Share the love

We will explore two ways to autoscale microservice hosted in AKS. You can either use Kubernetes Metrics Server or KEDA.

Kubernetes Metrics Server

To autoscale a microservice in AKS based on message rate, you can use Kubernetes Horizontal Pod Autoscaler (HPA) along with Kubernetes Metrics Server. Here is an example YAML file that demonstrates how to configure HPA for a microservice:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
  name: my-microservice-hpa
  namespace: my-namespace
    apiVersion: apps/v1
    kind: Deployment
    name: my-microservice-deployment
  minReplicas: 2
  maxReplicas: 10
  - type: Object
        kind: Service
        name: my-microservice-service
        apiVersion: v1
      metricName: message-rate
      targetValue: 100

In this example, the HPA is set to monitor the message-rate metric for the my-microservice-service service in the my-namespace namespace. The HPA is set to scale the my-microservice-deployment deployment between a minimum of 2 replicas and a maximum of 10 replicas. When the message-rate metric exceeds 100, the HPA will scale up the number of replicas, and when the message-rate falls below 100, the HPA will scale down the number of replicas.

You will also need to have a metric server running inside your AKS cluster, you can use following command to deploy it:

kubectl apply -f

Kubernetes-based Event Driven Autoscaler (KEDA)

Another way to achieve autoscaling based on message rate in AKS is to use Kubernetes-based Event Driven Autoscaler (KEDA). KEDA is a Kubernetes operator that can be used to autoscale your containers based on event-driven metrics, such as message rate, queue depth, etc.

Install KEDA

#Add the KEDA Helm repository
helm repo add kedacore
#Update the Helm repository
helm repo update
#Install KEDA in the kube-system namespace:
helm install keda kedacore/keda --namespace kube-system
#Verify that the KEDA pods are running
kubectl get pods -n kube-system

Here is an example YAML file that demonstrates how to configure KEDA for a microservice:

kind: ScaledObject
  name: my-microservice-scaledobject
  namespace: my-namespace
    kind: Deployment
    name: my-microservice-deployment
  pollingInterval: 10
  cooldownPeriod: 300
  - type: azure-servicebus-queue
      queueName: my-queue
      connection: Endpoint=sb://;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=my-secret-key
      threshold: 0
      threshold: 10

In this example, KEDA is set to monitor the message rate of the my-queue Azure Service Bus queue for the my-microservice-deployment deployment in the my-namespace namespace. The pollingInterval is set to 10 seconds and the cooldownPeriod is set to 300 seconds. When the message rate exceeds 10, KEDA will scale up the number of replicas of the deployment, and when the message rate falls below 0, KEDA will scale down the number of replicas.