Setting Up KEDA (Kubernetes Event-driven Autoscaling)

Comprehensive guide for implementing event-driven autoscaling in Kubernetes using KEDA

Setting Up KEDA (Kubernetes Event-driven Autoscaling)

This guide provides detailed instructions for setting up and configuring KEDA in your Kubernetes cluster, including common scalers, best practices, and real-world examples.

What is KEDA?

KEDA (Kubernetes Event-driven Autoscaling) is a Kubernetes-based Event Driven Autoscaler that enables fine-grained autoscaling for event-driven workloads. It can scale any container in Kubernetes based on the number of events needing to be processed.

Prerequisites

  • Kubernetes cluster (v1.16+)
  • kubectl CLI tool
  • Helm (optional)
  • Metrics Server installed
  • Prometheus (optional, for metrics)

Installation Methods

1. Using Helm

# Add KEDA Helm repository
helm repo add kedacore https://kedacore.github.io/charts
helm repo update

# Create namespace
kubectl create namespace keda

# Install KEDA
helm install keda kedacore/keda --namespace keda --version 2.12.0

2. Using YAML Manifests

# Install KEDA
kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.12.0/keda-2.12.0.yaml

Verification

# Check KEDA components
kubectl get pods -n keda

# Expected output:
# NAME                                      READY   STATUS    RESTARTS   AGE
# keda-operator-xxx-xxx                     1/1     Running   0          1m
# keda-operator-metrics-apiserver-xxx-xxx   1/1     Running   0          1m

Common Scalers Configuration

1. RabbitMQ Scaler

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: rabbitmq-consumer
    kind: Deployment
  minReplicaCount: 1
  maxReplicaCount: 10
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: rabbitmq
    metadata:
      protocol: amqp
      queueName: orders
      host: amqp://rabbitmq.default.svc:5672
      queueLength: "50"

2. Kafka Scaler

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-scaledobject
spec:
  scaleTargetRef:
    name: kafka-consumer
    kind: Deployment
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.default.svc:9092
      consumerGroup: my-group
      topic: orders
      lagThreshold: "100"

3. Prometheus Scaler

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaledobject
spec:
  scaleTargetRef:
    name: my-app
    kind: Deployment
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc
      metricName: http_requests_total
      threshold: "100"
      query: sum(rate(http_requests_total{service="my-service"}[2m]))

4. CPU/Memory Scaler

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cpu-scaledobject
spec:
  scaleTargetRef:
    name: my-app
    kind: Deployment
  triggers:
  - type: cpu
    metadata:
      type: Utilization
      value: "50"
  - type: memory
    metadata:
      type: Utilization
      value: "70"

Advanced Configuration

1. TriggerAuthentication

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: kafka-trigger-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: sasl
    name: kafka-secrets
    key: sasl
  - parameter: username
    name: kafka-secrets
    key: username
  - parameter: password
    name: kafka-secrets
    key: password

2. ClusterTriggerAuthentication

apiVersion: keda.sh/v1alpha1
kind: ClusterTriggerAuthentication
metadata:
  name: azure-servicebus-auth
spec:
  podIdentity:
    provider: azure

3. Scaling Jobs

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: batch-processor-job
spec:
  jobTargetRef:
    template:
      spec:
        containers:
        - name: batch-processor
          image: batch-processor:latest
  pollingInterval: 30
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 5
  triggers:
  - type: postgresql
    metadata:
      connectionFromEnv: POSTGRESQL_CONNECTION
      query: "SELECT COUNT(*) FROM tasks WHERE status='pending'"
      targetQueryValue: "1"

Monitoring and Metrics

1. Prometheus ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: keda-metrics
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: keda-operator
  endpoints:
  - port: metrics

2. Grafana Dashboard

apiVersion: v1
kind: ConfigMap
metadata:
  name: keda-dashboard
  namespace: monitoring
data:
  keda-dashboard.json: |
    {
      "title": "KEDA Metrics",
      "panels": [
        {
          "title": "Active Scalers",
          "type": "graph"
        },
        {
          "title": "Scaling Operations",
          "type": "graph"
        }
      ]
    }

High Availability Configuration

apiVersion: keda.sh/v1alpha1
kind: KedaController
metadata:
  name: keda
  namespace: keda
spec:
  watchNamespace: ""
  operator:
    replicaCount: 2
    affinity:
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - keda-operator
            topologyKey: kubernetes.io/hostname

Best Practices

  1. Resource Management

    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: best-practice-scaler
    spec:
      scaleTargetRef:
        name: my-app
      minReplicaCount: 1
      maxReplicaCount: 100
      pollingInterval: 30
      cooldownPeriod: 300
      advanced:
        restoreToOriginalReplicaCount: true
        horizontalPodAutoscalerConfig:
          behavior:
            scaleDown:
              stabilizationWindowSeconds: 300
              policies:
              - type: Percent
                value: 100
                periodSeconds: 15
  2. Security Configuration

    apiVersion: keda.sh/v1alpha1
    kind: TriggerAuthentication
    metadata:
      name: secure-trigger-auth
    spec:
      secretTargetRef:
      - parameter: connectionString
        name: secure-secrets
        key: connectionString

Troubleshooting

Common Issues and Solutions

  1. Scaling Not Working

    # Check KEDA operator logs
    kubectl logs -n keda -l app=keda-operator
    
    # Check ScaledObject status
    kubectl get scaledobject
    kubectl describe scaledobject <name>
  2. Authentication Issues

    # Verify TriggerAuthentication
    kubectl get triggerauthentication
    kubectl describe triggerauthentication <name>
    
    # Check secrets
    kubectl get secrets
  3. Metrics Issues

    # Verify metrics server
    kubectl get apiservice v1beta1.metrics.k8s.io
    
    # Check metrics
    kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"

Integration Examples

1. AWS SQS Integration

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: aws-sqs-scaler
spec:
  scaleTargetRef:
    name: aws-sqs-consumer
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.region.amazonaws.com/account/queueName
      queueLength: "5"
      awsRegion: "us-east-1"

2. Azure Service Bus Integration

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: azure-servicebus-scaler
spec:
  scaleTargetRef:
    name: azure-servicebus-consumer
  triggers:
  - type: azure-servicebus
    metadata:
      queueName: myqueue
      messageCount: "5"

Performance Tuning

1. Scaling Behavior

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: optimized-scaler
spec:
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleUp:
          stabilizationWindowSeconds: 0
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15

2. Resource Optimization

apiVersion: apps/v1
kind: Deployment
metadata:
  name: optimized-deployment
spec:
  template:
    spec:
      containers:
      - name: app
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 512Mi

Additional Resources