Pausing Autoscaling in KEDA

KEDA allows you to pause autoscaling of workloads temporarily, which is useful for:

  • Cluster maintenance.
  • Avoiding resource starvation by scaling down non-critical workloads.

Procedure

Immediate Pause with Current Replicas

Add the following annotation to your ScaledObject definition to pause scaling without changing the current replica count:

metadata:
  annotations:
    autoscaling.keda.sh/paused: "true"

Pause After Scaling to a Specific Replica Count

Use this annotation to scale the workload to a specific number of replicas and then pause:

metadata:
  annotations:
    autoscaling.keda.sh/paused-replicas: "<number>"

Behavior When Both Annotations are Set

If both paused and paused-replicas are specified:

  • KEDA scales the workload to the value defined in paused-replicas.
  • Autoscaling is paused afterward.

Unpausing Autoscaling

To resume autoscaling:

  • Remove both paused and paused-replicas annotations from the ScaledObject.
  • If only paused: "true" was used, set it to false:
    metadata:
      annotations:
        autoscaling.keda.sh/paused: "false"

Scaling to Zero

Example ScaledObject Configuration:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: example-scaledobject
  namespace: <your-namespace>
  annotations:
    autoscaling.keda.sh/paused-replicas: "0"  # Scale to 0 replicas and pause

Verification

To verify that the ScaledObject has scaled to zero, you can check the number of replicas of the target deployment:

kubectl get deployment <your-deployment> -n <your-namespace>

Or you can check the number of pods in the target deployment:

kubectl get pods -n <your-namespace> -l <your-deployment-label-key>=<your-deployment-label-value>

The number of pods should be zero, indicating that the deployment has scaled to zero.