Runbook: keda_scaled_object_errors

This runbook covers steps to investigate and remediate KEDA ScaledObject errors. The alert fires when the ScaledObject error count is greater than 0 for more than 5 minutes.

Impact

Autoscaling for the affected workload may be impacted. The ScaledObject cannot reconcile correctly, meaning the HPA managed by KEDA may not update, and the workload may not scale.

Steps

1. Get admin permissions via Escalator

Request admin access through Escalator:

https://escalator.marqo-staging.com/

2. Copy admin credentials to local terminal

Copy the admin credentials from Escalator and export them in your terminal.

3. Get EKS cluster credentials

aws eks update-kubeconfig --region us-east-1 --name cell2-MultitenantEKSCluster

4. Check the ScaledObject status

kubectl get scaledobject <scaled-object-name> -n <namespace> -o yaml

Look at the status section for error conditions and messages.

5. Check KEDA operator logs

kubectl logs -n keda -l app=keda-operator --tail=300 | grep -i "<scaled-object-name>\|error"

Look for:

ScaleTarget not found (the deployment/statefulset referenced doesn't exist)
HPA creation/update failures
Trigger configuration errors
CRD validation errors

6. Verify the scale target exists

kubectl get deployment <target-name> -n <namespace>

If the target is a statefulset:

kubectl get statefulset <target-name> -n <namespace>

7. Check the associated HPA

kubectl get hpa -n <namespace> | grep <scaled-object-name>
kubectl describe hpa <hpa-name> -n <namespace>

8. Remediate

Depending on the root cause:

If scale target is missing: The referenced deployment/statefulset needs to be created or the ScaledObject updated to reference the correct target.
If HPA conflicts: Another HPA may already exist for the same target. Remove the conflicting HPA.
If trigger errors: Fix the ScaledObject trigger configuration (see keda_scaler_errors runbook).

If KEDA operator issues: Restart the operator:

kubectl rollout restart deployment keda-operator -n keda

Impact​

Steps​

1. Get admin permissions via Escalator​

2. Copy admin credentials to local terminal​

3. Get EKS cluster credentials​

4. Check the ScaledObject status​

5. Check KEDA operator logs​

6. Verify the scale target exists​

7. Check the associated HPA​

8. Remediate​