Runbook: keda_scaled_object_errors
This runbook covers steps to investigate and remediate KEDA ScaledObject errors. The alert fires when the ScaledObject error count is greater than 0 for more than 5 minutes.
Impact
Autoscaling for the affected workload may be impacted. The ScaledObject cannot reconcile correctly, meaning the HPA managed by KEDA may not update, and the workload may not scale.
Steps
1. Get admin permissions via Escalator
Request admin access through Escalator:
https://escalator.marqo-staging.com/
2. Copy admin credentials to local terminal
Copy the admin credentials from Escalator and export them in your terminal.
3. Get EKS cluster credentials
aws eks update-kubeconfig --region us-east-1 --name cell2-MultitenantEKSCluster
4. Check the ScaledObject status
kubectl get scaledobject <scaled-object-name> -n <namespace> -o yaml
Look at the status section for error conditions and messages.
5. Check KEDA operator logs
kubectl logs -n keda -l app=keda-operator --tail=300 | grep -i "<scaled-object-name>\|error"
Look for:
- ScaleTarget not found (the deployment/statefulset referenced doesn't exist)
- HPA creation/update failures
- Trigger configuration errors
- CRD validation errors
6. Verify the scale target exists
kubectl get deployment <target-name> -n <namespace>
If the target is a statefulset:
kubectl get statefulset <target-name> -n <namespace>
7. Check the associated HPA
kubectl get hpa -n <namespace> | grep <scaled-object-name>
kubectl describe hpa <hpa-name> -n <namespace>
8. Remediate
Depending on the root cause:
- If scale target is missing: The referenced deployment/statefulset needs to be created or the ScaledObject updated to reference the correct target.
- If HPA conflicts: Another HPA may already exist for the same target. Remove the conflicting HPA.
- If trigger errors: Fix the ScaledObject trigger configuration (see
keda_scaler_errorsrunbook). - If KEDA operator issues: Restart the operator:
kubectl rollout restart deployment keda-operator -n keda