So far, we have extensively covered KEDA and its ability to scale Kubernetes pods using various metrics:
Horizontal Pod Autoscaling with Kubernetes Event-Driven Autoscaler (KEDA)
Scaling Amazon Elastic Kubernetes Service Workloads with KEDA and Amazon CloudWatch
Scaling Amazon Elastic Kubernetes Service Workloads with KEDA and SQS
Scaling Amazon Elastic Kubernetes Service Workloads with KEDA, OTEL Collector, and Amazon CloudWatch
Scaling Kubernetes Pods Based on HTTP Traffic using KEDA HTTP Add-on
Today, we will explore the Cron scaler, which can be combined with the previously mentioned methods to enhance our scaling strategies.
Here's the scenario: we already have autoscaling based on metrics like requests or messages in our applications. Every morning, we quickly reach our regular workload due to normal business operations. With the current setup, achieving the ideal number of pods can take time, and during this period, users might experience some delays in the applications.
One solution could be to increase the minimum number of pods in our setup. However, this would mean maintaining that number even during off-business hours. The other option is to keep a different minimum number of pods during business hours, and this effect can be achieved using an additional trigger in our ScaledObject
definition:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: myconsumer-scaler
spec:
scaleTargetRef:
name: myconsumer-deployment
minReplicaCount: 1
maxReplicaCount: 10
triggers:
- type: azure-servicebus
metadata:
queueName: MyQueue
queueLength: '10'
authenticationRef:
name: myconsumer-trigger-authentication
- type: cron
metadata:
timezone: America/Bogota
start: 0 8 * * *
end: 0 20 * * *
desiredReplicas: "5"
When we have multiple triggers for the same scaler, KEDA will start scaling once one trigger meets the criteria. It calculates metrics for each scaler and uses the highest desired replica count to scale the workload. In the given example, from 8 am to 8 pm, the cron trigger ensures a minimum of 5 pods. If the workload demands more pods, the initial trigger can scale up to 10. This approach not only enhances user experience by reducing latency but also helps in managing operational costs effectively. Thank you, and happy coding.