Kubernetes Cluster Over-Provisioning: Proactive App Scaling

Need of a cluster overprovisioning

Let’s say we have a Kubernetes cluster running with X number of worker-nodes. All nodes in the cluster are running at their full capacity, meaning there is no space left on any of the nodes for incoming pods. Everything till now is working fine, applications running on the Kubernetes cluster are able to respond in time(We are scaling applications on the basis of Memory/CPU usage) but suddenly load/traffic on the application increases. As the load on the application increases, Memory/CPU consumption of an application also increases and Kubernetes starts scaling the application horizontally by adding new pods in the k8s cluster when the metric considered for scaling crosses a threshold value. But, all the newly created pods will go in a pending state (as all our worker nodes are running at full capacity).

To overcome the problem, we thought of implementing the below solutions

  1. Fix the number of extra nodes in the cluster
  2. Make use of pod-priority-preemption

Fix the number of extra nodes in the cluster

One of the ideas was to add a fixed number of extra nodes in the cluster that will always be available, waiting, and ready to accept new pods. In this way, our apps would always be able to scale up without having to wait for AWS to create a new EC2 instance to join the Kubernetes cluster. The apps could always scale up without getting in a pending state. That was exactly what we wanted.

Make use of pod-priority-preemption

The second solution was difficult to implement as it was difficult to decide the pod priority of different applications as we were having multiple applications that are sensitive to such scaling delays. Any delay in the scaling of an application may hinder the application performance and will have an adverse impact on application performance.

Introduction to Horizontal Cluster Proportional Autoscaler

Scaling with over-provisioning: what happens under the hood

  1. Load hits the cluster
  2. Kubernetes starts scaling application pods horizontally by adding new pods.
  3. Kube-scheduler tries to place newly created application pods but finds insufficient resources.
  4. placeholder-pods (pause pods in our case) gets evicted as it has low priority and application pods get placed.
  5. Application pods gets placed and scaling happens immediately without any delay.
  6. placeholder-pods go in pending state and cannot be scheduled due to insufficient resources.
  7. Cluster autoscaler watches the pending pods and will scale the cluster by adding new nodes in the cluster
  8. Kube-scheduler waits, for instance, to be provisioned, boot, join the cluster, and become ready.
  9. Kube-scheduler notices there is a new node in the cluster where pods can be placed and will schedule placeholder-pods on such nodes.


Note:- Change pod priority cut off in Cluster Autoscaler to -10 so pause pods are considered during scale down and scale-up. Set flag expendable-pods-priority-cut-off to -10.

  1. Placeholder-pod (paused-pod) deployment
  2. Cluster proportional autoscaler deployment
  3. Serviceaccount
  4. Clusterrole
  5. Clusterrolebinding
  6. Configmap
  7. Priority class for the paused pods with a priority value of -1


Cluster-Overprovisioning is needed when we have applications running in the cluster which are sensitive to scaling delays and don’t want to wait for new nodes to be created and join the cluster, for scaling.


  1. https://github.com/kubernetes-sigs/cluster-proportional-autoscaler
  2. https://github.com/kubernetes-sigs/cluster-proportional-autoscaler#horizontal-cluster-proportional-autoscaler-container
  3. https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-can-i-configure-overprovisioning-with-cluster-autoscaler
  4. https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
  5. https://www.ianlewis.org/en/almighty-pause-container



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ajit Vedpathak

Ajit Vedpathak

I'm Ajit Vedpathak, a dynamic IT professional + loves challenging assignments and leadership challenges + loves playing volleyball.