Kubernetes – Set max pod (replica) limit per node

We have to allocate exactly N pod of service A per node. When new pod of A (N+1) is coming, pod cannot schedule due to lack of capacity and new nodes are added by Cluster Autoscaler

We can find a similar user case from this issue of Kubernetes Github repo add a new predicate: max replicas limit per node · Issue #71930

From Kubernetes 1.6, it seems we can use Pod Topology Spread Constraints to resolve.

You can use topology spread constraints to control how Pods are spread across your cluster among failure-domains such as regions, zones, nodes, and other user-defined topology domains. This can help to achieve high availability as well as efficient resource utilization.

Unfortunately, i am working on a old version of Kubernetes cluster (v1.12). It requires a workaround for this problem.

I found a workaround using Kubernetes QoS class Guaranteed setting to implement it.

When Kubernetes creates a Pod it assigns one of these QoS classes to the Pod:

  • Guaranteed
  • Burstable
  • BestEffort

Kubernetes QoS class

For a Pod to be given a QoS class of Guaranteed: Pods where both limit and (optionally) request are set for all resources (CPU and memory) and their values are the same. These pods are high priority, therefore they are terminated only if they are over the limit and there are no lower priority pods to terminate.

Example, 1 node t3.xlarge have 4 CPUs, 16 GBs. We want to spread 3 pods per node the resources/limits, requests are the same value 1 CPU (1 CPU for Kubernetes system pod, log pod such as Fluentd, …)

Pod manifest

containers:
  name: pong
    resources:
      limits:
        cpu: 1
        memory: 1Gi
      requests:
        cpu: 1
        memory: 1Gi

This workaround works like a charm for my case.