Kubernetes - Resource Limits
If someone tells you that you can use a shared service without limits, they are either lying or the system will eventually collapse.
Efficient resource utilization is about how to properly share the resources. Most applications are designed to use as much resources as possible and this doesn't play well in a distributed cluster environment. You need to build some virtual fences.
The most global resource configuration in Kubernetes are namespace quotas. These restrict how much resources can the namespace pods take in total.
apiVersion: v1 kind: ResourceQuota metadata: name: mem-cpu-example spec: hard: requests.cpu: 2 requests.memory: 2Gi limits.cpu: 3 limits.memory: 4Gi
Setting namespace quotas forces pods to specify
requests/limits. Otherwise they are "unlimited" and won't ever get scheduled. Because of this, I find it a good practice always specify namespace quotas so you are forced to specify something.
Pod Container Quotas
You specify how much each container will take resources. This is defined under the
resources key. Remember that these limitations are per container, not per pod. Pod limitations are the sum of it's containers.
# this is somewhere under `containers:` definition... resources: requests: memory: 300Mi # <- the pod requires 300 MB of memory or won't be ran cpu: 500m # <- the pod requires 500 millicores (0.5 CPU core) or won't be ran limits: memory: 600Mi # <- the pod will be killed if tries to use more than 600 MB of memory cpu: 1 # <- the pod will be throttled if tries to use more than 1 CPU core
- If you haven't made your program multi-core; use
limits.cpu: 1or below
limits.memoryis not set, the container takes "as much as it can".
limits.cpuis not set, the container takes "as much as it can".
It is imperative that you set both CPU and memory
limits. Otherwise the cluster will not work properly in the long run, and can even crash the virtual instance nodes in the works case.
A decent way to figure out the values: It can be hard to figure out what your
limits should be...
- Start a single container pod.
- Start as much processing as you can.
- Record the peak vCPU and memory usage when processing.
- Stop the processing.
- Record usage after cooling down and when idle.
- The idle values are your
- The processing values are your
limits, maybe add +15% memory to avoid OOM.
- If your memory usage can climb rapidly e.g. up by 1 GB in a second for a while, consider setting
limits.memoryto the same value because sudden memory bursts can strangle the host virtual machine. This is called "no memory overcommitment".
If a node runs out of memory, Kubernetes will start looking for pods to kill. The kill order is something on the lines:
- pods with no
requests.memoryis set but memory is still under their
limits.memory(which can be infinite if not defined)
- if there are multiple found, rank these according to priority and the terminate the lowest priority pod
- if there are multiple with the same low priority, terminate the one that is most over what it requested
- if none of the previous steps cause anything to be terminated, start killing any workloads so system components don't get stuck
kubectl describe pod lets you know if a container was out-of-memory killed. It will look something like this:
Last State: Terminated Reason: OOMKilled Exit Code: 137 # <- tried to use more memory that its `limits`
You should monitor your node CPU and memory usage. If memory jumps close to 100% from time to time, some processes will die midway. If CPU jumps close to 100% from time to time, some processes will be slowed. If you have issues, remember to inspect CPU and memory usage by container, not by pod.
Follow if your total CPU/Memory
limit goes over 100% of the node capacity. Sudden surge in memory usage can even crash the node and bring down all the pods in it. Going over 100% memory capacity is called "overcommit". It's safer to specify such limits that this won't happen e.g. by making memory request and limit equal.