☸️ Kubernetes - Resource Limits
If someone tells you that you can use a shared service without limits, they are either lying or the system will eventually collapse.
Efficient resource utilization is about how to properly share the resources. Most applications are designed to use as much resources as possible and this doesn't play well in a distributed cluster environment. You need to build some virtual fences.
Namespace Quotas
The most global resource configuration in Kubernetes are namespace quotas. These restrict how much resources can the namespace pods take in total.
apiVersion: v1
kind: ResourceQuota
metadata:
name: mem-cpu-example
spec:
hard:
requests.cpu: 2
requests.memory: 2Gi
limits.cpu: 3
limits.memory: 4Gi
Setting namespace quotas forces pods to specify requests/limits
. Otherwise they are "unlimited" and won't ever get scheduled. Because of this, I find it a good practice always specify namespace quotas so you are forced to specify something.
Pod Container Quotas
You specify how much each container will take resources. This is defined under the resources
key. Remember that these limitations are per container, not per pod. Pod limitations are the sum of it's containers.
# this is somewhere under `containers:` definition...
resources:
requests:
memory: 300Mi # <- the pod requires 300 MB of memory or won't be ran
cpu: 500m # <- the pod requires 500 millicores (0.5 CPU core) or won't be ran
limits:
memory: 600Mi # <- the pod will be killed if tries to use more than 600 MB of memory
cpu: 1 # <- the pod will be throttled if tries to use more than 1 CPU core
- If you haven't made your program multi-core; use
limits.cpu: 1
or below - If
limits.memory
is not set, the container takes "as much as it can". - If
limits.cpu
is not set, the container takes "as much as it can".
It is imperative that you set both CPU and memory requests
and limits
. Otherwise the cluster will not work properly in the long run, and can even crash the virtual instance nodes in the works case.
A decent way to figure out the values: It can be hard to figure out what your requests
and limits
should be...
- Start a single container pod.
- Start as much processing as you can.
- Record the peak vCPU and memory usage when processing.
- Stop the processing.
- Record usage after cooling down and when idle.
- The idle values are your
requests
. - The processing values are your
limits
, maybe add +15% memory to avoid OOM. - If your memory usage can climb rapidly e.g. up by 1 GB in a second for a while, consider setting
requests.memory
andlimits.memory
to the same value because sudden memory bursts can strangle the host virtual machine. This is called "no memory overcommitment".
If a node runs OOM, Kubernetes will start looking for pods to kill. The kill order is something on the lines:
- pods with no
requests.memory
set orrequests.memory
is set but memory is still under theirlimits.memory
(which can be infinite if not defined) - if there are multiple found, rank these according to priority and the terminate the lowest priority pod
- if there are multiple with the same low priority, terminate the one that is most over what it requested
- if none of the previous steps cause anything to be terminated, start killing any workloads so system components don't get stuck
kubectl describe pod
tells if a container was out-of-memory killed. It will look something like this:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137 # <- tried to use more memory that its `limits`
You should monitor your node CPU and memory usage. If memory jumps close to 100% from time to time, some processes will die midway. If CPU jumps close to 100% from time to time, some processes will be slowed. If you have issues, remember to inspect CPU and memory usage by container, not by pod.
Follow if your total CPU/Memory limit
goes over 100% of the node capacity. Sudden surge in memory usage can even crash the node and bring down all the pods in it. Going over 100% memory capacity is called "overcommit". It's safer to specify such limits that this won't happen e.g. by making memory request and limit equal.