Chapter 2
Section titled “Chapter 2”Now that we know how to ensure our application is healthy with probes, let’s address another critical aspect of running applications in a shared cluster: resource management.
Managing Container Resources (Requests & Limits)
Section titled “Managing Container Resources (Requests & Limits)”By default, a Pod can consume as much CPU and memory as is available on the Node it’s running on. In a cluster with many applications, this default behavior is dangerous and leads to two major problems:
- The “Noisy Neighbor” Problem: A single buggy application that starts consuming 100% of a Node’s CPU or memory can starve all other applications on that same Node, causing them to fail. It can even destabilize the Node itself.
- Inefficient Scheduling: When Kubernetes doesn’t know how many resources your application needs, it can’t make smart decisions about where to place your Pods. It might schedule a memory-intensive Pod on a Node that has very little free memory, only for that Pod to be killed moments later when it tries to start.
To solve this, Kubernetes allows you to specify resource requests and limits for each container in your Pod.
Requests vs. Limits
Section titled “Requests vs. Limits”It’s crucial to understand the difference between a request and a limit.
-
Requests: This is the amount of resources that Kubernetes guarantees your container will have.
- Purpose: Primarily used for scheduling. The Kubernetes scheduler will not place your Pod on a Node unless that Node has enough available resources to satisfy the Pod’s requests.
- Analogy: This is like reserving a 2-core, 4GB RAM virtual machine. You are guaranteed to get at least that much.
-
Limits: This is the maximum amount of resources your container is allowed to use.
- Purpose: Primarily used for enforcement on the Node.
- CPU Limit: If your container tries to use more CPU than its limit, it will be throttled (artificially slowed down).
- Memory Limit: If your container tries to use more memory than its limit, it will be terminated. This is known as being OOMKilled (Out Of Memory Killed).
Resource Units:
- CPU: Measured in units of “cores”. You can use
1for a full core or500m(500 millicores) for half a core. - Memory: Measured in bytes. You can use standard suffixes like
Mi(Mebibytes) andGi(Gibibytes) — for example,128Mior2Gi.
Quality of Service (QoS) Classes
Section titled “Quality of Service (QoS) Classes”Based on the requests and limits you set, Kubernetes assigns a Quality of Service (QoS) class to your Pod. This is important because it determines which Pods get killed first when a Node runs out of memory.
- Guaranteed:
limitsare set and are equal torequestsfor all resources. These are the highest priority Pods and the last to be killed. - Burstable:
requestsare less thanlimits. These Pods are allowed to “burst” and use more resources if they are available on the Node. These are killed after BestEffort Pods. - BestEffort: No requests or limits are set. These are the lowest priority and the first Pods to be killed if the Node is under pressure.
-
Enable the Metrics Server To monitor resource usage, we need to enable the
metrics-serveraddon in Minikube. This may take a minute or two to start up.Terminal window minikube addons enable metrics-server -
The Deployment with Resource Limits Create a file named
deployment-with-resources.yaml. We will set a strict CPU and memory limit.deployment-with-resources.yaml apiVersion: apps/v1kind: Deploymentmetadata:name: resource-demo-deploymentspec:replicas: 1selector:matchLabels:app: resource-demotemplate:metadata:labels:app: resource-demospec:containers:- name: app-containerimage: busybox:1.36# This command will just keep the container runningcommand: ['/bin/sh', '-c', 'sleep 3600']# This is where we define the resources for this containerresources:requests:memory: "50Mi"cpu: "100m" # 1/10th of a corelimits:memory: "100Mi"cpu: "200m" # 1/5th of a core -
Deploy and Observe Apply the deployment and wait for the Pod to be ready.
Terminal window kubectl apply -f deployment-with-resources.yamlkubectl get podsOnce it’s running, check its baseline resource usage with
kubectl top pods. It should be near zero. -
Stress Test the CPU Limit Now, let’s
execinto the pod and run a command that pegs the CPU. Theddcommand is a perfect tool for this.Terminal window # Get your Pod's namePOD_NAME=$(kubectl get pods -l app=resource-demo -o jsonpath='{.items[0].metadata.name}')# Run the stress test inside the Podkubectl exec -it $POD_NAME -- /bin/sh -c 'dd if=/dev/zero of=/dev/null'While that command is running in one terminal, open a second terminal and watch the resource usage:
Terminal window # Watch the resource usage every 2 secondswatch kubectl top pod $POD_NAMEYou will see the CPU usage climb up to around
200m(our limit) and then stay there. It is being throttled and cannot consume more CPU, protecting the Node from this greedy process. PressCtrl+Cin the first terminal to stop the stress test. -
Trigger an OOMKilled Event Now let’s test the memory limit. We’ll
execinto the pod and try to allocate 150MB of memory, which is more than our 100MiB limit.Terminal window # This command tries to allocate 150MB of memory inside the containerkubectl exec -it $POD_NAME -- /bin/sh -c 'dd if=/dev/zero of=/dev/null bs=1M count=150'Wait a moment. You will likely see the command fail with a “Killed” message. Now check the status of your Pod:
Terminal window kubectl get pod $POD_NAMEYou will see that the
RESTARTScount has increased by 1. The container was killed because it violated its memory limit, and the Deployment automatically restarted it.To confirm the reason, describe the Pod:
Terminal window kubectl describe pod $POD_NAMEIn the
StateandLast Statesections for the container, you will see the reason:OOMKilled. -
Clean Up
Terminal window kubectl delete -f deployment-with-resources.yaml
Key Takeaways for Chapter 2:
Section titled “Key Takeaways for Chapter 2:”- Always define requests and limits for your production workloads. This is fundamental to cluster stability and application reliability.
- Requests guarantee resources for your app and ensure it gets scheduled correctly.
- Limits protect the rest of your cluster from misbehaving or resource-hungry applications.
- Use monitoring tools like
kubectl top(and more advanced tools like Prometheus) to observe your application’s real-world usage and adjust its requests and limits over time.
You now know how to build truly robust Pods that are both healthy and well-behaved citizens of the cluster. When you’re ready, let me know, and we’ll move to Chapter 3: Organizing the Cluster with Namespaces.