Ready for Production

Welcome to the next series of our Kubernetes lessons. Let’s make your applications truly robust, secure, and ready for production environments.

Chapter 1

Welcome to the next series of our Kubernetes lessons. We will build directly on the foundation we’ve already established.

Pod Health Checks with Probes (Liveness & Readiness)

In our first lesson series, we learned that a Deployment provides “self-healing” by restarting a Pod if it crashes. his happens when the main process inside the container exits with an error.

But what happens if your application’s process is still running, but it’s frozen in an infinite loop, deadlocked, or otherwise unable to respond to requests? From Kubernetes’ perspective, the process is alive, so it thinks everything is fine. Your application is effectively down, but Kubernetes doesn’t know it needs to intervene.

To solve this, Kubernetes provides probes.

Probes are diagnostic checks that the kubelet (the Kubernetes agent on each Node) periodically performs on your containers. These checks allow Kubernetes to have a much deeper understanding of your application’s health.

There are two primary types of probes you must know:

Liveness Probe: This probe answers the question, “Is the application still alive and working?” If the liveness probe fails a configured number of times, the kubelet assumes the container is in an unrecoverable state and kills it. The container is then restarted according to its restart policy. This is how you recover from deadlocks.
Readiness Probe: This probe answers the question, “Is this application ready to accept new traffic?” If the readiness probe fails, the container is not killed. Instead, its Pod is marked as “Not Ready.” This automatically removes the Pod’s IP address from the endpoints of any Service that directs traffic to it. When the readiness probe eventually succeeds, the Pod is added back. This is absolutely critical for achieving zero-downtime rolling updates, as it prevents traffic from being sent to a new Pod that is still starting up and not yet ready to serve requests.

How Probes Work

You can configure a probe to check your application in one of three ways:

HTTP GET: The kubelet sends an HTTP GET request to a specific path (e.g., /healthz) on your container. A response code between 200-399 is considered a success.
TCP Socket: The kubelet tries to open a TCP connection to a specified port on your container. If the connection is successful, the probe succeeds.
Exec Command: The kubelet executes a command inside your container. If the command exits with a status code of 0, the probe succeeds.

The Deployment with Probes Instead of a real broken app, we will use a simple shell script inside a busybox container. We will define probes that check for the existence of files. We can then create and delete these files to simulate the application’s health changing.

Create a file named deployment-with-probes.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
name: probe-demo-deployment
spec:
replicas: 1
selector:
    matchLabels:
    app: probe-demo
template:
    metadata:
    labels:
        app: probe-demo
    spec:
    containers:
    - name: app-container
        image: busybox:1.36
        # This command will start a shell and then just sleep. It won't create any files initially.
        command: ['/bin/sh', '-c', 'echo "Container started, but not yet ready or healthy."; sleep 600']

        # --- Readiness Probe ---
        # Kubernetes will start checking this after 5 seconds.
        readinessProbe:
        exec:
            command:
            - cat # This command succeeds (exit code 0) if the file exists
            - /tmp/ready
        initialDelaySeconds: 5
        periodSeconds: 5

        # --- Liveness Probe ---
        # Kubernetes will start checking this after 15 seconds.
        livenessProbe:
        exec:
            command:
            - cat # This command succeeds (exit code 0) if the file exists
            - /tmp/healthy
        initialDelaySeconds: 15
        periodSeconds: 5
        failureThreshold: 3

Some important fields to note:

initialDelaySeconds: How long to wait after the container starts before performing the first probe.
periodSeconds: How often to perform the probe.
failureThreshold: How many times the probe can fail before it’s considered a failure. For a liveness probe, this means 3 failures * 5 seconds = 15 seconds of unresponsiveness before restarting.

Deploy and Observe the Readiness Failure Apply the deployment:
Terminal window
```
kubectl apply -f deployment-with-probes.yaml
```
Now, quickly check the status of your Pods. The --watch flag will update the output automatically.
Terminal window
```
kubectl get pods --watch
```
You will see an output like this. Notice the READY column is 0/1 and the STATUS is Running. This means the container is running, but it’s not ready to accept traffic.
```
NAME                                     READY   STATUS    RESTARTS   AGE
probe-demo-deployment-559d47976c-xyz12   0/1     Running   0          10s
```
Let’s find out why. Get your Pod’s name and use kubectl describe:
Terminal window
```
# Replace <pod-name> with your actual pod name
kubectl describe pod <pod-name>
```
Scroll to the bottom to the Events section. You will see repeated messages saying Readiness probe failed: cat: can't open '/tmp/ready': No such file or directory.
Make the Pod “Ready” Let’s simulate the application finishing its startup sequence by creating the file the readiness probe is looking for.
Terminal window
```
# Get a shell inside the pod and create the file
kubectl exec <pod-name> -- touch /tmp/ready
```
Go back to your kubectl get pods --watch terminal. Within a few seconds (the periodSeconds), you will see the status change!
```
NAME                                     READY   STATUS    RESTARTS   AGE
probe-demo-deployment-559d47976c-xyz12   1/1     Running   0          1m
```
The Pod is now READY. A Service would now start sending traffic to it.
Trigger a Liveness Failure and Restart Our Pod is ready, but the liveness probe is still failing because /tmp/healthy doesn’t exist. Let’s watch what happens.

Keep your kubectl get pods --watch terminal open. The liveness probe will fail 3 times (our failureThreshold). After about 15 seconds of failures, you will see Kubernetes take action. The RESTARTS count will jump to 1, and the pod will briefly change status while it restarts.
```
# The pod will be killed and recreated
NAME                                     READY   STATUS        RESTARTS   AGE
probe-demo-deployment-559d47976c-xyz12   1/1     Terminating   0          2m
...
probe-demo-deployment-559d47976c-xyz12   0/1     Running       1          2m15s
```
If you describe the pod again, you will see events for the liveness probe failing and the container being killed. The new container starts again in the same unhealthy state, and the cycle will repeat.

Clean Up

kubectl delete -f deployment-with-probes.yaml

Key Takeaways for Chapter 1:

Probes give Kubernetes a deeper understanding of your application’s health beyond just “is the process running?”.
Use a Readiness Probe to signal when your application is fully initialized and ready to receive traffic. This is essential for zero-downtime deployments.
Use a Liveness Probe to tell Kubernetes when your application is broken and needs to be restarted.
Configuring probes is a non-negotiable step for running production-grade applications on Kubernetes.

Let me know when you are ready to move on to Chapter 2: Managing Container Resources (Requests & Limits).