May 2, 2020

Understanding Kubernetes' pod lifecycle: the readiness probe

Introduction

Understanding Kubernetes’ concepts is key to running highly available applications.

This article will take you through the scenario of deploying a new version of a pod, and show how understanding the pod lifecycle and implementing a readiness probe will help you deploying new releases without downtime.

Without a readiness probe Kubernetes will try to guess when your pod is ready, and then schedule traffic to it. If the pod has latency between the point-in-time when the container is running and when it can handle traffic, this will cause transactions to be dropped, a.k.a. downtime. A typical situation where this may happen, is if your container has a heavy initialization sequence, like starting multiple processes, making consistency checks or downloading external content that it should serve.

Demonstration

All steps shown here can be found in this github repo. To generate latency we will use a pre-built docker image trondhindenes/k8sprobetester. The author has also written a good article about probes.

In each step we will make a release update between v1 and v2 by preparing the environment (by deploying deployment-v1.yaml) and then applying deployment-v2.yaml while running a test (using curl with a time-out of 1 second). The test result will show how each method of deploying a new version affected the availability of the application.

I you want to follow along in your environment the repo has all the needed instructions.

Step 1: naive approach

The v1 deployment in step1 has no readiness probe so Kubernetes will switch the traffic to the new pod when it has started. As we have introduced a latency of 10 seconds curl will time-out a few times, and recover once the new pod is ready to handle traffic.

This video demonstrates the transaction behavior (left window) and the way kubernetes replaces the v1 pod with v2 (top right window). Notice that the state of the new pod becomes “1/1” right away as Kubernetes “thinks” it is ready.

Step 2: with readiness probe

The v1 deployment in step2 has a readiness probe so Kubernetes will switch the traffic to the new pod when it’s probe says it is ready. As we have introduced a latency of 10 seconds the traffic will be routed to the previous version of the pod until the new pod is ready, and curl will not time-out.

      spec:
        containers:
        - image: trondhindenes/k8sprobetester:latest
          ...   
          readinessProbe:
            httpGet:
              path: /healthz
              port: 80
              httpHeaders:
                - name: Host
                  value: KubernetesLivenessProbe
            initialDelaySeconds: 5
            failureThreshold: 2                              

This video demonstrates the transaction behavior (left window) and the way kubernetes replaces the v1 pod with v2 (top right window). Notice that the state of the new pod first becomes “0/1”, and “1/1” with a delay (of the introduced latency. As long as the new version of the pod is not Kubernetes routes the traffic to the previous version, leading to a zero-downtime deployment.

Content licensed under CC BY 4.0