Deploying Stateful Applications in Kubernetes

This is a small introduction to stateful applications, I made it based on my personal notes. We will also dive into how they’re implemented in Kubernetes.

Open Table of Contents

What is a Stateful Application?
- Stateful Applications in the World of Containers
Stateful Applications in Kubernetes

What is a Stateful Application?

A stateful application saves client data in one session for use in the next session. Stateful applications remember the state of interaction between the client and the server. Other than in stateless applications which treat every request as an independent transaction unrelated to any previous request. Examples of stateful applications are Databases, Data-Stores, and even something like E-commerce Websites or Online Games. In all of those, you need to remember some information from previous states or sessions.

Stateful Applications in the World of Containers

Now the question is, how do we get that stateful application onto the context of containers and into Kubernetes?

Back when containers were created, they were only used for stateless applications. This was the perfect fit for containers, since containers can be started, stopped, and replicated easily without worrying about maintaining any internal state.

Until today the container ecosystem evolved and gotten better at handling stateful applications. Technologies like the Docker Volumes or Kubernetes Statefulset were introduced. Those are making it possible for us to run databases, content management systems, and other stateful applications in containers.

Stateful Applications in Kubernetes

Now that we have a basic understanding of what stateful applications are, let’s take a look at how they’re implemented in Kubernetes.

In Kubernetes, we have StatefulSets for managing Stateful Applications. Let’s explore them in a practical example.

apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
    - port: 80
      name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx"
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: registry.k8s.io/nginx-slim:0.8
          ports:
            - containerPort: 80
              name: web
          volumeMounts:
            - name: www
              mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
    - metadata:
        name: www
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi

This example contains two parts, a headless service, and a statefulset. Let’s look at what a headless service is.

apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
    - port: 80
      name: web
  clusterIP: None
  selector:
    app: nginx

A headless service is a service with ClusterIP set to None, which means it has no Cluster IP assigned to it. Every DNS query to it gets answered with a set of IPs from the pods selected by the service. It provides stable network identities, which are crucial for stateful applications. So remember, to communicate with the service we use only DNS and not the IP address.

Let’s look at the Statefulset:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx"
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: registry.k8s.io/nginx-slim:0.8
          ports:
            - containerPort: 80
              name: web
          volumeMounts:
            - name: www
              mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
    - metadata:
        name: www
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi

In the yaml file of a Statefulset, we need to specify the headless service we created before, and similar to a deployment, we specify replicas and a template for the pods. Other than that, we provide a volumeClaimTemplate for PersistentVolumeClaims. The claims and its PersistentVolumes will be created automatically by the Statefulset.
Let’s apply the yaml file and take a closer look at what the Statefulset creates.

kubectl apply -f stateful-application.yaml

Let’s look at the pods created.

kubectl get pods

alt text

We see that each pod has a unique identifier, the number behind the pod, which is maintained across rescheduling and restarting. This allows each pod to be accessed via a stable hostname. We can test it for the web-0 pod like this:

kubectl run -i --rm debug --image=busybox --restart=Never -- nslookup web-0.nginx.default.svc.cluster.local

alt text

A debug pod is used to test the DNS connection and is deleted afterwards. We are using the FQDN (Fully Qualified Domain Name) here, which is web-0.nginx.default.svc.cluster.local. (Format: <pod-name>.<headless-svc-name>.<namespace>.svc.cluster.local) The nslookup command gives us the IP from the web-0 pod, which means the command worked and we can connect to the pod via DNS.

Another fact is that the pods will be scheduled in the order of the number assigned to them. For example, web-0 is scheduled and created first, and then web-1, and so on.

Thank you for reading this post on stateful applications in Kubernetes! I will update this post with more information soon, stay tuned.