In this post, I was going to cover two controller kinds (Deployment and StatefulSet), the resource kinds PersistentVolume, PersistentVolumeClaim, and StorageClass., and go over examples of non-persistent and persistent service delivery. Then I realized it was going to be way too long. So I’ll cover the basics of stateless/stateful and storage in this post. Next post, I’ll cover Deployment and StatefulSet controllers and provide some examples of use.
You’ve likely heard of the ‘twelve-Factor App’ methodology . It was published in 2011 by Adam Wiggins of Heroku. Adam said the twelve-factors should apply to: “Any developer building applications which run as a service and/or Ops engineers who deploy or manage such applications”. Number six of the twelve-factor methodology originally stated: “Execute the app as one or more stateless processes”. Important to read as, “applications which run as a service”, does not equal “any, and all applications”. Since its publication in 2011, the sixth factor has been subtly reworded by some to reflect that.
When we say stateless, it needs to be put in context. There are many attributes of state, and a stateless application/pod in K8s terms is a bit of a misnomer. With regards to K8s, we consider a stateless application to be one that is immutable. Any changes made to it during run time will be lost when restarted, due to it not persisting storage. Think of it as roughly being a server with a read-only drive.
Since the advent of K8s, as with Borg, a fundamental goal was to create a distributed system for processes, that are highly portable and not bound by the compute infrastructure. So in the beginning, the first incarnation of a process/task (now known as a pod) was implemented with no persistent storage. That’s not to say Google wasn’t implementing persistent storage constructs with Borg, they were. At the time, Borg utilized centralized and distributed processes to map persistent data to workloads.The sum of the moving parts implemented for Borg was too great and complex to move forward into K8s. K8s took a new approach to solve for data persistence when required. That method and its uses have evolved, and continue to.
So, K8s storage is a key concept in understanding a stateless or stateful app in this context. I don’t want this to become a complete explanation of storage concepts in K8s, but I’ll delve into some basic resource concepts so we can better understand the differences between Deployment and StatefulSet,
At a high-level, K8s leverages its extensible architecture to allow for many types of storage providers. How we can consume storage varies based on the underlying storage type, and provider we use. At the time of this writing, K8s is moving from in-tree to out-of-tree (CSI) volume plugins. Not necessary to understand here, but know that one way or another, we start with a provider/plugin that bridges physical storage into K8s Our pods consume that disk resource via the volumes definition in the pod spec. The following examples are specific to working with in-tree volume providers.
The pod.spec.volumes field defines a name and a reference to a Volume or PVC, for each entry. We then use volumeMounts in the container definition of our pod spec (pod.spec,containers.volumeMounts) to mount and define the path of the mount within the container. volumeMounts is correlated to the volumes entry via the volumes entry name field.
(Note: I’ll go through Volume and PVC separately, for now, it’s pointing to a bit of provisioned and bound disk space. I upper-cased Volume because it’s referring to a host based volume that K8s consumes, rather than a field in a pod spec.).
In this example, you see the volumes definition for AWS EBS (awsElasticBlockStore:) with a volumeID to link to a pre-provisioned Volume. It has a name of ‘test-volume’. The volumeMounts definition in containers links to it by this label (test-volume) and mounts it at /test-ebs in the container file system hierarchy. If we were at the command prompt inside the container and executed
ls / , we would see the
apiVersion: v1 kind: Pod metadata: name: test-ebs spec: containers: - image: k8s.gcr.io/test-webserver name: test-container volumeMounts: - mountPath: /test-ebs name: test-volume volumes: - name: test-volume # This AWS EBS volume must already exist. awsElasticBlockStore: volumeID: <volume-id> fsType: ext4
A spec.volumes lifecycle matches that of pod. If a container dies, the volumes persists and will be reattached once kubelet restarts it. If a pod in a replicatset is killed, it is not seen as being dead, just missing. So as the replicaset controller restarts it, the volumes will persist and be reattached. It isn’t until the cluster is told that a pod, and all its replicas should no longer exist that the volumes will also come to an end.
Take note of the comment in the pod spec example above. It says ‘This AWS EBS volume must already exist.’. There are two ways a Volume can be created, manually or dynamically.
Manually can be by either creating a Volume specific to the storage directly on the host, and then pointing to it with its storage specific attributes (As is done above). In the case above, the comment is saying the administrator needs to create the AWS EBS volume and retrieve the volumeID and fsType to supply to the pod spec.
In the case of the AWS EBS provider, the command would be something like this:
aws ec2 create-volume --availability-zone=eu-west-1a --size=10 --volume-type=gp2
The other option would be to manually create a PersistentVolume (aka PV). A PV is an abstraction that allows us to refer to pre-provisioned storage with a common method. This enhances portability of pod specs and eases the user experience. As in the EBS example above, if the administrator created the EBS Volume, they would need to provide the volumeID and other Volume specific info to the user for inclusion in the pod spec. If the administrator created the storage as a PV, the user could create a PersistentVolumeClaim (aka PVC) and K8s would automatically find a suitable PV and bind it.The user need not know any of the details of the Volume or underlying storage. They simply create the PVC, and then refer to it in the pod spec.
Here is an example of a PV for EBS storage. Notice that instead of running a CLI command outside of K8s purview, we’re defining a resource with the storage specific details. As K8s is aware of this as a PV, it will see the PVC and match it to this (or another suitable PV) automatically.
apiVersion: v1 kind: PersistentVolume metadata: name: pv0001 spec: capacity: storage: 10Gi accessModes: - ReadWriteOnce awsElasticBlockStore: fsType: "ext4" volumeID: "vol-f16a04ba"
So that’s better, create a bunch of PVs ahead of time, then let users gobble them up at will. No need for never-ending creation of Volumes and shuffling of Volume specific info to the users, each time a volume is needed for a pod.
Better, but still not ideal, for a number of reasons. The relationship for a PVC-to-PV is one-to-one. If I created ten PVs, each 10Gi in size, and ten PVCs were created that each required 2Gi, we would be wasting 80Gi of available storage.Another issue would come with PVC storage class requirements. If we don’t know what kind of storage will be needed, we’d need to triple or quadruple the ten PVs so that we have ten of each class. The problems conintue, but I’m guessing you get the gist of it.
This is where dynamic provisioning comes to the rescue. With dynamic provisioning, we don’t pre-provision Volume or PVs. We create definitions of storage providers along with class specific settings, and store them as resource definitions in the cluster. These resources are called StorageClasses.
Notice that we are driving the definition of our underlying storage configuration higher and higher in the abstraction plane. We’ve gone from Volume to PersistentVolume to StorageClass. A storage class specific to AWS EBS looks like this.
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: standard provisioner: kubernetes.io/aws-ebs parameters: type: gp2 iopsPerGB: "10" fsType: ext4 reclaimPolicy: Retain allowVolumeExpansion: true mountOptions: - debug volumeBindingMode: Immediate
We use a PVC to reference a StorageClass and K8s automatically provisions the PV based on it and then binds the PVC to it. It is provisioned based on the attributes we specify in the PVC. So if I need one PV with 2GiB and another with 5GiB, the PVs will be provisioned to match those requirements exactly. This abstraction and functionality solves many challenges with the previous methods covered,
Here we see an example of a PVC that makes use of the StorageClass above to dynamically provision a PV and bind to :
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: myclaim spec: accessModes: - ReadWriteOnce volumeMode: Filesystem resources: requests: storage: 8Gi storageClassName: standard selector: matchLabels: release: "stable"
Once the PVC is bound to a PV, it can be consumed as a volume via a pod spec and mounted in a container’s file system directory tree. The container and volume are separate objects. If a containers dies and is restarted, the volume is reattached. Thus, the data persists.
So with a PVC created, this is what our pod.spec will look like now:
apiVersion: v1 kind: Pod metadata: name: test-ebs spec: containers: - image: k8s.gcr.io/test-webserver name: test-container volumeMounts: - mountPath: /test-ebs name: test-volume volumes: - name: test-volume
persistentVolumeClaim: claimName: myclaim
The final detail is that we can define a ‘Default’ StorageClass that will be automatically applied any time we don’t explicitly reference one in our PVC. That’s it for storage at this level. This is enough to understand persistent storage in the context of a basic stateful application’s use.
To summarize the storage concepts:
- Storage is made available to K8s by storage plugins (or providers)
- Volumes represent storage available to pods
- Persistent Volumes is an abstraction that enables the use of Persistent Volume Claims. It abstracts the details of the underlying storage parameters from the requester.
- Persistent Volume Claim is the method for requesting a Volume via the Persistent Volume abstraction.
- A Storage Class further abstracts PV by defining underlying storage provisioning parameters and allowing a PVC to reference it. In this case, a PV is created on-demand.
- A pod.spec defines volumes and then defines how those volumes are mounted within a container.
Next up, I’ll cover the Deployment and StatefulSet controllers, and provide examples of how each operates differently given stateless and stateful requirements.