Cluster API vSphere Provider – Customize Image

TL;DR Steps 1-9 at the bottom. If you read my last post, you’ll recall I was in the middle of troubleshooting provisioning issues related to disk I/O latency. In the end, I returned my lab server to its prior form, a vSphere esxi host with vCenter self managing it.

Disk I/O issues resolved, I set back to my original task of configuring a Cluster API control plane to provision K8s clusters. The setup is fairly simple (These steps vary by infrastructure provider, here I am covering the vSphere provider). Link to the docs:

  1. Download the clusterctl CLI binary
  2. Download an OVA template and load in vCenter (You can use the image URLs in the docs to configure in vSphere web client. If you choose to download them locally, change them from http to https)
  3. Take a snapshot and convert to template
  4. Set values needed for the control plane initialization (e.g. vCenter address, admin user, password, etc.)
  5. Create a single node KinD cluster and use clusterctl to initialize it with the required control plane pods
  6. Deploy a managed cluster to vSphere, use clusterctl to initialize it as a control plane cluster (i.e Bootstrap and pivot)
  7. Delete the KinD cluster
  8. Use pivot control plane to provision managed clusters from there on

With that accomplished, I ran into my next hurdle. The project clusterctl template is intended to consume vSphere’s cloud native storage CSI for persistent volumes. Which is great and makes perfect sense for running production clusters on vSphere. But it requires shared storage and configurations that would require me to either add hardware to my lab, or create more VMs to simulate VSAN.

This is undesirable for me, as I want to maximize lab server resources available for the K8s patterns I’m working with. I’ve found Longhorn and Open-EBS storage projects to work very well in this regard. But there was a catch… both require the open-iscsi package be installed. The project hosted VM OVAs do not contain it.

Luckily, there is a process/framework to create  customized OVAs. In this case, I needed to add direction within the cloud-init user-data config to install the open-iscsi deb for my ubuntu OVA. There are a few ways to do this, I chose to run the make process directly against vSphere.

I don’t know how I missed the images that are an all-in-one build (they are named build-node-ova-vsphere-[OS]. I ended up going with the two image stage, base -> clone method. This method is better for testing, so not a terrible oversight. I will describe the base -> clone method I used.

  1. Download the files for the image builder framework (Can be via git clone or curl –
  2. Check dependencies are installed on your machine. This can be accomplished with “make deps” from the downloaded framework, or you can install them manually. I found installing them manually worked best for me. From what I recall, packer, ansible, and pip3 were needed. Then run ‘make deps-ova’.
  3. Define required values in vsphere.json, This is where the docs are a bit confusing for the two stage process. Basically, the final OVA creation is the result of two make/build stages. The first creates a base image OVA, the second creates the final ‘ansibilized’ OVA image from that base. (This makes iterating over the build process faster in subsequent runs as you’re not building the entire image each time.)
  4. From the root of the image-builder framework directory, go to the ./packer/config/ directory. There you will find a number of .json configs. These are used by the ansible stage to construct the cloud-init user-data.
  5. Depending on what you want to accomplish, edit the correct json. In my case, it was the common.json where I added my deb to the ‘extra-debs’ key.
  6. Use ‘make help’ to see a list of available images. For example, I chose build-node-ova-vsphere-base-ubuntu-2004. With this choice, you would set the vsphere.json ‘linked-clone’ value to ‘base-ubuntu-2004’.
  7. Run the make for the base image and wait for it to complete. This will leave a powered off VM in your cluster, convert it to template.
  8. Now run make again with the corresponding clone image. In my case, build-node-ova-vsphere-clone-ubuntu-2004
  9. If all goes well, this will result in another powered off VM. Take a snapshot and convert to template. Use this template for your cluster config.

That’s it! specify your new template and k8s version in your cluster config yaml and create your cluster. You can ssh into your nodes to validate the customization. In my case, I also defined a cluster resource set to deploy the longhorn and cni plugin manifests during provisioning.

Shoutout to yastij and voor on the slack #cluster-api-vsphere channel for giving me some sanity checks along the way.