Predictive Auto-scaling of vSPhere VMs and Container Services

Photo by Samuel Zeller on Unsplash

Reactive and predictive auto-scaling have existed for some time already. Predictive auto-scaling has been custom developed and leveraged by large service providers for almost a decade.

To define a few terms: Scaling is the process of adjusting resources to satisfactorily serve demand. Auto-scaling is replacing manual process with automated IT processes that react to current-state metrics. Predictive auto-scaling relies on observed metrics, over time, to predict an upcoming change in demand and auto-scale ahead of actual demand. This is advantageous because the resources required to scale will not contend with the services demanding more.

Scaling can be up, down, out, or in. Up and down involves adding more, or taking away resources from a single provider. Out and in involves adding or removing collections of providers. So, to scale a service up, we might add more RAM and CPU to existing workers. To scale a service out, we would add more workers.

There are multiple ways to auto-scale a vSphere environment. vRealize Operations provides click-to-configure compute scale up and down capability at the VM level. VMware PKS (a VMware/Pivotal implementation of Kubernetes with Bosh) provides scale up automation for containerized services.

By leveraging vRealize Automation, vRealize Orchestrator, some scripting, and data from vCenter and/or vRealize Operations/Endpoint Operations (or other performance metric collector), we can extend to scale up/down and scale out/in. This involves a number of moving parts that need to be well understood and configured.

Finally, with dynamic thresholds and historical data from vRealize Operations, we can elevate the extended approach to include predictive triggers.

Add in SDDC components of software defined storage (e.g. VSAN) and software defined networking (e.g. NSX),, and we can scale compute, storage, and security/networking functions. Container based architectures have their own constructs for scaling, separate from IaaS. These exist within your chosen container orchestration platform (e.g. Kubernetes, Swarm, etc.).

I believe the best approach for containerized services in this context will be one that maintains a logical separation between the IaaS and CaaS automation. This will allow for improved portability and flexibility of the overall scaling automation solution.

For example, if we want to perform IaaS operations across on-prem vSphere resources and off-prem AWS resources, or if we want to scale a container cluster across one or both. Logical separation of the codified processes for each layer allows us to more easily incorporate IaaS and CaaS architectures/scenarios.

Ideally, we would leverage RESTful APIs for the above approach. We can leverage point-and-click UI’s to define workflows, configurations, etc.. But the method of tying everything together will be best accomplished through RESTful API calls within a master routine that can incorporate other discreet functions across both automation models.

We will have our underlying vSphere environment preconfigured for our container clusters to have room to grow and contract. But there will be use cases where we need to scale each in tandem.

For example, if we run monolithic applications in VM’s along side containerized services, we may need to adjust the shares of resources before and after an application/service scaling operation. If we are temporarily scaling out and across resources normally reserved for dev/test, we will want to adjust the non-prod workloads..

As we continue to move up the stack in automated IT service delivery, (software defined network -> software defined storage -> software defined compute -> software defined OS), and as the technology involved evolves, we will need to rely on flexible approaches to fully realize the benefits.

In a coming post, I will walk through a demo of auto-scaling a K8s or Swarm VM based cluster across vSphere hosts and then scale a container based service/stack onto the expanded cluster in an automated form. With any luck, I’ll be able to build on that with a  predictive scaling demo for a follow-up. This will require monitoring of both VM and container image metrics, automated provisioning of VMs, coordinated automated scaling of a container service/stack, and various tweaks within the IaaS environment along the way. Stay tuned.