Predictive Auto-scaling of vSphere VMs and Docker Container Services – Shifting Gears

In the previous posts, I detailed the two main functions for performing the auto-scaling procedure. One to scale the K8s VM backed cluster across additional physical hosts, and one to scale the K8s pod deployment across the added hosts/nodes. The predictive trigger of these functions was to be the focus of this post.

As time passed and work took me away from this project, I am now without my former lab setup to test it. I could rebuild the lab and continue  the final bit of code that predicts a pattern in past CPU demand and calls the functions. But for my sanity’s sake, I’m going to pass on that for now and move on to the next logical progression in this series.

If I were to complete the last part of this, I would likely keep it in the vein of open source, unsupported distributions. TICK and Prophet are the two pieces I had earmarked for the work.

Unless you are Uber, Google, Facebook, etc., you are better off implementing a vendor supported solution wherever possible. The integration points have largely been worked out for you and you are left with far less to manage on your own.

Ultimately, the outcome was on a trajectory for my anticipated result. No matter how much work you put into a hacked together container service with multiple open source projects, there would be no end in sight to achieve something that was production-ready.

So that will be the focus of my next pass at the sample functionality I’ve created with the unsupported open source software. I will aim to implement a standard upstream version of k8s with an auto-scaling capability via VMware PKS. And then finally, look at incorporating a predictive function into that implementation.

Along the way, I will refer back to the previous methods to compare the pros and cons of either approach.


On a side note….

Another thing I’ve learned along the way is that time series forecasting is an immense topic, with many rabbit holes. I find it quite interesting but have determined that to truly, deeply understand the topic, it would require far more time than I have to commit. I’ll leave the details to the data scientist and be happy with knowing how to pick an appropriate set of time series analysis models for a given task.

The more I’ve read into time series ‘forecasting’, the more I’ve come to believe it should be referred to as timer series ‘prediction’. A time series forecast is really just a best guess of an input’s future value based on historic value(s) of it. While the science has proven to be accurate to a degree, it’s sort of akin to predicting the weather tomorrow based on the past six months of weather, without taking into consideration any of the weather indicators that affect it. The accuracy of a time series forecast result is largely dependent on the properties with influence over the metric you are analyzing.

I believe a truly intelligent predictive auto-scaling function will need to rely on more than a set of time series forecasting models that are only looking at traditional metrics like CPU. It will require knowledge of real-world events and trends. For example, tracking the path of a hurricane as an influence on when, where, and how to scale a service. That could be an interesting project for Kafka. I’ll hold off on committing to that for now as well.