Predictive Auto-scaling of vSphere VMs and Docker Container Services – The Plumbing (2 of 2)

In the previous post, I wrote the function to scale a K8s deployment with a REST API call. In this post, I’ll write the other function required to codify the scaling of a K8s cluster across physical resources.

The function here will power on a K8s node VM that resides on an ESXi host and make itself available to the K8s cluster for additional compute. I will use the vSphere REST API and then combine it with the previous function to complete the scale out operation.

I will not be implementing all of the error handling and functionality that would be required for a production-ready capability. Just enough to build a starting point.

For my lab, the python code to power on a vSphere VM looks like this:

import requests
import json
from requests.packages.urllib3.exceptions import InsecureRequestWarning

vsp_api_url = 'https://192.168.106.121/rest'
vsp_api_user = 'administrator@nate.lab'
vsp_api_pass = 'VMware1!'

def main():
    requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
    power_vm_on()
def auth_vcenter(username,password):
    resp = requests.post('{}/com/vmware/cis/session'.format(vsp_api_url),auth=(vsp_api_user,vsp_api_pass),verify=False)
    return resp.json()['value']

def power_vm_on():
    sid = auth_vcenter(vsp_api_user,vsp_api_pass)
    resp = requests.post('{}/vcenter/vm/vm-690/power/start'.format(vsp_api_url),verify=False,headers={'vmware-api-session-id':sid})
    return

main()

The above is hard-coded with my login credentials, vCenter API address, and the VM I am targeting. I supply credentials to obtain an authenticated session with the API server and then post a call to power on the VM. This performs the basis of adding additional physical compute to the K8s cluster.

There are so many resources online that cover the vSphere REST API, I won’t cover it here. It is a fantastic addition to the vSphere platform and greatly improves the ease of integrating vSphere functions with others (e.g. Kubernetes).

Combining my two functions to power on the VM and then scale the K8s deployment across the added compute is the next step. For this, I needed to add another function to wait until the K8s master node recognizes the ready state of the powered on worker node. The default K8s node status heartbeat detection is a bit slow for this use case. It could be tuned but I will leave it as-is. The combined code for my lab looks like this:

import requests
import json
from requests.packages.urllib3.exceptions import InsecureRequestWarning
from time import sleep

vsp_api_url = 'https://192.168.106.121/rest'
vsp_api_user = 'administrator@nate.lab'
vsp_api_pass = 'VMware1!'
k8s_api_url = 'http://localhost:8080/api/v1'

def main():
    requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
    power_vm_on()
    print "K8s nodes scaled out - Added 1 physical host"
    wait_node_ready()
    print "Additional K8s worker node ready"
    scale_deployment()
    print "K8s deployment scaled out"

def auth_vcenter(username,password):
    resp = requests.post('{}/com/vmware/cis/session'.format(vsp_api_url),auth=(vsp_api_user,vsp_api_pass),verify=False)
    return resp.json()['value']

def power_vm_on():
    sid = auth_vcenter(vsp_api_user,vsp_api_pass)
    resp = requests.post('{}/vcenter/vm/vm-690/power/start'.format(vsp_api_url),verify=False,headers={'vmware-api-session-id':sid})
    return

def wait_node_ready():
    resp = requests.get('{}/nodes/k8s-wn2'.format(k8s_api_url))
    j = resp.json()
    print "Waiting for node k8s-wn2 to become ready"
    while j['status']['conditions'][4]['status']!="True":
        resp = requests.get('{}/nodes/k8s-wn2'.format(k8s_api_url))
        j = resp.json()
        sleep(1)
    sleep(4)
    return

def scale_deployment():
    url = 'http://localhost:8080/apis/extensions/v1beta1/namespaces/default/deployments/nginx-deployment/scale'
    data = '''{
      "kind": "Scale",
      "apiVersion": "extensions/v1beta1",
      "metadata": {
        "name": "nginx-deployment",
        "namespace": "default",
        "selfLink": "/apis/extensions/v1beta1/namespaces/default/deployments/nginx-deployment/scale",
        "uid": "d9651c4a-4509-11e8-96f1-005056a00f0f"
      },
      "spec": {
        "replicas": 2
      }
    }'''
    resp = requests.put(url, data=data)
    return

main()

The above code:

  1. Obtains a session ID from vCenter with supplied credentials
  2. Powers on the specified VM
  3. Waits until the K8s master recognizes the node as ready state
  4. Scales the nginx-deployment deployment to two pods

There are many other operations that need to be added. I haven’t added the logic to evict pods from nodes that are scaling in, check for status of pods before and after scale operations, etc.. The K8s scheduler should naturally select the added node for the scaled out pod. In a real-world implementation, we would be developing much further, and add our own custom scheduler. The K8s scheduler is fine for basic operations but is lacking quite a bit for enterprise services that will be actively scaled.

That’s it for now. In my next post, I will dive into time series forecasting functions, time series databases, message queues, and time series platforms, as they relate to predictive auto-scaling (And hopefully completing a demo of a predictive auto-scale operation based on a time series forecast.).