Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies
Till now we have practiced a lot on the OnDemand Nodes of K8s Cluster. This post demonstrates how to use Spot Instances as K8s worker nodes, and shows the areas of provisioning, automatic scaling, and handling interruptions (termination) of K8s worker nodes across your cluster. Spot Instances can save you up to 70–90% cost as compared to OnDemand.Though Spot EKSInstances are cheaper, you cannot run all your worker nodes as Spot. You must have some OnDemand Instances as a backup because Spot Instances can betray you anytime with the interruptions 😉
In this article, we are discussing how you can use Spot Instances on EKS Cluster as well as the cluster you own on EC2 Servers.
Refer to our public Github Repo which contains the files/templates we have used in the implementation. This blog is covering the below-mentioned points:
- Provision k8s Cluster on EKS
- Provision k8s Cluster on EC2 Servers Using KOPS
- Use Cloudformation template to provision worker nodes for the EKS cluster
- For Kops cluster, use Instance Groups to provision spot and on-demand worker nodes
- Run Cluster Autoscaler on OnDemand Nodes
- Run Spot Interrupt Handler on Spot Instances
- Deploy Microservices
- Test AutoScaling
Kubernetes Operations with AWS EKS
AWS EKS is a managed service that simplifies the management of Kubernetes servers. It provides a highly available and secure K8s control plane. There are two major components associated with your EKS Cluster:
- EKS control plane which consists of control plane nodes that run the Kubernetes software, like etcd and the Kubernetes API server.
- EKS worker nodes that are registered with the control plane.
With EKS, the need to manage the installation, scaling, or administration of master nodes is no longer required i.e. AWS will take care of the control plane and let you focus on your worker nodes and application.
- EC2 Server to provision the EKS cluster using AWS CLI commands.
- The latest version of AWS CLI Installed on your Server
- IAM Permissions to create the EKS Cluster. Create an IAM Instance profile with the permissions attached and assign it to the EC2 Server.
- EKS Service Role
- Kubectl installed on the server.
Provision K8s Cluster with EKS
Execute the below command to provision an EKS Cluster:
aws eks create-cluster --name puck8s --role-arn arn:aws:iam::ACCOUNT:role/puc-eks-servicerole --resources-vpc-config subnetIds=subnet-xxxxx,subnet-xxxxx,subnet-xxxxxx,securityGroupIds=sg-xxxxx --region us-east-2
We have given private subnets available in our account to provision a private cluster.
Wait for the cluster to become available.
aws eks describe-cluster --name puck8s --query cluster.status --region us-east-2
Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through the AWS IAM Authenticator for Kubernetes(Link in the References section below). Install it using the below commands:
curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator
chmod +x ./aws-iam-authenticator
cp ./aws-iam-authenticator /usr/bin/aws-iam-authenticator
Update ~/.kube/config file which will be used by kubectl to access the cluster.
aws eks update-kubeconfig --name puck8s --region us-east-2
Execute “kubectl get svc”.
Launch Spot and OnDemand Worker Nodes
We have provisioned the EKS worker nodes using a cloud formation template provided by AWS. The template is available in our Github repo as well i.e. provision-eks-worker-nodes/amazon-eks-node group-with-spot.yaml. The template will provision three Autoscaling Groups:
- 2 ASG with Spot Instances with two different Instance types as given in the parameters
- 1 ASG with OnDemand Instance with Instance type as given in the parameter
Create a Cloudformation stack and provide the values in the parameters. For the AMI parameter, enter the ID from the below table:
| Region | AMI |
|-------------------------| ---------------------- |
| US East(Ohio)(us-east-2)| ami-0958a76db2d150238 |
Launch the stack and wait for the stack to be completed. Note down the Instance ARN from the Outputs.
Now get the config map from our repo.
Open the file “aws-cm-auth.yaml ” and replace the <ARN of instance role (not instance profile)> snippet with the NodeInstanceRole value that you recorded in the previous procedure, and save the file.
kubectl apply -f aws-auth-cm.yaml
kubectl get nodes --watch
Wait for the nodes to be ready.
Kubernetes Operations with KOPS
Kops is an official Kubernetes project for managing production-grade Kubernetes clusters. It has commands for provisioning multi-node clusters, updating their settings including nodes and masters, and applying infrastructure changes to an existing cluster. Currently, Kops is actually the best tool for managing k8s cluster on AWS.
Note: You can use kops in the AWS regions which AWS EKS doesn't support.
- Ec2 Server to provision the cluster using CLI commands.
- Route53 domain, (for example, k8sdemo.powerupcloud.com) in the same account from where you are provisioning the cluster. Kops uses DNS for identifying the cluster. It adds the records for APIs in your Route53 Hosted Zone.
Note: For public hosted zone, you will have to add the NS records for the above domain to your actual DNS. For example, we have added an NS record for "k8sdemo.powerupcloud.com" to "powerupcloud.com". This will be used for the DNS resolution. For the private hosted zone, ensure to add the VPCs.
- IAM Permissions to create the cluster resources and update DNS records in Route53. Create an IAM Instance profile with the permissions attached and assign it to the EC2 Server.
- S3 bucket for the state store.
- Kubectl installed.
Log into the EC2 server and execute the below command to install Kops on the Server:
curl -LO https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64
chmod +x kops-linux-amd64
sudo mv kops-linux-amd64 /usr/local/bin/kops
Provision K8s Cluster
kops create cluster k8sdemo.powerupcloud.com --ssh-public-key ~/.ssh/id_rsa.pub --master-zones ap-south-1a --zones ap-south-1a,ap-south-1b,ap-south-1a --master-size=t2.medium --node-count=1 --master-count 1 --node-size t2.medium --topology private --dns public --networking calico --vpc vpc-xxxx --state s3://k8sdemo-kops-state-store --subnets subnet-xxxx,subnet-xxxx --utility-subnets subnet-xxxx,subnet-xxxx --kubernetes-version 1.11.4 --admin-access xx.xxx.xxxx.xx/32 --ssh-access xx.xxx.xxx.xx/32 --cloud-labels "Environment=DEMO"
Refer to our previous blog for the explanation of the arguments in the above command.
kops update cluster --yes
Once the above command is successful, we will have a private K8s Cluster ready with Master and Nodes in the private subnets.
Use the command “kops validate cluster CLUSTER_NAME” to validate the nodes in your k8s cluster.
Create Instance Groups for Spot and OnDemand Instances
Kops Instance Group helps in the grouping of similar instances which maps to an Autoscaling Group in AWS. We can use the “kops edit” command to edit the configuration of the nodes in the editor. The “kops update” command applies the changes to the existing nodes.
Once we have provisioned the cluster, we will have two Instance groups i.e. One for master and One for Nodes. Execute the below command to get available Instance Groups:
kops get ig
Edit nodes instance group to provision spot workers. Add the below Key Values. Set the max price property to your bid. For example, “0.10” represents a spot-price bid of $0.10 (10 cents) per hour.
The final configuration will look like as shown in the below screenshot:
Create one more Spot Instance Group for a different instance type.
kops create ig nodes2 --subnet ap-south-1a,ap-south-1b --role Node
kops edit ig nodes2
Add maxPrice and node labels and the final configuration will look like as shown in the below screenshot:
Now, we have configured two spot worker node groups for our cluster. Create an instance group for OnDemand Worker Nodes by executing the below command:
kops create ig ondemand-nodes --subnet ap-south-1a,ap-south-1b --role Node
kops edit ig ondemand-nodes
Add node labels for the OnDemand Workers.
Also, we have added taints to avoid the pods from OnDemand Worker Nodes. Preferably, the new pods will be assigned to the Spot workers.
To apply the above configurations, execute the below command:
kops update cluster
kops update cluster --yes
kops rolling-update cluster --yes
Cluster Autoscaler is an open-source tool which automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:
- there are pods that failed to run in the cluster due to insufficient resources
- there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
CA will run as a daemonset on the Cluster OnDemand Nodes. The YAML file for daemonset is provided in our Github Repo i.e. https://github.com/powerupcloud/kubernetes-spot-webinar/tree/master/cluster-autoscaler.
Update the following variables in the cluster-autoscaler/cluster-autoscaler-ds.yaml
- Autoscaling Group Names of On-demand and Spot Groups
- Update minimum count of instances in the Autoscaling group
- Update max count of instances in the Autoscaling group
- AWS Region
- mode selector will ensure to run the CA pods on the OnDemand nodes always.
Create ClusterAutoscaler on both of the k8s clusters EKS as well as the cluster provisioned using kops. Ensure to attach the below permissions to the IAM Role assigned to the cluster worker nodes:
The daemonset YAML file for the EKS Cluster will look like as shown in the below screenshot.
Similarly, for the cluster provisioned using Kops, the yaml file will be :
kubectl create -f cluster-autoscaler/cluster-autoscaler-ds.yaml
Now create a pod disruption budget for CA which will ensure to run at least one cluster autoscaler pod always.
kubectl create -f cluster-autoscaler/cluster-autoscaler-pdb.yaml
Verify the Cluster autoscaler pod logs in kube-system namespace:
kubectl get pods -n kube-system
kubectl logs -f pod/cluster-autoscaler-xxx-xxxx -n kube-system
Spot Termination Handler
The major fallbacks of a Spot Instance are –
- it may take a long time to become available (or may never become available),
- and maybe reclaimed by AWS at any time.
Amazon EC2 can interrupt your Spot Instance when the Spot price exceeds your maximum price, when the demand for Spot Instances rises, or when the supply of Spot Instances decreases. Whenever you are opting for Spot, you should always be prepared for the interruptions.
So, we are creating one interrupt handler on the clusters which will run as a daemonset on the OnSpot Worker Nodes. The workflow of the Spot Interrupt Handler can be summarized as:
- Identify that a Spot Instance is being reclaimed.
- Use the 2-minute notification window to gracefully prepare the node for termination.
- Taint the node and cordon it off to prevent new pods from being placed.
- Drain connections on the running pods.
- To maintain desired capacity, replace the pods on remaining nodes
Create the Spot Interrupt Handler DaemonSet on both the k8s clusters using the below command:
kubectl apply -f spot-termination-handler/deploy-k8-pod/spot-interrupt-handler.yaml
Deploy Microservices with Istio
We have taken a BookInfo Sample application to deploy on our cluster which uses Istio.
Istio is an open platform to connect, manage, and secure microservices. For more info, see the link in the References section below. To deploy Istio on the k8s cluster, follow the steps below:
tar -xvzf istio-1.0.4-linux.tar.gz
In our case, we have provisioned the worker nodes in private subnets. For Istio to provision a publically accessible load balancer, tag the public subnets in your VPC with the below tag:
Install helm from the link below:
kubectl create -f install/kubernetes/helm/helm-service-account.yaml
helm init --service-account tiller --wait
helm install --wait --name istio --namespace istio-system install/kubernetes/helm/istio --set global.configValidation=false --set sidecarInjectorWebhook.enabled=false
kubectl get svc -n istio-system
You will get the LoadBalancer endpoint.
Create a gateway for the Bookinfo sample application.
kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml
The BookInfo sample application source code, Dockerfile, and Kubernetes deployment YAML files are available in the sample-app directory in our Github repo.
Build a docker image out of provided Dockerfiles and update the IMAGE variable in k8s/deployment.yaml for all the four services. Deploy each service using:
kubectl apply -f k8s
Hit http://LB_Endpoint/productpage. you will get the frontend of your application.
AutoScaling when the Application load is High
If the number of pods increases with the application load, the cluster autoscaler will provision more worker nodes in the Autoscaling Group. If the Spot Instance is not available, it will opt for OnDemand Instances.
Initial Settings in the ASG:
Scale up the number of pods for one deployment, for example, product page. Execute:
kubectl scale --replicas=200 deployment/productpage-v1
Watch the Cluster Autoscaler manage the ASG.
Similarly, if the application load is less, CA will manage the size of the ASG.
Note: We dont recommend to run the stateful applications on Spot Nodes. Use OnDemand Nodes for your stateful services.
and that’s all..!! Hope you found it useful. Happy Savings..!!