Kubernetes Assigning a Specific Pod to a particular Cluster Node — Part VI

By September 26, 2019 October 7th, 2019 AWS, Blogs

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

In this article, we are discussing how one can deploy a specific MicroService on a particular node. As a solution, we are using Taint and Tolerations feature of Kubernetes. Toleration is applied to pods, and allow the pods to schedule onto nodes with matching taints.

Setup:

  • Kubernetes — v1.8+
  • All Cluster Nodes resides in the Public Subnets.
  • Cluster autoscaler configured for the Cluster Nodes.
  • Prometheus being used for the monitoring.

Requirement:

  • One specific Microservice needs to run on a Private Node.

Workflow:

  • Provision a private node and add it to the running Kubernetes Cluster.
  • Taint the new Private Node with tolerations.
  • Deploy the MicroService
  • Attach the new node to the existing nodes autoscaling group
  • Fix Prometheus DaemonSetsMissScheduled Alert

Provision Private Node

Since all the available nodes exist in public subnet, first we need to start with provisioning a Private node and adding that node to the existing Kubernetes Cluster. Note the AMI used by the running cluster node and launch an EC2 Server from that AMI. Select existing private subnet and existing nodes IAM role.

Copy the userdata script from the existing Nodes launch configuration. The script is required to join the Node to the Kubernetes Cluster as soon as it is provisioned.

Paste it in the user data in the Advanced Details section.

Add the Tags the same as an existing node. Ensure to add the “KubernetesCluster” tag.

Launch it. Once the server is provisioned, login to the server and check the Syslog. Ensure the docker containers are running.

docker ps

Now execute the below command on the server from where you will be able to access the Kubernetes API.

kubectl get nodes

It should list the new private node. The node is now added to the existing Kubernetes Cluster.

Taint the Private Node

Taint the private node by executing the below command:

kubectl taint nodes ip-xx.xx.xx.xxx.REGION.compute.internal private=true:NoSchedule

where key=value is private=true and the effect is NoSchedule. This means that no pod will be able to schedule onto the specified node unless it has matching toleration. The key-value pair can be modified here.

If you want to list the available tainted nodes, you can list it via a template:

tolerations.tmpl

{{printf "%-50s %-12s\n" "Node" "Taint"}}
{{- range .items}}
{{- if $taint := (index .spec "taints") }}
{{- .metadata.name }}{{ "\t" }}
{{- range $taint }}
{{- .key }}={{ .value }}:{{ .effect }}{{ "\t" }}
{{- end }}
{{- "\n" }}
{{- end}}
{{- end}}

Execute:

kubectl get nodes -o go-template-file="tolerations.tmpl"

Label the Node

Apply a label to the private node by executing the below command:

kubectl label nodes <Node> <key>=<value>

Example, kubectl label nodes ip-xx.xx.xx.xxx.REGION.compute.internal private=true

Deploy the MicroService with Tolerations

Update the deployment.yaml to include the tolerations same as specified while tainting the node and node label in the node selector:

tolerations:
- key: "private"
operator: "Equal"
value: "true"
effect: "NoSchedule"
nodeSelector:
private: "true"

Deploy it.

kubectl apply -f deployment.yaml

Execute “kubectl get pod/podname -o wide”. Check the Node to which it is assigned.

Attaching the Private Node to the existing AutoScaling Group

Enable Termination Protection on the Private Node and Suspend Terminate process temporarily from the nodes autoscaling group.

Attach the new node to the Nodes autoscaling group. Go to AutoScaling Group, select the private node and set instance protection “Set Scale In Protection”.

Since we have Cluster Autoscaler configured for the Cluster Nodes, the new private node will get terminated by the autoscaler(due to less load as compared to the other nodes). Therefore, its safer to set instance protection on the private node.

Remove the Terminate process from the Suspended Processes now. Do have a look at the Cluster Autoscaler logs. The private node will be skipped by the autoscaler.

Fix Prometheus DaemonSetsMissScheduled Alert

After setting up the private node completely, we started getting DaemonSetsMissScheduled alerts for the calico-node DaemonSet from Prometheus. We have debugged and followed the below steps to fix it.

Problem: We had a total of 8 nodes in our cluster (including masters, nodes in public subnets and a node in private subnet) but the “desiredNumberScheduled” in DaemonSet was showing 7 (excluding the private node).

Solution: Since we have a tainted private node, the daemonset must match the tolerations. To fix the above problem, we have added the tolerations the same as a private node to the calico-node DaemonSet.

Execute:

kubectl edit ds/calico-node -n kube-system

Check the value of “desiredNumberScheduled”. It was one less than the total number of nodes. You can get the number of nodes by the command: “kubectl get nodes”.

Next, Add the toleration the same as you have provided to the private node in the second step above (Taint the Private Node).

Now execute:

kubectl describe ds/calico-node -n kube-system

Check the “Desired Number of Nodes Scheduled:”. It should be equal to your number of nodes currently available.

Look at the status of calico-node pods too:

kubectl get pods -n kube-system -o wide -l k8s-app=calico-node

and that’s how we were able to assign a specific pod to a particular node without any misconfigurations. Hope you found it useful. Keep following our blogs for the further parts on Kubernetes. Do visit the previous parts of this series.

References:

Leave a Reply