Category

Kubernetes

Kubernetes Security Practices on AWS

By | Blogs, Cloud, Cloud Assessment, Kubernetes | One Comment

Written by Praful Tamrakar Senior Cloud Engineer, Powerupcloud Technologies

Security in Cloud and Infra level

  1. Ensure the worker nodes AMI meet the CIS benchmark. 
    1. For K8s benchmark :
    1. Below is a list of tools and resources that can be used to automate the validation of an instance of Kubernetes against the CIS Kubernetes Benchmark:
  1. Verify that the Security Groups and NACL do not allow all traffic access and the rules allow access to ports and Protocol needed only for Application and ssh purposes.
  2. Make sure that you have encryption of data at rest. Amazon KMS can be used for encryption of data at rest. For Example : 
  • EBS volumes for ControlPlane nodes and worker nodes can be encrypted via KMS.
  • You can encrypt the  Logs Data either in Cloudwatch Logs or  in S3 using KMS.
  1. If Instance(s) are behind the ELB, make sure you have configured HTTPS encryption and decryption process (generally known as SSL termination) handled by an Elastic Load Balancer.
  2. Make sure the worker nodes and RDS are provisioned in Private Subnets.
  3. It’s always best practise to have a Separate Kubernetes(EKS) cluster for each Environment( Dev/UAT/Prod).
  4. Ensure to use AWS Shield/WAF to prevent DDOS attacks.

Container Level

  1. Ensure to use a minimal base image ( Eg: Alpine image to run the App)
  2. Ensure  that the docker image  registry you are using is a trusted, authorized and private registry. EG: Amazon ECR.
  3. Make sure you remove all the unnecessary files in your docker image. Eg: In tomcat server, you need to remove: 
  • $CATALINA_HOME/webapps/examples
  • $CATALINA_HOME/webapps/host-manager
  • $CATALINA_HOME/webapps/manager
  • $CATALINA_HOME/conf/Catalina/localhost/manager.xml 
  1. Ensure to disable the display of the app-server version or server information. For example, below in the Tomcat server, we can see the server information is displayed. This can be mitigated using the procedure below.

Update an empty value to server.info(server.info=””) in the file,$CATALINA_HOME/lib/org/apache/catalina/util/ServerInfo.properties

  1. Ensure not to copy or add any sensitive file/data in the Docker image, it’s always recommended to use Secrets ( K8s secrets are encrypted at rest by default onwards Kubernetes v1.13 )  You may also use another secret management tool of choice such as  AWS Secret Manager/Hashicorp Vault.
    • Eg: do not enter Database Endpoints, username, passwords in the docker file. Use K8s secrets and these secrets can be used as an Environmental variables  
apiVersion: v1
kind: Pod
metadata:
  name: secret-env-pod
spec:
  containers:
  - name: myapp
    image: myapp
    env:
      - name: DB_USERNAME
        valueFrom:
          secretKeyRef:
            name: dbsecret
            key: username
      - name: DB_PASSWORD
        valueFrom:
          secretKeyRef:
            name: dbsecret
            key: password
      - name: DB_ENDPOINT
        valueFrom:
          secretKeyRef:
            name: dbsecret
            key: endpoint

5. Ensure to disable Bash from the container images.

6. Endorse Multi-Stage build for smaller, cleaner and secure images.

To understand how can you leverage multi-stage can be found on :

https://docs.docker.com/develop/develop-images/multistage-build/

7. Verify that the container images are scanned for vulnerability assessment before it is pushed to the registry. The AWS ECR has the feature that you can scan Repository to Scan on Push. Eg : CLAIR/AQUA/etc assessment tools can be used to scan images. These tools can be embedded in the CI/CD pipeline making sure if there is any vulnerability, the docker image push can be rejected/terminated. Find sample implementation  – https://www.powerupcloud.com/email-va-report-of-docker-images-in-ecr/

K8s level

  1. Make sure to use or upgrade Kubernetes to the latest stable version.
  2. It’s recommended not to use default namespace. Instead, create a namespace for each application, i.e separate Namespaces for separate sensitive workloads.
  3. Make sure to enable Role-Based Access Control (RBAC) for Clients( Service Accounts / Users) for restricted privileges.

RBAC Elements:

  • Subjects: The set of users and processes that want to access the Kubernetes API.
  • Resources: The set of Kubernetes API Objects available in the cluster. Examples include Pods, Deployments, Services, Nodes, and PersistentVolumes, among others.
  • Verbs: The set of operations that can be executed to the resources above. Different verbs are available (examples: get, watch, create, delete, etc.), but ultimately all of them are Create, Read, Update or Delete (CRUD) operations.

Let’s see  RBAC meant for seeing Kubernetes as a production-ready platform.

  • Have multiple users with different properties, establishing a proper authentication mechanism.
  • Have full control over which operations each user or group of users can execute.
  • Have full control over which operations each process inside a pod can execute.
  • Limit the visibility of certain resources of namespaces.

4. Make sure to standardize the naming and labeling Convention of the Pod, Deployment, and service. This will ease the operational burden for security management ( Pod Network Policy ).

5. Ensure to use Kubernetes network policy which will restrict the  Pods communication, i.e how groups of pods are allowed to communicate with each other and other network endpoints. Please find how to implement the network policy in Amazon EKS https://blog.powerupcloud.com/restricting-k8s-services-access-on-amazon-eks-part-ix-7d75c97c9f3e

6. AWS Single Sign-On (SSO), AWS Managed Microsoft Active Directory Service, and the AWS IAM authenticator can be used to control access to your Amazon EKS cluster running on the AWS cloud.

7. Corroborate to use Pod Security Context.

  • Ensure to disable root access. the docker image should be accessible from a non-root user
  • Make sure to configure read-only root file system
  • Security-Enhanced Linux (SELinux): You can assign SELinuxOptions objects using the seLinuxOptions field. Note that the SELinux module needs to be loaded on the underlying Linux nodes for these policies to take effect.
  • Make sure  Linux capabilities and/or add non-default Linux capabilities are used if it’s required.
  • Make sure not to run pods/containers as privileged unless you will require access to all devices on the host. Permission to access an object, like a file, is based on user ID (UID) and group ID (GID).

Please Find the  Snippet for Pod Security Context  :

...
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    readOnlyRootFilesystem: true
    allowPrivilegeEscalation: false
    seLinuxOptions:
    level: "s0:c123,c456"
    capabilities:
      drop:
        - NET_RAW
        - CHOWN
      add: ["NET_ADMIN", "SYS_TIME"]
...

Note : Pod Security content can be used in pod as well as container level.

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo-2
spec:
  #Pod level
  securityContext:
    runAsUser: 1000
  containers:
  - name: sec-ctx-demo-2
    image: gcr.io/google-samples/node-hello:1.0
   #container level
    securityContext:
      runAsUser: 2000
      allowPrivilegeEscalation: false

8. Make sure to embed these Kubernetes Admission Controllers in all possible ways.

  • AlwaysPullImages – modifies every new Pod to force the image pull policy to Always. This is useful in a multitenant cluster so that users can be assured that their private images can only be used by those who have the credentials to pull them.
  • DenyEscalatingExec – will deny exec and attach commands to pods that run with escalated privileges that allow host access. This includes pods that run as privileged, have access to the host IPC namespace or have access to the host PID namespace.
  • ResourceQuota – will observe the incoming request and ensure that it does not violate any of the constraints enumerated in the ResourceQuota object in a Namespace.
  • LimitRanger- will observe the incoming request and ensure that it does not violate any of the constraints enumerated in the LimitRange object in a Namespace. Eg: CPU and Memory

10. Ensure to scan Manifest files (yaml/json) for which any credentials are passed in objects ( deployment, charts )  Palo Alto Prisma / Alcide Kubernetes Advisor.

11. Ensure to use TLS authentication for Tiller when Helm is being used.

12. It’s always recommended not to use a default Service account

  • The default service account has a very wide range of permissions in the cluster and should, therefore be disabled.

13. Do not create a Service Account or a User which has full cluster-admin privileges unless necessary,  Always follow Least Privilege rule.

14. Make sure to disable anonymous access and send Unauthorized responses to unauthenticated requests. Verify the following Kubernetes security settings when configuring kubelet parameters:

  • anonymous-auth is set to false to disable anonymous access (it will send 401 Unauthorized responses to unauthenticated requests).
  • kubelet has a `–client-ca-file flag, providing a CA bundle to verify client certificates.
  • –authorization-mode is not set to AlwaysAllow, as the more secure Webhook mode will delegate authorization decisions to the Kubernetes API server.
  • –read-only-port is set to 0 to avoid unauthorized connections to the read-only endpoint (optional).

15. Ensure to put restricted access to etcd from only the API server and nodes that need that access. This can be restricted in the Security Group attached to ControlPlane.

K8s API call level

  1. Ensure that all the communication from the client(Pod/EndUser) to the K8s(API SERVER) should be TLS encrypted
    1. May experience throttle if huge API calls happen
  2. Corroborate that all the communication from k8s API server to ETCD/Kube Control Manager/Kubelet/worker node/Kube-proxy/Kube Scheduler  should be TLS encrypted
  3. Enable Control Plane API to call logging and Auditing. ( EG:  EKS Control Plane Logging)
  4. If you are using Managed Services for K8s such as Amazon  EKS, GKE, Azure Kubernetes Service (AKS), these all things are taken care

EKS Security Considerations

  • EKS does not support Kubernetes Network Policies or any other way to create firewall rules for Kubernetes deployment workloads apart from Security Groups on the Worker node, since it uses VPC CNI plugin by default, which does not support network policy. Fortunately, this has a simple fix. The Calico CNI can be deployed in EKS to run alongside the VPC CNI, providing Kubernetes Network Policies support.
  • Ensure to Protect EC2 Instance Role Credentials and Manage AWS IAM Permissions for Pods. These can be configured by using below tools:
  • By using the IAM roles for the service accounts feature, we no longer need to provide extended permissions to the worker node’s IAM role so that pods on that node can call AWS APIs. We can scope IAM permissions to a service account, and only pods that use that service account have access to those permissions. This feature also eliminates the need for third-party solutions such as kiam or kube2iam.

https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html

Security Monitoring of K8s

Sysdig Falco is an open-source, container security monitor designed to detect anomalous activity in your containers. Sysdig Falco taps into your host’s (or Node’s in the case Kubernetes) system calls to generate an event stream of all system activity. Falco’s rules engine then allows you to create rules based on this event stream, allowing you to alert on system events that seem abnormal. Since containers should have a very limited scope in what they run, you can easily create rules to alert on abnormal behavior inside a container.

Ref: https://sysdig.com/opensource/falco/

The Alcide Advisor is a Continuous Kubernetes and Istio hygiene checks tool that provides a single-pane view for all your K8s-related issues, including audits, compliance, topology, networks, policies, and threats. This ensures that you get a better understanding and control of distributed and complex Kubernetes projects with a continuous and dynamic analysis. A partial list of the checks we run includes:

  • Kubernetes vulnerability scanning
  • Hunting misplaced secrets, or excessive secret access
  • Workload hardening from Pod Security to network policies
  • Istio security configuration and best practices
  • Ingress controllers for security best practices.
  • Kubernetes API server access privileges.
  • Kubernetes operators security best practices.

Ref :https://aws.amazon.com/blogs/apn/driving-continuous-security-and-configuration-checks-for-amazon-eks-with-alcide-advisor/

Migrate for Anthos: Modernized approach for migrating Compute Engine to Kubernetes Engine

By | Blogs, Cloud, Kubernetes | No Comments

Written by Madan Mohan K, Associate Cloud Architect

ANTHOS-One Management Solution for a hybrid cloud and multi-cloud world

The growing importance of hybrid cloud and multi-cloud environments is transforming the entire computing industry as well as the way businesses can leverage technology to innovate. Economics and speed are the two greatest issues driving this market change. Using a hybrid cloud/multi-cloud not only allows companies to scale computing resources, but it also eliminates the need to make massive capital expenditures to handle short-term spikes in demand as well as when the business needs to free up local resources for more sensitive data or applications.

Anthos:

Anthos is an open-source application platform that enables an enterprise to modernize their existing applications on hybrid or multi-cloud environments. You can build new VMs and run them anywhere in a secure manner. Anthos is built on open source technologies pioneered by Google—including Kubernetes, Istio, and Knative—and enables consistency between on-premises and cloud environments.

When workloads are upgraded to containers, IT departments can eliminate OS-level maintenance and security patching for VMs and automate policy and security updates at scale. Monitoring across on-premises and cloud environments are done through a single interface in the Google Cloud Console.

Scenario:

Rewriting existing applications to Kubernetes isn’t always possible or feasible to do manually. That’s where Migrate for Anthos can help, by modernizing the existing applications and getting them to run in Kubernetes.

Migrate for Anthos:

Migrate for Anthos provides an almost real-time solution to take an existing VM and make it available as a Kubernetes hosted pod with all the values associated with executing the applications in a Kubernetes cluster.

Let’s look at an example, migrating a Compute Engine instance to a Kubernetes Engine cluster running Migrate for Anthos to start with the basics.

Prerequisites:

https://cloud.google.com/migrate/anthos/docs/gce-to-gke-prerequisites

Compatible VM operating systems:

https://cloud.google.com/migrate/anthos/docs/compatible-os-versions

Instance Creation:

  • From the Console go to Compute Engine > VM Instances, then click the Create button
  • Name the instance “migrate-vm-anthos” or whichever preferred, check the box for “Allow HTTP traffic“, and accept all the other defaults. Click Create.
  • Once the VM is created, SSH

Install the Apache web server by running the following commands:

sudo apt-get update
 sudo apt-get install apache2 -y
 echo "Hello World" > index.html
 sudo mv index.html /var/www/html

A sample Hello World page is displayed.

Note: To migrate the VM, first stop it from running

We need a Kubernetes cluster to migrate the virtual machine into. The Migrate for Anthos app can be deployed to an existing cluster as well. Let’s install Migrate for Anthos through the Google Cloud Marketplace.

Deploying Migrate for Anthos

Navigate to Market place and search Migrate for Anthos.

Click the configure button.

For this lab, we can accept the default settings. Click the Create cluster button.

Once the cluster is created, check the box to accept the Terms of Service, then click the Deploy button. The migration for the Anthos environment will now be set up.

Migrating your VM to your new container:

Open cloud shell and run the following

pip3 install --user pyyaml 

This installs a Python prerequisite that will process YAML files.

Execute the following command

python3 /google/migrate/anthos/gce-to-gke/clone_vm_disks.py \
  -p $GOOGLE_CLOUD_PROJECT \
  -z us-central1-a \
  -T us-central1-a \
  -i migrate-vm-anthos \
  -A myworkload \
  -o myYaml.yaml

This command will take a few minutes to complete. The migration file is a YAML file called myYaml.yaml, created by Migrate for Anthos. When deployed to Kubernetes, it will perform the migration.

A successful yaml generation is seen in the below screenshot

Next we must initialize the kubectl environment and perform the migration by running the following command

kubectl apply -f myYaml.yaml

The execution result is obtained as shown

In the Console, from the Navigation menu, browse to Kubernetes Engine > Workloads. You will see a workload called myworkload. Wait for its status to change to OK

Validate the Migrated Instance:

In Cloud Shell, log in to the Kubernetes pod that is running the workload that has been migrated:

kubectl exec -it myworkload-0 -- /bin/bash
curl localhost

Well, the application works as expected.

Open the code editor in cloud shell and edit the myYaml.yaml file and do add the following at the top.

apiVersion: v1
kind: Service
metadata:
  name: myworkload-svc
  labels:
    app: myworkload
spec:
  type: LoadBalancer
  ports:
  - port: 80
    name: web
  selector:
    app: myworkload
---

Now find the entry for the StatefulSet definition. Find the entry for containers that is nested at. Add the following two lines directly below myworkload

ports:
- containerPort: 80

Make sure the indentation places the ports,element as child of name: myworkload. Save the file.

Apply the Kubernetes changes:

kubectl apply -f myYaml.yaml

There you go the service is exposed and we can validate it by navigating to the Service & Ingress section.

A browser tab will appear, and you will see the web page which is the simple text “Hello World”. This illustrates that you have successfully migrated the web server that was running in the Compute Engine to be running in a Kubernetes cluster.

Inference:

Anthos unites all of Google Cloud Platform’s powerful tools under one roof, and in doing so it delivers unprecedented efficiency, scalability, and cost-effectiveness to IT operations. With an introduction to Anthos, an organization can enjoy the full benefits of managing its multi and hybrid cloud environment at ease and it also offers the ability to innovate using cloud technologies.

Running Kubernetes Workloads on AWS Spot Instances-Part 8

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

Till now we have practiced a lot on the OnDemand Nodes of K8s Cluster. This post demonstrates how to use Spot Instances as K8s worker nodes, and shows the areas of provisioning, automatic scaling, and handling interruptions (termination) of K8s worker nodes across your cluster. Spot Instances can save you up to 70–90% cost as compared to OnDemand.Though Spot EKSInstances are cheaper, you cannot run all your worker nodes as Spot. You must have some OnDemand Instances as a backup because Spot Instances can betray you anytime with the interruptions 😉

In this article, we are discussing how you can use Spot Instances on EKS Cluster as well as the cluster you own on EC2 Servers.

Refer to our public Github Repo which contains the files/templates we have used in the implementation. This blog is covering the below-mentioned points:

Kubernetes Operations with AWS EKS

AWS EKS is a managed service that simplifies the management of Kubernetes servers. It provides a highly available and secure K8s control plane. There are two major components associated with your EKS Cluster:

  • EKS control plane which consists of control plane nodes that run the Kubernetes software, like etcd and the Kubernetes API server.
  • EKS worker nodes that are registered with the control plane.

With EKS, the need to manage the installation, scaling, or administration of master nodes is no longer required i.e. AWS will take care of the control plane and let you focus on your worker nodes and application.

Prerequisites

  • EC2 Server to provision the EKS cluster using AWS CLI commands.
  • The latest version of AWS CLI Installed on your Server
  • IAM Permissions to create the EKS Cluster. Create an IAM Instance profile with the permissions attached and assign it to the EC2 Server.
  • EKS Service Role
  • Kubectl installed on the server.

Provision K8s Cluster with EKS

Execute the below command to provision an EKS Cluster:

aws eks create-cluster --name puck8s --role-arn arn:aws:iam::ACCOUNT:role/puc-eks-servicerole --resources-vpc-config subnetIds=subnet-xxxxx,subnet-xxxxx,subnet-xxxxxx,securityGroupIds=sg-xxxxx --region us-east-2

We have given private subnets available in our account to provision a private cluster.

Wait for the cluster to become available.

aws eks describe-cluster --name puck8s --query cluster.status --region us-east-2

Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through the AWS IAM Authenticator for Kubernetes(Link in the References section below). Install it using the below commands:

curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator
chmod +x ./aws-iam-authenticator
cp ./aws-iam-authenticator /usr/bin/aws-iam-authenticator

Update ~/.kube/config file which will be used by kubectl to access the cluster.

aws eks update-kubeconfig --name puck8s --region us-east-2

Execute “kubectl get svc”.

Launch Spot and OnDemand Worker Nodes

We have provisioned the EKS worker nodes using a cloud formation template provided by AWS. The template is available in our Github repo as well i.e. provision-eks-worker-nodes/amazon-eks-node group-with-spot.yaml. The template will provision three Autoscaling Groups:

  • 2 ASG with Spot Instances with two different Instance types as given in the parameters
  • 1 ASG with OnDemand Instance with Instance type as given in the parameter

Create a Cloudformation stack and provide the values in the parameters. For the AMI parameter, enter the ID from the below table:

| Region                  |      AMI               | 
|-------------------------| ---------------------- |
| US East(Ohio)(us-east-2)| ami-0958a76db2d150238 |

Launch the stack and wait for the stack to be completed. Note down the Instance ARN from the Outputs.

Now get the config map from our repo.

https://github.com/powerupcloud/kubernetes-spot-webinar/blob/master/provision-eks-worker-nodes/aws-cm-auth.yaml

Open the file “aws-cm-auth.yaml ” and replace the <ARN of instance role (not instance profile)> snippet with the NodeInstanceRole value that you recorded in the previous procedure, and save the file.

kubectl apply -f aws-auth-cm.yaml
kubectl get nodes --watch

Wait for the nodes to be ready.

Kubernetes Operations with KOPS

Kops is an official Kubernetes project for managing production-grade Kubernetes clusters. It has commands for provisioning multi-node clusters, updating their settings including nodes and masters, and applying infrastructure changes to an existing cluster. Currently, Kops is actually the best tool for managing k8s cluster on AWS.

Note: You can use kops in the AWS regions which AWS EKS doesn't support.

Prerequisites:

  • Ec2 Server to provision the cluster using CLI commands.
  • Route53 domain, (for example, k8sdemo.powerupcloud.com) in the same account from where you are provisioning the cluster. Kops uses DNS for identifying the cluster. It adds the records for APIs in your Route53 Hosted Zone.
Note: For public hosted zone, you will have to add the NS records for the above domain to your actual DNS. For example, we have added an NS record for "k8sdemo.powerupcloud.com" to "powerupcloud.com". This will be used for the DNS resolution. For the private hosted zone, ensure to add the VPCs.
  • IAM Permissions to create the cluster resources and update DNS records in Route53. Create an IAM Instance profile with the permissions attached and assign it to the EC2 Server.
  • S3 bucket for the state store.
  • Kubectl installed.

Install Kops

Log into the EC2 server and execute the below command to install Kops on the Server:

curl -LO https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64
chmod +x kops-linux-amd64
sudo mv kops-linux-amd64 /usr/local/bin/kops

Provision K8s Cluster

kops create cluster k8sdemo.powerupcloud.com --ssh-public-key ~/.ssh/id_rsa.pub --master-zones ap-south-1a --zones ap-south-1a,ap-south-1b,ap-south-1a --master-size=t2.medium --node-count=1 --master-count 1 --node-size t2.medium --topology private --dns public --networking calico --vpc vpc-xxxx --state s3://k8sdemo-kops-state-store --subnets subnet-xxxx,subnet-xxxx --utility-subnets subnet-xxxx,subnet-xxxx --kubernetes-version 1.11.4 --admin-access xx.xxx.xxxx.xx/32 --ssh-access xx.xxx.xxx.xx/32 --cloud-labels "Environment=DEMO"

Refer to our previous blog for the explanation of the arguments in the above command.

kops update cluster --yes

Once the above command is successful, we will have a private K8s Cluster ready with Master and Nodes in the private subnets.

Use the command “kops validate cluster CLUSTER_NAME” to validate the nodes in your k8s cluster.

Create Instance Groups for Spot and OnDemand Instances

Kops Instance Group helps in the grouping of similar instances which maps to an Autoscaling Group in AWS. We can use the “kops edit” command to edit the configuration of the nodes in the editor. The “kops update” command applies the changes to the existing nodes.

Once we have provisioned the cluster, we will have two Instance groups i.e. One for master and One for Nodes. Execute the below command to get available Instance Groups:

kops get ig

Edit nodes instance group to provision spot workers. Add the below Key Values. Set the max price property to your bid. For example, “0.10” represents a spot-price bid of $0.10 (10 cents) per hour.

spec:
...
maxPrice: "1.05"
nodeLabels:
lifecycle: Ec2Spot
node-role.kubernetes.io/spot-worker: "true"

The final configuration will look like as shown in the below screenshot:

Create one more Spot Instance Group for a different instance type.

kops create ig nodes2 --subnet ap-south-1a,ap-south-1b --role Node
kops edit ig nodes2

Add maxPrice and node labels and the final configuration will look like as shown in the below screenshot:

Now, we have configured two spot worker node groups for our cluster. Create an instance group for OnDemand Worker Nodes by executing the below command:

kops create ig ondemand-nodes --subnet ap-south-1a,ap-south-1b --role Node

kops edit ig ondemand-nodes

Add node labels for the OnDemand Workers.

Also, we have added taints to avoid the pods from OnDemand Worker Nodes. Preferably, the new pods will be assigned to the Spot workers.

To apply the above configurations, execute the below command:

kops update cluster
kops update cluster --yes
kops rolling-update cluster --yes

Cluster Autoscaler

Cluster Autoscaler is an open-source tool which automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:

  • there are pods that failed to run in the cluster due to insufficient resources
  • there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.

CA will run as a daemonset on the Cluster OnDemand Nodes. The YAML file for daemonset is provided in our Github Repo i.e. https://github.com/powerupcloud/kubernetes-spot-webinar/tree/master/cluster-autoscaler.

Update the following variables in the cluster-autoscaler/cluster-autoscaler-ds.yaml

  • Autoscaling Group Names of On-demand and Spot Groups
  • Update minimum count of instances in the Autoscaling group
  • Update max count of instances in the Autoscaling group
  • AWS Region
  • mode selector will ensure to run the CA pods on the OnDemand nodes always.

Create ClusterAutoscaler on both of the k8s clusters EKS as well as the cluster provisioned using kops. Ensure to attach the below permissions to the IAM Role assigned to the cluster worker nodes:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*"
}
]
}

The daemonset YAML file for the EKS Cluster will look like as shown in the below screenshot.

Similarly, for the cluster provisioned using Kops, the yaml file will be :


Create DaemonSet.

kubectl create -f cluster-autoscaler/cluster-autoscaler-ds.yaml

Now create a pod disruption budget for CA which will ensure to run at least one cluster autoscaler pod always.

kubectl create -f cluster-autoscaler/cluster-autoscaler-pdb.yaml

Verify the Cluster autoscaler pod logs in kube-system namespace:

kubectl get pods -n kube-system
kubectl logs -f pod/cluster-autoscaler-xxx-xxxx -n kube-system

Spot Termination Handler

The major fallbacks of a Spot Instance are –

  • it may take a long time to become available (or may never become available),
  • and maybe reclaimed by AWS at any time.

Amazon EC2 can interrupt your Spot Instance when the Spot price exceeds your maximum price, when the demand for Spot Instances rises, or when the supply of Spot Instances decreases. Whenever you are opting for Spot, you should always be prepared for the interruptions.

So, we are creating one interrupt handler on the clusters which will run as a daemonset on the OnSpot Worker Nodes. The workflow of the Spot Interrupt Handler can be summarized as:

  • Identify that a Spot Instance is being reclaimed.
  • Use the 2-minute notification window to gracefully prepare the node for termination.
  • Taint the node and cordon it off to prevent new pods from being placed.
  • Drain connections on the running pods.
  • To maintain desired capacity, replace the pods on remaining nodes

Create the Spot Interrupt Handler DaemonSet on both the k8s clusters using the below command:

kubectl apply -f spot-termination-handler/deploy-k8-pod/spot-interrupt-handler.yaml

Deploy Microservices with Istio

We have taken a BookInfo Sample application to deploy on our cluster which uses Istio.

Istio is an open platform to connect, manage, and secure microservices. For more info, see the link in the References section below. To deploy Istio on the k8s cluster, follow the steps below:

wget https://github.com/istio/istio/releases/download/1.0.4/istio-1.0.4-linux.tar.gz
tar -xvzf istio-1.0.4-linux.tar.gz
cd istio-1.0.4

In our case, we have provisioned the worker nodes in private subnets. For Istio to provision a publically accessible load balancer, tag the public subnets in your VPC with the below tag:

kubernetes.io/cluster/puck8s:shared

Install helm from the link below:

https://github.com/helm/helm

kubectl create -f install/kubernetes/helm/helm-service-account.yaml
helm init --service-account tiller --wait
helm install --wait --name istio --namespace istio-system install/kubernetes/helm/istio --set global.configValidation=false --set sidecarInjectorWebhook.enabled=false
kubectl get svc -n istio-system

You will get the LoadBalancer endpoint.

Create a gateway for the Bookinfo sample application.

kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml

The BookInfo sample application source code, Dockerfile, and Kubernetes deployment YAML files are available in the sample-app directory in our Github repo.

Build a docker image out of provided Dockerfiles and update the IMAGE variable in k8s/deployment.yaml for all the four services. Deploy each service using:

kubectl apply -f k8s

Hit http://LB_Endpoint/productpage. you will get the frontend of your application.

AutoScaling when the Application load is High

If the number of pods increases with the application load, the cluster autoscaler will provision more worker nodes in the Autoscaling Group. If the Spot Instance is not available, it will opt for OnDemand Instances.

Initial Settings in the ASG:

Scale up the number of pods for one deployment, for example, product page. Execute:

kubectl scale --replicas=200 deployment/productpage-v1

Watch the Cluster Autoscaler manage the ASG.

Similarly, if the application load is less, CA will manage the size of the ASG.

Note: We dont recommend to run the stateful applications on Spot Nodes. Use OnDemand Nodes for your stateful services.

and that’s all..!! Hope you found it useful. Happy Savings..!!

References:

Automated Deployment of PHP Application using Gitlab CI on Kubernetes – Part 7

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

Recently, we have got an opportunity to develop and deploy an application on a Kubernetes cluster running on AWS Cloud. We have developed a sample PHP application that will parse the CSV file and upload the content of the file into a MySQL RDS Instance. The application UI also supports some other functionalities like updating/deleting a particular row from the database, store and view processed files via AWS S3 bucket and view all the records of MySQL database. The Kubernetes Cluster is being provisioned using KOPS tool.​​ This article discusses the following points:

​​Prerequisites

  • ​​Route53 hosted zone (Required for KOPS)
  • ​​One S3 Bucket (Required for KOPS to store state information)
  • ​​One S3 bucket (Required to store the processed CSV files, for example, pucdemo-processed-.csv)
  • ​​One S3 bucket to store Application Access logs of the Loadbalancer.
  • ​​MySQL RDS in private subnet.3306 port is opened to the Kubernetes Cluster Nodes.
  • ​​Table to store the data from.CSV File in supported variables. In our case, we have used the following command to create a table in the database.
create database csvdb;
CREATE TABLE puc_csv(
sku INT,
name VARCHAR(200),
price DOUBLE
);

​​Setup

  • ​​Cloud: Amazon Web Services
  • ​​Scripting Languages Used: HTML, Javascript, and PHP
  • ​​Kubernetes Version: 1.11
  • ​​K8s Cluster Instance Type: t2.medium
  • ​​Instances are launched in Private subnets
  • ​​3 masters and 2 nodes (Autoscaling Configured)
  • ​​K8s Master / Worker node is in the Autoscaling group for HA / Scalability / Fault Tolerant
  • ​​S3 buckets to store data (details in Prerequisites)
  • ​​Route53 has been used for DNS Management
  • ​​RDS — MySQL 5.7 (MultiAZ Enabled)

​​Provision Kubernetes Cluster on AWS

kops create cluster pucdemo.powerupcloud.com --ssh-public-key ~/.ssh/id_rsa.pub --master-zones ap-south-1a --zones ap-south-1a,ap-south-1b,ap-south-1a --master-size=t2.medium --node-count=2 --master-count 3 --node-size t2.small --topology private --dns public --networking calico --vpc vpc-xxxx --state s3://pucdemo-kops-state-store --subnets subnet-xxxx,subnet-xxxx --utility-subnets subnet-xxx,subnet-xxx --kubernetes-version 1.11.0 --api-loadbalancer-type internal --admin-access 172.31.0.0/16 --ssh-access 172.31.xx.xxx/32 --cloud-labels "Environment=TEST" --master-volume-size 100 --node-volume-size 100 --encrypt-etcd-storage;

​​where,

  • ​ We have provided our public key in the argument — ssh-public-key. The respective private key will be used for SSH access to your master and nodes.
  • private subnets are provided as arguments in “ — subnets”: will be used by Kubernetes API(internal)
  • ​​public subnets are provided as arguments in “— utility-subnets”: will be used by Kubernetes services(external)
  • ​​ “— admin-access” will have the IP CIDR for which the Kubernetes API port will be allowed.
  • ​​ “— ssh-access” will have the IP from where you will be able to SSH into master nodes of Kubernetes Cluster.
  • ​​pucdemo.powerupcloud.com is the hosted zone created in Route 53. KOPS will create API related DNS records within it.

​​Attach ECR Full access policy to cluster nodes Instance Profile.

​​Create Required Kubernetes Resources

Clone the below Github repo:

https://github.com/powerupcloud/k8s-data-from-csvfile-to-database.git

Create GitLab Instance:

Replace the values for the following variable in the Kubernetes-gitlab/gitlab-deployment.yml :

  • GITLAB_ROOT_EMAIL
  • GITLAB_ROOT_PASSWORD
  • GITLAB_HOST
  • GITLAB_SSH_HOST
kubectl create -f kubernetes-gitlab/gitlab-ns.yml
kubectl create -f kubernetes-gitlab/postgresql-deployment.yml
kubectl create -f kubernetes-gitlab/postgresql-svc.yml
kubectl create -f kubernetes-gitlab/redis-deployment.yml
kubectl create -f kubernetes-gitlab/redis-svc.yml
kubectl create -f kubernetes-gitlab/gitlab-deployment.yml
kubectl create -f kubernetes-gitlab/gitlab-svc.yml

kubectl get svc -n gitlab” will give the provisioned Loadbalancer Endpoint. Create a DNS Record for the Endpoint, for example, git.demo.powerupcloud.com.

Create Gitlab Runner:

Replace the values for the following variable in the GitLab-runners/configmap.yml :

  • Gitlab URL
  • Registration Token

Go to the Gitlab Runners section in the Gitlab console to get the above values.

kubectl create -f gitlab-runners/rbac.yaml
kubectl create -f gitlab-runners/configmap.yaml
kubectl create -f gitlab-runners/deployment.yaml

Create CSVParser Application:

Create a base Docker image with Nginx and php7.0 installed on it and push to ECR. Give the base image in csvparser/k8s/deployment.yaml.

kubectl create -f csvparser/k8s/deployment.yaml
kubectl create -f csvparser/k8s/service.yaml

kubectl get svc” will give the provisioned Loadbalancer Endpoint. Create a DNS Record for the Endpoint, for example, app.demo.powerupcloud.com.

Application Functionality

  • Basic Authentication is enabled for the main page.
  • The browse field will accept the CSV file only.
  • After uploading, the data will be imported into the database by clicking the “Import” button.
  • The processed files can be viewed by clicking on the “View Files” button.
  • “View Data” button will list the records from the database in tabular format.
  • The data record can be edited inline and updated into the database by clicking the “Archive” button.
  • A particular row can be deleted from the database by clicking the “Delete” button.
  • The application is running on two different nodes in different subnets and is being deployed under a Classic LoadBalancer.

CI/CD

  • The Gitlab Instance and Runner are running as pods on the Kubernetes Cluster.
  • The application code is available in the Gitlab Repository along with Dockerfile and .gitlab-ci.yml
  • The pipeline is implemented in Gitlab Console using .gitlab-ci.yml file.
  • Whenever a commit is pushed to the Repository, the pipeline is triggered which will execute the following steps in a pipeline:
  • Build: Build a docker image from the Dockerfile and push to AWS ECR Repo.
  • Deploy: Updates the docker image for the already running application pod on Kubernetes Cluster.

Application in Action

Hit the Gitlab Service:

Sign in with the credentials.

Create a new Project and push the code. It will look like:

The Pipelines will look like:



The Application

View Data:

View Processed Files:

Editable table:

“Archive” will update the database.

Delete will delete the row from the database.

Note: We don’t recommend the application code to use in any scenario. It’s just for our testing purpose. It is not written using the best practices. This article showcases the provisioning of the Kubernetes cluster using KOPS with best practices and the deployment of any PHP application on the cluster using Gitlab pipelines.

Hope you found it useful. Keep following our blogs for the more interesting articles on Kubernetes. Do visit the previous parts of this series.

References

Kubernetes Assigning a Specific Pod to a particular Cluster Node – Part 6

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

In this article, we are discussing how one can deploy a specific MicroService on a particular node. As a solution, we are using Taint and Tolerations feature of Kubernetes. Toleration is applied to pods, and allow the pods to schedule onto nodes with matching taints.

Setup:

  • Kubernetes — v1.8+
  • All Cluster Nodes resides in the Public Subnets.
  • Cluster autoscaler configured for the Cluster Nodes.
  • Prometheus being used for the monitoring.

Requirement:

  • One specific Microservice needs to run on a Private Node.

Workflow:

  • Provision a private node and add it to the running Kubernetes Cluster.
  • Taint the new Private Node with tolerations.
  • Deploy the MicroService
  • Attach the new node to the existing nodes autoscaling group
  • Fix Prometheus DaemonSetsMissScheduled Alert

Provision Private Node

Since all the available nodes exist in public subnet, first we need to start with provisioning a Private node and adding that node to the existing Kubernetes Cluster. Note the AMI used by the running cluster node and launch an EC2 Server from that AMI. Select existing private subnet and existing nodes IAM role.

Copy the userdata script from the existing Nodes launch configuration. The script is required to join the Node to the Kubernetes Cluster as soon as it is provisioned.

Paste it in the user data in the Advanced Details section.

Add the Tags the same as an existing node. Ensure to add the “KubernetesCluster” tag.

Launch it. Once the server is provisioned, login to the server and check the Syslog. Ensure the docker containers are running.

docker ps

Now execute the below command on the server from where you will be able to access the Kubernetes API.

kubectl get nodes

It should list the new private node. The node is now added to the existing Kubernetes Cluster.

Taint the Private Node

Taint the private node by executing the below command:

kubectl taint nodes ip-xx.xx.xx.xxx.REGION.compute.internal private=true:NoSchedule

where key=value is private=true and the effect is NoSchedule. This means that no pod will be able to schedule onto the specified node unless it has matching toleration. The key-value pair can be modified here.

If you want to list the available tainted nodes, you can list it via a template:

tolerations.tmpl

{{printf "%-50s %-12s\n" "Node" "Taint"}}
{{- range .items}}
{{- if $taint := (index .spec "taints") }}
{{- .metadata.name }}{{ "\t" }}
{{- range $taint }}
{{- .key }}={{ .value }}:{{ .effect }}{{ "\t" }}
{{- end }}
{{- "\n" }}
{{- end}}
{{- end}}

Execute:

kubectl get nodes -o go-template-file="tolerations.tmpl"

Label the Node

Apply a label to the private node by executing the below command:

kubectl label nodes <Node> <key>=<value>

Example, kubectl label nodes ip-xx.xx.xx.xxx.REGION.compute.internal private=true

Deploy the MicroService with Tolerations

Update the deployment.yaml to include the tolerations same as specified while tainting the node and node label in the node selector:

tolerations:
- key: "private"
operator: "Equal"
value: "true"
effect: "NoSchedule"
nodeSelector:
private: "true"

Deploy it.

kubectl apply -f deployment.yaml

Execute “kubectl get pod/podname -o wide”. Check the Node to which it is assigned.

Attaching the Private Node to the existing AutoScaling Group

Enable Termination Protection on the Private Node and Suspend Terminate process temporarily from the nodes autoscaling group.

Attach the new node to the Nodes autoscaling group. Go to AutoScaling Group, select the private node and set instance protection “Set Scale In Protection”.

Since we have Cluster Autoscaler configured for the Cluster Nodes, the new private node will get terminated by the autoscaler(due to less load as compared to the other nodes). Therefore, its safer to set instance protection on the private node.

Remove the Terminate process from the Suspended Processes now. Do have a look at the Cluster Autoscaler logs. The private node will be skipped by the autoscaler.

Fix Prometheus DaemonSetsMissScheduled Alert

After setting up the private node completely, we started getting DaemonSetsMissScheduled alerts for the calico-node DaemonSet from Prometheus. We have debugged and followed the below steps to fix it.

Problem: We had a total of 8 nodes in our cluster (including masters, nodes in public subnets and a node in private subnet) but the “desiredNumberScheduled” in DaemonSet was showing 7 (excluding the private node).

Solution: Since we have a tainted private node, the daemonset must match the tolerations. To fix the above problem, we have added the tolerations the same as a private node to the calico-node DaemonSet.

Execute:

kubectl edit ds/calico-node -n kube-system

Check the value of “desiredNumberScheduled”. It was one less than the total number of nodes. You can get the number of nodes by the command: “kubectl get nodes”.

Next, Add the toleration the same as you have provided to the private node in the second step above (Taint the Private Node).

Now execute:

kubectl describe ds/calico-node -n kube-system

Check the “Desired Number of Nodes Scheduled:”. It should be equal to your number of nodes currently available.

Look at the status of calico-node pods too:

kubectl get pods -n kube-system -o wide -l k8s-app=calico-node

and that’s how we were able to assign a specific pod to a particular node without any misconfigurations. Hope you found it useful. Keep following our blogs for the further parts on Kubernetes. Do visit the previous parts of this series.

References:

Kubernetes Event Notifications to a Slack Channel- Part V

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies.

In the previous parts of our Kubernetes blog series, we have discussed installing, deploying, monitoring and debugging the Kubernetes Cluster. In this article, we are covering the notification part i.e. how to get notified through Slack for each activity (creation/termination/update/restart) of Kubernetes resources.

Get the Slack Token

Go to Slack and create a new bot.

https://.slack.com/apps/new/bot

Save the API Token that you got on the next screen.

Now switch to slack channels and click on the slack channel on which you want the notifications from the Kubernetes cluster.

Execute:

/invite @<bot-username>

Hit “https://<your_organisation>.slack.com/apps/manage/custom-integrations”

Click Bots and you will be able to view the Bots. Select the bot which you have created and edit. It will show the channel that is added to your bot.

Using Kubewatch for Create/Delete/Update Notifications

We are using Kubewatch explained on the below link to get notified if the Kubernetes resources are being created, terminated or updated.

https://github.com/bitnami-labs/kubewatch

Create Configmap for kubewatch. Replace Slack API token and channel name.

kubewatch-config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
name: kubewatch
data:
.kubewatch.yaml: |
handler:
slack:
token: <SLACK_API_TOKEN>
channel: <SLACK_CHANNEL_NAME>
resource:
deployment: true
replicationcontroller: true
replicaset: true
daemonset: true
services: true
pod: true

Execute command:

kubectl create -f kubewatch-config.yaml

Create service account, cluster role and role bindings with the below yaml file:

rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: kubewatch
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kubewatch
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
- replicationcontrollers
verbs:
- list
- watch
- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kubewatch
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:kubewatch
subjects:
- kind: ServiceAccount
name: kubewatch
namespace: monitoring

Create it using the command:

kubectl create -f rbac.yaml

Create the Pod yaml file:

kubewatch.yaml

apiVersion: v1
kind: Pod
metadata:
name: kubewatch
namespace: monitoring
spec:
containers:
- image: tuna/kubewatch:v0.0.1
imagePullPolicy: Always
name: kubewatch
volumeMounts:
- name: config-volume
mountPath: /root
- image: gcr.io/skippbox/kubectl:v1.3.0
args:
- proxy
- "-p"
- "8080"
name: proxy
imagePullPolicy: Always
restartPolicy: Always
serviceAccount: kubewatch
serviceAccountName: kubewatch
volumes:
- name: config-volume
configMap:
name: kubewatch

kubectl create -f kubewatch.yaml

The notifications in slack will be like:

Note: Kubewatch will notify for the creation/termination/update of the resources. It won’t notify if the pod restarts.

Using Lifecycle Hook to get Notifications for Pod Restarts

We can get the restart count from kubectl get pods.

NAME                             READY     STATUS    RESTARTS   AGE<#podname#-xxxxxxx-xxxxxxxxxx>   2/2       Running   21         17d

We can see the restart count from the above output. To get notified for the pod restart, we are using a post-Start life cycle hook. Here’s the workflow:

  • Put post start lifecycle hook in the deployment.yaml which will execute whenever the pod starts.
  • Execute a POST curl request from the lifecycle hook to post the message to the Slack channel using API Token.

Refer to the deployment YAML content below for adding post start Life cycle Hook.

spec:
containers:
- image: xxxxxxx.dkr.ecr.ap-south-1.amazonaws.com/dev-tomcatapp:v1.0
imagePullPolicy: IfNotPresent
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "export hostname=`hostname` && curl -X POST -H 'Authorization: Bearer <SLACK_API_TOKEN>' -H 'Content-type: application/json' --data '{\"channel\":\"<SLACK_CHANNEL_NAME>\",\"text\":\"'\"The pod has started: $hostname\"'\"}'
https://slack.com/api/chat.postMessage"]
preStop:
exec:

and create the deployment using the command:

kubectl create -f deployment.yaml

If you have wanted to update the existing deployment, you can execute the below commands to add the post start hook:

kubectl get deployment -l name=<label> -o yaml > deployed-app.yaml

Edit deployed-app.yaml. Add the post start Lifecycle hook with proper indentation and execute the below command to do the deployment:

kubectl apply -f deployed-app.yaml

For more details on the above deployment.yaml, see our previous blog.

The notifications from the post start lifecycle hook will be like:

and that’s it..!! Hope you found it useful. Keep following our Kubernetes series for more interesting articles.

References:

Kubernetes Log Management using Fluentd as a Sidecar Container and preStop Lifecycle Hook- Part IV

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written b y Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

Application logs play a vital role in any successful deployment. The first thing which is being checked after any deployment is “logs”. Even if someone is getting any error or unable to access the application, everyone needs “logs” for debugging. The logs are particularly useful for debugging problems and monitoring “what is happening from the application server-side”.

In Kubernetes, the application logs from a pod can easily be fetched from the command: “kubectl logs <podname>”.

But what if your container crashes or pod becomes inaccessible and you still want to access the old logs. In such cases, we must have permanent storage for our logs so that we don’t miss any of the application logs.

In this article, we will be discussing logs Management in Kubernetes. We are introducing the following two scenarios here:

  • Scenario 1: Store the Logs in a Centralized Location( S3). So that if the pod is deleted, you can easily retrieve logs from the S3 bucket.
  • Scenario 2: Stream the application logs to Elasticsearch in real-time.

PreStop Lifecycle Hook:

If one pod is deleted due to some reason, here is the workflow which we have followed to get the logs from a container:

  • Execute Prestop Lifecycle Hook which will execute before the pod is terminated. The prestop hook will do the following tasks:
  • Zip the locations where the application logs exist in the container. For example, /usr/local/tomcat/logs for any tomcat application.
  • Put the Zip file to the AWS S3 Bucket.

Prerequisites:

  • S3 Bucket setup for putting logs.
  • AWS CLI installed in the container to put the logs to the S3 Bucket. Use the below three commands to install awscli from Dockerfile:
RUN apt-get install -y python3-pipRUN apt-get install -y zipRUN pip3 install --upgrade awscli
  • Ensure the IAM role, which is attached to K8s cluster nodes, is having permissions to access the S3 bucket which is configured for putting logs.

Creating ConfigMap for the Bash Script:

prestop-config.yaml

apiVersion: v1
data:
prestop.sh: |
#!/bin/bash
DATE=`date '+%Y-%m-%d-%H%M%S'`
HOSTNAME=`hostname`
###custom application log location
APPLOGS="/home/Apps/Logs/*"
###tomcat logs location
TOMCATLOGS="/usr/local/tomcat/logs/*"
zip -r /tmp/${DATE}-${HOSTNAME}.zip ${APPLOGS} ${TOMCATLOGS}
SERVICE=`echo $HOSTNAME | cut -d'-' -f -2`
aws s3 cp /tmp/${DATE}-${HOSTNAME}.zip s3://dev-k8s-prestop-logs/${SERVICE}/
kind: ConfigMap
metadata:
name: dev-prestop-script-config
namespace: default

In our case, we have two logs locations for writing the application logs:

  • /usr/local/tomcat/logs/
  • /home/Apps/Logs/

Execute the below command to create a configMap:

kubectl create -f prestop-config.yaml

Verify whether the config map created or not:

kubectl get configmap

Once the config map is created, but the bash script into the container using volume mount in deployment.YAML file which is used to create the deployment for the application.

        - name: prestop-script
mountPath: /mnt

We are adding the script to /mnt location of the application container. The location can be modified according to the requirements.

Specify the volumes for the config map which we have created in the above step:

      - name: prestop-script
configMap:
name: dev-prestop-script-config

Ensure that you are giving the same names to the prestop volume and volume mount.

And add the prestop lifecycle hook in the spec.containers.

       lifecycle:
preStop:
exec:
command: ["/bin/sh", "/mnt/prestop.sh"]

Refer to the final deployment.yaml in the sections below.

Fluentd as a Sidecar Container:

Workflow:

If you want to stream the pod logs to AWS Elasticsearch Service, here is the workflow:

  • Run two containers for every single pod. (Since a pod is a group of one or more containers)
  • One is an application container where the tomcat application is being deployed
  • Another one is a Fluentd container which will be used to stream the logs to AWS Elasticsearch Service.
  • Share the logs directories from application containers to fluentd containers using volume mounts.
  • Specify those logs directories in fluentd config so that the logs will be taken from them and streamed to Elasticsearch.
  • Hit Kibana URL to view the logs.

Prerequisites:

  • AWS Elasticsearch Service Setup (Cognito enabled Authentication Elastcisearch will also work here).
  • Ensure the IAM role, which is attached to K8s cluster nodes, is having permissions to access the AWS Elasticsearch Domain.

We have prepared a Dockerfile for building the fluentd image which is used to stream the logs to AWS Elasticsearch Service.

Dockerfile

FROM ubuntu:16.04
RUN apt-get update
RUN ulimit -n 65536
RUN apt-get install -y curl
RUN curl https://packages.treasuredata.com/GPG-KEY-td-agent | apt-key add -RUN echo "deb http://packages.treasuredata.com/3/ubuntu/xenial/ xenial contrib" > /etc/apt/sources.list.d/treasure-data.
listRUN apt-get update && apt-get install -y -q curl make g++ && apt-get clean && apt-get install -y td-agent && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN sed -i -e "s/USER=td-agent/USER=root/" -e "s/GROUP=td-agent/GROUP=root/" /etc/init.d/td-agent
RUN /usr/sbin/td-agent-gem install fluent-plugin-aws-elasticsearch-service -v 1.0.0
CMD /usr/sbin/td-agent $FLUENTD_ARGS

Build a docker image out of it and push to a docker registry. We have used ECR as a docker registry provided by AWS.

docker build -t <registry/repo:tag> -f <Dockerfile>docker push <registry/repo:tag>

Create a fluentd config map using below YAML:

fluentd-config.yaml

apiVersion: v1
data:
td-agent.conf: |
<source>
@type tail
format multiline
format_firstline /[0-9]{2}-[A-Za-z]{3}-[0-9]{4}/
format1 /^(?<datetime>[0-9]{2}-[A-Za-z]{3}-[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}) (?<Log-Level>[A-Z]*) (?<message>.*)$/
path /usr/local/tomcat/logs/catalina*,/usr/local/tomcat/logs/localhost*.log
path_key tailed_tomcat_path
pos_file /usr/local/tomcat/logs/tomcat.catalina.pos
tag tomcat.tomcat.logs
</source> <source>
@type tail
format apache
path /usr/local/tomcat/logs/localhost_access*
path_key tailed_localhost_access_path
pos_file /usr/local/tomcat/logs/tomcat.localhost.access.pos
tag tomcat.localhost.access.logs
</source> <source>
@type tail
format multiline
format_firstline /[0-9]{4}-[0-9]{2}-[0-9]{2}/
format1 /^(?<datetime>[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}) (?<Level>[A-Z]*) (?<Message>.*)$/
path /home/Apps/Logs/*.log
path_key tailed_app_path
pos_file /home/Apps/Logs/tomcat.app.pos
tag tomcat.app.logs
</source> <filter tomcat.tomcat.logs>
@type record_transformer
<record>
hostname ${hostname}
</record>
</filter> <filter tomcat.localhost.access.logs>
@type record_transformer
<record>
hostname ${hostname}
</record>
</filter> <filter tomcat.app.logs>
@type record_transformer
<record>
hostname ${hostname}
</record>
</filter> <match **>
@type copy
<store>
@type aws-elasticsearch-service
type_name fluentd
logstash_format true
logstash_prefix dev-tomcatapp
flush_interval 60s
num_threads 8
<endpoint>
url https://AWS_ES_ENDPOINT
region ap-south-1
</endpoint>
</store>
<store>
@type stdout
</store>
</match>kind: ConfigMap
metadata:
name: dev-tomcatapp-fluentd-config

The <source> section can be changed according to the application platform.

Execute the below command to create the configmap:

kubectl create -f fluentd-config.yaml

Refer to the final deployment.yaml file below.

Creating a YAML file for the Deployment

Now that we are aware of the workflows, let’s create the deployment on Kubernetes Cluster.

Here is the deployment.yaml with lifecycle hook and fluentd container:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
name: dev-tomcatapp
tier: dev
name: dev-tomcatapp
selfLink: /apis/extensions/v1beta1/namespaces/default/deployments/dev-tomcatapp
spec:
replicas: 1
selector:
matchLabels:
name: dev-tomcatapp
tier: dev
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
name: dev-tomcatapp
tier: dev
spec:
containers:
- image: xxxxxxx.dkr.ecr.ap-south-1.amazonaws.com/dev-tomcatapp:v1.0
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command: ["/bin/sh", "/mnt/prestop.sh"]
name: dev-tomcatapp
ports:
- containerPort: 8080
protocol: TCP
resources:
limits:
cpu: 200m
memory: 1Gi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- mountPath: /usr/local/tomcat/logs
name: tomcatapp-tomcat-logs
- mountPath: /home/Apps/Logs
name: tomcatapp-app-logs
- name: prestop-script
mountPath: /mnt
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
- name: fluentd
image: xxxxxxx.dkr.ecr.ap-south-1.amazonaws.com/tdagent-aws-es
env:
- name: FLUENTD_ARGS
value: -c /etc/td-agent/td-agent.conf
volumeMounts:
- name: tomcatapp-tomcat-logs
mountPath: /usr/local/tomcat/logs
- name: tomcatapp-app-logs
mountPath: /home/Apps/Logs
- name: config-volume
mountPath: /etc/td-agent
volumes:
- name: tomcatapp-tomcat-logs
emptyDir: {}
- name: tomcatapp-app-logs
emptyDir: {}
- name: config-volume
configMap:
name: dev-tomcatapp-fluentd-config
- name: prestop-script
configMap:
name: dev-prestop-script-config
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: secret01
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30

Execute the below command to create the deployment now:

kubectl create -f deployment.yaml

Verifying Logs

Get the pod name:

kubectl get pods -l name=dev-tomcatapp

and execute

kubectl logs -f pod/<podname> fluentd

It will show the stdout of logs parsing to ES:

Create an index pattern in Kibana:

Once the index is created, hit Discover to browse the logs.

If the pod is deleted, the zip file of logs will be shown in the s3 bucket:

and that’s it..!! Hope you found it very useful. Unlike us, at least now you won’t have to spend hours and days to research for a complete logging solution in Kubernetes. Follow this whole article and there you go without missing any logs. Happy Pod Logging..!! 🙂

Do visit our previous blogs for more interesting stuff on Kubernetes.

Running Stable Scalable Kubernetes Cluster on AWS

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written by Manoj Kumar, Principal Cloud Architect, Powerupcloud Technologies

Problem Statement

The customer is a large e-commerce start-up with all their applications running on AWS. The customer was running their core microservices application in Amazon Elastic Beanstalk multi-container environment. The setup had the below problems:

  • The existing environment was unable to scale for individual service
  • Deployment for service was affecting other services too
  • Cost of running their microservices was high
  • Memory-based scaling for each microservice was not implemented

How did the Powerup team help?

The Powerup DevOps team helped the customer in implementing the Kubernetes cluster from scratch to overcome the issues in the existing cluster.

AWS Architecture & Description

  • The website domain is handled via Amazon Route53
  • A VPC is provisioned with public and private subnets
  • Separate VPC is created for each environment (Dev, Stage, and Prod) in separate AWS Accounts
  • The separate subnet is being created for Kubernetes Master nodes and cluster nodes, ELB, Cache/Search, Databases in each availability zone
  • Separate security groups are created for each layer
  • Kong is used as API Gateway for the microservices
  • Kong servers are implemented in Cluster with Cassandra as a backend for Kong API
  • All microservices are created in Kubernetes with HPA for the pod level scaling.
  • All microservices are stateless
  • ElastiCache Redis is being used to store the sessions
  • Few services use Hazelcast as an in-memory cache
  • Hazelcast cluster gets automatically created based on the Kubernetes service name
  • ElasticSearch cluster is used for Search and its configured in HA mode
  • MongoDB and MySQL are used as Database engines
  • MongoDB is installed in EC2 instance and PostgreSQL is running on RDS
  • The e-commerce application is designed to achieve high availability, high scalability and be fault-tolerant
  • Highly Available — Each layer of the application is spread across 2 availability zones.
  • Highly Scalable — Kubernetes node is configured in the auto-scaling group. So the application will horizontally scale-up based on the server CPU utilization. HPA is being configured for the service level scaling.

CI/CD Setup

  • Source Code: Fabricator (Ben10): It is configured in an EC2 instance. This server is restricted only to CI server and an office IP.
  • CI Tool: Jenkins
  • List of plugins installed: Kubernetes, Docker, Selenium, JUnit Test case, Pipeline, Multiple SCMs
  • For Jenkins authentication, it’s integrated to GSuite. It can be authenticated only with GSuite users.
  • Artefacts: Nexus Repo
  • Build Tool: Maven is being used to build the war. This will pull the artifacts from the Nexus repo.
  • Code Review: SonarQube
  • Docker Image Repo: Elastic Container Registry
  • Separate Jenkins server has been created for Production and Nonproduction environment

Configuration Management

  • Docker/Ansible is being used as the configuration management Tool
  • Docker Image is created for all the microservices
  • It is maintained across all the Dev / QA / Prod environments
  • Ansible Playbook was prepared for the following components
  1. Monitoring Agents: Sensu monitoring agent Installation
  2. Software updates: Update all the packages and update the specific package

Operations

Deployment Work Flow

Build

  • Jenkins job is created to clone the code
  • Maven is used to building the war file
  • The artefacts are downloaded from Nexus repo

Pack

  • Using Jenkins Docker plugin, prepare the Docker image with the war file from the build step
  • Commit the Docker image into the ECR repository and tag the Docker image with the build number

Deploy

  • Using Kubernetes Jenkins plugin, deploy the latest image into the respective service in the Kubernetes cluster
  • Wait for the deployment to be successful
  • If the deployment fails, Kubernetes automatically rollback to the previous version

Test

  • Selenium test cases are triggered after the successful deployment

Write to us at cloud@powerupcloud.com to know more about the challenges we faced and the learning we had from this project.

Autoscaling based on CPU/Memory in Kubernetes — Part II

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

In Part I, we have discussed,

  • Setting up a cluster,
  • Creating Deployments, and
  • Accessing the services.

In this part, we are going to show how you can autoscale the pods on the CPU/Memory based metrics.

CPU Based Scaling

With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization.

Execute the command: “kubectl get deployment” to get the existing deployments.

Create a Horizontal Pod Autoscaler i.e. hpa for a particular deployment using the command:

kubectl autoscale deployment <deployment-name> --min=2 --max=5 --cpu-percent=80

Execute “kubectl get hpa” to get the available hpa in your cluster.

So, now we have a hpa running for our deployment “tomcat02”. It compares the arithmetic mean of the pods’ CPU utilization with the target defined in Spec.CPU Utilization, and adjusts the replicas of the Scale if needed to match the target (preserving condition: MinReplicas <= Replicas <= MaxReplicas). For more information on HPA, you can refer to this link here:

But, how you will update the minimum no. of replicas in an existing HPA? In our case, currently, we have set the minimum no. of replicas to 1, what if we need to update the min. no. of replicas to 2. In this scenario, just we need to get the hpa in yaml format and update the yaml file. Here’s an example,

kubectl get hpa/tomcat02 -o yaml > tomcat-hpa.yamlvim tomcat-hpa.yaml

Update the count as highlighted in below screenshot:

Save the yaml file and apply the changes:

kubectl apply -f tomcat-hpa.yaml

Once the changes have been applied, it will launch one more pod as shown in the above screenshot.

Memory Based Scaling

Since it is not possible to create memory-based hpa in Kubernetes, we have written a script to achieve the same. You can find our script here by clicking on this link:

https://github.com/powerupcloud/kubernetes-1/blob/master/memory-based-autoscaling.sh

Clone the repository :

https://github.com/powerupcloud/kubernetes-1.git

and then go to the Kubernetes directory. Execute the help command to get the instructions:

./memory-based-autoscaling.sh --help

Pod Memory Based AutoScaling

In this section, we are discussing how you can deploy autoscaling on the basis of memory that pods are consuming. We have used the command “kubectl top pod” to get the utilized pod memory and applied the logic.

  • Get the average pod memory of the running pods: Execute the script as follows:
./memory-based-autoscaling.sh --action get-podmemory --deployment <deploymentname>

Once this command is executed, you can check the logs in directory /var/log/kube-deploy/.

  • Deploy autoscaling on the basis of pod memory: Execute the script as follows:
./memory-based-autoscaling.sh --action deploy-pod-autoscaling --deployment <deployment-name> --scaleup <scaleupthreshold> --scaledown <scaledownthreshold>

Check the same log file once the script is executed. If the average pod memory will cross the scaleup threshold, it will launch one more pod. Similarly, the Scale down policy condition will check two things:

  • whether the average pod memory is less than the scaledown threshold, and
  • if the no. of current pods is greater than the minimum count we have set while creating the hpa.

For example, if we have no. of pods running 3 and the minimum count is 2, in this case, the deployment will be scaled down if the average pod memory is less than the scaledown threshold. In other cases, if we have two running pods and minimum is also set to 2, even if the average pod memory is less than the threshold, the deployment won’t be scaled down.

Once you verify that the above actions are working fine, you can schedule the script as a cronjob to execute at every 5 mins. Provide the full path of the script in crontab.

*/5 * * * * /bin/bash /opt/kubernetes/memory-based-autoscaling.sh --action deploy-pod-autoscaling --deployment xxxxxx --scaleup 80 --scaledown 20 > /dev/null 2>&1

Java Heap Memory Based AutoScaling

In the case of Java Applications, heap memory plays a very important role. When a Java program starts, Java Virtual Machine gets some memory from the Operating System. Java Virtual Machine or JVM uses this memory for all its need and part of this memory is call java heap memory. Whenever heap memory is full, it starts throwing java.lang.OutOfMemoryError: Java Heap Space error.

In this section, we are showing how you can deploy autoscaling on the basis of the heap memory that Java process is consuming. Once the heap memory crosses the threshold, one more pod will get launched.

Prerequisites:

  • Ensure you have allocated the max heap memory to the JVM Process.
  • Ensure that Jstat is installed in your docker image. Use the following commands to install “jstat” in Ubuntu / Debian OS:
  • apt-get update
  • apt install -t Jessie-backports openjdk-8-jre-headless ca-certificates-java
  • apt-get install -y openjdk-8-jdk

To update your deployment with the updated docker image, you can execute the below command:

kubectl set image deployment/<deploymentname> <containername>=<image>

You can verify that jstat is installed in your pod by executing jstat command as shown in below screenshot.

In the script, we are filtering for the Xmx value in the running jvm process to get the total heap memory, using the “jstat” command to get the utilized heap memory and applied the logic for autoscaling.

  • Get the average heap memory of the running pods: You can verify the allocated memory by checking the -Xmx value in the running jvm process:

kubectl exec -t <podname> -- ps -ef | grep java

Execute the script as follows:

./memory-based-autoscaling.sh --action get-heapmemory --deployment <deployment-name>

and then check the logs in /var/log/kube-deploy. The log file name will have the deployment name included in it i.e. kube-deploymentname-date.log.

  • Deploy autoscaling on the basis of heap memory: Execute the script as follows:
./memory-based-autoscaling.sh --action deploy-heap-autoscaling --deployment <deployment-name> --scaleup <scaleupthreshold> --scaledown <scaledownthreshold>

Check the same log file for the logs.

Once you verify all actions are working fine, you can schedule the script as a cronjob to execute at every 5 mins. Provide the full path of the script.

*/5 * * * * /bin/bash /opt/kubernetes/memory-based-autoscaling.sh --action deploy-heap-autoscaling --deployment xxxxxx --scaleup 80 --scaledown 20 > /dev/null 2>&1

& that’s it. Hope you found it useful. Happy Pod Scaling..!! 🙂

Keep following us for further parts on Kubernetes..!!

Getting Started with Kubernetes in AWS

By | AWS, Blogs, Cloud, Cloud Assessment, Kubernetes | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

kicking off the series of posts dealing with running Kubernetes in production on AWS with this introductory article. Stay tuned for more!

If you are reading this article, chances are you know Kubernetes already. It has Google’s enigma behind it and is a leader in container orchestration space. But for beginners, I will go ahead and state it. Kubernetes is an open-source platform for automating deployment, scaling, and operations of application containers across clusters of hosts, providing container-centric infrastructure.

With Kubernetes, you will be able to quickly and efficiently respond to scaling demands:

  • Deploy your applications quickly and predictably.
  • Scale your applications on the fly.
  • Seamlessly roll out new features.
  • Optimize the use of your hardware by using only the resources you need.

Downloading Kubernetes

So let’s get straight ahead with exploring Kubernetes. Download Kubernetes package

wget -q -O — https://get.k8s.io | bash

Go to the cluster directory of your Kubernetes. There you can see the different environments available for Kubernetes.

In this blog, we are focusing on creating the cluster in AWS environment. The default configuration can be found in the “aws” directory. Either you can change the configurations in the default file itself or you can export the required variables to bring up the cluster. To export the variables, we have created a file “export.sh” with our required configurations. Btw, don’t take the instance types in the config below seriously, you should choose the right instance type depending on your cluster setup and workload.

#!/bin/bashexport KUBERNETES_PROVIDER=awsexport KUBE_AWS_ZONE=us-east-1cexport NUM_NODES=2export MASTER_SIZE=m3.mediumexport NODE_SIZE=t2.mediumexport AWS_S3_REGION=us-east-1export AWS_S3_BUCKET=kubernetes-puc01

Export all these variables by executing this file:

source ./export.sh

Now, bring up the cluster by executing the following command from the cluster directory i.e. /opt/Kubernetes/cluster

./kube-up.sh

Once the cluster is launched, we will get the endpoints as shown in below screenshots:

Hit the Kubernetes-dashboard URL to access the dashboard.

Get the username and password from the /root/.kube/config file. After providing the credentials, we will get the Kubernetes dashboard as shown below:

Installing kubectl

This is straight forward — Follow this link for installing kubectl in your system:

Creating Deployment in Kubernetes

We are going to build a sample java application. For this, we have Dockerfile as mentioned below:

FROM tomcatRUN apt-get updateRUN apt-get install -y zip curl wgetRUN wget https://github.com/manulachathurika/Apache_Stratos_Tomcat_Applications/raw/master/Calendar.war -O /usr/local/tomcat/webapps/Calendar.warCMD ["catalina.sh", "run"]

Build the docker image using the command:

docker build -t <tag-name> .

Push the image to docker hub:

Push the image to docker hub:

docker push <tag-name>
  • Deploy the Docker image in Kubernetes

We have two options available for deploying an application in Kubernetes:

  • Either through dashboard
  • or through a yaml file

Deployment through Kubernetes Dashboard:

The interface looks kind of like the Google Cloud Console. That’s right, its Google all the way. Click Create. and Specify the Docker Image and type of service you want.

  • Internal: Exposes the service on a cluster-internal IP. Choosing this value makes the service only reachable from within the cluster. This is the default ServiceType.
  • External: Exposes the service externally using a cloud provider’s load balancer. The service will be accessible from outside the cluster.

Deployment through a yaml file:

Create a yaml file which includes the deployment details such as the name, docker image to use, the ports to expose, etc. Refer to the following sample file which we have used for the deployment of a java application:

Also, we have specified the resources we want for our deployment by using limits and requests in the yaml file. Once we are ready with our yaml file, either you can deploy it through the dashboard:

or through kubectl command:

kubectl create -f example.yaml

We have specified the name of our deployment i.e. tomcat01 in the selector for which we are going to create a load balancer type service. Create the service using the command:

kubectl create -f service-example.yaml

Go to Services Dashboard. You will get an external endpoint through which your application will be accessible.

Hit the endpoint.

Get the existing deployments using the command:

kubectl get deployments

Get the existing services using the command:

kubectl get services

If you want to get the services across all namespaces, use command:

kubectl get services — all-namespaces

So, this is all about how you can create the deployments in Kubernetes cluster and how you can access them externally. Keep following our blog for further parts on Kubernetes. Happy container orchestration..!!