All Posts By

powerupcloud

Significance of BI tools in the Era of Big Data

By | Uncategorized | No Comments

Written by Anjali Sharma, Software Engineer at Powerupcloud Technologies

The Demand of business intelligence tool in the Big data world has become the BOOM these days. Today, after Big Data, one of the most used buzzword in the business world is nothing but Business Intelligence. Then how do they both relate to each other? The ascendance of Business Intelligence to the highest priority of most companies has meant that BI Analysts are highly sought after. Business Intelligence (BI) tools have enabled organizations to get revealing insights into their operations and processes and use them to improve productivity, boost revenue, cut costs, etc.

BI refers to the business strategy and technological tools used for analysing business information, including analysis of historical data, analysis of current data as well as future predictions. Hence, BI is a business discipline, much as it is also a technological discipline. As the technological part of BI, companies use various databases and data analytics tool, which comprise their enterprise BI infrastructure. BI tools have been around for decades. However, in recent years, the advent of Big data and artificial intelligence technologies have increased the number and broadened the functionalities of BI technologies.

Gone are the days when businesses were assumed to be like gambling. In those days, there were no other options than making ‘the perfect guess.’ But now, as you know, when it comes to a company’s future, this is no longer an appropriate method to arrive at a strategy. With the help of Business Intelligence software, one can have accurate data, real time updates, and means for forecasting and even to predict conditions.

Assortments- BI tool can have several visages according to the business demand or technical requirements:

  • Data Visual representation tool
  • Data Mining tool
  • Reporting tool
  • Querying tool
  • Analysis tool
  • Geolocation analysis tool etc.

How Tableau becomes the most Powerful BI tool

Now let’s understand among all BI tools how Tableau becomes the most powerful & user friendly-

Tableau offers powerful and sophisticated data collection, analysis and visualizations. One of the claims on Tableau’s website is “Tableau helps people see and understand their data” Tableau allows users to drill deep into data, create powerful visualizations to analyse the information, and automatically produce valuable business insights.

Several Data Source Connections

One of the main strengths of Tableau is that it can automatically connect with hundreds of data sources without any programming needed, including big data providers.

Tableau is one of the leading BI tools for Big Data Hadoop which you can use. It provides the connectivity to various Hadoop tools for the data source like Hive, Cloudera, HortonWorks, etc. Also, not only with Hadoop, Tableau provides the option to connect the data source from over 50 different sources including AWS and SAP.

Drag & Drop facility

Tableau’s drag & drop facility makes it really easy and user friendly. Tableau is designed with most integration taking place through drag-and-drop icons. You can quickly create visuals from data by dragging the icon for the relevant data set into the visualisation area. In other words, you can access visualisations that reveal important insights within a few clicks.

Live and Extracted Data Connection

Tableau allows users to connect live data and extracted data both. User can instantly switch between live data connections and pre-extracted data. You can also schedule extract refreshes and get notifications when live data connections fail.

Security

Users can collaborate securely across networks or the cloud, using Tableau Server and Tableau Online. This allows rapid sharing of insights, meaning that people can take action more quickly to save costs or make more money for the business.

Above mentioned features of tableau make it different from other BI tools. Data is growing faster than ever. With the proliferation of the internet, we now generate even more information. According to IBM, 2.5 quintillion bytes of data are created every day! However, less than 0.5% of it is ever analysed and used. Therefore, the importance of data analysis tools has increased these days. From past 6 years Tableau has been the leader among all data analysis and visualisation tool. Specializing in beautiful visualizations, Tableau lets you perform complex tasks with simple drag-and-drop functionalities and numerous type of charts.

If you are beginner, for better understanding let’s do a hands-on on Tableau with some sample data. Here I am using skill registry dataset where we have created a Google form for the employees of our organisation, we have shared it among them where they can fill their name, email address skills, Total experience etc. After collecting the data, we have created a CEO dashboard.

Download & Install Tableau desktop 14 days’ trial version-  

https://www.tableau.com/en-gb/products/trial

Also you can try free Tableau Public version 2020.2.

Open tableau and connect the data source wherever you have your data as Tableau provides more than 100 data sources we can connect.

After connecting data source check if the data is in correct format, any data source filter needs to be applied or should we use the data interpreter etc. Connections can be Live or Extracted as per the requirements.

What is Live & Extract? (Refer the link given below)

https://www.tableau.com/about/blog/2016/4/tableau-online-tips-extracts-live-connections-cloud-data-53351

If the data is not sufficient in one table, you can take another table using joins.

Now go to the sheet. It would be the first step moving forward creating your very first dashboard.

Tableau divides its data in two types- Measures & Dimensions.

Now Dimensions are something which contain qualitative values like Name, Date, Country etc.

And Measures are those field that can be aggregated or can be used for mathematical operations. In short the numeric values of the dimensions are measures.

As I am using Employees data I can put their location in one sheet using Map chart.

For another view I have put Employees’ skills in two different sheets skill categories and skill-sub categories using name count in measures so that we can analyze how many resources we do have in each skill category.

In last view I have added resource information like their email address, service group, Resume also I have added using action filter.

Now go to the dashboard symbol put all the sheets together and create a visualize representation. You can apply filters according to the requirements also use format option for making you dashboard clean and colorful.

(Data security is the reason why I have hidden the counts and resource information)

For the practice you can download sample data from https://www.kaggle.com/datasets and create your own dashboard.

DevOps for Databases using Liquibase, Jenkins and CodeCommit

By | data, Data pipeline, DevOps | 4 Comments

Written by Arun Kumar, Associate Cloud Architect at Powerupcloud Technologies

In the Infra modernization, we are moving the complete architecture to microservices-based with CI/CD deployment methods, these techniques are suitable for any application deployments. But  most of the time Database deployments are manual efforts. Applications and databases are growing day by day. Especially the database size and the operational activities are getting complex and maintaining this by a database administrator is a bit tedious task.

 For enterprise organizations this is even more complex when it comes to managing multiple DB engines with hundreds of Databases or multi-tenant databases . Currently, below are the couple of challenges that the DBA faces indeed which are the manual activities.

  • Creating or Modifying the stored procedures, triggers and functions in the database.
  •  Altering the table in the database.
  • Rollback any Database deployment.
  • Developers need to wait for any new changes to be made in the database by DBA which increase the TAT(turn around time) to test any new features even in non-production environments.
  • Security concerns by giving access to the Database to do change and maintaining access to the database will be huge overhead.
  • With vertical scaling different DB engines of the database it is difficult to manage.

One of our Enterprise customers has all the above challenges, to overcome this we have explored various tools and come up with the strategy to use Liquibasefor deployments.  Liquibase supports standard SQL databases (SQL, MySQL, Oracle, PostgreSQL, Redshift, Snowflake, DB2, etc..) but the community is improving their support towards NoSQL databases, now it supports MongoDB, Cassandra. Liquibase will help us in versioning, deployment and rollback.

With our DevOps experience, We have integrated the both open source tools Liquibase and Jenkins automation server for continuous deployment. We can implement this solution across any cloud platform or on-premise.

Architecture

For this demonstration we will be considering the AWS platform. MS SQL is our main database, lets see how to setup a CI/CD pipeline for the database.

Pre-request:

  • Setup a sample repo in codecommit .
  • Jenkins server up and running.
  • Notification service configured in Jenkins.
  • RDS MSSQL up and running.

Setup the AWS codecommit repo:

To create a code repo in AWS CodeCommit refer the following link .

https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-https-windows.html

Integration of codecommit with Jenkins:

To trigger the webhook from the AWS CodeCommit, We needto configure the AWS SQS and SNS. please follow the link

https://github.com/riboseinc/aws-codecommit-trigger-plugin

Webhook connection from Codecommit to Jenkins, We need to install the AWS CodeCommit Trigger Plugin.

Select -> Manage Jenkins -> Manage Plugins -> Available ->  AWS CodeCommit Trigger Plugin.

  • In Jenkin, create a new freestyle project. 
  • In the Source Code Management add your CodeCommit repo url and credentials.

Jenkins -> Manage Jenkins -> Configure System -> AWS CodeCommit Trigger SQS Plugin.

Installation and configuration of Liquibase:

sudo add-apt-repository ppa:webupd8team/java
sudo apt install openjdk-8-jdk
java -version
wget https://github.com/liquibase/liquibase/releases/download/v3.8.1/liquibase-3.8.1.tar.gz
mkdir liquibase
cd liquibase/
mv liquibase-3.8.1.tar.gz /opt/liquibase/
tar -xvzf liquibase-3.8.1.tar.gz

Based on your Database, you need to download the JDBC driver(jar file) in the same location of the liquibase directory. Go through the following link.

https://docs.microsoft.com/en-us/sql/connect/jdbc/download-microsoft-jdbc-driver-for-sql-server?view=sql-server-ver15

Integration of Jenkins with Liquibase:

During the deployment Jenkins will ssh into Liquibase instance, we need to generate a ssh key pair for Jenkins user and paste the key into Liquibase server linux user. Here we have a Ubuntu user on the Liquibase server.

Prepare the deployment script in Liquibase server.

For Single database server deployment: singledb-deployment.sh

-- Script Name: singledb-deployment.sh
#!/bin/bash
set -x
GIT_COMMIT=`cat /tmp/gitcommit.txt`
sudo cp /opt/db/temp/temp.sql /opt/db/db-script.sql
old=$(sudo cat /opt/db/db-script.sql | grep   'change' | cut -d ":" -f 2)
sudo  sed -i "s/$old/$GIT_COMMIT/g" /opt/db/db-script.sql
dburl=`cat /home/ubuntu/test | head -1 | cut -d ":" -f 1`
dbname=`cat /home/ubuntu/test | head -1 | cut -d ":" -f 2`
sed  -i -e "1d" /home/ubuntu/test
sudo sh -c 'cat /home/ubuntu/test >> /opt/db/db-script.sql'
export PATH=/opt/liquibase/:$PATH
echo DB_URLs is $dburl
echo DB_Names is $dbname
for prepare in $dbname; do  liquibase --driver=com.microsoft.sqlserver.jdbc.SQLServerDriver --classpath="/opt/liquibase/mssql-jdbc-7.4.1.jre8.jar" --url="jdbc:sqlserver://$dburl:1433;databaseName=$prepare;integratedSecurity=false;" --changeLogFile="/opt/db/db-script.sql"  --username=xxxx --password=xxxxx  Update;  done
sudo rm -rf /opt/db/db-script.sql  /home/ubuntu/test /tmp/gitcommit.txt

For Multi database server deployment: Multidb-deployment.sh

--Script name: Multidb-deployment.sh
#!/bin/bash
set -x
GIT_COMMIT=`cat /tmp/gitcommit.txt`
sudo cp /opt/db/temp/temp.sql /opt/db/db-script.sql
old=$(sudo cat /opt/db/db-script.sql | grep   'change' | cut -d ":" -f 2)
sudo  sed -i "s/$old/$GIT_COMMIT/g" /opt/db/db-script.sql

csplit -sk /home/ubuntu/test '/#----#/' --prefix=/home/ubuntu/test
sed  -i -e "1d" /home/ubuntu/test01
while IFS=: read -r db_url db_name; do
echo "########"
sudo sh -c 'cat /home/ubuntu/test01 >> /opt/db/db-script.sql'
export PATH=/opt/liquibase/:$PATH
echo db_url is $db_url
echo db_name is $db_name
for prepare in $db_name; do  liquibase --driver=com.microsoft.sqlserver.jdbc.SQLServerDriver --classpath="/opt/liquibase/mssql-jdbc-7.4.1.jre8.jar" --url="jdbc:sqlserver://$db_url:1433;databaseName=$prepare;integratedSecurity=false;" --changeLogFile="/opt/db/db-script.sql"  --username=xxxx --password=xxxx  Update;  done
done < /home/ubuntu/test00
sudo rm -rf /opt/db/db-script.sql  /home/ubuntu/test* /tmp/gitcommit.txt
  • In your Jenkins Job use shell to execute the commands.
  • The file test is actually coming from your code commit repo which contains the SQL queries and SQL server information
  • Below is the example job for multiple database servers. So we used to trigger the mutlidb-deployment.sh file. If you are using single SQL server deployment use singledb-deployment.sh

Prepare sample SQL Database for  demo:

CREATE DATABASE employee;
use employee;
CREATE TABLE employees
( employee_id INT NOT NULL,
  last_name VARCHAR(30) NOT NULL,
  first_name VARCHAR(30),
  salary VARCHAR(30),
  phone INT NOT NULL,
  department VARCHAR(30),
  emp_role VARCHAR(30)
);
INSERT into [dbo].[employees] values ('1', 'kumar' ,'arun', '1000000', '9999998888', 'devops', 'architect' );
INSERT into [dbo].[employees] values ('2', 'hk' ,'guna', '5000000', '9398899434, 'cloud', 'engineer' );
INSERT into [dbo].[employees] values ('3', 'kumar' ,'manoj', '900000', '98888', 'lead', 'architect' );

Deployment 1: (for single SQL server deployment)

We are going to insert a new row using CI/CD

  • db-mssql: CodeCommit Repo
  • test: SQL server information( RDS endpoint: DBname) and SQL that we need to deploy.
  • Once we commit our code to our repository(CodeCommit). The webhook triggers the deployment

Check the SQL server to verify the row inserted:

Deployment 2: (for Multiple SQL server to deploy same SQL statements)

  • db-mssql: CodeCommit Repo
  • test: SQL server information( RDS endpoint: DBname) and SQL that we need to deploy.
  • #—-#: this is the separator for the servers and SQL queries so don’t remove this.

Deployment 3:  (for Multiple SQL server to deploy same SQL stored procedure)

  • db-mssql: CodeCommit Repo
  • test: SQL server information( RDS endpoint: DBname) and SQL that we need to deploy.
  • #—-#: this is the separator for the servers and SQL queries so don’t remove this.

Notification:

  • Once the Job is executed you will get the email notification.

Liquibase Limitations:

  • Commented messages in the function or SP will not get updated in the Database.

Conclusion:

Here we used this liquibase on AWS, so we used RDS, CodeCommit and etc. But you can use the same method to configure the automatic deployment pipeline for databases with versioning, rollback in (AWS RDS, Azure SQL Database, Google Cloud SQL, Snowflake) using open source tool Liquibase and Jenkins.

Migration: Assessment & Planning for one of the largest low-cost airlines

By | Case Study, Cloud Case Study | No Comments

About Customer

The customer is UAE’s aviation corporation catering over 70 million passengers as of date. Passenger service system (PSS), their ticket booking application was a legacy system that they intended to migrate to a cloud environment while ensuring they manage to leverage administered services of cloud by conducting a Migration Readiness Assessment & Planning (MRAP).

Problem Statement

Passenger Service System (PSS) was the existing ticket booking application for the customer. The objective was to understand this legacy system and then recommend how it can be migrated to AWS while leveraging the cloud-native capabilities via an MRAP assessment. The focus would be application modernization rather than a lift & shift migration to the cloud. The customer team intends to leverage managed services of cloud and go serverless, containers, open source etc. wherever possible. The customer team also wants to move away from the commercial Oracle database to a more open-source AWS Aurora PostgreSQL database due to the high licensing costs imposed by Oracle.

MRAP is critical to any organization that plans to adapt to the cloud as this tool-based assessment checks their application’s ability to cloud. Powerup was approached to perform MRAP on their existing set up to propose a migration plan as well as a roadmap, post its analysis.

Proposed Solution

The customer’s MRAP Process

To begin with, the RISC Networks RN150 virtual appliance, an application discovery tool that poses as an optional deployment architecture was configured and installed on the customer’s existing PSS Equinix data centre (DC) to collect data and create a detailed tool-based assessment to understand the existing set up ‘s readiness to migration.

Application stacks were built for the applications in scope and assessments as well as group interviews were conducted with all stakeholders. Data gathered from stakeholders were cross-verified with the information provided by the customer’s IT and application team to bridge gaps if any. Powerup team would then work on creating a proposed migration plan and a roadmap.

MRAP Deliverables

A comprehensive and detailed MRAP report included the following information:

Existing overall architecture

The existing PSS system was bought from a vendor called Radixx International, which provided three major services:

  • Availability service, an essential core service mainly used by online travel agencies (OTAs), end-users and global distribution system (GDS) to check the availability of their customer’s flights. It’s base system contained modules like Connect Point CP (core), payments, the enterprise application (Citrix app) all written in .NET and the enterprise application for operation and administration written in VB6.
  • Reservation service was used in booking passengers’ tickets where data was stored in two sessions, Couchbase and the Oracle database. The webpage traffic was 1000:1 when compared to availability service.
  • DCS System (Check-in & Departure Control Systems) is another core system of any airline, which assists in passenger check-in, baggage check-in and alerting the required officials. It is a desktop application used by airport officials to manage passengers from one location to another with the availability of an online check-in module as well.

Existing Database: Oracle is the current core database that stores all critical information consisting of 4 nodes – 2 Read-Write nodes in RAC1 & another 2 (read-only nodes) in RAC2. All availability checks are directed to the read-only Oracle nodes. The Oracle database nodes are heavily utilized roughly at 60-70% on an average with currently 14 schemas within the Oracle database accessed by the various modules. Oracle Advanced Queuing is used is some cases to push the data to the Oracle database.

Recommended AWS Landing zone structure

The purpose of AWS Landing Zone is to set up a secure, scalable, automated multi-account AWS environment derived from AWS best practices while implementing an initial security baseline through the creation of core accounts and resources.

The following Landing Zone Account structure was recommended for the customer:

AWS Organizations Account:

Primarily used to manage configuration and access to AWS Landing Zone managed accounts, the AWS organizations account provides the ability to create and financially manage member accounts.

Shared Services Account:

It is a reference for creating infrastructure shared services. In the customer’s case, Shared Services Account will have 2 VPCs – one for management applications like AD, Jenkins, Monitoring Server, Bastion etc. and other Shared services like NAT Gateway & Firewall. Palo Alto Firewall will be deployed in the shared services VPC across 2 Availability Zones (AZ)s and load balanced using AWS Application Load Balancer.

AWS SSM will be configured in this account for patch management of all the servers. AWS Pinpoint will be configured in this account to send notifications to customer – email, SMS and push notifications.

Centralized Logging Account:

The log archive account contains a central Amazon S3 bucket for storing copies of all logs like CloudTrail, Config, CloudWatch logs, ALB Access logs, VPC flow logs, Application Logs etc. The logging account will also host the Elasticsearch cluster, which can be used to create custom reports as per customer needs, and Kibana will be used to visualize those reports. All logs will be pushed to the current Splunk solution used by the customer for further analysis.

Security Account:

The Security account creates auditor (read-only) and administrator (full-access) cross-account roles from a security account to all AWS Landing Zone managed accounts. The organization’s security and compliance team can audit or perform emergency security operations with this setup and this account is also designated as the master Amazon GuardDuty account. Security Hub will be configured in this account to get a centralized view of security findings across all the AWS accounts and AWS KMS will be configured to encrypt sensitive data on S3, EBS volumes & RDS across all the accounts. Separate KMS keys will be configured for each account and each of the above-mentioned services as a best practice.

Powerup recommended Trend Micro as the preferred anti-malware solution and the management server can be deployed in the security account.

Production Account:

This account will be used to deploy the production PSS application and the supporting modules. High availability (HA) and DR will be considered to all deployments in this account. Auto-scaling will be enabled wherever possible.

UAT Account – Optimized Lift & Shift:

This account will be used to deploy the UAT version of the PSS application. HA and scalability are not a priority in this account. It is recommended to shut down the servers during off-hours to save cost.

DR Account:

Based on the understanding of the customer’s business a Hot Standby DR was recommended where a scaled-down version of the production setup will be always running and will be quickly scaled up in the event of a disaster.

UAT Account – Cloud-Native:

The account is where the customer’s developers will test all the architectures in scope. Once the team has made the required application changes, they will use this account to test the application on the cloud-native services like Lambda, EKS, Fargate, Cognito, DynamoDB etc.

Application Module – Global Distribution Systems (GDS)

A global distribution system (GDS) is one of the 15 modules of the PSS application. It is a computerized network system that enables transactions between travel industry service providers, mainly airlines, hotels, car rental companies, and travel agencies by using real-time inventory (for e.g., number of hotel rooms available, number of flight seats available, or number of cars available) to service providers.

  • The customer gets bookings from various GDS systems like Amadeus, Sabre, Travelport etc.
  • ARINC is the provider, which connects the client with various GDS systems.
  • The request comes from GDS systems and is pushed into the IBM MQ cluster of ARINC where it’s further pushed to the customer IBM MQ.
  • The GMP application then polls the IBM MQ queue and sends the requests to the PSS core, which in turn reads/writes to the Oracle DB.
  • GNP application talks with the Order Middleware, which then talks with the PSS systems to book, cancel, edit/change tickets etc.
  • Pricing is provided by the Offer Middleware.

Topology Diagram from RISC tool showing interdependency of various applications and modules:

Any changes in the GDS architecture can break the interaction between applications and modules or cause a discrepancy in the system that might lead to a compromise in data security. In order to protect the system from becoming vulnerable, Powerup recommended migrating the architecture as is while leveraging the cloud capabilities.

Proposed Migration Plan

IBM MQ cluster will be setup on EC2, and auto-scaling will be enabled to maintain the required number of nodes thus ensuring availability of EC2 instances at all times. IBM MQ will be deployed in a private subnet.

Amazon Elastic File System (Amazon EFS) will be automatically mounted on the IBM MQ server instance for distributed storage, to ensure high availability of the queue manager service and the message data. If the IBM MQ server fails in one availability zone, a new server is created in the second availability zone and connected to the existing data, so that no persistent messages are lost.

Application Load Balancer will be used to automatically distribute connections to the active IBM MQ server. GMP Application and PNL & ADL application will be deployed on EC2 across 2 AZs for high availability. GMP will be deployed in an auto-scaling group to scale based on the queue length in the IBM MQ server and consume and process the messages as soon as possible whereas PNL & ADL to scale out in case of high traffic.

APIS Inbound Application, AVS application, PSF & PR application and the Matip application will all be deployed on EC2 across 2 AZs for high availability in an auto-scaling group to scale out in case of high traffic.

Cloud-Native Architecture

  • GMP and GMP code sharing applications will be deployed as Lambda functions. The lambda function will run when a new message comes to the IBM MQ.
  • PNL & ADL application will be deployed as a Lambda function and the function will run when there is a change in the PNR number in which case a message must be sent to the airport.
  • AVS application will be deployed as Lambda functions where it will run when a message will be sent to the external systems.
  • Matip application will be deployed as a Lambda function and will run when a message will be sent using the MATIP protocol.
  • PFS & PR application will be deployed as Lambda functions. The lambda function will run when a message will be sent to the airport for booking.
  • APIS Inbound application will be deployed as a Lambda function and it will run when an APIS message will be sent to the GDS systems.

For all the above, required compute resources will be assigned as per the requirement. Lambda function will scale based on the load.

Application modifications recommended

All the application components like GMP, AVS, PNL & ADL, PFS & PR, Matip, etc are currently in .NET. which have to be moved into .NET Core to be run as Lambda functions. The applications are recommended to be broken down into microservices.

Oracle to Aurora Database Migration

AWS schema conversion tool (SCT) is run on the source database, which will generate a schema conversion report that will help understand interdependencies of existing schemas, and how they can be migrated on to Aurora PostgreSQL. The report will contain database objects some that can be directly converted by the SCT tools and the rest, which would need manual intervention. For Oracle functionalities that are not supported in Aurora PostgreSQL, the application team must write custom code to migrate those. Once all the schemas are migrated, AWS Database Migration Service will be used to migrate the entire data set from Oracle to Aurora.

Oracle to Aurora-PostgreSQL Roadmap

  • Lift & shift:

The current Oracle database will be moved to AWS as-is without any changes in order to kick-start the migration. The Oracle database can run on AWS RDS service or EC2 instances. One RDS node will be the master database in read/write mode. The master instance is the only instance to which the application can write to. There will be 3 additional read-replicas spread across 2 AZs of AWS to handle the load that is coming in for read requests. In case the master node goes down one of the read replicas is promoted as the master node.

  • Migrate the Oracle schemas to Aurora:

Once the Oracle database is fully migrated to AWS, the next step is to gradually migrate the schemas one by one to Aurora – PostgreSQL. The first step is to map all the 14 schemas with each application module of the customer. The Schemas will be migrated based on this mapping, wherever there are non-dependent schemas on other modules, it will be identified and migrated first.

The application will be modified to work with the new Aurora schema. Any functionality, which is not supported by Aurora, will be moved to application logic.

DB links can be established from Oracle to Aurora, however, it cannot be established from Aurora to Oracle database.

Any new application development that is in progress should be compatible and aligned with the Aurora schema.

  • Final Database:

Finally, all the 14 schemas will be migrated onto Aurora and the data will be migrated using DMS service. The entire process is expected to take up to 1 year. There will be 4 Aurora nodes – One Master Write & 3 Read Replicas spread across 2 AZs of AWS for high availability.

Key Findings

The assessment posed as a roadmap to move away from Oracle to PostgreSQL saving up to 30% in Oracle License cost. It also provided a way forward for each application towards cloud-native.

Current infrastructure provisioned was utilized at around 40-50% and a significant reduction in the overall total cost of ownership (TCO) was identified if they went ahead with cloud migration. Less administration by using AWS managed services also proved to be promising, facilitating smooth and optimized functioning of the system while requiring minimum administration.

With the MRAP assessment and findings in place, the customer now has greater visibility towards cloud migration and the benefits it would derive from implementing it.

Cloud platform

AWS.

Technologies used

EPS, ALB, PostgreSQL Aurora, Lambda, RDS Oracle, VPC.

Thundering Clouds – Technical overview of AWS vs Azure vs Google Cloud

By | Blogs, Powerlearnings | No Comments

Compiled by Kiran Kumar, Business Analyst at Powerupcloud Technologies.

The battle of the Big 3 Cloud Service Providers

The cloud ecosystem is in a constant state of evolution, with increasing maturity and adoption, the battle for the mind and wallet intensifies. With Amazon Web Services (AWS), Microsoft Azure, and Google Cloud (GCP) leading with IaaS maturity, the likes of Salesforce, SAP, and Oracle to Workday, which recently reached $1B in quarterly revenue are both gaining ground and carving out niches in the the ‘X’aaS space. The recent COVID crisis has accelerated both adoption and consideration as enterprises transform to cope, differentiate, and sustain an advantage over the competition.  

In this article, I will stick to referencing the AWS, Azure, and GCP and terming them as the BIG 3, a disclaimer, Powerup is a top-tier partner with all three and the comparisons are purely objective based on current publically available information. It is very likely that when you do read this article a lot might have already changed. Having said that, the future will belong to those who excel in providing managed solutions around artificial intelligence, analytics, IoT, and edge computing. So let’s dive right in:      

Amazon Web Services –  As the oldest amongst the three and the most widely known, showcasing the biggest spread of availability zones and an extensive roster of services. It has monopolized its maturity to activate a developer ecosystem globally, which has proven to be a critical enabler of its widespread use.      

Microsoft Azure – Azure is the closest that one gets to AWS in terms of products and services. While AWS has fully leveraged its head start, Azure tapped into Microsoft’s huge enterprise customers and let them take advantage of the already existing infrastructure by providing better value through Windows support and interoperability.

Google Cloud Platform –  Google Cloud was announced in 2011, for being less than a decade old it has created a significant footprint. Initially intended to strengthen google’s products but later came up with an enterprise offering. A lot is expected from its deep expertise in AI, ML, deep learning & data analytics to give it a significant edge over the other providers.

AWS vs. Azure vs. Google Cloud: Overall Pros and Cons

In this analysis, I dive into broad technical aspects of these 3 cloud providers based on the common parameters listed below.

  • Compute
  • Storage
  • Exclusives  

Compute

AWS Compute:

Amazon EC2 EC2 or Elastic compute cloud is Amazon’s compute offering. EC2 can support multiple instance types (bare metal, GPU, windows, Linux, and more)and can be launched with different security and networking options, you can choose from a wide range of templates available based on your use case. EC2 can both resize and autoscale to handle changes in requirements which eliminates the need for complex governance.

Amazon Elastic Container Service a highly scalable, high-performance container orchestration service that supports Docker containers and allows you to easily run and scale containerized applications, manage and scale a cluster of VM’s, or schedule containers on those VM’s.

Amazon EKS makes it easy to deploy, manage, and scale containerized applications using Kubernetes on AWS.

It also has its own Fargate service that automates server and cluster management for containers, a virtual private cloud option known as Lightsail for batch computing jobs, Elastic Beanstalk for running and scaling Web applications, lambda for launching serverless applications.

Container services Include Amazon Elastic Container Registry a fully-managed Docker container registry which allows you to store, manage, and deploy Docker container images.

Microsoft VM:

Azure VM: Azure VM’s are a secure and highly scalable compute solution with various instance types optimized for high-performance computing, Ai, and ML-based computing container instances and with azure’s emphasis on hybrid computing, support for multiple OS’s types, Microsoft software, and services. Virtual Machine Scale Sets are used to auto-scale your instances.

Azure container services include Azure Kubernetes service fully managed Kubernetes based Container Solution.

Container Registry which lets you store and manage container images across all types of Azure deployments.

Service Fabric A unique fully managed services which lets you develop microservices and orchestrate containers on Windows or Linux.

Other services include Web App for Containers which lets you run, scale, and deploy containerized web apps. Azure Functions for launching serverless applications, Azure Red Hat OpenShift, with support for  OpenShift.

Google Compute Engine:

Google Compute Engine (GCE) is google compute service Google is fairly new to cloud compared to the other two CSP’s and it is reflected in its catalog of services GCE offers the standard array of features starting from windows and Linux instances, RESTful API’s, load balancing, data storage, and networking, CLI and GUI interfaces, and easy scaling. Backed by Google, GCE can spin up instances faster than most of its competition under most cases. It runs on a carbon-neutral infrastructure and offers the best value for your buck among the competition.

Google Kubernetes Engine (GKE) is based on Kubernetes, originally developed inhouse Google has the highest expertise when it comes to Kubernetes and has deeply integrated it into the google cloud platform GKE service can be used to automate many of your deployment, maintenance, and management tasks. Also can be used with hybrid clouds via the Anthos service.

Storage

AWS Storage:

Amazon S3 is an object storage service that offers scalability, data availability, security, and performance for most of your storage requirements. Amazon Elastic Block Store persistent block storage that can be used with your Amazon EC2 instances. Elastic file system for scalable file storage.

Other storage services include S3 Glacier, a secure, durable, and extremely low-cost storage service for data archiving and long-term backup, Storage Gateway for hybrid storage, and snowball, a device used for offline small to medium scale data transfer.

Database

And other database services like Amazon Aurora a SQL compatible relational database, RDS (relational database service), DynamoDB NoSQL database, Amazon ElastiCache forElasti Cache in-memory data store, Redshift data warehouse, Amazon Neptune a graph database.

Azure Storage:

Azure Blobs A massively scalable object storage solution, includes support for big data analytics through Data Lake Storage Gen2, Azure Files Managed file storage solution with support for on-prem, Azure Queues A reliable messaging store, Azure Tables A NoSQL storage solution for structured data.

Azure Disks Block-level storage volumes for Azure VMs similar to Amazon EBS.

Database

Database Services Include SQL based database like Azure SQL Database, Azure Database for MySQL, and, Azure Database for PostgreSQL for NoSQL data warehouse services, Cosmos DB, and table storage, Server stretch database is a hybrid storage service designed specifically for organizations leveraging Microsoft SQL on-prem and, Redis cache is an in-memory data storage service.

Google Cloud Storage:

GCP’s cloud storage service includes Google Cloud Storage unified, scalable, and highly durable object storage, Filestore network-attached storage (NAS) for Compute Engine and GKE instances, Persistent Disk object storage for VM instances and, Transfer Appliance for Large data transfer.

Database

On the database side, GCP has 3 NoSQL database Cloud BigTable for storing big data, Firestore a document database for mobile and web application data, Bigquery an analytics server, Memorystore for in-memory storage, Firebase Realtime Database cloud database for storing and syncing data in real-time. SQL-based Cloud SQL and a relational database called, Cloud Spanner that is designed for mission-critical workloads.

Benchmarks Reports

An additional drill-down would be to analyze performance figures for the three across for network, storage, and CPU, and here I quote research data from a study conducted by Cockroach labs.

Network

GCP has taken significant strides when it comes to network and latency compared to last year as it even outperforms AWS and Azure in network performance

  • Some of GCP’s best performing machines hover around 40-60 GB/sec
  • AWS machines stick to their claims and offer a consistent 20 to 25 GB/sec and
  • Azure’s machines offered significantly less at 8 GB/sec.  
  • When it comes to latency AWS outshines the competition by offering a consistent performance across all of its machines.
  • GCP does undercut AWS under some cases but still lacks the consistency of AWS.
  • Azure’s negligible performance in the network department has reflected in high latency making it the least performing among the three.

NOTE: GCP believes that skylake for the n1 family of machines, is the reason for their increase in performance on the network side.

Storage

AWS has superior performance in storage; neither GCP nor Azure even comes close to the read-write speeds and latency figures. This is largely due to the storage optimized instances like the i3 series. Azure and GCP do not have storage optimized instances and have a performance that is comparable to the non-storage optimized instances from Amazon While Azure offered slightly better read-write speed among the two, GCP offered better latency.

CPU

While comparing the CPU’s performances Azure machines showcased a slightly higher CPU performance thanks to Using conventional 16 core CPUs. Azure machines use 16 cores with a single thread per core and other clouds use hyperthreading to achieve 16 cores by combining 8cores with 2 threads. After comparing each offering across the three platforms here’s the best each cloud platform has to offer.

  • AWS c5d.4xlarge 25000 – 50000 Bogo ops per sec
  • Azure Standard_DS14_v2  just over 75000 Bogo ops per sec
  • GCP c2-standard-16 25000 – 50000 Bogo ops per sec
  • While AWS and GCP figures look similar AWS overall offers slightly better than GCP and
  • Avoiding hyperthreading has inflated Azure’s figures and while it might still be superior in performance it may not accurately represent the difference in the performance power it offers.

For detailed benchmarking reports visit Cockroach Labs  

Key Exclusives

Going forward, technologies like Artificial Intelligence, Machine Learning, the Internet of Things(IoT), and serverless computing will play a huge role in shaping the technology industry. The goal of most of the services and products will try to take advantage of these technologies to deliver solutions more efficiently and with precision. All of the “BIG 3“providers have begun experimenting with offerings in these areas. This can very well be the key differentiator between them.

AWS Key Tools:

Some of the latest additions to the AWS portfolio include AWS Graviton processors built using 64 bit Arm Neoverse cores. EC2 based M6g, C6g, and R6g instances are powered by these new-gen instances. Thanks to the power-efficient Arm architecture it is said to provide 40% better price performance over the X86 based instances.

AWS Outpost: Outpost is Amazon’s emphasis on the hybrid architecture; it is a fully managed ITaaS solution that brings all AWS products and services to anywhere by physically deploying it in your site. It is aimed at offering a consistent hybrid experience with the scalability and flexibility of AWS.

AWS has put a lot of time and effort into developing a relatively broad range of products and services in AI and ML space. Some of the important ones include AWS Sagemaker service for training and deploying machine learning models, the Lex conversational interface, and Polly text-to-speech service which powers Alexa services, its Greengrass IoT messaging service and the Lambda serverless computing service.

And AI-powered services like DeepLens which can be trained and used for OCR, Image, and, character Recognition, Gluon, an open-source deep-learning library designed to build and quickly train neural networks without having to know AI programming.

Azure Key Tools:

When it comes to hybrid support Azure offers a very strong proposition, with services like Azure stack and Azure Arc minimize your risks of going wrong. Knowing that a lot of enterprises are already using Microsoft’s services Azure tries to deepen this by offering enhanced security and flexibility through its hybrid services. With Azure Arc customers can manage resources deployed within Azure and outside of Azure through the same control plane enabling organizations to extend Azure services to their on-prem data centers.

Azure also consists of a comprehensive family of AI services and cognitive APIs which helps you build intelligent apps, services like Bing Web Search API, Text Analytics API, Face API, Computer Vision API and Custom Vision Service come under it. For IoT, it has several management and analytics services, and it also has a serverless computing service known as Functions.

Google Cloud Key Tools:

AI and machine learning are big areas of focus for GCP. Google is a leader in AI development, thanks to TensorFlow, an open-source software library for building machine learning applications. It is the single most popular library in the market, with AWS also adding support for TensorFlow in an acknowledgment of this.

Google Cloud has strong offerings in APIs for natural language, speech, translation, and more. Additionally, it offers IoT and serverless services, but both are still in beta stage. However Google has been working extensively on Anthos, as quoted by Sundar Pichai Anthos follows the “Write once and run anywhere” approach by allowing organizations to run Kubernetes workloads on-premises, AWS or Azure, however, Azure support is still in a beta testing stage. 

Verdict

Each of the three has its own set of features and come with their own set of constraints and advantages. The selection of the appropriate cloud provider should, therefore, like with most enterprise software be based on your organizational goals over the long term.

However, we strongly believe that multi-cloud will be the way forward for an organization for e.g. if an organization is an existing user of Microsoft’s services it is natural for it to prefer Azure. Most small, web-based/digitally native companies looking to scale quickly by leveraging AI/ML, Data services, would want to take a good look at Google Cloud. And of course, AWS with its absolute scale of products and services and maturity makes it very hard to ignore in any mix.

Hope this shed some light on the technical considerations, and will follow this up with some of the other key evaluating factors that we think you should consider while selecting your cloud provider.

Infra transformation through complete automation

By | Case Study, Cloud Case Study | No Comments

Customer: One of India’s top media solutions company

 

Summary

The powerup cloud helped the customer completely transform their business environment through complete automation. Our design architecture and solution engineering improved business process efficiency, without any manual intervention, resulting in turnaround time is decreased by more than 90%. Now most of their applications running on the cloud, the customer has become one of most customer-friendly media companies in India.

Problem Statement

The customer’s team wants to concentrate on building applications rather than spending time on the infrastructure setup and dependencies packages installed and maintained on the servers. The proposing solution needs to be a quick & scalable one so that business performance will be improved significantly.

Proposed Solution

Focusing on workload and transaction volume, we designed a customer-friendly, network optimized, a highly agile and scalable cloud platform that enabled cost optimization, effective management, and easy deployment. This helped reducing interventions and cost overheads.

CloudFormation Templates:

We used AWS native tool CloudFormation to deploy the infrastructure as code, the ideology behind this is deployed infra as well as we can use it for Disaster Recovery.

CloudFormation template implemented in Stage & prod environment based on the best practice of AWS by subjecting the severs to reside in private subnets and internet routing with the help of Nat-gateway.

To remove the IP dependencies for a better way to manage failures, the servers and the websites are pointed to the Application load balancers where a single load balancer we managed to have multiple target groups in the view of cost optimization.

 

Base Packages Dependency:

This solution must remove the dependency of the developer to install the packages on the server to support the application.

The packages need to be installed on the infra setup, so the developer can deploy the code using the code deployer services rather than spending time to install dependencies.

Hence, we proposed & implemented the solution via Ansible, With the help of ansible we can able to manage multiple servers under a single roof. We have prepared a shell script that will install the packages on the server.

The architecture majorly differentiated in the means of Backend & frontend Module.

Backend Module where the java application will be running, hence a shell script will run the backend servers which will install the Java-8 versions and creates a Home path based on standard path, so home path execution of application will be always satisfied by this condition.

Frontend Module which more likely of Nginx combined with node.js which achieved by the same methodology.

Volume Mount

The application’s logs and other backup artifacts are managed in the secondary EBS volume which the mount point to the fstab entries are also automated.

Code-deploy agent

The Main part of deployment achieved by the code-deployer hence the servers should be installed with code-deployment agents during the server setup which is also done through ansible.

User Access:

User access is another solution, where the access to the servers restricted for some people in the development team and the access will be provided to the server with the approval of their leads.

We had, dev, qa, psr, stage and prod environments we clubbed all the servers in the ansible inventory and generated a public key and private key and passed them on the standard part. When the user adds scripts runs, ansible will copy the public keys and create a user on the destination server by pasting the public key in the authorized file.

This method will be hidden the pub key from the end-user when the user asked to removed using ansible we will delete those users from the server.

Monitoring with sensu:

Infra team is responsible for monitoring the infra, hence we created a shell script that will install the sensu on the destination server for monitoring using ansible.

By implementing these solutions, the development was less worried about the packages dependencies which allowed them to concentrate on their app development and fixing bugs and user access got streamlined.

Bastion with MFA settings:

The servers in the environment can get accessed only by the bastion server which acts as the entry point.

This bastion server was set up with the MFA mechanism, where each user must access the server with MFA authentication as a security best practice.

SSL renewal

In one of the legacy account, SSL offloaded at the server level with a lot of Vhosts. Hence renewing certificates will take time to reduce the time we used SSL with ansible to rotate the certificates in a quick time with less human efforts.

Automation in Pipeline :

  • Terraform implementation
  • Base packages installation on bootup which reduces one step of installation.
  • User access with automatic expiry condition.

Challenges

In addition to the on-going consulting engagement with the customer for enhancement, and designing a system to meet the client’s need, Powerupcloud also faced some challenges. The Infra has to be created in quick time with 13 servers under the application load balancers, which includes Networking, compute and load balancers with target groups. The Instances were required to install with certain dependencies to run the application smoothly. As a result, the development process became more complicated.

The solution was also expected to meet the very high level of security, continuous monitoring, Non-stop 24X7 operation, High availability, agility, scalability, less turnaround time, and high performance, which was a considerable challenge given the high business criticality of the application.

To overcome these challenges, we established a predictive performance model for early problem detection and prevention. Also, started a dedicated performance analysis team with active participation from various client groups.

All the changes in configuration are smoothly and rapidly executed from the viewpoint of minimizing load balance and outage time.

Business Result & outcome

With the move to automation, the customer’s turn-around time decreased by 30%. This new system also helped them reduced capital investments as it is completely automated. The solution was designed in-keeping with our approach of security, scalability, agility, and reusability.

  • Complete automation
  • Successful implementation of the CloudFormation template.
  • Improved business process efficiency by over 90%
  • Network optimized for a virtualized environment.
  • Key-based access Mechanism with secured logins.
  • Highly agile and Scalable environment.

Cloud platform

AWS.

Technologies used

Cloudformation template, Ansible.

Serverless Data processing Pipeline with AWS Step Functions and Lambda

By | Blogs, data, Data Lake, Data pipeline | One Comment

Written by Arun Kumar, Associate Cloud Architect at Powerupcloud Technologies

In the traditional ETL world generally, we use our own scripts or any paid tool or Open source Data processing tool or an orchestrator to deploy our data pipeline. If the Data processing pipeline is not complex, if we use these server-based solutions then sometimes it would add additional costs(considering deploying some non-complex pipelines). In AWS we have multiple serverless solutions Lambda and Glue. But lambda has the execution time limitation and Glue is running an EMR cluster in the background, so ultimately it’ll charge you a lot. So we decided to explore AWS Step functions with Lambda which are serverless at the same time as an orchestration service that executes our process on the event bases and terminates the resources post-execution of the process. Let’s see how we can build a data pipeline with this.

Architecture Description:

  1. The Teradata server from on-prem will send the input.csv  file to the S3  bucket(data_processing folder) on schedule bases.
  2. CloudWatch Event Rule will trigger the step function in the case of PutObject in the specified S3 bucket and start processing the input file.
  3. The cleansing script placed on ECS.
  4. AWS Step function will call Lambda Function and it will trigger ECS tasks(a bunch of Python and R script).
  5. Once the cleansing is done the output file will be uploaded to the target S3 bucket.
  6. AWS lambda function will be triggered to get the output file from the target bucket and send it to the respective team.

Create a custom CloudWatch Event Rule for S3 put object operation

Choose Event Pattern -> Service Name -> S3 -> Event Type -> Object level operations -> choose put object -> give the bucket name.

  • In targets choose the step function to be triggered -> give the name of the state machine created.
  • Create a new role or existing role as a cloud watch event requires permission to send events to your step function.
  • Choose one more target to trigger the lambda function -> choose the function which we created before.

AWS Management Console and search for Step Function

  • Create a state machine
  • On the Define state machine page select Author with code snippets.
  • Give a name. Review the State machine definition and visual workflow.
  • Use the graph in the Visual Workflow pane to check that your Amazon States Language code describes your state machine correctly.
  • Create or If you have previously created an IAM role select created an IAM role.

Create an ECS with Fargate

ECS console -> choose create cluster -> choose cluster template -> Networking only -> Next -> Configure -> cluster -> give cluster name -> create

In the navigation pane, choose Task Definitions, Create new Task Definition.

On the Select compatibilities page, select the launch type that your task should use and choose Next step. Choose Fargate launch type.

For Task Definition Name, type a name for your task definition.

For Task Role, choose an IAM role that provides permissions for containers in your task to make calls to AWS API operations on your behalf.

To create an IAM role for your tasks

a.   Open the IAM console.

b.   In the navigation pane, choose Roles, Create New Role.

c.   In the Select Role Type section, for the Amazon Elastic Container Service Task Role service role, choose Select.

d.   In the Attach Policy section, select the policy to use for your tasks and then choose Next Step.

e.   For Role Name, enter a name for your role. Choose Create Role to finish.

Task execution IAM role, either select your task execution role or choose to Create a new role so that the console can create one for you.

Task size, choose a value for Task memory (GB) and Task CPU (vCPU).

For each container in your task definition, complete the following steps:

a.   Choose Add container.

b.   Fill out each required field and any optional fields to use in your container definitions.

c.   Choose Add to add your container to the task definition.

Create a Lambda function

  • Create lambda function to call the ECS

In lambda console -> Create function -> Choose Author from scratch -> give function name -> Give runtime -> choose python 3.7 -> Create a new role or if you have existing role choose the role with required permission [ Amazon ECS Full Access, AWS Lambda Basic Execution Role ]

import boto3
import os
import time
 client = boto3.client('ecs')
def lambda_handler(event,context):
	response = client.run_task(
    	cluster='Demo',
    	launchType='FARGATE',
    	taskDefinition='Demo-ubuntu-new',
    	count=1,
    	platformVersion='LATEST',
    	networkConfiguration={
        	'awsvpcConfiguration': {
            	'subnets': ['subnet-f5e959b9','subnet-11713279'],
            	'assignPublicIp': 'ENABLED',
            	'securityGroups': ['sg-0462860d9c60d87d3']
        	},
    	}
	)
	print("this is the response",response)
	task_arn=response['tasks'][0]['taskArn']
	print (task_arn)
	time.sleep(31.5)
	stop_response = client.stop_task(
	cluster='Demo',
	task=task_arn
	)
	print (stop_response)
	return str(stop_response) ]
  • Give the required details such as cluster, launch Type, task Definition, count, platform Version, network Configuration.
  • Applications hosted in ECS Fargate will process the data_process.csv file and out file will be pushed to S3 bucket of output folder.

Create Notification to trigger lambda function(Send Email)

  • To enable the event notifications for an S3 bucket -> open the Amazon S3 console.
  • In the Bucket name list, choose the name of the bucket that you want to enable events for.
  • Choose Properties ->Under Advanced settings, choose Events ->Choose Add notification.
  •  In Name, type a descriptive name for your event configuration.
  • Under Events, select one or more of the type of event occurrences that you want to receive notifications for. When the event occurs a notification is sent to a destination that you choose.
  • Type an object name Prefix and/or a Suffix to filter the event notifications by the prefix and/or suffix.
  •  Select the type of destination to have the event notifications sent to.
  • If you select the Lambda Function destination type, do the following:
  • In Lambda Function, type or choose the name of the Lambda function that you want to receive notifications from Amazon S3 and choose to save.
  • Create a lambda function with Node.js
    • Note: Give bucket name, folder, file name, Email address verified.
[ var aws = require('aws-sdk');
var nodemailer = require('nodemailer');
var ses = new aws.SES({region:'us-east-1'});
var s3 = new aws.S3();
 function getS3File(bucket, key) {
	return new Promise(function (resolve, reject) {
    	s3.getObject(
        	{
            	Bucket: bucket,
            	Key: key
        	},
        	function (err, data) {
            	if (err) return reject(err);
            	else return resolve(data);
        	}
    	);
	})
}
 exports.handler = function (event, context, callback) {
     getS3File('window-demo-1', 'output/result.csv')
    	.then(function (fileData) {
        	var mailOptions = {
            	from: 'arun.kumar@powerupcloud.com',
            	subject: 'File uploaded in S3 succeeded!',
            	html: `<p>You got a contact message from: <b>${event.emailAddress}</b></p>`,
            	to: 'arun.kumar@powerupcloud.com',
            	attachments: [
                	{
                        filename: "result.csv",
                        content: fileData.Body
                	}
            	]
        	};
            console.log('Creating SES transporter');
        	// create Nodemailer SES transporter
        	var transporter = nodemailer.createTransport({
            	SES: ses
        	});
        	// send email
            transporter.sendMail(mailOptions, function (err, info) {
            	if (err) {
                    console.log(err);
                    console.log('Error sending email');
                    callback(err);
            	} else {
                    console.log('Email sent successfully');
                    callback();
            	}
        	});
    	})
    	.catch(function (error) {
        	console.log(error);
            console.log('Error getting attachment from S3');
        	callback(error);
    	});
}; ]

Conclusion:

If you are looking for a serverless orchestrator for your batch processing or on a complex Data Processing pipeline then give a try with AW Step functions and Lambda here we use  ECS Farget to cleanse the data. If your Data processing script is more complex you can integrate with Glue but still Step function will act as your orchestrator. 

Migration & App Modernization (Re-platforming)

By | Case Study, Cloud Case Study | No Comments

Summary

The customer is one of the largest Indian entertainment companies catering to acquiring, co-producing, and distributing Indian cinema across the globe. They believe that media and OTT platforms can derive maximum benefit in terms of multi-tenant media management solutions provided by the cloud. Therefore, they are looking at migrating their existing servers, databases, applications, and content management system on to cloud for better scalability, maintenance of large volumes of data, modernization, and cost-effectiveness. The customer intends to also look at alternative migration strategies like re-structuring and refactoring if need be.

About customer

The customer is a global Indian entertainment company that acquires, co-produces, and distributes Indian films across all available formats such as cinema, television and digital new media. The customer became the first Indian media company to list on the New York Stock Exchange. It has experience of over three decades in establishing a global platform for Indian cinema. The company has an extensive and growing movie library comprising over 3,000 films, which include Hindi, Tamil, and other regional language films for home entertainment distribution.

The company also owns the rapidly growing Over The Top (OTT) platform. With over 100 million registered users and 7.9 million paying subscribers, the customer is one of India’s leading OTT platforms with the biggest catalogue of movies and music across several languages.

Problem statement / Objective

The online video market has brought a paradigm shift in the way technology is being used to enhance the customer journey and user experience. Media companies have huge storage and serving needs as well as the requirement for high availability via disaster recovery plans so that a 24x7x365 continuous content serving is available for users. Cloud could help media and OTT platforms address some pressing business challenges. Media and OTT companies are under constant pressure to continuously upload more content cost-effectively. At the same time, they have to deal with changing patterns in media consumption and the ways in which it is delivered to the audience.

The customer was keen on migrating their flagship OTT platform from a key public cloud platform to Microsoft Azure. Some of the key requirements were improved maintainability, scalability, and modernization of technology platforms. The overall migration involved re-platforming and migrating multiple key components such as the content management system (CMS), the Application Program Interfaces (APIs), and the data layer.

Solution

Powerup worked closely with the client’s engineering teams and with the OEM partner (Microsoft) to re-architect and re-platform the CMS component by leveraging the right set of PaaS services. The deployment and management methodology changed to containers (Docker) and Kubernetes.

Key learnings from the project are listed below:

  • Creating a bridge between the old database (in MySql) and a new database (in Postgres).
  • Migration of a massive volume of content from the source cloud platform to Microsoft Azure.
  • Rewriting the complete CMS app using a modern technology stack (using Python/Django) while incorporating functionality enhancements.
  • Setting up and maintaining the DevOps pipeline on Azure using open source components.

Outcome/Result

Modernized infrastructure powered by Azure, provided improved scalability and stability. The customer was able to minimize infrastructure maintenance using PAAS services. Modular design set-up enabled by migration empowered the developers with the ability to prototype new features faster.

Cloud Platform

Azure.

Technologies used

Blob storage, MySQL, DevOps, AppGateway, Azure AD, Azure DNS.

Azure Data Factory – Setting up Self-Hosted IR HA enabled

By | Blogs, data | No Comments

Written by Tejaswee Das, Software Engineer, Powerupcloud Technologies

Introduction

In the world of big data, raw, unorganized data is often stored in relational, non-relational, and other storage systems. However, on its own, raw data doesn’t have the proper context or meaning to provide meaningful insights to analysts, data scientists, or business decision-makers.

Big data requires service that can orchestrate and operationalize processes to refine these enormous stores of raw data into actionable business insights. Azure Data Factory(ADF) is a managed cloud service that’s built for these complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects.

This is how Azure introduces you to ADF. You can refer to the Azure documentation on ADF to know more.

Simply said, ADF is an ETL tool that will help you connect to various data sources to load data, perform transformations as per your business logic, and store them into different types of storages. It is a powerful tool and will help solve a variety of use cases.

In this blog, we will create a self hosted integration runtime (IR) with two nodes for high availability.

Use Case

A reputed client on OTT building an entire Content Management System (CMS) application on Azure having to migrate their old data or historical data from AWS which is hosting their current production environment. That’s when ADFs with self-hosted IRs come to your rescue – we were required  to connect to a different cloud, different VPC, private network, or on-premise data sources.

Our use-case here was to read data from a production AWS RDS MySQL Server inside a private VPC from ADF. To make this happen, we set up a two node self-hosted IR with high availability (HA).

Pre-requisites

  •  Windows Server VMs (Min 2 – Node1 & Node2)
  • .NET Framework 4.6.1 or later
  • For working with Parquet, ORC, and Avro formats you will require 
    • Visual C++ 2010 Redistributable Package (x64)
    • Java

Installation Steps

Step1: Login to the Azure Portal. Go to https://portal.azure.com

Step 2: Search for Data Factory in the Search bar. Click on + Add to create a new Data Factory.

Step 3: Enter a valid name for your ADF.

Note: The name can contain only letters, numbers, and hyphens. The first and last characters must be a letter or number. Spaces are not allowed.

Select the Subscription & Resource Group you want to create this ADF in. It is usually a good practice to enable Git for your ADF. Apart from being able to  store all your code safely, this also helps you when you have to migrate your ADF to a production subscription. You can get all your pipelines on the go.

Step 4: Click Create

You will need to wait for a few minutes, till your deployment is complete. If you get any error messages here, check your Subscription & Permission level to make sure you have the required permissions to create data factories.

Click on Go to resource

Step 5:

Click on Author & Monitor

Next, click on the Pencil button on the left side panel

Step 6: Click on Connections

Step 7: Under Connections tab, click on Integration runtimes, click on + New to create a new IR

Step 8: On clicking New, you will be taken to the IR set-up wizard.

Select Azure, Self-Hosted and click on Continue

Step 9: Select Self-Hosted  and Continue

Step 10: Enter a valid name for your IR, and click Create

Note: Integration runtime Name can contain only letters, numbers and the dash (-) character. The first and last characters must be a letter or number. Every dash (-) character must be immediately preceded and followed by a letter or a number. Consecutive dashes are not permitted in integration runtime names.

Step 11:

On clicking Create, your IR will be created.

Next you will need to install the IRs in your Windows VMs. At this point you should login to your VM (Node1) or wherever you want to install your

You are provided with two options for installation :

  • Express Setup – This is the easiest way to install and configure your IRs.  We are following the Express Setup in this setup. Connect to your Windows Server where you want to install.

Login to Azure Portal in your browser (inside your VM) → Data Factory →  select your ADF → Connections → Integration Runtimes →  integrationRuntime1 → Click Express Setup → Click on the link to download setup files.

  • Manual Setup – You can download the integration runtime and add the authentication keys to validate your installation.

Step 12: Express Setup

Click on the downloaded file.

On clicking on the downloaded file, your installation will start automatically.

Step 13:

Once the installation and authentication is successfully completed, go to the Start Menu → Microsoft Integration Runtime → Microsoft Integration Runtime

Step 14: You will need to wait till your node is able to connect to the cloud service. If for any reason, you get any error at this step, you can troubleshoot by referring to self hosted integration runtime troubleshoot guide

Step 15: High availability 

One node setup is complete. For high availability, we will need to set up at least 2 nodes. An IR can have a max of 4 nodes.

Note: Before setting up other nodes, you need to enable remote access. To enable remote access, you need to make sure you are doing it in your very first node, i.e, you have a single node when you are doing this configuration, you might face issues with connectivity later if you forget this step.

Go to Settings tab and  Click on Change under Remote access from intranet

Step 16:

Select Enable without TLS/SSL certificate (Basic) for dev/test purpose, or use TLS/SSL for a more secured connection.

You can select a different TCP port – else use the default 8060

Step 17:

Click on OK. Your IR will need to be restarted for this change to be effected. Click OK again.

You will notice remote access enabled for your node.

Step 18:

Login to your other VM (Node2). Repeat Steps 11 to 17. At this point you will probably get a Connection Limited message stating your nodes are not able to connect to each other. Guess why? We will need to enable inbound access to port 8060 for both nodes.

Go to Azure Portal → Virtual Machines → Select your VM (Node1) → Networking.

Click on Add inbound port rule

Step 19:

Select Source → IP Addresses → Set Source IP as the IP of your Node2. Node2 will need to connect to Port 8060 of Node 1. Click Add

Node1 IP – 10.0.0.1 & Node2 IP – 10.0.0.2. You can use either of private or public IP addresses.

We will need to do a similar exercise for Node2.

Go to the VM page of Node2 and add Inbound rule for Port 8060. Node1 & Node2 need to be able to communicate with each other via port 8060.

Step 20:

If you go to your IR inside your Node1 and Node2, you will see the green tick implying your nodes are successfully connected to each other and also to the cloud. You can wait for some time for this sync to happen. If for some reason, you get an error at this step, you can view integration runtime logs from Windows Event Viewer to further troubleshoot. Restart both of your nodes.

To verify this connection, you can also check in the ADF Console.

Go to your Data Factory → Monitor (Watch symbol on the left panel, below Pencil symbol – Check Step 5) → Integration runtimes

Here you can see the number of registered nodes and their resource utilization. The HIGH AVAILABILITY ENABLED featured is turned ON now.

Step 21: Test Database connectivity from your Node

If you want to test database connectivity from your Node, make sure you have whitelisted the Public IP of your Node at the Database Server inbound security rules.

For e.g, if your Node1 has an IP address 66.666.66.66 and needs to connect to an AWS RDS MySQL Server. Go to your RDS security group and add Inbound rules of your MySQL Port for this IP.

To test this. Login to your Node1 → Start → Microsoft Integration Runtime → Diagnostics → Add your RDS connection details → Click on Test

Conclusion

This brings you to the end of successfully setting up a self-hosted IR with high availability enabled.

Hope this was informative. Do leave your comments below. Thanks for reading.

References

Migration of Applications from Monolithic to Microservices and DevOps Automation

By | Case Study, Cloud Case Study | No Comments

Customer: India’s largest trucking platform

Problem Statement

The customer’s environment on AWS was facing scalability challenges as it was maintained across a heterogeneous set of software solutions with many different types of programming languages and systems and there was no fault-tolerant mechanism implemented. The lead time to get a developer operational was high as the developer ended up waiting for a longer duration to access cloud resources like EC2, RDS, etc. Additionally, the deployment process was manual which increased the chances of unforced human errors and configuration discrepancies. Configuration management took a long time which further slowed down the deployment process. Furthermore, there was no centralized mechanism for user management, log management, and cron job monitoring.

Proposed Solution

For AWS cloud development the built-in choice for infrastructure as code (IAC) is AWS CloudFormation. However, before building the AWS Cloudformation (CF) templates, Powerup conducted a thorough assessment of the customer’s existing infrastructure to identify the gaps and plan the template preparation phase accordingly. Below were a few key findings from their assessment:

  • Termination Protection was not enabled to many EC2 instances
  • IAM Password policy was not implemented
  • Root Multi-Factor Authentication (MFA) was not enabled
  • IAM roles were not used to access the AWS services from EC2 instances
  • CloudTrail was not integrated with Cloudwatch logs
  • S3 Access logs for Cloudtrail S3 bucket was not enabled
  • Log Metric was not enabled for Unauthorised API Calls; Using ROOT Account to access the AWS Console; IAM Policy changes; Changes to CloudTrail, CloudConfig, S3 Bucket policy; Alarm for any security group changes, NACL, RouteTable, VPCs
  • SSH ports of few security groups were open to Public
  • VPC Flow logs were not enabled for few VPCs

Powerup migrated their monolithic service into smaller independent services which are self-deployable, sustainable, and scalable. They also set up CI/CD using Jenkins and Ansible. Centralized user management was implemented using FreeIPA, while ELK stack was used to implement centralized log management. Healthcheck.io was used to implement centralized cron job monitoring.

CloudFormation (CF) Templates were then used in the creation of the complete AWS environment. The template can be reused to create multiple environments in the future. 20 Microservices in the stage environment was deployed and handed over to the customer team for validation. Powerup also shared the Ansible playbook which helps in setting up the following components – Server Hardening / Jenkins / Metabase / FreeIPA / Repository.

The below illustrates the architecture:

  • Different VPCs are provisioned for Stage, Production, and Infra management. VPC peering is established from Infra VPC to Production / Stage VPC.
  • VPN tunnel is established between the customs office to  AWS Infra VPC for the SSH access / Infra tool access.
  • All layers except the elastic load balancer are configured in a private subnet.
  • Separate security group configured for each layer like DB / Cache / Queue / App / ELB / Infra security groups. Only required Inbound / Outbound rules.
  • Amazon ECS is configured in Auto-scaling mode. So the ECS workers will scale horizontally based on the Load to the entire ECS cluster.
  • Service level scaling is implemented for each service to scale the individual service automatically based on the load.
  • Elasticache (Redis) is used to store the end-user session
  • A highly available RabbitMQ cluster is configured. RabbitMQ is used as messaging broker between the microservices.
  • For MySQL and Postgresql, RDS Multi-AZ is configured. MongoDB is configured in Master-slave mode.
  • IAM roles are configured for accessing the AWS resources like S3 from EC2 instances.
  • VPC flow logs/cloud trail / Cloud Config are enabled for logging purposes. The logs are streamed into AWS Elasticsearch services using AWS Lambda. Alerts are configured for critical events like instance termination, IAM user deletion, Security group updates, etc.
  • AWS system manager is used to manage to collect the OS, Application, instance metadata of EC2 instances for inventory management.
  • AMIs and backups are configured for business continuity.
  • Jenkins is configured for CI / CD process.
  • CloudFormation template is being used for provisioning/updating of the environment.
  • Ansible is used as configuration management for all the server configurations like Jenkins / Bastion / FreeIPA etc.
  • Sensu monitoring system is configured to monitor system performance
  • New Relic is configured for application performance monitoring and deployment tracking

Cloud platform

AWS.

Technologies used

Amazon Redshift, free IPA, Amazon RDS, Redis.

Benefit

IaC enabled customers to spin up an entire infrastructure architecture by running a script. This will allow the customer to not only deploy virtual servers, but also launch pre-configured databases, network infrastructure, storage systems, load balancers, and any other cloud service that is needed. IaC completely standardized the setup of infrastructure, thereby decreasing the chances of any incompatibility issues with infrastructure and applications that can run more smoothly. IaC is helpful for risk mitigation because the code can be version-controlled, every change in the server configuration is documented, logged, and tracked. And these configurations can be tested, just like code. So if there is an issue with the new setup configuration, it can be pinpointed and corrected much more easily, minimizing the risk of issues or failure.

Developer productivity drastically increases with the use of IaC. Cloud architectures can be easily deployed in multiple stages to make the software development life cycle much more efficient.

Detect highly distributed web DDoS on CloudFront, from botnets

By | Blogs, Cloud, Cloud Assessment | No Comments

Author: Niraj Kumar Gupta, Cloud Consulting at Powerupcloud Technologies.

Contributors: Mudit Jain, Hemant Kumar R and Tiriveedi Srividhya

INTRODUCTION TO SERVICES USED

 CloudWatch Metrics

Metrics are abstract data points indicating performance of your systems. By default, several AWS services provide free metrics for resources (such as Amazon EC2 instances, Amazon EBS volumes, and Amazon RDS DB instances).

CloudWatch Alarms

AWS CloudWatch Alarm is a powerful service provided by Amazon for monitoring and managing our AWS services. It provides us with data and actionable insights that we can use to monitor our application/websites, understand and respond to critical changes, optimize resource utilization, and get a consolidated view of the entire account. CloudWatch collects monitoring and operational information in the form of logs, metrics, and events. You can configure alarms to initiate an action when a condition is satisfied, like reaching a pre-configured threshold.

CloudWatch Dashboard

Amazon CloudWatch Dashboards is a feature of AWS CloudWatch that offers basic monitoring home pages for your AWS accounts. It provides resource status and performance views via graphs and gauges. Dashboards can monitor resources in multiple AWS regions to present a cohesive account-wide view of your accounts.

CloudWatch Composite Alarms

Composite alarms enhance existing alarm capability giving customers a way to logically combine multiple alarms. A single infrastructure event may generate multiple alarms, and the volume of alarms can overwhelm operators or mislead the triage and diagnosis process. If this happens, operators can end up dealing with alarm fatigue or waste time reviewing a large number of alarms to identify the root cause. Composite alarms give operators the ability to add logic and group alarms into a single high-level alarm, which is triggered when the underlying conditions are met. This gives operators the ability to make intelligent decisions and reduces the time to detect, diagnose, and performance issues when it happen.

What are Anomaly detection-based alarms?

Amazon CloudWatch Anomaly Detection applies machine-learning algorithms to continuously analyze system and application metrics, determine a normal baseline, and surface anomalies with minimal user intervention. You can use Anomaly Detection to isolate and troubleshoot unexpected changes in your metric behavior.

Why Composite Alarms?

  1. Simple Alarms monitor single metrics. Most of the alarms triggered, limited by the design, will be false positives on further triage. This adds to maintenance overhead and noise.
  2. Advance use cases cannot be conceptualized and achieved with simple alarms.

Why Anomaly Detection?

  1. Static alarms trigger based on fixed higher and/or lower limits. There is no direct way to change these limits based on the day of the month, day of the week and/or time of the day etc. For most businesses these values change massively over different times of the day and so on. Specially so, while monitoring user behavior impacted metrics, like incoming or outgoing traffic. This leaves the static alarms futile for most of the time. 
  2. It is cheap AI based regression on the metrics.

Solution Overview

  1. Request count > monitored by anomaly detection based Alarm1.
  2. Cache hit > monitored by anomaly detection based Alarm2.
  3. Alarm1 and Alarm2 > monitored by composite Alarm3.
  4. Alarm3 > Send Notification(s) to SNS2, which has lambda endpoint as subscription.
  5. Lambda Function > Sends custom notification with CloudWatch Dashboard link to the distribution lists subscribed in SNS1.

Solution

Prerequisites

  1. Enable additional CloudFront Cache-Hit metrics.

Configuration

This is applicable to all enterprise’s CloudFront CDN distributions.

  1. We will configure an Anomaly Detection alarm on request count increasing by 10%(example) of expected average.

2. We will add an Anomaly Detection alarm on CacheHitRate percentage going lower than standard deviation of 10%(example) of expected average.

3. We will create a composite alarm for the above-mentioned alarms using logical AND operation.

4. Create a CloudWatch Dashboard with all required information in one place for quick access.

5. Create a lambda function:

This will be triggered by SNS2 (SNS topic) when the composite alarm state changes to “ALARM”. This lambda function will execute to send custom notifications (EMAIL alerts) to the users via SNS1 (SNS topic)

The target arn should be the SNS1, where the user’s Email id is configured as endpoints.

In the message section type the custom message which needs to be notified to the user, here we have mentioned the CloudWatch dashboard URL.

6. Create two SNS topics:

  • SNS1 – With EMAIL alerts to users [preferably to email distribution list(s)].
  • SNS2 – A Lambda function subscription with code sending custom notifications via SNS1 with links to CloudWatch dashboard(s). Same lambda can be used to pick different dashboard links based on the specific composite alarm triggered, from a DynamoDB table with mapping between SNS target topic ARN to CloudWatch Dashboard link.

7. Add notification to the composite alarm to send notification on the SNS2.

Possible False Positives

  1. There is some new promotion activity and the newly developed pages for the promotional activity.
  2. Some hotfix went wrong at the time of spikes in traffic.

Summary

This is one example of implementing a simple setup of composite alarms and anomaly-based detection alarms to achieve advance security monitoring. We are submitting the case that these are very powerful tools and can be used to design a lot of advanced functionalities.