All Posts By

powerupcloud

Navigating Identity and Access Management (IAM)

By | Powerlearnings | No Comments

Written by Kiran Kumar, Business analyst at Powerup Cloud Technologies

Contributor Agnel Bankien, Head – Marketing at Powerup Cloud Technologies

Introduction to IAM 

In 1960 Fernando Corbató introduced passwords which instantly changed the world, with the use of passwords people were able to secure their files and share them across easily. However, as our network and storage started evolving passwords became harder to manage. This necessitated a system that could manage your access controls giving rise to the IAM(identity access management) system. The earliest form of IAM used spreadsheets to manage all the access controls; it was a significant improvement for the time but it still had ways to go. And as the internet grew IAMs systems became an integral part of the network and saw a huge increase in adoption from web applications. Yet IAM systems were still too expensive and complex to maintain with limited capabilities.

Coming back to 2021 in the day and age where 90% of the enterprises are using cloud in some form IAMs systems have transformed significantly and can perform much more complex tasks like authentication, authorization, policy implementation access control, etc. So let’s explore more about IAMs

How do you Define IAM?

Identity access management is simply a system that enables enterprises to define who can access particular content, tools, application features, etc, and manage roles like what actions can a particular user perform and assign the necessary permissions and resources to the user. IAM can work with multiple identities including people, software, hardware across robotics and IoT devices.

Why Organization should Adopt IAM

Since the onset of the pandemic, the majority of organizations have been forced to push their employees to work remotely without even being able to completely assess all the potential risk factors of it.  With the risk of data breaches and security risks at an all-time high, there is a need for policies, guidelines, and risk limiters that can help mitigate these cyber threats by carefully managing and governing your company’s assets. 

Additionally, IAMs have become a matter of compliance across various regions, international regulations like GDPR and also governance bodies across the world have made IAM a compulsory affair for the organization leveraging or working on payment cards as IAM enables the business to construct a strong authentication system and can ensure that the right person is accessing the right information at any point in time. 

How does IAM Work?

To understand how IAM works we need to understand the structure of an IAM system before that, every IAM system needs these essential components to function properly 

  • Database with a list of users identities and defined privileges 
  • Tools required to manage and modify access controls 
  • And a system for log management. 

The first step in an IAM journey is to define resources, such as user groups, federated users, roles, policies, and identifiers to identify these resources. Also, there are predefined roles in IAM systems that need to be assigned to users 

Access approval approver this user can view, provide or revoke access approval requests and view configuration 

Access approval config editor this user can update the access approval configuration 

Principal: Principal coordinates with the service provider to raise requests and learn about new updates, features, etc.

However, these roles are not market standard and might vary from one provider to another.

To perform any kind of changes in your IAM setup from the service provider’s side a request needs to be raised by the principal through the console. And only the principals authenticated by the provider can raise these requests.

Authentication can be done by just logging in to the service provider’s console with your account ID and password. Sometimes you might also be required to provide additional security information like MFA Multi-Factor Authentication. Once the authentication is done request can be raised and for any request to be processed it must Include the below information

  • The action that needs to be taken, 
  • The resources and users involved in the action their details 
  • The details about the environment 

The next step is getting authorization to successfully complete the request, during the process the provider validates the information provided on the request with the policies that apply and decides whether the users involved can be allowed or denied and if the user is allowed then the provider checks what operations can the user perform like viewing, creating, editing, and, deleting 

IAM Strategy and Approaches

Here are some of the strategies and best practices that need to be incorporated to set up a good IAM system.

RBAC or Role-Based Access Control

Under this approach, your existing groups, directories, and their attributes are mapped with the new directories, enabling them to control data access and add privileges based on the job roles or groups they belong to.

Single Sign-on

This approach requires its users to log in only once and they will have access to all the information in the system. While this adds to the system’s productivity gains of the system it’s best to fully understand the system by moving just one directory to this model.

Multi-factor Authentication

The multi-Factor authentication approach just means that the user is authenticated by more than one factor since the onset of cloud computing the need for increased security has skyrocketed. MFA solution provides end-to-end encryption from IAM setup to the MFA app, making it impossible for external attacks.

CASB (Cloud Access Security Broker) Integration

IAM systems are usually integrated with CASBs as they focus on security after the login. CASB helps you detect if your account has been hacked and have an IAM solution deactivate the compromised account. CASBs can detect other types of security issues as well.

Benefits of IAM

How IAM Makes a Difference?

Enhanced security – This is one of the most prominent advantages of an IAM system. IAM lets you control user access to eliminate data breaches, theft, and illegal access into organizations’ networks and also prevent the spread of the compromised login credentials. 

Streamlined Policy implementation – IAM lets you create, edit, and update your security policies in a centralized manner enabling you to simplify operation and save time. 

Centralized Security Operations – IAM lets you manage all your security tickets through a single help desk and also sometimes automate it depending on the use case resulting in faster ticket resolutions. 

Compliance – IAM is one of the key parameters on which the organizations are validated for industry regulations like GDPR and HIPAA and having an IAM strategy can help them meet the requirements faster.

Collaboration – With the increased number of remote workers, IAM provides organizations with safe and effective ways to manage their remote workers and promote collaboration between external vendors and suppliers.

Our Experience with IAM 

I would look to share some of our experience from setting up an IAM system for a Product company

Our Client was one of India’s Largest Online Mutual Fund Investment Platforms. One of their challenges was that they had a standalone VPC with no centralized control for network admin and private access. Their customer monitoring tool called monit had feasibility issues with no centralized log storage amongst others. So LTI-Powerup helped them move their platform to GCP and built their access control around Cloud IAM for complete control over the network with secure access. With this setup, a customer could achieve a granular level of Monitoring for GKE and lay a strong foundation for their security practice with IAM at the core, and bring operational efficiency.

Conclusion

There is no doubt that IAM lifts your infrastructure security up a notch but IAM has its limitations. One of the common challenges of IAM is that whenever a new employee joins the organization the admin has to manually define all the access permissions. While it may sound simple, imagine scaling this for 1000s of users or if there is a major restructuring in the leadership it becomes very difficult for the administrators. Also using more than one cloud provider brings its problems and adds to the complexity 

The most important thing that needs to be said is that organizations need to see IAM as an entry and not as a destination with IAM providing a strong foundation to your organization’s security practice they need to build on it by capturing and implementing best practices integrating other technologies like log management, automation to improve and effectively manage access to data.

Application Deployment on AWS

By | Cloud Case Study | No Comments

Summary

A large-scale security organization offering advanced integrated and technology-based security solutions was looking to move its entire legacy setup to AWS cloud.

LTI-Powerup steered the client through a smooth migration process along with providing them round-the-clock managed services for all their applications and services on the cloud. LTI-Powerup also ensured the client could derive maximum benefits from automated deployments and enhanced DevOps capabilities. 

About Client

The client is a leading innovative Singapore-based operations-technology organization that provides commercial supplementary-armed security forces to government as well as private organizations. They build and manage customized security solutions for complex and crucial operations offering unique integrated security systems that comprise of modern technology, facilities management, security management, customer service and human resources. 

The goal being to constantly develop and deliver collaborative and integrated security services that drive operational efficiency and fruitful business outcomes.

Business Needs

The client intended to deploy all their internal web, application, and databases onto the cloud. They were looking for an experienced consulting firm that would help them –

  • Migrate to AWS, 
  • Manage the applications 24/7 and 
  • Build DevOps capabilities along with it. 

LTI-Powerup, premier and trusted consulting partner of AWS, facilitated the migration of the client’s entire on-premise setup to AWS cloud followed by instrumentation of DevOps practices, managed services, and cloud best practices.

Solution

LTI-Powerup’s cloud experts engaged with the client to understand and implement migration strategies for their current on-premise setup that was to be moved to AWS. 

Migration

A high available architecture was designed to host the primary site on availability zone – A whereas disaster recovery of workloads were recommended to be hosted on availability zone – B in the AWS Singapore region. The purpose of positioning multiple independent availability zones was to ensure business continuity in case of unforeseen setbacks and to strengthen AWS resiliency strategy. 

The applications and database workloads were hosted on AWS EC2 services for better compute security and capacity. 

Elastic Load Balancing (ELB) would automatically help distribute the incoming application or network traffic and route requests to registered EC2 instances, containers, or IP addresses in multiple Availability Zones depending upon the load capacity and reachability it could support. 

ELB is capable of scaling the load balancer according to the variation in the incoming traffic over a length of time. 

Ever since, developers were able to design and deploy scalable, tolerant, versioned and more consistent architecture on cloud.  

AWS systems patch manager provisioned centralized patching of all the servers while AWS NTP service was used for time synchronization of all instances running on VPC across all AWS public regions. Servers were accessed through openVPN access server, which supports all major operating systems including desktop and mobile platforms.

AWS CloudFormation allowed provisioning a collection of related AWS infrastructure using a simple text file, which acts as a template that can be used to design and build one-click deployments. Alternatively known as Infrastructure as code, it simplified and accelerated provisioning and management on AWS. These deployed solutions became more reliable and adhered to all AWS recommended best practices. 

Managed Service

AWS Identity and Access Management (IAM) assisted in creating and managing user access to AWS services and resources via secure encryption keys. AWS users and groups were formed with pre-defined access permissions using IAM.

Amazon S3 (Amazon Simple Storage Service) capacitated data storage at one place that could be accessed with easy-to-use application interfaces. Amazon S3 easily managed data and access controls for the client as it is designed for 99.9999% of data durability. It helped protect data from failures and threats, consequently enhancing scalability, availability and application performance.

With S3 object Lambda, customized codes could be run to process or modify the data as it is returned to an application. AWS Lambda functions enabled codes to run virtually on any infrasructure fully managed by AWS without having to provision or manage for servers. With Lambda, backup automation could be stimulated along with triggering alerts to monitor backup statuses.

This brought down costs dramatically as the client ended up paying only for the consumed compute time. It also warranted code scalability with high availability. 

AWS CloudWatch is a metrics repository that helped monitor client’s resources and applications on cloud. Operational data logs, metrics and events could be viewed and tracked via consolidated automated dashboards. 

Opensource solutions like Grafana and Sensu were configured for detailed monitoring of AWS EC2 resources.

AWS CloudTrail was implemented to capture event history of the client’s Amazon web services account activity. CloudTrail facilitated cloud governance, adhering to compliance as well as operational and risk auditing of the entire client AWS accounts. 

AWS Key Management Service (KMS) was used to encrypt data volumes. It helped control data usage across AWS services and AWS KMS could also be integrated with AWS CloudTrail to provide logs of all key usage in order to meet regulatory and compliance needs.

AWS Config helped assess, track and evaluate the configurations, inventory and changes related to AWS resources whereas Amazon GuardDuty aided in continuous monitoring and intelligent detection of unauthorized threats or malicious activity to protect the client’s AWS accounts, workloads, and data stored in Amazon S3.

Business Impact

  • With the migration to AWS, the client was able to seamlessly integrate the migrated applications with its on-premise legacy systems generating technical agility across the organization. 
  • The AWS well-architected framework facilitated greater scalability, high availability and operational flexibility of applications helping them boost their customer support, service and retention. 
  • The client was able to continuously and proactively monitor systems, detect as well as resolve issues in real time on automated basis, invariably enhancing security, reliability and application performance. 
  • The backup and disaster recovery solutions were more efficient and trouble-free with the move to AWS.
  • The client was able to create standard templates for infrastructure deployment ensuring speed, consistency and minimal or zero component-level failures.

Overview of AWS Service Catalog

By | Powerlearnings | No Comments

Written by Aparna M, Associate Solutions Architect at Powerupcloud Technologies

Enterprise customers who are looking to provision and manage AWS accounts and cloud environments available across all areas of business and providing right access maintaining the governance and security as per business standards is always difficult and time-consuming.

When using aws cloud resources it could be frustrating to make sure everyone has right level of access to the services they need especially when different roles and positions have different needs. It gets complicated when administrators want to ensure everything is properly updated and compliant with security standards. To overcome this AWS launched Service catalog, in this blog post we will talk more about service catalog – what it is, benefits and its key concepts.

What is AWS Service catalog?

With AWS Service Catalog you can create and manage catalogs of IT services that are approved for use on AWS. These IT services can include everything from virtual machine images, servers, software, and databases to complete multi-tier application architectures. It allows you to centrally manage deployed IT services and your applications, resources, and metadata. Administrators can control which users have access to each product to enforce governance and meet your compliance requirements.

Why AWS Service Catalog?

Self- Service
  • Increase agility with access to services
Promote standardization
  • Share best practices
  • Compliance with business goals and policies
Control provisioning of aws resources
  • Tag resources while provisioning
  • Restrict user permission
Rapidly Find and Deploy Approved Services 
  • Own catalog AWS Service and marketplace software
  • Connect with ITSM/ITOM Tools
  • ITSM tools such as ServiceNow and Jira Service Desk can be integrated with the Service catalog
  • Improved security
  • constraints on which IAM role should be used to provision resources.
  • Build service catalog on top of CloudFormation to follow the security best practices
Cost savings
  • Add constraint on resource provisioning

Features of AWS service catalog

  • Standardization of assets
  • Self-service discovery and launch
  • Fine-grain access control
  • Extensibility and version control
  • Central management of resources

Key AWS Service Catalog concepts

A Product is a blueprint for building your AWS resources A collection of AWS resources like EC2, application servers, RDS, and so on that are instantiated and managed as a stack. You create your products by importing AWS CloudFormation templates. The templates define the AWS resources required for the product, the relationships between resources, and the parameters for launching the product to configure security groups, create key pairs, and perform other customizations.

A portfolio, if a product has multiple versions then it is a good idea that we can put them in a Portfolio.  You can use portfolios to manage user access to specific products. You can grant portfolio access at an IAM user, IAM group, and IAM role level. 

Users Catalog administrators – Manage a catalog of products, organizing them into portfolios and granting access to end-users.

End users – Use AWS Service Catalog to launch products to which they have been granted access.

Constraints control the way users can deploy a product. Launch constraints allow you to specify a role for a product in a portfolio. This role is used to provision the resources at launch, so you can restrict user permissions without impacting users’ ability to provide products from the catalog.

Access control You can use AWS IAM permissions to control who can view and modify your products and portfolios. By assigning an IAM role to each product, you can avoid giving users permissions to perform unapproved operations, and enable them to provision resources using the catalog.

Versioning Service Catalog allows you to manage multiple versions of the products in your catalog.

Stack every AWS Service Catalog product is launched as an AWS CloudFormation stack. You can use CloudFormation StackSets to launch Service Catalog products across multiple regions and accounts. 

Getting started

  1. Create and Log In to AWS and open the Console.
  2. Grant permissions to administrators and end-users.
  3. Start building Service Catalogs. Use the Getting Started Guide to know-how.

Administrator Workflow

As a Catalog administrator you create a product that is based on cloudformation template, which defines the AWS resources used by the product. You add the product to portfolio and distribute it to the end user. At the end, login in as end user to test the product.

End-user Workflow

The following diagram explains the end-user workflow once the product is available. End-user can view the product and provision tasks, on the right, as well as the administrator’s tasks, on the left. The tasks are numbered in order.

Benefits

  • Controlled provisioning of AWS Services
  • Reduces AWS Costs
  • Provide self-service to end users
  • Ensure resource standardization, compliance, and consistent service offerings
  • Enhanced security
  • Help employees quickly find and deploy approved IT services
  • Connect with ITSM/ITOM software
  • Fine grain access control
  • Extensibility and version control
  • Automate backups
  • Setting up the DevOps pipelines

Our Experience with AWS Service Catalog

  • one of our customers who is leading the world with the innovation of new products had a requirement on having a self-service portal that the users across the business should be able to provision AWS resources as per needs with the company standards, compliance, and the security involved in it. They wanted to integrate with the ITSM tools to manage and keep track of the resource creation – As part of this, we leveraged the AWS Service catalog which includes various service catalog products consisting of AWS resources to provision EC2, S3, etc. We had integrated the service catalog with Jira so that the users can self-serve themselves with the resources they need. This helped them to dramatically simplify the process of creating or replicating AWS environments, enabling their teams to offer self-service options for standard configurations. By providing them with services through AWS Service Catalog, they can improve time involved in setting up and reduce stress on in-house IT teams with the assurance that baseline security configurations are maintained.”
  • A customer wanted to set up separate AWS accounts so they can meet the different needs of their organization. It takes manual efforts to configure the baseline security practices, and meet the governance and security standards. We had used the AWS Landing zone concept to set up and organize the AWS account with certain company standards. The Account Vending Machine (AVM) is an AWS Landing Zone key component. The AVM is provided as an AWS Service Catalog product, which allows customers to create new AWS accounts pre-configured with an account security baseline. Monitoring, logging, security, and compliance will be pre-configured during account setup. This helps the customs to reduce costs in Infra setup and Operations cost, takes minimum effort to set up the infrastructure. This helps the customers to migrate to AWS in less time reducing all the manual efforts on configuring the baseline for security.

Conclusion

We understood how the AWS Service Catalog allows organizations to centrally manage commonly deployed IT services. We also saw the various benefits of using the Service catalog on how it reduces the effort of users in provisioning the resources.

Site Reliability Engineer – SRE

By | Powerlearnings | No Comments

Compiled by Kiran Kumar, Business analyst at Powerup Cloud Technologies

Contributor Agnel Bankien, Head – Marketing at Powerup Cloud Technologies

Summary

SRE is a systematic and automated approach to enhancing IT service delivery using standardized tools and practices for the acceptable implementation of DevOps concepts. Let us look at the SRE team’s roles and responsibilities along with how their effort and productivity can be assessed and why they need organizational support to ensure uninterrupted functioning. The blog will also help understand the various models of SRE implementation, the current day SRE tools, and the benefits derived from it.

Index

1. What is Site Reliability Engineering (SRE)?

2. SRE roles and responsibilities 

2.1 Build software engineering

2.2 Fixing support escalation

2.3 On-Call Process Optimization 

2.4 Documenting Knowledge

2.5 Optimizing SDLC

3. SRE challenges

4. Why organizational SRE support is important?

5. Measuring SRE efforts and effectiveness

6. SRE implementation and its models 

6.1 A composite solo team

6.2 Application or Product-specific

6.3 Infrastructure specific

6.4 Tool centered

6.5 Embedded

6.6 Consulting

7. DevOps and SRE

8. SRE tools and technologies

9. Conclusion

1. What is Site Reliability Engineering (SRE)?

Site reliability engineering (SRE) is a systematic software engineering approach that uses software as a tool to manage IT infrastructure and operations. In the words of Benjamin Treynor Sloss, VP engineering at Google Cloud, “SRE is what happens when you ask a software engineer to design an operations function.” 

The notion was conceptualized at Google in 2003 when the technology giant established SRE to make its websites highly available, reliable and scalable. 

Subsequently, site reliability engineering was embraced by the IT industry to mitigate risks and issues, manage systems and automate operations like capacity and performance planning, disaster management and quality monitoring.

Standardization and automation are the fundamental components of an SRE model. The idea is to shift operations from manual control to DevOps, which implement and manage large complex systems on to cloud through software code and automation to accelerate the efficiency and sustainability of a business. 

2. SRE Roles and Responsibilities 

According to the latest report by LinkedIn on the ‘Emerging Jobs 2020’, Site Reliability Engineer is among the top 10 in-demand jobs for 2020.

Site reliability engineering is a progressive form of QA that strengthens the collaboration between software developers and IT operations in the DevOps environment.

Some roles of an individual SRE defined within the teams are:

  • System administrators, engineers, and operations experts 
  • Software developers and engineers
  • Quality assurance and test engineers 
  • Build automation pipeline experts and
  • Database administrators. 

SRE responsibilities include:

2.1 Build software Engineering

Site reliability engineers, in order to develop and improve IT and support, build exclusive tools to mitigate risks, manage incidents and provide services like production, code change, alerting and monitoring.

2.2 Fixing Support Escalation

SREs must be capable of identifying and routing critical issues pertaining to support escalation to concerned teams. They must also work in collaboration with relevant teams to remediate issues. As SRE functions mature with time, the systems become more robust and reliable leading to lesser support escalations. 

2.3 On-Call Process Optimization 

There are instances where SREs take up on-call responsibilities and its process optimization. They implement runbook tools and other automation techniques to ready incident response teams, enhance their collaborative responses in real-time, and appraise documents.

2.4 Documenting Knowledge

SREs often function alongside IT operations, development, support as well as technical teams constructing a large amount of knowledge repository in the process. Documenting such comprehensive information would ensure a smooth flow of operations among teams.

2.5 Optimizing SDLC 

Site reliability engineers ensure that IT resources and software developers detect and assess incidents to document their findings to facilitate informed decision-making. Based on post-incident reviews, SRE can then optimize the SDLC to augment service reliability.  

3. SRE Challenges

Many times, organizations are reluctant to change and any change often comes with additional costs, lost productivity, and uncertainties. A few challenges are:

  • Maintaining a high level of network and application availability,
  • Establishing standards and performance metrics for monitoring purposes,
  • Analyzing cloud and on-premise infrastructure scalability and limitations,
  • Understanding application requirements, test and debugging needs as well as general security issues to ensure smooth function of systems,
  • Keeping alerting systems in place to detect, prioritize and resolve incidents if any,
  • Generate best practices documents and conduct training to facilitate smooth collaboration between identified SRE resources and other cross-functional teams.

4. Why Organizational SRE Support is Important?

In today’s digital ecosystems, enterprises have to constantly work towards highly integrated setups in controlled environments that pose a colossal challenge for them. Catering to build dependable services while ensuring uninterrupted functioning of business is the key to escalated performance and user experience.

Organizations need to actively obtain consent from senior leaders to set up SRE teams and processes. Its success is directly proportional to the teams and business units supporting it. SRE creates better visibility across all SDLC phases and while reliability engineers solely work on incident management and scalability concerns, DevOps teams can focus perpetually on continuous deployment and integration.

With an SRE setup, organizations can adhere to SLAs and SLOs better, run tests, monitor software and hardware be prepared for incident occurrence, reduce issue escalation, speed up software development and save potential costs of downtime. It promotes automation, post-incident reviews, on-call support engineer to distinguish the level of reliability in new deployments and infrastructure.

5. Measuring SRE Efforts and Effectiveness

Irrespective of whether an organization has adapted fully to DevOps or is still transitioning, it is vital to continuously improve the people, process, and technology within IT organizations. 

The need to establish a formalized SRE process to upgrade the health of the systems has come into effect. Setting up metrics and benchmarks to ensure better performance, uptime and enhanced user experiences paved the way to an effective monitoring strategy.

  • Define a Benchmark for Good Latency Rates

This is to determine the time taken to service a request by monitoring the latency of successful requests against failed ones to keep a check on system health. The process highlights non-performing services allowing teams to detect incidents faster.

  • Control Traffic Demands

SRE teams must monitor the amount of stress a system can take at a given point in time in terms of user interactions, number of applications running, or service transaction events. This gives organizations a clearer picture of customer product experience and the system’s efficacy to hold up to changes in demand.

  • Track the Rate of Errors

It is important to scan and detect the rate of requests that are failing across individual services as well as the whole system at large. It is crucial to further understand whether they are manual errors or obvious errors like failed HTTP requests. SRE teams must also ensure categorizing the errors into critical, medium, or minor to help organizations keep a check on the true health of service and take appropriate action to fix errors.

Monitor Overall Service Capacity

Saturation provides a high-level overview of how much more capacity does the service has. As most of the systems start to deteriorate before it hits 100% utilization, SRE teams must ascertain a benchmark on accepted levels of saturation thus ensuring service performance and availability for customers.

6. SRE Implementation and its Models

To implement SRE as per business specifications, organizational teams need to collate and consider the best approach that works for them. To begin with, clarifications and advice can be sought by an SRE proponent for the project kick-off who will also be capable of testing various SRE models that can be best suited for the organization. 

The SRE team can then work sequentially with the product teams, be existent around them or function as a discrete centralized unit.

The primary focus is continuous improvement with a data-driven approach to operations. SRE supports the application of automation practices, failure restoration while also ensuring error reduction and prevention. 

A simulation-based inference aids ineffective root cause analysis of incidents, performance, and capacity. SRE helps determine the permissible amount of risk through error budgeting as well as offers change management techniques in case the system has become unstable, thus guaranteeing a controlled risk environment.

A few SRE Models to Look at:

6.1 A Composite Solo Team

  • It is a single SRE team and is the most widely accepted model.
  • Best fit for smaller set ups with narrower product range and customer base.
  • Handles all organizational processes and is capable of identifying patterns and alikeness between different projects.
  • Enables implementation of integrative solutions across contrasting software projects. 

6.2 Application or Product-specific

  • Such teams focus on upgrading reliability of one specific product or application of the organization that is business-critical.
  • Bridges the gap between business objectives and how the team’s effort reciprocates.
  • Best suited for large organizations that cannot implement SRE for their entire range of products or services and therefore focus specifically on crucial deliveries. 

6.3 Infrastructure Specific

  • Defines Infrastructure as Code and enables automation of iterative tasks to simplify and speed up software delivery. 
  • Infrastructure SRE teams improve quality, continuity and performance of business. 
  • Such models are suitable for large sized organizations with multiple independent development teams that need common uniform processes applicable to all.

6.4 Tool Centered

  • Tool-centered SRE teams exclusively create tools and features to enhance productivity as well as mitigate process hold-ups and restrictions.
  • They need SRE charters, defined scope, and continuous monitoring to keep themselves segregated from the Infrastructure team.
  • All organizations that require software tools other than those provided by DevOps or SaaS platforms can form this SRE team.

6.5 Embedded

  • SRE experts are implanted within development teams that address specific issues of product development.
  • Helps change environment configurations to enhance performance in the SDLC lifecycle and enables implementation of development best practices.
  • It is beneficial to have the embedded team when organizations have just begun the SRE journey and need to accelerate transformation. 

6.6 Consulting:

  • The consulting model helps build specific tools that complement the existing processes.
  • Helps scale the impact of SRE outside of the initial project scope and can be performed by third party specialists.
  • Can be implemented before the actual SRE implementation begins to comprehend SRE best practices. 

7. DevOps and SRE

DevOps offer high-speed and top quality service delivery, contemporary designs that ensure increased business value as well as operational responsiveness and instills a modernized approach to development, automation and operations culture. SRE on the other hand can be considered as an implementation of DevOps. SRE, like DevOps is also about team culture and relations and constantly works towards building a tightly integrated development and operations team.

Both DevOps and SRE provision speedy application development along with enhanced quality, reliability and performance of services.

SRE depends on on-site reliability engineers who possess the skill set of both development and operations and are embedded within the development teams to ensure smooth communication and workflow. The SRE team assists developers that need guidance with operational functions and help balance tasks between DevOps and SRE where DevOps team focus on code, new features and the development pipeline whereas SRE ensures reliability.

8. SRE Tools and Technologies

Google Site Reliability Engineer Liz Fong-Jones states, “One SRE team is going to have a really difficult time supporting 50 different software engineering teams if they’re each doing their own separate thing, and they’re each using separate tooling.”

Hence standardization of SRE tools and practices for proper implementation of DevOps principles was a must. 

SRE toolchains are in line with the DevOps phases where the DevOps toolchain assists teams to choose the right tools to plan, build, integrate, test, monitor, evaluate, deploy, release and operate the software they develop.

Tools used by Site Reliability Engineers

9. Conclusion

SRE establishes a healthy and productive relationship between development and operations, thus letting organizations envision a holistic picture towards better resilience, authenticity, and speed of their IT services.  

As Jonathan Schwietert, SRE Team Lead at VictorOps rightly states, “SRE shines a light on the unknowns in a running system, it eliminates fear associated with the unknown. SRE brings awareness to the maintainers of a system and builds confidence toward handling unknown issues”.

SRE is one of the most efficient team models to implement DevOps standards in the present-day software delivery systems that aim to direct more productive and progressive business objectives going forward.

Application Modernization and Management on AWS

By | Cloud Case Study | No Comments

Summary

Financial Ruler is a comprehensive financial planning and management platform that is looking at moving to cloud to tighten security around its data and architecture while ensuring reliability, performance efficiency as well as cost and compliance optimization. LTI  to restructure and migrate their infrastructure to AWS for higher availability and scalability. 

Project type

Infrastructure Migration 

About 

The client is an all-inclusive social media platform for financial planning that offers services such as asset management, budgeting, retirement planning, and managing your financial data repository all under one roof. They also provide financial projections by converting available complex data into easy-to-view reports and charts along with augmented intelligence tools via integrated technologies to enhance their clients’ usage and experience.

Business Needs

The client is currently running its day-to-day business operations on an external network system and is planning to migrate and host its existing applications in the AWS cloud mainly for future scalability, reliability and high availability. 

LTI , a premier and trusted consulting partner of AWS is to re-architect the client’s applications by setting up high-availability application and database layers to help them migrate to cloud efficiently with minimal downtime and optimum infrastructure security.

Solution

LTI ’s cloud experts engaged with the customer to understand their requirements, application setup, current architecture and network to draft a suitable cloud migration strategy. 

LTI  also facilitated assessment and scalability of the entire architecture on cloud along with devising a progressive plan of action for security and compliance, establishing database high availability and minimizing overall infrastructure costs.

Application Re-architecting

Currently the application is a monolithic setup where the application and database are both hosted on a single physical server. 

LTI  proposes re-architecting the current setup to a three-tier architecture by splitting the web, applications and database servers to rebuild it as a service-oriented scalable design. 

This method helps organizations to take full advantage of cloud-native capabilities. In this case, the proposed AWS architecture has allowed them to easily add capacity, strengthen security and accelerate productivity. 

The AWS setup

Using root accounts for deployments is not advisable as per AWS recommended security best practices. To securely access different AWS services, MFA enabled IAM accounts with least permissible access were created and users were given access to only those services, which were needed. This was to ensure an additional layer of security was in place, in case credentials got compromised.

AWS web application firewall (WAF) was setup to secure the web applications hosted on AWS from hacks and cyber attacks.

NAT gateway was provisioned comprising of VPCs with public and private subnets to enable Internet access for servers in the private subnet. Network was designed using appropriate CIDR range, subnets and route tables.

LTI  helped configure Network Access Control Lists (NACLs) based on security requirements to control traffic at the subnet level where rules are defined to control traffic from required IP addresses. 

They created security groups to control traffic at the VM layer and opened only required ports with least access. IP whitelisting for various third party providers was also covered at the security group level. 

SSL certificates were deployed on the load balancers to protect data in transit. Customers could either bring in their own SSL certificates or use public SSL certificates from AWS for free.

A load balancer serves as the single point of contact and distributes incoming application traffic across multiple targets such as EC2 instances, in multiple availability zones. This elevates the availability of the client’s application. 

LTI  established required config rules, abolished default VPCs across all regions and accounts and enabled VPC flow logs to capture IP traffic data across the network interfaces on cloud. VPN tunnels were also to be enabled between AWS, customer locations and data centers. 

Enabling CloudTrail helped the client capture all API activities in the account.

Migration

LTI  provisioned EC2 instances, which are virtual servers in AWS EC2 service that help run web applications on the AWS infrastructure. Subsequently, they configured web and application servers providing EC2 instances for MongoDB – primary, secondary and arbiter setup. It is a document-oriented cross platform NoSQL cloud database service for modern applications. 

LTI  devised auto scaling for these application instances that support it and took MongoDB backup to restore data on master MongoDB instance on AWS, thus enabling replication between primary, secondary and arbiter MongoDB servers.

The primary is the only member in the replica set that receives read/write operations. However, server name cannot be defined for the same. A secondary maintains the copy of the primary data set and applies operations from the primary’s operations log to replicate data on to its own data set. Arbiters are MongoDB instances that are part of a replica set that don’t require any dedicated hardware but also do not hold any data. Arbiters can be added to the setup if a replica set has an even number of members.

Post migration checks were essential and LTI  ensured an end-to-end application validation was conducted and any issues if found in the infrastructure configuration were fixed swiftly. 

To enable going live on production, the Amazon Route 53, an AWS DNS web service and the Amazon CloudFront services were both configured successfully to dispense highly scalable and secure web applications, data and APIs across the globe on cloud. AWS WAF enabled application protection via its WAF policies. It also helped cleanup unwanted data logs on the production environment, set the app in maintenance mode, helped restart all necessary components, performed application validation and updated the DNS to point to AWS.

AWS Monitoring and Logging 

Inducted the CloudWatch tool, which provides infrastructure and application monitoring of all the applications, resources and services that run on cloud and on-premise. It helps collect operational data logs, metrics and events through automated backups that can be viewed in consolidated form on dashboards. Appropriate alarms were configured in CloudWatch to notify the customer when certain thresholds were crossed. Amazon SNS were used for notifications.  

AWS Config service enabled assessment, audit and evaluation of all AWS resources. Config monitored and recorded all AWS resource configurations in accordance with AWS best practices for change management. It can eventually automate evaluated records against expected configurations as well. 

Managed Services

LTI  supported AWS infrastructure operations by leveraging AWS services in the following areas: 

  • Continuous cost optimization through rightsizing EC2 instances, scheduling, upgrading instances to latest generation and deleting unused cloud-based storage volumes.
  • AWS Server Management provided 24/7 support for server monitoring, disaster recovery in case of server outage or compromise, speedy response time along with environment and app monitoring.
  • Security management was administered via AWS cloud security services that helped the client meet their security and compliance requirements, ascertained data protection and confidentiality with stringent measures in place for any security threats, enabled secure scaling, provided greater visibility and automated security tasks.
  • AWS Network Services ensured consistent network availability, data integrity and 24/7 monitoring of cloud infrastructure that can run workloads with high throughput and lowest latency requirements. AWS network capabilities are one of the largest globally and can deliver client applications and content across the world over a private network.  
  • AWS Backup enables backup policy configurations from a centrally managed console to confirm that application data across AWS services are backed up and secure. It supports automated backup processes and maintains consolidated backup activity logs for all AWS services. 
  • DR support minimizes downtime and data loss by enabling speedy and reliable recovery of physical, virtual and cloud-based servers into AWS cloud. This simplifies architecture implementation and protects enterprise applications and databases without compromising business continuity. 
  • AWS Audit Manager provides continuous audit of AWS usage to assess risks and compliance against industry standards. It provides prebuilt frameworks that are mapped to the client’s AWS resources to ensure if regulatory controls are being abided by. Audit manager facilitates automated collection of data for assessment on daily, weekly or monthly basis as per client’s requirement.

Business Impact

  • With the migration to AWS, the client acquired a much more strengthened, secure and efficient infrastructure. Application and server performances got more coherent.
  • Database is now fully secured with prevention of unauthorized public access and static data stored in AWS S3 bucket is encrypted.
  • Moving to cloud signified cost optimization by eliminating unwanted costs and managing expenses without overspending.
  • The AWS well-architected framework offered operational flexibility and excellence, took care of monitoring the systems on a continuous and automated basis and helped recover from quick service or infrastructure disruptions if any.

Thus, the client has reaped benefits of up to 30% in operational savings and up to 25% in AWS infrastructure savings while also improving their operational SLAs, security, and compliance posture. 

Configured for Cloud with IaC

By | Powerlearnings | No Comments

Compiled by Kiran Kumar, Business analyst at Powerup Cloud Technologies

Contributor Agnel Bankien, Head – Marketing at Powerup Cloud Technologies

Summary:

IaC is a DevOps initiative, which is all about writing codes to dispense infrastructure and deploy applications automatically on cloud. The blog touches upon why IaC is important in today’s times, principles it is based on, how automating workflows is the need of the hour, its benefits and best practices with names of a few dominant IaC tools in the market. 

Index

1. What is Infrastructure as code?

2. Why is it important?

3. Principles of Infrastructure as code       

4. IaC Workflow

5. Automating Infrastructure Workflows                                      

5.1 Mutable vs Immutable

5.2 Imperative vs Declarative 5.3 DevOps

6. Benefits of IaC

7. Best practices of Infrastructure as code

7.1 Codify everything

7.2 Use minimum documentation

7.3 Maintain version control

7.4 Continuous testing

7.5 Go modular

8. Infrastructure as code tools

9. Conclusion

1. Introducing Infrastructure as Code (IaC)

With the revolution in IT, configuring a cloud environment through code is fast trending. Popularly termed “Infrastructure as code”, it is a system adopted by organizations to a device, develop and support IT infrastructure. 

Unlike the traditional setup where system administrators manually built and managed all the hardware and software requirements, Infrastructure as Code (IaC) is the management and provision of networks, virtual machines, load balancers, and connection topology via computer-readable files without involving physical hardware configuration or interactive configuration tools.

Gartner predicts that by 2024, automation and analytics will help the digital resources shift 30% of their time spent on endpoint support and repair to continuous engineering.

2. Why is it Important?

While conventional IT infrastructure management requires a huge amount of expertise, IaC employs nominal resources that lead to cost-effective management and simple uniform communication, making it faster and more consistent.  A code-based approach makes it easier to get more done in less time. 

IaC is capable of managing large distributed systems, cloud-native applications and service-based architecture, giving cloud users a better understanding of the setup and granting them the power to make changes when required without impacting the existing infrastructure. It develops scalability and availability while improving monitoring and performance visibility of the overall infrastructure. 

3. Principles of Infrastructure as Code         

Adam Jacob, co-founder of Opscode, states; “There are two steps to IaC principles:

1. Break the infrastructure down into independent, reusable, network-accessible services and 

2. Integrate these services in such a way as to produce the functionality your infrastructure requires”.

The major principles on which IaC works upon are:

  • Services must be broken into smaller and simpler modules.
  • Re-building of systems should be effortless and flexible with zero dependency on manual decisions, thus eliminating the majority of risks and failures.
  •  Create a comprehensive design at the initial stage that takes all possible requirements and scenarios into account. Design should be able to accommodate change in a way that promotes continuous improvement.
  • Services must be able to build and integrate complex systems with also the ability to edit, transfer, destroy, upgrade or resize resources to cater to the ever-changing cloud infrastructure.
  • Deploy a unified automated structure that facilitates compute, network and storage capacities to run a workload dynamically in production through API tools.
  • Services must produce the same results when used repeatedly with maximum focus on the component level and its functions. It should be in concurrence with the policies and the overall system as a whole eventually.

4. IaC Workflow 

Infrastructure as code is a key DevOps strategy that is treated the same way as application source code where the teams would examine its version control, write tests for it and ensure it is in concurrence with continuous delivery.

Developers define configuration specifications in a domain-specific language after which the instruction files are sent to a master server, management API or code repository based on which the IaC platform creates the infrastructure.

As all the parameters are saved as machine-readable files called manifests that are easy to reuse, edit, copy or share. IaC users need not configure an environment each time they plan to develop, test or deploy software, making the process swifter and consistent.

Developers then systematize and store the configuration files with version control. In case of edits or pull requests, code review workflows are able to verify the exactness of the changes.

5. Automating Infrastructure Workflows

Structuring infrastructure through IaC provides a standard template to organizations for provisioning servers, operating systems, storage and other components without the involvement of developers, every time something is developed or deployed. The infrastructure aspect is treated more like software where a code is written and executed manually or via automation to build and run servers, load balancers, networks, storage, firewall, policies, databases and application configs.

According to Gartner, more than 90% of enterprises will have an automation architect by 2025, up from less than 20% today.

5.1 Mutable vs Immutable

Mutable infrastructure originated in the physical data center world and as acquiring new servers was expensive, the existing servers were upgraded regularly along with ad hoc fixes, edits or patches when necessary. Recurrent manual changes made the setup complicated, fragile and difficult to duplicate. 

With the advent of cloud, virtualization and on-demand cloud computing revolutionized server architectures making it more affordable, scalable and high-speed. Configuration management and cloud APIs gained momentum with new servers being automatically provisioned and deployed via code that could never be modified in the future.

5.2 Imperative vs Declarative 

An imperative style defines the specific commands that need to be run. In a declarative approach, the desired resources with specific properties are affirmed, which the IaC tool then configures. 

It also maintains a list of the current state of system objects that assist in pulling down the infrastructure with ease.

IaC tools mostly use a declarative approach that automatically provisions for the required infrastructure. A declarative IaC tool will apply changes made, if any, while an imperative tool will not implement changes on its own.

5.3 DevOps

With IaC, DevOps teams are able to convert the code into artifacts that are a deployable component processed by the build. In the case of infrastructure as code, Docker images or VM images can be considered artifacts.

Once the build is installed, unit, integration and security checks can be performed to ensure all sensitive information is intact.

The scripts can be unit tested to check for syntax errors or best practice violations without provisioning an entire system. Conduct tests to ensure right server platforms are being used in the correct environment and that packages are being installed as expected. The next step is integration tests to verify if the system gets deployed and provisioned accurately. This is followed by security testing of the infrastructure code to ensure security mechanisms are not compromised and that IaC is compliant with industry standards.

Automation of IaC saves significant debugging time, enables tracking and fixing of errors, is subject to shorter recovery time, experiences more predictable deployments and speeds up the software delivery process. These factors are vital for quick-paced software delivery. 

The IaC approach helps DevOps teams create servers, deploy operating systems, containers and application configurations, set up data storage, network and component integrations. IaC can also be integrated with CI/CD tools that help build infrastructure code for your pipelines.

6. Benefits of IaC

  • Speed and Consistency: Code based approach of IaC eliminates manual processes and enables repeated usage of code, making it easier to get more done in less time. Iterations are faster and repeatable, consistency is the key value and changes can be implemented globally without altering the software version.
  • Collaboration: Version control helps multiple teams from different locations to collaborate on the same environment. Developers are able to work on varied infrastructure sections and release changes in a controlled format.
  • Efficiency: IaC enhances competency and productivity of the code and infrastructure across the development lifecycle via established quality assurance practices. It also keeps a repository of all the environment builds allowing developers to focus more on application development.
  • Scalability: The current infrastructure can be upgraded as well as expanded effortlessly through IaC.
  • Disaster Recovery: IaC facilitates recovery of large systems in case of a disaster by re-running its manifest code scripts where the system can be made available on a different location if needed.
  • Reduced Overheads: Unlike the conventional setup, IaC reduces the cost of developing software and does not need a group of admins to govern the storage, networking, compute, and other layers of hardware and middleware. IaC offers a utilization-based cost structure paying only for those resources that are being used, thus reducing remarkable cost-overheads.

7. Best Practices of Infrastructure as Code

7.1 Codify Everything

All the parameters must be explicitly coded in configuration files describing the cloud components to be used, their relationship with one another, and how the whole environment came into existence. Infrastructure can only then be deployed faster and with transparency. 

7.2 Use Minimum Documentation:

The IaC code itself acts as a document that has defined specifications and parameters in it. Diagrams and setup instruction documents may exist to a certain extent for resources unaware of the deployment process. However, deployment steps would ideally happen through configuration code leading to minimum or no additional documentation.

7.3 Maintain Version Control:

All configuration files must be version controlled. Any change in code can be managed, tracked or resolved just like application code. Maintaining versions of IaC codebase provides an audit trail for code changes and the ability to collaborate, review, and test IaC code before it goes into production.

7.4 Continuous Testing:

Constantly test, monitor, integrate and deploy environments before pushing changes to production. To avoid post deployment issues, a series of unit, regression, integration, security and functional tests should be carried out multiple times, across multiple environments, preferably via automation techniques to save time and efforts. 

DevSecOps is the association of DevOps and security professionals to detect and eliminate risks, threats and violations, if any. 

7.5 Go Modular:

IaC partitioning offers division of infrastructure into multiple components that can be combined through automation. This enables organizations to control – who has access to what parts of their code while limiting the number of changes that can be made to manifests.

 8. Infrastructure as Code Tools

Tools are opted for, depending upon the infrastructure and application code being utilized. A combination of tools enhances better decision-making capabilities on how systems can be structured.

Tools commonly used in infrastructure as code are:

  1. Terraform: An IaC provisioning tool that creates execution plans using its own DSL stating what exactly must happen when a code is run. It builds a graph of resources and automates changes with minimal human interaction across multiple cloud service providers simultaneously and cohesively.
  2.  AWS CloudFormation: A configuration orchestration tool used to automate deployments. Used within AWS alone, CloudFormation allows preview of suggested changes in the stack to see how it impacts resources and how its dependencies can be managed.
  3. Azure Resource Manager: Azure offers in-house IaC tools that define the infrastructure and dependencies for the applications, group dependent resources for instant deployment or deletion and provide control access through user permissions.
  4. Google Cloud Deployment Manager: GCP offers features similar to AWS and Azure such as template creation and change previews prior to deployment for automation of infrastructure stack.
  5. Puppet: A configuration management tool that helps in the continuous delivery of software that supports remote execution of commands. Once the desired config is declared, Puppet deciphers how to achieve it.
  6. Ansible: An infrastructure automation tool that describes how the infrastructure components and system relate to one another as opposed to managing systems independently.
  7. Juju: Juju contains a set of scripts that deploy and operate software bundles linked together to establish an application infrastructure as a whole. 
  8. Docker: Dockers create containers that package code and dependencies together for applications to run in any environment. 
  9. Vagrant: This tool facilitates development of environments by using small amounts of VMs instead of the entire cloud infrastructure. 
  10. Pallet: An IaC tool that automates cloud infrastructure providing a high level of environment customization. Pallet can be used to start, stop and configure nodes, deploy projects as well as run administrative tasks.
  11. CFEngine: The desired state of the infrastructure can be defined using DSL after which CFEngine agents monitor the cloud environments’ convergence. It claims to be the fastest infrastructure automation tool with execution time under 1 second.
  12. NixOS: A configuration management tool that ensures easy, reliable and safe upgrade of infrastructure systems or convenient rollbacks to old configuration.

9. Conclusion

DevOps need to equip themselves with a broader set of skills to keep pace with the accelerating cloud infrastructure capabilities.

Gartner predicts that 50% of organizations will fail to meet cloud adoption goals due to lack of in-house skills and experience.

As a solution, organizations will largely leverage infrastructure as code to furnish such expertise and improve their infrastructure quality and productivity.

Infrastructure as Code can simplify and accelerate infrastructure-provisioning processes, comply with policies, keep environments consistent and immaculate while saving considerable time and costs. With the implementation of IaC, developers can focus on innovative growth, become more productive and increase the quality of customer service.

Overview of Microservices Architecture

By | Powerlearnings | No Comments

Compiled by Kiran Kumar, Business analyst at Powerup Cloud Technologies

Contributor Agnel Bankien, Head – Marketing at Powerup Cloud Technologies

Summary

Microservices architecture is majorly favored for distributed systems. Microservices are successors to a monolithic architecture where applications are loosely incorporated and are fragmented into smaller components that run on individual processes. 

Businesses that grow with time are obligated to revamp their framework that caters to frequent code change, application updates, facilitating real-time communication and managing rising network traffic to name some.

The challenges and benefits of building microservices architecture along with how it can actually impact business is what we will explore in this blog. 

Index

1. What is Microservices Architecture?

2. Why Microservices Architecture?

3. Evolution of Microservices Architecture 

4. Fundamentals to a successful Microservices Design      

4.2 Cohesion and Coupling

4.3 Unique Means of Identification

4.4 API Integration

4.5 Data Storage Segregation

4.6 Traffic Management

4.7 Automating Processes

4.8 Isolated Database Tables

4.9 Constant Monitoring

5. Key Challenges in Microservices

    5.1 Being Equipped for Speedy Provisioning and Deployment

    5.2 Execute Powerful Monitoring Controls

    5.3 Incorporate DevOps Culture

    5.4 Provision Accurate Testing Capabilities

    5.5 Design for Failure

6. Benefits of Microservices Architecture 

      6.1 Increased Elasticity

      6.2 Better Scalability

      6.3 Right Tools for the Right Task

      6.4 Faster Time to Market

      6.5 Easy Maintenance

      6.6 Improved ROI and Reduced TCO

      6.7 Continuous Delivery

7. Building Microservices Architecture

8. Microservices Best Practices

9. How can Microservices add Value to Organizations?

10. Conclusion 

1. What is Microservices Architecture?

Microservices or microservices architecture is a service-oriented architecture built from a collection of divergent smaller services that run independent of each other. Here applications are loosely incorporated and have fragmented components running in individual processes. Each independent service can be updated, scaled up or taken down without interrupting other services in the application.

Microservices makes applications highly maintainable, testable, and transformable, generating a scalable and distributed framework that can be built to boost business capabilities. It comprises fine-grained services with ethereal protocols that increase flexibility and modernize the business technology stack.

Microservices enable speedy, reliable and continuous deployment and delivery of huge complex applications.

2. Why Microservices Architecture?

Updating or reconfiguring traditional monolithic applications involve expensive, time-consuming and inconvenient processes. With microservices architecture, it is simpler to work with independent applications that can run on its own in different programming languages and on multiple platforms. Once executed or updated, these minute components could be grouped together and delivered as a complete monolithic app.

Thus, teams could function in smaller, independent and agile groups instead of being a part of larger impenetrable projects. One microservices can communicate with other available microservices uninterruptedly even when failures occur.

Any enterprise that uses applications needing repeated updates, faces dynamic traffic on their network or requires near real-time information exchange can benefit from adapting to microservices architecture. For instance, social media platforms like Twitter and Instagram, retailers like Amazon, media platforms like Netflix, and services like Uber use microservices, setting new standards for container technology.

3. Evolution of Microservices Architecture 

Monolithic

Initially, monolithic applications consisted of the presentation, application and database layers that were built and located on a solo data center. Users would talk to the presentation layer that in turn interacted with the business logic layer and database layers, after which information would travel back up the stack to the end user. 

However, this structured process did generate multiple single points of failure, long outages due to system failures or crashes and did not accommodate auto-fixing of errors and scaling on existing setup.

SOA

A few years down the line, architectural changes gave birth to service-oriented architecture (SOA) where a service is independently made available to other application modules via a network protocol. The approach enabled shorter autonomous development cycles via APIs but overcoming the complexity of testing integrated services along with too many failed and expensive implementations gave it a substandard reputation.

Microservices

Today, cloud microservices have the ability to further decompose the SOA strategy facilitating speedy code updates for a single function that is part of a larger service or application, such as data search or logging function. This technique enables making changes or updates without affecting the rest of the microservices, thus increasing flexibility, providing automated self-healing capabilities, minimizing failure points and creating a more stable application architecture. 

As per latest reports, 58% of enterprises run or plan to run between 10-49 microservices in production, and 15% run or plan to run more than 100.

Microservices architectures use Docker containers that are a grouping construct, more efficient that VMs. They allow the code and its required libraries to be deployed on any operating systems and can be launched or redeployed instantly on any cloud platforms. Many organizations are mirroring their systems as that on cloud microservices to enable unique development operations across any location or cloud-native setup.

4. Fundamentals to a Successful Microservice Design 

For microservices to fit into well-defined distributed application architecture, it is vital that the following elements are taken into consideration first.

4.1 Right Scoping the Functionality

Microservices enable a functionality to be partitioned into smaller micro services attributing it as an independent software component. Over-segregating the functionality would lead to excess microservices and hence it is imperative to identify which functionalities on the monolithic app are frequently called for by multiple functions. Once identified, that function can be broken down into its own service to serve diverse situations without overloads. 

Mirroring an entire module if it is not dependent on other modules within the application is also another method of scoping functionality. For example, authentication groups that manage user identity and authorization can be scoped and built as a microservice.

While defining the scope, it is best to limit the size of microservices to the lines of code (LOC) that can be re-implemented on a periodic basis in order to avoid overloads and bloating of services.

4.2 Cohesion and Coupling

A loosely coupled system has low interdependence and therefore enables deployment, edit or update of a new service without disturbing other existing services present. 

It is also vital to combine related code or homogeneous functions while partitioning a monolithic architecture into smaller services also known as cohesion. Higher cohesions mean greater autonomy leading to better microservices architecture. 

4.3 Unique Means of Identification

In a microservice design, any one particular service needs to act as a unique source of identification for the remaining parts of the system. For instance, once we place an order on flipkart, a unique order ID is generated and as a microservice, this order ID is the only source of information that provides all the details about the order placed. 

4.4 API Integration

For the broken down micro services to communicate, relate and work together it is fundamental to use appropriate APIs that enable convenient communication between the service and the client calls aiding transition and execution of functions.

Defining the business domain while creating an API will ease the process of singularizing the functionality. However as individual services evolve, richer APIs may have to be created with additional functionalities that need to be revealed alongside the old API. API changes must be fully incorporated to ensure the service behind the API is expatiated and it is able to manage the calls from multiple client types.  

4.5 Data Storage Segregation

Data stored and accessed for a service should be owned by that service and can be shared only through an API ensuring minimized dependency and access among services. Data classification should be based on the users, which can be achieved through the Command and Query Responsibility Segregation (CQRS).

4.6 Traffic Management

Once services are able to call each other via APIs, it is necessary to gauge the traffic to different services. It may vary due to slow movement or overload of calls to a service that may even cause a system crash. To manage and smoothen traffic flows, auto scaling must be implemented. It has the ability to terminate instances that cause delays or affect performance; it can track services continually and furnishes incomplete data in case of broken call or unresponsive services. 

4.7 Automating Processes

Independently designed microservices function by themselves and automation would further allow sustained self-deployment. Functional services progress to become more adaptable and cloud-native with the competence to be deployed in any environment and DevOps plays a major role in the evolution of such services.

4.8 Isolated Database Tables

Designing a microservice must cater to business functions rather than the database and its working as accessing and fetching data from a full fledged database is time consuming and unnecessary. Therefore, a microservice design should have a minimum number of tables with a focus only around business.

4.9 Constant Monitoring

The smaller components of microservices architecture with its data layers and caching may enhance performance of an application but monitoring all the changes in such a setup becomes challenging. It is vital to establish intent monitoring of data via microservice monitoring tools that will track individual services and eventually combine data to store it in a central location. Thus, any changes can be reflected without affecting the performance of the system. 

Processes must also be defined to monitor API performance to ensure the functionality meets the standards of speed, responsiveness and overall performance.

Most of all, building a well-designed secure software code and architecture from the inception phase ensures accountability, validation, data integrity, and privacy as well as safe accessibility to sturdy and protected systems.

The patterns to secure microservices architecture are:

  • Introduce security into software design from the start
  • Scan all the dependencies
  • Use HTTPS even for static sites to ensure data privacy and integrity
  • Authorize identity of users via access tokens for a secure server-to-server communication
  • Use encryption
  • Monitor systems especially while executing CI/CD pipelines through DevSecOps initiatives
  • Implement strategies in the code to limit network traffic and consecutively delay or avert attackers
  • Tool Docker Rootless Mode to safeguard sensitive data
  • As docker containers are integral to microservices, scan them for vulnerabilities 
  • Use time-based security through multi-factor authentication and network traffic controllers
  • Know the 4C’s of cloud-native security – code, container, cluster, and cloud.

5. Key Challenges in Microservices

Microservices are favorable but not every business is capable of adapting to it. Some organizational reservations are:

5.1 Being Equipped for Speedy Provisioning and Deployment

Microservices demand instant server provisioning and new application and service deployments. Organizations need to keep pace and be well furnished with high-speed development and delivery mechanisms. 

5.2 Execute Powerful Monitoring Controls

With microservices functioning independently, enterprises need to administer efficient monitoring of various teams working concurrently on different microservices as well as the infrastructure to keep track of failures and down time.

5.3 Incorporate DevOps Culture

Unlike the traditional setup, DevOps ensures everyone is responsible for everything. The cross-functional teams need to collaboratively work towards developing functionalities, provisioning services, managing operations and remediating failures. 

5.4 Provision Accurate Testing Capabilities

Since each service has its own dependencies and as new features get added, new dependencies arise making it difficult to keep a check on the changes. The complexity increases also with the rise in the number of services. Implementing flexible and penetrative tests to detect database errors, network latency, caching issues or service unavailability on microservices is a must. 

5.5 Design for Failure

Designing for system downtime, slow service and unexpected responses is essential. Being prepared for load balancing, setting up back up plans and ensuring failures do not bring the entire system to a halt helps businesses handle failures or issues better. 

6. Benefits of Microservices Architecture 

6.1 Increased Elasticity

Due to the distributed and granular structure of services, a minimal impact is experienced even in case of failures using microservices. The entire application is disseminated and even when multiple services are grounded for maintenance, the end-users would not be affected. 

6.2 Better Scalability

Scaling up a single function or service that is crucial to business without disturbing the application as a whole increases availability and performance.

 6.3 Right Tools for the Right Task

With microservices there is flexibility in terms of services using its own language and framework without being confined to a specific vendor. The freedom of being able to choose the right tool for the right task that smoothens communication with other services is a significant gain. 

6.4 Faster Time to Market

As microservices have loosely coupled services, code development or modification can be constricted to relevant services instead of rewriting the code for the entire application. Therefore, working in smaller independent increments leads to swifter testing and deployment, enabling services to reach the markets faster.

 6.5 Easy Maintenance

Debugging, testing and maintaining applications are easier with microservices. With smaller modules being continuously tested and delivered, the services become more enhanced and error-free. 

 6.6 Improved ROI and Reduced TCO

Microservices allows optimization of resources as multiple teams work on independent services, enabling quicker deployment, code reuse and reduction in development time. Decoupling cuts down on infrastructure costs and minimizes downtime resulting in improved ROI.

 6.7 Continuous Delivery

Code can be modified, updated, tested and deployed at any given point in time and released as small packets of code using continuous integration / continuous delivery (CI/CD).

After adopting microservices architecture, organizations have reported a 64% improvement in the scalability of applications, experienced 60% faster time to market, increased application flexibility by almost 50% and given their development teams 54% more autonomy.

7. Building Microservices Architecture

Step 1:

Monolith architecture is the traditional method of building and deploying software applications. As businesses grow, the list of key business capabilities to be provided by existing systems also expands.

Microservices work best if the roles of different services required by the system are correctly identified without which redefining service interactions, APIs and data structures in microservices proves costly. Adopting microservices after business has matured and gained sufficient feedback from customers is the best-case scenario.

However, it is advisable to switch to microservices before the code gets too complicated on the monolithic setup. It is important to determine and break the services into smaller pieces, separate the code from the web UI ensuring it interacts with the database via APIs. This warrants smooth transition to microservices especially when organizations would look at moving more API resources to different services in the future.

Step 2:

Microservices is not just about splitting the code, accommodating for failures, recovering from network issues or dealing with service load monitoring. It is equally crucial to reorganize teams, preferably in small numbers who have the required competency to develop and maintain microservices architecture.

Werner Vogels, CTO at Amazon quotes, “you build it, you run it” implying that developers can analyze the impact of their code in production, work on reducing risks and eventually deliver better releases. Also, multiple teams can work collaboratively on code upgrades and automation of deployment pipeline.                                                                                         

Step 3:

Once the service boundaries and teams are established, the traditional architecture can be broken to create microservices. Communication between services must be via simple APIs to avoid the components from being tightly integrated. Using basic message queuing services and transmitting messages over the network without much complexity works best in microservices.

Moreover, it is suitable to divide data in dissociated services as each service can have its own data store to carry on with what it needs. Example; suppose a user accesses customer information from the “order” table, which is also used by the billing system to generate invoice details. With microservices, invoices can still be accessed even if the ordering system is down as it allows streamlining of invoice tables that is independent of others. 

However, to eliminate duplicate data in different databases, businesses can adopt an event-driven architecture to help data syncing across multiple services. For instance; when a customer updates his personal information, an event is triggered by the account service to update billing and delivery tracking services as well.

While transitioning from the monolithic architecture, ensure designs are built for failure right from the beginning. As the system is now distributed, multiple points of failure can arise and microservices needs to address and remediate not just individual service related issues but also system failures and slower network responses if any.

Since microservices are distributed by nature, it is challenging to monitor or log individual services. Centralized logging service that adds logs from each service instance can be accessed through available standard tools right from the beginning. Likewise, CPU and memory usage can also be collated and stored centrally.

With an increasing number of services getting deployed multiple times, it is most advantageous to regulate continuous delivery. Practicing continuous integration and delivery will assure that each service has passed the acceptance tests, there is minimal risk of failure and a robust system with superior quality of releases is being built with time. 

8. Microservices Best Practices

  • Microservices should have a single-responsibility principle, which states that every service module must have responsibility for a single part of that functionality.
  • Customize the database storage and infrastructure exclusively for microservices needs and other microservices can call for the same data through APIs.
  • Use asynchronous communication or events between microservices to avoid building tightly coupled components. 
  • Employ circuit breakers to achieve fault tolerance, speed up responses, and timeout external calls when delayed. It helps isolate the failing services keeping microservices in good health.
  • Proxy microservices requests through an API Gateway instead of directly calling for the service. This enables more layers of protection, traffic control, and rejection of unauthorized requests to microservices.
  • Ensure your API changes are backward compatible by conducting contract testing for APIs on behalf of end-users, which allows applications to get into production at a faster rate. 
  • Version your microservices with each change and customers can choose to use the new version as per their convenience. Support for older versions of microservices would continue for a limited timeframe.
  • While hosting microservices, create a dedicated infrastructure by setting apart the microservices infrastructure from other components to achieve error isolation and enhanced performance.
  • Microservices must have their own separate release detached from other components.
  • Build standards within the organization for multiple teams to develop and release microservices on the same lines. Create enterprise solutions for API security; log aggregation, monitoring, API documentation, secrets management, config management, and distributed tracing. 

9. How can Microservices add Value to Organizations? 

Advancing user preferences are driving organizations to adapt to digital agility.

A survey on digital transformation by Bernd Rücker revealed that 63% of the 354 questioned enterprises were looking into microservices because they wanted to improve employee efficiency, customer experience and optimize tools and infrastructure costs. 

Netflix was one of the pioneers of microservices almost a decade ago with other companies like Amazon, Uber and eBay joining the trend. Former CTO of eBay, Steve Fisher, stated that as of today, eBay utilizes more than 1000 services that include front-end services to send API calls and back-end services to execute tasks. 

It is easier for business capabilities to get aligned with the fragmented parts of microservices as it facilitates application development to align directly with functionalities prioritized by business value. This ensures businesses are highly available, become more resilient and structured. 

Businesses can look towards scalability of applications backed by a team of experts and advanced technology, deployment becomes easier due to independent manageability and enterprises become better equipped to handle dynamic market conditions as well as meet growing customer needs. The business value proposition must be the key drivers to embracing microservices architecture.

10. Conclusion

Microservices are beneficial not just technically for code development process but also strategic to overall business maturity. Despite being considered complex, microservices provide businesses the capability and flexibility to cater to frequent functional and operational changes making them the most sought-after architectures of today.  

Its granular approach assists business communication and efficiency. Microservices are a novel concept but seem promising for the areas of application development. 

Organizations can consider microservices architecture, if best suited, to digitally transform their businesses to become more dynamic and competitive.

AI : Cloud :: Intelligence : Muscle

By | Powerlearnings | No Comments

Written by Vinit Balani, Senior Specialist – Cloud Services and Software

Just like social distancing went on to become the new normal during and post the pandemic in the physical world, the cloud is slowly becoming (if not already) the new normal in the enterprise world. With remote working and a serious push to digitization, it has become inevitable for enterprises to continue delivering products and services to their end-users. COVID-19 has been a major driver in cloud adoption for many enterprises across the globe.

Worldwide revenues for the artificial intelligence (AI) market, are forecasted to grow 16.4% Y-o-Y in 2021 to $327.5 billion, according to the latest release of the IDC Worldwide Semiannual AI Tracker.  By 2024, the market is expected to break the $500 billion mark with a five-year compound annual growth rate (CAGR) of 17.5% and total revenues reaching an impressive $554.3 billion.

If we look at India, India Inc.’s AI spending is expected to grow at a CAGR of 30.8% to touch USD 880.5 million in 2023 as per the IDC report. AI is now being used by enterprises to get a competitive advantage with BFSI and manufacturing verticals leading the race in terms of AI spending.

So, why is AI becoming mainstream across industries and has picked up drastically over the last decade? One of the major reasons behind this is ‘Cloud’. I would like to draw an analogy of AI and Cloud with a human body. If AI is the intelligence that resides in the brain, Cloud is the muscle that it needs to execute any action or any algorithm in this case. The advantage of AI on the cloud against doing it locally using on-premise infrastructure is that the more the data you train, the cost does not grow proportionately due to the economies of scale it provides. In fact in the world of cloud computing, more is better. This is also one of the biggest reasons for the increase in AI adoption by enterprises post cloud adoption.

Below are some of the areas which I believe will see major developments in the coming 5 years –

Niche AI services

With the democratization of data, we can already see AI services being developed for different industries and in many cases also specific use cases. Enterprises are looking for automation within their domains and AI (in some cases along with RPA) is playing a major role to address business challenges. The growth of industry-focused (retail, manufacturing, healthcare & more) and in certain cases even going one level deep into the AI category within industries (i.e. conversational AI, computer vision, etc.) has been phenomenal. With on-demand compute available at a click, entrepreneurs are picking up focused challenges that can be addressed with AI and build products/services around it.

Accurate AI models

Due to the massive boost to digitization post-COVID-19, a massive amount of digital data is being generated in multiple formats. Thanks to cloud storage, all of it is being stored in raw format by different enterprises. Bias is one of the most important factors in the accuracy of AI models. Bias in AI is also one of the factors that can hamper its update and application within enterprises. However, with most of the data moving to digital form and the high volume of this data getting generated (and now being available to train), the existing AI models are bound to get more accurate and reliable. With much happening in data quality space as well, we can expect the AI models or services to get smarter and more accurate by the day.

AI Governance

While there have been talks on using AI responsibly, there has not much been done in this space. The regulatory and policy landscape for AI is an emerging issue in jurisdictions globally. There have been certain initiatives and papers written in this space by national, regional, and international authorities, but we still see a lack of governance framework to regulate the use of AI. With the growing adoption, we can expect the frameworks and governance to be established and play a major role in the driving development of responsible AI.

Cloud AIOps

There are interesting trends already seen on automating cloud environments using Artificial Intelligence. With enterprises migrating their workloads to the cloud, there is going to be an incredible amount of telemetry data and metrics being generated by these applications and the underlying infrastructure. All of this data can be used to train AI initially to pinpoint the issue, further to provide the resolution, and slowly automate fixing such issues without human intervention. However, it will be interesting to see if any of the hyperscalers (i.e. AWS, Azure, GCP) can build that intelligence from telemetry data of their client’s environment and create service for this automation.

AI@Edge

With the adoption of IoT (Internet-of-Things) across industries like manufacturing, retail, health care, energy, financial services, logistics, and agriculture, more data is being generated by the devices with a need of analysis and processing near the device.

According to Gartner, companies generated just 10% of their data outside of the data center or cloud in 2019; this amount is expected to grow up to 75% by 2025.

As a result of this; IDC predicts that in the next 3 years, 45% of the data will be stored, analyzed, processed, and acted upon close to the devices. We already see our smartphones carrying chips to process AI. However, this growing IoT adoption will lead to AI models being deployed onto more edge devices across domains.

Self-service AI

While enterprises are looking to create a self-service experience in different domains, on the other hand, the cloud service providers are building products to create self-service platforms for these enterprises to reach to market faster.

As per Gartner, 65% of the app development work will be done using low or no-code platforms and a big chunk of this is going to be platforms to build and train AI.

Hyperscalers have been putting their best efforts to create no or low code platforms for non-techies to be able to create and train their models on the cloud. From chatbots, computer vision to creating custom ML models, enterprises are making use of these platforms to create their offerings with on-demand resources on the cloud instead of re-inventing the wheel.

These are some of the areas where I believe we will see advancements in AI. It would be great to hear some thoughts on what you folks think about the trends in AI. 

At LTI, we aspire to create the data-to-decision ecosystem to enable organizations to take quantum leaps in business transformation with AI. LTI’s Mosaic ecosystem has been created to provide enterprises with a competitive edge using its transformative products. While Mosaic AI simplifies designing, development, and deployment of AI/ML at enterprise scale, Mosaic AIOps infuses AI in IT operations to enable pro-active operational intelligence with real-time processing of events. With data being at the center of AI, Mosaic Decisions ensures ingestion, integrity, storage, and governance aspects with Mosaic Catalog ensuring ease of discovering enterprise data.

Mosaic Agnitio’s in-built deep learning enables enterprises to extract insights from data and automate business processes, while Mosaic Lens’s augmented analytics capabilities help uncover hidden insights within the data using AI. 

To know more details on LTI’s Mosaic ecosystem, you can visit – https://www.lntinfotech.com/digital-platforms/mosaic/ 

Cloud Governance with ‘x’Ops -Part 3

By | Powerlearnings | No Comments

Compiled by Kiran Kumar, Business analyst at Powerup Cloud Technologies

Contributor Agnel Bankien, Head – Marketing at Powerup Cloud Technologies

Summary:

The xOps umbrella consists of four major ops functions broadly categorized under cloud management and cloud governance. In our previous blogs, we had a detailed look at how IT Ops could be built effectively and by what means DevOps and CloudOps play a major role in cloud management functions. In this concluding blog of the xOps series, we will have a close look at the financial and security operations on cloud and its significance in cloud governance that has paved the way to a more integrated and automated approach to cloud practices.

Index

1. Introduction

2. What is FinOps?

3. FinOps Team

4. Capabilities Architecture

4.1 FinOps Lifecycle

4.1.1 Inform

4.1.2 Optimize

4.1.3 Operate

5. Benefits of FinOps

6. What is SecOps?

7. SOC & the SecOps team

8. How SecOps works?

9. Benefits of SecOps

10. Conclusion

Optimizing cloud governance through FinOps and SecOps

1. Introduction        

According to Gartner, the market for public cloud services will grow at a compound annual growth rate of 16.6% by 2022. A surge in the usage of the cloud has determined organizations to not only upscale their capabilities to be more reliable, compliant, flexible, and collaborative but also equip themselves to handle their cloud finances and security more effectively.

Financial operations, widely known as finOps has led businesses to become vigilant and conscious about their financial strategies and analytics to plan, budget and predict their required cloud expenses better, helping gain more flexibility and agility in time.  

In today’s technology-driven business environments, data is the biggest asset and securing our data, applications, and infrastructure on cloud is a massive concern with growing cloud practices. 69% of executives surveyed by Accenture Security’s 2020 State of Cyber Resilience state that staying ahead of attackers is a constant battle and the cost is unsustainable. With the global IT spends close to $4 trillion, modernized security strategies combined with business operations need to be implemented right from the beginning of the software development lifecycle.

In the first two parts of the ‘x’Ops series, we saw how IT Ops could be built productively by focusing more on DevOps and CloudOps.

The xOps umbrella consists of four major ops functions broadly categorized under cloud management and cloud governance and in this conclusive blog of the xOps series, we will have a close look at cloud governance through FinOps and SecOps practices.

2. What is FinOps?

FinOps is short for Cloud Financial Management and is the concurrence of finance and operations on cloud. 

The traditional setup of IT was unaware of the inefficiency and roadblocks that occur due to the silo work culture, limitations in infrastructure adaptability with regards to business requirements and the absence of technology-led cloud initiatives.

With the onset of FinOps, the people, process and technology framework was brought together to manage operating expenses as well as impose financial accountability to the variable spend on cloud.

Organizations needed to establish efficient cost control mechanisms in order to enable easy access to cloud spend and devise steady FinOps practices.

3. FinOps Team

Workforces from every level and area of business would have unique individual roles to play in the FinOps practices.

Executive heads like VP of Infrastructure, Head of Cloud Center of Excellence, CTO or CIO would be responsible for driving teams to be efficient and accountable while also building transparency and controls.

FinOps practitioners would be focused on forecasting; allocating and budgeting cloud spends to designated teams. FinOps experts would typically include FinOps Analyst, Director of Cloud Optimization, Manager of Cloud Operations, or Cost Optimization Data Analyst to name a few.

Engineering and operations departments comprising of Lead Software Engineer, Principal Systems Engineer, Cloud Architect, Service Delivery Manager, Engineering Manager or Director of Platform Engineering, would focus on building and supporting services for the organization.

Technology Procurement Manager, Financial Planning and Analyst Manager and Financial Business Advisor would form the finance and procurement team to use FinOps team’s historical records for future requirements and forecasts. They would work closely with FinOps to understand existing billing data and rate negotiation techniques to construct enhanced cost models for future capacity and resource planning.

Thus for organizations operating on the FinOps model, a cross-functional team known as a Cloud Cost Center of Excellence would be set up to strategize, manage and govern cloud cost policies and operations as well as implement best practices to optimize and stir up the enterprise cloud businesses.

4. Capabilities Architecture

Organizations adapting to FinOps practices, need to primarily inculcate a cultural change, to begin with.

Cloud cost forms a significant part of performance metrics and can be tracked and monitored to determine the right team size as per workload specifications, allocate container costs, identify and compute unused storage and highlight inconsistency if any, in the expected cloud spends. 

FinOps is a trusted operating model for teams to manage all of the above. Technology teams can collaborate with business and finance teams to shape informed decisions, drive continuous optimization and gain faster financial and operational control.

4.1 FinOps Lifecycle

The FinOps journey on cloud consists of three iterative stages – Inform, Optimize, and Operate. 

4.1.1 Inform

Provides a detailed assessment of cloud assets for better visibility, understanding, budget allocations, and benchmarking industry standards to detect and optimize areas of improvement.

Considering the dynamic nature of the cloud, stakeholders are compelled to customize pricing and discounts, ensure accurate allocation of cloud spends based on business mapping, and ascertain ROIs are driven in view of the set budgets and forecasts.

4.1.2 Optimize

Once organizations and teams are commissioned, it is time to optimize their cloud footprint.

This phase helps set alerts and measures to identify areas that need to spend and redistribute resources.

It generates real-time decision-making capacity regarding timely and consistent spends and recommends application or architecture changes where necessary. For instance, to increase usage commitments, cloud providers often strategize to offer lucrative discounts on reserved instances in order to increase usage commitment levels. Also,     cloud environments can be optimized by rightsizing and automation to curb any wasteful use of resources.

4.1.3 Operate

Helps to align plans and evaluate business objectives through metrics on a continuous basis.

Optimizes costs by instilling proactive cost control measures at the resource level.It enables distributed teams to drive the business by following speed, cost, and quality. This phase provides flexibility in operations, creates financial accountability to the variable cloud spends, and helps understand the cloud finances better.

5. Benefits of FinOps

  • The shift to FinOps empowers teams to build a robust cloud cost and ROI framework.
  • Enables organizations to estimate, forecast, and optimize cloud spends.
  • Improves the decision-making process of enterprises and provides traceability to the decisions made.
  • Helps in financial, procurement, demand, and operational management on cloud.
  • Increases cost efficiency, helps teams attain clear visibility to make their own financial choices with regards to cloud operations.
  • Creates a finance model that conforms to the dynamics of the cloud business.

 6. What is SecOps?

As per the latest studies, 54% of security leaders state that they communicate effectively with IT professionals to which only 45% of IT professionals agree. As IT operations stress upon rapid innovation and push new products to market, security teams are weighed down with identifying security vulnerabilities and compliance issues. This has created a huge mismatch between the IT and security teams that needs to be jointly addressed and resolved effectively.

SecOps is the integration of IT security and operations teams that combine technology and processes to reduce the risk and impact on business, keep data and infrastructure safe, and develop a culture of continuous improvement to eventually enhance business agility. SecOps ensures data protection is given priority over innovation, speed to market, and costs at all times.

7. SOC & the SecOps team

SecOps teams are anticipated to interact with cross-functional teams and work 24/7 to record all tasks and mitigate risks. For this purpose, a Security Operations Center (SOC) is established that commands and overlooks all security-related activities on the cloud.

The Chief Information Security Officers (CISOs) are primarily responsible for assembling a synergetic SecOps team that defines clear roles and responsibilities and devises strategies to restrict security threats and cyber-attacks. Every SecOps team will comprise of:

  • An incident responder, who identifies, classifies and prioritizes threats and configures as well as monitors security tools.
  • Security investigator that identifies affected devices, evaluates running and terminated processes, carries out threat analysis and drafts the mitigation strategies.
  • An advanced security analyst is responsible for recognizing hidden flaws, reviews and assesses threats, vendor and product health; recommends process or tool changes if any. 
  • SOC manager manages the entire SOC team, communicates with the CISO and business heads and oversees the entire people and crisis management activities.
  • Security Engineer or architect who evaluates vendor tools takes care of the security architecture and ensures it is part of the development cycle as well as compliant to industry standards.
  • SecOps has lately seen many new cybersecurity roles unfold. Cloud security specialists, third-party risk specialists, and digital ethics professionals to name some. These roles essentially highlight the vulnerabilities in supply chain processes, privacy concerns, and the impact of cloud computing on IT businesses.

8. How SecOps works?

Gartner states that through 2020, “99% of vulnerabilities exploited will continue to be the ones known by security and IT professionals for at least one year.”

Therefore, the most important aspect is to establish security guardrails and monitor the security spectrum on the cloud continuously.

Dave Shackleford, principal consultant at Voodoo Security stated that for a SOC monitored cloud, SecOps teams must:

  • Establish a discrete cloud account for themselves to ensure entire security controls lie solely with them,
  • Administer multifactor authentication for all cloud accounts while also creating a few least privilege accounts to perform specific cloud functions as and when required and
  • Enable write-once storage for all logs and evidence.

Moreover, the SecOps team must ensure to be primarily responsible and accountable towards security incidents with proactive and reactive monitoring of the entire security scope of the organization’s cloud ecosystem.

According to Forrester Research, “Today’s security initiatives are impossible to execute manually. As infrastructure-as-a-code, edge computing, and internet-of-things solutions proliferate, organizations must leverage automation to protect their business technology strategies.”

Additionally, firewalls and VPNs are no longer considered competent enough to combat the present day’s advanced security threats.

Therefore, it is believed that enterprises that automate core security functions such as vulnerability remediation and compliance enforcement are five times more likely to be sure of their teams communicating effectively. 

Businesses need to implement SecOps practices:

  • That can apply checks on their cloud environment concerning security benchmarks and guidelines as per industry standards.
  • Use vulnerability management tools to scan, analyze and detect potential security-related risks and threats.
  • Assure access authorization and employ frameworks that automate user behavior, profiling, and control.
  • Conduct recurrent audits as preventive measures to keep a check on cloud health and status
  • Use AI to automate SecOps that encapsulate incident detection, response, and analysis, help categorize, prioritize and mitigate threats, recommend remediation, detect unused resources and assign risk scores.
  • Dispense SecOps software that caters to DNS, network, and anti-phishing security along with the application of advanced analytics like data discovery.
  • Implement cloud orchestrations to coordinate automated tasks and consolidate cloud processes and workflows for a more sophisticated and proactive defense.
  • Last but not the least, implement best practices to ensure continuous monitoring and structuring of cloud security operations.

9. Benefits of SecOps

Security and operations together provide:

– Continuous protection of data, applications, and infrastructure on cloud

– Prevention and mitigation of risks and threats

– Speedy and effective response time

– Adherence to compliance standards

– Cost savings from optimizing security measures

– Building security expertise and

– Instilling flexibility and high availability while eliminating redundancy in business operations.

10. Conclusion

With the onset of development, security, finance, and cloud operations coming together under one umbrella, IT operations have gained immense competency in cloud-based services.

The current trend facilitates Dev+Sec+Ops teams to collaborate and incorporate security-related strategies, processes, and policies from the inception phase of the

SDLC. The idea is for everyone to be responsible for security by strategically placing security checkpoints at different stages of the SDLC.

Moving forward, the future of SecOps will be relying more on AI and machine learning tools to construct powerful, intelligent, and dynamic SecOps strategies.

83 % of the organizations admit that with stronger security operations on cloud, their business productivity has appreciably risen. Their security risks have significantly decreased by 56 % while overall costs have come down by almost 50 % improving the ability to be more agile and innovative.

Cloud Landing Zone Guide

By | Powerlearnings | No Comments

Written by Aparna M, Associate Solutions Architect at Powerupcloud Technologies

The Cloud is the backbone and foundation of digital transformation in its many forms. When planning a cloud migration and an adoption strategy, it is important to create an effective operational and governance model that is connected to business goals and objectives. At this point, building an efficient cloud landing zone plays a big role. In this article, we will take a deeper look into why having a cloud landing zone is a key foundation block in the cloud adoption journey.

What is Cloud Landing Zone?

Landing Zone is defined as, ‘A configured environment with a standard set of secured cloud infrastructure, policies, best practices, guidelines, and centrally managed services.’. This helps customers to quickly set up a secure, multi-account Cloud environment based on industry best practices. With a large number of design choices, setting up a multi-account environment can take a significant amount of time, involving the configuration of multiple accounts and services, which requires a deep understanding of cloud provider services(AWS/GCP/Google).

This solution can help to save time by automating the set-up of an environment for running secure and scalable workloads while implementing an initial security baseline through the creation of core accounts and resources.

Why a Landing Zone?

As large customers are moving towards cloud one of the main concern is on the security, time constraints and the cost. AWS Landing Zone is a service that helps in setting up a secure and multi-account AWS environment maintaining the best practices. Having many design choices, it’s good to start without wasting your time for configuration with minimal costs. It helps you save time by automating the setup of an environment to run secure and scalable workloads.

Fundamentals of the Landing Zone when Migrating to the Cloud:

Before you even deciding on which Cloud provider to use (like AWS, GCP, Azure Cloud) it’s important to assess certain basic considerations like:

1. Security & Compliance

A landing zone allows you to enforce security at the global and account level. Security baseline with preventative and detective control. Company-wide compliance and data residency policies can be implemented with landing zones. As part of this process, consistent architecture is deployed for concerns such as Edge Security, Threat Management, Vulnerability Management, Transmission Security, and others.

2. Standardized Tenancy

Landing Zone provides a framework for creating and baselining a multi-account. Automated environment for the multi-account helps to save the time for setup, while also implementing that initial security baseline for any digital environment you are going to use. The automated multi-account structure includes security, audit, and shared service requirements. Enforce tagging policies across multiple cloud tenants and provide standardized tenants for different security profiles (dev/staging/prod).

3. Identity and Access Management

Implementing the principle of least privilege by defining roles and access policies. Implement the principle of least privilege by defining roles and access policies. Implementing SSO for Cloud logins.

4. Networking

Designing and implementing cloud networking capabilities is a critical part of your cloud adoption efforts. Networking is composed of multiple products and services that provide different networking capabilities. Network Implementation measures to ensure the network is highly available, resilient, and scalable.

5. Operations

Centralized logging from various accounts leveraging different services from the cloud provider. Configuring automate backup and setting up DR using various cloud native tools. Configuring monitoring and alerts for cost management, reactive scalability, and reliability. Automated regular patching of servers.

Benefits of the Cloud Landing Zone:

  • Automated environment setup
  • Speed and scalability and governance in a multi-account environment 
  • Security and compliance
  • Flexibility
  • Reduced operational costs
  • Automated environment setup
  • Speed and scalability and governance in a multi-account environment 
  • Security and compliance
  • Flexibility
  • Reduced operational costs

Best Practices of the Cloud Landing Zone

  • Organizations Master Account: It is the root account that provisions and manages member accounts at the organization level under Organizations Services.
  • Core Accounts in an Organizational Unit: This provides essential functions that are common to all the accounts under the Organization, such as log archive, security management, and shared services like the directory service.
  • Team/Group Accounts in Organizational Units: Teams and groups are logically grouped under Teams. These are designated for individual business units at the granularity of the team or group level. For example, a set of Team accounts may consist of the team’s shared services account, development account, pre-production account, and production account.
  • Developer Accounts: Enterprises should have separate “sandboxes” or developer accounts for individual learning and experimentation, as a best practice.
  • Billing: An account is the only way to separate items at a billing level. The multi-account strategy helps create separate billable items across business units, functional teams, or individual users.
  • Quota Allocation: Service provider quotas are set up on a per-account basis. Separating workloads into different accounts gives each account (such as a project) a well-defined, individual quota.
  • Multiple Organizational Units (OUs): These are designated for individual business units at the granularity of the team or group level. For example, a set of Team accounts may consist of the team’s Shared Services account, Development account, Pre-Production account, and Production account.
  • Connectivity: You can also choose the type of connection you want to use. By setting up networking patterns and combining it with external data centers, you can create a hybrid system or a multi-cloud-driven adoption.
  • Security Baseline:
    •  All accounts sending logs to a centrally managed log archival account.
    • Central VPC for all the account and using peering when applicable
    • Configuring password policy
    • Cross account access with limited permissions
    • Alarms/ events configured to send notification on root account login, api authentication failures
  • Automation: Automation ensures that your infrastructure is set up in a way that is repeatable and can evolve as your use is refined and demands grow.
  • Tagging:  Tagging resources can help the customer in many ways for example: cost analysis, optimization etc.

Cloud Landing Zone Life Cycle

let’s talk about the different phases of a landing zones lifecycle!

  • Design
  • Deploy
  • Operate

In software development you often hear the terms

“Day 0/Day 1/Day 2”

Those refer to different phases in the life of a software: From specifications and design (Day 0) to development and deployment (Day 1) to operations (Day 2). For this blog post, we’re going to use this terminology to describe the phases of the landing zone lifecycle.

Designing a Landing Zone (Day 0)

Regardless of the deployment option, you should carefully consider each design area. Your decisions affect the platform foundation on which each landing zone depends. 4 aspects a well-designed landing zone should take care of in the cloud:

  1. Security and Compliance
  2. Standardized tenancy
  3. Identity and access management
  4. Networking

Deploying a Landing Zone (Day 1)

When it comes to customizing and deploying a landing zone according to the design and specifications determined during the design phase, the implementation of the landing zone concept is handled differently by every public cloud service provider.

Amazon Web services: The solution provided by AWS is called the AWS Landing Zone. This solution helps customers more quickly set up a multi-account architecture, with an initial security baseline, identity and access management, governance, data security, network design, and logging. AWS has three options for creating your landing zone: a service-based landing zone using AWS Control Tower, a CloudFormation solution, and a customized landing zone that you can build. 

Microsoft Azure: The solution provided by Azure is called as the Cloud Adoption Framework. A major tool is Azure blueprints: You can choose and configure migration, landing zone blueprints within Azure to set up your cloud environments. As an alternative, you can use third-party services like terraform.

Google Cloud Platform: The solution provided by the google cloud is called as Google Deployment Manager. You can use a declarative format utilizing Yaml – or Python and Jinja2 templates – to configure your deployments.

Operations (Day 2):

It’s an ongoing effort onto how you manage and operate using landing zones. The objective of the operations workstream is to review your current operational model and develop an operations integration approach to support future-state operating models as you migrate to Cloud. Infrastructure-as-Code is used to ensure that your configurations are managed in a repeatable way, evolving via DevOps disciplines and tooling. And leveraging various logging solutions like Splunk, Sumo Logic, ELK, etc. Implementing various backup and patching using Cloud provider services or tools. Planning and designing disaster recovery plays a very important role to ensure high availability of the infrastructure.

Our Experience with Cloud Landing Zone:

We at Powerup ensure seamless Migration to the cloud used trusted and best cloud migration tools, and integration of existing operational capabilities, and leverage the most powerful and best-integrated tooling available for each platform.

Many of our customers use the Landing Zone concept, once such example is where customers wanted to set up separate AWS accounts so they can meet the different needs of their organization. Although multiple organizations have simplified the operational issues and provide isolation based on the functionality, it takes manual efforts to configure the baseline security practices. To save time and effort in creating the new account, we use “Account Vending Machine”. The Account Vending Machine (AVM) is an AWS Landing Zone key component. The AVM is provided as an AWS Service Catalog product, which allows customers to create new AWS accounts pre-configured with an account security baseline. Monitoring, logging, security, and compliance will be pre-configured during account setup. This helps the customs to reduce costs in Infra setup and Operations cost, takes minimum effort to set up the infrastructure.