Category

AWS

Why hybrid is the preferred strategy for all your cloud needs

By | AWS, Azure, Blogs, GCP, hybrid cloud, Powerlearnings | No Comments

Written by Kiran Kumar, Business analyst at Powerupcloud Technologies.

While public cloud is a globally accepted and proven solution for CIO’s and CTO’s looking for a more agile, scalable and versatile  IT environment, there are still questions about security, reliability, cloud readiness of the enterprises and that require a lot of time and resources to fully migrate to a cloud-native organization. This is exacerbated especially for start-ups, as it is too much of a risk to work with these uncertainties. This demands a solution that is innocuous and less expensive to drive them out of the comforts of their existing on-prem infrastructure. 

Under such cases, a hybrid cloud is the best approach providing you with the best of both worlds while keeping pace with all your performance, & compliance needs within the comforts of your datacenter.

So what is a hybrid cloud?

Hybrid cloud delivers a seamless computing experience to the organizations by combining the power of the public and private cloud and allowing data and applications to be shared between them. It provides enterprises the ability to easily scale their on-premises infrastructure to the public cloud to handle any fluctuations in the work-load without giving third-party datacenters access to the entirety of their data. Understanding the benefits, various organizations around the world have streamlined their offerings to effortlessly integrate these solutions into their hybrid infrastructures. However, an enterprise has no direct control over the architecture of a public cloud so, for hybrid cloud deployment, enterprises must architect their private cloud to achieve consistent hybrid experience with the desired public cloud or clouds. 

A 2019 survey (of 2,650 IT decision-makers from around the world) respondents reported steady and substantial hybrid deployment plans over the next five years. In addition to this, a vast majority of 2019 survey respondents about more than 80% selected hybrid cloud as their ideal IT operating model and more than half of these respondents cited hybrid cloud as the model that meets all of their needs. And more than 60% of them stated that data security is the biggest influencer.

Also, respondents felt having the flexibility to match the right cloud to each application showcases the scale of adaptability that enterprises are allowed to work with, in a hybrid multi-cloud environment 

Banking is one of those industries that will embrace the full benefits of a hybrid cloud, because of how the industry operates they require a unique mix of services and an infrastructure which is easily accessible and also affordable

In a recent IBM survey 

  • 50 percent of banking executives say they believe the hybrid cloud can lower their cost of IT ownership 
  • 47 percent of banking executives say they believe hybrid cloud can improve operating margin 
  • 47 percent of banking executives say they believe hybrid cloud can accelerate innovation

Hybrid adoption – best practices and guidelines 

Some of the biggest challenges in cloud adoption include security, talent, and costs, according to the report’s hybrid computing has shown that it can eliminate security challenges and manage risk, by positioning all the important digital assets and data on-prem. Private clouds are still considered to be an appropriate solution to host and manage sensitive data and applications and also the enterprises still need the means to support their conventional enterprise computing models. A sizeable number of businesses still have substantial on-premise assets comprising archaic technology, sensitive collections of data, and highly coupled legacy apps that either can’t be easily moved or swapped for public cloud. 

Here some of the guidelines for hybrid adoption 

Have a cloud deployment model for applications and data

Deployment models talk about what cloud resources and applications should be deployed and where. Hence it is crucial to understand the 2 paced system ie, steady and fast-paced system to determine the deployment models. 

A steady paced system must continue to support the traditional enterprise applications on-prem to keep the business running and maintain the current on-premise services. Additionally, off-premises services, such as private dedicated IaaS, can be used to increase infrastructure flexibility for enterprise services.

And a fast-paced system is required to satisfy the more spontaneous needs like delivering applications and services quickly whether it’s scaling existing services to satisfy spikes in demand or providing new applications quickly to meet an immediate business need. 

The next step is determining where applications and data must reside.

Placement of application and datasets on private, public or on-prem is crucial since IT architects must access the right application architecture to achieve maximum benefit. This includes understanding application workload characteristics and determining the right deployment model for multi-tier applications. 

Create heterogeneous environments 

To achieve maximum benefits from a hybrid strategy, the enterprise must leverage its existing in-house investments with cloud services, by efficiently integrating them, as new cloud services are deployed, the applications running on them with various on-premises applications and systems becomes important.

Integration between applications typically includes 

  • Process (or control) integration, where an application invokes another one in order to execute a certain workflow. 
  • Data integration, where applications share common data, or one application’s output becomes another application’s input. 
  • Presentation integration, where multiple applications present their results simultaneously to a user through a dashboard or mashup.

To obtain a seamless integration between heterogeneous environments, the following actions are necessary:

  • a cloud service provider must support open source technologies for admin and business interfaces.
  • Examine the compatibility of in-house systems to work with cloud services providers and also ensure that on-premises applications are following SOA design principles and can utilize and expose APIs to enable interoperability with private or public cloud services.
  • Leverage the support of third party ID and Access Management functionality to authenticate and authorize access to cloud services. Put in place suitable API Management capabilities to prevent unauthorized access.

Network security requirements 

Network type – Technology used for physical connection over the WAN Depends on aspects like bandwidth, latency, service levels, and costs. Hybrid cloud solutions can rely on P2P links as well as the Internet to connect on-premises data centers and cloud providers. The selection of the connectivity type depends on the analysis of aspects like performance, availability, and type of workloads. 

Security – connectivity domain needs to be evaluated and understood; to match the network security standards between cloud provider network security standards and the overall network security policies, guidelines and compliance. The encrypting and authenticating traffic on the WAN can be evaluated at the application level. And aspects like systems for the computing resources, applications must be considered and technologies such as VPNs can be employed to provide secure connections between components running in different environments.

Web apps security and Management services like DNS and DDoS protection which are available on the cloud can free up dedicated resources required by an enterprise to procure, set-up and maintain such services and instead concentrate on business applications. This is especially applicable to the hybrid cloud for workloads that have components deployed into a cloud service and are exposed to the Internet. The system that is deployed on-premises needs to adapt to work with the cloud, to facilitate problem identification activities that may span multiple systems that have different governance boundaries.

Security and privacy challenges & counter-measures

Hybrid cloud computing has to coordinate between applications and services spanning across various environments, which involves the movement of applications and data between the environments. Security protocols need to be applied across the whole system consistently, and additional risks must be addressed with suitable controls to account for the loss of control over any assets and data placed into a cloud provider’s systems. Despite this inherent loss of control, enterprises still need to take responsibility for their use of cloud computing services to maintain situational awareness, weigh alternatives, set priorities, and effect changes in security and privacy that are in the best interest of the organization. 

  • A single and uniform interface must be used to curtail or Deal with risks arising from using services from various cloud providers since it is likely that each will have its own set of security and privacy characteristics. 
  • Authentication and Authorization. A hybrid environment could mean that gaining access to the public cloud environment could lead to access to the on-premises cloud environment.
  • Compliance check between cloud providers used and in-home systems.

Counter-measures

  • A single Id&AM system should be used.
  • Networking facilities such as VPN are recommended between the on-premises environment and the cloud.  
  • Encryption needs to be in place for all sensitive data, wherever it is located.
  • firewalls, DDoS attack handling, etc, needs to be coordinated across all environments with external interfaces.

Set-up an appropriate DB&DR plan

As already been discussed, a hybrid environment provides organizations the option to work with the multi-cloud thus offering business continuity, which has been one of the most important aspects of business operations. It is not just a simple data backup to the cloud or a Disaster Recovery Plan, it means when a disaster or failure occurs, data is still accessible with little to no downtime. Which is measured in terms of time to restart (RTO: recovery time objective) and maximum data loss allowed (RPO: recovery point objective).

Therefore a business continuity solution has to be planned considering some of the key elements such as resilience, time to restart (RTO: recovery time objective) and maximum data loss allowed (RPO: recovery point objective) which was agreed upon by the cloud provider. 

Here are some of the challenges encountered while making a DR plan 

  • Although the RTO and RPO values give us a general idea of the outcome, they cannot be trusted fully so the time required to restart the operation may take longer 
  • As the systems get back up and operational there will be a sudden burst of request for resources which is more apparent in large scale disasters.
  • Selecting the right CSP is crucial as most of the cloud providers do not provide DR as a managed service instead, they provide a basic infrastructure to enable our own DRaaS.

Hence enterprises have to be clear and select their DR strategy which best suits their IT infrastructure as this is very crucial in providing mobility to the business thus making the business more easily accessible from anywhere around the world and also data insurance in the event of a disaster natural or even in case of technical failures, by minimizing downtime and the costs associated with such an event.

How are leading OEMs like AWS, Azure and Google Cloud adapting to this changing landscape  

Google Anthos

In early, 2019 google came up with Anthos which is one of the first multi-cloud solutions from a mainstream cloud provider, Anthos is an open application modernization platform that enables you to modernize your existing applications, build new ones, and run them anywhere, built on open-source, including Kubernetes as its central command and control center, Istio enables federated network management across the platform, and Knative provides an open API and runtime environment that enables you to run your serverless workloads anywhere you choose. Anthos enables consistency between on-premises and cloud environments. Anthos helps accelerate application development and strategically enables your business with transformational technologies. 

AWS Outposts

AWS Outposts is a fully managed service that extends the same AWS hardware infrastructure, services, APIs, and tools to build and run your applications on-premises and in the cloud for a truly consistent hybrid experience. AWS compute, storage, database, and other services run locally on Outposts, and you can access the full range of AWS services available in the Region to build, manage, and scale your on-premises applications using familiar AWS services and tools. across your on-premises and cloud environments. Your Outposts infrastructure and AWS services are managed, monitored, and updated by AWS just like in the cloud.

Azure Stack

Azure Stack is a hybrid solution provided by Azure built and distributed by approved Hardware vendors(like DellLenovoHPE, etc,.) that bring Azure cloud into your on-prem data center. It is a fully managed service where hardware is managed by the certified vendors and software is managed by the Microsoft Azure. Using azure stack you can extend the azure technology anywhere, from the datacenter to edge locations and remote offices. Enabling you to build, deploy, and run hybrid and edge computing apps consistently across your IT ecosystem, with flexibility for diverse workloads.

How Powerup approaches Hybrid cloud for its customers 

Powerup is one of the few companies in the world to have achieved the status of a launch partner with AWS outposts with the experience in working on over 200+projects across various verticals and having top-tier certified expertise in all the 3 major cloud providers in the market. We can bring an agile, secure, and seamless hybrid experience across the table. Outposts is a fully managed services hence it eliminates the hassle of managing an on-prem data center so that the enterprises can concentrate more on optimizing their infrastructure

Reference Material

Practical Guide to Hybrid Cloud Computing

Multicast in AWS using AWS Transit Gateway

By | AWS, Blogs, Powerlearnings | No Comments

Written by Aparna M, Associate Solutions Architect at Powerupcloud Technologies.

Multicast is a communication protocol used for delivering a single stream of data to multiple receiving computers simultaneously.

Now AWS Transit Gateway multicast makes it easy for customers to build multicast applications in the cloud and distribute data across thousands of connected Virtual Private Cloud networks. Multicast delivers a single stream of data to many users simultaneously. It is a preferred protocol to stream multimedia content and subscription data such as news articles and stock quotes, to a group of subscribers.

Now let’s understand the key concepts of Multicast:

  1. Multicast domain – Multicast domain allows the segmentation of a multicast network into different domains and makes the transit gateway act as multiple multicast routers. This is defined at the subnet level.
  2. Multicast Group – A multicast group is used to identify a set of sources and receivers that will send and receive the same multicast traffic. It is identified by a group IP address.
  3. Multicast source – An elastic network interface associated with a supported EC2 instance that sends multicast traffic.
  4. Multicast group member – An elastic network interface associated with a supported EC2 instance that receives multicast traffic. A multicast group has multiple group members.

Key Considerations for setting up Multicast in AWS:

  • Create a new transit gateway to enable multicast
  • You cannot share multicast-enabled transit gateways with other accounts
  • Internet Group Management Protocol (IGMP) (IGMP) support for managing group membership is not supported right now
  • A subnet can only be in one multicast domain.
  • If you use a non-Nitro instance, you must disable the Source/Dest check. 
  • A non-Nitro instance cannot be a multicast sender.

Let’s walkthrough how to set up multicast via AWS Console.

Create a Transit gateway for multicast:

In order to create a transit gateway multicast follow the below steps:

  1. Open the Amazon VPC console at https://console.aws.amazon.com/vpc/.
  2. On the navigation pane, choose Create Transit Gateway.
  3. For Name tag, enter a name to identify the Transit gateway.
  4. Enable Multicast support.
  5. Choose Create Transit Gateway.

Create a Transit Gateway Multicast Domain

  1. On the navigation pane, choose the Transit Gateway Multicast.
  2. Choose Create Transit Gateway Multicast domain.
  3. (Optional) For Name tag, enter a name to identify the domain.
  4. For Transit Gateway ID, select the transit gateway that processes the multicast traffic.
  5. Choose Create Transit Gateway multicast domain.

Associate VPC Attachments and Subnets with a Transit Gateway Multicast Domain

To associate VPC attachments with a transit gateway multicast domain using the console

  1. On the navigation pane, choose Transit Gateway Multicast.
  2. Select the transit gateway multicast domain, and then choose ActionsCreate association.

3. For Transit Gateway ID, select the transit gateway attachment.

4. For Choose subnets to associate, select the subnets to include in the domain.

5. Choose Create association.

Register Sources with a Multicast Group

In order to register sources for transit gateway multicast:

  1. On the navigation pane, choose Transit Gateway Multicast.
  2. Select the transit gateway multicast domain, and then choose ActionsAdd group sources.
  3. For Group IP address, enter either the IPv4 CIDR block or IPv6 CIDR block to assign to the multicast domain. IP range must be in 224.0.0.0/4.
  4. Under Choose network interfaces, select the multicast sender’s (ec2 servers) network interfaces.
  5. Choose Add sources.

Register Members with a Multicast Group

To register members in the transit gateway multicast:

  1. On the navigation pane, choose Transit Gateway Multicast.
  2. Select the transit gateway multicast domain, and then choose ActionsAdd group members.
  3. For Group IP address, enter either the IPv4 CIDR block or IPv6 CIDR block to assign to the multicast domain. Specify the same multi cast IP specified while adding source.
  4. Under Choose network interfaces, select the multicast receivers'(ec2 server) network interfaces.
  5. Choose Add members.

Modify Security groups of the Member servers(receivers):

  1. Allow inbound traffic on Custom UDP port 5001

Once your setup is completed follow the below steps to test the multicast routing.

  1. Login to all the Source and member servers.
  2. Make sure you install iperf package in all your servers in order to test the functionality
  3. Run the below command in the Source Machine
iperf -s -u -B 224.0.0.50 -i 1

– 224.0.0.50 will be your multicast group IP provided during the setup

  1. Run the below command in all the member servers
iperf -c 224.0.0.50 -u -T 32 -t 100 -i 1

Once you start sending the data from the source server simultaneously that can be seen across all members. Below is the screenshot for your reference.

Conclusion

This blog helps you to host multicast applications on AWS leveraging AWS Transit gateway. Hope you found it useful.

Text to Speech using Amazon Polly with React JS & Python

By | AI, AWS, Blogs, Chatbot | One Comment

Written by Ishita Saha, Software Engineer, Powerupcloud Technologies

In this blog, we will discuss how we can integrate AWS Polly using Python & React JS to a chatbot application.

Use Case

We are developing a Chatbot Framework where we use AWS Polly for an exquisite & lively voice experience for our users

Problem Statement

We are trying to showcase how we can integrate AWS Polly voice services with our existing chatbot application built on React JS & Python.

What is AWS Polly ?

Amazon Polly is a service that turns text into lifelike speech. Amazon Polly enables existing applications to speak as a first-class feature and creates the opportunity for entirely new categories of speech-enabled products, from mobile apps and cars to devices and appliances. Amazon Polly includes dozens of lifelike voices and support for multiple languages, so you can select the ideal voice and distribute your speech-enabled applications in many geographies. Amazon Polly is easy to use – you just send the text you want converted into speech to the Amazon Polly API, and Amazon Polly immediately returns the audio stream to your application so you can play it directly or store it in a standard audio file format, such as MP3.

AWS Polly is easy to use. We only need an AWS subscription. We can test Polly directly from the AWS Console.

Go to :

https://console.aws.amazon.com/polly/home/SynthesizeSpeech

There is an option to select Voice from Different Languages & Regions.

Why Amazon Polly?

You can use Amazon Polly to power your application with high-quality spoken output. This cost-effective service has very low response times, and is available for virtually any use case, with no restrictions on storing and reusing generated speech.

Implementation

User provides input to the Chatbot. This Input goes to our React JS Frontend, which interacts internally with a Python Application in the backend. This Python application is responsible for interacting with AWS Polly and sending response back to the React app which plays the audio streaming output as mp3.

React JS

In this implementation, we are using the Audio() constructor.

The Audio() constructor creates and returns a new HTMLAudioElement which can be either attached to a document for the user to interact with and/or listen to, or can be used offscreen to manage and play audio.

Syntax :

audio = new Audio(url);

Methods :

play – Make the media object play or resume after pausing.
pause – Pause the media object.
load – Reload the media object.
canPlayType – Determine if a media type can be played.
 
Here, we are using only play() and pause() methods in our implementation.

Step 1: We have to initialize a variable into the state.

this.state = {
audio : "",
languageName: "",
voiceName: ""
}

Step 2 : Remove all unwanted space characters from input.

response = response.replace(/\//g, " ");
response = response.replace(/(\r\n|\n|\r)/gm, "");

Step 3 : If any existing reply from Bot is already in play. We can stop it.

if (this.state.audio != undefined) {
     this.state.audio.pause();
   }

Step 4 :

This method interacts with our Python Application. It sends requests to our Python backend with the following parameters. We create a new Audio() object. We are passing the following parameters dynamically to handle speaker() method :

  • languageName
  • voiceName
  • inputText
handleSpeaker = inputText => {
this.setState({
     audio: ""
   });
   this.setState({
     audio: new Audio(
       POLLY_API +
         "/texttospeech?LanguageCode=" +
         this.state.languageName +
         "&VoiceId=" +
         this.state.voiceName +
         "&OutputFormat=mp3"
    "&Text=" + inputText
     )
   });
}

Step 5 : On getting the response from our POLLY_API Python App, we will need to play the mp3 file.

this.state.audio.play();

Python

The Python application communicates with AWS Polly using AWS Python SDK – boto3.

Step 1: Now we will need to configure AWS credentials for accessing AWS Polly by using Secret Key, Access Keys & Region.

import boto3
def connectToPolly():
 polly_client = boto3.Session(
     aws_access_key=”xxxxxx”,
     aws_secret_key=”xxxxxx”,
     region=”xxxxxx”).client('polly')

 return polly_client

Here, we are creating a polly client to access AWS Polly Services.

Step 2: We are using synthesize_speech() to get an audio stream file.

Request Syntax :

response = client.synthesize_speech(
    Engine='standard'|'neural',
    LanguageCode='arb'|'cmn-CN'|'cy-GB'|'da-DK'|'de-DE'|'en-AU'|'en-GB'|'en-GB-WLS'|'en-IN'|'en-US'|'es-ES'|'es-MX'|'es-US'|'fr-CA'|'fr-FR'|'is-IS'|'it-IT'|'ja-JP'|'hi-IN'|'ko-KR'|'nb-NO'|'nl-NL'|'pl-PL'|'pt-BR'|'pt-PT'|'ro-RO'|'ru-RU'|'sv-SE'|'tr-TR',
        				OutputFormat='json'|'mp3'|'ogg_vorbis'|'pcm',
    									TextType='ssml'|'text',
    VoiceId='Aditi'|'Amy'|'Astrid'|'Bianca'|'Brian'|'Camila'|'Carla'|'Carmen'|'Celine'|'Chantal'|'Conchita'|'Cristiano'|'Dora'|'Emma'|'Enrique'|'Ewa'|'Filiz'|'Geraint'|'Giorgio'|'Gwyneth'|'Hans'|'Ines'|'Ivy'|'Jacek'|'Jan'|'Joanna'|'Joey'|'Justin'|'Karl'|'Kendra'|'Kimberly'|'Lea'|'Liv'|'Lotte'|'Lucia'|'Lupe'|'Mads'|'Maja'|'Marlene'|'Mathieu'|'Matthew'|'Maxim'|'Mia'|'Miguel'|'Mizuki'|'Naja'|'Nicole'|'Penelope'|'Raveena'|'Ricardo'|'Ruben'|'Russell'|'Salli'|'Seoyeon'|'Takumi'|'Tatyana'|'Vicki'|'Vitoria'|'Zeina'|'Zhiyu'
)

Response Syntax :

{
    'AudioStream': StreamingBody(),
    'ContentType': 'string',
    'RequestCharacters': 123
}

We are calling textToSpeech Flask API which accepts parameters sent by React and further proceeds to call AWS Polly internally. The response is sent back to React as a mp3 file. The React application then plays out the audio file for the user.

@app.route('/textToSpeech', methods=['GET'])
def textToSpeech():
 languageCode = request.args.get('LanguageCode')
 voiceId = request.args.get('VoiceId')
 outputFormat = request.args.get('OutputFormat')
 polly_client = credentials.connectToPolly(aws_access_key, aws_secret_key, region)
 response = polly_client.synthesize_speech(Text="<speak>" + text + "</speak>",    
     LanguageCode=languageCode,
     VoiceId=voiceId,
     OutputFormat=outputFormat,
     TextType='ssml')
 return send_file(response.get("AudioStream"),
           AUDIO_FORMATS['mp3'])

Conclusion

This blog showcases the simple implementation of React JS integration with Python to utilize AWS Polly services. This can be used as a reference for such use cases with chatbots.

AWS WorkSpaces Implementation

By | AWS, Blogs, Powerlearnings | 3 Comments

Written by Arun Kumar, Associate Cloud Architect at Powerupcloud Technologies

Introduction

Amazon WorkSpaces is managed & secured Desktop-as-a-service (DaaS) provided by AWS cloud. WorkSpace eliminates the need for provisioning the hardware and software configurations, which becomes the easy tasks for IT admins to provision managed desktops on cloud.  End users can access the virtual desktop from any device or browser like Windows, Linux, iPad, and Android. Managing the corporate applications for end users becomes easier using WAM (Workspace Application Manager) or integrating with existing solutions like SCCM,WSUS and more.

To manage the end user’s and provide them access to WorkSpaces below solutions can be leveraged with AWS.

  • Extending the existing on-premises Active Directory by using AD Connector in AWS.
  • Create & configure AWS managed Simple AD or Microsoft Active Directory based on size of the organization.

WorkSpaces architecture with simple AD approach

In this architecture, WorkSpace is deployed for the Windows and Linux virtual desktop both are associated with the VPC and the Directory service (Simple AD) to store and manage information of users and WorkSpace.

The above architecture describes the flow of end users accessing Amazon WorkSpaces using Simple AD which authenticates users. Users access their WorkSpaces by using a client application from a supported device or web browser, and they log in by using their directory credentials.The login information is sent to an authentication gateway, which forwards the traffic to the directory for the WorkSpace. Once the user is authenticated, a streaming traffic is processed through the streaming gateway which works over PCoIP protocol to provide the end users complete experience of the desktop.

Prerequisites

To use the WorkSpace the following requirements need to be completed.

  • A directory service to authenticate users and provide access to their WorkSpace.
  • The WorkSpaces client application is based on the user’s device and requires an Internet connection.

For this demo we have created the Simple AD, this can be created from the workspace console.

Directory

  • Create the Simple AD
  • Choose the Directory size based on your organization size.
  • Enter the fully qualified domain name and Administrator password make note of your admin password somewhere for reference.
  • We’ll need a minimum of two subnets created for the AWS Directory Service which requires Multi-AZ deployment.
  • Directory is now created.

WorkSpace

Now let’s create the WorkSpace for employees.

  • Select the Directory which you need to create WorkSpace for the user access.
  • Select the appropriate subnets that we created in the previous section to provision the workspaces in a Multi-AZ deployment.
  • Ensure that the self-service permissions is always set to  “NO”, else the users will have the privilege to change the workspaces configurations on the fly without the workspaces admin knowledge.
  • Enabling WorkDocs based on the user’s requirement.
  • You can select the user from the Directory list or user can create a new user on the fly.
  • Select the Bundle of compute, operating system, storage for each of your users.

You can select the running mode of the WorkSpaces based on your company needs. This can directly impact the monthly bill as selecting “Always -On “ mode will have a fixed pricing  whereas ‘AutoStop’ mode is an on-demand pricing model. Ensure right running mode is selected during the workspaces creation based on business requirements of the user.

  • Review and launch your workSpace.
  • Now your WorkSpace is up and running. Once it is available and ready to use. You will receive an email from amazon with workspaces login details.
  • By selecting the URL to create a password for your user to access the WorkSpace.
  • Download the client based on your device or you have web login.
  • Install the WorkSpace agent in your local.
  • Open the WorkSpace client and enter the registration code which you received in the email.
  • It prompts for username and password.
  • Now you are prompted to your Virtual Desktop

Security and compliance of WorkSpace

  • By default encryption at transit.
  • KMS can be used to encrypt our data at rest.
  • IP based restrictions.
  • Multi-factor authentication(RADIUS)
  • PCI DSS Level 1 Complaint.
  • HIPAA-Eligible with business level agreements
  • Certification- ISO 9001 and ISO 27001

Cost

  • No upfront payment.
  • On- Demand pricing  – Autostop of the WorkSpaces – In this model when the user is not using the virtual desktop Amazon automatically gets stopped based on the Autostop hours selected for the user.
  • Fixed Pricing – Always-On model – In this model the WorkSpace virtual desktop cost is calculated on a fixed monthly basis based on the selected bundle.

.Licencing

  • Built in license – Which allows us to select the right Windows bundle as per business needs.
  • WorkSpaces additionally supports BYOL( bring your own license)  license model for Windows 10.

Monitoring

  • CloudTrail can monitor the API calls.
  • CloudWatch Monitoring can see the number of users connected to WorksSpaces and latency of the session and more.

Additional features

  • API support(SDK, AWS CLI)
  • WorkSpace Application Manager(WAM).
  • Custom images.
  • Audio input.
  • Pre Built applications in AWS Marketplace, we can add those applications to our WorkSpace.
  • User control in Directory level.
  • Integration with WorkDocs.

Conclusion

By adapting to the AWS WorkSpaces we can enable the end-users to securely access the business applications, documents that they are currently using within their organization devices or existing VDI solutions and experience a seamless performance of their desktop on cloud and also access the workspaces in the most secure way which prevents any data breach by enabling encryption options and also restricting client devices for users.

Benefits like reducing the overhead of maintenance of existing hardware and purchasing new hardware. Monitoring and managing the end-user workspaces becomes an easy task by integrating with AWS native services.

Copying objects using AWS Lambda based on S3 events – Part 2 – date partition

By | AWS, Blogs, Cloud | No Comments

Written by Tejaswee Das, Software Engineer, Powerupcloud Technologies

Introduction

If you are here from the first of this series on S3 events with AWS Lambda, you can find some complex S3 object keys that we will be handling here.

If you are new here, you would like to visit the first part – which is more into the basics & steps in creating your Lambda function and configuring S3 event triggers.

You can find link to part 1 here :

Use Case

This is a similar use case where we try Copying new files to a different location(bucket/path) while preserving the hierarchy, plus we will partition the files according to their file names and store them in a date-partitioned structure.

Problem Statement

Our Tech Lead suggested a change in the application logic, so now the same application is writing files to  S3 bucket in a different fashion. The activity file for Ravi Bharti is written to source-bucket-006/RaviRanjanKumarBharti/20200406-1436246999.parquet.

Haha! Say our Manager wants to check activity files of Ravi Bharti date-wise, hour-wise, minute-wise, and.. no not seconds, we can skip that!

 So we need to store them in our destination bucket  as:

  • destination-test-bucket-006/RaviRanjanKumarBharti/2020-04-06/20200406-1436246999.parquet — Date wise
  • destination-test-bucket-006/RaviRanjanKumarBharti/2020-04-06/14/20200406-1436246999.parquet — Hour wise
  • destination-test-bucket-006/RaviRanjanKumarBharti/2020-04-06/14/36/20200406-1436246999.parquet — Hour/Min wise

Tree:

source-bucket-006
| - AjayMuralidhar
| - GopinathP
| - IshitaSaha
| - RachanaSharma
| - RaviRanjanKumarBharti
		| - 20200406-143624699.parquet
| - Sagar Gupta
| - SiddhantPathak

Solution

Our problem is not that complex, just a good quick play with split & join of strings should solve it. You can choose any programming language for this. But we are continuing using Python & AWS Python SDK – boto3.

Python Script

Everything remains the same, we will just need to change our script as per our sub-requirements. We will make use of the event dictionary to get the file name & path of the uploaded object.

source_bucket_name = event['Records'][0]['s3']['bucket']['name']

file_key_name = event['Records'][0]['s3']['object']['key']
  • destination-test-bucket-006/RaviRanjanKumarBharti/2020-04-06/20200406-1436246999.parquet

Format: source_file_path/YYYY-MM-DD/file.parquet

You can be lazy to do

file_key_name = “RaviRanjanKumarBharti/20200406-1436246999.parquet”

Splitting file_key_name with ‘/’ to extract Employee (folder name) & filename

file_root_dir_struct = file_key_name.split(‘/’)[0]

date_file_path_struct = file_key_name.split(‘/’)[1]

Splitting filename with ‘-’ to extract date & time

date_file_path_struct = file_key_name.split(‘/’)[1].split(‘-‘)[0]

Since we know the string will be always the same, we will concat it as per the position

YYYY		  - 		MM		-	DD
String[:4] - string[4:6] - string[6:8]


date_partition_path_struct = date_file_path_struct[:4] + "-" + date_file_path_struct[4:6] + "-" + date_file_path_struct[6:8]

Since Python is all about one-liners! We will try to solve this using List Comprehension

n_split = [4, 2, 2]

date_partition_path_struct = "-".join([date_file_path_struct[sum(n_split[:i]):sum(n_split[:i+1])] for i in range(len(n_split))])

We get date_partition_path_struct as ‘2020-04-06’

  • destination-test-bucket-006/RaviRanjanKumarBharti/2020-04-06/14/20200406-1436246999.parquet
time_file_path_struct = file_key_name.split('/')[1]

We will further need to split this to separate the file extension. Using the same variable for simplicity

time_file_path_struct = file_key_name.split('/')[1].split('-')[1].split('.')[0]


This gives us time_file_path_struct  as '1436246999'


hour_time_file_path_struct = time_file_path_struct[:2]
  • destination-test-bucket-006/RaviRanjanKumarBharti/2020-04-06/14/36/20200406-1436246999.parquet

Similarly for minute

min_time_file_path_struct = time_file_path_struct[2:4]

# Complete Code

import json
import boto3

# boto3 S3 initialization
s3_client = boto3.client("s3")


def lambda_handler(event, context):
  destination_bucket_name = 'destination-test-bucket-006'

  source_bucket_name = event['Records'][0]['s3']['bucket']['name']

  file_key_name = event['Records'][0]['s3']['object']['key']

  #Split file_key_name with ‘ / ’ to extract Employee & filename
  file_root_dir_struct = file_key_name.split('/')[0]

  file_path_struct = file_key_name.split('/')[1]

  # Split filename with ‘-’ to extract date & time
  date_file_path_struct = file_path_struct.split('-')[0]

  # Date Partition Lazy Solution

  # date_partition_path_struct = date_file_path_struct[:4] + "-" + date_file_path_struct[4:6] + "-" + date_file_path_struct[6:8]

  # Date Partition using List Comprehension

  n_split = [4, 2, 2]

  date_partition_path_struct = "-".join([date_file_path_struct[sum(n_split[:i]):sum(n_split[:i+1])] for i in range(len(n_split))])

  # Split to get time part
  time_file_path_split = file_key_name.split('/')[1]

  # Time Partition
  time_file_path_struct = time_file_path_split.split('-')[1].split('.')[0]

  # Hour Partition
  hour_time_file_path_struct = time_file_path_struct[:2]

  # Minute Partition
  min_time_file_path_struct = time_file_path_struct[2:4]

  # Concat all required strings to form destination path || date
  destination_file_path = file_root_dir_struct + "/" \
   + date_partition_path_struct + "/" + file_path_struct

  # # Concat all required strings to form destination path || hour partition
  # destination_file_path = file_root_dir_struct + "/" + date_partition_path_struct + "/" + \
  #                         hour_time_file_path_struct + "/" + file_path_struct

  # # Concat all required strings to form destination path || minute partition
  destination_file_path = file_root_dir_struct + "/" + date_partition_path_struct + "/" + \
                          hour_time_file_path_struct + "/" + min_time_file_path_struct + "/" + file_path_struct

  # Copy Source Object
  copy_source_object = {'Bucket': source_bucket_name, 'Key': file_key_name}

  # S3 copy object operation
  s3_client.copy_object(CopySource=copy_source_object, Bucket=destination_bucket_name, Key=destination_file_path)

  return {
      'statusCode': 200,
      'body': json.dumps('Hello from S3 events Lambda!')
  }

You can test your implementation by uploading a file in any folders of your source bucket, and then check your destination bucket of the respective Employee.

source-test-bucket-006

destination-test-bucket-006

Conclusion

This has helped us to solve the most popular use-case involved in data migration of storing files in a partitioned structure for better readability.

Hope this two series blog was useful to understand how we can use AWS Lambda and process your S3 objects based on event triggers.

Do leave your comments. Happy reading.

References

https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html

https://docs.aws.amazon.com/lambda/latest/dg/with-s3.html

https://stackoverflow.com/questions/44648145/split-the-string-into-different-lengths-chunks

Tags: Amazon S3, AWS Lambda, S3 events, Python, Boto3, S3 Triggers, Lambda Trigger, S3 copy objects, date-partitioned, time-partitioned

Copying objects using AWS Lambda based on S3 events – Part 1

By | AWS, Blogs, Cloud | No Comments

Written by Tejaswee Das, Software Engineer, Powerupcloud Technologies

Introduction

In this era of cloud, where data is always on the move. It is imperative for anyone dealing with moving data, to hear about Amazon’s Simple Storage Service, or popularly known as S3. As the name suggests, it is a simple file storage service, where we can upload or remove files – better referred to as objects. It is a very flexible storage and it will take care of scalability, security, performance and availability. So this is something which comes very handy for a lot of applications & use cases.

The next best thing we use here – AWS Lambda! The new world of Serverless Computing. You will be able to run your workloads easily using Lambda without absolutely bothering about provisioning any resources. Lambda takes care of it all.

Advantages

S3 as we already know is object-based storage, highly scalable & efficient. We can use it as a data source or even as a destination for various applications. AWS Lambda being serverless allows us to run anything without thinking about any underlying infrastructure. So you can use Lambda for a lot of your processing jobs or even simple communicating with any of your AWS resources.

Use Case

Copying new files to a different location(bucket/path) while preserving the hierarchy. We will use AWS Python SDK to solve this.

Problem Statement

Say, we have an application writing  files to a S3 bucket path every time an Employee updates his/her tasks at any time of the day during working hours.

For eg, The work activity of Ajay Muralidhar for 6th April 2020, of 12:00 PM will be stored in source-bucket-006/AjayMuralidhar/2020-04-06/12/my-task.txt. Refer to the Tree for more clarity. We need to move these task files to a new bucket while preserving the file hierarchy.

Solution

For solving this problem, we will use Amazon S3 events. Every file pushed to the source bucket will be an event, this needs to trigger a Lambda function which can then process this file and move it to the destination bucket.

1. Creating a Lambda Function

1.1 Go to the AWS Lambda Console and click on Create Function

1.2 Select an Execution Role for your Function

This is important because this ensures that your Lambda has access to your source & destination buckets. Either you can use an existing role that already has access to the S3 buckets, or you can choose to Create an execution role. If you choose the later, you will need to attach S3 permission to your role.

1.2.1 Optional – S3 Permission for new execution role

Go to Basic settings in your Lambda Function. You will find this when you scroll down your Lambda Function. Click Edit. You can edit your Lambda runtime settings here, like Timeout – Max of 15 mins. This is the time for which your Lambda can run. Advisable to set this as per your job requirement. Any time you get an error of Lambda timed out. You can increase this value.

Or you can also check the Permissions section for the role.

Click on View the <your-function-name>-role-<xyzabcd> role on the IAM console. This takes you to the IAM console. Click on Attach policies. You can also create inline policy if you need more control on the access you are providing. You can restrict this to particular buckets. For ease of demonstration, we are using AmazonS3FullAccess here.

Select AmazonS3FullAccess, click on Attach policy

Once the policy is successfully attached to your role, you can go back to your Lambda Function.

2. Setting S3 Event Trigger

2.1 Under Designer tab, Click on Add trigger

2.2 From the Trigger List dropdown, select S3 events

Select your source bucket. There are various event types you can choose from.

Find out more about S3 events here, https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html#notification-how-to-event-types-and-destinations

We are using PUT since we want this event to trigger our Lambda when any new files are uploaded to our source bucket. You can add Prefix & Suffix if you need any particular type of files. Check on Enable Trigger

Python Script

We now write a simple Python script which will pick the incoming file from our source bucket and copy it to another location. The best thing about setting the Lambda S3 trigger is, whenever a new file is uploaded, it will trigger our Lambda. We make use of the event object here to gather all the required information.

This is how a sample event object looks like. This is passed to your Lambda function.

{
   "Records":[
      {
         "eventVersion":"2.1",
         "eventSource":"aws:s3",
         "awsRegion":"xx-xxxx-x",
         "eventTime":"2020-04-08T19:36:34.075Z",
         "eventName":"ObjectCreated:Put",
         "userIdentity":{
            "principalId":"AWS:POWERUPCLOUD:powerup@powerupcloud.com"
         },
         "requestParameters":{
            "sourceIPAddress":"XXX.XX.XXX.XX"
         },
         "responseElements":{
            "x-amz-request-id":"POWERUPCLOUD",
            "x-amz-id-2":"POWERUPCLOUD/POWERUPCLOUD/POWERUPCLOUD/POWERUPCLOUD/POWERUPCLOUD/POWERUPCLOUD/POWERUPCLOUD/POWERUPCLOUD"
         },
         "s3":{
            "s3SchemaVersion":"1.0",
            "configurationId":"powerup24-powerup-powerup-powerup",
            "bucket":{
               "name":"source-test-bucket-006",
               "ownerIdentity":{
                  "principalId":"POWERUPCLOUD"
               },
               "arn":"arn:aws:s3:::source-test-bucket-006"
            },
            "object":{
               "key":"AjayMuralidhar/2020-04-06/12/my-tasks.txt",
               "size":20,
               "eTag":"1853ea0cebd1e10d791c9b2fcb8cc334",
               "sequencer":"005E8E27C31AEBFA2A"
            }
         }
      }
   ]
}

Your Lambda function makes use of this event dictionary to identify the location where the file is uploaded.

import json
import boto3

# boto3 S3 initialization
s3_client = boto3.client("s3")


def lambda_handler(event, context):
   destination_bucket_name = 'destination-test-bucket-006'

   # event contains all information about uploaded object
   print("Event :", event)

   # Bucket Name where file was uploaded
   source_bucket_name = event['Records'][0]['s3']['bucket']['name']

   # Filename of object (with path)
   file_key_name = event['Records'][0]['s3']['object']['key']

   # Copy Source Object
   copy_source_object = {'Bucket': source_bucket_name, 'Key': file_key_name}

   # S3 copy object operation
   s3_client.copy_object(CopySource=copy_source_object, Bucket=destination_bucket_name, Key=file_key_name)

   return {
       'statusCode': 200,
       'body': json.dumps('Hello from S3 events Lambda!')
   }

You can test your implementation by uploading a file in any folders of your source bucket, and then check your destination bucket for the same file.

source-test-bucket-006

destination-test-bucket-006

You can check your Lambda execution logs in CloudWatch. Go to Monitoring and click View Logs in CloudWatch

Congrats! We have solved our problem. Just before we conclude this blog, we would like to discuss an important feature of Lambda which will help you to upscale your jobs. What if your application is writing a huge number of files at the same time? Don’t worry, Lambda will help you with this too. By default, Lambda has a Concurrency of 1000. If you need to scale up, you can increase this as per your business requirements.

Conclusion

This is how easy it was to use S3 with Lambda to move files between buckets.

In Part 2 of this series, we will try to handle a bit more complex problem, where we will try to move files as date partitioned structures at our destination.

You can find link to part 2 here :

Hope this was helpful for an overview of the basics of using s3 events triggers with AWS Lambda. Do leave your comments. Happy reading.

References

https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html

https://docs.aws.amazon.com/lambda/latest/dg/with-s3.html

Tags: Amazon S3, AWS Lambda, S3 events, Python, Boto3, S3 Triggers, Lambda Trigger, S3 copy objects

Handling Asynchronous Workflow-Driven pipeline with AWS CodePipeline and AWS Lambda

By | AWS, Blogs, Cloud, Cloud Assessment, Data pipeline | No Comments

Written by Praful Tamrakar Senior Cloud Engineer, Powerupcloud Technologies

Most of the AWS customers use AWS lambda widely for performing almost every task, especially its a very handy tool when it comes to customizing the way your pipeline works. If we are talking about pipelines, then AWS Lambda is a service that can be directly integrated with AWS CodePipeline. And the combination of these two services make it possible for AWS customers to successfully automate various tasks, including infrastructure provisioning, blue/green deployments, serverless deployments, AMI baking, database provisioning, and deal with asynchronous behavior.

Problem Statement :

Our customer has a requirement to trigger and monitor the status of the Step Function state machine, which is a long-running asynchronous process. The customer is using the AWS Step Function to run the ETL jobs with the help of AWS Glue jobs and AWS EMR. We proposed to achieve this with Lambda but lambda has a limitation of its timeout i.e. 15 min. Now the real problem is that such an asynchronous process needs to continue and succeed even if it exceeds a fifteen-minute runtime (a limit in Lambda).

Here in this blog we have a solution in which we have figured out how we can solve and automate this approach, with the combination of lambda and AWS CodePipeline with Continuous token.

Assumptions :

This blog assumes you are familiar with AWS CodePipeline and AWS Lambda and know how to create pipelines, functions, Glue jobs and the IAM policies and roles on which they depend.

Pre-requisites:

  1. Glue jobs has already been configured
  2. A StepFunction StateMachine configured to run  Glue Jobs.
  3. CodeCommit repository for Glue scripts

Solution :

In this blog post, we discuss how a CodePipeline action can trigger a Step Functions state machine and how the pipeline and the state machine are kept decoupled through a Lambda function.

The source code for the sample pipeline, pipeline actions, and state machine used in this post is available at https://github.com/powerupcloud/lambdacodepipeline.git.

The below diagram highlights the CodePipeline-StepFunctions integration that will be described in this post. The pipeline contains two stages: a Source stage represented by a CodeCommit Git repository and a DEV stage with CodeCommit, CodeBuild and Invoke Lambda actions that represent the workflow-driven action.

The Steps involved  in the CI/CD pipeline:

  1. Developers commit AWS Glue job’s Code in the SVC (AWS CodeCommit)
  2. The AWS CodePipeline in the Tools Account gets triggered due to step
  3. The Code build steps involve multiple things as mentioned below
    • Installations of dependencies and packages needed
    • Copying the Glue and EMR jobs to S3 location where the Glue jobs will pick the script from.
  4. CHECK_OLD_SFN: The Lambda is invoked to ensure that the Previous Step function execution is not still in a running state before we run the actual Step function. Please find below the process.
    • This action invokes a Lambda function (1).
    • In (2) Lamba Checks the State Machine  Status, which returns a Step Functions State Machine status.
    • In (3) The lambda gets the execution state of the State Machine ( RUNNING || COMPLETED || TIMEOUT )
    • In (4) The Lambda function sends a continuation token back to the pipeline

If The State Machine State is RUNNING in Seconds later, the pipeline invokes the Lambda function again (4), passing the continuation token received. The Lambda function checks the execution state of the state machine and communicates the status to the pipeline. The process is repeated until the state machine execution is complete.

Else (5) Lambda  sends a Job completion token  and completes the pipeline stage.

  1.  TRIGGER_SFN_and_CONTINUE : Invoking Lambda to execute the new Step function execution and Check the status of the new execution. Please find below the process.
    • This action invokes a Lambda function (1) called the State Machine, which, in turn, triggers a Step Functions State Machine to process the request (2).
    • The Lambda function sends a continuation token back to the pipeline (3) to continue its execution later and terminates.
    • Seconds later, the pipeline invokes the Lambda function again (4), passing the continuation token received. The Lambda function checks the execution state of the state machine (5,6) and communicates the status to the pipeline. The process is repeated until the state machine execution is complete.
    • Then the Lambda function notifies the pipeline that the corresponding pipeline action is complete (7). If the state machine has failed, the Lambda function will then fail the pipeline action and stop its execution (7). While running, the state machine triggers various Glue Jobs to perform ETL operations. The state machine and the pipeline are fully decoupled. Their interaction is handled by the Lambda function.
  2. Approval to the Higher Environment. In this stage, we Add a Manual Approval Action to a Pipeline in CodePipeline. Which can be implemented using https://docs.aws.amazon.com/codepipeline/latest/userguide/approvals-action-add.html

Deployment Steps :

Step 1: Create a Pipeline

  1. Sign in to the AWS Management Console and open the CodePipeline console at http://console.aws.amazon.com/codesuite/codepipeline/home.
  2. On the Welcome page, Getting started page, or the Pipelines page, choose Create pipeline.
  3. In Choose pipeline settings, in Pipeline name, enter the pipeline name.
  4. In-Service role, do one of the following:
    • Choose a New service role to allow CodePipeline to create a new service role in IAM.
    • Choose the Existing service role to use a service role already created in IAM. In Role name, choose your service role from the list.
  5. Leave the settings under Advanced settings at their defaults, and then choose Next.

6. In the Add source stage, in Source provider, Choose Source Provider as CodeCommit.

7. Provide Repository name and Branch Name

8. In Change detection options Choose AWS CodePipeline

9. In Add build stage,  in Build provider choose AWS CodeBuild, choose the Region

10. Select the existing Project name or Create project

11. You Can add Environment Variables, which you may use in buildspec.yaml file , and click Next

NOTE: The build Step has a very special reason. Here we copy the glue script from SVC (AWS CodeCommit ) to the S3 bucket, from where the Glue job picks its script to execute in its next execution.

12. Add deploy stage, Skip deploy Stage.

13. Now Finally click Create Pipeline.

Step 2: Create the CHECK OLD SFN LAMBDA Lambda Function

  1. Create the execution role
  • Sign in to the AWS Management Console and open the IAM console

Choose Policies, and then choose Create Policy. Choose the JSON tab, and then paste the following policy into the field.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "states:*",
                "codepipeline:PutJobFailureResult",
                "codepipeline:PutJobSuccessResult"
            ],
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "logs:*",
            "Resource": "arn:aws:logs:*:*:*"
        }
    ]
}
  • Choose Review policy.
  • On the Review policy page, in Name, type a name for the policy (for example, CodePipelineLambdaExecPolicy). In Description, enter Enables Lambda to execute code.
  • Choose Create Policy.
  • On the policy dashboard page, choose Roles, and then choose to Create role.
  • On the Create role page, choose AWS service. Choose Lambda, and then choose Next: Permissions.
  • On the Attach permissions policies page, select the checkbox next to CodePipelineLambdaExecPolicy, and then choose Next: Tags. Choose Next: Review.
  • On the Review page, in Role name, enter the name, and then choose to Create role.

2. Create the CHECK_OLD_SFN_LAMBDA Lambda function to use with CodePipeline

  • Open the Lambda console and choose the Create function.
  • On the Create function page, choose Author from scratch. In the Function name, enter a name for your Lambda function (for example, CHECK_OLD_SFN_LAMBDA ) .
  • In Runtime, choose Python 2.7.
  • Under Role, select Choose an existing role. In Existing role, choose your role you created earlier, and then choose the Create function.
  • The detail page for your created function opens.
  • Copy the check_StepFunction.py code into the Function code box
  • In Basic settings, for Timeout, replace the default of 3 seconds with 5 Min.
  • Choose Save.

3. Create the TRIGGER_SFN_and_CONTINUE Lambda function to use with CodePipeline

  • Open the Lambda console and choose the Create function.
  • On the Create function page, choose Author from scratch. In Function name, enter a name for your Lambda function (for example, TRIGGER_SFN_and_CONTINUE ) .
  • In Runtime, choose Python 2.7.
  • Under Role, select Choose an existing role. In Existing role, choose your role you created earlier, and then choose the Create function.
  • The detail page for your created function opens.
  • Copy the trigger_StepFunction.py code into the Function code box
  • In Basic settings, for Timeout, replace the default of 3 seconds with 5 Min.
  • Choose Save.

Step 3: Add the CHECK OLD SFN LAMBDA, Lambda Function to a Pipeline in the CodePipeline Console

In this step, you add a new stage to your pipeline, and then add a Lambda action that calls your function to that stage.

To add stage

  • Sign in to the AWS Management Console and open the CodePipeline console at http://console.aws.amazon.com/codesuite/codepipeline/home.
  • On the Welcome page, choose the pipeline you created.
  • On the pipeline view page, choose Edit.
  • On the Edit page, choose + Add stage to add a stage after the Build stage with thaction. Enter a name for the stage (for example, CHECK_OLD_SFN_LAMBDA ), and choose Add stage.
  • Choose + Add action group. In Edit action, in Action name, enter a name for your Lambda action (for example, CHECK_OLD_SFN_LAMBDA ). In Provider, choose AWS Lambda. In Function name, choose or enter the name of your Lambda function (for example, CHECK_OLD_SFN_LAMBDA )
  • In UserParameters, you must provide a JSON string with a parameter: { “stateMachineARN”: “<ARN_OF_STATE_MACHINE>” } EG: 
  • choose Save.

Step 4: Add the TRIGGER_SFN_and_CONTINUE  Lambda Function to a Pipeline in the CodePipeline Console

In this step, you add a new stage to your pipeline, and then add a Lambda action that calls your function to that stage.

To add a stage

  • Sign in to the AWS Management Console and open the CodePipeline console at http://console.aws.amazon.com/codesuite/codepipeline/home.
  • On the Welcome page, choose the pipeline you created.
  • On the pipeline view page, choose Edit.
  • On the Edit page, choose + Add stage to add a stage after the Build stage with thaction. Enter a name for the stage (for example, TRIGGER_SFN_and_CONTINUE ), and choose Add stage.
  • Choose + Add action group. In Edit action, in Action name, enter a name for your Lambda action (for example, TRIGGER_SFN_and_CONTINUE ). In Provider, choose AWS Lambda. In Function name, choose or enter the name of your Lambda function (for example, TRIGGER_SFN_and_CONTINUE )
  • In UserParameters, you must provide a JSON string with a parameter: { “stateMachineARN”: “<ARN_OF_STATE_MACHINE>” }
  • choose Save.

Step 5: Test the Pipeline with the Lambda function

  • To test the function, release the most recent change through the pipeline.
  • To use the console to run the most recent version of an artifact through a pipeline
  • On the pipeline details page, choose Release change. This runs the most recent revision available in each source location specified in a source action through the pipeline.
  • When the Lambda action is complete, choose the Details link to view the log stream for the function in Amazon CloudWatch, including the billed duration of the event. If the function failed, the CloudWatch log provides information about the cause.

Example JSON Event

The following example shows a sample JSON event sent to Lambda by CodePipeline. The structure of this event is similar to the response to the GetJobDetails API, but without the actionTypeId and pipelineContext data types. Two action configuration details, FunctionName and UserParameters, are included in both the JSON event and the response to the GetJobDetails API. The values in green text are examples or explanations, not real values.

{
    "CodePipeline.job": {
        "id": "11111111-abcd-1111-abcd-111111abcdef",
        "accountId": "111111111111",
        "data": {
            "actionConfiguration": {
                "configuration": {
                    "FunctionName": "MyLambdaFunctionForAWSCodePipeline",
                    "UserParameters": "some-input-such-as-a-URL"
                }
            },
            "inputArtifacts": [
                {
                    "location": {
                        "s3Location": {
                            "bucketName": "s3-bucket-name",
                            "objectKey": "for example CodePipelineDemoApplication.zip"
                        },
                        "type": "S3"
                    },
                    "revision": null,
                    "name": "ArtifactName"
                }
            ],
            "outputArtifacts": [],
            "artifactCredentials": {
                "secretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
                "sessionToken": "MIICiTCCAfICCQD6m7oRw0uXOjANBgkqhkiG9w
0BAQUFADCBiDELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAldBMRAwDgYDVQQHEwdTZEDmFJl0ZxBHjJnyp378OD8uTs7fLvjx79LjSTbNYiytVbZPQUQ5Yaxu2jXnimvwdasdadasljdajldlakslkdjakjdkaljdaljdasljdaljdalklakkoi9494k3k3owlkeroieowiruwpirpdk3k23j2jk234hjl2343rrszlaEXAMPLE=",
                "accessKeyId": "AKIAIOSFODNN7EXAMPLE"
            },
            "continuationToken": "A continuation token if continuing job",
            "encryptionKey": { 
              "id": "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab",
              "type": "KMS"
            }
        }
    }
}

Conclusion

In this blog post, we discussed how a Lambda function can be used fully to decouple the pipeline and the state machine and manage their interaction. We also learned how asynchronous processes that need to continue and succeed, even if it exceeds a fifteen-minute runtime (a limit in Lambda) are handled using Continuous Token.

Please Visit our Blogs for more interesting articles.

Automate and Manage AWS KMS from Centralized AWS Account

By | AWS, Blogs, Cloud, Cloud Assessment | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

As we have discussed in our previous blog that we use the AWS Landing Zone concept for many of our customers which consists of separate AWS accounts so they can meet the different needs of their organization. One of the accounts is the Security account where the security-related components reside. KMS Keys are one of the security-related key components that helps in the encryption of data. 

A Customer Master Key (CMK) is a logical representation of a master key which includes the following details:

  • metadata, such as the key ID, creation date, description
  • key state
  • key material used to encrypt and decrypt data.

There are three types of AWS KMS:

  • Customer Managed CMK: CMKs that you create, own, and manage. You have full control over these CMKs.
  • AWS Managed CMK: CMKs that are created, managed, and used on your behalf by an AWS service that is integrated with AWS KMS. Some AWS services support only an AWS managed CMK.
  • AWS Owned CMK:  CMKs that an AWS service owns and manages for use in multiple AWS accounts. You do not need to create or manage the AWS owned CMKs.

This blog covers the automation of Customer Managed CMKs i.e. how we can use the Cloudformation templates to create the Customer Managed CMKs. It also discusses the strategy that we follow for our enterprise customers for enabling encryption in cross accounts. 

KMS Encryption Strategy

We are covering the KMS strategy that we follow for most of our customers.

In each of the Accounts, create a set of KMS Keys for the encryption of data. For example,

  • UAT/EC2
    • For enabling the default EC2 encryption, go to Ec2 dashboard settings in the right-hand side as shown in the below screenshot:

Select “Always encrypt the EBS volumes” and Change the default key. Paste the ARN of the UAT/EC2 KMS Key ARN.

  • UAT/S3
    • Copy the ARN of UAT/S3 KMS Key ARN.
    • Go to Bucket Properties and Enable Default Encryption with Custom AWS-KMS. Provide the KMS ARN from the security account.
  • UAT/RDS
    • This Key can be used while provisioning the RDS DB instance.
    • Ensure to provide the Key ARN if using via cross-account
  • UAT/OTHERS

Automated KMS Keys Creation

Below Cloudformation template can be used to create a set of KMS Keys as follows:

https://github.com/powerupcloud/automate-kms-keys-creation/blob/master/kms-cf-template.json

Ensure to replace the SECURITY_ACCOUNT_ID variable with the 12-digit AWS security account ID where KMS keys will be created.

The CF Template does the following:

  • Creates the below KMS Keys in the Target Account:
    • PROD/EC2
      • It is used to encrypt the EBS Volumes.
    • PROD/S3
      • Used to encrypt the S3 buckets.
    • PROD/RDS
      • Used to encrypt the RDS data.
    • PROD/OTHERS
      • It can be used to encrypt the AWS resources other than EC2, S3, and RDS. For example, if EFS requires to be created in the production account, PROD/OTHERS KMS key can be used for the encryption of EFS.
  • In our case, we are using the Landing Zone concept, so the “OrganizationAccountAccessRole” IAM Role used for Switch Role access from Master account is one of the Key Administrator.
  • Also, We have enabled the Single sign-on in our account, the IAM Role created by SSO “AWSReservedSSO_AdministratorAccess_3687e92578266b74″ has also the Key Administrator access.

The Key administrators can be changed as required in the Key Policy.

The “ExternalAccountID” in the Cloudformation parameters is used to enable the cross-account access via KMS Key policy.

Hope you found it useful.

AWS EKS Authentication and Authorization using AWS Single SignOn

By | AWS, Blogs, Cloud, Cloud Assessment | No Comments

Written by Priyanka Sharma, DevOps Architect, Powerupcloud Technologies

Amazon EKS uses IAM to provide authentication to the Kubernetes cluster. The “aws eks get-token” command is being used to get the token for authentication. However, IAM is only used for authentication of valid IAM entities. All permissions for interacting with the Amazon EKS cluster’s Kubernetes API is managed through the native Kubernetes RBAC system.

In this article, we are covering the authentication and authorization of AWS EKS through OnPrem ActiveDirectory i.e. in general, how we can provide EKS access to Onprem AD users. We have divided the solution into two parts: One is integrating OnPrem AD to AWS through AWS SingleSignOn and another is implementing RBAC policies on Kubernetes Cluster.

For the solution, we are using the following services:

  • OnPrem Active Directory with predefined users and groups: Ensure the below ports are allowed for your AD:
    • TCP 53/UDP 53/TCP 389/UDP 389/TCP 88/UDP 88/UDP 389
  • AD Connector on AWS which connects to the OnPrem AD
  • AWS Single SignOn: Integrates with AD Connector, allowing users to access the AWS Command Line Interface with a set of temporary AWS credentials.
  • AWS EKS: version 1.14

Integrating OnPrem AD with the AWS Single SignOn

For instance, we have created the following users and groups on OnPrem AD for the demo purpose:

AD UsernameRespective AD Group
user1EKS-Admins
user2EKS-ReadOnly
dev1EKS-Developers

Ensure to setup and AD Connector in the same region as AWS SignOn. Refer to our previous blog for setting up an Active Directory Connector.

Switch to AWS SingleSignOn Console and change the user directory. Select the AD connector created in the above step.

Select the account where you have setup the EKS Cluster.

Search for the AD group for which you want to give the EKS access.

Create a custom permission set.

Attach the below custom permission policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Action": "sts:AssumeRole",
          "Resource": "*"
        }
    ]
 }

Select the Permission set created above.

Finish. Similarly, Create for Readonly and Developers too with the same permission set policy. Verify the permission sets in AWS accounts once.

Note that no specific permissions are assigned to the assumed role for the AD Users/Groups at this stage. By default the assumed role will not have permission to perform any operations. The specific authorization permissions will be defined via Kubernetes RBAC in the next section.

Behind the scenes, AWS SSO performs the following operations in Account B (member):

  • Sets up SAML federation by configuring an Identity Provider (IdP) in AWS IAM. The Identity Provider enables the AWS account to trust AWS SSO for allowing SSO access.
  • Creates an AWS IAM role and attaches the above permission set as a policy to the role. This is the role that AWS SSO assumes on behalf of the Microsoft AD user/group to access AWS resources. The role created will have prefix “AWSReservedSSO”.

Go to the account you have applied permission set to. There will be an IAM role created by SSO. In our case, below is the screenshot from the target account:

Creating RBAC Policies on Kubernetes

A Role is used to grant permissions within a single namespace. Apply the below role to create the multiple K8s role for Admins, Readonly and Dev groups respectively:

---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: Role
 metadata:
   name: default:ad-eks-admins
   namespace: default
 rules:
 - apiGroups: ["*"]
   resources: ["*"]
   verbs: ["*"]
---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: Role
 metadata:
   name: default:ad-eks-readonly
   namespace: default
 rules:
 - apiGroups: [""]
   resources: ["*"]
   verbs: ["get", "list", "watch"]
---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: Role
 metadata:
   name: default:ad-eks-developers
   namespace: default
 rules:
 - apiGroups: ["*"]
   resources: ["services","deployments", "pods", "configmaps", "pods/log"]
   verbs: ["get", "list", "watch", "update", "create", "patch"]
---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: Role
 metadata:
   name: default:ad-eks-monitoringadmins
   namespace: monitoring
 rules:
 - apiGroups: ["*"]
   resources: ["*"]
   verbs: ["*"]

Edit the existing aws-auth configmap through the below command:

kubectl edit configmap aws-auth –namespace kube-system

Add the below contents:

- rolearn: arn:aws:iam::ACCOUNTID:role/AWSReservedSSO_AD-EKS-Admins_b2abd90bad1696ac
      username: adminuser:{{SessionName}}
      groups:
        - default:ad-eks-admins
    - rolearn: arn:aws:iam::ACCOUNTID:role/AWSReservedSSO_AD-EKS-ReadOnly_2c5eb8d559b68cb5
      username: readonlyuser:{{SessionName}}
      groups:
        - default:ad-eks-readonly
    - rolearn: arn:aws:iam::ACCOUNTID:role/AWSReservedSSO_AD-EKS-Developers_ac2b0d744059fcd6
      username: devuser:{{SessionName}}
      groups:
        - default:ad-eks-developers
    - rolearn: arn:aws:iam::ACCOUNTID:role/AWSReservedSSO_AD-EKS-Monitoring-Admins_ac2b0d744059fcd6
      username: monitoringadminuser:{{SessionName}}
      groups:
        - default:ad-eks-monitoring-admins

Ensure to remove: aws-reserved/sso.amazonaws.com/ from the role_arn.

kubectl create rolebinding eks-admins-binding --role default:ad-eks-admins --group default:ad-eks-admins --namespace default

kubectl create rolebinding eks-dev-binding --role default:ad-eks-developers --group default:ad-eks-developers --namespace default

kubectl create rolebinding eks-readonly-binding --role default:ad-eks-readonly --group default:ad-eks-readonly --namespace default

kubectl create clusterrolebinding clusterrole-eks-admins-binding --clusterrole=cluster-admin  --group default:ad-eks-admins

kubectl create clusterrolebinding clusterrole-eks-readonly-binding --clusterrole=system:aggregate-to-view  --group default:ad-eks-readonly

Time for some Action

Hit SSO User Portal URL (highlighted in the below screenshot):

Give user AD credentials which is added to EKS-Admins group:

Click on Programmatic access:

It gives temporary AWS Credentials:

Create a ~/.aws/credentials file in the server with the credentials got from SSO:

[ACCOUNTID_AD-EKS-Admins]
aws_access_key_id = ASIAZMQ74VVIMRLK2RLO
aws_secret_access_key = gzOs61AcQ/vyh0/E9y+naT3GF3PDKUqB5stZLWvv
aws_session_token = AgoJb3JpZ2luX2VjENr//////////wEaCXVzLWVhc3QtMSJIMEYCIQCcs8/t5OK/UlOvSQ/NSXt+giJm9WkxVkfUhY6MFVnJwgIhAMOuJxb/CqhNx12ObPY4Obhe4KmxyEdyosqzqq63BOLaKt8CCKP//////////wEQABoMNjQ1Mzg1NzI3MzEyIgzw9o8jbVmjgcjgTHIqswICmRCh/7qIgBbxjG0kZJdmrGFEHjssv1b4Rl3AnIel7p0RizMDzzY9lQIlsuE5S7xYVB4alVVl1MNQ/1+iNSrSAG4LlCtSIaMrmUZ+hspR1qiQ5cqS2954UhgzEb081QCzYMbPgtvtPWwiiDZ9LkYOU2tp9hWbX7mHAZksFTHgEOO62hEuJWl3bh6dGYJWqyvTO3iwSJZYeqKJ/vY0MNnx5bjcqjgehUA6LnpUES3YlxelAGQPns7nbS0kOzDatoMe4erBIUTiP60vJ4JXJ2CFPsPmX6Doray0MWrkG/C9QlH4s/dZNCIm6In5C3nBWLAjpYWXQGA9ZC6e6QZRYq5EfMmgRTV6vCGJuSWRKffAZduXQJiZsvTQKEI0r7sVMGJ9fnuMRvIXVbt28daF+4ugyp+8MOCXjewFOrMB8Km775Vi0EIUiOOItQPj0354cao+V9XTNA/Pz23WTs8kF+wA5+il7mBOOEkmhLNrxEkuRTOCv0sn52tm9TeO9vSHRbH4e4xaKoJohyBYZTlEAysiu8aRQgahg4imniLYge+qvelQeDl1zYTBsea8Z71oQDcVVtBZzxmcIbS0V+AOOm81NTLRIIM1TNcu004Z7MnhGmD+MiisD0uqOmKVLTQsGLeTKur3bKImXoXNaZuF9Tg=

Update the KubeConfig for EKSAdmins groups as below:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: XXXXXX
    server: https://3B6E58DAA490F4F0DD57DAE9D9DFD099.yl4.ap-south-1.eks.amazonaws.com
  name: arn:aws:eks:ap-south-1:ACCOUNTID:cluster/puck8s
contexts:
- context:
    cluster: arn:aws:eks:ap-south-1:ACCOUNTID:cluster/puck8s
    user: arn:aws:eks:ap-south-1:ACCOUNTID:cluster/puck8s
  name: arn:aws:eks:ap-south-1:ACCOUNTID:cluster/puck8s
current-context: arn:aws:eks:ap-south-1:ACCOUNTID:cluster/puck8s
kind: Config
preferences: {}
users:
- name: arn:aws:eks:ap-south-1:ACCOUNTID:cluster/puck8s
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      args:
      - --region
      - ap-south-1
      - eks
      - get-token
      - --cluster-name
      - puck8s
      - -r
      - arn:aws:iam::ACCOUNTID:role/aws-reserved/sso.amazonaws.com/AWSReservedSSO_AD-EKS-Admins_b2abd90bad1696ac
      command: aws
      env:
        - name: AWS_PROFILE
          value: "ACCOUNTID_AD-EKS-Admins"

Similarly, update the KubeConfig and get the temporary credentials for ReadOnly and Dev Users.

KubeConfig for ReadOnly Users:

users:
- name: arn:aws:eks:ap-south-1:ACCOUNTID:cluster/puck8s
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      args:
      - --region
      - ap-south-1
      - eks
      - get-token
      - --cluster-name
      - puck8s
      - --role
      - "arn:aws:iam::ACCOUNTID:role/aws-reserved/sso.amazonaws.com/AWSReservedSSO_AD-EKS-ReadOnly_2c5eb8d559b68cb5"
      command: aws
      env:
        - name: AWS_PROFILE
          value: "ACCOUNTID_AD-EKS-ReadOnly"

AWS Profile for Temporary Credentials:

Export the KUBECONFIG for ReadOnly Users and try out the following commands:

Export the KUBECONFIG for EKS Admin Users and try out the following commands:

Export the KUBECONFIG for EKS ReadOnly Users and try out the following commands:

That’s all..!! Hope you found it useful.

References:

https://aws.amazon.com/blogs/opensource/integrating-ldap-ad-users-kubernetes-rbac-aws-iam-authenticator-project/

Email VA report of Docker Images in ECR

By | AWS, Blogs, Cloud, Cloud Assessment, Containerization, Containers on Cloud | 3 Comments

Written by Praful Tamrakar, Senior Cloud Engineer, Powerupcloud Technologies

Amazon ECR is Elastic Container Registry provided by Amazon to store, decrypt and manage container images. Recently AWS announced Image scanning feature for the images stored in the ECR. Amazon ECR uses the Common Vulnerabilities and Exposures (CVEs) database from the open-source CoreOS Clair project and provides you with a list of scan findings.

Building container images in a Continuous Integration (CI) pipeline, pushing these artifacts into ECR has been a common approach adopted widely. But along with this, we would also like to Scan the Container Image and send a Vulnerability Assessment report to the customer. The Email alert will be triggered only if the container has any critical vulnerabilities.

How Can I Scan My Container Images?

Container images can be scanned using lots of third-party tools such as Clair, Sysdig Secure, etc. To use these tools, the required server/database needs to be managed by us. This adds additional effort to the operations team.

To reduce these efforts, we can use the Image scanning feature of the ECR.

  • You can scan your container images stored in ECR manually.
  • Enable Image Scan for the push on your repositories so that each and every image is checked against an aggregated set of Common Vulnerabilities and Exposures (CVEs).
  • Scan Images using an API command thereby allowing you to set up periodic scans for your container images. This ensures continuous monitoring of your images.

Problem Statement

Currently, no direct way to achieve getting the scan results in CloudWatch or CloudTrail.  However,  it can be achieved using the following approach.

Resolution:

  1. Configure an Existing Repository to Scan on Push using AWS CLI
aws ecr put-image-scanning-configuration --repository-name <ECR_REPO_NAME> --image-scanning-configuration scanOnPush=true --region <REGION_CODE>
  1. Getting the image scan findings (scan results from ECR) can be achieved through a Lambda function that will use an API call.
    1. Create an SNS topic with EMAIL as the subscription
    2. Create a lambda function with runtime Python 3.7 or above, and attach AmazonEC2ContainerRegistryPowerUser and AmazonSNSFullAccess to the following AWS provided policy to the Lambda service Role.

b. Paste the following python command which will return the image scan findings summary

import json
from datetime import datetime
from logging import getLogger, INFO
import os
import boto3
from botocore.exceptions import ClientError


logger = getLogger()
logger.setLevel(INFO)

ecr = boto3.client('ecr')
sns = boto3.client('sns')

def get_findings(tag):
    """Returns the image scan findings summary"""
    
    try:
        response = ecr.describe_image_scan_findings(
            repositoryName='<NAME_OF_ECR >',
            registryId='<AWS_ACCOUNT_ID >',
            imageId={
            'imageTag': tag},
        )
        
        criticalresult = {}
        criticalresultList = []
        findings = response['imageScanFindings']['findings']
        for finding in findings:
            if finding['severity'] == "CRITICAL": #Can be CRITICAL | HIGH
                # print(findding['severity'])
                name  = finding['name']
                description = finding['description']
                severity = finding['severity']
                criticalresult["name"] = name
                criticalresult["description"] = description
                criticalresult["severity"] = severity
                criticalresultList.append(criticalresult)
        return criticalresultList
        
            
    except ClientError as err:
        logger.error("Request failed: %s", err.response['Error']['Message'])

def lambda_handler(event, context):
    """AWS Lambda Function to send ECR Image Scan Findings to EMAIL"""
    scan_result = get_findings(event['tag'])
    print (scan_result)
    
    
    sns_response = sns.publish(
    TopicArn='arn:aws:sns:<AWS_REGION_CODE>:<AWS_ACCOUNT_ID>:<SNS_TOPIC>',    
    Message=json.dumps({'default': json.dumps(scan_result)}),
    MessageStructure='json')
    
    print (sns_response)

This email contains the Name, Description and Severity level for the scanned image.

[{
"name": "CVE-2019-2201", 
"description": "In generate_jsimd_ycc_rgb_convert_neon of jsimd_arm64_neon.S, there is a possible out of bounds write due to a missing bounds check. This could lead to remote code execution in an unprivileged process with no additional execution privileges needed. User interaction is needed for exploitation.Product: AndroidVersions: Android-8.0 Android-8.1 Android-9 Android-10Android ID: A-120551338", 
"severity": "CRITICAL"
}]

d. Trigger this from the Jenkins pipeline

stage('Image Scan'){
    node('master'){        
        sh'''

        sleep 60
        if [[ $(aws ecr describe-image-scan-findings --repository-name < ECR_REPO_NAME> --image-id imageTag=${IMAGETAG} --region ap-southeast-1 --registry-id <AWS_ACCOUNT_ID> --output json --query imageScanFindings.findingSeverityCounts.CRITICAL) -gt 0 ]]
          then
           aws lambda invoke --function-name <LAMBDA_FUNCTION_NAME> --invocation-type Event --payload '{"tag":"${IMAGETAG}"}' response.json
        fi

        '''
      
      
         }}

OPTIONAL

You can create a CloudWatch rule that will match the scanning completion event using this event pattern If you don’t want to trigger this lambda function with pipeline but with Cloudwatch event for Event for a Completed Image Push

{
    "version": "0",
    "id": "13cde686-328b-6117-af20-0e5566167482",
    "detail-type": "ECR Image Action",
    "source": "aws.ecr",
    "account": "123456789012",
    "time": "2019-11-16T01:54:34Z",
    "region": "us-west-2",
    "resources": [],
    "detail": {
        "result": "SUCCESS",
        "repository-name": "my-repo",
        "image-digest": "sha256:7f5b2640fe6fb4f46592dfd3410c4a79dac4f89e4782432e0378abcd1234",
        "action-type": "PUSH",
        "image-tag": "latest"
    }
}

This event pattern will match the exact completion’s event API. After that, you can pass all of the matched events to the Lambda function so that it extracts the values of “repository-name”, “image-digest” and “image-tags”, which will be passed to the DescribeImageScanFindings API call. For reference: https://docs.aws.amazon.com/AmazonECR/latest/userguide/ecr-eventbridge.html

Note:

If the ECR repository is not in the same account where Lambda is configured then, you must configure image permission for the image.

  1. Go to ECR console, and select the ECR repository.
  1. Click on Permission in the Left-hand side dashboard.
  2. Click on Edit policy json
{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "pull and push",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
                    "arn:aws:sts::<AWS_ACCOUNT_ID>:assumed-role/<LAMBDA_ROLE_NAME>"
        ]
      },
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:BatchGetImage",
        "ecr:CompleteLayerUpload",
        "ecr:DescribeImageScanFindings",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetDownloadUrlForLayer",
        "ecr:InitiateLayerUpload",
        "ecr:PutImage",
        "ecr:UploadLayerPart"
      ]
    }
  ]
}
  1. Save it.
  2. Trigger the lambda.

And that’s it..!! Hope you found it useful. Keep following our Blog for more interesting articles.