Customer: An e-commerce Company-Running Websites at Scale on App Service.
One of India’s largest media companies, uses various SaaS platforms to run their OTT streaming application resulting in data is stored a several disparate sources. With around 20 of these data sources, resulting in an overall daily raw data aggregating to ~600 GB. This made extracting customer meta-data complex while making search and building recommendations difficult.
Building a Data Lake to bring all their customers’ and operations’ data at one place to understand their business better. Powerupcloud built real-time and batch ETL jobs to bring the data from varied data sources to S3. The raw data was stored in S3. The data was then populated in Redshift for further reporting while advanced analytics was run using Hadoop based ML engines on EMR. Reporting was done using QuickSight.
The solution architecture
S3, DynamoDB, AWS ElasticSearch, Kibana, EMR Clusters, RedShift, QuickSight, Lambda, Cognito, API gateway, Athena, MongoDB, Kinesis