AWS Batch Processing Solution Diagram

2022-01-11 awsaws-batch-processingdata-engineeringsolution-diagram

This diagram shows a typical batch processing solution on AWS with Amazon S3, AWS Lambda, Amazon EMR and Amazon Redshift:

  • Amazon S3 is used to store staging data extracted from source systems on-premises or on-cloud.
  • AWS Lambda is used to register data arrival in S3 buckets into ETL frameworks and trigger batch process process.
  • Amazon EMR is then used to transform data like aggregations and load the data.
  • Amazon Redshift is then used to store the transformed data.

This pattern follow the traditional ETL pattern and you can change it to ELT pattern too to do transformations in Redshift directly. Amazon EMR can be replaced with many other products.

Data Warehouse
[Not supported by viewer]
Data Transformation
[Not supported by viewer]
Amazon S3
[Not supported by viewer]
AWS Lambda
[Not supported by viewer]
Amazon EMR
[Not supported by viewer]
Amazon Redshift
[Not supported by viewer]
Data Sources
[Not supported by viewer]
Data Staging 
(Raw Data Store)
[Not supported by viewer]
Data Submissions
[Not supported by viewer]
MsPortalFx.base.images-23 public:true sdk: MsPortalFx.Base.Images.Polychromatic.Files() category: General image/svg+xml MsPortalFx.base.images-23