This diagram shows one example of using AWS Glue to crawl, catalog and perform data stored in S3.
- Data landed in raw bucket is scanned by Glue Crawler and the metadata is stored in Glue Catalog.
- Glue ETL job loads the raw data and does transformations and eventually store the processed data in curated bucket.
- The processed files are scanned by Glue Crawler.
- Processed data is then queried by Amazon Athena. The data can be further utilized in reporting and dashboard.