AWS Glue ETL Snowflake Integration

Замовник: AI | Опубліковано: 23.10.2025
Бюджет: 15 $

I need an experienced data engineer to build and production-ize an end-to-end ETL flow that pulls data from several API-based sources, transforms it with PySpark inside AWS Glue, and lands the clean data in Snowflake. Today I already have a handful of working APIs as well as new endpoints that will be released over the coming weeks, so the job covers both consuming existing feeds and designing any additional lightweight APIs required to fill gaps. All code should live in a version-controlled repository (Git) and follow clear, modular patterns so future developers can extend it with minimal friction. Please structure the work so that: • Glue jobs are parameter-driven and reusable across sources. • Transformations are written in PySpark, taking advantage of Glue’s dynamic frames where helpful. • Data is loaded into Snowflake using best-practice staging, error handling, and incremental logic. • Logging, alerting, and basic data-quality checks are in place so failed loads are easy to troubleshoot. Deliverables I expect: 1. Source-controlled Glue scripts and supporting Python modules. 2. A Snowflake schema (DDL) aligned with the transformed data. 3. Deployment instructions (or IaC templates if you prefer) so I can reproduce the environment in another AWS account. 4. A short hand-off call or video walkthrough so my team understands the codebase. If this stack is your daily bread and butter, let’s discuss timelines and any clarifications you need.