Quantitative Data Aggregation & Integration of AQI

Several years of quantitative records live in scattered CSV and Excel files, while a set of instruments continues to push fresh sensor readings to a local directory. The task is to bring all of this together into one clean, analysis-ready dataset. Here is what needs to happen: • Build or adapt an automated pipeline (Python, R, or a comparable tool) that ingests every CSV/Excel file in the folder structure and appends incoming sensor files on a rolling basis. • Apply consistent units, time stamps, and field names so the historical and sensor streams align perfectly. • Handle missing or corrupt rows, flag anomalies, and document any assumptions in a short README. • Deliver the consolidated file in the format of your choice (Parquet, CSV, or a lightweight relational database) along with the reproducible script/notebook so I can rerun the process when new data arrives. Clean structure, transparent code, and a brief note on data quality checks will be the acceptance criteria.

Python

Регистрация