I need to pull numerical fields from one or more databases, clean and structure them, then run a solid statistical analysis that answers a set of business questions I will provide once we start. The raw data already sits in the databases; the job is to design a repeatable extraction routine (SQL queries or an ETL script), handle any required data-quality checks, and produce clear statistical insights. Ideally the workflow ends with two concrete artefacts: • A tidy dataset (CSV or XLSX) produced directly from the database pull. • A short statistical report—tables, charts, and a concise narrative of the findings—created in Python (pandas, SciPy, seaborn) or R if you prefer. Everything should be reproducible: I want the code, queries, and brief instructions so I can rerun the process as new data arrives. Accuracy and transparency matter more to me than fancy dashboards at this stage.