PROJECT TITLE: Extract Historical Hourly Forecast Temperature Data (Open-Meteo) PROJECT DESCRIPTION: I require extraction of historical hourly forecast temperature data from the Open-Meteo Previous Runs API. The goal is to retrieve archived hourly forecast temperatures for a fixed list of US locations from: START DATE: 2024-01-01 END DATE: [INSERT TODAY’S DATE] No UI, no dashboard, no hosting required. This is a data extraction and structuring task only. ------------------------------------------------------------ DATA SOURCE: Open-Meteo Previous Runs API Base endpoint: https://previous-runs-api.open-meteo.com/v1/forecast ------------------------------------------------------------ REQUIRED VARIABLES (HOURLY): Retrieve the following hourly fields: - temperature_2m - temperature_2m_previous_day1 - temperature_2m_previous_day2 - temperature_2m_previous_day3 - temperature_2m_previous_day4 - temperature_2m_previous_day5 Timezone must be: UTC Temperature unit must be: Fahrenheit ------------------------------------------------------------ LOCATIONS: I will provide a list of 4 US locations including: - Station name - Latitude - Longitude All 19 must be included. ------------------------------------------------------------ DATE RANGE: Start: 2024-01-01 End: [INSERT TODAY’S DATE] Full continuous coverage required across entire range. ------------------------------------------------------------ REQUIRED OUTPUT STRUCTURE: Deliver as CSV (UTF-8 encoded). Each row must represent: (station_id, timestamp_utc, lead_days) Required columns: - station_id (string) - station_name (string) - latitude (float) - longitude (float) - timestamp_utc (ISO 8601 format, UTC) - lead_days (integer: 0–5) - temperature_f (float) Lead mapping: lead_days = 0 → temperature_2m lead_days = 1 → temperature_2m_previous_day1 lead_days = 2 → temperature_2m_previous_day2 lead_days = 3 → temperature_2m_previous_day3 lead_days = 4 → temperature_2m_previous_day4 lead_days = 5 → temperature_2m_previous_day5 ------------------------------------------------------------ DATA INTEGRITY REQUIREMENTS (MANDATORY): 1. No duplicate rows. Unique key must be: (station_id, timestamp_utc, lead_days) 2. No missing timestamps within API-available range. 3. All timestamps must be in UTC. 4. All lead_days layers (0–5) must be present for all timestamps returned. 5. Include an audit summary file showing: - Total rows per station - Total rows per lead_days layer - Date coverage per station - Count of null values (if any) ------------------------------------------------------------ EXTRACTION REQUIREMENTS: - Data must be retrieved in safe date windows (e.g., 7–30 day blocks) - Script must handle retries and timeouts - Full coverage must be verified before delivery ------------------------------------------------------------ DELIVERABLES: 1. Final clean CSV dataset 2. Audit summary file 3. Python extraction script used 4. Short README explaining extraction method ------------------------------------------------------------ ACCEPTANCE CRITERIA: Work will be accepted if: - All 4 stations included - Full date range covered - No duplicate composite keys - Lead layers 0–5 present - Audit totals match dataset This is a data extraction task only. No analysis required.