I am conducting academic research and need a developer to build a **reproducible data collection pipeline** for Spotify Daily Top 200 rankings. This is not a one-time scraping task. The goal is to create a **research-grade, structured, and re-executable dataset archive**. - Data Scope Countries: USA, Global, Japan (System should allow future country expansion.) Period: January 1, 2022 – Latest available date Frequency: Daily - Required Output For each country: * One unified CSV file (UTF-8) * Streams stored as numeric integers * Clean standardized column names Minimum required fields: * date * rank * track_name * artist_names * streams If available, additional ranking-related fields (uri, source, peak_rank, previous_rank, days_on_chart, etc.) may also be included. - Critical Requirements (Very Important) This project must meet academic research standards: * Fully reproducible process (clear README required) * Ability to re-fetch by country and date * Logging per date: Success / failure, HTTP status, Retry count * Missing date tracking (missing list file) * Respect rate limits (no excessive access) * Minor source changes should be handled gracefully Environment: Windows Language: Python or R acceptable - Future Expansion (Not Required Now) The system should allow future extension to include Spotify audio features (danceability, energy, tempo, etc.), but this is not part of the current scope. Quality, reliability, and reproducibility are more important than speed.