Polymarket Data Aggregation & Analytics Platform Project Overview Build a data pipeline to collect, store, and query all historical and real-time data from Polymarket's APIs for market analysis and insights. Requirements 1. Data Collection System Initial Bulk Download: Fetch all historical data from Polymarket APIs: Markets (all status: active, closed, archived) Events and their associated markets Order book data (current state) Historical trades data Historical price data (time series) Incremental Updates: Set up scheduled jobs to fetch new/updated data every 5 minutes Rate Limiting: Respect API limits, implement exponential backoff and error handling 2. Database Design & Storage Choose and implement ONE of these options: PostgreSQL Database must store: Markets metadata (questions, outcomes, dates, status) Order book snapshots (timestamped) Trade history Price history (OHLCV-style if possible) Market events and state changes 3. API Endpoints to Integrate https://gamma-api.polymarket.com/markets https://gamma-api.polymarket.com/events https://clob.polymarket.com/book (order books) https://clob.polymarket.com/prices-history (time series) https://data-api.polymarket.com/trades (historical trades) and others that are available 4. Query Interface Build a simple REST API or dashboard that allows: Search markets by keyword, category, date range Filter by volume, liquidity, status Get historical price charts for any market View order book history over time Aggregate statistics (total volume, market count, etc.) 5. Key Features Market Screener: Find markets closing in next X minutes with volume > threshold Historical Analysis: Track price movements and volume trends Liquidity Monitoring: Monitor bid/ask spreads and depth over time Data Export: Allow CSV/JSON export of query results 6. Additional Features once the basic system works WebSocket integration for real-time updates Advanced analytics (market volatility, arbitrage detection) Automated alerts (market closing soon, price movements) Data visualization dashboard Dockerized deployment Technical Stack Backend: Python (FastAPI/Flask) or Node.js (Express) Database: PostgreSQL Scheduler: Cron jobs, APScheduler, or Celery Optional Dashboard: React, Streamlit, or Grafana Deliverables Data Pipeline: Working scripts to download and update all Polymarket data Database: Properly structured with indexes for fast queries Frontend to view and analyze markets Documentation: Setup instructions (README) Database schema documentation API endpoint documentation Query API/Dashboard: Interface to search and analyze stored data Deployment Guide: Instructions for running on cloud (AWS/GCP/Heroku) or local server Timeline - 2 - 3 weeks Evaluation Criteria Code quality and documentation Database performance (can handle millions of records) API response times (<500ms for typical queries) Error handling and reliability Data accuracy and completeness Questions for Bidders Which database would you recommend and why? Estimated storage requirements for 1 year of data? Your experience with similar data pipeline projects? Proposed architecture diagram? Tentative no of hours required according to you at this point? ## Note - If your github doesn't show similar relevant projects or your old reviews don't such such products, then it will be a no.