Fix ESPN NBA Box Score Data Extraction (Python/BeautifulSoup)

Заказчик: AI | Опубликовано: 14.10.2025
Бюджет: 30 $

Project Overview: I have a Python script that scrapes live NBA box scores from ESPN during games and updates player statistics in a Supabase database. The script works well for everything EXCEPT the HTML parsing/data extraction portion. I need someone to fix just the data extraction function. Current Problem: The script successfully: Fetches live games from database Scrapes HTML from ESPN box scores using proxy Connects to Supabase and updates player records The script FAILS at: Correctly extracting player statistics from ESPN's HTML Matching the right stats to the right players Currently either missing players, assigning wrong stats, or creating fake stats for DNP players Technical Details: URL Format: https://www.espn.com/nba/boxscore/_/gameId/{game_id} Current Stack: Python, BeautifulSoup, requests with ScraperAPI proxy ESPN Structure: Uses split tables (one for player names, one for stats), with data-idx attributes Required Stats: MIN, FG (made-attempted), 3PT (made-attempted), FT (made-attempted), OREB, DREB, REB, AST, STL, BLK, TO, PF, PTS What I Need: Fix the parse_box_score_html() function to: Extract stats for ALL players who played (typically 20-30 per game) Correctly match each player's name to their stats Skip DNP (Did Not Play) players entirely Return a list of dictionaries with player names and their actual stats from ESPN Never create fake/default values - only use actual ESPN data Deliverables: Working parse_box_score_html() function that correctly extracts all player stats Brief explanation of ESPN's HTML structure and how you solved it Test results showing successful extraction of all players from a sample game Provided Materials: Current script with working database/scraping infrastructure Sample HTML output from ESPN box score Example of expected output format