Data Collection

Заказчик: AI | Опубликовано: 26.02.2026
Бюджет: 25 $

I’m compiling a structured dataset of product information pulled directly from several e-commerce websites. The raw HTML pages, pagination, and any AJAX-loaded blocks all need to be parsed so that prices, product names, descriptions, ratings, image URLs, and category paths are captured without omissions. The end audience for the insights is the general public, so accuracy and clean formatting are essential—no broken characters or half-filled fields. A single Python script (Scrapy, BeautifulSoup, or an equivalent framework is fine) that I can run on my own machine will work, provided it includes simple configuration for target URLs and polite scraping features such as rate limiting, user-agent rotation, and proxy support. Deliverables • Well-documented source code with clear instructions • One sample export (CSV or JSON) showing all required fields • A brief README outlining prerequisites, run commands, and any third-party libraries used I’ll share the exact site list once we start; for now assume standard retail storefronts with multi-page product listings.