Online Store Catalogue Data Extraction

Бюджет: 250 $

I need a reliable script that can crawl a single online store and pull down the full catalogue in one sweep. The data points I care about are: • product categories • product name • full description • price • reference/SKU • technical datasheet link or text • image URLs (all available images, not just the first thumbnail) Deliver the extracted information in a clean, well-structured Excel spreadsheet. I don’t need any extra filtering or sorting—just capture everything exactly as it appears on the site. Scope and expectations 1. Build an easily repeatable scraper (Python + BeautifulSoup/Scrapy, Node + Puppeteer, or another language you’re comfortable with). 2. Handle pagination, category hierarchies, and any lazy-loaded images. 3. Store raw images only as URLs—no need to download files. 4. Include simple setup notes so I can rerun the script on my own machine when the catalogue updates. 5. Provide a short demo run or sample sheet so I can verify field mapping before final delivery. If the store uses basic anti-bot measures (e.g., user-agent checks or modest rate limiting), please account for that with polite delays or rotating headers; no heavy browser automation is expected unless required. Share your proposed tech stack, estimated turnaround, and a brief example of a similar scrape you’ve done so I know you can hit all fields accurately.

Python

Реєстрація