Python Playwright Scraper Migration

Заказчик: AI | Опубликовано: 27.10.2025

I have an in-house Chrome extension that currently handles web scraping tasks on several e-commerce sites. It collects product details, prices, stock status, and pushes the data into our backend via an API call. The extension works, but maintaining it is becoming painful, so I’d like the entire workflow rebuilt in Python using either Playwright (preferred for speed and headless stability) or Selenium if you can demonstrate clear advantages. Scope • Replicate every data-gathering step performed by the existing extension. You’ll have access to the manifest, content scripts, and a short screencast that shows the routine in action. • Implement the scraper in clean, modular Python with sensible separation between navigation logic, data extraction, and API posting. • Target sites are mainstream e-commerce platforms; they use infinite scroll, lazy-loaded images, and occasional anti-bot measures, so please factor in scrolling, wait-for-selector logic and basic stealth. • Command-line flags (or a simple .env file) should let me switch between Playwright and Selenium drivers if both are included. Deliverables 1. Fully annotated Python project (Playwright and/or Selenium) mirroring current extension functionality 2. Requirements.txt / Pipfile with pinned versions 3. README that explains setup, environment variables, and running the scraper on Windows and Linux 4. Light documentation mapping each former extension feature to the new code module 5. One recorded demo run showing successful extraction on at least two target e-commerce sites Timeline I need a working first pass in no more than four weeks, so please outline your milestones accordingly. Acceptance I’ll run the script against our staging API. If the extracted JSON matches the schema produced by the extension and completes without manual intervention, the milestone is approved. Let me know which framework you’re leaning toward, any libraries you’d pair with it (pandas, BeautifulSoup, playwright-stealth, etc.), and your rough milestone breakdown. I’m happy to answer questions or share code snippets once we get started.