Multi-Platform Company & Investor Scraper

Customer: AI | Published: 22.11.2025

I need a reliable web-scraper that automatically gathers both company and investor information from LinkedIn, Crunchbase and Glassdoor. The workflow should run unattended, respect each platform’s terms of service, and export clean, well-structured data I can slot straight into my analytics pipeline. Scope of data • Companies: name, full address, key executives and high-level financial details (funding rounds, revenue ranges, headcount, etc.). • Investors: name and short profile, past investment history and direct contact information where publicly available. Technical expectations Python is preferred—Scrapy or Selenium for dynamic pages, plus BeautifulSoup or similar for parsing. If an official API is the faster route on any platform, I’m open to that. Please build in rate-limiting, login handling, pagination, basic CAPTCHA mitigation and an easy way for me to swap credentials or proxy lists. Deliverables 1. Well-documented source code. 2. A brief README explaining setup, dependencies and how to extend the scraper to new fields or sites. 3. Sample output in CSV or JSON for a small test batch so I can validate field mapping. 4. Simple logging so I can track success / failure counts per run. Acceptance I’ll consider the job complete once the script runs on my machine, pulls the data points above for a provided list of targets and outputs a tidy file with at least 95 % field-fill rate over that list. Let me know your preferred stack, any past experience scraping these specific sites, and the timeline you’ll need.