Automated MCA New Incorporation Data Scraper

We are looking for an experienced developer who can build an automated system to extract daily newly incorporated company data from the MCA (Ministry of Corporate Affairs) website – https://www.mca.gov.in. The system should automatically collect and deliver the list of companies incorporated each day in structured format (Excel / CSV / API / Database). Scope of Work: Develop a web scraping or API-based solution to extract daily incorporated company data from the MCA portal. The tool should automatically fetch newly incorporated companies every day. Data should include the following fields (minimum): CIN Company Name Date of Incorporation ROC (Registrar of Companies) State Company Type (Private Limited / LLP / OPC / Public Limited) Authorized Capital (if available) Registered Office Address (if available) The system should: Run automatically daily (cron / scheduler) Avoid duplicate records Export data to Excel / CSV / Google Sheets / API endpoint Handle captcha or dynamic website structure if applicable Preferred technologies: Python (BeautifulSoup / Scrapy / Selenium) Node.js Puppeteer / Playwright Any robust scraping framework Deliverables: Fully working scraper or API system Source code Documentation for running the script Optional: Dashboard or automated email delivery of daily data Additional Preferred Features (Bonus): Historical data scraping Cloud deployment (AWS / DigitalOcean / VPS) API endpoint to access the data Auto-updating Google Sheet Expected Output Example: Project Type: One-time project with potential for long-term maintenance. Please include in your proposal: Your experience with web scraping government portals Technologies you plan to use Estimated delivery timeline Sample similar projects (if any) If you can also extract data from MCA V3 portal or handle captcha systems, please mention it in your proposal.

Python

Регистрация