Platform for gathering data from different sources in Saudi Arabia a

Бюджет: 750 $

Skilled Web Scraper and Data Engineer Needed for Real Estate Data System in Saudi Arabia Project Title: Development of Automated Web Scraping System for Saudi Real Estate Data Project Description: We are seeking an experienced freelancer or team to develop a robust system for collecting, storing, exporting, and analyzing real estate data from multiple Saudi Arabian websites. The system will scrape data based on predefined criteria, store it in an online database, enable flexible data exports, and provide advanced analytics based on specified KPIs. The focus is on real estate listings in Saudi Arabia, primarily in Arabic, targeting platforms like Aqar.fm and potentially others (e.g., Bayut.sa, PropertyFinder.sa). The system must handle Arabic content, comply with local regulations, and use ethical scraping practices (e.g., APIs where available, respectful scraping with delays). Scope of Work: The freelancer will develop the system in phases. Please provide a detailed proposal with your tech stack (e.g., Python with Scrapy/Selenium, Supabase/PostgreSQL, Google Sheets), timeline, and pricing. Requirements Gathering and Design (Phase 1 - 1 Week): Define data fields (e.g., property ID, type, location, price, size, dates, publisher details, additional attributes like bedrooms or street width). Specify target websites (start with Aqar.fm; expand to 2-3 others if feasible). Design system architecture: Scraping scripts, database schema, export formats, analytics structure. Deliverable: Design document (PDF/Word) with architecture and dashboard wireframes. Web Scraping System Development (Phase 2 - 2-3 Weeks): Build scrapers using Python (Scrapy, BeautifulSoup, or Selenium for dynamic sites). AI Tools and Agents is also suitable, and prefered. Implement filtering based on criteria (e.g., property type, Riyadh districts, price ranges, listing age). Schedule scraping: Frequent updates (e.g., every 4-24 hours) to capture changes. Handle pagination, anti-scraping measures, Arabic text encoding, and proxies to avoid blocks. Deliverable: Working scraper code (GitHub repo), sample data output. Online Database Implementation (Phase 3 - 1-2 Weeks): Set up a scalable online database (Supabase preferred, Google Sheets as fallback, or cloud-hosted PostgreSQL). Create schema for properties, historical data, and metadata. Enable real-time data insertion/updates and user access controls. Deliverable: Database setup with test data, access credentials. Data Export Functionality (Phase 4 - 1 Week): Develop export capabilities for filtered data (I will share xls file with all required fields). Support formats: CSV, Excel, JSON, PDF reports; allow column selection. Provide API endpoints or UI buttons for exports. Deliverable: Export scripts/tools, sample exports. Analytics and Reporting Engine (Phase 5 - 2 Weeks): Build analytics module for KPIs (e.g., summaries by property type, location, or value metrics). Generate visualizations (charts via Matplotlib or dashboard) and reports. Implement alerts for specific conditions (e.g., email/WhatsApp notifications). Deliverable: Analytics scripts/dashboard (e.g., Streamlit or Supabase-integrated), sample reports. Testing, Deployment, and Maintenance (Phase 6 - 1 Week + Ongoing): Test for accuracy, scalability, and error handling. Deploy on cloud (e.g., Vercel, AWS) with scheduled scraping. Provide documentation and training. Offer 1 month of post-deployment support. Deliverable: Deployed system, user guide. Requirements: Expertise in web scraping (Python tools), Arabic content handling, and ethical practices. Proficiency with databases (Supabase, Google Sheets, PostgreSQL), data export (Pandas), and analytics (SQL/Python). Portfolio with relevant projects (e.g., web scrapers, data dashboards). Bilingual skills (English/Arabic) preferred for communication and data handling. Compliance with website terms and local laws (e.g., using delays, headers, or APIs).

Регистрация