Build Enterprise Data Intelligence Platform

Project Overview We are looking for a highly experienced Full-Stack Architect / Big Data Engineer / Search Infrastructure Expert to build a large-scale web application similar to Volza, capable of handling: 10+ Terabytes of structured trade data Trillions of rows Sub-second search performance Advanced filtering & analytics Production-grade scalability This is not a basic CRUD web app. We are building an enterprise-grade data intelligence platform. Core Objective =============== If a user searches for: Product name HS code Importer / Exporter name Country Shipment date range Address Port Any combination of filters The results must return in seconds (ideally sub-second) — even with trillions of records. Expected Architecture Expertise =========================== We expect the developer/team to propose and implement a scalable architecture such as: Distributed data storage Columnar database optimization Partitioning & indexing strategies Query acceleration techniques Caching layers Parallel query execution Horizontal scaling Suggested Tech Stack (Open to Better Suggestions) ========================================== Backend Python (FastAPI / Django) or Node.js Go (optional for performance-critical services) Database Options ClickHouse (preferred) Apache Druid Elasticsearch BigQuery / Redshift Any distributed columnar DB Frontend React / Next.js Advanced filtering UI Data grid with pagination & lazy loading Infrastructure Kubernetes / Docker Load balancing CDN Caching (Redis) Object storage for raw data Required Features ================ Advanced Search Engine Full-text search Multi-filter query builder Auto-suggestions Fuzzy matching Aggregations (sum, count, trends) Data Handling ============ Bulk data ingestion pipelines ETL processing Schema optimization Index optimization Performance Requirements ====================== Query response in seconds Pagination with deep offset handling Parallel query execution Caching for repeated queries Security & Access =============== User authentication Role-based access Paid subscription model (optional phase 2) Dataset Details ============= 10+ TB structured shipment/export-import data Trillions of rows Continually growing dataset Structured but large-volume relational-style data Ideal Candidate ============== Experience building large-scale search platforms Hands-on experience with distributed databases Strong system design background Experience optimizing heavy analytical queries Experience handling 1B+ rows minimum (preferably more) Deliverables =========== Complete system architecture design Scalable backend Optimized database schema High-performance search engine Production-ready deployment Documentation Budget ======= Open to proposals (Fixed / Milestone-based preferred). Serious and experienced teams only. Timeline ======= Phase 1 MVP: 8–12 weeks Full production version: Based on architecture complexity

Python

Регистрация