We are building an industrial-grade AI system to automate order entry from customer emails and PDF attachments into an ERP system. This is not a chatbot and not a prompt-only LLM project. The system must be deterministic, auditable, and safe for production ERP use, with confidence thresholds and human-in-the-loop review. What the System Must Do Ingest customer orders from Microsoft Outlook emails (including attachments) Classify documents (order vs quote vs non-order) Extract structured order data: Customer Ship-to Line items Quantity and UOM Match free-text descriptions to ERP item master Apply confidence scoring and business rules Route low-confidence results to review (no auto-posting) Produce structured JSON suitable for ERP integration Ingestion Requirements Support at least one of the following: Microsoft Outlook Office Add-in (preferred) Drag-and-drop Outlook .msg files into the app Forwarding emails to a dedicated intake address Must retain full traceability to the original email. Technical Expectations Python-based ML pipeline NLP / ML models for classification and extraction OCR for scanned PDFs Embeddings or similarity search for item matching REST API for integration Confidence scoring and validation logic Model versioning and monitoring What This Is NOT Not a chatbot Not prompt-engineering-only Not autonomous AI posting directly to ERP Not experimental or demo-only work Reliability, repeatability, and auditability are mandatory. Deliverables Working ingestion and extraction pipeline Structured order output (JSON) Confidence and validation logic Documentation and architecture overview Ideal Freelancer Has built production ML systems Experience with document processing or IDP Comfortable with structured data and business rules Understands why ERP systems require guardrails