Python AI Invoice PDF Parser

Заказчик: AI | Опубликовано: 19.11.2025
Бюджет: 350 $

I need a clean, AI-assisted Python script that can open a range of vendor-supplied PDF invoices and pull out one thing perfectly: every line item together with the overall total. The primary goal is straightforward invoice PDF parsing, but I want the process to be smart enough to recognise different layouts and recover gracefully if a page relies on scanned text (so feel free to blend pdfplumber, PyPDF2, Tesseract OCR, Layout-LM, or whatever stack you trust). When the script finishes, it must save two files side-by-side: a structured JSON document and an Excel workbook (.xlsx). No other formats are required. Deliverables • Modular Python-3 code with clear functions for file ingest, parsing logic, and export • requirements.txt with pinned versions • README that shows set-up, a one-line command to run the parser, and a note on how to extend field rules • Sample JSON and Excel outputs generated from my test invoices Acceptance criteria • Every total amount and each individual line-item from the supplied PDFs is captured exactly (numeric and text values) • Works on at least three different invoice layouts without manual tweaks If this sounds clear, let’s get it running.