PDF Text Extractor & Compiler

Бюджет: 750 $

I need an application that reads a multipage PDF, pulls out all the text, and then pours that content into a fresh one-page PDF that follows a custom layout I already designed. The flow is simple: open PDF → extract every text element in reading order → map each string to its assigned field or zone in my supplied template → generate a brand-new single-page PDF ready for distribution. My design file shows exact font sizes, margins, headers, and footers, so the app must respect those specifications pixel-for-pixel. Dynamic text should auto-shrink only when a block overruns its allotted space; everything else should remain fixed. No images or tables need processing—pure text only. I’m open to your preferred stack (Python with PyPDF2/PDFPlumber, Java with PDFBox, or any robust alternative) as long as the final solution: • Runs on Windows 10+ without extra paid dependencies • Processes at least 200 pages in under two minutes on a standard laptop • Lets me update the template later without touching the core code (e.g., via an external JSON or simple GUI field map) • Outputs a perfectly flattened PDF—no editable form fields Please package the source code, a brief setup guide, and a short test report proving it works with the sample files I’ll send after kickoff.

Python

Регистрация