Python Developer Needed for PDF Vector Line Extraction (Structural Drawings / PT Plans)

Бюджет: 750 $

IMPORTANT — READ BEFORE APPLYING: You must begin your proposal with the words “Artificial Intelligence”. Any proposal that does not include this phrase at the very start will be automatically rejected. This is how we filter out bots. Only real applicants will be considered. Project Overview We are building an internal system for a post-tensioning company that needs to extract detailed geometry (specifically tendons) from post-tensioned slab plan PDFs. This first milestone focuses purely on confirming the core foundation: Extracting vector line segments from a PDF (NOT raster/pixel extraction) Reading the scale from the sheet Converting line distances to real-world feet Bucketing line angles Outputting clean CSV data + a visual reconstruction This step is crucial and will become the engine for banded tendon detection in later milestones. You must have direct experience working with vector PDFs, not OCR tools. The required accuracy is for estimating only, not engineering — the client is comfortable with a few inches of slack on long tendon runs. Milestone 1 — Scope of Work You will build a Python script that does the following: 1. Load the provided PDF Use Python 3 Use PyMuPDF (fitz) preferred Target a specific page (e.g. page index 8, the main PT slab plan) 2. Extract all vector line segments Use vector extraction only — no image processing. Capture: x1, y1, x2, y2 Line lengths (in PDF units) Line width if available 3. Parse the drawing scale Detect patterns such as: SCALE: 1/8" = 1'-0" Scale 1/4" = 1' If detection fails, fallback to default (configurable). Convert: PDF units → inches on paper → feet in reality Assume 72 PDF units per inch unless otherwise noted. 4. Compute real-world lengths Write a helper function: length_ft = pdf_units_to_feet(dx, dy, scale) Accuracy tolerance: ±2–3 inches over typical 20–40 ft tendon runs. 5. Calculate angle for each segment Normalize angle to [0°, 180°) Bucket angles using 5° increments (0°, 5°, 10°, etc.) 6. Filter tiny noise segments Remove lines shorter than 2 ft in real-world length. 7. Output clean CSVs You should produce: all_segments.csv candidate_segments.csv (length ≥ 2 ft) Each row includes: x1, y1, x2, y2 length_ft angle_deg angle_bucket_deg 8. Generate a visual reconstruction Produce a matplotlib plot of the extracted linework and save as: reconstructed.png This lets us visually confirm accuracy. Deliverables You must deliver: main.py (fully working script) all_segments.csv candidate_segments.csv reconstructed.png A clean README including: setup instructions dependencies how to run the script any assumptions Requirements Strong Python 3 experience Experience with PDF vector extraction (not OCR) Familiarity with PyMuPDF or equivalent Comfort with 2D geometry (angles, lengths, scaling) Ability to work independently and efficiently Bonus: Experience with CAD / structural / construction drawings Experience with Shapely or geometry libraries How to Apply (MUST READ) To filter out bots, you must begin your proposal with: “Artificial Intelligence” Any proposal without this phrase at the very beginning will be: Ignored Rejected Not considered further In your proposal, also answer: Have you extracted vector geometry from PDFs before? Which libraries do you prefer for vector extraction? Can you show an example of similar work? What is your fixed-price quote for Milestone 1? What is your estimated delivery timeline?

Python

Реєстрація