Ongoing ML Data Labeling

Заказчик: AI | Опубликовано: 01.04.2026
Бюджет: 750 $

I’m midway through an AI project and need an engineer who can take full ownership of the data-labeling pipeline while keeping our Machine Learning goals front and center. The model we are building is strictly regression-based, so every label you create must be precise enough to support continuous target prediction rather than simple categorization. You will start by reviewing the raw dataset, defining clear labeling guidelines, and then labeling the data inside the platform of your choice—Label Studio, CVAT, or another tool you are comfortable with. I’m open to suggestions on workflow improvements as long as we end up with a clean, version-controlled set of annotations ready for model training. Deliverables • A fully labeled dataset aligned with our regression targets • A concise guideline document so future annotators can replicate your work • Weekly progress snapshots (exported JSON/CSV plus brief summary) • A short report outlining any data quality issues uncovered during labeling and proposed fixes Acceptance criteria • ≥ 98 % agreement on a 10 % random audit sample • All files named and stored according to the repository’s existing structure • Final dataset loads without errors into our current Python pipeline (pandas + scikit-learn) If you’re comfortable shaping raw data into machine-ready gold and can keep an eye on downstream regression performance, let’s talk—I’m ready to get started right away.