Database Entries Cleanup

Customer: AI | Published: 19.12.2025

I have a collection of raw database entries that need to be cleaned before they are migrated to our production environment. The work centres on identifying and removing duplicates, standardising formats (dates, phone numbers, capitalisation, IDs), flagging or filling missing values where possible, and ensuring the final file is import-ready without breaking referential integrity. The current data sits in CSV exports pulled from MySQL but can be provided in whichever format you prefer (CSV, SQL dump, or even loaded to a temporary test schema). Feel free to use SQL, Python (pandas), Excel PowerQuery or any other tool you find efficient—as long as the final output meets the acceptance criteria below. Deliverables • A cleaned dataset in the same structure as the original, ready for direct import • A concise log or README outlining the cleaning rules applied, with notes on unresolved anomalies Acceptance criteria 1. Zero duplicate primary keys and no orphaned foreign keys 2. Consistent formatting for all date, numeric and text fields 3. Missing-value handling clearly documented (filled, left blank, or flagged) 4. Import script or instructions verified against a sample restore to prove integrity When you reply, focus on your experience handling similar database-level data cleaning tasks—tools, scale of past projects, and any notable challenges you solved. No lengthy project proposal needed; solid, relevant experience is what will stand out.