My core dataset lives in Excel, and duplicated records are starting to blur every metric I monitor. I need those duplicates removed, the sheet reorganised into a tidy, analysis-ready format, and then a predictive model built from the cleaned data so I can forecast sales and spot upcoming trends with confidence. You’re free to tackle the job in Excel itself or switch to Python with Pandas and scikit-learn—whatever lets you document repeatable steps so I can rerun the process next quarter. Once the duplicates are gone, I’d like to see feature engineering where it adds real value, followed by a clear explanation of the modelling approach and its accuracy. If a quick Power Query routine or Google Sheets connector helps automate future uploads, feel free to include it. Deliverables: • Cleaned, de-duplicated Excel file (or linked Sheet) • Well-commented script or macro that reproduces the cleaning steps • Predictive model file with summary report on performance metrics • Brief walkthrough or dashboard that highlights key insights When the final files open with zero duplicate IDs and the model accuracy is validated against a held-out set, the job’s a success. Let me know your preferred toolchain and timeline so we can get started.