My goal is to build a robust predictive model that can anticipate individual customer actions using the transactional, demographic, and behavioral data already sitting in my warehouse. Roughly 350 k customers and 8 million transactions span the past two years, exported as CSV from PostgreSQL. The work begins with data wrangling—merging the three sources, resolving missing values, and engineering features such as RFM scores, CLV proxies, and session-level metrics. When the data is tidy, I need a model that estimates the likelihood of a repeat purchase within the next 30 days. Accuracy, recall, and AUC will guide selection; gradient-boosted trees or other interpretable, high-performing methods are welcome. Deliverables • Clean, feature-engineered dataset (CSV) • Reproducible scripts or notebooks—Python preferred (pandas, scikit-learn, XGBoost, LightGBM, etc.) • Brief technical report: methodology, evaluation results, and actionable marketing insights • Deployment-ready model file or lightweight prediction API stub Acceptance criteria: code runs end-to-end on my machine, AUC exceeds 0.80 on a hold-out set, and documentation is clear enough for a mid-level analyst to follow. Let me know if you have any questions so we can get started right away.