Baseball Totals Prediction Model

Customer: AI | Published: 24.03.2026
Бюджет: 250 $

I want a robust statistical / machine-learning model that predicts run totals for both the full game and the first five innings of Major League Baseball matchups. The model must ingest and refresh four key data sets: historical game results, detailed player statistics, real-time weather metrics, and an up-to-date feed of injuries to key players. To keep the forecasts actionable, I’ll need thoughtful feature engineering (park factors, handedness splits, weather-adjusted run environments, etc.) and a validation framework that back-tests against past closing totals so we can see exactly how the model would have performed. Preferred stack is Python with pandas, scikit-learn or XGBoost; however, I’m open to R or another proven toolkit if it suits the job better. Please include clear documentation and well-commented code so I can retrain or tweak parameters as new seasons roll in. Deliverables • Clean, reproducible data-pipeline scripts pulling the four data sources • Fully trained model plus training notebooks / scripts • Evaluation report showing historical performance against the market total, split full-game vs. first-five • Quick-start guide for running daily predictions and updating the data Acceptance criteria: on a blind test set the model should beat a naïve league-average baseline by a statistically significant margin (prove it with the report). Let’s get started—once these pieces are in place I can handle deployment to my own betting interface.