Welcome to the MLB Game Predictor! This project leverages advanced machine learning models to predict the outcomes of MLB games during the 2024 season. The primary aim is to provide accurate game-by-game predictions and season-long projections using a comprehensive historical dataset.
- Historical Data Utilization: The models are trained on an extensive dataset that includes team statistics from nearly every game between the 2000 and 2024 seasons and is continuously updated. This ensures that there is a sufficient amount of data for the models to understand patterns between statistics and outcomes.
- Comprehensive Team Statistics: The dataset includes a wide range of team statistics that cover almost every aspect of a baseball game. These encompass both offensive and defensive statistics, as well as season-wide and recent performance for each team.
- Machine Learning Models: This project uses Ridge Classifier and Linear Regression models to predict the outcome and scores of games. After testing various models for both outcome and run prediction, these models consistently performed.
- Future Statistic Prediction: This project includes the functionality to predict the statistics for any upcoming game from now until the end of the season using exponential moving averages and past team performance. This allows the model's predictive ability to not be limited to games, for it has concrete statistics.
- FanGraphs for web scraping all necessary team statistics
- MLB-StatsAPI for finding schedules for given date ranges
- Clone the repository
git clone https://github.com/laplaces42/mlb_game_predictor.git cd mlb_game_predictor
- Start a virtual environment
python3 -m venv .venv source .venv/bin/activate
- Open an IDE (Optional)
code . #For VSCode
- Install
requirements.txt
pip install -r requirements.txt
- Run
baseball_dataset.py
to update the dataset with any recent statisticspython baseball_dataset.py
- Run
baseball_model.py
to train the models with the new datapython baseball_model.py
- Run
baseball_prediction.py
to run the main program and request game predictionspython baseball_prediction.py
The results of this project can be used when the baseball_prediction.py
script is run.
I am open to all (relevant) contributions! To do so, please fork the repository and submit a pull request with your changes.
This project is licensed under the MIT License. See the LICENSE file for details.
If you have any questions or feedback, feel free to reach out!
- Email: [email protected]
- LinkedIn: LaPlace Sallis IV
- GitHub: laplaces42
You can also open an issue on this repository if you have any questions or need support.