The Shipment Pricing Prediction project aims to predict shipment prices based on various factors in the supply chain domain using machine learning techniques. This project addresses the growing need for accurate predictions in the rapidly evolving supply chain analytics market.
- Machine Learning: Various regression models
- Python: Programming language for data analysis and modeling
- Libraries:
- Pandas
- NumPy
- Scikit-learn
- Matplotlib
- Seaborn
- Flask
The supply chain analytics market is projected to grow significantly, with organizations needing to optimize pricing strategies. This project focuses on predicting shipment pricing using available data to help supply chain leaders make informed decisions.
- Missing Values: All missing values were replaced with the mode (most frequent value).
- Numerical Columns: Standardized to prevent data leakage using pipelines.
- Categorical Columns: Encoded using either label encoding or one-hot encoding.
- Accumulated data was exported to Python and read using Pandas.
- Performed exploratory data analysis (EDA) to identify distributions, outliers, and trends.
- Checked for null values; if present, they were imputed.
- Encoded categorical values into numeric values and scaled numerical features using StandardScaler.
- New features were created to enhance model building based on business insights.
- Optimized the model for accuracy with a training R-squared of 0.998273 and a test R-squared of 0.991598.
- Key features:
Days to Process
,Line Item Insurance
,Shipment Mode
,Freight Cost
.
- Training RΒ²: 0.998273
- Test RΒ²: 0.991598
- Important features identified:
Days to Process
,Line Item Insurance
,Shipment Mode
,Freight Cost
.
- Dataset Link: Click here!
To run this project, you will need Python and the required libraries. Set up a virtual environment and install dependencies using pip:
pip install pandas numpy scikit-learn matplotlib seaborn flask