LearnFlow - House Price Prediction

This repository contains a complete pipeline for predicting house prices using various machine learning models. The project includes data preprocessing, model training, evaluation, and generating predictions.

Project Overview

This project involves using regression models to predict house prices. The goal is to build a model that can accurately estimate the sale price of a house based on various features.

Data Description

The dataset used in this project includes:

Train Data: Contains the features and target variable (SalePrice) for training the model.
Test Data: Contains the features for which predictions need to be made.

The dataset may include missing values and irrelevant columns, which are addressed during preprocessing.

Data Preprocessing

Check for Missing Values:

missing_values = df.isnull().sum()
print(missing_values[missing_values > 0])

Drop Irrelevant Columns:

df_cleaned = df.drop(['Id', 'SomeIrrelevantColumn'], axis=1)

Convert Categorical Columns:

from sklearn.preprocessing import OneHotEncoder

encoder = OneHotEncoder()
encoded_features = encoder.fit_transform(df_cleaned[['CategoricalColumn']])

Prepare Data for Modeling:

from sklearn.model_selection import train_test_split

X = df_cleaned.drop('SalePrice', axis=1)
y = df_cleaned['SalePrice']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Model Training

Fit the Model:

import xgboost as xgb

model = xgb.XGBRegressor()
model.fit(X_train, y_train)

Generate Predictions:
```
predictions = model.predict(X_test)
```

Model Evaluation

Evaluate Model Performance:

from sklearn.metrics import mean_squared_error, r2_score

mse = mean_squared_error(y_test, predictions)
r2 = r2_score(y_test, predictions)
print(f"Mean Squared Error: {mse}")
print(f"R^2 Score: {r2}")

Achieved an accuracy of around 65-70% on the training dataset with XGBoost. Due to time constraints, the model can be further improved.

Submission

Prepare Submission File:

submission = pd.DataFrame({'Id': test_data['Id'], 'SalePrice': predictions})
submission.to_csv('submission.csv', index=False)

Attached are the Submission and Submission2 files.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
House-Price-Predictor		House-Price-Predictor
README.md		README.md
cleaned_dataset.csv		cleaned_dataset.csv
cleaned_dataset_test.csv		cleaned_dataset_test.csv
data_description.txt		data_description.txt
house-price-predictor.ipynb		house-price-predictor.ipynb
submission.csv		submission.csv
submission2.csv		submission2.csv
test.csv		test.csv
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LearnFlow - House Price Prediction

Table of Contents

Project Overview

Data Description

Data Preprocessing

Model Training

Model Evaluation

Submission

About

Releases

Packages

Languages

HARSHDEEPSINGHBEDI/Learnflow-House-Price-Predictor

Folders and files

Latest commit

History

Repository files navigation

LearnFlow - House Price Prediction

Table of Contents

Project Overview

Data Description

Data Preprocessing

Model Training

Model Evaluation

Submission

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages