This repository contains an advanced house price prediction project using synthetic data, comprehensive preprocessing, multiple machine learning models, and enhanced visualizations. The code demonstrates best practices in data science and machine learning workflows.
- Synthetic housing dataset generation
- Advanced preprocessing: KNN imputation, outlier detection, feature engineering
- Multiple regression models: Linear Regression, Random Forest, Gradient Boosting
- Hyperparameter tuning with GridSearchCV
- 9 advanced visualizations for model analysis
- Performance metrics: RMSE, MAE, R²
- Python 3.7+
- scikit-learn
- pandas
- numpy
- matplotlib
- seaborn
Install required packages:
pip install pandas numpy scikit-learn matplotlib seabornClone the repository and run the model:
git clone https://github.com/yourusername/house-price-prediction.git
cd house-price-prediction
python HousePredictionModel.pyThe model generates the following plots:
- Model Performance Comparison (RMSE)
- R² Score Comparison
- Actual vs Predicted (Best Model)
- Residual Plot
- Feature Correlation Heatmap
- Price Distribution
- Feature Importance (Random Forest)
- Price vs Square Footage (Colored by Grade)
- Model Error Distribution (Box Plot)
-
Srihari Sir (IIT Guwahati Faculty):
Most of the code and structure in this project was created by Srihari Sir, who generously included spaces and comments to teach us advanced concepts.
His expertise and guidance were invaluable, and this project would not be possible without his foundational work. -
Masai School:
Thank you to Masai for providing this learning opportunity and a supportive environment for growth in data science and machine learning.
This project is not entirely my own work. The majority of the code was written by Srihari Sir, with educational gaps and explanations for us to learn and fill in. My contribution was primarily in learning, understanding, and completing the exercises provided.
MIT
Thank you, Srihari Sir and Masai School, for this opportunity and for making advanced machine learning accessible and enjoyable!