GitHub - Sinhaaz/ML-Project1: Project Name : Student Performance Indicator

Project Name :- Student Performance Indicator

Data Ingestion :
- In Data Ingestion phase the data is first read as csv.
- Then the data is split into training and testing and saved as csv file.
Data Transformation :
- In this phase a ColumnTransformer Pipeline is created.
- For Numeric Variables first SimpleImputer is applied with strategy median , then Standard Scaling is performed on numeric data.
- For Categorical Variables SimpleImputer is applied with most frequent strategy, then ordinal encoding performed , after this data is scaled with Stusandard Scaler.
- This preprocessor is saved as pickle file.
Model Training :
- In this phase base model is tested . The best model found was catboost regressor.
- After this hyperparameter tuning is performed on catboost and knn model.
- A final VotingRegressor is created which will combine prediction of catboost, xgboost and knn models.
- This model is saved as pickle file.
Prediction Pipeline :
- This pipeline converts given data into dataframe and has various functions to load pickle files and predict the final results in python.

A pdf containing a skeleton overview of the overall project

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.ebextensions		.ebextensions
artifacts		artifacts
catboost_info		catboost_info
notebook		notebook
src		src
templates		templates
.gitignore		.gitignore
ML Project-Student Performance Indicator.pdf		ML Project-Student Performance Indicator.pdf
README.md		README.md
Screenshot.png		Screenshot.png
application.py		application.py
requirements.txt		requirements.txt
setup.py		setup.py