-
Data Ingestion :
- In Data Ingestion phase the data is first read as csv.
- Then the data is split into training and testing and saved as csv file.
-
Data Transformation :
- In this phase a ColumnTransformer Pipeline is created.
- For Numeric Variables first SimpleImputer is applied with strategy median , then Standard Scaling is performed on numeric data.
- For Categorical Variables SimpleImputer is applied with most frequent strategy, then ordinal encoding performed , after this data is scaled with Stusandard Scaler.
- This preprocessor is saved as pickle file.
-
Model Training :
- In this phase base model is tested . The best model found was catboost regressor.
- After this hyperparameter tuning is performed on catboost and knn model.
- A final VotingRegressor is created which will combine prediction of catboost, xgboost and knn models.
- This model is saved as pickle file.
-
Prediction Pipeline :
- This pipeline converts given data into dataframe and has various functions to load pickle files and predict the final results in python.
-
Notifications
You must be signed in to change notification settings - Fork 0
Sinhaaz/ML-Project1
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
Project Name : Student Performance Indicator
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
