This project predicts customer churn using machine learning models. It includes data cleaning, feature encoding, model training, evaluation, and visual analysis.
- Source:
/content/drive/MyDrive/paper/archive (2).zip - Target:
Churn
-
Data Preprocessing:
- Handle missing values
- Encode categorical variables
- Drop irrelevant columns (
customerID) - Scale features
-
Models Used:
- Logistic Regression
- Random Forest
- XGBoost
-
Evaluation Metrics:
- Accuracy, Precision, Recall, ROC-AUC, Classification Report
-
Feature Importance:
- Visualized using Random Forest
-
EDA Visuals:
- Churn distribution
- Contract type vs churn
- Monthly charges, tenure, and customer service calls vs churn
pdf link : https://drive.google.com/file/d/1OoSteePkrppLvPvC8B79myYiCFpfWMu2/view?usp=drive_link
pip install pandas numpy seaborn matplotlib scikit-learn xgboostUse Jupyter Notebook or any Python IDE to execute the script.
Month-to-month contracts, low tenure, and high service calls are key churn indicators.