Spam Email Classifier

A beginner-friendly Machine Learning project that classifies SMS messages as Spam or Ham (Not Spam) using Python, Scikit-learn, and TF-IDF vectorization. This project includes text preprocessing, model training, evaluation, and vocabulary visualization using WordClouds.

🚀 Features

Classifies SMS messages as spam or not spam
Uses TF-IDF to convert text to numeric features
Built with Naive Bayes (MultinomialNB) algorithm
Accuracy score, confusion matrix, and classification report
WordCloud visualizations for spam and ham messages
Live predictions on custom sample messages

🛠️ Tech Stack

Category	Tools Used
Language	Python
Libraries	Pandas, Scikit-learn, Matplotlib, WordCloud
ML Algorithm	Naive Bayes (MultinomialNB)
Feature Extraction	TF-IDF Vectorizer
IDE	Jupyter Notebook

📂 Dataset

Name: SMS Spam Collection Dataset
Source: Kaggle Datasets
File: spam.csv (included in this repo)

This dataset contains 5,572 SMS messages labeled as either "spam" or "ham".

📊 WordCloud Visualizations

Spam message vocabulary (black background):
Ham message vocabulary (white background):

📌 How It Works

Load Dataset – Reads and cleans the SMS spam dataset.
Preprocess Data – Labels are converted (ham → 0, spam → 1).
WordClouds – Generates WordClouds for both spam and ham.
TF-IDF Vectorization – Converts text into numeric vectors.
Train-Test Split – 80% training, 20% testing.
Train Model – Uses Naive Bayes classifier.
Evaluate Model – Prints accuracy, confusion matrix, and report.
Live Prediction – Test the model with your own text inputs.

🔍 Sample Predictions

sample = ["Congratulations! You've won a free iPhone. Click the below link to claim"]
# Output: Spam

sample = ["Hey, are we still meeting at 6 PM?"]
# Output: Not Spam

📝 License

This project is licensed under the MIT License. Feel free to use, modify, and distribute for personal and commercial purposes.

🙌 Contribution

Contributions, issues, and feature requests are welcome! Feel free to fork this repo and submit a pull request.

💬 Contact

Created with ❤️ by Sushmitha Shettigar

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.ipynb_checkpoints		.ipynb_checkpoints
WordCloud-Images		WordCloud-Images
LICENCE		LICENCE
README.md		README.md
Spam_Email_Classifier.ipynb		Spam_Email_Classifier.ipynb
spam.csv		spam.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Email Classifier

🚀 Features

🛠️ Tech Stack

📂 Dataset

📊 WordCloud Visualizations

📌 How It Works

🔍 Sample Predictions

📝 License

🙌 Contribution

💬 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

sushmithashettigar29/Spam_Email_Classifier

Folders and files

Latest commit

History

Repository files navigation

Spam Email Classifier

🚀 Features

🛠️ Tech Stack

📂 Dataset

📊 WordCloud Visualizations

📌 How It Works

🔍 Sample Predictions

📝 License

🙌 Contribution

💬 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages