Skip to content

Haneenabukhater12/Content-moderation-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Content Moderation System πŸ”

A machine-learning-based system for automatic content moderation, detecting toxic, spam, and safe content.
The system combines rule-based filters with ML classifiers to simulate real-world moderation tools used by platforms like Facebook, Twitter, YouTube, and OpenAI.

πŸ‘‰ Open in Colab: Click Here


πŸ“Œ Project Overview

This project implements a Content Moderation System that automatically classifies and filters user-generated content using machine learning techniques.


βš™ System Architecture

πŸ”Ή Text Preprocessing Module

  • Text normalization and cleaning
  • Feature extraction (length, word count, capitalization ratio, etc.)
  • URL, mention, and hashtag detection

πŸ”Ή Rule-Based Filter

  • Offensive keyword blacklist
  • Pattern matching for spam/promotional content
  • Excessive capitalization and character repetition detection

πŸ”Ή Machine Learning Classifier

  • Algorithms: Naive Bayes, Logistic Regression, SVM, Random Forest
  • TF-IDF vectorization
  • Multi-class classification: toxic, spam, safe

πŸ”Ή Risk Assessment Engine

  • Weighted scoring combining rule-based + ML predictions
  • Configurable confidence thresholds
  • Dynamic decision making based on risk scores

πŸš€ Google Colab Notebook

You can run and explore the project notebook here:
πŸ‘‰ Colab Link


πŸ“Š Dataset

The dataset used for training the model is available on Google Drive:
πŸ‘‰ Download Dataset


πŸ‘©β€πŸ’» Team Members

  • Haneen Mohamed
  • Mai Mohamed
  • Farida Montser

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published