Skip to content

aldol07/DuoDetect

Repository files navigation

DuoDetect 🧠💬

A lightweight machine learning pipeline to detect duplicate questions using Bag-of-Words and other preprocessing techniques.

Python uv License



🚀 Getting Started

1. Clone the repository

git clone https://github.com/aldol07/DuoDetect.git
cd duplicatepairs


### 2. Install dependencies with uv
uv venv
uv pip install -r requirements.txt


🧪 Usage
Run the main script:
streamlit run main.py


🔍 Features
Clean, normalized text preprocessing

Bag-of-Words feature extraction

Model training and persistence (model.pkl)

Cross-validation strategies (cv.pkl)

Jupyter-based experimentation

Live at: duodetectbyaldol.streamlit.app/

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published