Movie-Genre-Classifier

Predict one or more movie genres from a film’s plot summary using sentence/document embeddings (Bert, SBert, W2v, tf-idf) and a flexible neural classifier. This repo implements a clean, reproducible NLP → multi-label classification pipeline in Python.

What it does

Task: Multi-label text classification (a movie can have several genres).
Pipeline: text cleanup → embedding (BERT / SBERT / Word2Vec) → feed-forward neural network → thresholding.
Models (as implemented):
- BERT with CLS + MLP (BertPretrained embeddings → FlexibleNeuralNetwork)
- SBERT + MLP (Sentence-BERT embeddings → FlexibleNeuralNetwork)
- Google News Word2Vec + MLP (Word2Vec 300-d embeddings → FlexibleNeuralNetwork)
Evaluation: Micro/Macro-F1, Precision/Recall, Jaccard, Hamming accuracy; global decision-threshold tuning.
Reproducibility: Seeded runs, requirements.txt, scriptable CLI; metrics saved under analysis/model_comparison/.

Results

Top run (test set, best global threshold):

Model	F1 (best-threshold)	Threshold	Precision	Recall	Jaccard	Hamming acc.
SBERT with NN	0.660	0.271	0.642	0.676	0.491	0.905

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.idea		.idea
analysis		analysis
data		data
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Movie-Genre-Classifier

What it does

Results

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Lolya-cloud/Movie-Genre-Classifier

Folders and files

Latest commit

History

Repository files navigation

Movie-Genre-Classifier

What it does

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages