Skip to content

Lolya-cloud/Movie-Genre-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie-Genre-Classifier

Predict one or more movie genres from a film’s plot summary using sentence/document embeddings (Bert, SBert, W2v, tf-idf) and a flexible neural classifier. This repo implements a clean, reproducible NLP → multi-label classification pipeline in Python.

What it does

  • Task: Multi-label text classification (a movie can have several genres).
  • Pipeline: text cleanup → embedding (BERT / SBERT / Word2Vec) → feed-forward neural network → thresholding.
  • Models (as implemented):
    • BERT with CLS + MLP (BertPretrained embeddingsFlexibleNeuralNetwork)
    • SBERT + MLP (Sentence-BERT embeddingsFlexibleNeuralNetwork)
    • Google News Word2Vec + MLP (Word2Vec 300-d embeddingsFlexibleNeuralNetwork)
  • Evaluation: Micro/Macro-F1, Precision/Recall, Jaccard, Hamming accuracy; global decision-threshold tuning.
  • Reproducibility: Seeded runs, requirements.txt, scriptable CLI; metrics saved under analysis/model_comparison/.

Results

Top run (test set, best global threshold):

Model F1 (best-threshold) Threshold Precision Recall Jaccard Hamming acc.
SBERT with NN 0.660 0.271 0.642 0.676 0.491 0.905

About

Multi-label movie genre classification based on movie summary/overview.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages