📘 Customer Feedback Analysis using NMF and Sentiment Analysis
Topic Modeling + Sentiment Classification on Customer Reviews
📌 Project Overview
- This project performs automatic analysis of customer feedback using a combination of:
- NLP preprocessing
- TF-IDF vectorization
- NMF (Non-negative Matrix Factorization) topic modeling
- VADER sentiment analysis
- Evaluation metrics (Reconstruction Error, Topic Coherence, Silhouette Score)
- Visualizations such as topic distribution & word cloud.
The goal is to understand what customers are talking about and how they feel about different aspects of a product/service.
This project uses a CSV file containing customer review texts.
📂 Dataset Description
The dataset consists of a CSV file with customer reviews:
Column Description
review_text The customer review text (string)
Example review entries:
"The product quality is excellent and very durable."
"Delivery was late and the package was damaged."
"Amazing value for money! Totally worth the price."
🎯 Objectives
- Identify key topics discussed in customer reviews
- Determine sentiment (Positive / Negative / Neutral)
- Cluster reviews based on dominant themes
- Evaluate model performance using topic modeling metrics
- Visualize topic–word distributions and sentiment insights
🧠 Techniques & Algorithms Used
1. Text Preprocessing
- Lowercasing
- Removing punctuation & special symbols
- Removing extra spaces
2. TF-IDF Vectorization
Converts textual reviews into numerical feature vectors.
3. NMF Topic Modeling
Extracts hidden themes by decomposing the TF-IDF matrix into:
W: Document–Topic matrix
H: Topic–Word matrix
Outputs:
- Top Words per Topic
- Dominant Topic per Review
4. Sentiment Analysis
Uses VADER to classify each review as:
- Positive
- Negative
- Neutral
5. Evaluation Metrics
- Reconstruction Error : Measures NMF model fit
- Topic Coherence (C_v) : Measures semantic quality of topics
- Silhouette Score : Measures clustering separation
6. Visualizations
- Topic distribution bar chart
- Word clouds for each topic
🚀 Project Workflow
- Load CSV containing customer reviews
- Clean and preprocess the text
- Convert text into TF-IDF vectors
- Apply NMF to extract topics
- Extract top words per topic
- Determine dominant topic per review
- Run sentiment analysis
- Calculate evaluation metrics
- Create visualizations
- Save results to customer_feedback_nmf_results.csv
📊 Sample Output
Top Words per Topic
- Topic 1: delivery, package, late, damaged
- Topic 2: quality, excellent, durable, product
- Topic 3: service, wrong, disappointed, rude
Dominant Topic Example Review: "Delivery was late and package was damaged." → Dominant Topic: 1 (Delivery Issues) → Sentiment: Negative
1. Install dependencies
pip install pandas numpy scikit-learn vaderSentiment gensim wordcloud matplotlib seaborn
2. Run the Python script
python customer_feedback_nmf_sentiment.py
3. View the results
Check the attached csv file.