Customer Shopping Behavior Analysis

This project is an end-to-end Customer Shopping Analysis transforming raw retail data into actionable business insights through Python-based data cleaning, SQL analytics, and an interactive Power BI dashboard.

📊 Project Overview

The goal of this project is to simulate a corporate-grade, end-to-end data analytics workflow, demonstrating the ability to translate raw data into strategic business intelligence by:

✅ Data Preparation, Modeling & Exploratory Data Analysis (Python): Clean and transform the raw dataset for analysis.

✅ Data Analysis (SQL): Run analytical business queries to extract insights on customer segments, loyalty, and purchase drivers.

✅ Visualization & Insights (Power BI): Interactive dashboard highlighting key patterns and trends for data-driven decision-making.

✅ Report & Presentation: Communicate insights and recommendations in a business-focused manner.

🛠️ How to Use This Project

✅ Clone the Repository

git clone https://github.com/InfiniteLoop360/Customer_Behavior_Analysis.git
cd customer-trends-data-analysis-SQL-Python-PowerBI

✅ Run the Notebook — `Customer_Shopping_Behavior_Analysis.ipynb`

This notebook includes:

📥 Data Import
🔍 Data Exploration (EDA)
🧹 Data Cleaning & Feature Engineering
🔗 SQL Database Connection (PostgreSQL / MySQL / MS SQL Server)
⬆️ Loading Clean Data into SQL Database

✅ SQL Analysis

Create database in SQL Server
Run Python code (from notebook) to populate the database
Execute queries in customer_behavior_sql_queries.sql to answer business questions

✅ Power BI Dashboard

Connect Power BI to SQL Database
Open: customer_behavior_dashboard.pbix
Explore interactive visual insights

✅ Reporting & Presentation

Prepare a business report summarizing findings
Create a presentation deck using Power BI screenshots & business impact (optionally using Gamma AI)

📊 Dashboard Showcase

The final deliverable is an interactive Power BI dashboard that allows business teams to:

Filter insights by season, location, and customer type
Track revenue trends and shopping behavior
Identify high-value segments for marketing

Designed for data-driven decisions and business storytelling.

🧩 Tech Stack

Stage	Technology	Purpose
Data Cleaning	Python (Pandas)	Transform & enrich raw data
Database	PostgreSQL	Scalable data storage & querying
Connector	SQLAlchemy	Seamless Python ➝ SQL integration
Visualization	Power BI	Interactive dashboards & insights

🔄 Project Workflow

This project follows a 3‑phase data pipeline:

✅ 1️⃣ Data Cleaning & Preparation (Python)

Raw dataset: customer_shopping_behavior.csv (3,900 rows)

Key transformations performed:

🧹 Missing Value Imputation
- 37 missing review_rating values filled using median rating per product category
🏷️ Column Standardization
- Converted headers to snake_case for SQL compatibility
🧠 Feature Engineering
- age_group via statistical quartiles → Young Adult / Adult / Middle‑aged / Senior
- purchase_frequency_days converted into numeric values (e.g., Weekly → 7)
🚫 Redundant Column Removal
- Dropped duplicate column promo_code_used

➡️ Final cleaned dataset loaded into PostgreSQL table: customer

📈 2️⃣ Data Analysis (SQL)

Below are the core SQL queries used to derive key business insights:

✅ Q1 — Total Revenue by Gender

SELECT gender, SUM(purchase_amount) AS revenue
FROM customer
GROUP BY gender;

✅ Q2 — Discount Users Spending Above Average

SELECT customer_id, purchase_amount
FROM customer
WHERE discount_applied = 'Yes' AND purchase_amount >= (SELECT AVG(purchase_amount) FROM customer);

✅ Q3 — Top 5 Products by Highest Avg Review Rating

SELECT item_purchased, ROUND(AVG(review_rating::numeric),2) AS "Average Product Rating"
FROM customer
GROUP BY item_purchased
ORDER BY AVG(review_rating) DESC
LIMIT 5;

✅ Q4 — Compare Avg Spend: Standard vs Express Shipping

SELECT shipping_type,
ROUND(AVG(purchase_amount),2)
FROM customer
WHERE shipping_type IN ('Standard','Express')
GROUP BY shipping_type;

✅ Q5 — Subscriber vs Non‑Subscriber Spending & Revenue

SELECT subscription_status,
       COUNT(customer_id) AS total_customers,
       ROUND(AVG(purchase_amount),2) AS avg_spend,
       ROUND(SUM(purchase_amount),2) AS total_revenue
FROM customer
GROUP BY subscription_status
ORDER BY total_revenue, avg_spend DESC;

✅ Q6 — Top 5 Products with Highest Discount Usage

SELECT item_purchased,
       ROUND(100.0 * SUM(CASE WHEN discount_applied = 'Yes' THEN 1 ELSE 0 END) / COUNT(*),2) AS discount_rate
FROM customer
GROUP BY item_purchased
ORDER BY discount_rate DESC
LIMIT 5;

✅ Q7 — Customer Segmentation by Purchase History

WITH customer_type AS (
SELECT customer_id, previous_purchases,
CASE 
    WHEN previous_purchases = 1 THEN 'New'
    WHEN previous_purchases BETWEEN 2 AND 10 THEN 'Returning'
    ELSE 'Loyal'
    END AS customer_segment
FROM customer
)

SELECT customer_segment, COUNT(*) AS "Number of Customers"
FROM customer_type
GROUP BY customer_segment;

✅ Q8 — Top 3 Most Purchased Items per Category

WITH item_counts AS (
    SELECT category,
           item_purchased,
           COUNT(customer_id) AS total_orders,
           ROW_NUMBER() OVER (PARTITION BY category ORDER BY COUNT(customer_id) DESC) AS item_rank
    FROM customer
    GROUP BY category, item_purchased
)
SELECT item_rank, category, item_purchased, total_orders
FROM item_counts
WHERE item_rank <= 3;

✅ Q9 — Subscription Likelihood Among Repeat Buyers

SELECT subscription_status,
       COUNT(customer_id) AS repeat_buyers
FROM customer
WHERE previous_purchases > 5
GROUP BY subscription_status;

✅ Q10 — Revenue Contribution by Age Group

SELECT age_group,
       SUM(purchase_amount) AS total_revenue
FROM customer
GROUP BY age_group
ORDER BY total_revenue DESC;

📊 3️⃣ Data Visualization (Power BI)

Live connection to PostgreSQL database to ensure real‑time, refreshable insights.

🔍 Key Insights & Business Questions Answered

Question	Insight	Value Delivered
Revenue by Gender	Men spent far more	Revenue focus on male audience
High‑Spending Discount Users	839 customers identified	Target for profitable promo strategy
Top Rated Products	Gloves, Sandals, Boots...	Inventory optimization
Shipping Type Impact	Spend nearly identical	Express shipping not revenue driver
Subscriber Value	Non‑subscribers = $170k+ revenue	Big conversion opportunity
Discount‑Driven Products	Hat & Sneakers most influenced	Price sensitivity segmentation
Customer Segmentation	Loyal: 3,116	Retention program potential
Top Products by Category	"Hero" products identified	Perfect for marketing campaigns
Repeat Buyers & Subscription	2,518 not subscribed	Upsell campaign focus
Revenue by Age Group	Young Adult leads	Demographic targeting

🎯 Business Recommendations

Strategic actions based on findings:

✅ Convert 2,518 loyal non‑subscribers → membership program, special onboarding campaigns

✅ Implement loyalty rewards for 3,100+ regular buyers to boost retention

✅ Optimize discount strategy

Promote price‑sensitive items (Hats, Sneakers)
Protect margins on high‑demand products

✅ Marketing campaign focus

"Young Adult" & "Middle-aged" segments → highest spending
Feature top performing products in ads

🚀 How to Run This Project

✅ Prerequisites

Python 3.x
PostgreSQL installed and running
Power BI Desktop

Required Python libraries:

pip install pandas sqlalchemy psycopg2-binary

✅ Setup & Execution

1️⃣ Create a PostgreSQL database → customer_behaviour 2️⃣ Update credentials in main.py 3️⃣ Place customer_shopping_behavior.csv in the same folder 4️⃣ Run the pipeline:

python main.py

✅ Automatically loads cleaned data into PostgreSQL

✅ SQL Analysis

Run queries from analysis.sql in pgAdmin or DBeaver

✅ Dashboard

Open: customer_behaviour.pbix in Power BI Desktop

Update connection path when prompted

📁 Project Structure

├── main.py                # Python data cleaning + PostgreSQL loader
├── analysis.sql           # SQL business insights
├── customer_behaviour.pbix # Power BI dashboard file
├── customer_shopping_behavior.csv # Raw dataset
└── README.md              # Project documentation

🏁 Final Outcome

✅ A complete retail analytics system ✅ SQL‑powered decision intelligence ✅ Dashboard‑based story for stakeholders

This project showcases data engineering + analytics + visualization expertise for real‑world business value.

⭐ Contributions & Feedback!

If you'd like to enhance or explore this project further — feel free to contribute or reach out!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Customer Behavior Analysis Assets		Customer Behavior Analysis Assets
Business Problem Document.pdf		Business Problem Document.pdf
Customer Shopping Behavior Analysis.pdf		Customer Shopping Behavior Analysis.pdf
Customer-Shopping-Behavior-Analysis.pptx		Customer-Shopping-Behavior-Analysis.pptx
Customer_Shopping_Behavior_Analysis.ipynb		Customer_Shopping_Behavior_Analysis.ipynb
README.md		README.md
customer_behavior_sql_queries.sql		customer_behavior_sql_queries.sql
customer_behaviour.pbix		customer_behaviour.pbix
customer_shopping_behavior.csv		customer_shopping_behavior.csv

InfiniteLoop360/Customer_Behavior_Analysis

Folders and files

Latest commit

History

Repository files navigation

Customer Shopping Behavior Analysis

📊 Project Overview

🛠️ How to Use This Project

✅ Clone the Repository

✅ Run the Notebook — Customer_Shopping_Behavior_Analysis.ipynb

✅ SQL Analysis

✅ Power BI Dashboard

✅ Reporting & Presentation

📊 Dashboard Showcase

🧩 Tech Stack

🔄 Project Workflow

✅ 1️⃣ Data Cleaning & Preparation (Python)

📈 2️⃣ Data Analysis (SQL)

✅ Q1 — Total Revenue by Gender

✅ Q2 — Discount Users Spending Above Average

✅ Q3 — Top 5 Products by Highest Avg Review Rating

✅ Q4 — Compare Avg Spend: Standard vs Express Shipping

✅ Q5 — Subscriber vs Non‑Subscriber Spending & Revenue

✅ Q6 — Top 5 Products with Highest Discount Usage

✅ Q7 — Customer Segmentation by Purchase History

✅ Q8 — Top 3 Most Purchased Items per Category

✅ Q9 — Subscription Likelihood Among Repeat Buyers

✅ Q10 — Revenue Contribution by Age Group

📊 3️⃣ Data Visualization (Power BI)

🔍 Key Insights & Business Questions Answered

🎯 Business Recommendations

🚀 How to Run This Project

✅ Prerequisites

✅ Setup & Execution

✅ SQL Analysis

✅ Dashboard

📁 Project Structure

🏁 Final Outcome

⭐ Contributions & Feedback!

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

✅ Run the Notebook — `Customer_Shopping_Behavior_Analysis.ipynb`

Packages