🚀 Event-Driven SaaS Analytics Platform

A production-style Data Engineering project that simulates a real subscription-based SaaS product and builds a complete modern analytics pipeline — from large-scale event generation to executive business dashboard.

This project models realistic SaaS behavior and implements a layered warehouse architecture (Bronze → Silver → Gold) using industry-standard tools.

📌 Project Overview

This system simulates a subscription-based SaaS platform (Spotify / Swiggy-style behavioral modeling) and builds an end-to-end analytics stack including:

Large-scale synthetic event generation (~900K+ events)
Subscription lifecycle simulation (6 months)
Recurring billing system
Churn probability modeling
Retention cohort modeling
KPI data marts
Executive analytics dashboard

The goal is to demonstrate real-world Data Engineering practices such as:

Event-driven architecture
Star schema modeling
Warehouse layering
Analytical KPI marts
Production-style project organization

🏗 System Architecture

Python Event Generator
        ↓
PostgreSQL (Bronze Layer - Raw Events)
        ↓
dbt Transformations (Silver Layer - Star Schema)
        ↓
Gold Layer KPI Data Marts
        ↓
Streamlit Business Dashboard

📊 Project Scale & Complexity

~900,000+ simulated user events
6-month subscription lifecycle modeling
Multi-cohort retention tracking
Recurring monthly & yearly billing simulation
Payment failure probability
Reactivation behavior modeling
Fully normalized star schema warehouse

🛠 Tech Stack

Layer	Tool	Purpose
Event Generation	Python	Synthetic event simulation
Data Warehouse	PostgreSQL	Central analytics warehouse
Transformations	dbt	Staging, star schema, KPI marts
Visualization	Streamlit	Business dashboard

🧠 Data Warehouse Design

🥉 Bronze Layer (Raw Events)

Append-only immutable event store.

Table:

raw_user_events
- UUID primary key
- Event timestamp
- Event type
- JSONB event_properties
- User identifier

Design Principles:

No transformations
Fully replayable
Audit-friendly

🥈 Silver Layer (Star Schema)

Staging

stg_raw_user_events

Dimension Tables

dim_users
dim_plan
dim_date
dim_event_type

Fact Tables

fact_user_activity
fact_payments
fact_subscriptions

Design Principles:

Surrogate keys
Clear fact-dimension separation
Optimized for analytical joins

🥇 Gold Layer (Business KPI Marts)

Pre-aggregated business-ready models:

kpi_dau
kpi_mau
kpi_mrr
kpi_churn
kpi_ltv
kpi_retention_cohort

Optimized for dashboard consumption.

📊 KPIs Implemented

👥 User Metrics

Daily Active Users (DAU)
Monthly Active Users (MAU)
Stickiness Ratio (DAU / MAU)

💰 Revenue Metrics

Monthly Recurring Revenue (MRR)
Customer Lifetime Value (LTV)
Subscription distribution

🔁 Retention & Churn

Monthly churn rate
Cohort retention matrix
Reactivation analysis

📊 Dashboard Capabilities

The Streamlit dashboard includes:

Executive KPI summary cards
DAU & MAU growth trends
Revenue trend (MRR over time)
Churn breakdown (absolute + rate)
Retention cohort heatmap
LTV distribution histogram
Architecture overview section

🚀 How To Run Locally

1️⃣ Generate Events

python generator/event_generator.py

2️⃣ Run dbt Transformations

cd dbt
dbt run

3️⃣ Launch Dashboard

streamlit run dashboard/app.py

📁 Project Structure

event-driven-saas-analytics/

├── generator/
│   |── behaviour_engine.py
│   ├── config.py
│   ├── enums.py
│   ├── main.py
│   ├── postgres_writer.py
│   ├── revenue_engine.py
│   └── models.py
|
├── dbt/
│   ├── models/
│   │   ├── staging/
│   │   ├── marts/
│   │   ├── dimensions/
│   │   ├── facts/
│   │   └── intermediate/
│
├── db/
│   ├── indexes.sql/
│   └── raw_schema.sql/
│
├── dashboard/
│   └── app.py
│   └── database.py
│   └── queries.py
│
└── README.md

🧪 Business Logic Simulation

The synthetic generator models:

User acquisition growth curve
Signup → feature usage funnel
Upgrade probability modeling
Monthly vs yearly subscription logic
Recurring billing cycles
Churn after minimum subscription tenure
Reactivation probability
Payment failure scenarios

Designed to mirror realistic SaaS growth dynamics.

🎯 Engineering Concepts Demonstrated

Event-driven system modeling
Star schema warehouse design
Layered architecture (Bronze / Silver / Gold)
Analytical SQL modeling
KPI mart construction
Large-scale synthetic data handling
Production-style project structure

📈 Business Impact Simulation

This architecture enables:

Real-time KPI visibility
Subscription revenue tracking
Cohort-based retention analysis
Data-driven churn reduction strategy
Executive-level reporting readiness

🔮 Future Enhancements

Planned upgrades:

Apache Airflow orchestration
Docker containerization
Kafka streaming ingestion
Data quality tests in dbt
Incremental model optimization

📊 Dataset

~900K+ events
6-month lifecycle coverage
Subscription events
Payment transactions
Feature usage activity
Upgrade & churn transitions

Fully synthetic data generated using Python.

⭐ Project Goal

To design and implement a production-style, end-to-end analytics system that simulates real SaaS business operations and demonstrates modern Data Engineering practices at scale .

This project showcases:

Event-driven system design
Layered warehouse architecture (Bronze → Silver → Gold)
Star schema data modeling
KPI data mart construction
Large-scale synthetic data handling (~900K+ events)
Business-ready analytical reporting

👨‍💻 Author

Kavin Kishore
Delhi Technological University (DTU)

Built as a production-style Data Engineering project.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dashboard		dashboard
db		db
dbt_project		dbt_project
event_generator		event_generator
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🚀 Event-Driven SaaS Analytics Platform

📌 Project Overview

🏗 System Architecture

📊 Project Scale & Complexity

🛠 Tech Stack

🧠 Data Warehouse Design

🥉 Bronze Layer (Raw Events)

🥈 Silver Layer (Star Schema)

Staging

Dimension Tables

Fact Tables

🥇 Gold Layer (Business KPI Marts)

📊 KPIs Implemented

👥 User Metrics

💰 Revenue Metrics

🔁 Retention & Churn

📊 Dashboard Capabilities

🚀 How To Run Locally

1️⃣ Generate Events

2️⃣ Run dbt Transformations

3️⃣ Launch Dashboard

📁 Project Structure

🧪 Business Logic Simulation

🎯 Engineering Concepts Demonstrated

📈 Business Impact Simulation

🔮 Future Enhancements

📊 Dataset

⭐ Project Goal

👨‍💻 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages