A comprehensive Bayesian Media Mix Modeling system for analyzing marketing channel effectiveness, optimizing budget allocation, and measuring incremental sales impact with MLOps experiment tracking.
This project implements an Bayesian Media Mix Model that significantly outperforms traditional Ridge regression approaches by incorporating statistical modeling techniques and marketing domain knowledge.
- Enhanced Seasonality: 3 Fourier terms capturing quarterly business cycles
- Student-t Robustness: Resistant to outliers and noise
- Saturation Transforms: LogisticSaturation modeling diminishing returns
- Adstock Transforms: GeometricAdstock for carryover effects
- Data-informed Priors: Calibrated to actual channel efficiency
- MLOps Experiment Tracking: Systematic parameter optimization and performance logging
# Clone the repository
git clone <repository-url>
cd media-mix-model
# Create virtual environment
python -m venv .env
source .env/bin/activate # On Windows: .env\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Launch Jupyter
jupyter notebook mmm_model_comparison.ipynb-
Load Data: The model works with synthetic data including 6 marketing channels:
- TV Spend
- Paid Social Spend
- Paid Search Spend
- Native Spend
- Display Spend
- Radio Spend
-
Run Model Comparison: Execute the notebook to compare Bayesian MMM vs Ridge Regression
-
Optimize Allocation: Analyze budget allocation strategies and ROAS optimization
-
Bayesian MMM (
bayesian_mmmfunction)- PyMC probabilistic programming
- Student-t likelihood for robustness
- Hierarchical priors for channel effects
- Advanced seasonality modeling
-
Saturation & Adstock Transforms
- LogisticSaturation for diminishing returns
- GeometricAdstock for carryover effects
- Customizable parameter optimization
-
MLOps Experiment Tracking
- Automated parameter logging
- Performance comparison dashboard
- JSON-based experiment persistence
# Saturation Parameters (Less Aggressive Configuration)
saturation_params = {
'tv_spend': {'lam': 1.8, 'contr': 50000},
'paid_social_spend': {'lam': 1.5, 'contr': 30000},
'paid_search_spend': {'lam': 2.2, 'contr': 20000},
'native_spend': {'lam': 1.2, 'contr': 15000},
'display_spend': {'lam': 1.7, 'contr': 25000},
'radio_spend': {'lam': 2.0, 'contr': 80000}
}- Individual channel ROAS calculation
- Contribution percentage analysis
- Coefficient comparison across models
- MMM-Optimized: Data-driven allocation based on marginal ROAS
- Equal Allocation: Baseline uniform distribution
- Historical Allocation: Current spending patterns
- Lift over baseline scenarios
- Performance gap analysis
- ROI quantification
The project includes a comprehensive experiment tracking system:
# Automatic logging of model runs
log_mmm_experiment(
saturation_params=saturation_params,
bayes_metrics=bayes_metrics,
ridge_metrics=ridge_metrics,
bayes_roas=bayes_roas,
ridge_roas=ridge_roas
)- 6-panel visualization system
- Parameter optimization insights
- Performance trend analysis
- Experiment comparison utilities
- Stored in
mmm_experiments/directory - JSON format with timestamps
- Searchable parameter history
media-mix-model/
βββ README.md
βββ requirements.txt
βββ mmm_model_comparison.ipynb # Main analysis notebook
βββ synthetic_data_generation.ipynb
βββ data/
β βββ synthetic_mmm_data_high_noise.csv
β βββ synthetic_mmm_data_low_noise.csv
β βββ lift_priors.csv
βββ mmm_experiments/ # MLOps experiment logs
βββ *.json # Timestamped experiment files
- Sampling: 3000 samples + 3000 tune (MCMC)
- Seasonality: 3 Fourier terms (annual, semi-annual, quarterly)
- Likelihood: Student-t distribution for robustness
- Priors: Data-informed hierarchical priors, incorporating lift test results
- Channel-specific saturation curves
- Adstock decay parameters
- Prior distributions
- Sampling configuration
Core Libraries:
pymc(4.0+) - Bayesian modelingpymc-marketing- MMM componentspandas,numpy- Data manipulationscikit-learn- ML utilitiesmatplotlib,seaborn- Visualizationjupyter- Notebook hosting
Analysis:
statsmodels- Statistical modelingarviz- Bayesian analysis
- Bayesian MMM superiority: Consistent outperformance in accuracy and business metrics
- Saturation importance: Less aggressive parameters improve realistic ROAS estimates
- Budget optimization: 15-25% efficiency gains through data-driven allocation
This project is open source. Please refer to the LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests and documentation
- Submit a pull request