Bayesian inference for linear models with continuous and discrete responses using MCMC and Variational Inference (VI).
This repository provides a Python package (bayes_linear) implementing Bayesian approaches for regression and classification tasks. All core implementations rely primarily on NumPy, SciPy, and pandas, making them lightweight and easy to understand.
Two inference methods are available:
- Mean-Field Variational Inference (VI): Fast approximate inference using coordinate ascent CAVI (Coordinate Ascent Variational Inference)
- Markov Chain Monte Carlo (MCMC): Gibbs sampling with data augmentation for exact posterior inference. Also computes the posterior density of residuals for model diagnostics and outlier detection. See Albert and Chib (1995) for details.
Robust regression model for heavy-tailed response variables using Gibbs sampling. The Student-t distribution provides robustness against outliers compared to standard Gaussian linear regression.
Implementation of Bayesian Active Learning by Disagreement (BALD) for binary classification. Also supports Batch BALD for selecting multiple informative samples simultaneously.
A 2-hidden layer Bayesian feedforward neural network using PyMC3 with informative priors.
bayes-linear/
├── bayes_linear/ # Main package
│ ├── __init__.py
│ ├── bprobit.py # Bayesian probit model (VI + MCMC) and BALD
│ ├── bstudent.py # Student-t robust linear regression
│ ├── bnnet.py # Bayesian neural network (PyMC3)
│ ├── requirements.txt # Package dependencies
│ └── setup.py # Package setup file
├── example_bayes_probit.ipynb # Example notebook for probit model
├── AL_BayesProbit.ipynb # Active learning with BALD example
└── README.md
| File | Description |
|---|---|
bprobit.py |
Contains BayesProbit_MCMC and BayesProbit_VI classes for Bayesian binary classification, plus the BALD() function for active learning |
bstudent.py |
Contains bstudent class for robust linear regression with Student-t errors |
bnnet.py |
Contains bnn class implementing a Bayesian neural network with PyMC3/Theano |
example_bayes_probit.ipynb |
Jupyter notebook demonstrating the Bayesian probit model usage |
AL_BayesProbit.ipynb |
Jupyter notebook demonstrating active learning with BALD algorithm |
git clone https://github.com/AVoss84/bayes-linear.git
cd bayes-linear
pip install -r bayes_linear/requirements.txtCore dependencies:
numpyscipypandasscikit-learn
For the Bayesian neural network (bnnet.py):
pymc3theano
from bayes_linear.bprobit import BayesProbit_MCMC
# Initialize model
model = BayesProbit_MCMC(N_sim=10000, burn_in=5000, verbose=True)
# Fit model
model.fit(X_train, y_train)
# Predict probabilities
probs = model.predict_proba(X_test)
# Predict labels
predictions = model.predict(X_test)from bayes_linear.bprobit import BayesProbit_VI
# Initialize model
model = BayesProbit_VI(max_iter=500, verbose=True)
# Fit model
model.fit(X_train, y_train)
# Predict
predictions = model.predict(X_test)from bayes_linear.bprobit import BayesProbit_MCMC, BALD
# Train initial model
model = BayesProbit_MCMC()
model.fit(X_train, y_train)
# Select most informative samples from unlabeled pool
selected_samples, indices, scores = BALD(X_pool, model.theta_final, b=10)from bayes_linear.bstudent import bstudent
# Initialize with degrees of freedom
model = bstudent(nu=5, verbose=True)
# Fit via Gibbs sampling
model.fit(X, y, MCsim=2000, burnin=1000)- Albert, J. H., & Chib, S. (1995). Bayesian residual analysis for binary response regression models. Biometrika, 82(4), 747-769.
- Houlsby, N., Huszár, F., Ghahramani, Z., & Lengyel, M. (2011). Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745.
- Kirsch, A., van Amersfoort, J., & Gal, Y. (2019). BatchBALD: Efficient and diverse batch acquisition for deep Bayesian active learning. NeurIPS 2019.
This project is open source. See the repository for license details.