Alpha Research Framework

A rigorous, reproducible framework for testing ML-based alpha generation strategies.

🎯 Key Finding

This research demonstrates that apparent alpha from ML models on public market data is primarily overfitting.

The critical comparison between experiments C6 and C7 proves this:

Metric	C6 (Weak Regularization)	C7 (Strong Regularization)
Features	41	8
Ridge α	1.0	100.0
Mean IC	+0.036	-0.084
p-value	0.14	0.001
Validation R²	All negative	All negative

The IC flips from positive to significantly negative when overfitting is controlled.

This is the classic signature of spurious patterns: complex models find noise, simple models reveal truth.

📊 Experiment Results Summary

Equity Experiments (E1-E4)

ID	Description	Key Result	Status
E1	LSTM Overfitting Demo	Train IC +0.86 → Val IC -0.06	✅ Severe overfitting demonstrated
E2	Cross-Sectional Targets	IC ≈ 0.02-0.03, normal decay	✅ Weak signal
E3	Fundamentals (BIASED)	IC +0.18, p=0.0000	⚠️ FAKE - look-ahead bias
E4	Fundamentals (Corrected)	IC ≈ 0.02, p > 0.10	✅ No significance after correction

Crypto Experiments (C1-C7)

ID	Description	Key Result	Status
C1	Time-Series Direction	Edge ~0.5%, IC ~0.04	✅ Marginal, not robust
C2	Technical Alpha	IC -0.01, p=0.55	✅ Not significant
C3	Production System	Sharpe 0.69, p=0.02	⚠️ Borderline
C6	Overfitting Demo	IC +0.036, Val R² < 0	⚠️ SPURIOUS
C7	Anti-Overfitting	IC -0.084, p=0.001	✅ MAIN RESULT

🚀 Quick Start

Installation

git clone https://github.com/yourusername/alpha_research_framework.git
cd alpha_research_framework
pip install -r requirements.txt

Run Key Comparison (Recommended)

# Run the critical C6 vs C7 comparison
python run_experiments.py --compare

Run Individual Experiments

# List all experiments
python run_experiments.py --list

# Run specific experiment
python run_experiments.py -e C7    # Main result
python run_experiments.py -e E1    # Overfitting demo
python run_experiments.py -e E3    # Look-ahead bias demo

# Show documented results without running
python run_experiments.py --show

Run All Experiments

# Full test suite (~15-20 minutes)
python run_experiments.py --all

📁 Project Structure

alpha_research_framework/
├── README.md                 # This file
├── requirements.txt          # Dependencies
├── run_experiments.py        # Unified experiment runner
├── experiments/
│   ├── equity/
│   │   ├── E1_single_stock_lstm.py    # LSTM overfitting
│   │   ├── E2_cross_sectional.py      # Cross-sectional targets
│   │   ├── E3_fundamentals_bias.py    # Look-ahead bias demo
│   │   └── E4_annual_fundamentals.py  # Bias-corrected
│   └── crypto/
│       ├── C1_timeseries.py           # Direction prediction
│       ├── C2_technical_alpha.py      # Technical indicators
│       ├── C3_production_system.py    # Full system
│       ├── C6_overfitting_demo.py     # ⚠️ Shows overfitting
│       └── C7_robust_final.py         # ✅ Main result
├── docs/
│   ├── RESEARCH_JOURNEY.md   # Full narrative
│   ├── METHODOLOGY.md        # Technical details
│   └── RESULTS_SUMMARY.md    # All results
└── results/
    └── documented_results.json

🔬 Methodology Highlights

Anti-Overfitting Techniques

Walk-Forward Validation: Train on past, test on future (no shuffling)
Purge Gap: 5-10 day gap between train/test to prevent leakage
Feature Reduction: 8 features max (vs 41 in overfit version)
Strong Regularization: Ridge α = 100 (vs 1.0 in overfit version)
Cross-Sectional Targets: Rank-based to remove market trend

Statistical Rigor

IC t-test: Test if mean IC is significantly different from zero
Multiple Folds: 40-70 walk-forward windows per experiment
Validation R²: Must be non-negative (negative = overfitting)

📈 Key Insights

What We Learned

Neural networks overfit easily on financial data (E1)
Look-ahead bias creates fake alpha - always check data timestamps (E3 vs E4)
Positive IC with negative validation R² = overfitting (C6)
Strong regularization reveals truth (C7)
Public price data has no exploitable alpha with standard ML

What Works

Walk-forward validation with purge gaps
Cross-sectional (ranking) targets
Minimal features (8-10 max)
Strong regularization (Ridge α ≥ 100)
Statistical significance testing

What Doesn't Work

Complex models (LSTM, deep MLP) without heavy regularization
Many features (>20) without selection
Absolute return targets (includes beta)
Using current data for historical predictions

📚 Documentation

RESEARCH_JOURNEY.md: Full narrative from hypothesis to null result
METHODOLOGY.md: Technical details and anti-overfitting checklist
RESULTS_SUMMARY.md: Complete results tables

🎓 Academic Value

This framework demonstrates:

Rigorous methodology for financial ML research
Honest null result - finding no alpha is valid science
Reproducible experiments with clear documentation
Overfitting detection techniques applicable to any ML project

⚠️ Disclaimer

This is research code for educational purposes. Results are based on historical data and do not guarantee future performance. This is not financial advice.

📄 License

MIT License - see LICENSE for details.

🤝 Contributing

Contributions welcome! Please read the methodology documentation first to understand the anti-overfitting principles.

📖 Citation

@software{alpha_research_framework,
  title={Alpha Research Framework: A Rigorous Approach to ML-Based Alpha Generation},
  year={2025},
  url={https://github.com/BianchiGiacomo/alpha-research-framework}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Alpha Research Framework

🎯 Key Finding

📊 Experiment Results Summary

Equity Experiments (E1-E4)

Crypto Experiments (C1-C7)

🚀 Quick Start

Installation

Run Key Comparison (Recommended)

Run Individual Experiments

Run All Experiments

📁 Project Structure

🔬 Methodology Highlights

Anti-Overfitting Techniques

Statistical Rigor

📈 Key Insights

What We Learned

What Works

What Doesn't Work

📚 Documentation

🎓 Academic Value

⚠️ Disclaimer

📄 License

🤝 Contributing

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
experiments		experiments
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_experiments.py		run_experiments.py

Folders and files

Latest commit

History

Repository files navigation

Alpha Research Framework

🎯 Key Finding

📊 Experiment Results Summary

Equity Experiments (E1-E4)

Crypto Experiments (C1-C7)

🚀 Quick Start

Installation

Run Key Comparison (Recommended)

Run Individual Experiments

Run All Experiments

📁 Project Structure

🔬 Methodology Highlights

Anti-Overfitting Techniques

Statistical Rigor

📈 Key Insights

What We Learned

What Works

What Doesn't Work

📚 Documentation

🎓 Academic Value

⚠️ Disclaimer

📄 License

🤝 Contributing

📖 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages