A trading signal generation platform I built to test whether machine learning actually works for algorithmic trading. Spoiler: it's complicated.
┌─────────────────────────────────────────────────────────────────┐
│ WHAT THIS IS │
├─────────────────────────────────────────────────────────────────┤
│ 200+ Technical Indicators RSI, MACD, Bollinger, volume, etc │
│ 5 ML Models RF, XGBoost, LightGBM, LSTM, etc │
│ Ensemble Methods Voting, stacking, Bayesian avg │
│ Backtesting Engine Walk-forward validation, real costs│
│ FastAPI Server REST + WebSocket streaming │
│ Production Monitoring Drift detection, auto-retrain │
│ 6,000+ Lines of Python Not a toy project │
└─────────────────────────────────────────────────────────────────┘
Wanted to learn if ML could generate profitable trading signals. Read a bunch of papers claiming 60-70% accuracy. Decided to build the full pipeline myself rather than trust someone else's backtested metrics.
Built this to test:
- Can traditional ML (RF, XGBoost) beat LSTM/Transformers for price prediction?
- Do 200+ features actually help or just overfit?
- How much do transaction costs destroy theoretical edge?
- What's the real Sharpe ratio after slippage?
Used Python because the ML ecosystem is Python-native. FastAPI for the server because it's fast and has WebSocket support. Redis for caching because repeated feature calculation is expensive.
After extensive backtesting on 4 years of data (2020-2024):
┌─────────────────────────────────────────────────────────────────┐
│ Metric Value Reality Check │
├─────────────────────────────────────────────────────────────────┤
│ Sharpe Ratio 1.8-2.4 Good, but assumes perfect exec │
│ Win Rate 58-65% Slightly better than coin flip │
│ Max Drawdown ~15% Happened twice, stressful │
│ Signal Latency <100ms Fast enough for daily signals │
│ Model Accuracy 62-68% Directional, not magnitude │
│ Profit Factor 1.8 After 0.1% transaction costs │
└─────────────────────────────────────────────────────────────────┘
Key learning: Ensemble methods (combining RF + XGBoost + LightGBM) beat individual models. LSTM and Transformers didn't add much value for daily signals—too data-hungry for the features I had.
git clone https://github.com/JasonTeixeira/AlphaStream.git
cd AlphaStream
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# Train models on AAPL, GOOGL, MSFT
python train_models.py train --symbols AAPL,GOOGL,MSFT
# Start API server
python -m api.mainWith Docker:
docker-compose up -d
docker-compose logs -f apiBuilt in 3 layers:
Data Pipeline Load OHLCV, validate, cache
│
v
Feature Engineering 200+ indicators (price, volume, volatility)
│
v
ML Pipeline Train 5 models, ensemble predictions
├── Random Forest Baseline, interpretable
├── XGBoost Best single model (usually)
├── LightGBM Fast for large datasets
├── LSTM Sequential patterns (underwhelming)
└── Transformer Attention-based (compute-heavy)
│
v
Backtesting Engine Walk-forward validation, real costs
│
v
FastAPI Server REST + WebSocket streaming
│
v
Redis Cache Feature cache (optional but fast)
AlphaStream/
├── ml/
│ ├── models.py # 5 model implementations
│ ├── features.py # 200+ technical indicators
│ ├── dataset.py # Data loading + preprocessing
│ ├── train.py # Training pipeline
│ ├── validation.py # Data quality checks
│ └── monitoring.py # Drift detection
├── backtesting/
│ └── engine.py # Portfolio simulation
├── api/
│ └── main.py # FastAPI server
├── config/
│ ├── training.yaml # Model hyperparameters
│ └── logging.yaml # Logging config
├── tests/
│ └── test_models.py # Unit tests
├── train_models.py # CLI for training
└── docker-compose.yml # Orchestration
I implemented everything from TA-Lib plus custom ones:
Price-based:
- Moving averages (SMA, EMA, WMA)
- Bollinger Bands
- RSI, MACD, Stochastic
Volume:
- OBV, VWAP, MFI
- Accumulation/Distribution
Volatility:
- ATR, Historical Vol
- Parkinson, Garman-Klass
Market microstructure:
- Bid-ask spread proxy
- Order flow imbalance
- Volume profile
Single models are noisy. Ensembles smooth predictions:
- Voting: Majority wins (simple, works)
- Stacking: Meta-learner on top of base models (better)
- Blending: Weighted combination (tunable)
- Bayesian Averaging: Probabilistic (overkill for this)
Best results: Simple voting of RF + XGBoost + LightGBM.
Get Prediction:
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"symbol": "AAPL", "model_type": "ensemble"}'Response:
{
"symbol": "AAPL",
"prediction": 1,
"confidence": 0.72,
"action": "BUY",
"timestamp": "2024-01-01T12:00:00"
}Batch Signals:
curl -X POST http://localhost:8000/signals \
-H "Content-Type: application/json" \
-d '{"symbols": ["AAPL", "GOOGL", "MSFT"], "threshold": 0.6}'Backtesting:
curl -X POST http://localhost:8000/backtest \
-H "Content-Type: application/json" \
-d '{
"symbol": "AAPL",
"start_date": "2023-01-01",
"end_date": "2024-01-01",
"model_type": "xgboost",
"initial_capital": 100000
}'const ws = new WebSocket('ws://localhost:8000/ws/stream');
ws.send(JSON.stringify({
action: 'subscribe',
symbols: ['AAPL', 'GOOGL']
}));
ws.onmessage = (event) => {
const signal = JSON.parse(event.data);
console.log('Signal:', signal);
};Production monitoring is critical because models degrade:
- Data Drift: Kolmogorov-Smirnov test on feature distributions
- Concept Drift: Track prediction accuracy over rolling windows
- Automated Alerts: Slack/email when drift detected
- Auto-Retrain: Trigger retraining when performance degrades
I learned this the hard way after a model trained in 2022 stopped working in 2023 (regime change).
- Data quality: Bad ticks, splits, dividends—had to write extensive validation
- Feature explosion: 200+ features led to overfitting. Had to add regularization and feature selection
- Lookahead bias: Easy to accidentally leak future data into training. Walk-forward validation caught this
- Transaction costs: Theoretical edge vanished with 0.1% costs. Had to optimize for fewer trades
- Model drift: Models degrade fast. Needed monitoring and auto-retrain logic
# All tests
pytest tests/ -v
# With coverage
pytest tests/ --cov=ml --cov=backtesting --cov-report=html
# Specific test
pytest tests/test_models.py -k test_random_forestDocker (recommended):
docker-compose up -d
docker-compose logs -f apiProduction checklist:
- Use Redis for caching (10x faster feature lookups)
- Enable GPU if using LSTM/Transformers (CPU is slow)
- Set up Prometheus/Grafana for monitoring
- Configure alerts for drift detection
- Add API rate limiting (prevent abuse)
- Use load balancer for multiple instances
- Reinforcement learning: Try RL for position sizing and timing
- Alternative data: Sentiment, news, options flow (expensive to get)
- Multi-timeframe: Combine daily + hourly signals
- Risk management: Position sizing based on volatility
- Database: Persist predictions for historical analysis
- Core ML pipeline
- 200+ technical indicators
- Backtesting with real costs
- FastAPI + WebSocket
- Docker deployment
- Drift detection
- Database persistence (PostgreSQL)
- API authentication (JWT)
- Reinforcement learning agents
- Cloud deployment guide (AWS/GCP)
MIT
- Build Story: JOURNEY.md (coming soon)
- API Docs:
/docswhen server is running - Config:
config/training.yaml