| Repo size | Last commit | License | Python version | Build status | Docker support | GCP BigQuery | Gemini AI summarization | Streamlit App |
TrendNest is a portfolio-ready data pipeline and dashboard project that ingests, transforms, models, and visualizes data trends over time. It integrates AI summarization using Gemini 1.5 and supports exporting cleaned data to CSV. The project is fully containerized and deployable.
- Data extraction from various sources (e.g. APIs, databases, files)
- Transformation pipeline via configurable "recipe"
- Time-based trend modeling
- AI-generated summaries using Gemini 1.5
- Interactive dashboard built with Streamlit (or Dash)
- CSV downloads of processed data
- Dockerized for deployment
TrendNest/
โโโ dags/ # Airflow DAGs (optional)
โโโ dashboard/ # Streamlit dashboard app
โ โโโ app.py # Main UI script
โโโ data/ # Local and processed data
โ โโโ cleaned_data.csv # Output from pipeline
โ โโโ sample.csv # Example input data
โโโ docker/ # Containerization setup
โ โโโ Dockerfile # Docker build instructions
โโโ docs/ # Documentation and notes
โ โโโ design.md # System design outline
โโโ notebooks/ # Jupyter notebooks (EDA, prototyping)
โโโ sql/ # BigQuery-compatible SQL queries
โ โโโ monthly_averages.sql # Avg monthly close/volume
โ โโโ latest_prices.sql # Most recent close prices
โ โโโ volume_spikes.sql # High-volume trading days
โโโ src/ # Core data pipeline logic
โ โโโ __init__.py
โ โโโ config.py # Config constants
โ โโโ extract.py # Local/CSV data extraction
โ โโโ extract_stocks.py # YFinance stock extractor
โ โโโ transform.py # Data cleaning
โ โโโ model.py # Trend modeling
โ โโโ summarize.py # Gemini AI summaries
โ โโโ export.py # CSV export
โ โโโ upload.py # BigQuery uploader
โโโ test_https.py # API connectivity test
โโโ test_upload.py # BigQuery upload test
โโโ test_yfinance_fetch.py # yfinance fetch test
โโโ tests/ # Unit tests (placeholder)
โโโ run_pipeline.py # Main pipeline runner
โโโ requirements.txt # Python dependencies
โโโ .env.example # Sample environment variables (copy to .env)
โโโ .gitignore # Git exclusions
โโโ README.md # This file
-
Clone the repo:
git clone https://github.com/yourusername/TrendNest.git cd TrendNest -
Set up your environment:
python -m venv venv source venv/bin/activate pip install -r requirements.txt -
Copy
.env.exampleto.env, then fill in your own credentials. Keep.envout of version control. -
(Optional) Set up observability:
LOG_LEVELcontrols verbosity (defaultINFO).- To emit OpenTelemetry traces/metrics to a collector, set
OTEL_EXPORTER_OTLP_ENDPOINT(HTTP/OTLP) and optionalOTEL_EXPORTER_OTLP_HEADERSfor auth. Without it, spans are printed to stdout and metrics stay local. ENVIRONMENTtags spans/metrics (e.g.,dev,staging,prod).TOP_PERFORMERS_LIMITandTICKERS_UNIVERSElet you tune the ticker selection.- Resilience knobs:
MAX_WORKERS,FETCH_TIMEOUT_SECONDS,FETCH_MAX_RETRIES,FETCH_BACKOFF_SECONDS,FETCH_PERIOD,FETCH_INTERVAL, andDEAD_LETTER_PATHfor failed rows.
-
Run the pipeline:
python run_pipeline.py -
Start the dashboard:
streamlit run dashboard/app.py
Command-line overrides:
python run_pipeline.py --tickers AAPL,MSFT --limit 5 --period 1mo --interval 1d --export-path /tmp/output.csv --dead-letter-path /tmp/failed.csv
Use a YAML config file to override settings:
python run_pipeline.py --config config.yaml
- Tracing: pipeline run โ per-ticker spans + downstream HTTP (requests/yfinance) via OpenTelemetry.
- Metrics: counters for runs, tickers processed, and rows processed (
trendnest.pipeline.*). They export via OTLP if configured, else stay in-process. - Logs: structured
loggingwithrun_idon key entries; adjustLOG_LEVELas needed. - Resilience: bounded retries with jitter, timeouts on fetches, concurrent ticker processing (
MAX_WORKERS), and a dead-letter CSV for failures. - Metrics expanded: fetch latency histogram (
trendnest.pipeline.fetch_latency_seconds) and retry/failure counters.
- Run tests locally:
python -m pip install -r requirements-dev.txt && pytest -q - GitHub Actions workflow (
.github/workflows/ci.yml) runs tests on pushes/PRs tomain.
- Keep secrets out of git; use
.env.exampleas a template and prefer cloud secret storage. - See
SECURITY.mdfor reporting guidance and hygiene tips.
TrendNest integrates Gemini 1.5 to generate natural language summaries of key insights in your trend data. This makes the dashboard useful to both technical and non-technical stakeholders.
Example summary output:
"Apple's stock (AAPL) shows a general upward trend from December 2024 to June 2025, increasing from ~$172 to ~$258. Trading volume spiked in June, suggesting heightened investor interest."
TrendNest supports uploading cleaned trend data to Google BigQuery. This enables:
- SQL-based analysis
- Historical trend aggregation
- Integration with Looker Studio or other BI tools
Each run appends to the trendnest.cleaned_stock_data table using a service account key.
Once data is in BigQuery, you can run SQL like:
SELECT
FORMAT_DATE('%Y-%m', PARSE_DATE('%Y-%m-%d', date)) AS month,
ROUND(AVG(CAST(Close AS FLOAT64)), 2) AS avg_close,
ROUND(AVG(CAST(Volume AS INT64))) AS avg_volume
FROM `trendnest-463421.trendnest.cleaned_stock_data`
WHERE Ticker = 'AAPL'
GROUP BY month
ORDER BY month;The /sql/ directory contains reusable queries for analytics and dashboarding:
monthly_averages.sql: Calculates average monthly closing price and trading volumelatest_prices.sql: Retrieves the most recent closing price for each tickervolume_spikes.sql: Identifies unusually high trading volume days
These can be run in BigQuery or loaded into the dashboard for insights.
Build and run the container:
docker build -t trendnest .
docker run -p 8501:8501 trendnest
MIT โ free to use, modify, and distribute.
- Integrated Gemini 1.5 for AI-generated summaries
- Implemented BigQuery upload via service account
- Enabled SQL querying and Looker Studio compatibility
- Multi-ticker support added with interactive dashboard controls
- Upgraded Streamlit dashboard with Altair charts (line and bar)
- Dynamic filtering and AI summaries per selected ticker
- Enhanced CSV export for selected tickers and date ranges
- Improved dashboard responsiveness and readability