This project is a daily pipeline that:
- Fetches transactions from Plaid,
- Stores them in a PostgreSQL database,
- Aggregates daily financial data,
- Sends the data to a Gemini LLM for analysis,
- Stores the LLM’s recommendations.
Connects to the PostgreSQL database using the DATABASE_URL environment variable.
Retrieves a Plaid access token from the database or creates a new one (sandbox only) and stores it.
Fetches new, modified, and removed transactions from Plaid using the access token and an optional cursor for incremental sync.
Inserts new transactions into the plaid_transactions table, skipping duplicates.
Retrieves the last used Plaid sync cursor for incremental transaction fetching.
Updates the last used Plaid sync cursor in the database.
Aggregates daily transaction data (income, spending, top categories, etc.) for a given date, formatted for LLM input.
Sends the aggregated data to the Gemini LLM and parses its JSON response for summary and recommendations.
Stores the LLM’s summary and recommendations in the daily_recommendations table.
Orchestrates the entire process: fetches transactions, updates the database, aggregates data, gets LLM insights, and stores results.
- Python 3.8+
- PostgreSQL database
- Plaid developer account (for API keys)
- Google Gemini API key
pip install -r requirements.txtCreate a .env file in the project root with:
PLAID_CLIENT_ID=your_plaid_client_id
PLAID_SECRET=your_plaid_secret
PLAID_ENV=sandbox
DATABASE_URL=postgresql://user:password@host:port/dbname
GEMINI_API_KEY=your_gemini_api_key
LLM_MODEL=gemini-1.5-pro-latest
Run the schema SQL to create tables:
psql $DATABASE_URL -f schema.sqlpython main.py- The pipeline is designed for daily runs (e.g., via cron or a scheduled job).
- Only sandbox Plaid environment is supported for token creation in this demo.
- LLM recommendations are stored for each day in the database.
---# Fintech-Data-API-