Skip to content

A Streamlit-powered research agent that searches arXiv by topic, summarizes papers using IBM watsonx.ai Granite, generates reviewer-style notes, and exports a consolidated report in TXT, DOCX, and PDF. Built for fast literature reconnaissance and lightweight paper triage.

License

Notifications You must be signed in to change notification settings

RohitXJ/Research-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Research Agent (arXiv + IBM watsonx.ai Granite)

by Rohit Gomes

A Streamlit-powered research agent that searches arXiv by topic, summarizes papers using IBM watsonx.ai Granite, generates reviewer-style notes, and exports a consolidated report in TXT, DOCX, and PDF. Built for fast literature reconnaissance and lightweight paper triage.

  • Tech stack: Python, Streamlit, arXiv API, IBM watsonx.ai Granite, python-docx, ReportLab
  • Author: Rohit Gomes
  • GitHub: github.com/RohitXJ

Features

  • Topic-based arXiv search with adjustable number of results
  • LLM-powered summarization in concise, technical bullet points
  • Reviewer-style notes with strengths, weaknesses, questions, and suggestions
  • One-click export of the full report in TXT, DOCX, and PDF
  • Caching for arXiv queries and automatic retries for model calls
  • Secure secret management (no keys committed)

Demo Flow

  1. Enter a topic in the input field (e.g., “Sparse Mixture-of-Experts for Vision Transformers”).
  2. Click “Run Agent”.
  3. Watch the progress as the app fetches papers, summarizes, and generates reviews.
  4. Preview the combined report and download as TXT/DOCX/PDF.
  5. Expand per-paper details for the generated summary and reviewer notes.

Project Structure

  • app/
    • app.py
    • agents/
      • init.py
      • fetch.py
      • summarize.py
      • review.py
      • report.py
    • utils/
      • init.py
      • ibm_client.py
      • io.py
      • text.py
  • .streamlit/
    • secrets.toml (local only; do NOT commit)
  • requirements.txt
  • .gitignore
  • LICENSE
  • README.md

Requirements

  • Python 3.9+
  • Dependencies in requirements.txt:
    • streamlit
    • arxiv
    • python-docx
    • reportlab
    • ibm-watsonx-ai
    • tenacity

Install:

  • python -m venv .venv
  • Windows: . .venv/Scripts/activate
  • macOS/Linux: source .venv/bin/activate
  • pip install -r requirements.txt

Configuration (Secrets)

Create .streamlit/secrets.toml locally with IBM watsonx.ai credentials. Do not commit this file.

Example: [ibm] apikey = "YOUR_IBM_WATSONX_APIKEY" url = "YOUR_IBM_WATSONX_URL" project_id = "YOUR_IBM_WATSONX_PROJECT_ID" model_id = "ibm/granite-13b-instruct-v2" decoding_method = "greedy" max_new_tokens = 350 top_p = 0.9 temperature = 0.7

Notes:

  • Use least-privilege API keys.
  • For Streamlit Cloud, add the same keys in the app’s Secrets UI instead of uploading secrets.toml.

Run Locally

  • streamlit run app/app.py
  • Open the URL shown in the terminal (usually http://localhost:8501).
  • Enter a topic and click “Run Agent”.

Deployment (Streamlit Community Cloud)

  1. Push the repository to GitHub. Ensure .streamlit/secrets.toml is ignored.
  2. Go to Streamlit Community Cloud and “New app”.
  3. Select the repo/branch and set the entry point to app/app.py.
  4. In Settings → Secrets, paste the [ibm] block from above.
  5. Deploy.

Tips:

  • Subsequent code changes auto-redeploy on push.
  • Update secrets in Settings → Secrets without code changes.

Troubleshooting

  • Import error: “No module named app.agents; 'app' is not a package”
    • Ensure app/, app/agents/, app/utils/ each have init.py.
    • Run from project root: streamlit run app/app.py (not inside app/).
    • If needed, add a sys.path shim at the top of app/app.py.
  • IBM auth errors
    • Verify apikey, url, project_id in secrets.
    • Confirm the model_id exists and access is granted to your project.
  • ArXiv returns no papers
    • Try a broader query or reduce filters; lower number of papers while testing.
  • Slow generation or rate limits
    • Reduce “Number of papers” and “Max new tokens”.
    • Use “greedy” decoding for determinism and speed.
  • PDF/DOCX issues
    • Ensure python-docx/reportlab installed properly.
    • Regenerate a clean virtual environment if needed.

Customization

  • Prompt style:
    • app/agents/summarize.py: tweak summary format (e.g., more/less bullet points).
    • app/agents/review.py: adjust reviewer categories or tone.
  • Controls:
    • app/app.py sidebar: expose more decoding parameters or add filters (e.g., year, author).
    • app/agents/fetch.py: add category filtering or date ranges.
  • Exports:
    • app/utils/io.py: extend to JSON/Markdown exports.
    • Add metadata like authors, publish dates, and categories to the report.

Roadmap Ideas

  • Add year/category filters for arXiv queries.
  • Deduplication and relevance scoring beyond default relevance.
  • Persist session outputs to downloadable JSON for programmatic use.
  • Option to include abstracts verbatim in the report.
  • Add a “cost/tokens” estimator and limiter per run.
  • Select models from a dropdown if multiple IBM models are available.

Security Notes

  • Never commit secrets (secrets.toml is ignored).
  • Use separate IBM keys for dev and deployment.
  • Avoid logging sensitive inputs or outputs.
  • Keep dependency versions updated to patch vulnerabilities.

Acknowledgements

  • arXiv Python library
  • IBM watsonx.ai Granite
  • Streamlit
  • python-docx and ReportLab

Author

  • Name: Rohit Gomes
  • GitHub: github.com/RohitXJ

If this project helps your workflow, consider starring the repo and sharing feedback or feature requests via issues. 1

Footnotes

  1. https://github.com/RohitXJ

About

A Streamlit-powered research agent that searches arXiv by topic, summarizes papers using IBM watsonx.ai Granite, generates reviewer-style notes, and exports a consolidated report in TXT, DOCX, and PDF. Built for fast literature reconnaissance and lightweight paper triage.

Topics

Resources

License

Stars

Watchers

Forks

Languages