Medgemma gemini agent by kovashikawa · Pull Request #10 · ODSCGoogleHackhathon/googol

kovashikawa · 2025-12-15T04:05:01Z

No description provided.

- Add GeminiValidator for MedGemma output validation with retry logic - Add AnnotationSerializer with smart JSON truncation (4000 char limit) - Add AnnotationPipeline orchestrator (MedGemma → Gemini → Pydantic → DB) - Refactor GeminiAnnotationAgent to use bulletproof pipeline - Update /datasets/analyze endpoint for validated DB insertion - Add comprehensive tests and documentation Pipeline flow: MedGemma → Gemini Validation → Pydantic → DB Cost: ~$0.0005/image (validation only) Budget: $300 = 600k images 🤖 Generated with Claude Code Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Keep bulletproof pipeline implementation - Remove conflicting old code - Update .gitignore for __pycache__

- Enhanced DB schema with annotation_request staging table - Added AgenticAnnotationRepo for two-tier operations - Created ClinicalSummary Pydantic model with validation - Implemented GeminiSummaryGenerator for polished summaries - Built AgenticAnnotationPipeline orchestrator - All tests passing (imports, models, repository) Architecture: MedGemma → Gemini Validator → Pydantic → annotation_request (raw) ↓ Gemini Summary ↓ annotation (clean) Benefits: - Full traceability (raw → summary) - Pydantic validation at every step - Clean summaries for frontend - Pipeline statistics & debugging - Cost: ~$0.0008/image 🤖 Generated with Claude Code Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- annotation table now has 6 columns (added request_id) - Updated save_annotations to explicitly set request_id=NULL - Maintains backward compatibility with old API code

- Automated backup of existing database - Creates new database with enhanced schema - Verifies table and index creation - Shows helpful next steps Usage: ./migrate_database.sh

- Updated POST /datasets/load to create annotation_request placeholders - Updated POST /datasets/analyze to use AgenticAnnotationPipeline - Flow: load → annotation_request (pending) → analyze → update request → process to annotation - annotation table now only receives processed, refined data with Gemini summaries Breaking change: /datasets/analyze now requires /datasets/load first

Changes: - Added force_reanalyze parameter to PromptRequest schema - Updated POST /datasets/analyze to support re-analyzing processed images - Improved error messages (distinguish "no images" vs "already processed") - Added analyze_dataset() support for force_reanalyze in API client - Added "Analyze Dataset" button to UI with force re-analyze checkbox - Fixed 404/400 error when trying to analyze already-processed images Features: - Users can now re-analyze images with different prompts - Clear error messages with pipeline stats - UI form with customizable prompt and force re-analyze option

Critical fixes for concurrent database access: 1. Enabled WAL (Write-Ahead Logging) mode in both repositories - Allows multiple readers + one writer simultaneously - Prevents "database is locked" errors 2. Added connection timeout and autocommit mode - timeout=30.0: Wait up to 30 seconds for lock acquisition - isolation_level=None: Enable autocommit mode 3. Reused global agentic_repo instance across endpoints - Prevents creating multiple concurrent connections - Fixed: POST /datasets/load creating new AgenticAnnotationRepo - Fixed: POST /datasets/analyze creating new AgenticAnnotationRepo - Fixed: POST /chat creating new AgenticAnnotationRepo Changes: - DB/agentic_repository.py: WAL mode + timeout + autocommit - DB/repository.py: WAL mode + timeout + autocommit - src/api/main.py: Global agentic_repo reused across endpoints Root cause: Multiple database connections trying to write simultaneously in async FastAPI context with SQLite's limited concurrency support.

Critical performance fix: 1. Created global agentic_pipeline instance - Initialized once at startup - Reused across all analyze requests - Prevents MedGemma model from reloading every time 2. Fixed lazy loading logic in MedGemmaTool - Only set _model_loaded flag AFTER successful load - Prevents flag from being set on failed loads - Better error handling and logging Root cause: - AgenticAnnotationPipeline was created fresh on every POST /datasets/analyze - Each new pipeline created a new MedGemmaTool() instance - Each new MedGemmaTool() reset _model_loaded = False - Model was reloaded from disk on EVERY request (~10-20 seconds each time) After fix: - First request: Loads model (~10-20 seconds) - Subsequent requests: Reuses loaded model (~instant) Changes: - src/api/main.py: Global agentic_pipeline initialized at startup - src/tools/medgemma_tool.py: Fixed lazy loading flag logic

kovashikawa and others added 16 commits December 14, 2025 18:13

fix: resolve merge conflict in src/api/main.py

23bca98

- Keep bulletproof pipeline implementation - Remove conflicting old code - Update .gitignore for __pycache__

chore: remove __pycache__ from git tracking

62a92ab

fix: update repository.py for new schema with request_id column

fc9c644

- annotation table now has 6 columns (added request_id) - Updated save_annotations to explicitly set request_id=NULL - Maintains backward compatibility with old API code

feat: add database migration script for two-tier architecture

5ab86d8

- Automated backup of existing database - Creates new database with enhanced schema - Verifies table and index creation - Shows helpful next steps Usage: ./migrate_database.sh

db

cd34069

rm backup

ebaf906

chore: disable file logging, console only

645ff15

improve chatbot tool

bbc222a

.

637d3e7

Merge branch 'main' into medgemmaGeminiAgent

47d85b8

kovashikawa merged commit 408aa6f into main Dec 15, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Medgemma gemini agent#10

Medgemma gemini agent#10
kovashikawa merged 16 commits intomainfrom
medgemmaGeminiAgent

kovashikawa commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kovashikawa commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant