GigaEvo Platform

A machine learning experiment management system with a microservices architecture, featuring Kafka-based messaging and three-tier service separation.

🏗️ Architecture Overview

GigaEvo Platform consists of three main components:

🔧 Master API (Port 8000)

Role: Experiment orchestration and coordination
Technology: FastAPI, Kafka, PostgreSQL, Redis
Features:
- Kafka integration for async messaging
- Experiment lifecycle management
- Configuration storage and retrieval
- uv-based dependency management

🏃 Runner API (Port 8001)

Role: Task execution with GigaEvolve integration
Technology: FastAPI, GigaEvolve tools
Features:
- Experiment code execution
- Results visualization
- Best program extraction
- Background task processing

🌐 WebUI (Port 7860)

Role: Gradio-based user interface
Technology: Gradio, Plotly, Requests
Features:
- Interactive experiment creation
- Real-time progress monitoring
- Results visualization
- System status dashboard

🚀 Quick Start

Prerequisites

Docker & Docker Compose
Python 3.12+ (for local development)
uv (recommended) or pip

LLM configuration

GigaEvo platform reads all LLM settings from a single repo-level file: llm_models.yml. Create llm_models.yml from the llm_models.yml.example template and fill in your credentials.

Using the Deployment System

GigaEvo Platform uses the deploy.sh script with Docker Compose for service orchestration:

1. Deploy Everything (Recommended)

make deploy
# Or directly:
./deploy.sh deploy

This will deploy with automated health checks:

Infrastructure: PostgreSQL, Kafka, Zookeeper, Redis (2 instances), MinIO
Applications: Master API, Runner API, WebUI
Networking: Docker network and shared volumes
Health Monitoring: Automatic service health verification

2. Deploy Development Environment

make dev

3. Individual Service Development

# Run services locally for development (requires infrastructure running)
make master-api    # Master API on port 8000
make runner-api    # Runner API on port 8001
make web-ui        # WebUI on port 7860

4. Service Management

# Check all services status
make status
# Or:
./deploy.sh status

# Stop all services
make stop
# Or:
./deploy.sh stop

# Restart specific service
make restart SERVICE=master-api
make restart SERVICE=runner-api
make restart SERVICE=web-ui
make restart SERVICE=kafka

# View service logs
./deploy.sh logs [service-name]

Access Points

WebUI: http://localhost:7860
Master API: http://localhost:8000
Runner API: http://localhost:8001
MinIO Console: http://localhost:9001 (user: minioadmin, pass: minioadmin)
Kafka Broker: localhost:9092
Kafka UI: Available in dev mode at http://localhost:9000 (via make dev)

📚 API Endpoints

Master API (as per docs/api_endpoints.md)

POST /api/v1/experiments/ - Initialize experiment
GET /api/v1/experiments/ - Get list of experiments
GET /api/v1/experiments/{experiment_id}/status - Request status
POST /api/v1/experiments/{experiment_id}/start - Start experiment
POST /api/v1/experiments/{experiment_id}/stop - Stop experiment
GET /api/v1/experiments/{experiment_id}/results - Get results

Runner API (as per docs/api_endpoints.md)

POST /api/v1/experiments/{experiment_id}/upload - Load experiment code
POST /api/v1/experiments/{experiment_id}/start - Start experiment
POST /api/v1/experiments/{experiment_id}/stop - Stop experiment
GET /api/v1/experiments/{experiment_id}/status - Get execution status
GET /api/v1/experiments/{experiment_id}/visualization - Get visualization
GET /api/v1/experiments/{experiment_id}/best-program - Get best program
GET /api/v1/experiments/{experiment_id}/logs - Get logs (optional)

🔄 Kafka Topics

The system uses these Kafka topics for coordination:

experiment-config - Experiment configuration received
experiment-prepared - Experiment prepared for execution
experiment-started - Experiment execution started
experiment-stopped - Experiment execution stopped
runner-status - Runner status updates

🛠️ Development

Local Development Setup

# Install all dependencies
make install

# Run services individually (infrastructure must be running first)
make master-api    # Master API on port 8000
make runner-api    # Runner API on port 8001
make web-ui        # WebUI on port 7860

Container-Based Development

# Development with hot reload (legacy architecture)
make dev

# Production environment (legacy)
make prod

# Clean up containers and volumes
make docker-clean

Code Quality

make lint     # Run linting with ruff
make format   # Format code with ruff
make test     # Run tests (individual components)

Database Management

make db-reset     # Drop and recreate database
make db-migrate   # Run database migrations

🐛 Troubleshooting

Common Issues

Port Conflicts: Ensure these ports are free:
- 5432: PostgreSQL
- 6379, 6380: Redis (2 instances)
- 7860: WebUI
- 8000: Master API
- 8001: Runner API
- 9000, 9001: MinIO
- 9092, 29092, 29093: Kafka
- 2181: Zookeeper

Deployment Issues:

# Check deployment status
./deploy.sh status
# Or:
make status

# View service logs
./deploy.sh logs [service-name]
# Or for all services:
./deploy.sh logs

# Restart specific service
make restart SERVICE=master-api
make restart SERVICE=runner-api
make restart SERVICE=web-ui
make restart SERVICE=kafka

Service Health Check Failures:

# The deploy script automatically checks service health
# If services fail to start, check logs:
./deploy.sh logs postgres
./deploy.sh logs kafka
./deploy.sh logs master-api

Database Connection Issues:

# Reset database (use after schema changes)
make db-reset

# Check PostgreSQL logs
./deploy.sh logs postgres

Environment Variables

Key environment variables for Master API:

DATABASE__URL - PostgreSQL connection string
KAFKA__BOOTSTRAP_SERVERS - Kafka bootstrap servers
REDIS_URL - Redis connection URL
STORAGE__ENDPOINT_URL - MinIO endpoint
STORAGE__ACCESS_KEY - MinIO access key
STORAGE__SECRET_KEY - MinIO secret key

📊 Architecture Details

Current Kafka-Based Architecture

The platform uses a modern microservices architecture with:

Kafka Message Broker - Asynchronous service communication with topics for experiment coordination
Separate Docker Compositions - Modular deployment with infrastructure and application services
Health Monitoring - Automated service health checks and recovery
Resource Isolation - Dedicated Redis instances and MinIO storage
uv Dependency Management - Fast package installation and dependency caching

Service Orchestration

deploy.sh: Main deployment script with health checks and service management
docker-compose.kafka.yml: Core infrastructure services
docker-compose.*.yml: Individual application service configurations
Makefile: Development commands and shortcuts

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Run tests and linting: make test && make lint
Submit a pull request

📄 License

MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.vscode		.vscode
common		common
docker		docker
docs		docs
master_api		master_api
runner_api		runner_api
web_ui		web_ui
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
create_validate_uuids.sh		create_validate_uuids.sh
deploy.sh		deploy.sh
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.kafka.yml		docker-compose.kafka.yml
docker-compose.master-api.yml		docker-compose.master-api.yml
docker-compose.runner-api.yml		docker-compose.runner-api.yml
docker-compose.web-ui.yml		docker-compose.web-ui.yml
docker-compose.yml		docker-compose.yml
init.sql		init.sql
llm_models.yml.example		llm_models.yml.example
pyproject.toml		pyproject.toml

License

AIRI-Institute/gigaevo-platform

Folders and files

Latest commit

History

Repository files navigation