Skip to content

AI-powered business intelligence tool that turns anyone into a data analyst. Upload CSV, ask questions in plain English, get executive insights with ROI analysis.

Notifications You must be signed in to change notification settings

cbratkovics/sql-genius-ai

Repository files navigation

SQL Genius AI - Enterprise SaaS Platform

Production-Ready Multi-Tenant SaaS for Intelligent Data Analysis

Deploy on Render Frontend on Vercel API Docs Live Demo

Enterprise Ready Multi-Tenant Security First CI/CD Pipeline


Enterprise-grade SaaS platform that transforms natural language into intelligent SQL queries with advanced security, monitoring, and multi-tenant architecture.


Enterprise Architecture

Core Platform

  • Multi-tenant SaaS with database-per-tenant isolation
  • FastAPI backend with async PostgreSQL and Redis
  • JWT authentication with RSA key rotation and service mesh
  • Role-based access control (RBAC) with fine-grained permissions
  • Multi-factor authentication (TOTP, SMS, email, backup codes)

Security & Compliance

  • Zero-trust architecture with service mesh authentication
  • Client-side encryption with column-level access control
  • Data masking and automated compliance reporting
  • Security scanning integrated in CI/CD pipeline
  • Vulnerability management with automated patching

Observability & Monitoring

  • Prometheus metrics with custom business KPIs
  • Grafana dashboards for system and application monitoring
  • ELK stack for centralized logging and analysis
  • Jaeger tracing for distributed request tracking
  • SLI/SLO monitoring with automated alerting

FinOps & Cost Management

  • AI API cost tracking with per-token pricing models
  • Infrastructure cost monitoring via AWS Cost Explorer
  • Cost anomaly detection with statistical analysis
  • Budget management with configurable alerts
  • Automated optimization recommendations

Disaster Recovery

  • Automated backups (full, incremental, transaction log)
  • Cross-region replication with S3 storage
  • Point-in-time recovery with RTO/RPO targets
  • Backup scheduling and retention policies
  • Recovery plan generation and execution

Technology Stack

Backend Services

API Framework: FastAPI with async/await
Database: PostgreSQL 15 with SQLAlchemy ORM
Cache: Redis 7 with async support
Task Queue: Celery with Redis broker
Authentication: JWT with RSA256 signing

Infrastructure

Containerization: Docker with multi-stage builds
Orchestration: Kubernetes with Helm charts
Monitoring: Prometheus + Grafana + ELK
Service Mesh: Istio with mTLS
Storage: AWS S3 with encryption

AI & Intelligence

SQL Generation: Claude 3.5 Sonnet
Query Optimization: Custom SQL parser
Semantic Caching: Vector similarity matching
Business Intelligence: Automated insights
Anomaly Detection: Statistical analysis

DevOps & Security

CI/CD: GitHub Actions with security scanning
Security: TruffleHog, Snyk, Bandit, Semgrep
Container Security: Trivy, Dockle
Performance: Load testing with Locust
License Compliance: Automated validation

Live Deployment

Production URLs

Quick Deploy

Deploy Backend to Render

  1. Fork this repository
  2. Connect to Render
  3. Create new Blueprint from render.yaml
  4. Set environment variables:
    • ANTHROPIC_API_KEY: Your Claude API key
    • JWT_SECRET_KEY: Auto-generated
    • DATABASE_URL: Auto-configured

Deploy Frontend to Vercel

cd sql-genius-frontend
npx vercel --prod

Demo Features

  • SQL Generation: Natural language to SQL with AI
  • Live Playground: Interactive query builder
  • Real-time Metrics: Performance monitoring dashboard
  • Schema Templates: E-commerce, SaaS, Healthcare, Financial
  • Sandboxed Execution: Safe query testing environment

Verify Deploy

# Check API health
curl -sSf https://sql-genius-api.onrender.com/health

# View API documentation
curl -sSf https://sql-genius-api.onrender.com/docs

For detailed deployment instructions, see the render.yaml Blueprint configuration.


Enterprise Features

Unified Authentication Service

  • JWT with RSA rotation - Automatic key rotation every 24 hours
  • Multi-factor authentication - TOTP, SMS, email, backup codes
  • Single Sign-On (SSO) - SAML 2.0 and OpenID Connect
  • Service mesh auth - mTLS between microservices
  • Session management - Redis-based with automatic cleanup

Advanced SQL Engine

  • Multi-step reasoning - Complex query decomposition
  • Validation pipeline - Syntax and semantic checking
  • Security scanning - SQL injection prevention
  • Query sandboxing - Isolated execution environment
  • Performance optimization - Automatic query tuning

Multi-Tenant Architecture

  • Database isolation - Database-per-tenant pattern
  • Dynamic routing - Tenant-aware request routing
  • Resource quotas - Per-tenant limits and throttling
  • Usage metering - Detailed consumption tracking
  • Billing integration - Stripe subscription management

Business Intelligence

  • Automated insights - AI-powered data analysis
  • Anomaly detection - Statistical outlier identification
  • Natural language reporting - Executive summaries
  • Predictive analytics - Trend forecasting
  • Custom dashboards - Real-time visualizations

Intelligent Caching

  • Semantic similarity - Vector-based query matching
  • TTL optimization - Dynamic cache expiration
  • Distributed cache - Redis cluster support
  • Cache invalidation - Smart dependency tracking
  • Performance metrics - Cache hit/miss analytics

Development Setup

Prerequisites

- Python 3.11+
- Docker & Docker Compose
- PostgreSQL 15
- Redis 7
- Node.js 18+ (for frontend)

Quick Start

# Clone repository
git clone https://github.com/cbratkovics/sql-genius-ai.git
cd sql-genius-ai

# Start infrastructure services
docker-compose -f infrastructure/docker/docker-compose.yml up -d

# Install Python dependencies
cd backend
pip install -r requirements.txt

# Run database migrations
alembic upgrade head

# Start the API server
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# Start monitoring stack
docker-compose -f infrastructure/monitoring/docker-compose.monitoring.yml up -d

Environment Configuration

# .env file
DATABASE_URL=postgresql://user:pass@localhost:5432/sqlgenius
REDIS_URL=redis://localhost:6379
CLAUDE_API_KEY=your_claude_api_key
JWT_SECRET_KEY=your_jwt_secret
STRIPE_SECRET_KEY=your_stripe_key
AWS_ACCESS_KEY_ID=your_aws_key
AWS_SECRET_ACCESS_KEY=your_aws_secret

CI/CD Pipeline

Security Scanning

  • Secret detection - TruffleHog for exposed secrets
  • Dependency scanning - Snyk for vulnerabilities
  • SAST analysis - Bandit and Semgrep
  • Container security - Trivy and Dockle
  • License compliance - Automated validation

Quality Gates

  • Code coverage - 90% minimum threshold
  • Type checking - MyPy static analysis
  • Code formatting - Black and isort
  • Performance testing - Regression detection
  • Security benchmarks - OWASP compliance

Deployment Strategy

  • GitOps workflow - ArgoCD for deployments
  • Blue-green deployment - Zero-downtime releases
  • Automated rollback - Health check failures
  • Progressive delivery - Feature flags and canary
  • Post-deployment testing - Smoke and integration tests

Monitoring & Observability

Metrics & Dashboards

Application Metrics:
  - Request latency and throughput
  - Error rates and success rates
  - SQL query performance
  - Cache hit/miss ratios
  - Business KPIs and conversion

Infrastructure Metrics:
  - CPU, memory, disk usage
  - Database connection pools
  - Queue depth and processing
  - Network I/O and latency
  - Container resource usage

Alerting Rules

Critical Alerts:
  - API error rate > 1%
  - Response time > 2s
  - Database connections > 80%
  - Disk usage > 85%
  - Memory usage > 90%

Warning Alerts:
  - Cache hit rate < 80%
  - Queue processing delay > 30s
  - Cost anomaly detected
  - Backup failure
  - SSL certificate expiry < 30 days

Cost Management

FinOps Features

  • Real-time cost tracking - Per-tenant consumption
  • Budget alerts - Configurable thresholds
  • Cost allocation - Department and project tags
  • Optimization recommendations - AI-powered suggestions
  • Anomaly detection - Statistical analysis of spending

Cost Optimization

AI API Costs:
  - Token usage optimization
  - Model selection automation
  - Batch processing
  - Caching strategies
  - Rate limiting

Infrastructure Costs:
  - Auto-scaling policies
  - Reserved instance planning
  - Storage lifecycle management
  - Spot instance utilization
  - Resource right-sizing

API Documentation

Authentication Endpoints

POST /auth/login          # User authentication
POST /auth/register       # User registration
POST /auth/mfa/setup      # MFA configuration
POST /auth/refresh        # Token refresh
DELETE /auth/logout       # Session termination

SQL Generation Endpoints

POST /sql/generate        # Generate SQL from natural language
POST /sql/execute         # Execute SQL query
GET /sql/history          # Query history
POST /sql/validate        # Validate SQL syntax
GET /sql/schema          # Database schema

Management Endpoints

GET /tenants              # Tenant management
POST /tenants/{id}/backup # Create backup
GET /metrics              # Prometheus metrics
GET /health              # Health check
GET /docs                # OpenAPI documentation

Architecture Diagrams

System Architecture

graph TB
    LB[Load Balancer] --> API[FastAPI Services]
    API --> DB[(PostgreSQL)]
    API --> CACHE[(Redis)]
    API --> QUEUE[Celery Workers]
    
    API --> AUTH[Auth Service]
    API --> SQL[SQL Engine]
    API --> BI[BI Service]
    
    MONITOR[Prometheus] --> GRAFANA[Grafana]
    LOGS[ELK Stack] --> KIBANA[Kibana]
    
    BACKUP[Backup Service] --> S3[(AWS S3)]
    COST[Cost Service] --> AWS[AWS APIs]
Loading

Multi-Tenant Data Flow

sequenceDiagram
    participant C as Client
    participant LB as Load Balancer
    participant API as API Gateway
    participant AUTH as Auth Service
    participant SQL as SQL Engine
    participant DB as Tenant DB
    
    C->>LB: Request with JWT
    LB->>API: Route to service
    API->>AUTH: Validate token
    AUTH->>API: Tenant context
    API->>SQL: Process query
    SQL->>DB: Execute on tenant DB
    DB->>SQL: Results
    SQL->>API: Formatted response
    API->>C: Business insights
Loading

Performance Benchmarks

API Performance

Response Times:
  - SQL Generation: <500ms (p95)
  - Query Execution: <2s (p95)
  - Authentication: <100ms (p95)
  - Dashboard Load: <1s (p95)

Throughput:
  - Concurrent Users: 10,000+
  - Requests/Second: 5,000+
  - SQL Queries/Hour: 1,000,000+
  - Cache Hit Rate: >90%

Scalability

Horizontal Scaling:
  - API Services: Auto-scaling 2-50 instances
  - Database: Read replicas + sharding
  - Cache: Redis cluster (6 nodes)
  - Queue: Celery workers (10-100 instances)

Resource Efficiency:
  - CPU Usage: <70% average
  - Memory Usage: <80% average
  - Storage Growth: <10GB/month/tenant
  - Network I/O: <1Gbps sustained

Security & Compliance

Security Controls

  • Authentication: JWT with RSA256, MFA required
  • Authorization: Fine-grained RBAC permissions
  • Encryption: TLS 1.3 in transit, AES-256 at rest
  • Network: VPC isolation, security groups
  • Monitoring: Real-time threat detection

Compliance Standards

  • SOC 2 Type II - Security and availability
  • GDPR - Data privacy and protection
  • HIPAA - Healthcare data security
  • PCI DSS - Payment card industry
  • ISO 27001 - Information security management

Deployment

Production Deployment

# Build and push images
docker build -t sqlgenius/api:latest -f infrastructure/docker/Dockerfile.backend .
docker push sqlgenius/api:latest

# Deploy with Helm
helm upgrade --install sqlgenius ./charts/sqlgenius \
  --namespace production \
  --values values.production.yaml

# Verify deployment
kubectl get pods -n production
kubectl logs -f deployment/sqlgenius-api -n production

Infrastructure as Code

# Kubernetes manifests
- Deployments and Services
- ConfigMaps and Secrets
- Ingress and NetworkPolicies
- HorizontalPodAutoscaler
- PersistentVolumeClaims

# Helm Charts
- Application deployment
- Configuration management
- Environment-specific values
- Dependency management
- Upgrade strategies

Contributing

Development Guidelines

  • Code Style: Black formatting, PEP 8 compliance
  • Testing: 90%+ coverage, unit and integration tests
  • Documentation: Docstrings, API documentation
  • Security: Security review for all changes
  • Performance: Benchmark critical paths

Contribution Process

  1. Fork the repository
  2. Create feature branch
  3. Implement changes with tests
  4. Run security and quality checks
  5. Submit pull request
  6. Code review and approval
  7. Automated deployment

License

This project is licensed under the MIT License - see the LICENSE file for details.


Enterprise Support

Production-Ready SaaS Platform

PRODUCTION READY

ENTERPRISE GRADE

Multi-tenant • Secure • Scalable • Observable • Cost-optimized


Built by Christopher Bratkovics

Star this repo to support enterprise-grade open source!

About

AI-powered business intelligence tool that turns anyone into a data analyst. Upload CSV, ask questions in plain English, get executive insights with ROI analysis.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •