Codebase Swarm

A Multi-Agent AI System for Comprehensive Code Analysis

Quick Start • Features • Architecture • Usage Examples • Contributing

Why I Created This

As a developer, I was tired of juggling 10 different tools to understand my codebase:

Security: Bandit, Snyk, Semgrep

Performance: Profilers, linters, manual code review

Testing: Coverage.py, pytest, mutation testing

Architecture: Graphviz, manual tracing, whiteboard sessions

Refactoring: IDE hints, gut feelings, Stack Overflow

Each tool gave me fragmented insights, but none understood the big picture. I wanted something that could:

Think holistically about my codebase like a senior architect

Connect the dots between security, performance, and design

Generate actual fixes, not just warnings

Learn and adapt to my team's specific patterns

So I built Codebase Swarm—a team of AI agents that collaborate to give you complete codebase intelligence in one place.

What is Codebase Swarm?

Codebase Swarm is a multi-agent AI system that analyzes your codebase using specialized AI agents working together. Think of it as hiring a team of expert consultants (security auditor, performance engineer, test architect, etc.) who:

Collaborate to solve complex problems Specialize in their domain but understand the big picture Generate actionable fixes with proof-of-concept exploits Visualize your architecture in real-time Predict issues before they hit production Unlike traditional static analysis tools, Codebase Swarm uses LLMs + AST parsing + Graph analysis to understand intent, not just syntax.

Problems It Solves

"I have 50 security warnings, which ones actually matter?" Problem: Traditional tools flood you with false positives.

Solution: Security Agent generates proof-of-concept exploits and ranks by actual risk, not just pattern matching.

2."Will this code scale to 1000 RPS?" Problem: Performance issues only appear in production.

Solution: Performance Agent simulates load and predicts bottlenecks with estimated RPS limits.

3."What tests should I write?" Problem: 40% test coverage, but which 40% matters?

Solution: Tester Agent maps critical paths and generates targeted tests for untested error handling.

4."If I change this function, what breaks?" Problem: Fear of refactoring due to unknown dependencies.

Solution: Architect Agent builds a call graph and shows exact impact of changes.

5."How do I fix this vulnerability?" Problem: Tools tell you what's wrong, but not how to fix it.

Solution: Refactorer Agent generates ready-to-apply patches with before/after code.

Key Features

Feature	Description	Impact
🔒 Security Agent	Finds SQLi, XSS, hardcoded secrets with exploits	Prevents breaches before deployment
⚡ Performance Agent	Predicts RPS limits, detects N+1 queries, blocking calls	Scales confidently
🧪 Tester Agent	Identifies coverage gaps, generates missing tests	Reaches 80%+ coverage efficiently
🏗️ Architect Agent	Maps call graphs, detects circular dependencies	Refactors safely
🔧 Refactorer Agent	Auto-generates patches for all issues	Fixes in minutes, not hours
🕸️ Interactive Graphs	D3.js call graph with clickable nodes	Visualize architecture
📊 Risk Scoring	0-10 risk scores per category	Prioritize work
📝 Git Integration	Analyzes commit history, generates patches	Seamless workflow
🎯 Custom Rules	YAML-based architecture rules	Enforce team standards
🌐 Multi-language	Python + extendable to JS/TS, Go, Rust	Polyglot support

Architecture

Supported Languages

Codebase Swarm includes built-in parsers and scaffolding for multiple languages. Current lightweight supported languages:

Python (fully implemented AST parsing)
JavaScript / JSX (heuristic parser stub; recommend integrating tree-sitter or esprima for production)
Go (heuristic parser stub)

Broad language support

The project now includes a generic, heuristic parser that provides basic coverage across many languages (Java, Kotlin, C#, PHP, Ruby, Rust, C/C++, Swift, Scala, Perl, and more). Heuristic parsers can detect simple function and class declarations but are not a substitute for full AST-based parsing.

For production-grade, accurate parsing across all languages, integrate Tree-sitter or language-specific AST tools. We provide a clear hook: add a parser under swarm/tools/parsers/ implementing the BaseParser interface and call register_parser('<language>', parser_instance).

Optional: to enable Tree-sitter parsing, install a Python tree-sitter package and configure compiled language libraries. Example (not included):

pip install tree_sitter
# then build language bundles per Tree-sitter docs

Tree-sitter scaffold

The repository now includes a Tree-sitter integration scaffold at swarm/tools/parsers/tree_sitter_parser.py.

It will register a tree_sitter parser automatically if the tree_sitter Python package is installed and a compiled languages bundle is available (see TREE_SITTER_LANG_DIR env var or vendor/tree_sitter_languages.so).
The scaffold is intentionally minimal — extend it to load specific Language objects and map file extensions to those languages for accurate AST queries.

Example quick-start:

pip install tree_sitter
# Build a combined language bundle (see Tree-sitter docs) and export:
export TREE_SITTER_LANG_DIR=/path/to/compiled/bundle

# Then run the analyzer; CodeParser will prefer tree-sitter when available.

To add a new language parser, implement swarm/tools/parsers/<your_parser>.py following the BaseParser interface and register it via swarm/tools/parsers/__init__.py using register_parser(name, parser_instance).

graph TD
    A[User CLI/Streamlit] --> B[Swarm Orchestrator];
    B --> C[Architect Agent];
    B --> D[Security Agent];
    B --> E[Performance Agent];
    B --> F[Tester Agent];
    B --> G[Refactorer Agent];
    
    C --> H[Call Graph Builder];
    C --> I[Import Analyzer];
    
    D --> J[Security Scanner];
    D --> K[Exploit Generator];
    
    E --> L[Performance Analyzer];
    E --> M[Complexity Profiler];
    
    F --> N[Test Generator];
    F --> O[Coverage Analyzer];
    
    G --> P[Patch Generator];
    G --> Q[AST Transformer];
    
    H --> R[Shared State];
    J --> R;
    L --> R;
    N --> R;
    P --> R;
    
    R --> S[Final Report];
    P --> T[fixes.patch];
    
    style A fill:#f9f,stroke:#333,stroke-width:2px
    style B fill:#bbf,stroke:#333,stroke-width:2px
    style S fill:#bfb,stroke:#333,stroke-width:2px

Installation

Option 1: From PyPI (Recommended)

pip install codebase-swarm

Option 2: From Source

git clone https://github.com/KunjShah01/codebase-swarm.git
cd codebase-swarm
pip install -r requirements.txt
python setup.py install

Option 3: Docker

docker pull KunjShah01/codebase-swarm:latest
docker run -v $(pwd):/code KunjShah01/codebase-swarm /code

Quick Start

30-Second Test

# Run on the included sample project
swarm examples/sample_project --output report.md

# Apply fixes automatically
swarm examples/sample_project --apply-fixes

2-Minute Deep Dive

# Interactive CLI with beautiful UI
swarm --mode interactive

# Or launch Streamlit dashboard
streamlit run streamlit_app.py

Usage Examples

CLI Mode

# Basic analysis
swarm /path/to/your/project

# Save report
swarm /path/to/project -o security_report.md

# Security only
swarm /path/to/project --agents security

# Auto-fix critical issues
swarm /path/to/project --apply-fixes --severity critical

Streamlit Mode

# Launch web interface
streamlit run streamlit_app.py -- --target /path/to/project

Python API

from swarm.orchestrator import SwarmOrchestrator
from swarm.models import Task

# Initialize
orchestrator = SwarmOrchestrator()

# Create task
task = Task(
    description="Find security issues in auth module",
    target="src/auth.py"
)

# Run analysis
result = orchestrator.solve(task)

# Access results
print(f"Found {len(result['security']['vulnerabilities'])} vulnerabilities")
print(f"Generated {len(result['fixes']['fixes'])} fixes")

The Agents

🔒 Security Agent

Expertise: OWASP Top 10, cryptography, secure coding
Tools: Static analyzer, pattern matcher, exploit generator
Output: CVE-style reports with PoC exploits

# Example: Finds and exploits SQL injection
vulnerability = {
    "type": "SQL Injection",
    "cwe": "CWE-89",
    "severity": "critical",
    "exploit": {
        "payload": "' OR '1'='1'; DROP TABLE users; --",
        "impact": "Complete database compromise",
        "proof_of_concept": "python exploit.py --url http://target.com/login"
    }
}

⚡ Performance Agent

Expertise: Algorithms, concurrency, database optimization
Tools: Profiler, complexity analyzer, load simulator
Output: RPS predictions with optimization patches

# Example: Predicts scaling limit
prediction = {
    "estimated_rps": 500,
    "bottleneck_at": "3 critical bottlenecks",
    "failure_mode": "Database connection pool exhaustion",
    "scaling_limit": "Will fail at ~1000 RPS"
}

🧪 Tester Agent

Expertise: TDD, pytest, test doubles
Tools: Coverage analyzer, test generator, mutation tester
Output: Missing tests with 80%+ coverage path

# Example: Generates missing test
generated_test = """
def test_process_payment_raises_on_invalid_amount():
    with pytest.raises(ValueError):
        process_payment(user_id=1, amount=-100)
"""

🏗️ Architect Agent

Expertise: Design patterns, clean architecture, scalability
Tools: Call graph builder, import analyzer, dependency mapper

graph TD
    A[API Layer] --> B[Service Layer];
    B --> C[Database Layer];
    A -.-> C;  # Violation!

🔧 Refactorer Agent

Expertise: Refactoring, code style, modern Python
Tools: AST transformer, code generator, patch applier
Output: Git-ready patches with before/after

- query = f"SELECT * FROM users WHERE id = {user_id}"
+ query = "SELECT * FROM users WHERE id = ?"
+ cursor.execute(query, (user_id,))

Sample Output

CLI Report

🐝 CODEBASE SWARM ANALYSIS
═══════════════════════════════════════════════════════════

🏗️ Architecture: 47 functions, 12 classes
🔒 Security: 3 critical, 5 high, 2 medium vulnerabilities
⚡ Performance: 2 critical bottlenecks (estimated RPS: 500)
🧪 Testing: 45% coverage, 8 test gaps identified
🔧 Fixes: 18 auto-generated patches ready

🚨 CRITICAL ISSUES:
   • SQL Injection in auth.py:42
   • Blocking call in payment.py:67
   • N+1 query in orders.py:23

📊 Risk Score: 7.2/10 (High Risk)

Streamlit Dashboard

diff --git a/src/auth.py b/src/auth.py
--- a/src/auth.py
+++ b/src/auth.py
@@ -42,7 +42,8 @@ def authenticate(username, password):
-    query = f"SELECT * FROM users WHERE username = '{username}'"
-    result = db.execute(query)
+    query = "SELECT * FROM users WHERE username = ?"
+    result = db.execute(query, (username,))
     
     if result:
         return User(**result)

Advanced Configuration

Custom Rules (swarm.yaml)

# Architecture rules
architecture:
  forbidden_patterns:
    - pattern: "api/.*\\.py"
      forbidden_imports: ["database", "models"]
      severity: "error"
  
  max_function_length: 50
  max_class_methods: 10

# Security rules
security:
  custom_vulnerabilities:
    - name: "Internal API Key"
      pattern: 'internal_api_key\s*=\s*["\'][^"\']+["\']'
      severity: "high"

# Performance thresholds
performance:
  max_complexity: 10
  min_rps_threshold: 1000

Environment Variables

export OPENAI_API_KEY="sk-..."
export SWARM_CONFIG="swarm.yaml"
export SWARM_OUTPUT_DIR="./reports"
export SWARM_AUTO_APPLY="false"

Roadmap

Q2 2025
GitHub Action - Automated PR comments
VS Code Extension - Real-time analysis in IDE
JavaScript/TypeScript Support - Full AST parsing
Enterprise SSO - SAML/OAuth integration

Q3 2025
Machine Learning Model - Custom bug prediction
Cloud Cost Estimator - AWS/GCP cost analysis
Interactive Playground - Fix vulnerabilities in-browser
Team Dashboard - Organization-wide metrics

Q4 2025
Self-Healing Mode - Auto-fix on commit
Multi-repo Analysis - Microservices architecture view
Custom Agent SDK - Build your own agents
Enterprise Edition - On-premise deployment

Contributing

We love contributions! Here's how to help:

Quick Start for Contributors

git clone https://github.com/KunjShah01/codebase-swarm.git
cd codebase-swarm
pip install -r requirements-dev.txt
pre-commit install

Adding a New Agent

# Create swarm/agents/custom_agent.py
from swarm.agents.base_agent import BaseAgent

class CustomAgent(BaseAgent):
    def execute(self, task, context):
        # Your logic here
        return {"custom_metric": 42}

Running Tests

pytest tests/ --cov=swarm --cov-report=html

License

MIT License - see LICENSE file for details.

Acknowledgments

OpenAI - GPT-4 for agent intelligence
Tree-sitter - Blazing-fast AST parsing
Rich - Beautiful CLI interfaces
Streamlit - Interactive dashboards
All contributors - Making this better every day

Support

Discord: Join our community
GitHub Issues: Report bugs
Documentation: Full docs
Email: kunjkshahdeveloper@gmail.com

About

Because your codebase deserves a team of experts

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
chroma_db		chroma_db
examples		examples
external_repo		external_repo
swarm		swarm
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
app.py		app.py
ask.py		ask.py
cli_main.py		cli_main.py
index.py		index.py
main.py		main.py
parse.py		parse.py
pytest.ini		pytest.ini
readme.md		readme.md
requirements.txt		requirements.txt
rules.yaml		rules.yaml
samplereport.md		samplereport.md
setup.py		setup.py
streamlit_app.py		streamlit_app.py
swarm_performance_report.md		swarm_performance_report.md
swarm_security_report.md		swarm_security_report.md

License

KunjShah01/codebase-oracle

Folders and files

Latest commit

History

Repository files navigation

Codebase Swarm

A Multi-Agent AI System for Comprehensive Code Analysis

Table of Contents

Why I Created This

What is Codebase Swarm?

Problems It Solves

Key Features

Architecture

Supported Languages

Broad language support

Tree-sitter scaffold

Installation

Option 1: From PyPI (Recommended)

Option 2: From Source

Option 3: Docker

Quick Start

30-Second Test

2-Minute Deep Dive

Usage Examples

CLI Mode

Streamlit Mode

Python API

The Agents

🔒 Security Agent

⚡ Performance Agent

🧪 Tester Agent

🏗️ Architect Agent

🔧 Refactorer Agent

Sample Output

CLI Report

Streamlit Dashboard

Advanced Configuration

Custom Rules (swarm.yaml)

Environment Variables

Roadmap

Contributing

Quick Start for Contributors

Adding a New Agent

Running Tests

License

Acknowledgments

Support

About

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages