Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .claude/commands/review-pr.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,10 @@ Pull request(s): $ARGUMENTS
- Reviewing large, unfocused PRs is impractical and error-prone; the review cannot provide adequate assurance for such changes

6. **Vision Alignment Check**
- Read the project's README.md and CLAUDE.md to understand the application's core purpose
- Assess whether this PR aligns with the application's intended functionality
- If the changes deviate significantly from the core vision or add functionality that doesn't serve the application's purpose, note this in the review
- This is not a blocker, but should be flagged for the reviewer's consideration
- **VISION.md protection**: First, check whether the PR diff modifies `VISION.md` in any way (edits, deletions, renames). If it does, **stop the review immediately** — verdict is **DON'T MERGE**. VISION.md is immutable and no PR is permitted to alter it. Explain this to the user and skip all remaining steps.
- Read the project's `VISION.md`, `README.md`, and `CLAUDE.md` to understand the application's core purpose and mandatory architectural constraints
- Assess whether this PR aligns with the vision defined in `VISION.md`
- **Vision deviation is a merge blocker.** If the PR introduces functionality, integrations, or architectural changes that conflict with `VISION.md`, the verdict must be **DON'T MERGE**. This is not negotiable — the vision document takes precedence over any PR rationale.

7. **Safety Assessment**
- Provide a review on whether the PR is safe to merge as-is
Expand Down
18 changes: 18 additions & 0 deletions .claude/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{
"version": "0.0.1",
"configurations": [
{
"name": "backend",
"runtimeExecutable": "python",
"runtimeArgs": ["-m", "uvicorn", "server.main:app", "--host", "127.0.0.1", "--port", "8888", "--reload"],
"port": 8888
},
{
"name": "frontend",
"runtimeExecutable": "cmd",
"runtimeArgs": ["/c", "cd ui && npx vite"],
"port": 5173
}
],
"autoVerify": true
}
22 changes: 22 additions & 0 deletions VISION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# VISION

This document defines the mandatory project vision for AutoForge. All contributions must align with these principles. PRs that deviate from this vision will be rejected. This file itself is immutable via PR — any PR that modifies VISION.md will be rejected outright.

## Claude Agent SDK Exclusivity

AutoForge is a wrapper around the **Claude Agent SDK**. This is a foundational architectural decision, not a preference.

**What this means:**

- AutoForge only supports providers, models, and integrations that work through the Claude Agent SDK.
- We will not integrate with, accommodate, or add support for other AI SDKs, CLIs, or coding agent platforms (e.g., Codex, OpenCode, Aider, Continue, Cursor agents, or similar tools).

**Why:**

Each platform has its own approach to MCP tools, skills, context management, and feature integration. Attempting to support multiple agent frameworks creates an unsustainable maintenance burden and dilutes the quality of the core experience. By committing to the Claude Agent SDK exclusively, we can build deep, reliable integration rather than shallow compatibility across many targets.

**In practice:**

- PRs adding support for non-Claude agent frameworks will be rejected.
- PRs introducing abstractions designed to make AutoForge "agent-agnostic" will be rejected.
- Alternative API providers (e.g., Vertex AI, AWS Bedrock) are acceptable only when accessed through the Claude Agent SDK's own configuration.
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "autoforge-ai",
"version": "0.1.13",
"version": "0.1.14",
"description": "Autonomous coding agent with web UI - build complete apps with AI",
"license": "AGPL-3.0",
"bin": {
Expand Down
2 changes: 1 addition & 1 deletion requirements-prod.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Production runtime dependencies only
# For development, use requirements.txt (includes ruff, mypy, pytest)
claude-agent-sdk>=0.1.0,<0.2.0
claude-agent-sdk>=0.1.39,<0.2.0
python-dotenv>=1.0.0
sqlalchemy>=2.0.0
fastapi>=0.115.0
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
claude-agent-sdk>=0.1.0,<0.2.0
claude-agent-sdk>=0.1.39,<0.2.0
python-dotenv>=1.0.0
sqlalchemy>=2.0.0
fastapi>=0.115.0
Expand Down
95 changes: 64 additions & 31 deletions server/services/assistant_chat_session.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
but cannot modify any files.
"""

import asyncio
import json
import logging
import os
Expand All @@ -25,7 +26,12 @@
create_conversation,
get_messages,
)
from .chat_constants import ROOT_DIR
from .chat_constants import (
MAX_CHAT_RATE_LIMIT_RETRIES,
ROOT_DIR,
calculate_rate_limit_backoff,
check_rate_limit_error,
)

# Load environment variables from .env file if present
load_dotenv()
Expand Down Expand Up @@ -393,39 +399,66 @@ async def _query_claude(self, message: str) -> AsyncGenerator[dict, None]:

full_response = ""

# Stream the response
async for msg in self.client.receive_response():
msg_type = type(msg).__name__

if msg_type == "AssistantMessage" and hasattr(msg, "content"):
for block in msg.content:
block_type = type(block).__name__

if block_type == "TextBlock" and hasattr(block, "text"):
text = block.text
if text:
full_response += text
yield {"type": "text", "content": text}

elif block_type == "ToolUseBlock" and hasattr(block, "name"):
tool_name = block.name
tool_input = getattr(block, "input", {})
# Stream the response (with rate-limit retry)
for _attempt in range(MAX_CHAT_RATE_LIMIT_RETRIES + 1):
try:
async for msg in self.client.receive_response():
msg_type = type(msg).__name__

if msg_type == "AssistantMessage" and hasattr(msg, "content"):
for block in msg.content:
block_type = type(block).__name__

if block_type == "TextBlock" and hasattr(block, "text"):
text = block.text
if text:
full_response += text
yield {"type": "text", "content": text}

elif block_type == "ToolUseBlock" and hasattr(block, "name"):
tool_name = block.name
tool_input = getattr(block, "input", {})

# Intercept ask_user tool calls -> yield as question message
if tool_name == "mcp__features__ask_user":
questions = tool_input.get("questions", [])
if questions:
yield {
"type": "question",
"questions": questions,
}
continue

# Intercept ask_user tool calls -> yield as question message
if tool_name == "mcp__features__ask_user":
questions = tool_input.get("questions", [])
if questions:
yield {
"type": "question",
"questions": questions,
"type": "tool_call",
"tool": tool_name,
"input": tool_input,
}
continue

yield {
"type": "tool_call",
"tool": tool_name,
"input": tool_input,
}
# Completed successfully — break out of retry loop
break
except Exception as exc:
is_rate_limit, retry_secs = check_rate_limit_error(exc)
if is_rate_limit and _attempt < MAX_CHAT_RATE_LIMIT_RETRIES:
delay = retry_secs if retry_secs else calculate_rate_limit_backoff(_attempt)
logger.warning(f"Rate limited (attempt {_attempt + 1}/{MAX_CHAT_RATE_LIMIT_RETRIES}), retrying in {delay}s")
yield {
"type": "rate_limited",
"retry_in": delay,
"attempt": _attempt + 1,
"max_attempts": MAX_CHAT_RATE_LIMIT_RETRIES,
}
await asyncio.sleep(delay)
await self.client.query(message)
continue
if is_rate_limit:
logger.error("Rate limit retries exhausted for assistant chat")
yield {"type": "error", "content": "Rate limited. Please try again later."}
return
# Non-rate-limit MessageParseError: log and break (don't crash)
if type(exc).__name__ == "MessageParseError":
logger.warning(f"Ignoring unrecognized message from Claude CLI: {exc}")
break
raise

# Store the complete response in the database
if full_response and self.conversation_id:
Expand Down
40 changes: 40 additions & 0 deletions server/services/chat_constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
imports (``from .chat_constants import API_ENV_VARS``) continue to work.
"""

import logging
import sys
from pathlib import Path
from typing import AsyncGenerator
Expand All @@ -32,6 +33,45 @@
# imports continue to work unchanged.
# -------------------------------------------------------------------
from env_constants import API_ENV_VARS # noqa: E402, F401
from rate_limit_utils import calculate_rate_limit_backoff, is_rate_limit_error, parse_retry_after # noqa: E402, F401

logger = logging.getLogger(__name__)

# -------------------------------------------------------------------
# Rate-limit handling for chat sessions
# -------------------------------------------------------------------
MAX_CHAT_RATE_LIMIT_RETRIES = 3


def check_rate_limit_error(exc: Exception) -> tuple[bool, int | None]:
"""Inspect an exception and determine if it represents a rate-limit.

Returns ``(is_rate_limit, retry_seconds)``. ``retry_seconds`` is the
parsed Retry-After value when available, otherwise ``None`` (caller
should use exponential backoff).

Handles:
- ``MessageParseError`` whose raw *data* dict has
``type == "rate_limit_event"`` (Claude CLI sends this).
- Any exception whose string representation matches known rate-limit
patterns (via ``rate_limit_utils.is_rate_limit_error``).
"""
exc_str = str(exc)

# Check for MessageParseError with a rate_limit_event payload
cls_name = type(exc).__name__
if cls_name == "MessageParseError":
raw_data = getattr(exc, "data", None)
if isinstance(raw_data, dict) and raw_data.get("type") == "rate_limit_event":
retry = parse_retry_after(str(raw_data)) if raw_data else None
return True, retry

# Fallback: match error text against known rate-limit patterns
if is_rate_limit_error(exc_str):
retry = parse_retry_after(exc_str)
return True, retry

return False, None


async def make_multimodal_message(content_blocks: list[dict]) -> AsyncGenerator[dict, None]:
Expand Down
85 changes: 67 additions & 18 deletions server/services/expand_chat_session.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,13 @@
from dotenv import load_dotenv

from ..schemas import ImageAttachment
from .chat_constants import ROOT_DIR, make_multimodal_message
from .chat_constants import (
MAX_CHAT_RATE_LIMIT_RETRIES,
ROOT_DIR,
calculate_rate_limit_backoff,
check_rate_limit_error,
make_multimodal_message,
)

# Load environment variables from .env file if present
load_dotenv()
Expand Down Expand Up @@ -298,24 +304,67 @@ async def _query_claude(
else:
await self.client.query(message)

# Stream the response
async for msg in self.client.receive_response():
msg_type = type(msg).__name__

if msg_type == "AssistantMessage" and hasattr(msg, "content"):
for block in msg.content:
block_type = type(block).__name__

if block_type == "TextBlock" and hasattr(block, "text"):
text = block.text
if text:
yield {"type": "text", "content": text}

self.messages.append({
"role": "assistant",
"content": text,
"timestamp": datetime.now().isoformat()
# Stream the response (with rate-limit retry)
for _attempt in range(MAX_CHAT_RATE_LIMIT_RETRIES + 1):
try:
async for msg in self.client.receive_response():
msg_type = type(msg).__name__

if msg_type == "AssistantMessage" and hasattr(msg, "content"):
for block in msg.content:
block_type = type(block).__name__

if block_type == "TextBlock" and hasattr(block, "text"):
text = block.text
if text:
yield {"type": "text", "content": text}

self.messages.append({
"role": "assistant",
"content": text,
"timestamp": datetime.now().isoformat()
})
# Completed successfully — break out of retry loop
break
except Exception as exc:
is_rate_limit, retry_secs = check_rate_limit_error(exc)
if is_rate_limit and _attempt < MAX_CHAT_RATE_LIMIT_RETRIES:
delay = retry_secs if retry_secs else calculate_rate_limit_backoff(_attempt)
logger.warning(f"Rate limited (attempt {_attempt + 1}/{MAX_CHAT_RATE_LIMIT_RETRIES}), retrying in {delay}s")
yield {
"type": "rate_limited",
"retry_in": delay,
"attempt": _attempt + 1,
"max_attempts": MAX_CHAT_RATE_LIMIT_RETRIES,
}
await asyncio.sleep(delay)
# Re-send the query before retrying receive_response
if attachments and len(attachments) > 0:
content_blocks_retry: list[dict[str, Any]] = []
if message:
content_blocks_retry.append({"type": "text", "text": message})
for att in attachments:
content_blocks_retry.append({
"type": "image",
"source": {
"type": "base64",
"media_type": att.mimeType,
"data": att.base64Data,
}
})
await self.client.query(make_multimodal_message(content_blocks_retry))
else:
await self.client.query(message)
continue
if is_rate_limit:
logger.error("Rate limit retries exhausted for expand chat")
yield {"type": "error", "content": "Rate limited. Please try again later."}
return
# Non-rate-limit MessageParseError: log and break (don't crash)
if type(exc).__name__ == "MessageParseError":
logger.warning(f"Ignoring unrecognized message from Claude CLI: {exc}")
break
raise

def get_features_created(self) -> int:
"""Get the total number of features created in this session."""
Expand Down
Loading
Loading