Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
02e5210
docs
dimavrem22 Feb 19, 2026
e0ba13a
Add DOM specialist agent with meta/script/hidden-input scanning
dimavrem22 Feb 19, 2026
4abcac7
Checkpoint: exploration pipeline complete with clean specialist separ…
dimavrem22 Feb 20, 2026
9f8d0dd
rm some file
dimavrem22 Feb 20, 2026
24ac6af
wild and crazy orchestation vibe coded from top to bottom!
dimavrem22 Feb 20, 2026
2be9b97
is the the first successful run i seee?? even if it costs a a gazilli…
dimavrem22 Feb 20, 2026
5705442
we got auth to work! lets goooo
dimavrem22 Feb 21, 2026
57de81a
extra errors added to context
dimavrem22 Feb 21, 2026
9700b47
checkpoint
dimavrem22 Feb 21, 2026
738707c
doc review: fix inaccuracies in api_indexing_spec, add spec_v2 improv…
dimavrem22 Feb 22, 2026
4be4bda
clarify PI data loader access: holds loaders but passes through to wo…
dimavrem22 Feb 22, 2026
e0162d8
clarify PI context: agent docs are on-demand file reads, not pre-loaded
dimavrem22 Feb 22, 2026
df471b2
fix worker context docs: no exploration summaries; add proven artifac…
dimavrem22 Feb 22, 2026
75434a9
docs chpt
dimavrem22 Feb 22, 2026
9cade36
add When column to PI context table; add deferred routine planning im…
dimavrem22 Feb 23, 2026
9705851
add screenshot + OCR improvements (#6) to spec_v2 potential improvements
dimavrem22 Feb 23, 2026
1d17972
fix RoutineInspector context table: add spec comparison, doc tools, c…
dimavrem22 Feb 23, 2026
f33363d
add potential improvements #7-13: inspector context, proven artifacts…
dimavrem22 Feb 23, 2026
7898515
add complete agent prompts reference (docs/prompts.md)
dimavrem22 Feb 23, 2026
02d589e
add cleaned potential improvements: deduplicate, organize by theme, a…
dimavrem22 Feb 23, 2026
4f71c64
workspace, abstract agent, refactored. beta discovery removed
dimavrem22 Feb 24, 2026
904363b
stricter-code-sandboxing
dimavrem22 Feb 24, 2026
5e01885
network-specailists uses new workspace and tools
dimavrem22 Feb 24, 2026
1b97d16
mounting data for specialists done
dimavrem22 Feb 25, 2026
44b7c46
fixed bluebox and made agent_workspace
dimavrem22 Feb 25, 2026
7aee2e5
i think the massive agent refactor is now done!!!
dimavrem22 Feb 25, 2026
e7f30bf
pipeline looks to start working?
dimavrem22 Feb 26, 2026
341bf9e
we re soooaring! flyyyyinggit add .
dimavrem22 Feb 26, 2026
6d8a806
cleaned up workers, inspectors
dimavrem22 Feb 27, 2026
4fce7ee
test: add unit tests for API indexing pipeline data models
dimavrem22 Feb 27, 2026
e8a9965
refactor: extract _browser_execute helper to eliminate browser tool b…
dimavrem22 Feb 27, 2026
c49c0b5
docs finishing
dimavrem22 Feb 27, 2026
67002c3
docs
dimavrem22 Feb 27, 2026
78c1631
new diagrams
dimavrem22 Feb 27, 2026
c38753b
new diagrams
dimavrem22 Feb 27, 2026
79bffea
abs agent cleanup, workspace cleanup, bluebox cleanup, aliases removed
dimavrem22 Feb 27, 2026
3bad643
fix: remove references to non-existent follow_up tool in PI system pr…
dimavrem22 Feb 27, 2026
061a419
more cleanup
dimavrem22 Feb 27, 2026
34a64a3
docs: update CLAUDE.md for agent refactor, workspace, and API indexin…
dimavrem22 Feb 27, 2026
932ace4
uni tests
dimavrem22 Feb 27, 2026
ad39e52
anthropic models for api indexing
dimavrem22 Feb 27, 2026
54148f5
refactor: merge run_python_code into execute_python
dimavrem22 Feb 27, 2026
5e30089
fix: track all writable workspace dirs in execute_python, not just ou…
dimavrem22 Feb 27, 2026
2a5971a
fix: remove braindead _after_chat_added, inline callback in _add_chat
dimavrem22 Feb 27, 2026
5a30a5d
fix: snapshot entire workspace for file-tracking, not just writable dirs
dimavrem22 Feb 27, 2026
51cbdd9
claude out
dimavrem22 Feb 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 0 additions & 27 deletions .github/workflows/claude-code-review.yml

This file was deleted.

5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -222,4 +222,7 @@ downloads/
benchmarks/
routine_output/
bluebox_workspace/
api_indexing_output/
api_indexing_output/
api_indexing_output*/
agent_workspace/
agent_workspace*/
62 changes: 54 additions & 8 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ This file provides context and guidelines for working with the bluebox codebase.
- `bluebox-monitor --host 127.0.0.1 --port 9222 --output-dir ./cdp_captures --url about:blank --incognito` - Start browser monitoring
- `bluebox-discover --task "your task description" --cdp-captures-dir ./cdp_captures --output-dir ./routine_discovery_output --llm-model gpt-5.2` - Discover routines from captures
- `bluebox-execute --routine-path example_data/example_routines/amtrak_one_way_train_search_routine.json --parameters-path example_data/example_routines/amtrak_one_way_train_search_input.json` - Execute a routine
- `bluebox-agent-adapter --agent RoutineDiscoveryAgentBeta --cdp-captures-dir ./cdp_captures` - Start HTTP adapter for programmatic agent interaction (see Agent HTTP Adapter section below)
- `bluebox-api-index --cdp-captures-dir ./cdp_captures --task "your task" --output-dir ./api_indexing_output --model gpt-5.2 --post-run-analysis` - Run the API indexing pipeline (exploration + routine construction)
- `bluebox-agent-adapter --agent NetworkSpecialist --cdp-captures-dir ./cdp_captures` - Start HTTP adapter for programmatic agent interaction (see Agent HTTP Adapter section below)
- `bluebox-agent-adapter --list-agents` - List all available agents and their required data

### Chrome Debug Mode
Expand Down Expand Up @@ -107,23 +108,35 @@ This file provides context and guidelines for working with the bluebox codebase.
- `bluebox/utils/js_utils.py` - JavaScript code generation
- `bluebox/utils/web_socket_utils.py` - WebSocket utilities for CDP
- `bluebox/sdk/client.py` - Main SDK client
- `bluebox/workspace.py` - Agent workspace (artifact-oriented file I/O with provenance tracking)

### Agents

AI agents that power routine discovery and conversational interactions:
AI agents that power routine discovery, API indexing, and conversational interactions. All agents inherit from `AbstractAgent` (`bluebox/agents/abstract_agent.py`).

**Core agents:**
- `bluebox/agents/routine_discovery_agent.py` - Analyzes CDP captures to generate routines (identifies transactions, extracts/resolves variables, constructs operations)
- `bluebox/agents/guide_agent.py` - Conversational agent for guiding users through routine creation/editing (maintains chat history, dynamic tool registration)
- `bluebox/agents/bluebox_agent.py` - General-purpose conversational agent

**API Indexing Pipeline agents:**
- `bluebox/agents/principal_investigator.py` - Orchestrator: plans routine catalog, dispatches experiments to workers, reviews results, assembles and ships routines
- `bluebox/agents/workers/experiment_worker.py` - Browser-capable execution agent: live browser tools + recorded capture lookup tools, executes experiments
- `bluebox/agents/routine_inspector.py` - Independent quality gate: scores routines on 6 dimensions, hard-fails on 4xx/5xx or unresolved placeholders

**Specialists** (domain-specific agents for exploration):
- `bluebox/agents/specialists/network_specialist.py` - Network traffic analysis
- `bluebox/agents/specialists/dom_specialist.py` - DOM structure analysis
- `bluebox/agents/specialists/interaction_specialist.py` - UI interaction analysis
- `bluebox/agents/specialists/js_specialist.py` - JavaScript file analysis
- `bluebox/agents/specialists/value_trace_resolver_specialist.py` - Storage & window property analysis

**Agent HTTP Adapter** (`bluebox/scripts/agent_http_adapter.py`):

HTTP wrapper that exposes any `AbstractAgent` (or `AbstractSpecialist`) subclass as a JSON API, enabling programmatic interaction via curl. Agents are auto-discovered at runtime — adding a new `AbstractSpecialist` subclass makes it available with zero adapter changes.
HTTP wrapper that exposes any `AbstractAgent` subclass as a JSON API, enabling programmatic interaction via curl. Agents are auto-discovered at runtime — adding a new `AbstractAgent` subclass makes it available with zero adapter changes.

```bash
# Start adapter (default: RoutineDiscoveryAgentBeta)
bluebox-agent-adapter --cdp-captures-dir ./cdp_captures --port 8765 -q

# Or pick a specific agent
# Start adapter with a specific agent
bluebox-agent-adapter --agent NetworkSpecialist --cdp-captures-dir ./cdp_captures

# Agents with no data requirements (e.g. BlueBoxAgent) don't need --cdp-captures-dir
Expand All @@ -134,7 +147,7 @@ Endpoints:
- `GET /health` — liveness check
- `GET /status` — agent type, chat state, discovery support
- `POST /chat {"message": "..."}` — send a chat message (all agents)
- `POST /discover {"task": "..."}` — run discovery/autonomous mode (specialists + RoutineDiscoveryAgentBeta)
- `POST /discover {"task": "..."}` — run discovery/autonomous mode
- `GET /routine` — retrieve discovered routine JSON

**Best practices when calling from Claude Code or scripts:**
Expand All @@ -147,6 +160,7 @@ Endpoints:
**LLM Infrastructure:**
- `bluebox/llms/data_loaders/` - Specialized data loaders for CDP capture analysis:
- `NetworkDataLoader` - HTTP request/response transactions
- `DOMDataLoader` - DOM snapshots (string-interning tables, element classification by tag family)
- `JSDataLoader` - JavaScript files
- `StorageDataLoader` - Cookies, localStorage, sessionStorage, IndexedDB
- `WindowPropertyDataLoader` - Window property changes
Expand All @@ -156,18 +170,50 @@ Endpoints:

**Import patterns:**
```python
from bluebox.agents.abstract_agent import AbstractAgent, agent_tool, AgentCard
from bluebox.agents.guide_agent import GuideAgent
from bluebox.agents.routine_discovery_agent import RoutineDiscoveryAgent
from bluebox.agents.principal_investigator import PrincipalInvestigator
from bluebox.agents.workers.experiment_worker import ExperimentWorker
from bluebox.agents.routine_inspector import RoutineInspector
from bluebox.workspace import AgentWorkspace, LocalAgentWorkspace
from bluebox.llms.data_loaders.network_data_loader import NetworkDataLoader
from bluebox.llms.data_loaders.dom_data_loader import DOMDataLoader
from bluebox.llms.data_loaders.js_data_loader import JSDataLoader
```

### Workspace

The workspace (`bluebox/workspace.py`) is an artifact-oriented file I/O system attached to agents. Each workspace has a strict directory layout:

- `raw/` (read-only): tool result artifacts and mounted external files
- `output/`: agent-generated deliverables
- `context/`: reusable notes/context saved for later use in the same run
- `meta/`: system-managed metadata (`manifest.jsonl`, `input_mounts.jsonl`) — not editable
- `scratch/`: ephemeral scratch space

External files (e.g. CDP capture JSONL) can be mounted into `raw/` via hardlinks using `attach_input_file()`. The `save_artifact()` API records provenance in `meta/manifest.jsonl` (SHA-256, size, content type, timestamp).

### API Indexing Pipeline

End-to-end pipeline (`bluebox-api-index`) that turns raw CDP captures into a catalog of executable routines.

**Phase 1 — Exploration** (4 specialists in parallel): Network, Storage, DOM, and UI specialists each produce a structured exploration summary.

**Phase 2 — Routine Construction**: PrincipalInvestigator reads summaries, dispatches ExperimentWorker agents, reviews results, assembles routines, submits to RoutineInspector for quality gating. Incremental persistence to disk. PI crash recovery via DiscoveryLedger.

**Data models:**
- `bluebox/data_models/orchestration/` - `DiscoveryLedger`, `ExperimentEntry`, `RoutineSpec`, `RoutineAttempt`, `RoutineCatalog`, `RoutineInspectionResult`
- `bluebox/data_models/api_indexing/` - `NetworkExplorationSummary`, `StorageExplorationSummary`, `DOMExplorationSummary`, `UIExplorationSummary`

### Important Patterns

- **Routine Execution**: Operations execute sequentially, maintaining state via `RoutineExecutionContext`
- **Placeholder Resolution**: All parameters use `{{paramName}}` format; `Parameter.type` drives coercion at runtime
- **Session Storage**: Use `session_storage_key` to store and retrieve data between operations
- **CDP Sessions**: Use flattened sessions for multiplexing via `session_id`
- **Agent Tools**: Decorate with `@agent_tool()`. Supports `persist` (`NEVER`/`ALWAYS`/`OVERFLOW`), `max_characters`, and `token_optimized` parameters
- **Agent Card**: Every concrete `AbstractAgent` subclass must declare an `AGENT_CARD`

### Common Gotchas

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ bluebox-agent --context-file path/to/agent_context.json

## Create your own routines

To learn about the core technology powering BlueBox, see [routine_discovery.md](routine_discovery.md).
To learn about the core technology powering BlueBox, see [routine_discovery.md](docs/routine_discovery.md).

## Contributing 🤝

Expand Down
101 changes: 101 additions & 0 deletions bluebox/agent_docs/common-issues/cors-failed-to-fetch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Fetch Fails with TypeError: Failed to fetch (CORS)

> Fetch operations fail with "TypeError: Failed to fetch" when the browser's origin doesn't match the API server's CORS `Access-Control-Allow-Origin` header. Fix by adding a `navigate` operation to the allowed origin before any `fetch`. Related: [fetch.md](../operations/fetch.md), [navigation.md](../operations/navigation.md)

**Symptom:** Fetch operation returns `TypeError: Failed to fetch` or the response data is `null`/empty despite the endpoint working in experiments.

**Root Cause:** The routine executor starts from `about:blank` (origin = `null`). Many APIs restrict CORS to their own website origin. For example, `api.nasdaq.com` only allows requests from origin `https://www.nasdaq.com`. Without a `navigate` operation first, the browser's origin is `null` and every `fetch` is blocked by CORS.

**How to detect:** If an experiment confirmed the API works from the site's origin (e.g. `browser_eval_js(fetch(...))` succeeded after navigating to `www.example.com`) but the routine's `fetch` operation fails with `TypeError: Failed to fetch`, the routine is missing a `navigate` step.

**Solutions:**

| Problem | Fix |
|---------|-----|
| API requires same-origin (e.g. `api.example.com` allows `www.example.com`) | Add `navigate` to the allowed origin before `fetch` |
| API requires `Origin`/`Referer` headers | Add `"Origin"` and `"Referer"` to fetch headers |
| API is on the same domain as the website | Add `navigate` to the website URL first |
| Cloudflare/WAF blocks CORS preflight (OPTIONS → 403) | Set `"credentials": "omit"` on the fetch endpoint — this avoids the preflight OPTIONS request entirely, bypassing the block. Works for public APIs that don't need cookies |
| All else fails | Use `js_evaluate` with `fetch()` instead of a `fetch` operation — JS fetch from the navigated page context has the correct origin |

**RULE:** Every routine that calls an external API SHOULD start with a `navigate` operation to establish the correct browser origin. This is cheap (one page load) and prevents CORS issues.

**Example: Navigate to allowed origin, then fetch from API subdomain**
```json
[
{"type": "navigate", "url": "https://www.example.com"},
{
"type": "fetch",
"endpoint": {
"url": "https://api.example.com/api/data?q={{query}}",
"method": "GET",
"headers": {
"Accept": "application/json, text/plain, */*"
}
},
"session_storage_key": "result"
},
{"type": "return", "session_storage_key": "result"}
]
```

**Example: Navigate + auth token + data fetch (common pattern)**
```json
[
{"type": "navigate", "url": "https://www.example.com"},
{
"type": "fetch",
"endpoint": {
"url": "https://api.example.com/api/token",
"method": "POST",
"headers": {"Content-Type": "application/json"},
"body": {"applicationName": "web"}
},
"session_storage_key": "auth_response"
},
{
"type": "js_evaluate",
"expression": "(function(){ var r = JSON.parse(sessionStorage.getItem('auth_response')); return r.data.token; })()",
"session_storage_key": "bearer_token"
},
{
"type": "fetch",
"endpoint": {
"url": "https://api.example.com/api/data",
"method": "GET",
"headers": {
"Authorization": "Bearer {{sessionStorage.bearer_token}}",
"Accept": "application/json"
}
},
"session_storage_key": "data_result"
},
{"type": "return", "session_storage_key": "data_result"}
]
```

**Cloudflare / WAF Blocking Preflight Requests**

Some APIs behind Cloudflare or other WAFs block CORS preflight (OPTIONS) requests with 403. This happens when `credentials: "include"` triggers a preflight that Cloudflare rejects. The captured network data will show OPTIONS requests returning 403 with `server: cloudflare` and `content-type: text/html`.

**Fix:** If the API does NOT require cookies or session auth, set `"credentials": "omit"` on the fetch endpoint. This tells the browser NOT to send cookies, which often eliminates the preflight OPTIONS request entirely, bypassing the Cloudflare block.

**When to try this:** The experiment shows `TypeError: Failed to fetch` AND the captured network data shows OPTIONS preflight returning 403 from Cloudflare. Try `credentials: "omit"` first — many public search/listing APIs work without cookies.

```json
[
{"type": "navigate", "url": "https://www.example.com"},
{
"type": "fetch",
"endpoint": {
"url": "https://api.example.com/search",
"method": "POST",
"headers": {"Content-Type": "application/json", "Accept": "application/json"},
"body": {"query": "{{search_term}}", "page": "{{page}}"},
"credentials": "omit"
},
"session_storage_key": "search_result"
},
{"type": "return", "session_storage_key": "search_result"}
]
```
Loading