Skip to content

HTTP-only client with 99% memory reduction#25

Open
mattheworiordan wants to merge 9 commits intoSawyerHood:mainfrom
mattheworiordan:feat/performance-optimization
Open

HTTP-only client with 99% memory reduction#25
mattheworiordan wants to merge 9 commits intoSawyerHood:mainfrom
mattheworiordan:feat/performance-optimization

Conversation

@mattheworiordan
Copy link

@mattheworiordan mattheworiordan commented Dec 30, 2025

Whilst using dev-browser, especially with concurrent agents, I found things slow to start at times, and resource intensive. This fixes that and makes things very snappy and lightweight.

Summary

  • Switch to lightweight HTTP-only client (12MB → 30KB memory per agent)
  • Extract shared routes to eliminate code duplication
  • Add complete HTTP API for all page operations

Changes

Phase 1: Startup Optimizations

  • Map-based page registry for O(1) lookup (was O(n) CDP scan)
  • Conditional npm install (skip if node_modules current)
  • TypeScript pre-compilation for faster startup

Phase 2: HTTP-Only API

New server endpoints for all page operations:

  • POST /pages/:name/screenshot - capture screenshots
  • POST /pages/:name/set-viewport - set viewport size
  • POST /pages/:name/wait-for-selector - wait for elements
  • GET /pages/:name/info - get URL and title

Phase 3: Client-Lite

  • client-lite.ts - HTTP-only client, no Playwright dependency
  • Extract shared routes to http-routes.ts (~400 lines deduplication)
  • Update SKILL.md to document client-lite API
  • Deprecate old Playwright client with migration guide

Memory Benchmark

client-lite: 29.6 KB import overhead
Playwright: 12.4 MB import overhead
Reduction: 99.8%

Test plan

  • All 54 unit tests pass
  • End-to-end: navigate, screenshot, click, fill, evaluate
  • Memory benchmark confirms savings
  • SKILL.md examples work with client-lite

Note:

This Depends on multi-agent-concurrency #24. However, this PR is asking for merge to main, and thus includes commits from that branch. This is unavoidable because of Github :(

Add serveWithExternalBrowser() that connects to an existing browser via CDP
instead of launching Playwright's Chromium. Key features:

  - Connect to any browser with CDP enabled (Chrome for Testing, Chrome Beta, etc.)
  - Auto-launch browser if not running (with BROWSER_PATH env var)
  - Browser stays open after server stops (user manages lifecycle)
  - No extension required - direct CDP connection

New files:
  - src/external-browser.ts - Core implementation
  - scripts/start-external-browser.ts - Startup script

Use case: Local development with visible browser automation where you want
to inspect results after automation completes.
When multiple AI agents run browser automation tasks in parallel,
they need separate HTTP API ports while potentially sharing the same
browser instance. This adds automatic port allocation to avoid conflicts.

Key changes:
- Add port-manager.ts for dynamic port allocation (range 9222-9300)
- Server tracking via ~/.dev-browser/active-servers.json
- PORT=XXXX output for agent discovery
- Config file support at ~/.dev-browser/config.json
- Update both standalone and external browser modes

Architecture:
  Agent 1 → server (port 9222) ┐
  Agent 2 → server (port 9224) ├→ Shared Browser (CDP 9223)
  Agent 3 → server (port 9226) ┘

See docs/CONCURRENCY.md for design decisions and usage examples.
Addresses concerns raised in PR SawyerHood#15 about single-point congestion.
When a dev-browser server crashes, its Chrome browser may still be
running on the CDP port. This adds smart cleanup to detect and
terminate orphaned browsers before launching new ones.

Key changes:
- Enhanced ServerInfo structure to track CDP port and mode
- Added detectOrphanedBrowsers() to find browsers with no registered server
- Added cleanupOrphanedBrowsers() to safely terminate orphans
- Standalone mode now cleans orphans on startup (before launching browser)
- External mode tracks CDP port but doesn't clean (browser is intentionally external)

This restores crash recovery functionality that was previously in
start-server.ts, but in a smarter way that respects multi-agent scenarios.
  - Add ~/.dev-browser/config.json for browser configuration
  - Auto-detect Chrome for Testing on macOS/Linux/Windows
  - Add --standalone flag to force Playwright mode
  - Skip npm install when dependencies unchanged (hash check)
  - Rename port-manager.ts to config.ts with browser config
  - Let browser use default profile unless userDataDir explicitly set
  - Simplify SKILL.md documentation with single startup flow
  - Add Map-based page lookup in client.ts for O(1) targetId resolution
    (eliminates 11ms CDP session scan per lookup)
  - Add conditional npm install in server.sh using package-lock hash
    (skips 500-2000ms when dependencies unchanged)
  - Add TypeScript pre-compilation with esbuild
    (500ms faster startup using node vs tsx)
  - Include mode in POST /pages response to eliminate extra HTTP round-trip
  - Add benchmark.ts script for measuring performance

Key improvements:
  - Page lookup: 11ms → 0ms (after first access populates registry)
  - Server startup: ~700ms faster (pre-compiled + conditional npm)
  - HTTP requests: 1 fewer per getPage() call
Enables agents to use dev-browser without Playwright dependency by moving
page operations server-side and providing a thin HTTP client.

Server changes:
  - Add HTTP endpoints for page operations in index.ts and external-browser.ts:
    - POST /pages/:name/navigate - navigate to URL
    - POST /pages/:name/evaluate - execute JavaScript
    - GET /pages/:name/snapshot - get AI-friendly ARIA snapshot
    - POST /pages/:name/select-ref - get element info by ref
    - POST /pages/:name/click - click element by ref
    - POST /pages/:name/fill - fill input by ref
  - Add new API types (EvaluateRequest/Response, NavigateRequest/Response, etc.)

  New lightweight client:
  - client-lite.ts: HTTP-only client (~30KB import vs ~12MB for Playwright)
  - No Playwright dependency required on agent side
  - Same interface as full client for easy migration

  Testing:
  - http-api.test.ts: 21 tests for request validation and registry logic
  - client-lite.test.ts: 24 tests for HTTP client behavior
  - test-http-api.ts: Manual integration test script
  - memory-benchmark.ts: Measures memory savings (60% heap reduction for 10 agents)

Memory impact (10-agent scenario):
  - Before: 238 MB heap (each agent imports Playwright)
  - After: 95 MB heap (server has Playwright, agents use HTTP)
  - Savings: 143 MB (60% reduction)
Phase 3 of performance optimization: make client-lite the primary client path.

Changes:
  - Extract shared HTTP routes into http-routes.ts (removes ~400 lines duplication)
  - Add HTTP endpoints: screenshot, set-viewport, wait-for-selector, info
  - Update client-lite.ts with methods for all page operations
  - Update SKILL.md to document client-lite API
  - Add deprecation notice to client.ts pointing to client-lite
  - Add note to scraping.md for advanced Playwright usage

Benefits:
  - Client memory: 12.4MB → 30KB (99.8% reduction)
  - No Playwright dependency on client side
  - All page operations via HTTP to server
@mattheworiordan mattheworiordan force-pushed the feat/performance-optimization branch from b503733 to e4ebba3 Compare January 1, 2026 13:54
- Change default port range from 9222-9300 to 19222-19300 to avoid
  Chrome CDP port conflicts (9222 is Chrome's default debug port)
- Add automatic port discovery chain in client-lite:
    1. DEV_BROWSER_PORT environment variable
    2. tmp/port file written by server
    3. Most recent server from ~/.dev-browser/active-servers.json
    4. Default port 19222 as fallback
- Write port to tmp/port on server startup for client discovery
- Add 30-minute idle timeout to prevent zombie server accumulation
- Clean up stale server entries on startup
- Update SKILL.md with new configuration options and behavior
- Add start-external-browser.ts to build script

This fixes the issue where agents couldn't connect to dev-browser
because the client defaulted to port 9222 while the server was
dynamically assigned a different port.
- When browser.path ends with .app, automatically use `open -a` on macOS
- Fail with helpful error instead of silent fallback to standalone mode
- Add path validation for user-specified browser paths
- Document .app bundle behavior in SKILL.md

This ensures consistent browser behavior and proper Dock integration
when using launcher apps that handle CDP flags internally.
mjdaly added a commit to mjdaly/dev-browser that referenced this pull request Jan 16, 2026
- Switch to lightweight HTTP-only client (12MB → 30KB memory per agent)
- Extract shared routes to eliminate code duplication
- Add complete HTTP API for all page operations
- Map-based page registry for O(1) lookup (was O(n) CDP scan)
- Auto-shutdown after 30 minutes of inactivity
- Stale server cleanup on startup

Source: SawyerHood#25
Co-authored-by: Matthew O'Riordan <matthew@ably.com>
mjdaly added a commit to mjdaly/dev-browser that referenced this pull request Jan 16, 2026
- Switch to lightweight HTTP-only client (12MB → 30KB memory per agent)
- Extract shared routes to eliminate code duplication
- Add complete HTTP API for all page operations
- Map-based page registry for O(1) lookup (was O(n) CDP scan)
- Auto-shutdown after 30 minutes of inactivity
- Stale server cleanup on startup

Source: SawyerHood#25
Co-authored-by: Matthew O'Riordan <matthew@ably.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant