Rust browser + LLM agent for deterministic, single-step web automation.
mbus runs a tight loop of snapshot -> propose -> validate -> apply. Actions are strictly validated against the current observation before execution, and every step is logged as JSON for traceability.
Key traits:
- Chromium CDP browser adapter (
chromiumoxide) - Strict action schema + validation
- Model router with fast -> mid -> strong escalation
- Structured JSON logs plus tracing + metrics
Prerequisites:
- Rust toolchain (stable)
- A Chromium/Chrome binary discoverable by
chromiumoxide
Build:
cargo buildDefault (stub LLM, immediately returns done after snapshot):
cargo run -- run --task "open example.com"OpenAI mode:
MBUS_LLM_MODE=openai MBUS_LLM_API_KEY=... \
cargo run -- run --task "Find the shipping address" \
--llm-model-fast gpt-5-mini \
--llm-model-mid gpt-5.1 \
--llm-model-strong gpt-5.2Scripted mode (feed actions from a file):
cargo run -- run --task "Click the button" \
--llm-mode scripted \
--llm-actions-file ./actions.jsonlFor a concise install + quickstart path (prerequisites, install steps, and the first successful run with validated commands), see docs/quickstart.md.
mbus run flags (most common):
--taskor--task-file--planor--plan-file--config--headless--initial-url--max-steps--llm-mode(stub,scripted,openai)--llm-base-url,--llm-api-key--llm-model-fast,--llm-model-mid,--llm-model-strong--llm-timeout-ms,--llm-temperature,--llm-max-tokens--llm-actions-file--extract-output
mbus bench flags:
--tasks-dir(default:harness/tasks)--report-path(default:target/bench/report.json)--config--headless--max-steps-per-task(default:40)--required-passes(default: total tasks minus two)--llm-mode(scripted,openai)--llm-base-url,--llm-api-key--llm-model-fast,--llm-model-mid,--llm-model-strong--llm-timeout-ms,--llm-temperature,--llm-max-tokens
Run the local benchmark harness:
cargo run -- bench --llm-mode scriptedThe command:
- Starts a local HTTP harness server on
127.0.0.1with deterministic pages. - Serves static harness pages from
harness/pages. - Loads task fixtures from
harness/tasks/*.json. - Executes each task with scripted actions in
scriptedmode. - Executes each task autonomously in
openaimode (requiresMBUS_LLM_API_KEYor--llm-api-key). - Writes the report to
target/bench/report.json. - Enforces a gate (
required_passes, default 8 of 10 tasks).
Task fixture shape (example):
{
"id": "bench-task-01",
"task": "Navigate to benchmark task 01 and confirm marker text.",
"start_path": "/bench/start",
"max_steps": 40,
"actions": [
{"type": "navigate", "url": "{{base_url}}/bench/task-01"},
{"type": "done", "summary": "Reached benchmark task 01"}
],
"expect": {
"status": "done",
"final_url_contains": "/bench/task-01",
"final_visible_text_contains": "BENCH TASK 01"
}
}Config precedence is: defaults -> config file -> env (MBUS_*) -> CLI flags.
Config file lookup order is: --config, MBUS_CONFIG, ./mbus.toml, ~/.mbus.toml.
Sample mbus.toml:
[agent]
max_steps = 40
[agent.memory]
max_observations = 8
max_history = 100
[browser]
headless = true
# headful = true
initial_url = "about:blank"
snapshot_timeout_ms = 5000
action_timeout_ms = 10000
max_elements = 50
max_text_len = 4000
[router]
failures_to_mid = 2
failures_to_strong = 4
no_progress_to_mid = 2
no_progress_to_strong = 4
ladder = ["gpt-5-mini:medium", "gpt-5.1:medium", "gpt-5.2:medium"]
[validator]
allow_insecure = false
max_text_len = 2000
max_wait_ms = 30000
max_scroll = 2000
[llm]
mode = "stub"
base_url = "https://api.openai.com/v1"
api_key = ""
model_fast = "gpt-5-mini"
model_mid = "gpt-5.1"
model_strong = "gpt-5.2"
timeout_ms = 30000
temperature = 1.0
max_tokens = 256
actions_file = "actions.jsonl"
[output]
extract_output = "mbus_extract.json"To run with a visible browser window, set headful = true in the config or
pass --headless false on the CLI.
Environment variable overrides (full list):
MBUS_CONFIGMBUS_MAX_STEPSMBUS_MEMORY_MAX_OBSERVATIONSMBUS_MEMORY_MAX_HISTORYMBUS_HEADLESSMBUS_INITIAL_URLMBUS_CDP_URLMBUS_SNAPSHOT_TIMEOUT_MSMBUS_ACTION_TIMEOUT_MSMBUS_MAX_ELEMENTSMBUS_MAX_TEXT_LENMBUS_ROUTER_FAILURES_TO_MIDMBUS_ROUTER_FAILURES_TO_STRONGMBUS_ROUTER_NO_PROGRESS_TO_MIDMBUS_ROUTER_NO_PROGRESS_TO_STRONGMBUS_ROUTER_REASONING_EFFORTMBUS_ROUTER_LADDERMBUS_ALLOW_INSECUREMBUS_VALIDATOR_MAX_TEXT_LENMBUS_VALIDATOR_MAX_WAIT_MSMBUS_VALIDATOR_MAX_SCROLLMBUS_LLM_MODEMBUS_LLM_BASE_URLMBUS_LLM_API_KEYMBUS_LLM_MODEL_FASTMBUS_LLM_MODEL_MIDMBUS_LLM_MODEL_STRONGMBUS_LLM_TIMEOUT_MSMBUS_LLM_TEMPERATUREMBUS_LLM_MAX_TOKENSMBUS_LLM_ACTIONS_FILEMBUS_EXTRACT_OUTPUT
Scripted actions accept any of the following formats:
- A JSON array of actions
- A single JSON action object
- JSON Lines (one action per line)
Example (actions.jsonl):
{"type":"navigate","url":"https://example.com"}
{"type":"click","id":"el_1"}
{"type":"done","summary":"clicked"}mbus runprints JSON log lines to stdout (type = config | step | summary).- Tracing logs are emitted as JSON to stderr; set
RUST_LOG=infoor similar to control verbosity. - Metrics are in-process counters and timers; see
src/telemetry.rsfor names.
- Chromium fails to launch: install Chromium/Chrome and ensure it is
discoverable by
chromiumoxide. - OpenAI 401/403: ensure
MBUS_LLM_API_KEYis set foropenaimode. - Invalid scripted actions: confirm the JSON matches the action schema and references real element ids.
- Timeouts on slow pages: increase
snapshot_timeout_msoraction_timeout_ms. - Navigation to non-http(s) URLs blocked: set
allow_insecure = trueonly when needed and understand the security implications.
For a structured operations runbook, recovery steps, and the log/metric fields
you should monitor, see docs/operations-runbook.md.
Verification:
cargo test- Run a short task with
mbus runand confirm asummaryJSON log line is emitted and, if using extract actions,mbus_extract.jsonis written.
Rollback:
- Checkout the previous release tag or commit and rebuild.
- Revert any config changes (especially router thresholds and timeouts) to the last known-good values.
For the full verification checklist, rollback recipe, and structured logging
guidance, see docs/operations-runbook.md.