- Sensitive Redaction: Enabled
redactSensitive: "tools"inclawdbot.jsonto prevent API keys and credentials from leaking into logs or session histories. - Pattern Matching: Implement
redactPatternsfor PII (emails, phone numbers) using thegateway.config.patchtool.
- Sub-Agent Isolation: Always spawn sub-agents for processing untrusted external data (e.g., web scraped content, user-submitted prompts). Sub-agents run in isolated sessions, limiting the "blast radius" of a successful injection.
- Tool Access Control: Use the
tools.denylist for sub-agents processing external data to prevent them from calling sensitive tools likeedit,exec(host), orgateway.
- Delimiter Shielding: Use clear XML or Markdown delimiters (e.g.,
<external_data>...</external_data>) when feeding untrusted content to models. - System Prompt Reinforcement: Regularly update
SOUL.mdand agent system prompts to explicitly ignore instructions within external data blocks. - Verification Loop: Use a "Verifier" agent to check the output of a "Worker" agent for signs of prompt leakage or hijacked instructions before presenting it to the human.
- Approval Gates: Keep
tools.elevatedonaskoron-missmode for destructive commands. - Session Continuity: Review
memory/*.mddaily to identify any anomalous behavior patterns.
Status: Initial Hardening Plan Implemented - 2026-01-28