initializ · initializ-mk · Mar 6, 2026 · Mar 6, 2026
diff --git a/docs/hooks.md b/docs/hooks.md
@@ -19,7 +19,7 @@ Hooks fire synchronously during the agent loop and can:
 | `BeforeLLMCall` | Before each LLM API call | `Messages`, `TaskID`, `CorrelationID` |
 | `AfterLLMCall` | After each LLM API call | `Messages`, `Response`, `TaskID`, `CorrelationID` |
 | `BeforeToolExec` | Before each tool execution | `ToolName`, `ToolInput`, `TaskID`, `CorrelationID` |
-| `AfterToolExec` | After each tool execution | `ToolName`, `ToolInput`, `ToolOutput`, `Error`, `TaskID`, `CorrelationID` |
+| `AfterToolExec` | After each tool execution | `ToolName`, `ToolInput`, `ToolOutput` (mutable), `Error`, `TaskID`, `CorrelationID` |
 | `OnError` | When an LLM call fails | `Error`, `TaskID`, `CorrelationID` |
 | `OnProgress` | During tool execution | `Phase`, `ToolName`, `StatusMessage` |
 
@@ -73,6 +73,19 @@ hooks.Register(engine.BeforeToolExec, func(ctx context.Context, hctx *engine.Hoo
 })
 ```
 
+## Output Redaction
+
+`AfterToolExec` hooks can modify `hctx.ToolOutput` to redact sensitive content before it enters the LLM context. The agent loop reads back `ToolOutput` from the `HookContext` after all hooks fire.
+
+The runner registers a guardrail hook that scans tool output for secrets and PII patterns. See [Tool Output Scanning](security/guardrails.md#tool-output-scanning) for details.
+
+```go
+hooks.Register(engine.AfterToolExec, func(ctx context.Context, hctx *engine.HookContext) error {
+    hctx.ToolOutput = strings.ReplaceAll(hctx.ToolOutput, secret, "[REDACTED]")
+    return nil
+})
+```
+
 ## Audit Logging
 
 The runner registers `AfterLLMCall` hooks that emit structured audit events for each LLM interaction. Audit fields include:

diff --git a/docs/memory.md b/docs/memory.md
@@ -17,6 +17,18 @@ memory:
 - Sessions are saved as JSON files with atomic writes (temp file + fsync + rename)
 - Automatic cleanup of sessions older than 7 days at startup
 - Session recovery on subsequent requests (disk snapshot supersedes task history)
+- **Session max age** (default 30 minutes): stale sessions are discarded on recovery to prevent poisoned error context from blocking tool retries. When an LLM accumulates repeated tool failures in a session, it may stop retrying altogether. The max age ensures these poisoned sessions expire, giving the agent a fresh start.
+
+Configure via `forge.yaml` or environment variable:
+
+```yaml
+memory:
+  session_max_age: "30m"   # default; use "1h", "15m", etc.
+```
+
+```bash
+export FORGE_SESSION_MAX_AGE=1h
+```
 
 ## Context Window Management
 
@@ -89,6 +101,7 @@ Full memory configuration in `forge.yaml`:
 memory:
   persistence: true
   sessions_dir: ".forge/sessions"
+  session_max_age: "30m"      # discard sessions idle longer than this
   char_budget: 200000
   trigger_ratio: 0.6
   long_term: false
@@ -105,6 +118,7 @@ Environment variables:
 | Variable | Description |
 |----------|-------------|
 | `FORGE_MEMORY_PERSISTENCE` | Set `false` to disable session persistence |
+| `FORGE_SESSION_MAX_AGE` | Session idle timeout, e.g. `30m`, `1h` (default: `30m`) |
 | `FORGE_MEMORY_LONG_TERM` | Set `true` to enable long-term memory |
 | `FORGE_EMBEDDING_PROVIDER` | Override embedding provider |
 

diff --git a/docs/runtime.md b/docs/runtime.md
@@ -148,7 +148,8 @@ forge run --host 0.0.0.0 --shutdown-timeout 30s
 | `--model` | — | Override model name |
 | `--provider` | — | Override LLM provider |
 | `--env` | `.env` | Path to env file |
-| `--enforce-guardrails` | `false` | Enforce guardrail violations as errors |
+| `--enforce-guardrails` | `true` | Enforce guardrail violations as errors |
+| `--no-guardrails` | `false` | Disable all guardrail enforcement |
 
 ### `forge serve` — Background Daemon
 
@@ -202,6 +203,8 @@ For details on session persistence, context window management, compaction, and l
 
 The engine fires hooks at key points in the loop. See [Hooks](hooks.md) for details.
 
+The runner registers four hook groups: logging, audit, progress, and guardrail hooks. The guardrail `AfterToolExec` hook scans tool output for secrets and PII, redacting or blocking before results enter the LLM context. See [Tool Output Scanning](security/guardrails.md#tool-output-scanning).
+
 ## Streaming
 
 The current implementation (v1) runs the full tool-calling loop non-streaming. `ExecuteStream` calls `Execute` internally and emits the final response as a single message on a channel. True word-by-word streaming during tool loops is planned for v2.

diff --git a/docs/security/guardrails.md b/docs/security/guardrails.md
@@ -11,6 +11,7 @@ The guardrail engine checks inbound and outbound messages against configurable p
 | `content_filter` | Inbound + Outbound | Blocks messages containing configured blocked words |
 | `no_pii` | Outbound | Detects email addresses, phone numbers, and SSNs via regex |
 | `jailbreak_protection` | Inbound | Detects common jailbreak phrases ("ignore previous instructions", etc.) |
+| `no_secrets` | Outbound | Detects API keys, tokens, and private keys (OpenAI, Anthropic, AWS, GitHub, Slack, Telegram, etc.) |
 
 ## Modes
 
@@ -37,6 +38,9 @@ Custom guardrail rules can be added to the policy scaffold:
     },
     "jailbreak_protection": {
       "mode": "warn"
+    },
+    "no_secrets": {
+      "mode": "enforce"
     }
   }
 }
@@ -45,13 +49,67 @@ Custom guardrail rules can be added to the policy scaffold:
 ## Runtime
 
 ```bash
-# Run with guardrails enforced
-forge run --enforce-guardrails
-
-# Default: warn mode (log only)
+# Default: guardrails enforced (all built-in guardrails active)
 forge run
+
+# Explicitly disable guardrail enforcement
+forge run --no-guardrails
 ```
 
+All four built-in guardrails (`content_filter`, `no_pii`, `jailbreak_protection`, `no_secrets`) are active by default, even without running `forge build`. Use `--no-guardrails` to opt out.
+
+## Tool Output Scanning
+
+The guardrail engine scans tool output via an `AfterToolExec` hook, catching secrets and PII before they enter the LLM context or outbound messages.
+
+| Guardrail | What it detects in tool output |
+|-----------|-------------------------------|
+| `no_secrets` | API keys, tokens, private keys (same patterns as outbound message scanning) |
+| `no_pii` | Email addresses, phone numbers, SSNs |
+
+**Behavior by mode:**
+
+| Mode | Behavior |
+|------|----------|
+| `enforce` | Returns a generic error (`"tool output blocked by content policy"`), blocking the result from entering the LLM context. The error message intentionally omits which guardrail matched to avoid leaking security internals to the LLM or channel. |
+| `warn` | Replaces matched patterns with `[REDACTED]`, logs a warning, and allows the redacted output through |
+
+The hook writes the redacted text back to `HookContext.ToolOutput`, which the agent loop reads after all hooks fire. This is backwards-compatible — existing hooks that don't modify `ToolOutput` leave it unchanged.
+
+## Path Containment
+
+The `cli_execute` tool confines filesystem path arguments to the agent's working directory. This prevents social-engineering attacks where an LLM is tricked into listing or reading files outside the project.
+
+### Shell Interpreter Denylist
+
+Shell interpreters (`bash`, `sh`, `zsh`, `dash`, `ksh`, `csh`, `tcsh`, `fish`) are **unconditionally blocked**, even if they appear in `allowed_binaries`. Shells defeat the no-shell `exec.Command` security model by reintroducing argument interpretation and bypassing all path validation (e.g., `bash -c "ls ~/Library/Keychains"`).
+
+### HOME Override
+
+When `workDir` is configured, `$HOME` in the subprocess environment is overridden to `workDir`. This prevents `~` expansion inside subprocesses from reaching the real home directory.
+
+### Path Argument Validation
+
+**Rules:**
+- Arguments that look like paths (`/`, `~/`, `./`, `../`) are resolved and checked
+- If a resolved path is inside `$HOME` but outside `workDir` → **blocked**
+- System paths outside `$HOME` (e.g., `/tmp`, `/etc`) → allowed
+- Non-path arguments (e.g., `get`, `pods`, `--namespace=default`) → allowed
+- Flag arguments (e.g., `--kubeconfig=~/.kube/config`) → not detected as paths, allowed
+
+Additionally, `cmd.Dir` is set to `workDir` so relative paths in subprocess execution resolve within the agent directory.
+
+**Examples:**
+
+| Command | Result |
+|---------|--------|
+| `kubectl get pods` | Allowed — no path args |
+| `bash -c "ls ~/"` | Blocked — `bash` is a denied shell interpreter |
+| `ls ~/Library/Keychains/` | Blocked — inside `$HOME`, outside workDir |
+| `cat ../../.ssh/id_rsa` | Blocked — resolves inside `$HOME`, outside workDir |
+| `jq '.' /tmp/data.json` | Allowed — system path outside `$HOME` |
+| `ls ./data/` | Allowed — within workDir |
+
 ## Audit Events
 
 Guardrail evaluations are logged as structured audit events:

diff --git a/docs/skills.md b/docs/skills.md
@@ -168,6 +168,7 @@ forge skills list --tags kubernetes,incident-response
 | `tavily-search` | 🔍 | research | Search the web using Tavily AI search API | `tavily-search.sh` |
 | `tavily-research` | 🔬 | research | Deep multi-source research via Tavily API | `tavily-research.sh`, `tavily-research-poll.sh` |
 | `k8s-incident-triage` | ☸️ | sre | Read-only Kubernetes incident triage using kubectl | — (binary-backed) |
+| `k8s-cost-visibility` | 💰 | sre | Estimate K8s infrastructure costs (compute, storage, LoadBalancer) with cost attribution reports | `k8s-cost-visibility.sh` |
 | `k8s-pod-rightsizer` | ⚖️ | sre | Analyze workload metrics and produce CPU/memory rightsizing recommendations | — (binary-backed) |
 | `code-review` | 🔎 | developer | AI-powered code review for diffs and files | `code-review-diff.sh`, `code-review-file.sh` |
 | `code-review-standards` | 📏 | developer | Initialize and manage code review standards | — (template-based) |
@@ -265,6 +266,37 @@ This skill operates in three modes:
 
 Requires: `bash`, `kubectl`, `jq`, `curl`. Optional: `KUBECONFIG`, `K8S_API_DOMAIN`, `PROMETHEUS_URL`, `PROMETHEUS_TOKEN`, `POLICY_FILE`, `DEFAULT_NAMESPACE`.
 
+### Kubernetes Cost Visibility Skill
+
+The `k8s-cost-visibility` skill estimates Kubernetes infrastructure costs by querying cluster node, pod, PVC/PV, and LoadBalancer data via `kubectl`, applying cloud pricing models, and producing cost attribution reports:
+
+```bash
+forge skills add k8s-cost-visibility
+```
+
+This registers a single tool:
+
+| Tool | Purpose | Behavior |
+|------|---------|----------|
+| `k8s_cost_visibility` | Estimate cluster costs and produce attribution reports | Queries nodes, pods, PVCs, PVs, and services; applies pricing; returns cost breakdown |
+
+**Cost dimensions tracked:**
+
+| Dimension | Source | Default Rate |
+|-----------|--------|-------------|
+| Compute (CPU + memory) | Node instance types, pod resource requests | Auto-detected from cloud CLI or $0.031611/vCPU-hr |
+| Storage (PVC/PV) | PVC capacities, storage classes | $0.10/GiB/month |
+| LoadBalancer | Services with `type: LoadBalancer` | $18.25/month each |
+| Waste | Unbound Persistent Volumes | Flagged with estimated monthly waste |
+
+**Grouping modes:** `namespace` (includes storage + LB columns), `workload`, `node`, `label:<key>`, `annotation:<key>`.
+
+**Pricing modes:** `auto` (detect cloud CLI), `aws`, `gcp`, `azure`, `static` (built-in rates), `custom:<file.json>` (user-provided rates).
+
+**Safety:** This skill is strictly read-only. It only uses `kubectl get` commands (nodes, pods, pvc, pv, svc) — never `apply`, `delete`, `patch`, `exec`, or `scale`.
+
+Requires: `kubectl`, `jq`, `awk`, `bc`. Optional: `KUBECONFIG`, `K8S_API_DOMAIN`, `DEFAULT_NAMESPACE`, `AWS_REGION`, `AZURE_SUBSCRIPTION_ID`, `GCP_PROJECT`.
+
 ### Codegen React Skill
 
 The `codegen-react` skill scaffolds and iterates on **Vite + React** applications with Tailwind CSS:

diff --git a/docs/tools.md b/docs/tools.md
@@ -59,7 +59,7 @@ Provider selection: `WEB_SEARCH_PROVIDER` env var, or auto-detect from available
 
 ## CLI Execute
 
-The `cli_execute` tool provides security-hardened command execution with 7 security layers:
+The `cli_execute` tool provides security-hardened command execution with 10 security layers:
 
 ```yaml
 tools:
@@ -73,13 +73,16 @@ tools:
 
 | # | Layer | Detail |
 |---|-------|--------|
-| 1 | **Binary allowlist** | Only pre-approved binaries can execute |
-| 2 | **Binary resolution** | Binaries are resolved to absolute paths via `exec.LookPath` at startup |
-| 3 | **Argument validation** | Rejects arguments containing `$(`, backticks, or newlines |
-| 4 | **Timeout** | Configurable per-command timeout (default: 120s) |
-| 5 | **No shell** | Uses `exec.CommandContext` directly — no shell expansion |
-| 6 | **Environment isolation** | Only `PATH`, `HOME`, `LANG`, explicit passthrough vars, proxy vars, and `OPENAI_ORG_ID` (when set) |
-| 7 | **Output limits** | Configurable max output size (default: 1MB) to prevent memory exhaustion |
+| 1 | **Shell denylist** | Shell interpreters (`bash`, `sh`, `zsh`, `dash`, `ksh`, `csh`, `tcsh`, `fish`) are unconditionally blocked — they defeat the no-shell design |
+| 2 | **Binary allowlist** | Only pre-approved binaries can execute |
+| 3 | **Binary resolution** | Binaries are resolved to absolute paths via `exec.LookPath` at startup |
+| 4 | **Argument validation** | Rejects arguments containing `$(`, backticks, or newlines |
+| 5 | **Path confinement** | Path arguments inside `$HOME` but outside `workDir` are blocked (see [Path Containment](security/guardrails.md#path-containment)) |
+| 6 | **Timeout** | Configurable per-command timeout (default: 120s) |
+| 7 | **No shell** | Uses `exec.CommandContext` directly — no shell expansion |
+| 8 | **Working directory** | `cmd.Dir` set to `workDir` so relative paths resolve within the agent directory |
+| 9 | **Environment isolation** | Only `PATH`, `HOME`, `LANG`, explicit passthrough vars, proxy vars, and `OPENAI_ORG_ID` (when set). `HOME` is overridden to `workDir` to prevent `~` expansion from reaching the real home directory |
+| 10 | **Output limits** | Configurable max output size (default: 1MB) to prevent memory exhaustion |
 
 ## File Create
 

diff --git a/forge-cli/build/policy_stage.go b/forge-cli/build/policy_stage.go
@@ -24,6 +24,9 @@ func (s *PolicyStage) Execute(ctx context.Context, bc *pipeline.BuildContext) er
 					Type:   "content_filter",
 					Config: map[string]any{"enabled": true},
 				},
+				{Type: "no_pii"},
+				{Type: "jailbreak_protection"},
+				{Type: "no_secrets"},
 			},
 		}
 	}

diff --git a/forge-cli/cmd/run.go b/forge-cli/cmd/run.go
@@ -23,6 +23,7 @@ var (
 	runShutdownTimeout   time.Duration
 	runMockTools         bool
 	runEnforceGuardrails bool
+	runNoGuardrails      bool
 	runModel             string
 	runProvider          string
 	runEnvFile           string
@@ -42,7 +43,8 @@ func init() {
 	runCmd.Flags().StringVar(&runHost, "host", "", "bind address (e.g. 0.0.0.0 for containers)")
 	runCmd.Flags().DurationVar(&runShutdownTimeout, "shutdown-timeout", 0, "graceful shutdown timeout (e.g. 30s)")
 	runCmd.Flags().BoolVar(&runMockTools, "mock-tools", false, "use mock runtime instead of subprocess")
-	runCmd.Flags().BoolVar(&runEnforceGuardrails, "enforce-guardrails", false, "enforce guardrail violations as errors")
+	runCmd.Flags().BoolVar(&runEnforceGuardrails, "enforce-guardrails", true, "enforce guardrail violations as errors")
+	runCmd.Flags().BoolVar(&runNoGuardrails, "no-guardrails", false, "disable all guardrail enforcement")
 	runCmd.Flags().StringVar(&runModel, "model", "", "override model name (sets MODEL_NAME env var)")
 	runCmd.Flags().StringVar(&runProvider, "provider", "", "LLM provider (openai, anthropic, ollama)")
 	runCmd.Flags().StringVar(&runEnvFile, "env", ".env", "path to .env file")
@@ -59,14 +61,19 @@ func runRun(cmd *cobra.Command, args []string) error {
 
 	activeChannels := parseChannels(runWithChannels)
 
+	enforceGuardrails := runEnforceGuardrails
+	if runNoGuardrails {
+		enforceGuardrails = false
+	}
+
 	runner, err := runtime.NewRunner(runtime.RunnerConfig{
 		Config:            cfg,
 		WorkDir:           workDir,
 		Port:              runPort,
 		Host:              runHost,
 		ShutdownTimeout:   runShutdownTimeout,
 		MockTools:         runMockTools,
-		EnforceGuardrails: runEnforceGuardrails,
+		EnforceGuardrails: enforceGuardrails,
 		ModelOverride:     runModel,
 		ProviderOverride:  runProvider,
 		EnvFilePath:       resolveEnvPath(workDir, runEnvFile),

diff --git a/forge-cli/cmd/run_test.go b/forge-cli/cmd/run_test.go
@@ -19,8 +19,11 @@ func TestRunCmd_FlagDefaults(t *testing.T) {
 	if runMockTools {
 		t.Error("mock-tools should default to false")
 	}
-	if runEnforceGuardrails {
-		t.Error("enforce-guardrails should default to false")
+	if !runEnforceGuardrails {
+		t.Error("enforce-guardrails should default to true")
+	}
+	if runNoGuardrails {
+		t.Error("no-guardrails should default to false")
 	}
 	if runModel != "" {
 		t.Errorf("model should default to empty, got %q", runModel)

diff --git a/forge-cli/cmd/serve.go b/forge-cli/cmd/serve.go
@@ -29,6 +29,7 @@ var (
 	serveHost              string
 	serveShutdownTimeout   time.Duration
 	serveEnforceGuardrails bool
+	serveNoGuardrails      bool
 	serveModel             string
 	serveProvider          string
 	serveEnvFile           string
@@ -87,7 +88,8 @@ func registerServeFlags(cmd *cobra.Command) {
 	cmd.Flags().IntVarP(&servePort, "port", "p", 8080, "HTTP server port")
 	cmd.Flags().StringVar(&serveHost, "host", "127.0.0.1", "bind address (use 0.0.0.0 for containers)")
 	cmd.Flags().DurationVar(&serveShutdownTimeout, "shutdown-timeout", 30*time.Second, "graceful shutdown timeout")
-	cmd.Flags().BoolVar(&serveEnforceGuardrails, "enforce-guardrails", false, "enforce guardrail violations as errors")
+	cmd.Flags().BoolVar(&serveEnforceGuardrails, "enforce-guardrails", true, "enforce guardrail violations as errors")
+	cmd.Flags().BoolVar(&serveNoGuardrails, "no-guardrails", false, "disable all guardrail enforcement")
 	cmd.Flags().StringVar(&serveModel, "model", "", "override model name (sets MODEL_NAME env var)")
 	cmd.Flags().StringVar(&serveProvider, "provider", "", "LLM provider (openai, anthropic, ollama)")
 	cmd.Flags().StringVar(&serveEnvFile, "env", ".env", "path to .env file")
@@ -166,7 +168,9 @@ func serveStartRun(cmd *cobra.Command, args []string) error {
 		"--host", serveHost,
 		"--shutdown-timeout", serveShutdownTimeout.String(),
 	}
-	if serveEnforceGuardrails {
+	if serveNoGuardrails {
+		runArgs = append(runArgs, "--no-guardrails")
+	} else if serveEnforceGuardrails {
 		runArgs = append(runArgs, "--enforce-guardrails")
 	}
 	if serveModel != "" {

diff --git a/forge-cli/runtime/guardrails_loader.go b/forge-cli/runtime/guardrails_loader.go
@@ -26,3 +26,19 @@ func LoadPolicyScaffold(workDir string) (*agentspec.PolicyScaffold, error) {
 	}
 	return &ps, nil
 }
+
+// DefaultPolicyScaffold returns a scaffold with all built-in guardrails enabled.
+// Used when no policy-scaffold.json exists (e.g. running without forge build).
+func DefaultPolicyScaffold() *agentspec.PolicyScaffold {
+	return &agentspec.PolicyScaffold{
+		Guardrails: []agentspec.Guardrail{
+			{
+				Type:   "content_filter",
+				Config: map[string]any{"enabled": true},
+			},
+			{Type: "no_pii"},
+			{Type: "jailbreak_protection"},
+			{Type: "no_secrets"},
+		},
+	}
+}