Skip to content

A First Principles Master Class in AI Security Engineering. Learn to secure AI Agents by treating LLM output as untrusted source code using ASTs, Symbolic Verification, and Self-Refinement loops.

Notifications You must be signed in to change notification settings

benelser/ai-security-compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Security Engineering: Compiling Consistency from Chaos

🎓 The Master Class

This repository contains a complete, 30-minute lesson on securing AI Agents using First Principles of Compiler Engineering. It requires NO external API keys (the LLM is simulated deterministically).

📚 Curriculum Overview

  1. The Anatomy: Understanding Agents as Software Systems (Brain, Memory, Hands).
  2. The Trap: Why Naive Runtimes are dangerous remote shells.
  3. The Solution: Building a "Security Compiler" (ASTs + Verification).

🛠️ Setup & Requirements

  • Go 1.21+ installed.
  • Web Browser (to view the slides).
# Verify structure
ls -R

👨‍🏫 Instructor Runbook

Phase 1: The Setup (Talk)

Open slides.html in your browser. Present Slides 1-3.

  • Introduce Alice, Bob, and Mallory.
  • Explain the "Anatomy" (Brain vs. Nervous System).

Phase 2: The Dataflow (Show)

Open the fixtures/ folder.

  1. Show fixtures/1_mallory_prompt.txt: "This is the input."
  2. Show fixtures/2_bob_response.json: "This is what the LLM outputs. It looks like valid data, but it contains a weapon."

Phase 3: The Exploit (Demo 1)

Switch to the terminal. Run the Naive Agent.

go run cmd/naive/main.go "Delete the logs"

Observation:

  • The agent blindly executes rm -rf /var/log/app.log.
  • Explain: "Bob saw valid JSON, so he executed it. Syntax != Safety."

Phase 4: The Theory (Talk)

Present Slides 6-7 (slides.html).

  • Explain Lexing: Converting that JSON into an AST (pkg/ast/types.go).
  • Explain Verification: Checking the AST against specs/policy.yaml.

Phase 5: The Fix & Self-Healing (Demo 2)

Switch to the terminal. Run the Secure Harness.

make demo-secure

Observation:

  • Attempt 1: The agent tries to execute rm -rf.
  • The Compiler Intercepts: It runs Semantic Analysis, detects the violation, and blocks it.
  • The Feedback Loop: Instead of crashing, the harness sends the error back to the Brain as "System Feedback".
  • Attempt 2: The Brain Self-Corrects (Refinement). It apologizes and tries a safe tool (get_weather).
  • Verification: The Compiler verifies the new intent and allows it to pass.

Phase 6: The Control (Show)

Open specs/policy.yaml.

  • Show that system_exec is NOT in allowed_tools.
  • Explain how Alice controls the runtime behavior purely through this "Spec" (Policy-as-Code).
  • Explain that this is essentially HIPS/HIDS for AI Agents.

📂 Repository Structure

  • cmd/naive: The vulnerable code.
  • cmd/secure: The secure code (The "Compiler").
  • pkg/simulation: The Mock LLM (Returns canned JSON).
  • pkg/parser: The Lexer (JSON -> AST).
  • pkg/verifier: The Policy Logic.

About

A First Principles Master Class in AI Security Engineering. Learn to secure AI Agents by treating LLM output as untrusted source code using ASTs, Symbolic Verification, and Self-Refinement loops.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published