Skip to content

septract/acl2-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ACL2 MCP Server

⚠️ Early Beta - Development in Progress This is an early beta version that was rapidly developed. While functional and tested, expect rough edges, potential bugs, and breaking changes. Use at your own risk in production environments.

A Model Context Protocol (MCP) server that provides tools for interacting with the ACL2 theorem prover.

Features

This MCP server exposes 15 tools for working with ACL2, including support for persistent sessions that enable incremental development:

Session Management Tools

  • start_session: Create a persistent ACL2 session for incremental development
  • end_session: End a persistent session and clean up resources
  • list_sessions: List all active sessions with their status

Code-based Tools

  • prove: Submit ACL2 theorems (defthm) for proof
  • evaluate: Evaluate arbitrary ACL2 expressions and definitions
  • check_syntax: Check ACL2 code for syntax errors
  • admit: Test if an ACL2 event would be admitted without error

All code-based tools support an optional session_id parameter for incremental development.

File-based Tools

  • certify_book: Certify an ACL2 book file (loads and verifies all definitions and theorems)
  • include_book: Load an ACL2 book and optionally evaluate additional code
  • check_theorem: Check a specific theorem in an ACL2 file by name

Query and Verification Tools

  • query_event: Query information about a defined function, theorem, or event (uses :pe)
  • verify_guards: Verify guards for a function to ensure efficient execution

Session State Management Tools

  • undo: Undo the last N events in a session
  • save_checkpoint: Save a named checkpoint of the current session state
  • restore_checkpoint: Restore a session to a previously saved checkpoint
  • get_world_state: Display current session state (recent definitions and theorems)
  • retry_proof: Retry a failed proof with different hints

Prerequisites

  • Python 3.10 or later
  • ACL2 installed and available in PATH

Installation

# Clone the repository
cd acl2-mcp

# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install the package
pip install -e .

Usage

Running the Server

The server can be run directly:

acl2-mcp

Or via Python:

python -m acl2_mcp.server

Configuring in Claude Desktop

Add this to your Claude Desktop configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "acl2": {
      "command": "/path/to/acl2-mcp/venv/bin/acl2-mcp"
    }
  }
}

Replace /path/to/acl2-mcp with the actual path to your installation directory.

Configuring in Claude Code

Recommended: Using the CLI (Simplest method)

Claude Code provides a CLI command to add MCP servers:

claude mcp add acl2 /path/to/acl2-mcp/venv/bin/acl2-mcp

Replace /path/to/acl2-mcp with the actual path to your installation directory.

This will automatically configure the server in your Claude Code settings.

Alternative: Manual Configuration

You can also manually edit the Claude Code MCP settings file:

macOS/Linux: ~/.config/claude-code/mcp_settings.json Windows: %APPDATA%\claude-code\mcp_settings.json

Option 1: Using the installed executable (Recommended)

{
  "mcpServers": {
    "acl2": {
      "command": "/path/to/acl2-mcp/venv/bin/acl2-mcp"
    }
  }
}

Option 2: Using Python module

{
  "mcpServers": {
    "acl2": {
      "command": "/path/to/acl2-mcp/venv/bin/python",
      "args": [
        "-m",
        "acl2_mcp.server"
      ]
    }
  }
}

Replace /path/to/acl2-mcp with the actual path to your installation directory.

Example Tool Usage

Persistent Session Workflow (Recommended for Interactive Development)

For incremental development where you build up definitions and theorems step-by-step, use persistent sessions:

1. Start a session:

Tool: start_session
Arguments:
  name: "natural-numbers-proof"  (optional, for easy identification)

Returns: Session ID (e.g., "a1b2c3d4-...")

2. Define functions incrementally:

Tool: evaluate
Arguments:
  session_id: "a1b2c3d4-..."
  code: "(defun plus (x y) (if (zp x) y (plus (1- x) (1+ y))))"

Tool: evaluate
Arguments:
  session_id: "a1b2c3d4-..."
  code: "(plus 2 3)"  // Test the function

3. Build on previous definitions:

Tool: evaluate
Arguments:
  session_id: "a1b2c3d4-..."
  code: "(defun times (x y) (if (zp y) 0 (plus x (times x (1- y)))))"

4. Prove theorems interactively:

Tool: prove
Arguments:
  session_id: "a1b2c3d4-..."
  code: "(defthm plus-commutative (equal (plus x y) (plus y x)))"

5. If proof fails, retry with hints:

Tool: retry_proof
Arguments:
  session_id: "a1b2c3d4-..."
  code: "(defthm plus-commutative
          (equal (plus x y) (plus y x))
          :hints ((\"Goal\" :induct (plus x y))))"

6. Save checkpoints before risky steps:

Tool: save_checkpoint
Arguments:
  session_id: "a1b2c3d4-..."
  checkpoint_name: "before-induction"

7. Restore if needed:

Tool: restore_checkpoint
Arguments:
  session_id: "a1b2c3d4-..."
  checkpoint_name: "before-induction"

8. Inspect session state:

Tool: get_world_state
Arguments:
  session_id: "a1b2c3d4-..."
  limit: 20  (show last 20 events)

9. Undo mistakes:

Tool: undo
Arguments:
  session_id: "a1b2c3d4-..."
  count: 1  (undo last event)

10. End session when done:

Tool: end_session
Arguments:
  session_id: "a1b2c3d4-..."

Benefits of persistent sessions:

  • ✅ No need to wrap everything in progn
  • ✅ Test functions immediately after defining them
  • ✅ Build complex proofs incrementally
  • ✅ Try different proof strategies without re-submitting entire files
  • ✅ Save/restore checkpoints for experimentation
  • ⚡ Sessions auto-timeout after 30 minutes of inactivity

Code-based Tools (One-off Execution)

Prove a Theorem:

(defthm append-nil
  (implies (true-listp x)
           (equal (append x nil) x)))

Evaluate Expressions:

(defun factorial (n)
  (if (zp n)
      1
    (* n (factorial (- n 1)))))

(factorial 5)

Check Syntax:

(defun my-function (x y)
  (+ x y))

File-based Tools

Certify a Book:

Tool: certify_book
Arguments:
  file_path: "path/to/mybook"  (without .lisp extension)
  timeout: 120  (optional)

Include a Book and Run Code:

Tool: include_book
Arguments:
  file_path: "path/to/mybook"  (without .lisp extension)
  code: "(thm (equal (+ 1 1) 2))"  (optional)
  timeout: 60  (optional)

Check a Specific Theorem:

Tool: check_theorem
Arguments:
  file_path: "path/to/myfile.lisp"
  theorem_name: "my-theorem-name"
  timeout: 60  (optional)

Query and Verification Tools

Admit an Event:

Tool: admit
Arguments:
  code: "(defun my-func (x) (+ x 1))"
  timeout: 30  (optional)

Returns whether the event would be admitted successfully.

Query an Event:

Tool: query_event
Arguments:
  name: "append"
  file_path: "path/to/file.lisp"  (optional, if function is in a file)
  timeout: 30  (optional)

Returns the definition and properties of the named event.

Verify Guards:

Tool: verify_guards
Arguments:
  function_name: "my-function"
  file_path: "path/to/file.lisp"  (optional, if function is in a file)
  timeout: 60  (optional)

Verifies that the function's guards are satisfied.

Development

Type Checking

This project uses strict static typing with mypy:

mypy acl2_mcp/

Running Tests

pytest

How It Works

The server supports two execution modes:

One-off Execution (Default)

When no session_id is provided, each tool call:

  1. Writes ACL2 code to a temporary .lisp file
  2. Starts a fresh ACL2 process with the code as input
  3. Captures and returns stdout/stderr
  4. Cleans up the temporary file and terminates ACL2

Persistent Sessions (Incremental Development)

When using sessions:

  1. start_session creates a long-running ACL2 process with persistent stdin/stdout pipes
  2. Each tool call sends commands to the existing process and reads responses
  3. The ACL2 world state accumulates across multiple commands
  4. Sessions auto-cleanup after 30 minutes of inactivity or when explicitly ended
  5. Up to 50 concurrent sessions are supported

Default timeout is 30 seconds per command, configurable per request.

License

BSD 3-Clause License - See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages