Skip to content

Conversation

@nojaf
Copy link
Contributor

@nojaf nojaf commented Feb 11, 2026

📝 Summary

Closes #8173

🔍 Description of Changes

AI agents working with marimo notebooks via MCP need to understand full notebook state, but the current tools require O(n) individual calls. In a 33-cell notebook, verifying state after a change can require 30+ tool calls that should be one or two.

This PR makes two changes as discussed with @mscolnick in #8173:

1. Enrich get_lightweight_cell_map with runtime info

LightweightCellInfo now includes three new fields:

  • runtime_state"idle", "running", "queued", etc. (or null if no notification yet)
  • has_output — whether the cell has visual output
  • has_console_output — whether the cell has stdout/stderr

This lets agents quickly scan all cells and decide which ones to drill into, without calling get_cell_runtime_data or get_cell_outputs for every cell.

2. Batch get_cell_runtime_data and get_cell_outputs

Both tools now accept cell_ids: list[CellId_t] instead of a single cell_id. Responses wrap results per cell:

  • get_cell_runtime_data returns data: list[GetCellRuntimeDataData]
  • get_cell_outputs returns cells: list[CellOutputData] (new dataclass with cell_id, visual_output, console_outputs)

Manual verification

Tested against the dataflow tutorial (43 cells) with Claude Code connected to the MCP server:

  • get_lightweight_cell_map: all cells returned with runtime_state, has_output, and has_console_output correctly populated
  • get_cell_runtime_data with 3 cell IDs: single response with code, errors, metadata, and variables for all three
  • get_cell_outputs with all 43 cell IDs: single response (59K chars) with visual and console output for every cell

📋 Checklist

  • I have read the contributor guidelines.
  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • Tests have been added for the changes made.
  • Documentation has been updated where applicable, including docstrings for API changes.
  • Pull request title is a good summary of the changes - it will be used in the release notes.

- Enrich get_lightweight_cell_map with runtime_state, has_output, and
  has_console_output per cell
- Batch get_cell_runtime_data and get_cell_outputs to accept a list of
  cell_ids instead of a single cell_id
@nojaf nojaf requested a review from akshayka as a code owner February 11, 2026 14:25
@vercel
Copy link

vercel bot commented Feb 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Feb 11, 2026 2:57pm

Request Review

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 11, 2026
@mscolnick
Copy link
Contributor

@AlecZorab can you test this out and let us know if this feels like an improvements from this particular concern you raised.

@nojaf
Copy link
Contributor Author

nojaf commented Feb 11, 2026

@mscolnick Alec is unable to easily access his public GitHub account right now.
He shared his findings via our internal chat:

❯ cool. Collate your feedback into a single message and I'll forward it for you 

● Feedback on Updated Marimo MCP Server Tools

  Tested against the dataflow.py tutorial notebook. Overall the new features are solid — multi-cell batching and the lightweight cell map enrichment are genuine improvements to the workflow.

  What works well

  - get_lightweight_cell_map enrichment: The runtime_state, has_output, and has_console_output fields enable a "scan then drill down" pattern that significantly reduces round-trips. I can triage an entire notebook in one call.
  - Multi-cell get_cell_runtime_data and get_cell_outputs: Both work correctly with multiple cell IDs. What would have been ~12 separate calls collapsed into 2. This is the single biggest efficiency gain.
  - Runtime state lifecycle: Confirmed "running" → "queued" → "idle" transitions work correctly using a 30-second sleep cell with a dependent cell. The null state for structurally broken cells (cycles, duplicate defs) also makes sense
  once understood.
  - Error surfacing in get_cell_runtime_data: Correctly reports MultipleDefinitionError, CycleError, and MarimoExceptionRaisedError with typed error objects.
  - has_console_output: Correctly flags only the cell that produced stderr output.

  Feedback

  1. Empty cell_ids should return all cells
  Passing cell_ids: [] to get_cell_runtime_data and get_cell_outputs returns empty results. get_tables_and_variables already establishes the convention that an empty list means "return everything." Applying the same convention here would
  eliminate the need to first fetch the cell map just to collect all IDs for a bulk query.

  2. runtime_state values need documentation in the tool descriptions
  The possible values and their meanings aren't documented anywhere a consumer can see them. Based on testing, the enum is:
  - "idle" — executed and quiescent
  - "running" — actively executing
  - "queued" — waiting on a running dependency
  - null — blocked from execution by a structural error (cycle, duplicate definition)

  Adding this to the tool description for get_lightweight_cell_map and get_cell_runtime_data would help AI consumers interpret the field without needing to infer it.

  3. execution_time changes semantics based on state
  While a cell is "running", execution_time is a Unix epoch timestamp (e.g., 1770823150.68152 — the start time). After the cell completes and transitions to "idle", it becomes a duration in milliseconds (e.g., 30031). While "queued", it's
   null. This dual meaning could confuse consumers. Suggestions (pick one):
  - Split into start_time and execution_duration_ms
  - Always return duration only (null until finished)
  - Document the dual semantics explicitly in the tool description

  4. Consider adding has_errors to the lightweight cell map
  The map already has has_output and has_console_output. A has_errors flag would let an agent quickly identify problem cells without calling get_cell_runtime_data or get_notebook_errors. You can partially infer errors from runtime_state:
  null, but that misses runtime errors — e.g., a cell with a NameError shows runtime_state: "idle" but still has errors in its runtime data.

  5. Error cell visual_output is a Python repr string
  Error output uses mimetype application/vnd.marimo+error but the content is a stringified Python repr like "[MultipleDefinitionError(name='planet', cells=('ZHCJ',))]". Returning structured JSON (e.g., {"error_type":
  "MultipleDefinitionError", "variable": "planet", "conflicting_cells": ["ZHCJ"]}) would make programmatic handling easier for consumers. The error objects in get_cell_runtime_data are already well-structured — aligning the output format
  would be consistent.

  6. Minor: null runtime state is ambiguous without context
  null currently means "structurally blocked" but could also be read as "never executed" or "unknown." If adding more states isn't desirable, documenting this clearly (per point 2) would suffice. Alternatively, a dedicated state like
  "disabled" or "error" would be more self-describing than null.

Do you agree with these suggestions?
Let me know which ones I can act on.

@mscolnick
Copy link
Contributor

@nojaf thanks for getting the feedback. these sounds like great additions on top of what you have, we can add all of these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP server: bulk notebook state visibility

2 participants