Skip to content

Comments

✨ Unified control-plane MVP: server-authoritative sessions, lifecycle, and naming#5

Merged
Robdel12 merged 46 commits intomainfrom
rd/codex-direct
Feb 10, 2026
Merged

✨ Unified control-plane MVP: server-authoritative sessions, lifecycle, and naming#5
Robdel12 merged 46 commits intomainfrom
rd/codex-direct

Conversation

@Robdel12
Copy link
Owner

@Robdel12 Robdel12 commented Feb 10, 2026

Summary

This PR completes the unified control-plane migration so orbitdock-server is the authoritative owner of session and message lifecycle for Codex + Claude, with the macOS app acting as a reactive WebSocket client.

Scope is intentionally broad and includes Workstream 1 (split-brain removal), Workstream 2 (lifecycle reliability), and Workstream 3 (naming consistency) from plans/mvp-unified-control-plane.md.

Why

Before this series, session state could diverge across app-side stores, direct/passive paths, and startup restoration behaviors. That made lifecycle bugs (reactivation, stale names, ghost sessions, stale transcripts) hard to reason about and hard to test end-to-end.

This consolidates state ownership into the server, removes app-side lifecycle mutation paths, and adds regression coverage around the real passive watcher flow.

Architecture Rationale

This server is not just a transport layer; it is now the control plane.

  • Fan-in for many Codex app-servers: multiple Codex app-server processes can feed one orbitdock-server instead of each client maintaining its own local truth.
  • Unified session engine: Codex direct sessions and passive rollout-watched sessions are handled by the same lifecycle/persistence model.
  • Shared real-time stream: server owns session/message state transitions once, then fans updates out over WebSocket to any client(s).
  • Thin clients: macOS UI is now a reactive renderer over server snapshots/deltas rather than a parallel lifecycle engine.

The practical outcome is that we can scale provider/process integrations without re-implementing lifecycle logic in every client.

What Changed

1) Server-authoritative session architecture

  • Migrated app UI/state flow to server snapshots/deltas as the source of truth.
  • Removed remaining app DB/session-store lifecycle dependencies from app entry/shell paths.
  • Added server protocol + adapter/state updates to keep list/detail/in-session state aligned.

2) Passive Codex watcher hardening

  • Added startup seeding + catch-up sweep behavior for rollout files.
  • Added stale path->session mapping rebind by rollout thread-id hint.
  • Fixed ended passive reactivation on new rollout activity (including manual close path).
  • Ensured passive transcript message lines append correctly as chat messages.

3) Lifecycle reliability + restart behavior

  • Added startup stale-passive cleanup and empty Claude shell cleanup.
  • Kept list/detail consistency after close/reopen and across restart cycles.
  • Preserved reactivation semantics through watcher processing and persisted state updates.

4) Naming consistency (Phase 3 completion)

  • Kept first real prompt as naming source across Claude + Codex.
  • Filtered bootstrap/system payloads from naming (environment_context, permissions, collaboration_mode, skill, turn_aborted, AGENTS boilerplate, etc).
  • Removed passive generated slug fallback so display fallback is deterministic (custom_name > first prompt > project).
  • Added startup seeded-passive backfill from rollout prompt history.

5) Tooling/logging/docs/plan updates

  • Updated startup/build embedding behavior and server/app logging improvements.
  • Updated MVP plan progress and completion notes.

Testing

Ran and verified:

  • cargo fmt --all
  • cargo check -p orbitdock-server
  • swift build (CommandCenter/OrbitDockCore)

Targeted regressions for this flow include:

  • passive close -> rollout append -> reactivation
  • stale rollout mapping rebind
  • catch-up sweep processes appended lines without fs event
  • response_item passive message append behavior
  • bootstrap payloads excluded from naming
  • startup passive naming backfill from rollout history
  • Codex send-message naming ignores bootstrap payloads

Notable Impact

  • Branch delta vs main: 33 commits, 108 files.
  • Biggest hotspots are server lifecycle/persistence/watcher paths plus app server-state integration.

Follow-ups

Remaining plan scope is primarily Workstream 4 diagnostics + Workstream 5 MVP verification gate.

User terminal commands and their output were showing as separate
cards. Now they're merged into a single card when the input and
output arrive as consecutive messages.
Implements bidirectional JSON-RPC communication with Codex app-server:
- CodexAppServerClient: Process management and JSON-RPC over stdio
- CodexProtocol: Type-safe protocol definitions for all API methods
- CodexDirectSessionManager: Session lifecycle orchestration
- CodexEventHandler: Real-time event processing and state updates
- CodexInputBar: UI for sending messages to direct sessions
- File logging to ~/.orbitdock/codex-server.log for debugging

Core features working:
- Create and resume Codex threads from OrbitDock
- Send messages and receive streaming responses
- Messages display in conversation view via MessageStore
- Session recovery on app restart

Still TODO:
- Handle approval workflows (exec, patch, questions)
- Token usage and rate limit display
- Silence streaming delta event noise
- Add file logging to ~/.orbitdock/codex-server.log for debugging
- Handle token usage and rate limit events (logged for now)
- Handle MCP server startup events
- Add webSearch item type handling
- Silence streaming delta events (reasoning, agent message)
- Ignore legacy codex/event/* duplicates
- Filter high-frequency events from console output
- Track input/output/cached tokens from app-server events
- Display token badge in session action bar (input ⬇️ / output ⬆️ / cache ⚡)
- Add migration 009 for token columns on sessions table
- Auto-resume ended sessions when sending new messages
- Fix TokenUsageEvent to match actual nested API structure
- Use last turn tokens (not cumulative total) for context percentage
- last.inputTokens / modelContextWindow = actual context fill
- last.cachedInputTokens for accurate cache savings %
- Add detailed logging for token usage debugging
- Add CodexFileLogger for structured JSON logs (~/.orbitdock/logs/codex.log)
- Log all Codex events with full payloads for debugging
- Log decode errors with raw JSON for fixing struct mismatches
- Log MCP bridge requests/responses with timing
- Fix agentMessage not displaying: use upsert since text arrives in
  item/updated after item/created is skipped (empty during streaming)
- Add MCPBridge for external tool integration
- Update CLAUDE.md with logging documentation
Allows Claude to interact with the same Codex session visible in OrbitDock:
- send_message: Send prompts to a session
- interrupt_turn: Stop current turn
- approve: Handle tool approvals
- list_sessions: List controllable sessions
- check_connection: Verify OrbitDock is up

Routes through OrbitDock's HTTP bridge (port 19384) to CodexDirectSessionManager,
ensuring MCP and user operate on the same session state.
- Add CodexTurnStateStore for in-memory turn state (diff, plan)
- Add CodexDiffSidebar and CodexDiffView for diff visualization
- Add orbitdock-debug MCP for pair-debugging Codex sessions
- Update ConversationView to skip transcript sync for direct sessions
- Update Session model with Codex-specific fields
- Add codex-app-server documentation
Phase 0-2 of real-time architecture complete:

**Rust Server (orbitdock-server/)**
- Axum WebSocket server on port 4000
- Session management with subscriber broadcasting
- PersistenceWriter with batched SQLite writes
- Protocol types matching Swift client
- Universal binary build script (arm64 + x86_64)

**Swift Integration**
- ServerManager: auto-starts server on app launch
- ServerConnection: WebSocket client with reconnection
- ServerProtocol: matching types for communication
- Debug settings page with server status
- SessionStore: centralized session state management

**Architecture**
- Server writes to ~/.orbitdock/orbitdock.db
- Swift app reads from same database
- Hybrid approach: existing providers still work
- Ready for Phase 3 (Codex connector)

See plans/realtime-architecture.md for full design.
Complete JSON-RPC connector for codex app-server subprocess:

**connectors/codex.rs**
- Process spawning with binary discovery
- JSON-RPC request/response correlation
- Event translation (turns, items, approvals, tokens)
- Approval submissions (exec, patch, question)

**server/codex_session.rs**
- Event loop forwards Codex events → WebSocket subscribers
- Action channel receives commands from WebSocket

**server/websocket.rs**
- CreateSession for Codex spawns CodexSession
- SendMessage, ApproveTool, Interrupt forwarded to connector

Ready for end-to-end testing with real Codex sessions.
- Handle WebSocket ping/pong in Rust server (fixes Swift client timeout)
- Add dual logging: stderr (compact) + JSON file (~/.orbitdock/logs/server.log)
- Fix WebSocket resource timeout (timeoutIntervalForResource = 0 for long-lived connection)
- Auto-subscribe to session list on WebSocket connect
- ServerManager prefers debug binary, sets RUST_LOG=debug for dev
- Add .gitignore for orbitdock-server/target/ build artifacts
- Update realtime-architecture.md plan with Phase 4 progress
Eliminates the entire JSON-RPC over stdio IPC layer by linking
codex-core as a library. Sessions now use ThreadManager/CodexThread
directly — no binary discovery, no subprocess, no serialization
boundary.

Also fixes two Swift serialization bugs:
- Message.message_type no longer renamed to "type" (matches Swift CodingKeys)
- tool_input changed from Value to String (matches Swift decoder)
Deletes CodexAppServerClient, CodexDirectSessionManager,
CodexEventHandler, CodexProtocol, and CodexTurnStateStore — all
replaced by the embedded Rust server with direct codex-core integration.

Adds ServerAppState and ServerTypeAdapters to bridge WebSocket
messages from the Rust server into SwiftUI state.
- Fix TranscriptMessage equality to include content/toolOutput so
  SwiftUI re-renders on streaming updates
- Use atomic counter for unique message IDs (no more UNIQUE constraint
  collisions on user/thinking messages across turns)
- Throttle streaming delta broadcasts to 50ms intervals
- Fix TurnAborted/SessionEnded/Error handlers to persist and broadcast
  status changes (no more stuck "working" state)
- Persist codex_thread_id so rollout watcher skips server-managed
  sessions (fixes duplicate sessions with wrong provider)
- Re-enable CodexRolloutWatcher for passive CLI session monitoring
- Restore active Codex sessions on server restart
- Fix cwd passing to codex-core via ConfigOverrides
- Handle UserMessage events from codex-core
- Fix ActivityBanner subtitle for Codex provider
…ring

- Fix approval keying: use event.id (sub_id) instead of call_id — codex-core
  keys pending approvals by sub_id, not the tool call_id
- Fix approval type dispatch: track pending approval types in SessionHandle
  HashMap so patch vs exec approvals route correctly
- Fix approval UI not clearing: broadcast SessionDelta with pending_approval
  cleared and work_status back to Working after dispatching approval
- Fix streaming text doubling: AgentMessageContentDelta and AgentMessageDelta
  fire simultaneously — skip legacy handler when newer path is active
- Fix edit card "No content": PatchApplyBegin now builds tool_input JSON with
  file_path and unified_diff from FileChange data
- Fix edit card rendering: reuse CodexDiffView for Codex unified diffs instead
  of duplicate ParsedUnifiedDiffView
- Fix session subscriptions: SessionDetailView subscribes/unsubscribes for
  server sessions, with reconnect re-subscription via onConnected callback
- Document Rust server log debugging in CLAUDE.md
Replace binary approve/deny with codex-core's full ReviewDecision:
- Approve (one-time), Allow for Session, Always Allow, Deny, Deny & Stop
- "Always Allow" only shown for exec approvals with proposed amendment
- Thread proposed_execpolicy_amendment through the entire stack

Add live autonomy picker in the codex action bar:
- Change approval policy + sandbox mode mid-session via Op::OverrideTurnContext
- Compact pill menu showing current level (Suggest/Auto Edit/Full Auto/Full Access)
- Extract shared AutonomyLevel enum from NewCodexSessionSheet

MCP bridge updated for backward compat (accepts both approved:bool and decision:string).
…ches

Was passing "sandbox_policy" as the TOML override key, but codex-core
expects "sandbox_mode". The wrong key was silently ignored, defaulting
to ReadOnly, which caused all patches to require manual approval even
in Auto Edit mode.
Integrate codex-arg0 for proper apply_patch self-invocation dispatch,
persist approval_policy/sandbox_mode in DB, and thread autonomy config
through session restoration so Auto Edit survives server restarts.
Add approval_policy/sandbox_mode columns via migration 011. Fix
projected usage using max(0,...) instead of min(100,...). Tighten
menu bar layout with better spacing and provider section styling.
Reuse ISO8601/RelativeDateTime formatters instead of recreating per-call.
Remove standalone CodexDiffView in favor of CodexDiffSidebar.
Adds ResumeSession protocol message so ended sessions can be restarted
with original project/model/autonomy settings. Loads session + messages
from DB, creates fresh codex-core connector, and reactivates in-place.
Ended direct sessions now visible in sidebar instead of being filtered out.
Mark Session.init and MigrationManager.init as nonisolated to silence
actor-isolation warnings in DatabaseManager. Fix self capture in
ServerManager termination handler for Swift 6 compliance.
Both plans were stale — written for the old Swift→codex-app-server architecture.
Rewrote to match the Rust server with direct codex-core integration that's actually built.

realtime-architecture.md: Phases 0-4 and 7 complete, removed duplicate/stale code blocks
codex-app-server-full-integration.md: Full rewrite with done/next/future verified against codex-core API
9 tasks with step-by-step checkboxes covering Rust protocol/connector/websocket
through Swift state/UI for each feature. Each task is independently shippable.
Wire up codex-core ThreadNameUpdated events and SetThreadName ops through
the full Rust server → Swift UI stack. Sessions can be renamed from the
sidebar or quick switcher, persisted in both codex-core and SQLite.

Also fixes two bugs:
- Duplicate sidebar entries: passive rollout-watcher sessions now filtered
  by thread ID match against server-managed sessions
- Rename routing: AgentListPanel/QuickSwitcher now route through
  serverState.renameSession() for server-managed Codex sessions
Move Codex rollout file ingestion/session lifecycle to orbitdock-server, remove Swift watcher/store, and keep direct/passive session modes explicit in protocol and Swift adapters.
Add provider-aware session discovery and per-session inspection while keeping control actions scoped to direct Codex sessions. This preserves Claude compatibility without enabling unsupported mutation paths.
Prevent dropped/truncated Codex assistant output when a message update arrives before the corresponding create event by upserting a fallback assistant message.
Prevented MessageStore timestamp decode crashes on legacy rows by using tolerant parsing.\nHydrated autonomy and token usage correctly across server restore/resume so UI reflects actual session config and counters.\nAdded a unified server-first control plane plan document for next migration phases.
Commits all current app, server, and integration plan changes to create a clean testing baseline before the next debug cycle.
Migrate key Codex session behavior to the server-backed path, add approval history protocol/UI plumbing, and harden session dedup/shadow-write handling.\n\nAlso fixes sticky approval pending state with local optimistic resolution + bounded refresh, and updates the integration plan with validated progress.
Unifies runtime behavior around server state, hardens Codex passive restore/reactivation, and adds persistence regression tests to prevent startup/session-state regressions.
Replace hardcoded model options with server-discovered model lists across Rust server, Swift client, and debug MCP.\n\nRequire explicit model selection in Codex send/create flows, cache discovered models in app state, and expose model-list endpoints/messages for bridge and MCP validation.
Route CLI sessions fully through orbitdock-server, improve rollout reactivation/list sync, and standardize Claude/Codex naming from first prompts. Also includes build pipeline and logging updates made during this phase.
Supersede the two older plan docs and define one MVP-focused source of truth for finishing server-owned Claude+Codex session lifecycle work.
Covers rollout-backed Codex sessions transitioning ended->active on new activity and verifies ended markers are cleared in persistence.
End startup-only empty Claude shell rows and clean stale shell sessions on new Claude session start so active list only shows real conversations.
Ensure passive Codex sessions reactivate reliably after new rollout activity by hardening watcher ingestion and adding server-side subscribe reactivation fallback.

Also adds end-to-end regression coverage for close->append->reactivate and updates the MVP plan progress.
Remove remaining app-side lifecycle fallbacks in session loading and conversation history, keep lifecycle transitions server-owned, and mark Workstream 1 complete in the MVP plan.
Apply SwiftFormat across CommandCenter, run cargo fmt across orbitdock-server, and resolve remaining clippy warnings so format/lint checks pass cleanly.
Add high-level server tests for list/detail state consistency after passive manual close and startup restore correctness for recent vs stale passive sessions. Mark corresponding Workstream 2 regression checklist items complete.
Add repeated-restart passive lifecycle regression (startup stale end -> live reactivation -> restart stays active), and mark remaining WS2 items complete in the MVP plan.
Unifies passive session lifecycle and message flow under orbitdock-server and removes direct app DB/session store dependencies.\n\nIncludes rollout watcher hardening for reactivation + stale mapping recovery, passive Codex message append handling, and follow-up app/server cleanup to keep UI state aligned with server snapshots/deltas.
Finishes naming defaults and guardrails by keeping first real prompt as the naming source across Claude and Codex, filtering bootstrap/system payloads, and removing passive slug fallback behavior.\n\nAlso adds regression coverage for startup passive name backfill and bootstrap-safe naming, and marks Workstream 3 complete in the MVP plan.
@Robdel12 Robdel12 merged commit b3b9915 into main Feb 10, 2026
1 check passed
@Robdel12 Robdel12 deleted the rd/codex-direct branch February 10, 2026 08:56
Robdel12 added a commit that referenced this pull request Feb 17, 2026
…, and naming (#5)

## Summary
This PR completes the unified control-plane migration so
`orbitdock-server` is the authoritative owner of session and message
lifecycle for Codex + Claude, with the macOS app acting as a reactive
WebSocket client.

Scope is intentionally broad and includes Workstream 1 (split-brain
removal), Workstream 2 (lifecycle reliability), and Workstream 3 (naming
consistency) from `plans/mvp-unified-control-plane.md`.

## Why
Before this series, session state could diverge across app-side stores,
direct/passive paths, and startup restoration behaviors. That made
lifecycle bugs (reactivation, stale names, ghost sessions, stale
transcripts) hard to reason about and hard to test end-to-end.

This consolidates state ownership into the server, removes app-side
lifecycle mutation paths, and adds regression coverage around the real
passive watcher flow.

## Architecture Rationale
This server is not just a transport layer; it is now the control plane.

- **Fan-in for many Codex app-servers**: multiple Codex app-server
processes can feed one `orbitdock-server` instead of each client
maintaining its own local truth.
- **Unified session engine**: Codex direct sessions and passive
rollout-watched sessions are handled by the same lifecycle/persistence
model.
- **Shared real-time stream**: server owns session/message state
transitions once, then fans updates out over WebSocket to any client(s).
- **Thin clients**: macOS UI is now a reactive renderer over server
snapshots/deltas rather than a parallel lifecycle engine.

The practical outcome is that we can scale provider/process integrations
without re-implementing lifecycle logic in every client.

## What Changed
### 1) Server-authoritative session architecture
- Migrated app UI/state flow to server snapshots/deltas as the source of
truth.
- Removed remaining app DB/session-store lifecycle dependencies from app
entry/shell paths.
- Added server protocol + adapter/state updates to keep
list/detail/in-session state aligned.

### 2) Passive Codex watcher hardening
- Added startup seeding + catch-up sweep behavior for rollout files.
- Added stale path->session mapping rebind by rollout thread-id hint.
- Fixed ended passive reactivation on new rollout activity (including
manual close path).
- Ensured passive transcript message lines append correctly as chat
messages.

### 3) Lifecycle reliability + restart behavior
- Added startup stale-passive cleanup and empty Claude shell cleanup.
- Kept list/detail consistency after close/reopen and across restart
cycles.
- Preserved reactivation semantics through watcher processing and
persisted state updates.

### 4) Naming consistency (Phase 3 completion)
- Kept first real prompt as naming source across Claude + Codex.
- Filtered bootstrap/system payloads from naming (`environment_context`,
`permissions`, `collaboration_mode`, `skill`, `turn_aborted`, AGENTS
boilerplate, etc).
- Removed passive generated slug fallback so display fallback is
deterministic (`custom_name` > first prompt > project).
- Added startup seeded-passive backfill from rollout prompt history.

### 5) Tooling/logging/docs/plan updates
- Updated startup/build embedding behavior and server/app logging
improvements.
- Updated MVP plan progress and completion notes.

## Testing
Ran and verified:
- `cargo fmt --all`
- `cargo check -p orbitdock-server`
- `swift build` (`CommandCenter/OrbitDockCore`)

Targeted regressions for this flow include:
- passive close -> rollout append -> reactivation
- stale rollout mapping rebind
- catch-up sweep processes appended lines without fs event
- response_item passive message append behavior
- bootstrap payloads excluded from naming
- startup passive naming backfill from rollout history
- Codex send-message naming ignores bootstrap payloads

## Notable Impact
- Branch delta vs `main`: 33 commits, 108 files.
- Biggest hotspots are server lifecycle/persistence/watcher paths plus
app server-state integration.

## Follow-ups
Remaining plan scope is primarily Workstream 4 diagnostics + Workstream
5 MVP verification gate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant