Skip to content

Conversation

@chetanyb
Copy link
Contributor

@chetanyb chetanyb commented Dec 14, 2025

Summary

Adds real-time fork choice tree visualization via /api/forkchoice/graph endpoint with thread-safe snapshot mechanism that doesn't block consensus operations.

Key Features

Visualization API

  • New HTTP endpoint serving Grafana node-graph compatible JSON
  • Configurable history via ?slots=N (default: 50, max: 200)
  • Color-coded arc borders representing consensus states:
    • 🟣 Purple: Finalized blocks (canonical chain)
    • 🔵 Blue: Justified checkpoint
    • 🟠 Orange: Current head
    • 🟢 Green: Normal blocks (refer TODO)
    • ⚫ Gray: Orphaned blocks (historical forks)
  • Arc completion represents validator weight

Thread Safety

  • Added RwLock to ForkChoice for concurrent access
  • Snapshot operation holds shared lock only during memcpy
  • Multiple readers don't block each other

Correctness Improvements

  • Fixed finalized block logic: all ancestors of finalized checkpoint correctly marked
  • Orphaned block detection for forks that diverged before finalization
  • Optimized ancestry checks to avoid redundant traversals

Security & Performance

  • Max slot cap (200) prevents excessive memory/lock time
  • Shared locks minimize contention
  • Lock-free JSON processing on snapshot copy

Screenshot sample

Grafana panel showing fork-choice tree visualization

NOTES

@chetanyb
Copy link
Contributor Author

@g11tech Current design allows the observability API to briefly hold shared locks. Should we add rate limiting to prevent potential write starvation from excessive requests, or is the max slot cap sufficient?

Comment on lines 8 to 23
var global_chain: ?*node.BeamChain = null;
var chain_mutex = std.Thread.Mutex{};

/// Register the chain for API access
pub fn registerChain(chain: *node.BeamChain) void {
chain_mutex.lock();
defer chain_mutex.unlock();
global_chain = chain;
}

/// Get the global chain reference
fn getChain() ?*node.BeamChain {
chain_mutex.lock();
defer chain_mutex.unlock();
return global_chain;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how mutex is helping us here, the variable is not being manipulated. Also I think it is better if we can supply the chain in context of SimpleMetricsServer itself.

// Parse query parameters for max_slots (default: 50)
var max_slots: usize = 50;
if (std.mem.indexOf(u8, request.head.target, "?slots=")) |query_start| {
const slots_param = request.head.target[query_start + 7 ..];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 7? I think you trying to skip bytes but there should be a better regex way to do it.

}

// Cap max_slots to prevent resource exhaustion
if (max_slots > 200) max_slots = 200;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 200 and if we have set it as 50 before. also declare these variables on top as constants.

Comment on lines +225 to +234
const role = if (is_finalized)
"finalized"
else if (is_justified)
"justified"
else if (is_head)
"head"
else if (is_orphaned)
"orphaned"
else
"normal";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set role in each blk code blocks above itself.

Comment on lines +240 to +241
else
0.0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't this be the default value of f64 anyway

}

/// Recursively builds a tree branch visualization
fn visualizeTreeBranch(allocator: Allocator, tree_lines: *std.ArrayListUnmanaged(u8), nodes: []const fcFactory.ProtoNode, node_idx: usize, depth: usize, prefix: []const u8, max_depth: ?usize) !void {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can lead to unbounded recursion even with max_depth there can be malicious nodes that can supply us blocks at shallow heights which this function will continue to process. We should also have a limit on the number of nodes it can handle while trying to visualize the tree branch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those blocks can only be in forkchoice if they are valid, so whatever is the prvention mechanism needs to come before that

@g11tech g11tech changed the title Fork Choice Visualization forkchoice grafana visualization Dec 16, 2025
const node = @import("@zeam/node");

// Global chain reference for API access
var global_chain: ?*node.BeamChain = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont like global chain, you should register the routes on the api server using the forkchoice instance, not even chain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants