Skip to content

Conversation

@dozreg-toplud
Copy link
Contributor

@dozreg-toplud dozreg-toplud commented Jan 11, 2026

This PR draft documents the attempt to make road promotion safer when it comes to async signal handling and u3m_bail calls.

  • jets.c
  • nock.c

Problem

In Vere we have nonlinear control flow in the form of exception raising with u3m_bail and catching it in various wrappers like u3m_soft_run and signal handling via u3m_soft_top/u3m_signal. So there are three ways we can "return" from a wrapped computation:

  1. We can actually return from it, copying the product and some persistent state out of the child road, returning the product to the caller and integrating the produced persistent state into the state of the parent road;
  2. We can jump out via u3m_bail call. The exception will either be turned into a noun, allowing Nock virtualization via metacircular jets, or the exception will be raised again to bubble it up to the top, eventually crashing the event transaction and printing a stacktrace. We still promote and integrate the accumulated persistent state;
  3. We can jump out all the way to the top wrapper via a signal. In this case we repeatedly perform road promotion and accumulate the stack traces from all roads. Doing so, we also promote and integrate the accumulated persistent state.

I noticed the latter when experimenting with Ford Lightning build system: hitting ^C mid build would interrupt it, and if the build was retried it would take less time than if it was done the first time without interruption. This is because some persistent caches were already accumulated and promoted to the home road on SIGINT handling.

This rang an alarm bell in my head: for this to properly work we would need to make sure that all functions that modify road state (u3h_*, functions that modify bytecode programs) are async-signal safe, otherwise, we might be promoting state that does not uphold its invariants.

After a discussion at core blitz on ~2026.1.9, @joemfb raised a concern that the issue has a broader scope, as signals are not the only piece of non-linear control flow in the codebase. Functions that modify road state must also be exception-handling safe: at each u3m_bail callsite the road state needs to conform to its invariants too.

It is worth noting that not all invariants need to be upheld at either bail or signal raise: for HAMTs it is sufficient to be walkable and have valid values. An example of HAMT having an invalid value: if a SIGINT comes between lines 1502-1506 here:

vere/pkg/noun/jets.c

Lines 1502 to 1506 in 33671ea

sit_u->pog_p = _cj_prog(loc, u3x_at(sit_u->axe, cor));
if ( u3_none != sit_u->bat ) {
u3z(sit_u->bat);
}
sit_u->bat = u3k(u3h(cor));

, the site struct will have a bytecode program pointer that does not correspond to the battery stored in bat field.

(This particular example does not allow to produce a bug because this fields are erased anyway when we call a program from a senior road)

Solution

I decided to minimize the surface area of code where the async-signal safety is required by only promoting the road state if:

  • the wrapped function returned sucesfully,
  • OR if the error is deterministic

With that, the only thing we are doing when async signal or nondeterministic error are raised is copying out nouns.

Nouns are (almost) async-signal safe

The only operation that is not async-signal safe is u3i_edit, if it is performed on a mutable noun (refcount of 1). The only place where it is used is in the Nock bytecode interpreter, and in case of any crash the product from the road stack is not being promoted anyway, so the stacktraces and error reports are safe.

HAMTs are fine

The only calls to u3m_bail in hashtable.c were in u3_assert (bail with c3__oops mote, which is not recoverable) and u3h/t macros to disassemble key-value pairs. These are always cells by construction, so no change is necessary, but I replaced them with asserting versions for documentation purposes, just to be sure.

Nock interpreter: to verify

Cursory look at nock.c/jets.c didn't raise any alarms, but I might be missing something. In jets.c u3h/t macros are used everywhere to dissassemble nouns from jet state and to disassemble cores for Nock 9 calls. The former should be infallible, and the latter could legitimately crash, so more attention is necessary there.

@dozreg-toplud dozreg-toplud requested a review from joemfb January 11, 2026 12:10
@pkova pkova marked this pull request as ready for review January 12, 2026 17:39
@pkova pkova requested a review from a team as a code owner January 12, 2026 17:39
@pkova pkova merged commit 89f50c3 into develop Jan 12, 2026
2 checks passed
@pkova pkova deleted the dozreg/async-signal-bail-safety branch January 12, 2026 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants