Skip to content

Amasterov/injection probability 17#898

Draft
a-masterov wants to merge 12 commits intoREL_17_STABLE_neonfrom
amasterov/injection-probability-17
Draft

Amasterov/injection probability 17#898
a-masterov wants to merge 12 commits intoREL_17_STABLE_neonfrom
amasterov/injection-probability-17

Conversation

@a-masterov
Copy link

@a-masterov a-masterov commented Feb 18, 2026

We need to test SMGR failures.

This PR adds two features:
SMGR_API injection point in CHECK_FOR_INTERRUPTS – Injection can run even when there are no pending interrupts, and is gated by an inside_smgr_api counter for use by Neon’s SMGR layer.
Probability-based injection in the injection_points extension – New error-prob- action so injection points can raise an ERROR with a configurable probability (e.g. for fault-injection tests).

Reference: https://databricks.atlassian.net/browse/LKB-6690

@a-masterov a-masterov force-pushed the amasterov/injection-probability-17 branch from b2c9d45 to 50fdd25 Compare February 27, 2026 14:37
@a-masterov a-masterov requested a review from HaoyuHuang March 3, 2026 15:05
@a-masterov a-masterov marked this pull request as ready for review March 3, 2026 15:06
@mmeent-databricks
Copy link

Same here: Can’t you install an SMGR that has these changes builtin, instead of modifying PG's code? It seems quite simple to me to have one that has these interrupts but otherwise lets that other SMGR resolve the requests.

@a-masterov
Copy link
Author

Same here: Can’t you install an SMGR that has these changes builtin, instead of modifying PG's code? It seems quite simple to me to have one that has these interrupts but otherwise lets that other SMGR resolve the requests.

We need an arbitrary check for interrupts, not just an error in SMGR itself.
The changes in neon: https://github.com/databricks-eng/hadron/pull/4259/changes

@mmeent-databricks
Copy link

If you need arbitrary CFI, couldn't you use the hook that's already being installed?

See ProcessInterruptsCallback, which is called every time CHECK_FOR_INTERRUPTS calls into ProcessInterrupts().

It won't exactly always get called whenever execution reaches CFI, but you can mostly force that with e.g. a high-frequency timer, or by setting InterruptsPending in your callback.

@a-masterov a-masterov marked this pull request as draft March 11, 2026 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants