Skip to content

fix(evm): cache fallback decisions and narrow JIT suitability thresholds#357

Open
starwarfan wants to merge 1 commit intoDTVMStack:mainfrom
starwarfan:adjust-fallback-cond
Open

fix(evm): cache fallback decisions and narrow JIT suitability thresholds#357
starwarfan wants to merge 1 commit intoDTVMStack:mainfrom
starwarfan:adjust-fallback-cond

Conversation

@starwarfan
Copy link
Contributor

Two bugs in the ZEN_ENABLE_JIT_PRECOMPILE_FALLBACK mechanism:

  1. Per-call analyzer overhead: EVMAnalyzer was constructed on every execute() call, rebuilding a std::map over the full bytecode each time. For benchmarks running thousands of iterations this added ~245us per call even when no fallback was needed. Fix: cache the ShouldFallback boolean per module key (CRC32 + revision) in a new FallbackCache map on the DTVM VM struct.

  2. Over-conservative fallback thresholds: the BytecodeSize and MirEstimate conditions caused false positives on real contracts (snailtracer regressed 22x) and harmless synth benchmarks (PUSH22-32, MUL/b1). The 5 actually-pathological cases (MUL/b0, SIGNEXTEND/b1, SHL/b1, SHR/b1, SAR/b1) are all caught precisely by MaxConsecutiveExpensive > 128 and MaxBlockExpensiveCount > 256 alone (they each contain ~2000 RA-expensive ops in a single basic block). Remove BytecodeSize and MirEstimate from the fallback condition; keep only pattern-based thresholds.

Result: 221 benchmarks complete with no hangs; 207/216 common cases remain within 20% of the no-fallback JIT reference.

1. Does this PR affect any open issues?(Y/N) and add issue references (e.g. "fix #123", "re #123".):

  • N
  • Y

2. What is the scope of this PR (e.g. component or file name):

3. Provide a description of the PR(e.g. more details, effects, motivations or doc link):

  • Affects user behaviors
  • Contains CI/CD configuration changes
  • Contains documentation changes
  • Contains experimental features
  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Other

4. Are there any breaking changes?(Y/N) and describe the breaking changes(e.g. more details, motivations or doc link):

  • N
  • Y

5. Are there test cases for these changes?(Y/N) select and add more details, references or doc links:

  • Unit test
  • Integration test
  • Benchmark (add benchmark stats below)
  • Manual test (add detailed scripts or steps below)
  • Other

6. Release note

None

Two fixes in the ZEN_ENABLE_JIT_PRECOMPILE_FALLBACK mechanism:

1. Per-call analyzer overhead: EVMAnalyzer was constructed on every
   execute() call, rebuilding a std::map over the full bytecode each
   time. For benchmarks running thousands of iterations this added
   ~245us per call even when no fallback was needed. Fix: cache the
   ShouldFallback boolean per module key (CRC32 + revision) in a new
   FallbackCache map on the DTVM VM struct.

2. Tuned fallback thresholds to eliminate false positives while
   covering all pathological cases:
   - Removed MirEstimate condition (caused false positives on PUSH22-32
     and other harmless synth benchmarks).
   - Raised BytecodeSize limit from 0x6000 (24 576) to 0x10000 (64 KB).
     The old limit overlapped the EIP-170 deployed-code ceiling and
     caused false positives on real contracts (snailtracer, MUL/b1, etc).
     The new limit is safely above both EIP-170 (24 576) and the EIP-3860
     initcode limit (49 152), while still catching extreme synthetic test
     bytecode (402 KB, 6 MB) that would stall LLVM.
   - Pattern-based thresholds (MaxConsecutiveExpensive > 128,
     MaxBlockExpensiveCount > 256, DupFeedbackPatternCount > 64) remain
     and precisely target the 5 pathological benchmark cases.

Verified: all 223 evmonetestsuite multipass tests pass; large synthetic
bytecode cases (evmone_block_gas_cost_overflow_*) correctly fall back to
interpreter; 207/216 common benchmark cases remain within 20% of the
no-fallback JIT reference.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the EVM JIT precompile fallback mechanism by caching per-bytecode fallback decisions in the EVMC VM and refining the analyzer’s fallback criteria to reduce false positives while preserving protection against pathological compilation cases.

Changes:

  • Add a per-module FallbackCache (CRC32 + revision key) to avoid re-running EVMAnalyzer on repeated executions.
  • Adjust JIT suitability thresholds by removing MIR-estimate-based fallback and increasing the bytecode-size threshold; keep pattern-based RA-expensive heuristics.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
src/vm/dt_evmc_vm.cpp Adds FallbackCache and uses it in execute() to avoid repeated analyzer work.
src/compiler/evm_frontend/evm_analyzer.h Updates bytecode-size threshold and simplifies fallback verdict to pattern-based checks plus a size guard.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +82 to +84
#ifdef ZEN_ENABLE_JIT_PRECOMPILE_FALLBACK
std::unordered_map<uint64_t, bool> FallbackCache;
#endif
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FallbackCache grows monotonically for the lifetime of the VM and is never pruned. In long-running processes that execute many distinct contracts (or untrusted inputs), this can become an unbounded memory growth vector. Consider bounding the cache (e.g., LRU / max entries) or clearing entries alongside module unload/retention policy.

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +119
/// Bytecode size limit: 64 KB is well above the EIP-170 deployed-code limit
/// (24 576) and the EIP-3860 initcode limit (49 152), so real-world contracts
/// are never affected, while pathological synthetic tests (400 KB+) are caught.
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims only “pathological synthetic tests (400 KB+)” are caught by this bytecode-size guard, but the threshold is 0x10000 (64 KiB), so it will also catch any bytecode larger than 64 KiB. Please adjust the wording to match the actual cutoff (or adjust the cutoff if 400+ KiB was the intent).

Suggested change
/// Bytecode size limit: 64 KB is well above the EIP-170 deployed-code limit
/// (24 576) and the EIP-3860 initcode limit (49 152), so real-world contracts
/// are never affected, while pathological synthetic tests (400 KB+) are caught.
/// Bytecode size limit: 64 KiB is well above the EIP-170 deployed-code limit
/// (24 576) and the EIP-3860 initcode limit (49 152), so real-world contracts
/// are never affected, while oversized synthetic tests above this threshold
/// (64 KiB+) are caught.

Copilot uses AI. Check for mistakes.
Comment on lines 318 to 321
JITResult.ShouldFallback =
BytecodeSize > MAX_JIT_BYTECODE_SIZE ||
JITResult.MirEstimate > MAX_JIT_MIR_ESTIMATE ||
JITResult.BytecodeSize > MAX_JIT_BYTECODE_SIZE ||
JITResult.MaxConsecutiveExpensive > MAX_CONSECUTIVE_RA_EXPENSIVE ||
JITResult.MaxBlockExpensiveCount > MAX_BLOCK_RA_EXPENSIVE ||
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the bytecode-size guard was removed from the fallback condition, but ShouldFallback still checks JITResult.BytecodeSize > MAX_JIT_BYTECODE_SIZE. Either update the PR description to reflect the retained size guard, or remove the size guard here if that’s the intended behavior change.

Copilot uses AI. Check for mistakes.
/// (24 576) and the EIP-3860 initcode limit (49 152), so real-world contracts
/// are never affected, while pathological synthetic tests (400 KB+) are caught.
static constexpr size_t MAX_JIT_BYTECODE_SIZE = 0x10000;
static constexpr size_t MAX_JIT_MIR_ESTIMATE = 50000;
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_JIT_MIR_ESTIMATE remains defined (and MirEstimate is still computed) but it no longer influences ShouldFallback. If MIR estimate is still meant for diagnostics, consider documenting that it is informational-only; otherwise remove the unused threshold to avoid confusion and dead configuration.

Suggested change
static constexpr size_t MAX_JIT_MIR_ESTIMATE = 50000;

Copilot uses AI. Check for mistakes.
Comment on lines +312 to 322
// Compute final fallback verdict.
// Pattern-based thresholds target pathological LLVM register-allocator
// cases (dense RA-expensive opcodes, DUP feedback loops).
// The bytecode size guard (64 KB) catches extreme synthetic tests whose
// sheer IR volume stalls LLVM, without affecting real contracts (bounded
// by EIP-170 / EIP-3860).
JITResult.ShouldFallback =
BytecodeSize > MAX_JIT_BYTECODE_SIZE ||
JITResult.MirEstimate > MAX_JIT_MIR_ESTIMATE ||
JITResult.BytecodeSize > MAX_JIT_BYTECODE_SIZE ||
JITResult.MaxConsecutiveExpensive > MAX_CONSECUTIVE_RA_EXPENSIVE ||
JITResult.MaxBlockExpensiveCount > MAX_BLOCK_RA_EXPENSIVE ||
JITResult.DupFeedbackPatternCount > MAX_DUP_FEEDBACK_PATTERN;
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback decision logic changed materially (removed MIR-estimate check; adjusted bytecode-size threshold). There are existing GTest suites under src/tests, but none appear to cover EVMAnalyzer suitability decisions. Adding unit tests for representative bytecode patterns (e.g., high RA-expensive density vs. benign PUSH-heavy code) would prevent regressions in these thresholds.

Copilot uses AI. Check for mistakes.
Comment on lines +160 to +168
auto CacheIt = VM->FallbackCache.find(ModKey);
bool NeedFallback;
if (CacheIt != VM->FallbackCache.end()) {
NeedFallback = CacheIt->second;
} else {
COMPILER::EVMAnalyzer Analyzer(Rev);
Analyzer.analyze(Code, CodeSize);
NeedFallback = Analyzer.getJITSuitability().ShouldFallback;
VM->FallbackCache[ModKey] = NeedFallback;
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On cache miss, this does a find() and then inserts via FallbackCache[ModKey] = NeedFallback, which performs a second hash lookup and may allocate a default entry first. Since this code is on a hot path for first-seen contracts, consider using emplace/try_emplace with the existing lookup result to avoid the extra work.

Suggested change
auto CacheIt = VM->FallbackCache.find(ModKey);
bool NeedFallback;
if (CacheIt != VM->FallbackCache.end()) {
NeedFallback = CacheIt->second;
} else {
COMPILER::EVMAnalyzer Analyzer(Rev);
Analyzer.analyze(Code, CodeSize);
NeedFallback = Analyzer.getJITSuitability().ShouldFallback;
VM->FallbackCache[ModKey] = NeedFallback;
auto [CacheIt, Inserted] = VM->FallbackCache.try_emplace(ModKey, false);
bool &NeedFallback = CacheIt->second;
if (Inserted) {
COMPILER::EVMAnalyzer Analyzer(Rev);
Analyzer.analyze(Code, CodeSize);
NeedFallback = Analyzer.getJITSuitability().ShouldFallback;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants