Ahead microtage index perf by CJ362ff · Pull Request #741 · OpenXiangShan/GEM5

CJ362ff · 2026-01-28T05:19:42Z

Summary by CodeRabbit

New Features
- Added a production-ready MicroTAGE branch predictor with full prediction/update lifecycle, folded-history support, bank-aware handling, and extensive instrumentation.
Configuration Changes
- MicroTAGE is now a selectable branch predictor option.
- Example configurations updated to demonstrate MicroTAGE enabled/disabled; one example includes a new commented toggle and another now shows MicroTAGE enabled by default.

…cases

Change-Id: I44b6175022a3a593ed385407bd95ec0c40c74642

Change-Id: I16ce79a7d8488d9a138d0a26f5500576ae54132e

Change-Id: I8be037a1cdacd7151f4fe8e743a32cca90a85036

Change-Id: I6b9c92c18c574a12c19532ae0b894e64c1187342

Change-Id: I953965b8e6feb1c15a238ac832d65bc16b32f496

Change-Id: I39cd471d452aa343c3dd741a80fdfa7d126e3a9f

Change-Id: I3567ae93652aac218c5b4646003abadddaf7cf32

Change-Id: If7c9d3aa68a23c36dde74d8cf3a286c9c48f3e3c

Change-Id: I83277ae5c801e9d22b594286580459d12cdec69b

Change-Id: I394246af184d3f07e02b85f06e4e5ceed368ec22

Change-Id: I762c11f8d15262fb9f1c9d443f77895fa76bbc79

Change-Id: Ia9bcdc028235447e254889d95e5ea98e7f067664

Change-Id: I56614e8ebc2dd33320d353562087ab456fc452da

Change-Id: Icdc7ac7f047a36dba6733a0e1d11e5e37aa4cdf1

Change-Id: I9cb7e7ef04efefbcbdbf705563563f50f3c83324

coderabbitai · 2026-01-28T05:20:07Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a full MicroTAGE BTB predictor (header + implementation), exposes it to Python bindings and build system, switches DecoupledBPUWithBTB to use the new MicroTAGE type, and updates two example configs (one enabling MicroTAGE, one comment-only).

Changes

Cohort / File(s)	Summary
Example configs `configs/example/idealkmhv3.py`, `configs/example/kmhv3.py`	`kmhv3.py` enables `cpu.branchPred.microtage.enabled`; `idealkmhv3.py` adds a comment referencing the same flag (no functional change).
Python predictor bindings `src/cpu/pred/BranchPredictor.py`	Introduce full `MicroTAGE` Param class with many public configuration fields and switch `DecoupledBPUWithBTB.microtage` to `Param.MicroTAGE`.
Build rules `src/cpu/pred/SConscript`	Add `MicroTAGE` to BpType list and include `Source('btb/microtage.cc')` in build sources.
Decoupled BPU header `src/cpu/pred/btb/decoupled_bpred.hh`	Change `microtage` member type from `BTBTAGE` to `MicroTAGE` and adjust header includes.
MicroTAGE implementation `src/cpu/pred/btb/microtage.hh`, `src/cpu/pred/btb/microtage.cc`	Add complete MicroTAGE predictor: class, UNIT_TEST/production paths, prediction/update/allocation lifecycle, folded-history, bank-conflict modeling, tracing hooks, extensive stats, and public API types/methods (many new symbols).

Sequence Diagram(s)

sequenceDiagram
    participant DecoupledBPU as DecoupledBPU
    participant MicroTAGE as MicroTAGE
    participant TageTables as TAGE_Tables
    participant FoldHist as FoldedHistory

    rect rgba(100,150,255,0.5)
    Note over DecoupledBPU,MicroTAGE: Prediction Flow
    DecoupledBPU->>MicroTAGE: putPCHistory(startAddr, history)
    MicroTAGE->>FoldHist: compute folded indices/tags
    FoldHist-->>MicroTAGE: folded indices/tags
    MicroTAGE->>TageTables: lookupHelper(entries)
    TageTables-->>MicroTAGE: per-table predictions
    MicroTAGE-->>DecoupledBPU: TagePrediction + meta
    end

    rect rgba(200,100,100,0.5)
    Note over DecoupledBPU,MicroTAGE: Update/Allocation Flow
    DecoupledBPU->>MicroTAGE: prepareUpdateEntries(fetchStream)
    MicroTAGE-->>DecoupledBPU: candidate entries
    DecoupledBPU->>MicroTAGE: updatePredictorStateAndCheckAllocation(...)
    MicroTAGE->>TageTables: handleNewEntryAllocation(...)
    TageTables-->>MicroTAGE: allocation result
    MicroTAGE->>MicroTAGE: updateLRU(), doUpdateHist()
    MicroTAGE-->>DecoupledBPU: resolved update / allocation status
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

cpu-o3: fix bugs https://github.com/OpenXiangShan/GEM5/pull/595 #613 — touches MicroTAGE integration and decoupled BPU wiring (folded-history runtime checks).
cpu-o3: split decoupled bpu stats helpers #624 — modifies DecoupledBPUWithBTB predictor wiring and build entries.
Test align #612 — changes kmhv3 config microtage setting (config alignment with this PR).

Suggested labels

perf

Suggested reviewers

jensen-yan
Yakkhini

Poem

🐰 I hopped through tags and folded bits,
Banks and counters clicked in skits,
I wired a class and added rows,
History folds where prediction grows,
MicroTAGE hums — the branch seed sows.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 56.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Ahead microtage index perf' is vague and lacks clarity about what specific change was made; it uses non-descriptive terms that don't convey meaningful information about the changeset.	Revise the title to be more specific and descriptive, such as 'Implement MicroTAGE branch predictor with folded history and bank-conflict modeling' or 'Add MicroTAGE index folded history recovery optimization' to better reflect the actual changes.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ahead-microtage-index-perf

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Fix all issues with AI agents

In `@src/cpu/pred/BranchPredictor.py`:
- Around line 1063-1089: The MicroTAGE Python config defines an unused Param
named baseTableSize causing confusion; remove the baseTableSize
Param.Unsigned(256, ...) declaration from the MicroTAGE class so the Python
config matches the C++ constructor (microtage.cc) which does not consume
baseTableSize; ensure no other references to baseTableSize exist in this file or
nearby configs and run lint/tests to confirm no breakage.

In `@src/cpu/pred/btb/decoupled_bpred.hh`:
- Around line 20-23: The file contains a duplicated include of the header
btb_mgsc.hh in decoupled_bpred.hh; remove the redundant `#include`
"cpu/pred/btb/btb_mgsc.hh" so the header is only included once (leave the single
correct include and delete the duplicate entry).

In `@src/cpu/pred/btb/microtage.cc`:
- Around line 96-104: The code currently calls
tableTagBits.resize(numPredictors) which zero‑fills missing entries and defeats
the later size validation, allowing 0‑bit tags and breaking tag/index math
(e.g., tableTagBits[i]-1 in folding). Before resizing any predictor tables,
validate that the provided tableTagBits configuration has at least numPredictors
entries (or throw/assert with a clear error); only after that validated check,
perform the resize of the predictor containers (e.g., tageTable, tableIndexBits,
tableIndexMasks, tableTagBits, tableTagMasks) or copy values from the validated
config so you never create implicit 0 entries for tableTagBits.
- Around line 17-25: The UNIT_TEST build fails because ceilLog2 and floorLog2
are used in the microtage constructor but base/intmath.hh is only included
inside the `#ifndef` UNIT_TEST block; move the include for "base/intmath.hh" out
of the `#ifndef` UNIT_TEST guard so ceilLog2 and floorLog2 are available in both
UNIT_TEST and non-UNIT_TEST builds, leaving other debug includes (e.g.,
"base/debug_helper.hh", "base/trace.hh", "debug/TAGE.hh") inside the guard as
appropriate.
- Around line 629-635: The code uses predMeta->preds[btb_entry.pc] which can
implicitly insert a default TagePrediction; instead, check for existence with
predMeta->preds.find(btb_entry.pc) and only use the stored prediction if the key
exists, otherwise call generateSinglePrediction(btb_entry, startAddr, predMeta)
(same fallback used when updateOnRead is true); update the branch that assigns
recomputed so it uses find() on predMeta->preds and falls back to
generateSinglePrediction when the find() lookup fails.

In `@src/cpu/pred/btb/microtage.hh`:
- Around line 4-8: The header is missing direct includes for types it uses; add
`#include` <memory>, `#include` <queue>, and `#include` <unordered_map> to the top of
microtage.hh so std::shared_ptr, std::queue and std::unordered_map are declared
rather than relying on transitive includes—this will fix usages of
std::shared_ptr (e.g., pointers to tag entries), std::queue (any queue
members/usages), and std::unordered_map (hash tables used by the predictor)
referenced in the file.

coderabbitai · 2026-01-28T05:32:10Z

src/cpu/pred/BranchPredictor.py

+class MicroTAGE(TimedBaseBTBPredictor):
+    """Micro-sized BTB TAGE predictor used alongside uBTB"""
+    type = 'MicroTAGE'
+    cxx_class = 'gem5::branch_prediction::btb_pred::MicroTAGE'
+    cxx_header = "cpu/pred/btb/microtage.hh"
+
+    needMoreHistories = Param.Bool(True, "MicroTAGE needs more histories")
+    enableSC = Param.Bool(False, "Enable SC or not")
+    updateOnRead = Param.Bool(True,"Enable update on read, no need to save tage meta in FTQ")
+    # Keep vector parameters consistent with numPredictors to avoid constructor asserts.
+    numPredictors = Param.Unsigned(1, "Number of TAGE predictors")
+    tableSizes = VectorParam.Unsigned([512],"the TAGE T0~Tn length")
+    TTagBitSizes = VectorParam.Unsigned([16] ,"the T0~Tn entry's tag bit size")
+    TTagPcShifts = VectorParam.Unsigned([1] ,"when the T0~Tn entry's tag generating, PC right shift")
+    blockSize = Param.Unsigned(32,"tage index function uses 32B aligned block address")
+
+    histLengths = VectorParam.Unsigned([16],"the BTB TAGE T0~Tn history length")
+    maxHistLen = Param.Unsigned(970,"The length of history passed from DBP")
+    numTablesToAlloc = Param.Unsigned(1,"The number of table to allocated each time")
+    numWays = Param.Unsigned(2, "Number of ways per set")
+    baseTableSize = Param.Unsigned(256,"Base table size")
+    maxBranchPositions = Param.Unsigned(32,"Maximum branch positions per 64-byte block")
+    useAltOnNaSize = Param.Unsigned(128,"Size of the useAltOnNa table")
+    useAltOnNaWidth = Param.Unsigned(7,"Width of the useAltOnNa table")
+    numBanks = Param.Unsigned(4,"Number of banks for bank conflict simulation")
+    enableBankConflict = Param.Bool(False,"Enable bank conflict simulation")
+    numDelay = Param.Unsigned(0,"Prediction latency in cycles")


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n 'baseTableSize' src/cpu/pred/btb/microtage.cc src/cpu/pred/btb/microtage.hh 2>/dev/null || echo "baseTableSize not found in specified files"

Repository: OpenXiangShan/GEM5

Length of output: 104

🏁 Script executed:

# Search for baseTableSize anywhere in the btb directory rg 'baseTableSize' src/cpu/pred/btb/ 2>/dev/null | head -20

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

# Check if BTBTAGE has baseTableSize parameter rg -A 5 'class BTBTAGE' src/cpu/pred/btb/btbtage.hh 2>/dev/null | head -40

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

# List files in the btb directory find src/cpu/pred/btb -type f -name "*.cc" -o -name "*.hh" 2>/dev/null

Repository: OpenXiangShan/GEM5

Length of output: 1412

🏁 Script executed:

# Search entire codebase for baseTableSize rg 'baseTableSize' src/ 2>/dev/null

Repository: OpenXiangShan/GEM5

Length of output: 448

🏁 Script executed:

# Check BranchPredictor.py for BTBTAGE and TimedBaseBTBPredictor definitions rg -n 'class BTBTAGE|class TimedBaseBTBPredictor' src/cpu/pred/BranchPredictor.py 2>/dev/null

Repository: OpenXiangShan/GEM5

Length of output: 148

🏁 Script executed:

# Check BTBTAGE class definition in BranchPredictor.py - look at surrounding context sed -n '1038,1065p' src/cpu/pred/BranchPredictor.py

Repository: OpenXiangShan/GEM5

Length of output: 1755

🏁 Script executed:

# Check btb_tage.cc for baseTableSize usage rg -n 'baseTableSize' src/cpu/pred/btb/btb_tage.cc src/cpu/pred/btb/btb_tage.hh 2>/dev/null || echo "Not found in btb_tage"

Repository: OpenXiangShan/GEM5

Length of output: 83

🏁 Script executed:

# Check if baseTableSize is used anywhere in BTB constructors - broaden search rg -B 5 -A 10 'baseTableSize' src/cpu/pred/btb/*.cc 2>/dev/null | head -50

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

# Check microtage.hh header - look for class definition and members cat src/cpu/pred/btb/microtage.hh | head -80

Repository: OpenXiangShan/GEM5

Length of output: 2374

🏁 Script executed:

# Check microtage.cc constructor - look for parameter handling rg -A 30 'MicroTAGE::MicroTAGE' src/cpu/pred/btb/microtage.cc 2>/dev/null | head -50

Repository: OpenXiangShan/GEM5

Length of output: 1818

Remove the unused baseTableSize parameter from the MicroTAGE Python configuration.

The baseTableSize parameter defined at line 1083 is not consumed by the C++ MicroTAGE constructor. The constructor initializer list (in microtage.cc) only processes numPredictors, tableSizes, TTagBitSizes, TTagPcShifts, histLengths, maxHistLen, numWays, maxBranchPositions, numTablesToAlloc, updateOnRead, numBanks, blockSize, and enableBankConflict. The baseTableSize parameter appears to be a copy-paste artifact from the FTB TAGE implementation, where it is actually used. Remove it from the MicroTAGE class definition to avoid confusion and prevent future developers from expecting this parameter to have an effect.

🧰 Tools

🪛 Ruff (0.14.14)

1069-1069: Param may be undefined, or defined from star imports

(F405)

1070-1070: Param may be undefined, or defined from star imports

(F405)

1071-1071: Param may be undefined, or defined from star imports

(F405)

1073-1073: Param may be undefined, or defined from star imports

(F405)

1074-1074: VectorParam may be undefined, or defined from star imports

(F405)

1075-1075: VectorParam may be undefined, or defined from star imports

(F405)

1076-1076: VectorParam may be undefined, or defined from star imports

(F405)

1077-1077: Param may be undefined, or defined from star imports

(F405)

1079-1079: VectorParam may be undefined, or defined from star imports

(F405)

1080-1080: Param may be undefined, or defined from star imports

(F405)

1081-1081: Param may be undefined, or defined from star imports

(F405)

1082-1082: Param may be undefined, or defined from star imports

(F405)

1083-1083: Param may be undefined, or defined from star imports

(F405)

1084-1084: Param may be undefined, or defined from star imports

(F405)

1085-1085: Param may be undefined, or defined from star imports

(F405)

1086-1086: Param may be undefined, or defined from star imports

(F405)

1087-1087: Param may be undefined, or defined from star imports

(F405)

1088-1088: Param may be undefined, or defined from star imports

(F405)

1089-1089: Param may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents

In `@src/cpu/pred/BranchPredictor.py` around lines 1063 - 1089, The MicroTAGE Python config defines an unused Param named baseTableSize causing confusion; remove the baseTableSize Param.Unsigned(256, ...) declaration from the MicroTAGE class so the Python config matches the C++ constructor (microtage.cc) which does not consume baseTableSize; ensure no other references to baseTableSize exist in this file or nearby configs and run lint/tests to confirm no breakage.

coderabbitai · 2026-01-28T05:32:10Z

src/cpu/pred/btb/decoupled_bpred.hh

+#include "cpu/pred/btb/btb_mgsc.hh"
 #include "cpu/pred/btb/btb_tage.hh"
 #include "cpu/pred/btb/btb_ubtb.hh"
 #include "cpu/pred/btb/btb_mgsc.hh"


⚠️ Potential issue | 🟡 Minor

Remove duplicate #include for btb_mgsc.hh.

The header btb_mgsc.hh is included twice (lines 20 and 23). Remove the duplicate to avoid redundant preprocessing.

Suggested fix

`#include` "cpu/pred/btb/btb_mgsc.hh" `#include` "cpu/pred/btb/btb_tage.hh" `#include` "cpu/pred/btb/btb_ubtb.hh" -#include "cpu/pred/btb/btb_mgsc.hh" `#include` "cpu/pred/btb/jump_ahead_predictor.hh"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#include "cpu/pred/btb/btb_mgsc.hh"

#include "cpu/pred/btb/btb_tage.hh"

#include "cpu/pred/btb/btb_ubtb.hh"

#include "cpu/pred/btb/btb_mgsc.hh"

`#include` "cpu/pred/btb/btb_mgsc.hh"

`#include` "cpu/pred/btb/btb_tage.hh"

`#include` "cpu/pred/btb/btb_ubtb.hh"

🤖 Prompt for AI Agents

In `@src/cpu/pred/btb/decoupled_bpred.hh` around lines 20 - 23, The file contains a duplicated include of the header btb_mgsc.hh in decoupled_bpred.hh; remove the redundant `#include` "cpu/pred/btb/btb_mgsc.hh" so the header is only included once (leave the single correct include and delete the duplicate entry).

coderabbitai · 2026-01-28T05:32:10Z

src/cpu/pred/btb/microtage.cc

+#ifndef UNIT_TEST
+#include "base/debug_helper.hh"
+#include "base/intmath.hh"
+#include "base/trace.hh"
+#include "base/types.hh"
+#include "cpu/o3/dyn_inst.hh"
+#include "debug/TAGE.hh"
+
+#endif


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for ceilLog2/floorLog2 usage in the file echo "=== Searching for ceilLog2/floorLog2 usage ===" rg -n "ceilLog2|floorLog2" src/cpu/pred/btb/microtage.cc echo "" echo "=== Searching for intmath.hh include ===" rg -n "intmath.hh" src/cpu/pred/btb/microtage.cc echo "" echo "=== Checking file structure around includes (lines 1-50) ===" head -50 src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 1706

UNIT_TEST builds fail to compile due to missing ceilLog2/floorLog2 declarations.

The UNIT_TEST constructor (line 45–48) uses ceilLog2 and floorLog2, but base/intmath.hh is only included within the #ifndef UNIT_TEST guard (line 19). When UNIT_TEST is defined, this include is skipped, causing undefined symbol errors at compilation.

Fix

-#ifndef UNIT_TEST -#include "base/debug_helper.hh" -#include "base/intmath.hh" -#include "base/trace.hh" +#include "base/intmath.hh" +#ifndef UNIT_TEST +#include "base/debug_helper.hh" +#include "base/trace.hh" `#include` "base/types.hh" `#include` "cpu/o3/dyn_inst.hh" `#include` "debug/TAGE.hh"

Move base/intmath.hh outside the #ifndef UNIT_TEST block to ensure it's available in both UNIT_TEST and production builds.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#ifndef UNIT_TEST

#include "base/debug_helper.hh"

#include "base/intmath.hh"

#include "base/trace.hh"

#include "base/types.hh"

#include "cpu/o3/dyn_inst.hh"

#include "debug/TAGE.hh"

#endif

`#include` "base/intmath.hh"

`#ifndef` UNIT_TEST

`#include` "base/debug_helper.hh"

`#include` "base/trace.hh"

`#include` "base/types.hh"

`#include` "cpu/o3/dyn_inst.hh"

`#include` "debug/TAGE.hh"

`#endif`

🤖 Prompt for AI Agents

In `@src/cpu/pred/btb/microtage.cc` around lines 17 - 25, The UNIT_TEST build fails because ceilLog2 and floorLog2 are used in the microtage constructor but base/intmath.hh is only included inside the `#ifndef` UNIT_TEST block; move the include for "base/intmath.hh" out of the `#ifndef` UNIT_TEST guard so ceilLog2 and floorLog2 are available in both UNIT_TEST and non-UNIT_TEST builds, leaving other debug includes (e.g., "base/debug_helper.hh", "base/trace.hh", "debug/TAGE.hh") inside the guard as appropriate.

coderabbitai · 2026-01-28T05:32:10Z

src/cpu/pred/btb/microtage.cc

+    tageTable.resize(numPredictors);
+    tableIndexBits.resize(numPredictors);
+    tableIndexMasks.resize(numPredictors);
+    tableTagBits.resize(numPredictors);
+    tableTagMasks.resize(numPredictors);
+    // Ensure PC shift vector has entries for all predictors (fallback default = 1)
+    if (tablePcShifts.size() < numPredictors) {
+        tablePcShifts.resize(numPredictors, 1);
+    }


⚠️ Potential issue | 🟠 Major

Don’t resize tableTagBits before validating config size.

tableTagBits.resize(numPredictors) zero‑fills missing entries and makes the later size assertion ineffective. That can silently produce 0‑bit tags (and tableTagBits[i]-1 in folding), which breaks tag/index math.

🩹 Proposed fix

- tableTagBits.resize(numPredictors); + // Avoid zero-fill masking config errors; validate instead. + assert(tableTagBits.size() >= numPredictors);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

tageTable.resize(numPredictors);

tableIndexBits.resize(numPredictors);

tableIndexMasks.resize(numPredictors);

tableTagBits.resize(numPredictors);

tableTagMasks.resize(numPredictors);

// Ensure PC shift vector has entries for all predictors (fallback default = 1)

if (tablePcShifts.size() < numPredictors) {

tablePcShifts.resize(numPredictors, 1);

}

tageTable.resize(numPredictors);

tableIndexBits.resize(numPredictors);

tableIndexMasks.resize(numPredictors);

// Avoid zero-fill masking config errors; validate instead.

assert(tableTagBits.size() >= numPredictors);

tableTagMasks.resize(numPredictors);

// Ensure PC shift vector has entries for all predictors (fallback default = 1)

if (tablePcShifts.size() < numPredictors) {

tablePcShifts.resize(numPredictors, 1);

}

🤖 Prompt for AI Agents

In `@src/cpu/pred/btb/microtage.cc` around lines 96 - 104, The code currently calls tableTagBits.resize(numPredictors) which zero‑fills missing entries and defeats the later size validation, allowing 0‑bit tags and breaking tag/index math (e.g., tableTagBits[i]-1 in folding). Before resizing any predictor tables, validate that the provided tableTagBits configuration has at least numPredictors entries (or throw/assert with a clear error); only after that validated check, perform the resize of the predictor containers (e.g., tageTable, tableIndexBits, tableIndexMasks, tableTagBits, tableTagMasks) or copy values from the validated config so you never create implicit 0 entries for tableTagBits.

src/cpu/pred/btb/microtage.cc

coderabbitai · 2026-01-28T05:32:10Z

src/cpu/pred/btb/microtage.hh

+#include <cstdint>
+#include <deque>
+#include <map>
+#include <utility>
+#include <vector>


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify STL types used in the header and check for local includes. rg -n "std::shared_ptr|std::queue|std::unordered_map" src/cpu/pred/btb/microtage.hh echo "---" rg -nE "#include <(memory|queue|unordered_map)>" src/cpu/pred/btb/microtage.hh

Repository: OpenXiangShan/GEM5

Length of output: 602

🏁 Script executed:

#!/bin/bash # Check if the three headers are already included grep -n "#include <memory>" src/cpu/pred/btb/microtage.hh || echo "No memory header" grep -n "#include <queue>" src/cpu/pred/btb/microtage.hh || echo "No queue header" grep -n "#include <unordered_map>" src/cpu/pred/btb/microtage.hh || echo "No unordered_map header" echo "---" # Show the full include section head -15 src/cpu/pred/btb/microtage.hh

Repository: OpenXiangShan/GEM5

Length of output: 494

Add missing standard headers for std::shared_ptr, std::queue, and std::unordered_map.

This header uses these types (lines 44, 126, 298, 384, 398, 414, 424) but doesn't include the corresponding headers. This relies on transitive includes and can break builds when dependencies change.

Suggested fix

`#include` <cstdint> `#include` <deque> `#include` <map> +#include <memory> +#include <queue> +#include <unordered_map> `#include` <utility> `#include` <vector>

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#include <cstdint>

#include <deque>

#include <map>

#include <utility>

#include <vector>

`#include` <cstdint>

`#include` <deque>

`#include` <map>

`#include` <memory>

`#include` <queue>

`#include` <unordered_map>

`#include` <utility>

`#include` <vector>

🤖 Prompt for AI Agents

In `@src/cpu/pred/btb/microtage.hh` around lines 4 - 8, The header is missing direct includes for types it uses; add `#include` <memory>, `#include` <queue>, and `#include` <unordered_map> to the top of microtage.hh so std::shared_ptr, std::queue and std::unordered_map are declared rather than relying on transitive includes—this will fix usages of std::shared_ptr (e.g., pointers to tag entries), std::queue (any queue members/usages), and std::unordered_map (hash tables used by the predictor) referenced in the file.

… history queue

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@src/cpu/pred/btb/microtage.cc`:
- Around line 125-127: The code can construct a PathFoldedHist with a negative
width when tableTagBits[i] is 0; update the loop that pushes into
tagFoldedHist/altTagFoldedHist/indexFoldedHist to validate/clamp the tag-bit
counts before constructing PathFoldedHist: compute a non-negative altBits (e.g.,
altBits = max(0, (int)tableTagBits[i] - 1)) or only push an altTagFoldedHist
when tableTagBits[i] > 0, and use that safe value when calling PathFoldedHist;
reference the vectors tagFoldedHist, altTagFoldedHist, indexFoldedHist and the
tableTagBits array to locate the change.
- Around line 724-738: getTageTag (and similarly getTageIndex) currently
computes mask with (1ULL << tableTagBits[t]) - 1 which is undefined when
tableTagBits[t] >= 64; fix by guarding the shift: if tableTagBits[t] is >= 64
(or >= sizeof(Addr)*8) set mask to all-ones for Addr (e.g. ~Addr(0)) otherwise
compute (Addr(1) << tableTagBits[t]) - 1, and use that mask for
pcBits/foldedBits; ensure the mask type matches Addr to avoid narrower-width
issues and apply the same change in getTageIndex.

In `@src/cpu/pred/btb/microtage.hh`:
- Around line 254-256: The declaration of setTag(Addr &dest, Addr src, int
table) in MicroTAGE is missing its implementation; either remove the declaration
or add a definition in the .cc file as MicroTAGE::setTag(Addr &dest, Addr src,
int table). If you implement it, make it compute the same tag used elsewhere
(reuse existing helpers like getTag or computeIndex/computeTag if present) by
extracting the appropriate bits from src for the given table (using the class's
tagBits/tagMask/tagShift members or the same logic as getTag), and assign the
masked value into dest; ensure the signature and class scope match the
declaration.
- Around line 372-373: Implement the missing method
MicroTAGE::recoverFoldedHist(const bitset& history) in the .cc file to match the
declaration in microtage.hh; the implementation should restore the predictor's
internal folded-history state from the supplied bitset by recomputing all
folded-history registers used by MicroTAGE (e.g., the foldedHist / folded
history arrays or structures) using the same folding algorithm/helpers already
present in this class (call existing fold/update helpers if available, otherwise
recompute each folded value by applying the class’s fold/rotate/xor logic across
the bits of history). Ensure the function signature matches exactly and updates
the member variables that other methods (predict/update) rely on so the
predictor recovers correctly after a misprediction.

🧹 Nitpick comments (5)

src/cpu/pred/btb/microtage.cc (3)
328-333: Signed/unsigned comparison in loop condition.

At line 328, int s is compared with stagePreds.size() (which returns size_t). Consider using size_t or casting to avoid potential compiler warnings:
-    for (int s = getDelay(); s < stagePreds.size(); s++) {
+    for (size_t s = static_cast<size_t>(getDelay()); s < stagePreds.size(); s++) {
908-914: Remove unused variables buf2 and buf3.

At line 910, std::string buf2, buf3; are declared but never used.
         for (int type = 0; type < 3; type++) {
-            std::string buf2, buf3;
             auto &foldedHist = type == 0 ? indexFoldedHist[t] : type == 1 ? tagFoldedHist[t] : altTagFoldedHist[t];
             foldedHist.check(hist);
         }
999-1030: updateLRU and getLRUVictim appear unused.

These LRU methods are defined but the allocation logic (line 496) explicitly states "no LFSR, no per-way table gating" and doesn't call them. Consider removing if not needed, or documenting their intended future use.
#!/bin/bash
# Verify if updateLRU or getLRUVictim are called anywhere in the codebase
echo "=== Searching for updateLRU calls ==="
rg -n "updateLRU\s*\(" --type cpp

echo ""
echo "=== Searching for getLRUVictim calls ==="
rg -n "getLRUVictim\s*\(" --type cpp
src/cpu/pred/btb/microtage.hh (2)
100-101: Misleading comment for mainprovided field.

Line 100 states "Whether to use alternative prediction, true if main is weak or no main prediction" but the field is named mainprovided and the code uses it as true when main IS provided (see line 972 in .cc: bool useAlt = pred.mainprovided ? false : true).

Consider fixing the comment:
-        bool mainprovided;    // Whether to use alternative prediction, true if main is weak or no main prediction
+        bool mainprovided;    // Whether main prediction is provided (false falls back to base prediction)
370-375: Redundant public: specifiers.

Lines 370 and 375 both have public: access specifiers in sequence. Remove one:
 public:
 
     // Recover folded history after misprediction
     void recoverFoldedHist(const bitset& history);
 
-public:
-
-
     // Metadata for TAGE prediction

src/cpu/pred/btb/microtage.cc

coderabbitai · 2026-01-28T06:25:59Z

src/cpu/pred/btb/microtage.cc

+Addr
+MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, Addr position)
+{
+    // Create mask for tableTagBits[t] to limit result size
+    Addr mask = (1ULL << tableTagBits[t]) - 1;
+
+    // Extract lower bits of PC directly (remove instruction alignment bits)
+    Addr pcBits = (pc >> bankBaseShift) & mask;
+
+    // Extract and prepare folded history bits
+    Addr foldedBits = foldedHist & mask;
+
+    // XOR all components together, including position (like RTL)
+    return pcBits ^ foldedBits ^ position;
+}


⚠️ Potential issue | 🟠 Major

Undefined behavior when tableTagBits[t] >= 64.

At line 728, (1ULL << tableTagBits[t]) - 1 causes undefined behavior if tableTagBits[t] is 64 or greater. The same issue exists in getTageIndex at line 744.

Consider clamping or validating the bit width:

Proposed fix

Addr MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, Addr position) { // Create mask for tableTagBits[t] to limit result size - Addr mask = (1ULL << tableTagBits[t]) - 1; + Addr mask = tableTagBits[t] >= 64 ? ~0ULL : (1ULL << tableTagBits[t]) - 1;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Addr

MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, Addr position)

{

// Create mask for tableTagBits[t] to limit result size

Addr mask = (1ULL << tableTagBits[t]) - 1;

// Extract lower bits of PC directly (remove instruction alignment bits)

Addr pcBits = (pc >> bankBaseShift) & mask;

// Extract and prepare folded history bits

Addr foldedBits = foldedHist & mask;

// XOR all components together, including position (like RTL)

return pcBits ^ foldedBits ^ position;

}

Addr

MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, Addr position)

{

// Create mask for tableTagBits[t] to limit result size

Addr mask = tableTagBits[t] >= 64 ? ~0ULL : (1ULL << tableTagBits[t]) - 1;

// Extract lower bits of PC directly (remove instruction alignment bits)

Addr pcBits = (pc >> bankBaseShift) & mask;

// Extract and prepare folded history bits

Addr foldedBits = foldedHist & mask;

// XOR all components together, including position (like RTL)

return pcBits ^ foldedBits ^ position;

}

🤖 Prompt for AI Agents

In `@src/cpu/pred/btb/microtage.cc` around lines 724 - 738, getTageTag (and similarly getTageIndex) currently computes mask with (1ULL << tableTagBits[t]) - 1 which is undefined when tableTagBits[t] >= 64; fix by guarding the shift: if tableTagBits[t] is >= 64 (or >= sizeof(Addr)*8) set mask to all-ones for Addr (e.g. ~Addr(0)) otherwise compute (Addr(1) << tableTagBits[t]) - 1, and use that mask for pcBits/foldedBits; ensure the mask type matches Addr to avoid narrower-width issues and apply the same change in getTageIndex.

src/cpu/pred/btb/microtage.hh

coderabbitai · 2026-01-28T06:25:59Z

src/cpu/pred/btb/microtage.hh

+    // Recover folded history after misprediction
+    void recoverFoldedHist(const bitset& history);


⚠️ Potential issue | 🟡 Minor

recoverFoldedHist is declared but not implemented.

The method recoverFoldedHist(const bitset& history) is declared at line 373 but has no implementation in the .cc file.

#!/bin/bash # Verify recoverFoldedHist is not implemented echo "=== Searching for recoverFoldedHist implementation ===" rg -n "MicroTAGE::recoverFoldedHist" --type cpp

🤖 Prompt for AI Agents

In `@src/cpu/pred/btb/microtage.hh` around lines 372 - 373, Implement the missing method MicroTAGE::recoverFoldedHist(const bitset& history) in the .cc file to match the declaration in microtage.hh; the implementation should restore the predictor's internal folded-history state from the supplied bitset by recomputing all folded-history registers used by MicroTAGE (e.g., the foldedHist / folded history arrays or structures) using the same folding algorithm/helpers already present in this class (call existing fold/update helpers if available, otherwise recompute each folded value by applying the class’s fold/rotate/xor logic across the bits of history). Ensure the function signature matches exactly and updates the member variables that other methods (predict/update) rely on so the predictor recovers correctly after a misprediction.

XiangShanRobot · 2026-01-29T02:21:42Z

[Generated by GEM5 Performance Robot]
commit: add8efc
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	20.25	20.27	-0.07 🔴

[Generated by GEM5 Performance Robot]
commit: add8efc
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	20.25	20.13	+0.60 🟢

Change-Id: Iad486579b9ab0df207013348f02c6be30bd10cfd

XiangShanRobot · 2026-01-29T04:56:53Z

[Generated by GEM5 Performance Robot]
commit: 04aa5dc
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	20.14	20.27	-0.64 🔴

[Generated by GEM5 Performance Robot]
commit: 04aa5dc
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	20.14	20.25	-0.57 🔴

Change-Id: I20e9fbfe161cd741b37bb69d46e99ee7755f79e5

XiangShanRobot · 2026-01-29T08:39:15Z

[Generated by GEM5 Performance Robot]
commit: 4eb6b9b
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	20.25	20.27	-0.06 🔴

[Generated by GEM5 Performance Robot]
commit: 4eb6b9b
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	20.25	20.14	+0.58 🟢

Change-Id: If1187310e575a7c485d49fb89215742b1174393a

XiangShanRobot · 2026-01-29T10:24:35Z

[Generated by GEM5 Performance Robot]
commit: 2c69bbc
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	20.26	20.27	-0.05 🔴

[Generated by GEM5 Performance Robot]
commit: 2c69bbc
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	20.26	20.25	+0.02 🟢

Change-Id: I2b9f9216a54405d517538958b717a8e3de2eca8a

Change-Id: I8c70b06b834fd5e713ef9ac4d3b26ec5b01ce2e6

XiangShanRobot · 2026-02-03T10:11:17Z

[Generated by GEM5 Performance Robot]
commit: 5ebc8f1
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	20.12	20.37	-1.20 🔴

[Generated by GEM5 Performance Robot]
commit: 5ebc8f1
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	20.12	20.26	-0.68 🔴

Change-Id: Ic736ea7cc73abebb3af473510c16d837b4f535f4

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

src/cpu/pred/btb/microtage.cc (4)

17-25: ⚠️ Potential issue | 🟠 Major

Move base/intmath.hh out of the UNIT_TEST guard.

Line 45 and Line 46 use ceilLog2/floorLog2, but Line 19 is excluded under UNIT_TEST, which can break UNIT_TEST builds.

💡 Suggested fix

-#ifndef UNIT_TEST
-#include "base/debug_helper.hh"
-#include "base/intmath.hh"
-#include "base/trace.hh"
+#include "base/intmath.hh"
+#ifndef UNIT_TEST
+#include "base/debug_helper.hh"
+#include "base/trace.hh"
 `#include` "base/types.hh"
 `#include` "cpu/o3/dyn_inst.hh"
 `#include` "debug/TAGE.hh"

Also applies to: 45-48

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 17 - 25, The include for
base/intmath.hh is currently inside the `#ifndef` UNIT_TEST guard but ceilLog2 and
floorLog2 are used later even in UNIT_TEST builds; move the line including
"base/intmath.hh" out of the UNIT_TEST guard so ceilLog2/floorLog2 are available
for unit tests (update the include block in microtage.cc so base/intmath.hh is
above or outside the `#ifndef` UNIT_TEST region while keeping other debug-only
includes like base/debug_helper.hh and debug/TAGE.hh inside).

645-647: ⚠️ Potential issue | 🟠 Major

Avoid operator[] on predMeta->preds in update path.

Line 646 can implicitly insert a default TagePrediction, which corrupts metadata for entries that were never predicted.

💡 Suggested fix

-        } else { // otherwise, use the prediction from the prediction-time main/alt
-            recomputed = predMeta->preds[btb_entry.pc];
+        } else { // otherwise, use prediction-time main/alt when available
+            auto it = predMeta->preds.find(btb_entry.pc);
+            if (it != predMeta->preds.end()) {
+                recomputed = it->second;
+            } else {
+                recomputed = generateSinglePrediction(btb_entry, startAddr, predMeta);
+            }
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 645 - 647, The code uses
operator[] on predMeta->preds (predMeta->preds[btb_entry.pc]) which can
implicitly insert a default TagePrediction and corrupt metadata; change this to
a non-mutating lookup (e.g., use find/contains/at depending on container) and
only read into recomputed if an existing entry for btb_entry.pc is present
(otherwise handle the missing-case explicitly instead of allowing insertion),
referencing predMeta, preds, btb_entry.pc, recomputed and TagePrediction to
locate and update the logic.

744-765: ⚠️ Potential issue | 🔴 Critical

Guard variable-width shifts when building masks (getTageTag/getTageIndex).

Line 745 and Line 764 invoke UB when the shift width is >= bit_width(Addr).

💡 Suggested fix

 Addr
 MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)
 {
-    // Create mask for tableTagBits[t] to limit result size
-    Addr mask = (1ULL << tableTagBits[t]) - 1;
+    const unsigned addrBits = sizeof(Addr) * 8;
+    Addr mask = tableTagBits[t] >= addrBits
+        ? ~Addr(0)
+        : ((Addr(1) << tableTagBits[t]) - 1);
@@
 Addr
 MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist)
 {
-    // Create mask for tableIndexBits[t] to limit result size
-    Addr mask = (1ULL << tableIndexBits[t]) - 1;
+    const unsigned addrBits = sizeof(Addr) * 8;
+    Addr mask = tableIndexBits[t] >= addrBits
+        ? ~Addr(0)
+        : ((Addr(1) << tableIndexBits[t]) - 1);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 744 - 765, The mask creation in
getTageTag and getTageIndex is undefined when shifting by the machine word
width; change the mask logic to guard variable-width shifts: compute bits =
tableTagBits[t] (or tableIndexBits[t]) and if bits == 0 set mask = 0, else if
bits >= (sizeof(Addr)*8) set mask = ~static_cast<Addr>(0), otherwise set mask =
(1ULL << bits) - 1; update both getTageTag and getTageIndex (references: mask,
tableTagBits/tableIndexBits, pcBits, foldedBits, altTagBits, bankBaseShift) so
no shift uses a width >= bit_width(Addr).

96-127: ⚠️ Potential issue | 🟠 Major

Validate tableTagBits before use; current resize masks bad config and can create invalid folded widths.

Line 99 (tableTagBits.resize(numPredictors)) zero-fills/truncates, so Line 120 no longer validates the original config. This can flow into Line 126 (tableTagBits[i] - 1) and produce invalid folded-history widths.

💡 Suggested fix

     tageTable.resize(numPredictors);
     tableIndexBits.resize(numPredictors);
     tableIndexMasks.resize(numPredictors);
-    tableTagBits.resize(numPredictors);
+    assert(tableTagBits.size() >= numPredictors);
     tableTagMasks.resize(numPredictors);
@@
-        assert(tableTagBits.size() >= numPredictors);
+        assert(tableTagBits[i] > 0);
         tableTagMasks[i].resize(tableTagBits[i], true);
@@
-        altTagFoldedHist.push_back(PathFoldedHist((int)histLengths[i], (int)tableTagBits[i]-1, 16));
+        altTagFoldedHist.push_back(
+            PathFoldedHist((int)histLengths[i], (int)tableTagBits[i] - 1, 16));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 96 - 127, The code currently
blindly resizes tableTagBits and then uses its entries (including
tableTagBits[i]-1) to construct folded-history objects, which can produce
zero/negative widths; update initialization to ensure tableTagBits contains
valid positive bit-widths before use: after any resize of
tableTagBits/tableIndexBits/tablePcShifts, verify for each predictor index i
that tableTagBits.size() > i and tableTagBits[i] >= 2 (so tableTagBits[i]-1
>=1), and if an entry is missing or invalid, set a safe default (or clamp it)
before creating tableTagMasks[i], tagFoldedHist, altTagFoldedHist, and
indexFoldedHist; similarly validate tableIndexBits[i] (>=1) and tableSizes[i]
presence—use these symbols (tableTagBits, tableTagMasks, tagFoldedHist,
altTagFoldedHist, indexFoldedHist, tableIndexBits) to locate and fix the checks
and clamping.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cpu/pred/btb/microtage.cc`:
- Around line 1118-1124: The commitBranch code dereferences meta returned by
std::static_pointer_cast<TageMeta>(stream.predMetas[getComponentIdx()]) without
checking for null; update commitBranch to guard meta (the local variable named
meta) before using it—i.e., after computing meta (from stream.predMetas and
getComponentIdx()) add a null-check like if (!meta) return/skip the preds logic
so you do not call meta->preds or read its contents (used to compute pc,
pred_taken, pred_hit); mirror the null-check pattern used in update and
recoverPHist to safely handle a failed cast.
- Around line 829-833: The mask in MicroTAGE::getBankId uses an int literal for
the shift which mixes types with Addr; change the mask to use an Addr-typed 1
(e.g. static_cast<Addr>(1) << bankIdWidth or Addr(1) << bankIdWidth) so the
shift and mask operate on Addr and avoid integer-promotion issues when computing
((1 << bankIdWidth) - 1); keep the rest of the expression ((pc >> bankBaseShift)
& mask) intact and ensure mask is of type Addr.

---

Duplicate comments:
In `@src/cpu/pred/btb/microtage.cc`:
- Around line 17-25: The include for base/intmath.hh is currently inside the
`#ifndef` UNIT_TEST guard but ceilLog2 and floorLog2 are used later even in
UNIT_TEST builds; move the line including "base/intmath.hh" out of the UNIT_TEST
guard so ceilLog2/floorLog2 are available for unit tests (update the include
block in microtage.cc so base/intmath.hh is above or outside the `#ifndef`
UNIT_TEST region while keeping other debug-only includes like
base/debug_helper.hh and debug/TAGE.hh inside).
- Around line 645-647: The code uses operator[] on predMeta->preds
(predMeta->preds[btb_entry.pc]) which can implicitly insert a default
TagePrediction and corrupt metadata; change this to a non-mutating lookup (e.g.,
use find/contains/at depending on container) and only read into recomputed if an
existing entry for btb_entry.pc is present (otherwise handle the missing-case
explicitly instead of allowing insertion), referencing predMeta, preds,
btb_entry.pc, recomputed and TagePrediction to locate and update the logic.
- Around line 744-765: The mask creation in getTageTag and getTageIndex is
undefined when shifting by the machine word width; change the mask logic to
guard variable-width shifts: compute bits = tableTagBits[t] (or
tableIndexBits[t]) and if bits == 0 set mask = 0, else if bits >=
(sizeof(Addr)*8) set mask = ~static_cast<Addr>(0), otherwise set mask = (1ULL <<
bits) - 1; update both getTageTag and getTageIndex (references: mask,
tableTagBits/tableIndexBits, pcBits, foldedBits, altTagBits, bankBaseShift) so
no shift uses a width >= bit_width(Addr).
- Around line 96-127: The code currently blindly resizes tableTagBits and then
uses its entries (including tableTagBits[i]-1) to construct folded-history
objects, which can produce zero/negative widths; update initialization to ensure
tableTagBits contains valid positive bit-widths before use: after any resize of
tableTagBits/tableIndexBits/tablePcShifts, verify for each predictor index i
that tableTagBits.size() > i and tableTagBits[i] >= 2 (so tableTagBits[i]-1
>=1), and if an entry is missing or invalid, set a safe default (or clamp it)
before creating tableTagMasks[i], tagFoldedHist, altTagFoldedHist, and
indexFoldedHist; similarly validate tableIndexBits[i] (>=1) and tableSizes[i]
presence—use these symbols (tableTagBits, tableTagMasks, tagFoldedHist,
altTagFoldedHist, indexFoldedHist, tableIndexBits) to locate and fix the checks
and clamping.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ebc8f1 and 2e11646.

📒 Files selected for processing (1)

src/cpu/pred/btb/microtage.cc

coderabbitai · 2026-03-03T07:25:05Z

src/cpu/pred/btb/microtage.cc

+MicroTAGE::getBankId(Addr pc) const
+{
+    // Extract bank ID bits after removing instruction alignment
+    return (pc >> bankBaseShift) & ((1 << bankIdWidth) - 1);
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the specific file and lines mentioned cat -n src/cpu/pred/btb/microtage.cc | head -n 850 | tail -n 30

Repository: OpenXiangShan/GEM5

Length of output: 1262

🏁 Script executed:

# Also need to find the declaration of bankIdWidth to understand its type and constraints rg "bankIdWidth" src/cpu/pred/btb/microtage.cc -B 2 -A 2

Repository: OpenXiangShan/GEM5

Length of output: 587

🏁 Script executed:

# Find the Addr type definition rg "typedef.*Addr|using.*Addr|^\s*Addr\s*=" --max-count=5 -A 1

Repository: OpenXiangShan/GEM5

Length of output: 12176

🏁 Script executed:

# Find ceilLog2 definition and return type rg "ceilLog2" -A 5 | head -n 50

Repository: OpenXiangShan/GEM5

Length of output: 2403

🏁 Script executed:

# Also check for numBanks parameter/configuration constraints rg "numBanks" src/cpu/pred/btb/microtage.hh -B 2 -A 2

Repository: OpenXiangShan/GEM5

Length of output: 997

🏁 Script executed:

# Get the actual ceilLog2 function definition rg "ceilLog2" src/base/intmath.hh -A 15

Repository: OpenXiangShan/GEM5

Length of output: 278

🏁 Script executed:

# Check if there are any parameter constraints or specs for numBanks rg "numBanks" src/cpu/pred/btb/ -B 2 -A 2

Repository: OpenXiangShan/GEM5

Length of output: 8320

🏁 Script executed:

# Check for any parameter validation or constraints on numBanks rg "numBanks" src/cpu/pred/btb/ --max-count=20 | grep -i "param\|constraint\|check\|valid\|range"

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

# Look at the .py parameter files to see what constraints exist fd ".*\.py" src/cpu/pred/btb/ -exec grep -l "numBanks\|bank" {} \;

Repository: OpenXiangShan/GEM5

Length of output: 270

🏁 Script executed:

# Check what realistic values numBanks could take rg "numBanks.*=" src/cpu/pred/btb/ | head -20

Repository: OpenXiangShan/GEM5

Length of output: 348

🏁 Script executed:

# Let's check the floorLog2 definition to understand return types rg "floorLog2" src/base/intmath.hh -A 12 | head -30

Repository: OpenXiangShan/GEM5

Length of output: 776

🏁 Script executed:

# Check for any existing test cases or usage patterns that show realistic numBanks values rg "MicroTAGE\|BTBTAGE" src/cpu/pred/btb/ -A 2 | grep -E "numBanks|= [0-9]" | head -20

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

# Let's analyze whether (1 << bankIdWidth) with int can overflow # Create a test to understand the constraint python3 << 'EOF' import math # If numBanks is unsigned type (typically 32-bit at minimum) # and bankIdWidth = ceilLog2(numBanks) # For bankIdWidth to cause UB on "int" shifting: # We need bankIdWidth >= 31 (shifting an int by 31+ is UB) # This means ceilLog2(numBanks) >= 31 # Which means numBanks >= 2^31 max_32bit_signed = 2**31 - 1 print(f"For UB to occur, numBanks must be >= 2^31 = {2**31}") print(f"Max 32-bit signed: {max_32bit_signed}") print(f"This is unrealistic for a branch predictor with ~2 billion banks") # However, the issue is still valid from a type safety perspective # Even at bankIdWidth=20 (1M banks), using int literal is imprecise print(f"\nAt typical bankIdWidth=20 (1M banks):") print(f" 1 << 20 = {1 << 20} (works with int)") print(f" But using untyped '1' for shift is poor practice") # The real risk: if someone later changes this to work with larger shifts # or misuses bankIdWidth elsewhere, it could break silently EOF

Repository: OpenXiangShan/GEM5

Length of output: 338

Improve type safety in getBankId mask operation.

Line 832 uses (1 << bankIdWidth) where 1 is an int literal, then masks a uint64_t (Addr). While realistic UB risk is minimal (would require numBanks ≥ 2³¹), this type inconsistency is poor practice. Use an Addr literal to ensure type correctness:

Suggested fix

unsigned MicroTAGE::getBankId(Addr pc) const { // Extract bank ID bits after removing instruction alignment - return (pc >> bankBaseShift) & ((1 << bankIdWidth) - 1); + return (pc >> bankBaseShift) & ((Addr(1) << bankIdWidth) - 1); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cpu/pred/btb/microtage.cc` around lines 829 - 833, The mask in MicroTAGE::getBankId uses an int literal for the shift which mixes types with Addr; change the mask to use an Addr-typed 1 (e.g. static_cast<Addr>(1) << bankIdWidth or Addr(1) << bankIdWidth) so the shift and mask operate on Addr and avoid integer-promotion issues when computing ((1 << bankIdWidth) - 1); keep the rest of the expression ((pc >> bankBaseShift) & mask) intact and ensure mask is of type Addr.

src/cpu/pred/btb/microtage.cc

Change-Id: I94ccc54b623bbf488bdaae9086c3cfff92c1fcdf

Change-Id: Ic7a506f9ee3bbbcaaf6b50f28cf74dc4a407cea8

Change-Id: Iffd21dbdb9f633841d1a9ac33dcd4e7f5f53b992

Change-Id: I93ab507fc32d1096a5fd07f68d85740577eb6ea9

XiangShanRobot · 2026-03-05T06:19:34Z

[Generated by GEM5 Performance Robot]
commit: 82a2364
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	20.26	20.38	-0.58 🔴

[Generated by GEM5 Performance Robot]
commit: 82a2364
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	20.26	20.26	0.00

XiangShanRobot · 2026-03-05T07:46:34Z

[Generated by GEM5 Performance Robot]
commit: 82a2364
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	17.63	17.84	-1.20 🔴

Cao Jiaming added 17 commits January 28, 2026 11:06

cpu-o3: split microtage out of tage

d1b55e7

cpu-o3: enhance branch index calculation in MicroTAGE to handle edge …

bd6087d

…cases

cpu-o3: update blockWidth calculation to handle zero blockSize

542f6cb

Change-Id: I44b6175022a3a593ed385407bd95ec0c40c74642

cpu-o3: adjust MicroTAGE parameters and clean up code for clarity

c23b9a9

Change-Id: I16ce79a7d8488d9a138d0a26f5500576ae54132e

cpu-o3: remove unused variable 'usingBasetable'

a358334

Change-Id: I8be037a1cdacd7151f4fe8e743a32cca90a85036

cpu-o3: MicroTAGE to remove base table and alternative prediction

9a01469

Change-Id: I6b9c92c18c574a12c19532ae0b894e64c1187342

cpu-o3: update MicroTAGE parameters for consistency

05a0f32

Change-Id: I953965b8e6feb1c15a238ac832d65bc16b32f496

cpu-o3: remove alternative prediction logic

bfb1563

Change-Id: I39cd471d452aa343c3dd741a80fdfa7d126e3a9f

cpu-o3: refine MicroTAGE parameters and remove unused variables

f742b1b

Change-Id: I3567ae93652aac218c5b4646003abadddaf7cf32

cpu-o3: update MicroTAGE history length parameter to 16

9404558

Change-Id: If7c9d3aa68a23c36dde74d8cf3a286c9c48f3e3c

cpu-o3: enhance MicroTAGE history management with ahead queues

7a5d5ab

Change-Id: I83277ae5c801e9d22b594286580459d12cdec69b

cpu-o3:open microtage

042d451

Change-Id: I394246af184d3f07e02b85f06e4e5ceed368ec22

cpu-o3:close utage in ideal

7e25f61

Change-Id: I762c11f8d15262fb9f1c9d443f77895fa76bbc79

cpu-o3: update folded history queue size limit

aeff5ff

Change-Id: Ia9bcdc028235447e254889d95e5ea98e7f067664

cpu-o3: disable MicroTAGE in branch prediction configuration

63be8d5

Change-Id: I56614e8ebc2dd33320d353562087ab456fc452da

cpu-o3: refine folded history queue management in doUpdateHist

50728be

Change-Id: Icdc7ac7f047a36dba6733a0e1d11e5e37aa4cdf1

cpu-o3: update .gitignore and modify MicroTAGE predictor includes

ba883a1

Change-Id: I9cb7e7ef04efefbcbdbf705563563f50f3c83324

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

cpu-o3: refactor MicroTAGE history management by removing ahead index…

add8efc

… history queue

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

cpu-o3: disable MicroTAGE predictor in setKmhV3Ideal

04aa5dc

Change-Id: Iad486579b9ab0df207013348f02c6be30bd10cfd

Cao Jiaming added 2 commits January 29, 2026 15:28

cpu-o3: enhance MicroTAGE predictor to support alternative tag history

57e8ea4

cpu-o3:enable utage in idael

4eb6b9b

Change-Id: I20e9fbfe161cd741b37bb69d46e99ee7755f79e5

cpu-o3: remove alternative tag history from MicroTAGE index

2c69bbc

Change-Id: If1187310e575a7c485d49fb89215742b1174393a

cpu-o3: implement ahead index history management in MicroTAGE

88017f2

Change-Id: I2b9f9216a54405d517538958b717a8e3de2eca8a

cpu-o3: add debug logging in MicroTAGE updates

5ebc8f1

Change-Id: I8c70b06b834fd5e713ef9ac4d3b26ec5b01ce2e6

cpu-o3: update index folded history recovery in MicroTAGE

2e11646

Change-Id: Ic736ea7cc73abebb3af473510c16d837b4f535f4

coderabbitai bot reviewed Mar 3, 2026

View reviewed changes

Cao Jiaming added 5 commits March 3, 2026 17:08

cpu-o3: update doUpdateHist to handle ahead index folded history

b1d0f6a

Change-Id: I94ccc54b623bbf488bdaae9086c3cfff92c1fcdf

cpu-o3: enhance ahead index history management in MicroTAGE

eeab9d8

Change-Id: Ic7a506f9ee3bbbcaaf6b50f28cf74dc4a407cea8

cpu: Add UTAGE flag and clean up MicroTAGE

38fdd3b

Change-Id: Iffd21dbdb9f633841d1a9ac33dcd4e7f5f53b992

cpu-o3: correct invalidation logic for duplicated entries in AheadBTB

7dbfbde

cpu-o3: update uTAGE parameters *4 for improved configuration

82a2364

Change-Id: I93ab507fc32d1096a5fd07f68d85740577eb6ea9

		// Recover folded history after misprediction
		void recoverFoldedHist(const bitset& history);

Conversation

CJ362ff commented Jan 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

XiangShanRobot commented Jan 29, 2026

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

XiangShanRobot commented Jan 29, 2026

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

XiangShanRobot commented Jan 29, 2026

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

XiangShanRobot commented Jan 29, 2026

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

XiangShanRobot commented Feb 3, 2026

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

CJ362ff commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 28, 2026 •

edited

Loading