Conversation
Change-Id: I39d54a0621d139cc00a156b02a6d7d888d9b15f0 Co-authored-by: Xu Boran <xuboran@bosc.ac.cn>
Change-Id: Ic203a9694c093034744986309e796b9d66d6f826
Change-Id: I3f0f686000b610c3bf842e62c9b9e91e7188a028
Change-Id: Id42dacf9c990f9beade99a3f39e3ccb2998b4e10
Change-Id: I875ccc0a59f7fff3103bcc4d31740915237fbcc7
Change-Id: Ie2ff36a83d965c4d3f8df51304d7a72c0523cb71
📝 WalkthroughWalkthroughThis PR introduces two-taken (2-taken) and two-fetch (2-fetch) prediction capabilities to the DecoupledBPUWithBTB branch predictor, enabling multiple branch predictions per fetch cycle. Changes span configuration parameters, fetch pipeline dual-branch handling, uBTB secondary prediction support with new training logic, and BPU state machine restructuring. Changes
Sequence DiagramsequenceDiagram
participant Fetch as Fetch<br/>(O3 Pipeline)
participant BPU as DecoupledBPU<br/>(Prediction Req)
participant UBTB as uBTB<br/>(Micro-BTB)
participant MBTB as MBTB<br/>(Main BTB)
participant FSQ as FSQ<br/>(Enqueue Targets)
Fetch->>BPU: requestNewPrediction()
activate BPU
BPU->>UBTB: lookup(startAddr)
UBTB-->>BPU: hit_index
BPU->>MBTB: lookup & predict
MBTB-->>BPU: primary prediction
BPU->>BPU: Store primary to predDFF
alt enable2Taken is true
BPU->>UBTB: putPCHistory2Taken()
UBTB->>UBTB: check secondary prediction
UBTB-->>BPU: (index, has_second)
alt Secondary prediction exists
BPU->>MBTB: getSecondPredictionMeta()
MBTB-->>BPU: metadata_2nd
BPU->>BPU: Store secondary to state
end
end
deactivate BPU
BPU->>BPU: tick() - Process predictions
activate BPU
BPU->>UBTB: trainUbtbFor2Taken()
UBTB->>UBTB: Update 2-taken counters
BPU->>BPU: processNewPrediction(false)
BPU->>FSQ: createFetchTargetEntry(false)
FSQ-->>Fetch: primary target
alt has secondary prediction
BPU->>BPU: processNewPrediction(true)
BPU->>FSQ: createFetchTargetEntry(true)
FSQ-->>Fetch: secondary target
end
deactivate BPU
Fetch->>Fetch: Check 2-fetch conditions
alt enable2Fetch && predicted_taken<br/>&& span_ok && buffer_ok
Fetch->>Fetch: mark do_2fetch = true
else
Fetch->>Fetch: Invalidate fetch targets
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/cpu/o3/fetch.cc (1)
1881-1882: UseFetchBuffer::reset()instead of partial invalidation.Line 1882 invalidates only
valid; usingreset()keeps buffer metadata consistent and avoids stalestartPC.♻️ Suggested refactor
- fetchBuffer[tid].valid = false; + fetchBuffer[tid].reset();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cpu/o3/fetch.cc` around lines 1881 - 1882, The code currently invalidates the fetch buffer by setting fetchBuffer[tid].valid = false, which leaves other metadata (e.g., startPC) stale; replace this partial invalidation with a full reset by calling fetchBuffer[tid].reset() so the buffer's metadata and state are cleared consistently (locate the usage near where fetchBuffer[tid].valid is set and swap to the FetchBuffer::reset() call).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cpu/pred/btb/btb_ubtb.cc`:
- Around line 326-350: The early return when secondPred == nullptr in
UBTB::addSecondPredictionToEntry leaves per-entry second-prediction state stale;
change the branch so that before returning you explicitly clear the
second-prediction fields on the target entry (set entry.valid_2nd = false;
entry.pt_2nd = false; entry.branch_info_2nd = BranchInfo()) so stale
valid_2nd/pt_2nd/branch_info_2nd cannot be replayed later; locate the logic in
UBTB::addSecondPredictionToEntry and update the nullptr path to clear those
fields on the referenced entry (use ubtb[entryIndex] as the target).
- Around line 437-462: train1Taken and train2Taken currently perform training
regardless of the usingS3Pred flag; add an early guard in UBTB::train1Taken and
UBTB::train2Taken to return immediately when the instance/member usingS3Pred is
false (the same config gate enforced by updateUsingS3Pred()), so these direct
decoupled-BTB entrypoints do not bypass the config; reference the
UBTB::train1Taken and UBTB::train2Taken methods and the usingS3Pred
member/updateUsingS3Pred() helper when making the change.
In `@src/cpu/pred/btb/mbtb.cc`:
- Around line 321-325: The MBTB::getSecondPredictionMeta() implementation
returns a non-null empty BTBMeta which prevents the fallback used elsewhere;
change MBTB::getSecondPredictionMeta() to return nullptr (matching UBTB and the
base class) so that code like entry.predMetas[i] = second_meta ? second_meta :
components[i]->getPredictionMeta() will fall back to getPredictionMeta() and
preserve proper hit_entries behavior.
---
Nitpick comments:
In `@src/cpu/o3/fetch.cc`:
- Around line 1881-1882: The code currently invalidates the fetch buffer by
setting fetchBuffer[tid].valid = false, which leaves other metadata (e.g.,
startPC) stale; replace this partial invalidation with a full reset by calling
fetchBuffer[tid].reset() so the buffer's metadata and state are cleared
consistently (locate the usage near where fetchBuffer[tid].valid is set and swap
to the FetchBuffer::reset() call).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 778f80bf-c4ae-4624-935f-efcfe8295c61
📒 Files selected for processing (14)
configs/example/idealkmhv3.pyconfigs/example/kmhv3.pysrc/cpu/o3/fetch.ccsrc/cpu/o3/fetch.hhsrc/cpu/pred/BranchPredictor.pysrc/cpu/pred/btb/btb_ubtb.ccsrc/cpu/pred/btb/btb_ubtb.hhsrc/cpu/pred/btb/common.hhsrc/cpu/pred/btb/decoupled_bpred.ccsrc/cpu/pred/btb/decoupled_bpred.hhsrc/cpu/pred/btb/decoupled_bpred_stats.ccsrc/cpu/pred/btb/mbtb.ccsrc/cpu/pred/btb/mbtb.hhsrc/cpu/pred/btb/timed_base_pred.hh
| UBTB::addSecondPredictionToEntry(int entryIndex, | ||
| FullBTBPrediction* secondPred) | ||
| { | ||
| if (!secondPred) { | ||
| return; | ||
| } | ||
| assert(entryIndex >= 0 && entryIndex < static_cast<int>(ubtb.size())); | ||
| auto &entry = ubtb[entryIndex]; | ||
| if (!entry.valid) { | ||
| return; | ||
| } | ||
| entry.valid_2nd = true; | ||
| entry.pt_2nd = shouldSetPtSecond(*secondPred); | ||
| if (entry.pt_2nd) { | ||
| auto second_taken = secondPred->getTakenEntry(); | ||
| if (!second_taken.valid) { | ||
| entry.valid_2nd = false; | ||
| return; | ||
| } | ||
| entry.branch_info_2nd = second_taken; | ||
| entry.branch_info_2nd.target = secondPred->getTarget(predictWidth); | ||
| } else { | ||
| entry.branch_info_2nd = BranchInfo(); | ||
| } | ||
| } |
There was a problem hiding this comment.
Clear second-prediction state when training without a second prediction.
Line 329 returns early for secondPred == nullptr, which leaves valid_2nd/pt_2nd/branch_info_2nd stale. That can replay obsolete second predictions in later putPCHistory2Taken() calls.
🐛 Proposed fix
void
UBTB::addSecondPredictionToEntry(int entryIndex,
FullBTBPrediction* secondPred)
{
- if (!secondPred) {
- return;
- }
assert(entryIndex >= 0 && entryIndex < static_cast<int>(ubtb.size()));
auto &entry = ubtb[entryIndex];
if (!entry.valid) {
return;
}
+ if (!secondPred) {
+ entry.valid_2nd = false;
+ entry.pt_2nd = false;
+ entry.branch_info_2nd = BranchInfo();
+ return;
+ }
entry.valid_2nd = true;
entry.pt_2nd = shouldSetPtSecond(*secondPred);
if (entry.pt_2nd) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| UBTB::addSecondPredictionToEntry(int entryIndex, | |
| FullBTBPrediction* secondPred) | |
| { | |
| if (!secondPred) { | |
| return; | |
| } | |
| assert(entryIndex >= 0 && entryIndex < static_cast<int>(ubtb.size())); | |
| auto &entry = ubtb[entryIndex]; | |
| if (!entry.valid) { | |
| return; | |
| } | |
| entry.valid_2nd = true; | |
| entry.pt_2nd = shouldSetPtSecond(*secondPred); | |
| if (entry.pt_2nd) { | |
| auto second_taken = secondPred->getTakenEntry(); | |
| if (!second_taken.valid) { | |
| entry.valid_2nd = false; | |
| return; | |
| } | |
| entry.branch_info_2nd = second_taken; | |
| entry.branch_info_2nd.target = secondPred->getTarget(predictWidth); | |
| } else { | |
| entry.branch_info_2nd = BranchInfo(); | |
| } | |
| } | |
| void | |
| UBTB::addSecondPredictionToEntry(int entryIndex, | |
| FullBTBPrediction* secondPred) | |
| { | |
| assert(entryIndex >= 0 && entryIndex < static_cast<int>(ubtb.size())); | |
| auto &entry = ubtb[entryIndex]; | |
| if (!entry.valid) { | |
| return; | |
| } | |
| if (!secondPred) { | |
| entry.valid_2nd = false; | |
| entry.pt_2nd = false; | |
| entry.branch_info_2nd = BranchInfo(); | |
| return; | |
| } | |
| entry.valid_2nd = true; | |
| entry.pt_2nd = shouldSetPtSecond(*secondPred); | |
| if (entry.pt_2nd) { | |
| auto second_taken = secondPred->getTakenEntry(); | |
| if (!second_taken.valid) { | |
| entry.valid_2nd = false; | |
| return; | |
| } | |
| entry.branch_info_2nd = second_taken; | |
| entry.branch_info_2nd.target = secondPred->getTarget(predictWidth); | |
| } else { | |
| entry.branch_info_2nd = BranchInfo(); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cpu/pred/btb/btb_ubtb.cc` around lines 326 - 350, The early return when
secondPred == nullptr in UBTB::addSecondPredictionToEntry leaves per-entry
second-prediction state stale; change the branch so that before returning you
explicitly clear the second-prediction fields on the target entry (set
entry.valid_2nd = false; entry.pt_2nd = false; entry.branch_info_2nd =
BranchInfo()) so stale valid_2nd/pt_2nd/branch_info_2nd cannot be replayed
later; locate the logic in UBTB::addSecondPredictionToEntry and update the
nullptr path to clear those fields on the referenced entry (use ubtb[entryIndex]
as the target).
| UBTB::train1Taken(FullBTBPrediction &s3Pred) | ||
| { | ||
| auto takenEntry = s3Pred.getTakenEntry(); | ||
| if (takenEntry.valid) { | ||
| ubtbStats.s3UpdateHits++; | ||
| }else { | ||
| } else { | ||
| ubtbStats.s3UpdateMisses++; | ||
| } | ||
| auto startAddr = s3Pred.bbStart; | ||
| UBTBIter oldEntryIter = lastPred.hit_entry; | ||
| takenEntry.source = getComponentIdx(); | ||
| updateNewEntry(oldEntryIter, takenEntry, startAddr); | ||
| trainCommon(lastPred.hit_index, s3Pred, nullptr); | ||
| } | ||
|
|
||
| void | ||
| UBTB::train2Taken(FullBTBPrediction &dffPred, | ||
| FullBTBPrediction &s3Pred, | ||
| int hitIndex) | ||
| { | ||
| if (dffPred.getTarget(predictWidth) != s3Pred.bbStart) { | ||
| trainCommon(hitIndex, dffPred, nullptr); | ||
| return; | ||
| } | ||
| if (!check2TakenConditions(dffPred, s3Pred)) { | ||
| trainCommon(hitIndex, dffPred, nullptr); | ||
| return; | ||
| } | ||
| trainCommon(hitIndex, dffPred, &s3Pred); | ||
| } |
There was a problem hiding this comment.
Respect usingS3Pred in the new direct training entrypoints.
Lines 437-462 train even when usingS3Pred is false. Since these methods are now called directly from decoupled BTB flow, this bypasses the existing config gate in updateUsingS3Pred().
🐛 Proposed fix
void
UBTB::train1Taken(FullBTBPrediction &s3Pred)
{
+ if (!usingS3Pred) {
+ return;
+ }
auto takenEntry = s3Pred.getTakenEntry();
if (takenEntry.valid) {
ubtbStats.s3UpdateHits++;
@@
void
UBTB::train2Taken(FullBTBPrediction &dffPred,
FullBTBPrediction &s3Pred,
int hitIndex)
{
+ if (!usingS3Pred) {
+ return;
+ }
if (dffPred.getTarget(predictWidth) != s3Pred.bbStart) {
trainCommon(hitIndex, dffPred, nullptr);
return;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| UBTB::train1Taken(FullBTBPrediction &s3Pred) | |
| { | |
| auto takenEntry = s3Pred.getTakenEntry(); | |
| if (takenEntry.valid) { | |
| ubtbStats.s3UpdateHits++; | |
| }else { | |
| } else { | |
| ubtbStats.s3UpdateMisses++; | |
| } | |
| auto startAddr = s3Pred.bbStart; | |
| UBTBIter oldEntryIter = lastPred.hit_entry; | |
| takenEntry.source = getComponentIdx(); | |
| updateNewEntry(oldEntryIter, takenEntry, startAddr); | |
| trainCommon(lastPred.hit_index, s3Pred, nullptr); | |
| } | |
| void | |
| UBTB::train2Taken(FullBTBPrediction &dffPred, | |
| FullBTBPrediction &s3Pred, | |
| int hitIndex) | |
| { | |
| if (dffPred.getTarget(predictWidth) != s3Pred.bbStart) { | |
| trainCommon(hitIndex, dffPred, nullptr); | |
| return; | |
| } | |
| if (!check2TakenConditions(dffPred, s3Pred)) { | |
| trainCommon(hitIndex, dffPred, nullptr); | |
| return; | |
| } | |
| trainCommon(hitIndex, dffPred, &s3Pred); | |
| } | |
| void | |
| UBTB::train1Taken(FullBTBPrediction &s3Pred) | |
| { | |
| if (!usingS3Pred) { | |
| return; | |
| } | |
| auto takenEntry = s3Pred.getTakenEntry(); | |
| if (takenEntry.valid) { | |
| ubtbStats.s3UpdateHits++; | |
| } else { | |
| ubtbStats.s3UpdateMisses++; | |
| } | |
| trainCommon(lastPred.hit_index, s3Pred, nullptr); | |
| } | |
| void | |
| UBTB::train2Taken(FullBTBPrediction &dffPred, | |
| FullBTBPrediction &s3Pred, | |
| int hitIndex) | |
| { | |
| if (!usingS3Pred) { | |
| return; | |
| } | |
| if (dffPred.getTarget(predictWidth) != s3Pred.bbStart) { | |
| trainCommon(hitIndex, dffPred, nullptr); | |
| return; | |
| } | |
| if (!check2TakenConditions(dffPred, s3Pred)) { | |
| trainCommon(hitIndex, dffPred, nullptr); | |
| return; | |
| } | |
| trainCommon(hitIndex, dffPred, &s3Pred); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cpu/pred/btb/btb_ubtb.cc` around lines 437 - 462, train1Taken and
train2Taken currently perform training regardless of the usingS3Pred flag; add
an early guard in UBTB::train1Taken and UBTB::train2Taken to return immediately
when the instance/member usingS3Pred is false (the same config gate enforced by
updateUsingS3Pred()), so these direct decoupled-BTB entrypoints do not bypass
the config; reference the UBTB::train1Taken and UBTB::train2Taken methods and
the usingS3Pred member/updateUsingS3Pred() helper when making the change.
| std::shared_ptr<void> | ||
| MBTB::getSecondPredictionMeta() | ||
| { | ||
| return std::make_shared<BTBMeta>(); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "getSecondPredictionMeta" --type=cpp -A 3Repository: OpenXiangShan/GEM5
Length of output: 1642
🏁 Script executed:
rg -n "class.*BranchPredictor" --type=cpp -A 10 | head -50Repository: OpenXiangShan/GEM5
Length of output: 1069
🏁 Script executed:
fd -e "ubtb" --type f | grep -E "\.(cc|h)$"Repository: OpenXiangShan/GEM5
Length of output: 44
🏁 Script executed:
rg -n "auto second_meta = components" --type=cpp -A 3Repository: OpenXiangShan/GEM5
Length of output: 477
🏁 Script executed:
rg -n "hit_entries" --type=cpp src/cpu/pred/btb/mbtb.ccRepository: OpenXiangShan/GEM5
Length of output: 293
🏁 Script executed:
rg -n "checkPredictionHit|hit_entries" --type=cpp src/cpu/pred/btb/mbtb.cc -B 2 -A 5Repository: OpenXiangShan/GEM5
Length of output: 1899
🏁 Script executed:
rg -n "getSecondPredictionMeta\(\)" --type=cpp -B 5 -A 10 | grep -E "(class|override|return|virtual)" | head -30Repository: OpenXiangShan/GEM5
Length of output: 2923
🏁 Script executed:
rg -n "getSecondPrediction|second.*prediction|second.*meta" --type=cpp src/cpu/pred/btb/mbtb.cc src/cpu/pred/btb/mbtb.hh -B 2 -A 2 | grep -E "(^[0-9]|//|/\*)" | head -40Repository: OpenXiangShan/GEM5
Length of output: 106
🏁 Script executed:
cat -n src/cpu/pred/btb/mbtb.hh | sed -n '140,160p'Repository: OpenXiangShan/GEM5
Length of output: 1065
Inconsistent with other BTB predictors: getSecondPredictionMeta() should return nullptr
MBTB is the only component returning a non-null empty BTBMeta, while UBTB and the base class both return nullptr. This breaks the fallback mechanism in decoupled_bpred.cc (line 807):
entry.predMetas[i] = second_meta ? second_meta : components[i]->getPredictionMeta();Since MBTB returns non-null, this ternary never triggers the fallback to getPredictionMeta(). Second predictions will have empty hit_entries, affecting checkPredictionHit() and commitBranch() statistics.
Return nullptr instead to match the established pattern and allow fallback to the first prediction's metadata (as UBTB does).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cpu/pred/btb/mbtb.cc` around lines 321 - 325, The
MBTB::getSecondPredictionMeta() implementation returns a non-null empty BTBMeta
which prevents the fallback used elsewhere; change
MBTB::getSecondPredictionMeta() to return nullptr (matching UBTB and the base
class) so that code like entry.predMetas[i] = second_meta ? second_meta :
components[i]->getPredictionMeta() will fall back to getPredictionMeta() and
preserve proper hit_entries behavior.
Summary by CodeRabbit
Release Notes
New Features
Configuration