Skip to content

pin_exec_driven: fix Inst_Info lookup aliasing across cores#20

Open
iconstan01 wants to merge 2 commits intolitz-lab:mainfrom
iconstan01:PR-pinexec-inst-info-cmp-lookup
Open

pin_exec_driven: fix Inst_Info lookup aliasing across cores#20
iconstan01 wants to merge 2 commits intolitz-lab:mainfrom
iconstan01:PR-pinexec-inst-info-cmp-lookup

Conversation

@iconstan01
Copy link

  • Fixes static instruction lookup mismatch in pin_exec_driven caused by address representation aliasing with CMP-tagged addresses.
  • Normalizes lookup and keeps consistency checks for per-core tagged instruction addresses.
  • Validated on multicore workloads where the issue previously reproduced.

Copilot AI review requested due to automatic review settings March 11, 2026 15:21
@hlitz hlitz requested review from yql5510718 and removed request for Copilot March 11, 2026 15:24
Copy link
Contributor

@yql5510718 yql5510718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix.

void convert_pinuop_to_t_uop(uns8 proc_id, ctype_pin_inst* pi, Trace_Uop** trace_uop) {
Flag new_entry = FALSE;
Inst_Info* info;
// Keep instruction-cache lookup independent of per-core CMP address tagging.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is a bit misleading.
maybe:
// Use canonical address (proc_id=0) for consistent hash key within per-core table

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I updated the comment to clarify canonical address lookup.

Copilot AI review requested due to automatic review settings March 12, 2026 07:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes multicore static instruction (Inst_Info) lookup mismatches in pin_exec_driven by normalizing CMP-tagged instruction addresses to a canonical form for hash lookup, while preserving per-core tagging for stored entries.

Changes:

  • Introduced a canonicalized lookup_addr for cpp_hash_table_access_create to avoid CMP-tag aliasing.
  • Replaced strict info->addr equality assert with a canonical-address consistency check and rebinding.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +786 to +788
// Use canonical address (proc_id=0) for consistent hash key within per-core table.
// The uop cache key should use the original program address bits only.
Addr lookup_addr = convert_to_cmp_addr(0, pi->instruction_addr);
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert_to_cmp_addr(0, ...) reads like it adds CMP tagging rather than stripping/normalizing it, which makes the canonicalization intent ambiguous and easy to misuse later. Consider switching to (or introducing) an explicitly-named helper that extracts the untagged program address bits (e.g., a strip_cmp_tag()/canonicalize_inst_addr()), then (if needed) re-tagging in a clearly separate step.

Copilot uses AI. Check for mistakes.
Comment on lines +786 to +787
// Use canonical address (proc_id=0) for consistent hash key within per-core table.
// The uop cache key should use the original program address bits only.
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the key should use the original program address bits only, but the code explicitly stores the per-core tagged address into info->addr. If info->addr is intended to remain tagged (as the previous assert implied), please adjust the comment to avoid implying that Inst_Info::addr is untagged/canonical. If Inst_Info::addr is intended to be canonical, then rebinding it to a tagged address contradicts that and should be reworked (e.g., store canonical and tagged addresses in distinct fields).

Suggested change
// Use canonical address (proc_id=0) for consistent hash key within per-core table.
// The uop cache key should use the original program address bits only.
// Use canonical address (proc_id=0) for the uop cache key so it depends only on
// the original program address bits, independent of any per-core address tagging.

Copilot uses AI. Check for mistakes.
trace_uop[ii]->eom = FALSE;
ASSERT(proc_id, info->addr == pi->instruction_addr);
if (info->addr != pi->instruction_addr) {
// A stale entry can differ only by CMP core bits; rebind it to this core's tagged address.
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the key should use the original program address bits only, but the code explicitly stores the per-core tagged address into info->addr. If info->addr is intended to remain tagged (as the previous assert implied), please adjust the comment to avoid implying that Inst_Info::addr is untagged/canonical. If Inst_Info::addr is intended to be canonical, then rebinding it to a tagged address contradicts that and should be reworked (e.g., store canonical and tagged addresses in distinct fields).

Suggested change
// A stale entry can differ only by CMP core bits; rebind it to this core's tagged address.
// A stale Inst_Info entry may differ from the current instance only in CMP core-tag bits.
// In that case, we intentionally rebind Inst_Info::addr to this core's tagged address so
// that Inst_Info::addr may hold a per-core tagged address rather than a strictly canonical one.

Copilot uses AI. Check for mistakes.
ASSERT(proc_id, info->addr == pi->instruction_addr);
if (info->addr != pi->instruction_addr) {
// A stale entry can differ only by CMP core bits; rebind it to this core's tagged address.
ASSERT(proc_id, convert_to_cmp_addr(0, info->addr) == convert_to_cmp_addr(0, pi->instruction_addr));
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This computes convert_to_cmp_addr(0, ...) twice in a per-uop loop. Since pi->instruction_addr is constant for the instruction and info->addr is also available, consider precomputing the canonicalized values once (e.g., canonical pi address before the loop, and/or canonical info address just before the assert) to reduce repeated work in hot code.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants