pin_exec_driven: fix Inst_Info lookup aliasing across cores by iconstan01 · Pull Request #20 · litz-lab/scarab

iconstan01 · 2026-03-11T15:21:55Z

Fixes static instruction lookup mismatch in pin_exec_driven caused by address representation aliasing with CMP-tagged addresses.
Normalizes lookup and keeps consistency checks for per-core tagged instruction addresses.
Validated on multicore workloads where the issue previously reproduced.

yql5510718

Thank you for the fix.

yql5510718 · 2026-03-11T16:52:09Z

src/pin/pin_lib/uop_generator.c

 void convert_pinuop_to_t_uop(uns8 proc_id, ctype_pin_inst* pi, Trace_Uop** trace_uop) {
  Flag new_entry = FALSE;
  Inst_Info* info;
+  // Keep instruction-cache lookup independent of per-core CMP address tagging.


The comment is a bit misleading.
maybe:
// Use canonical address (proc_id=0) for consistent hash key within per-core table

Agreed. I updated the comment to clarify canonical address lookup.

Copilot

Pull request overview

Fixes multicore static instruction (Inst_Info) lookup mismatches in pin_exec_driven by normalizing CMP-tagged instruction addresses to a canonical form for hash lookup, while preserving per-core tagging for stored entries.

Changes:

Introduced a canonicalized lookup_addr for cpp_hash_table_access_create to avoid CMP-tag aliasing.
Replaced strict info->addr equality assert with a canonical-address consistency check and rebinding.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-12T07:58:49Z

src/pin/pin_lib/uop_generator.c

+  // Use canonical address (proc_id=0) for consistent hash key within per-core table.
+  // The uop cache key should use the original program address bits only.
+  Addr lookup_addr = convert_to_cmp_addr(0, pi->instruction_addr);


convert_to_cmp_addr(0, ...) reads like it adds CMP tagging rather than stripping/normalizing it, which makes the canonicalization intent ambiguous and easy to misuse later. Consider switching to (or introducing) an explicitly-named helper that extracts the untagged program address bits (e.g., a strip_cmp_tag()/canonicalize_inst_addr()), then (if needed) re-tagging in a clearly separate step.

Copilot · 2026-03-12T07:58:49Z

src/pin/pin_lib/uop_generator.c

+  // Use canonical address (proc_id=0) for consistent hash key within per-core table.
+  // The uop cache key should use the original program address bits only.


The comment says the key should use the original program address bits only, but the code explicitly stores the per-core tagged address into info->addr. If info->addr is intended to remain tagged (as the previous assert implied), please adjust the comment to avoid implying that Inst_Info::addr is untagged/canonical. If Inst_Info::addr is intended to be canonical, then rebinding it to a tagged address contradicts that and should be reworked (e.g., store canonical and tagged addresses in distinct fields).

Suggested change

// Use canonical address (proc_id=0) for consistent hash key within per-core table.

// The uop cache key should use the original program address bits only.

// Use canonical address (proc_id=0) for the uop cache key so it depends only on

// the original program address bits, independent of any per-core address tagging.

Copilot · 2026-03-12T07:58:49Z

src/pin/pin_lib/uop_generator.c

      trace_uop[ii]->eom = FALSE;
-      ASSERT(proc_id, info->addr == pi->instruction_addr);
+      if (info->addr != pi->instruction_addr) {
+        // A stale entry can differ only by CMP core bits; rebind it to this core's tagged address.


The comment says the key should use the original program address bits only, but the code explicitly stores the per-core tagged address into info->addr. If info->addr is intended to remain tagged (as the previous assert implied), please adjust the comment to avoid implying that Inst_Info::addr is untagged/canonical. If Inst_Info::addr is intended to be canonical, then rebinding it to a tagged address contradicts that and should be reworked (e.g., store canonical and tagged addresses in distinct fields).

Suggested change

// A stale entry can differ only by CMP core bits; rebind it to this core's tagged address.

// A stale Inst_Info entry may differ from the current instance only in CMP core-tag bits.

// In that case, we intentionally rebind Inst_Info::addr to this core's tagged address so

// that Inst_Info::addr may hold a per-core tagged address rather than a strictly canonical one.

Copilot · 2026-03-12T07:58:49Z

src/pin/pin_lib/uop_generator.c

-      ASSERT(proc_id, info->addr == pi->instruction_addr);
+      if (info->addr != pi->instruction_addr) {
+        // A stale entry can differ only by CMP core bits; rebind it to this core's tagged address.
+        ASSERT(proc_id, convert_to_cmp_addr(0, info->addr) == convert_to_cmp_addr(0, pi->instruction_addr));


This computes convert_to_cmp_addr(0, ...) twice in a per-uop loop. Since pi->instruction_addr is constant for the instruction and info->addr is also available, consider precomputing the canonicalized values once (e.g., canonical pi address before the loop, and/or canonical info address just before the assert) to reduce repeated work in hot code.

pin_exec_driven: fix Inst_Info lookup aliasing across cores

6c15d84

Copilot AI review requested due to automatic review settings March 11, 2026 15:21

hlitz requested review from yql5510718 and removed request for Copilot March 11, 2026 15:24

yql5510718 requested changes Mar 11, 2026

View reviewed changes

pin_exec_driven: clarify canonical lookup comment

1f96562

Copilot AI review requested due to automatic review settings March 12, 2026 07:53

Copilot AI reviewed Mar 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pin_exec_driven: fix Inst_Info lookup aliasing across cores#20

pin_exec_driven: fix Inst_Info lookup aliasing across cores#20
iconstan01 wants to merge 2 commits intolitz-lab:mainfrom
iconstan01:PR-pinexec-inst-info-cmp-lookup

iconstan01 commented Mar 11, 2026

Uh oh!

yql5510718 left a comment

Uh oh!

yql5510718 Mar 11, 2026

Uh oh!

iconstan01 Mar 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// Use canonical address (proc_id=0) for consistent hash key within per-core table.
		// The uop cache key should use the original program address bits only.

-        // A stale entry can differ only by CMP core bits; rebind it to this core's tagged address.
+        // A stale Inst_Info entry may differ from the current instance only in CMP core-tag bits.
+        // In that case, we intentionally rebind Inst_Info::addr to this core's tagged address so
+        // that Inst_Info::addr may hold a per-core tagged address rather than a strictly canonical one.

Conversation

iconstan01 commented Mar 11, 2026

Uh oh!

yql5510718 left a comment

Choose a reason for hiding this comment

Uh oh!

yql5510718 Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

iconstan01 Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants