The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [Patch v3 0/7] Enable core PMU for DMR and NVL
@ 2026-01-14  1:17 Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 1/7] perf/x86/intel: Support the 4 new OMR MSRs introduced in " Dapeng Mi
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Changes:
 v2 -> v3:
 * Refine intel_alt_er() to align with intel_fixup_er() (Peter).
 * Simplify DMR core PMU enabling code (Peter).
 * Add words to mention rdpmc is global on hybrid platforms (Ian).
 v1 -> v2:
 * Rebase to latest perf/core base (6.19-rc1 based).
 * Refine intel_fixup_er() helper (Zide).
 * Optimize commit message.

This patch-set enables core PMU functionalities for Diamond Rapids (DMR)
and Nova Lake (NVL). 

Comparing with previous platforms, there are 3 main changes on core PMU
functionalities.

1. Introduce OFF-MODULE RESPONSE (OMR) facility to replace Off-Core
Response (OCR) facility

Legacy microarchitectures used the OCR facility to evaluate off-core and
multi-core off-module transactions. The properly renamed, OMR facility,
improves the OCR capability for scalable coverage of new memory systems
of multi-core module systems.

Along with the introduction of OMR, 4 equivalent MSRs (OFFMODULE_RSP_0 ~
OFFMODULE_RSP_3) are introduced to specify attributes of the off-module
transaction and the legacy 2 OFFCORE_RSP MSRs are retired.

For more details about OMR events and OFFMODULE_RSP_x MSRs, please refer
to the section 16.1 "OFF-MODULE RESPONSE (OMR) FACILITY" in latest ISE[1]
documentation.

2. New PEBS data source encoding layout

Diamond Rapids and Nova Lake include PEBS Load Latency and Store Latency
support similar to previous platforms but with a different data source
encoding layout.

Briefly speaking, the new data source encoding is determined by bit[8] of
memory auxiliary info field. The bit[8] indicates whether a L2 cache miss
occurs for a memory load or store instruction. If bit[8] is 0, it
signifies no L2 cache miss, and bits[7:0] specify the exact cache data
source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represents
the OMR encoding, indicating the specific L3 cache or memory region
involved in the memory access.

A significant enhancement for OMR encoding is the ability to provide
up to 8 fine-grained memory regions in addition to the cache region,
offering more detailed insights into memory access regions.

For more details about the new data source layout, please refer to the
section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in latest
ISE documentation.

3. Support "rdpmc user disable" feature

Currently executing RDPMC when CPL > 0 is allowed if the CR4.PCE flag
(performance-monitoring counter enable) is set. This introduces a
security risk that any user space process can read the count of any PMU
counter even though the counter belongs to a system-wide event as long
as CR4.PCE = 1.

To mitigate this security risk, the rdpmc user disable feature is
introduced to provide per-counter rdpmc control.

'rdpmc user disable' introduces a new bit "RDPMC_USR_DISABLE" to manage
if the counter can be read in user space by leveraging rdpmc instruction
for each GP and fixed counter.

The details are
- New RDPMC_USR_DISABLE bit in each EVNTSELx[37] MSR to indicate counter
  can't be read by RDPMC in ring 3.
- New RDPMC_USR_DISABLE bits in bits 33,37,41,45,etc.,
  in IA32_FIXED_CTR_CTRL MSR for fixed counters 0-3, etc.
- On RDPMC for counter x, use select to choose the final counter value:
  If (!CPL0 && RDPMC_USR_DISABLE[x] == 1 ) ? 0 : counter_value
- RDPMC_USR_DISABLE is enumerated by CPUID.0x23.0.EBX[2].

For more details about "rdpmc user disable", please refer to chapter 15
"RDPMC USER DISABLE" in latest ISE.

This patch-set adds support for these 3 new changes or features. Besides
the DMR and NVL specific counter constraints are supported together.

Tests:

The below tests pass on DMR and NVL (both P-core and E-core).
 a) Perf counting tests pass.
 b) Perf sampling tests pass.
 c) Perf PEBS based sampling tests pass.
 d) "rdpmc user disable" functionality tests pass.

Ref:

ISE (version 60): https://www.intel.com/content/www/us/en/content-details/869288/intel-architecture-instruction-set-extensions-programming-reference.html 

History:
v1: https://lore.kernel.org/all/20251120053431.491677-1-dapeng1.mi@linux.intel.com/
v2: https://lore.kernel.org/all/20260112051649.1113435-1-dapeng1.mi@linux.intel.com/

Dapeng Mi (7):
  perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
  perf/x86/intel: Add support for PEBS memory auxiliary info field in
    DMR
  perf/x86/intel: Add core PMU support for DMR
  perf/x86/intel: Add support for PEBS memory auxiliary info field in
    NVL
  perf/x86/intel: Add core PMU support for Novalake
  perf/x86: Use macros to replace magic numbers in attr_rdpmc
  perf/x86/intel: Add support for rdpmc user disable feature

 .../sysfs-bus-event_source-devices-rdpmc      |  44 +++
 arch/x86/events/core.c                        |  28 +-
 arch/x86/events/intel/core.c                  | 364 +++++++++++++++++-
 arch/x86/events/intel/ds.c                    | 261 +++++++++++++
 arch/x86/events/intel/p6.c                    |   2 +-
 arch/x86/events/perf_event.h                  |  26 ++
 arch/x86/include/asm/msr-index.h              |   5 +
 arch/x86/include/asm/perf_event.h             |   8 +-
 include/uapi/linux/perf_event.h               |  27 +-
 tools/include/uapi/linux/perf_event.h         |  27 +-
 10 files changed, 762 insertions(+), 30 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc


base-commit: 01122b89361e565b3c88b9fbebe92dc5c7420cb7
-- 
2.34.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Patch v3 1/7] perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
  2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
@ 2026-01-14  1:17 ` Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 2/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR Dapeng Mi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Diamond Rapids (DMR) and Nova Lake (NVL) introduce an enhanced
Off-Module Response (OMR) facility, replacing the Off-Core Response (OCR)
Performance Monitoring of previous processors.

Legacy microarchitectures used the OCR facility to evaluate off-core and
multi-core off-module transactions. The newly named OMR facility improves
OCR capabilities for scalable coverage of new memory systems in
multi-core module systems.

Similar to OCR, 4 additional off-module configuration MSRs
(OFFMODULE_RSP_0 to OFFMODULE_RSP_3) are introduced to specify attributes
of off-module transactions. When multiple identical OMR events are
created, they need to occupy the same OFFMODULE_RSP_x MSR. To ensure
these multiple identical OMR events can work simultaneously, the
intel_alt_er() and intel_fixup_er() helpers are enhanced to rotate these
OMR events across different OFFMODULE_RSP_* MSRs, similar to previous OCR
events.

For more details about OMR, please refer to section 16.1 "OFF-MODULE
 RESPONSE (OMR) FACILITY" in ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---

v3: Enhance intel_alt_er() to align with intel_fixup_er() on code style.
v2: Optimize intel_fixup_er().

 arch/x86/events/intel/core.c     | 59 +++++++++++++++++++++++---------
 arch/x86/events/perf_event.h     |  5 +++
 arch/x86/include/asm/msr-index.h |  5 +++
 3 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 1840ca1918d1..3578c660a904 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3532,17 +3532,32 @@ static int intel_alt_er(struct cpu_hw_events *cpuc,
 	struct extra_reg *extra_regs = hybrid(cpuc->pmu, extra_regs);
 	int alt_idx = idx;
 
-	if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
-		return idx;
-
-	if (idx == EXTRA_REG_RSP_0)
-		alt_idx = EXTRA_REG_RSP_1;
+	switch (idx) {
+	case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1:
+		if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
+			return idx;
+		if (++alt_idx > EXTRA_REG_RSP_1)
+			alt_idx = EXTRA_REG_RSP_0;
+		if (config & ~extra_regs[alt_idx].valid_mask)
+			return idx;
+		break;
 
-	if (idx == EXTRA_REG_RSP_1)
-		alt_idx = EXTRA_REG_RSP_0;
+	case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3:
+		if (!(x86_pmu.flags & PMU_FL_HAS_OMR))
+			return idx;
+		if (++alt_idx > EXTRA_REG_OMR_3)
+			alt_idx = EXTRA_REG_OMR_0;
+		/*
+		 * Subtracting EXTRA_REG_OMR_0 ensures to get correct
+		 * OMR extra_reg entries which start from 0.
+		 */
+		if (config & ~extra_regs[alt_idx - EXTRA_REG_OMR_0].valid_mask)
+			return idx;
+		break;
 
-	if (config & ~extra_regs[alt_idx].valid_mask)
-		return idx;
+	default:
+		break;
+	}
 
 	return alt_idx;
 }
@@ -3550,16 +3565,26 @@ static int intel_alt_er(struct cpu_hw_events *cpuc,
 static void intel_fixup_er(struct perf_event *event, int idx)
 {
 	struct extra_reg *extra_regs = hybrid(event->pmu, extra_regs);
-	event->hw.extra_reg.idx = idx;
+	int er_idx;
 
-	if (idx == EXTRA_REG_RSP_0) {
-		event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
-		event->hw.config |= extra_regs[EXTRA_REG_RSP_0].event;
-		event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0;
-	} else if (idx == EXTRA_REG_RSP_1) {
+	event->hw.extra_reg.idx = idx;
+	switch (idx) {
+	case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1:
+		er_idx = idx - EXTRA_REG_RSP_0;
 		event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
-		event->hw.config |= extra_regs[EXTRA_REG_RSP_1].event;
-		event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
+		event->hw.config |= extra_regs[er_idx].event;
+		event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0 + er_idx;
+		break;
+
+	case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3:
+		er_idx = idx - EXTRA_REG_OMR_0;
+		event->hw.config &= ~ARCH_PERFMON_EVENTSEL_UMASK;
+		event->hw.config |= 1ULL << (8 + er_idx);
+		event->hw.extra_reg.reg = MSR_OMR_0 + er_idx;
+		break;
+
+	default:
+		pr_warn("The extra reg idx %d is not supported.\n", idx);
 	}
 }
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 3161ec0a3416..586e3fdfe6d8 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -45,6 +45,10 @@ enum extra_reg_type {
 	EXTRA_REG_FE		= 4,  /* fe_* */
 	EXTRA_REG_SNOOP_0	= 5,  /* snoop response 0 */
 	EXTRA_REG_SNOOP_1	= 6,  /* snoop response 1 */
+	EXTRA_REG_OMR_0		= 7,  /* OMR 0 */
+	EXTRA_REG_OMR_1		= 8,  /* OMR 1 */
+	EXTRA_REG_OMR_2		= 9,  /* OMR 2 */
+	EXTRA_REG_OMR_3		= 10,  /* OMR 3 */
 
 	EXTRA_REG_MAX		      /* number of entries needed */
 };
@@ -1099,6 +1103,7 @@ do {									\
 #define PMU_FL_RETIRE_LATENCY	0x200 /* Support Retire Latency in PEBS */
 #define PMU_FL_BR_CNTR		0x400 /* Support branch counter logging */
 #define PMU_FL_DYN_CONSTRAINT	0x800 /* Needs dynamic constraint */
+#define PMU_FL_HAS_OMR		0x1000 /* has 4 equivalent OMR regs */
 
 #define EVENT_VAR(_id)  event_attr_##_id
 #define EVENT_PTR(_id) &event_attr_##_id.attr.attr
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3d0a0950d20a..6d1b69ea01c2 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -263,6 +263,11 @@
 #define MSR_SNOOP_RSP_0			0x00001328
 #define MSR_SNOOP_RSP_1			0x00001329
 
+#define MSR_OMR_0			0x000003e0
+#define MSR_OMR_1			0x000003e1
+#define MSR_OMR_2			0x000003e2
+#define MSR_OMR_3			0x000003e3
+
 #define MSR_LBR_SELECT			0x000001c8
 #define MSR_LBR_TOS			0x000001c9
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch v3 2/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 1/7] perf/x86/intel: Support the 4 new OMR MSRs introduced in " Dapeng Mi
@ 2026-01-14  1:17 ` Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 3/7] perf/x86/intel: Add core PMU support for DMR Dapeng Mi
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

With the introduction of the OMR feature, the PEBS memory auxiliary info
field for load and store latency events has been restructured for DMR.

The memory auxiliary info field's bit[8] indicates whether a L2 cache
miss occurred for a memory load or store instruction. If bit[8] is 0,
it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
the OMR encoding, indicating the specific L3 cache or memory region
involved in the memory access. A significant enhancement is OMR encoding
provides up to 8 fine-grained memory regions besides the cache region.

A significant enhancement for OMR encoding is the ability to provide
up to 8 fine-grained memory regions in addition to the cache region,
offering more detailed insights into memory access regions.

For detailed information on the memory auxiliary info encoding, please
refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
the ISE documentation.

This patch ensures that the PEBS memory auxiliary info field is correctly
interpreted and utilized in DMR.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/ds.c            | 140 ++++++++++++++++++++++++++
 arch/x86/events/perf_event.h          |   2 +
 include/uapi/linux/perf_event.h       |  27 ++++-
 tools/include/uapi/linux/perf_event.h |  27 ++++-
 4 files changed, 190 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index feb1c3cf63e4..272e652f25fc 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -34,6 +34,17 @@ struct pebs_record_32 {
 
  */
 
+union omr_encoding {
+	struct {
+		u8 omr_source : 4;
+		u8 omr_remote : 1;
+		u8 omr_hitm : 1;
+		u8 omr_snoop : 1;
+		u8 omr_promoted : 1;
+	};
+	u8 omr_full;
+};
+
 union intel_x86_pebs_dse {
 	u64 val;
 	struct {
@@ -73,6 +84,18 @@ union intel_x86_pebs_dse {
 		unsigned int lnc_addr_blk:1;
 		unsigned int ld_reserved6:18;
 	};
+	struct {
+		unsigned int pnc_dse: 8;
+		unsigned int pnc_l2_miss:1;
+		unsigned int pnc_stlb_clean_hit:1;
+		unsigned int pnc_stlb_any_hit:1;
+		unsigned int pnc_stlb_miss:1;
+		unsigned int pnc_locked:1;
+		unsigned int pnc_data_blk:1;
+		unsigned int pnc_addr_blk:1;
+		unsigned int pnc_fb_full:1;
+		unsigned int ld_reserved8:16;
+	};
 };
 
 
@@ -228,6 +251,85 @@ void __init intel_pmu_pebs_data_source_lnl(void)
 	__intel_pmu_pebs_data_source_cmt(data_source);
 }
 
+/* Version for Panthercove and later */
+
+/* L2 hit */
+#define PNC_PEBS_DATA_SOURCE_MAX	16
+static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
+	P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),	/* 0x00: non-cache access */
+	OP_LH               | LEVEL(L0) | P(SNOOP, NONE),	/* 0x01: L0 hit */
+	OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),	/* 0x02: L1 hit */
+	OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),	/* 0x03: L1 Miss Handling Buffer hit */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),	/* 0x04: L2 Hit Clean */
+	0,							/* 0x05: Reserved */
+	0,							/* 0x06: Reserved */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),	/* 0x07: L2 Hit Snoop HIT */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),	/* 0x08: L2 Hit Snoop Hit Modified */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),	/* 0x09: Prefetch Promotion */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),	/* 0x0a: Cross Core Prefetch Promotion */
+	0,							/* 0x0b: Reserved */
+	0,							/* 0x0c: Reserved */
+	0,							/* 0x0d: Reserved */
+	0,							/* 0x0e: Reserved */
+	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),	/* 0x0f: uncached */
+};
+
+/* L2 miss */
+#define OMR_DATA_SOURCE_MAX		16
+static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
+	P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),	/* 0x00: invalid */
+	0,							/* 0x01: Reserved */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE),	/* 0x02: local CA shared cache */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO),	/* 0x04: other CA IO agent */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE),	/* 0x05: other CA shared cache */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */
+	OP_LH | LEVEL(RAM) | P(REGION, MMIO),			/* 0x07: MMIO */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM0),			/* 0x08: Memory region 0 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM1),			/* 0x09: Memory region 1 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM2),			/* 0x0a: Memory region 2 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM3),			/* 0x0b: Memory region 3 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM4),			/* 0x0c: Memory region 4 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM5),			/* 0x0d: Memory region 5 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM6),			/* 0x0e: Memory region 6 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM7),			/* 0x0f: Memory region 7 */
+};
+
+static u64 parse_omr_data_source(u8 dse)
+{
+	union omr_encoding omr;
+	u64 val = 0;
+
+	omr.omr_full = dse;
+	val = omr_data_source[omr.omr_source];
+	if (omr.omr_source > 0x1 && omr.omr_source < 0x7)
+		val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0;
+	else if (omr.omr_source > 0x7)
+		val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM);
+
+	if (omr.omr_remote)
+		val |= REM;
+
+	val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT);
+
+	if (omr.omr_source == 0x2) {
+		u8 snoop = omr.omr_snoop | omr.omr_promoted;
+
+		if (snoop == 0x0)
+			val |= P(SNOOP, NA);
+		else if (snoop == 0x1)
+			val |= P(SNOOP, MISS);
+		else if (snoop == 0x2)
+			val |= P(SNOOP, HIT);
+		else if (snoop == 0x3)
+			val |= P(SNOOP, NONE);
+	} else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) {
+		val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0;
+	}
+
+	return val;
+}
+
 static u64 precise_store_data(u64 status)
 {
 	union intel_x86_pebs_dse dse;
@@ -411,6 +513,44 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status)
 	return lnl_latency_data(event, status);
 }
 
+u64 pnc_latency_data(struct perf_event *event, u64 status)
+{
+	union intel_x86_pebs_dse dse;
+	union perf_mem_data_src src;
+	u64 val;
+
+	dse.val = status;
+
+	if (!dse.pnc_l2_miss)
+		val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf];
+	else
+		val = parse_omr_data_source(dse.pnc_dse);
+
+	if (!val)
+		val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
+
+	if (dse.pnc_stlb_miss)
+		val |= P(TLB, MISS) | P(TLB, L2);
+	else
+		val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
+
+	if (dse.pnc_locked)
+		val |= P(LOCK, LOCKED);
+
+	if (dse.pnc_data_blk)
+		val |= P(BLK, DATA);
+	if (dse.pnc_addr_blk)
+		val |= P(BLK, ADDR);
+	if (!dse.pnc_data_blk && !dse.pnc_addr_blk)
+		val |= P(BLK, NA);
+
+	src.val = val;
+	if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
+		src.mem_op = P(OP, STORE);
+
+	return src.val;
+}
+
 static u64 load_latency_data(struct perf_event *event, u64 status)
 {
 	union intel_x86_pebs_dse dse;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 586e3fdfe6d8..bd501c2a0f73 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1664,6 +1664,8 @@ u64 lnl_latency_data(struct perf_event *event, u64 status);
 
 u64 arl_h_latency_data(struct perf_event *event, u64 status);
 
+u64 pnc_latency_data(struct perf_event *event, u64 status);
+
 extern struct event_constraint intel_core2_pebs_event_constraints[];
 
 extern struct event_constraint intel_atom_pebs_event_constraints[];
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index c44a8fb3e418..533393ec94d0 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1330,14 +1330,16 @@ union perf_mem_data_src {
 			mem_snoopx  :  2, /* Snoop mode, ext */
 			mem_blk     :  3, /* Access blocked */
 			mem_hops    :  3, /* Hop level */
-			mem_rsvd    : 18;
+			mem_region  :  5, /* cache/memory regions */
+			mem_rsvd    : 13;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd    : 18,
+		__u64	mem_rsvd    : 13,
+			mem_region  :  5, /* cache/memory regions */
 			mem_hops    :  3, /* Hop level */
 			mem_blk     :  3, /* Access blocked */
 			mem_snoopx  :  2, /* Snoop mode, ext */
@@ -1394,7 +1396,7 @@ union perf_mem_data_src {
 #define PERF_MEM_LVLNUM_L4			0x0004 /* L4 */
 #define PERF_MEM_LVLNUM_L2_MHB			0x0005 /* L2 Miss Handling Buffer */
 #define PERF_MEM_LVLNUM_MSC			0x0006 /* Memory-side Cache */
-/* 0x007 available */
+#define PERF_MEM_LVLNUM_L0			0x0007 /* L0 */
 #define PERF_MEM_LVLNUM_UNC			0x0008 /* Uncached */
 #define PERF_MEM_LVLNUM_CXL			0x0009 /* CXL */
 #define PERF_MEM_LVLNUM_IO			0x000a /* I/O */
@@ -1447,6 +1449,25 @@ union perf_mem_data_src {
 /* 5-7 available */
 #define PERF_MEM_HOPS_SHIFT			43
 
+/* Cache/Memory region */
+#define PERF_MEM_REGION_NA		0x0  /* Invalid */
+#define PERF_MEM_REGION_RSVD		0x01 /* Reserved */
+#define PERF_MEM_REGION_L_SHARE		0x02 /* Local CA shared cache */
+#define PERF_MEM_REGION_L_NON_SHARE	0x03 /* Local CA non-shared cache */
+#define PERF_MEM_REGION_O_IO		0x04 /* Other CA IO agent */
+#define PERF_MEM_REGION_O_SHARE		0x05 /* Other CA shared cache */
+#define PERF_MEM_REGION_O_NON_SHARE	0x06 /* Other CA non-shared cache */
+#define PERF_MEM_REGION_MMIO		0x07 /* MMIO */
+#define PERF_MEM_REGION_MEM0		0x08 /* Memory region 0 */
+#define PERF_MEM_REGION_MEM1		0x09 /* Memory region 1 */
+#define PERF_MEM_REGION_MEM2		0x0a /* Memory region 2 */
+#define PERF_MEM_REGION_MEM3		0x0b /* Memory region 3 */
+#define PERF_MEM_REGION_MEM4		0x0c /* Memory region 4 */
+#define PERF_MEM_REGION_MEM5		0x0d /* Memory region 5 */
+#define PERF_MEM_REGION_MEM6		0x0e /* Memory region 6 */
+#define PERF_MEM_REGION_MEM7		0x0f /* Memory region 7 */
+#define PERF_MEM_REGION_SHIFT		46
+
 #define PERF_MEM_S(a, s) \
 	(((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
 
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index c44a8fb3e418..d4b99610a3b0 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -1330,14 +1330,16 @@ union perf_mem_data_src {
 			mem_snoopx  :  2, /* Snoop mode, ext */
 			mem_blk     :  3, /* Access blocked */
 			mem_hops    :  3, /* Hop level */
-			mem_rsvd    : 18;
+			mem_region  :  5, /* cache/memory regions */
+			mem_rsvd    : 13;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd    : 18,
+		__u64	mem_rsvd    : 13,
+			mem_region  :  5, /* cache/memory regions */
 			mem_hops    :  3, /* Hop level */
 			mem_blk     :  3, /* Access blocked */
 			mem_snoopx  :  2, /* Snoop mode, ext */
@@ -1394,7 +1396,7 @@ union perf_mem_data_src {
 #define PERF_MEM_LVLNUM_L4			0x0004 /* L4 */
 #define PERF_MEM_LVLNUM_L2_MHB			0x0005 /* L2 Miss Handling Buffer */
 #define PERF_MEM_LVLNUM_MSC			0x0006 /* Memory-side Cache */
-/* 0x007 available */
+#define PERF_MEM_LVLNUM_L0			0x0007   /* L0 */
 #define PERF_MEM_LVLNUM_UNC			0x0008 /* Uncached */
 #define PERF_MEM_LVLNUM_CXL			0x0009 /* CXL */
 #define PERF_MEM_LVLNUM_IO			0x000a /* I/O */
@@ -1447,6 +1449,25 @@ union perf_mem_data_src {
 /* 5-7 available */
 #define PERF_MEM_HOPS_SHIFT			43
 
+/* Cache/Memory region */
+#define PERF_MEM_REGION_NA		0x0  /* Invalid */
+#define PERF_MEM_REGION_RSVD		0x01 /* Reserved */
+#define PERF_MEM_REGION_L_SHARE		0x02 /* Local CA shared cache */
+#define PERF_MEM_REGION_L_NON_SHARE	0x03 /* Local CA non-shared cache */
+#define PERF_MEM_REGION_O_IO		0x04 /* Other CA IO agent */
+#define PERF_MEM_REGION_O_SHARE		0x05 /* Other CA shared cache */
+#define PERF_MEM_REGION_O_NON_SHARE	0x06 /* Other CA non-shared cache */
+#define PERF_MEM_REGION_MMIO		0x07 /* MMIO */
+#define PERF_MEM_REGION_MEM0		0x08 /* Memory region 0 */
+#define PERF_MEM_REGION_MEM1		0x09 /* Memory region 1 */
+#define PERF_MEM_REGION_MEM2		0x0a /* Memory region 2 */
+#define PERF_MEM_REGION_MEM3		0x0b /* Memory region 3 */
+#define PERF_MEM_REGION_MEM4		0x0c /* Memory region 4 */
+#define PERF_MEM_REGION_MEM5		0x0d /* Memory region 5 */
+#define PERF_MEM_REGION_MEM6		0x0e /* Memory region 6 */
+#define PERF_MEM_REGION_MEM7		0x0f /* Memory region 7 */
+#define PERF_MEM_REGION_SHIFT		46
+
 #define PERF_MEM_S(a, s) \
 	(((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch v3 3/7] perf/x86/intel: Add core PMU support for DMR
  2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 1/7] perf/x86/intel: Support the 4 new OMR MSRs introduced in " Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 2/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR Dapeng Mi
@ 2026-01-14  1:17 ` Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 4/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL Dapeng Mi
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

This patch enables core PMU features for Diamond Rapids (Panther Cove
microarchitecture), including Panther Cove specific counter and PEBS
constraints, a new cache events ID table, and the model-specific OMR
events extra registers table.

For detailed information about counter constraints, please refer to
section 16.3 "COUNTER RESTRICTIONS" in the ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---

v3: Simplify DMR enabling code in intel_pmu_init() by reusing the
enabling code of previous platforms. 

 arch/x86/events/intel/core.c | 179 ++++++++++++++++++++++++++++++++++-
 arch/x86/events/intel/ds.c   |  27 ++++++
 arch/x86/events/perf_event.h |   2 +
 3 files changed, 207 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 3578c660a904..b2f99d47292b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -435,6 +435,62 @@ static struct extra_reg intel_lnc_extra_regs[] __read_mostly = {
 	EVENT_EXTRA_END
 };
 
+static struct event_constraint intel_pnc_event_constraints[] = {
+	FIXED_EVENT_CONSTRAINT(0x00c0, 0),	/* INST_RETIRED.ANY */
+	FIXED_EVENT_CONSTRAINT(0x0100, 0),	/* INST_RETIRED.PREC_DIST */
+	FIXED_EVENT_CONSTRAINT(0x003c, 1),	/* CPU_CLK_UNHALTED.CORE */
+	FIXED_EVENT_CONSTRAINT(0x0300, 2),	/* CPU_CLK_UNHALTED.REF */
+	FIXED_EVENT_CONSTRAINT(0x013c, 2),	/* CPU_CLK_UNHALTED.REF_TSC_P */
+	FIXED_EVENT_CONSTRAINT(0x0400, 3),	/* SLOTS */
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_RETIRING, 0),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BAD_SPEC, 1),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FE_BOUND, 2),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BE_BOUND, 3),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_HEAVY_OPS, 4),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BR_MISPREDICT, 5),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FETCH_LAT, 6),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_MEM_BOUND, 7),
+
+	INTEL_EVENT_CONSTRAINT(0x20, 0xf),
+	INTEL_EVENT_CONSTRAINT(0x79, 0xf),
+
+	INTEL_UEVENT_CONSTRAINT(0x0275, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0176, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x04a4, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x08a4, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x01cd, 0xfc),
+	INTEL_UEVENT_CONSTRAINT(0x02cd, 0x3),
+
+	INTEL_EVENT_CONSTRAINT(0xd0, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xd1, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xd4, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xd6, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xdf, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xce, 0x1),
+
+	INTEL_UEVENT_CONSTRAINT(0x01b1, 0x8),
+	INTEL_UEVENT_CONSTRAINT(0x0847, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0446, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0846, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0148, 0xf),
+
+	EVENT_CONSTRAINT_END
+};
+
+static struct extra_reg intel_pnc_extra_regs[] __read_mostly = {
+	/* must define OMR_X first, see intel_alt_er() */
+	INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OMR_0, 0x40ffffff0000ffffull, OMR_0),
+	INTEL_UEVENT_EXTRA_REG(0x022a, MSR_OMR_1, 0x40ffffff0000ffffull, OMR_1),
+	INTEL_UEVENT_EXTRA_REG(0x042a, MSR_OMR_2, 0x40ffffff0000ffffull, OMR_2),
+	INTEL_UEVENT_EXTRA_REG(0x082a, MSR_OMR_3, 0x40ffffff0000ffffull, OMR_3),
+	INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),
+	INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE),
+	INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE),
+	INTEL_UEVENT_EXTRA_REG(0x40ad, MSR_PEBS_FRONTEND, 0xf, FE),
+	INTEL_UEVENT_EXTRA_REG(0x04c2, MSR_PEBS_FRONTEND, 0x8, FE),
+	EVENT_EXTRA_END
+};
+
 EVENT_ATTR_STR(mem-loads,	mem_ld_nhm,	"event=0x0b,umask=0x10,ldlat=3");
 EVENT_ATTR_STR(mem-loads,	mem_ld_snb,	"event=0xcd,umask=0x1,ldlat=3");
 EVENT_ATTR_STR(mem-stores,	mem_st_snb,	"event=0xcd,umask=0x2");
@@ -650,6 +706,102 @@ static __initconst const u64 glc_hw_cache_extra_regs
  },
 };
 
+static __initconst const u64 pnc_hw_cache_event_ids
+				[PERF_COUNT_HW_CACHE_MAX]
+				[PERF_COUNT_HW_CACHE_OP_MAX]
+				[PERF_COUNT_HW_CACHE_RESULT_MAX] =
+{
+ [ C(L1D ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x81d0,
+		[ C(RESULT_MISS)   ] = 0xe124,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x82d0,
+	},
+ },
+ [ C(L1I ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_MISS)   ] = 0xe424,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+ [ C(LL  ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x12a,
+		[ C(RESULT_MISS)   ] = 0x12a,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x12a,
+		[ C(RESULT_MISS)   ] = 0x12a,
+	},
+ },
+ [ C(DTLB) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x81d0,
+		[ C(RESULT_MISS)   ] = 0xe12,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x82d0,
+		[ C(RESULT_MISS)   ] = 0xe13,
+	},
+ },
+ [ C(ITLB) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = 0xe11,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+	[ C(OP_PREFETCH) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+ [ C(BPU ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x4c4,
+		[ C(RESULT_MISS)   ] = 0x4c5,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+	[ C(OP_PREFETCH) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+ [ C(NODE) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+};
+
+static __initconst const u64 pnc_hw_cache_extra_regs
+				[PERF_COUNT_HW_CACHE_MAX]
+				[PERF_COUNT_HW_CACHE_OP_MAX]
+				[PERF_COUNT_HW_CACHE_RESULT_MAX] =
+{
+ [ C(LL  ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x4000000000000001,
+		[ C(RESULT_MISS)   ] = 0xFFFFF000000001,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x4000000000000002,
+		[ C(RESULT_MISS)   ] = 0xFFFFF000000002,
+	},
+ },
+};
+
 /*
  * Notes on the events:
  * - data reads do not include code reads (comparable to earlier tables)
@@ -7236,6 +7388,20 @@ static __always_inline void intel_pmu_init_lnc(struct pmu *pmu)
 	hybrid(pmu, extra_regs) = intel_lnc_extra_regs;
 }
 
+static __always_inline void intel_pmu_init_pnc(struct pmu *pmu)
+{
+	intel_pmu_init_glc(pmu);
+	x86_pmu.flags &= ~PMU_FL_HAS_RSP_1;
+	x86_pmu.flags |= PMU_FL_HAS_OMR;
+	memcpy(hybrid_var(pmu, hw_cache_event_ids),
+	       pnc_hw_cache_event_ids, sizeof(hw_cache_event_ids));
+	memcpy(hybrid_var(pmu, hw_cache_extra_regs),
+	       pnc_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
+	hybrid(pmu, event_constraints) = intel_pnc_event_constraints;
+	hybrid(pmu, pebs_constraints) = intel_pnc_pebs_event_constraints;
+	hybrid(pmu, extra_regs) = intel_pnc_extra_regs;
+}
+
 static __always_inline void intel_pmu_init_skt(struct pmu *pmu)
 {
 	intel_pmu_init_grt(pmu);
@@ -7897,9 +8063,21 @@ __init int intel_pmu_init(void)
 		x86_pmu.extra_regs = intel_rwc_extra_regs;
 		pr_cont("Granite Rapids events, ");
 		name = "granite_rapids";
+		goto glc_common;
+
+	case INTEL_DIAMONDRAPIDS_X:
+		intel_pmu_init_pnc(NULL);
+		x86_pmu.pebs_latency_data = pnc_latency_data;
+
+		pr_cont("Panthercove events, ");
+		name = "panthercove";
+		goto glc_base;
 
 	glc_common:
 		intel_pmu_init_glc(NULL);
+		intel_pmu_pebs_data_source_skl(true);
+
+	glc_base:
 		x86_pmu.pebs_ept = 1;
 		x86_pmu.hw_config = hsw_hw_config;
 		x86_pmu.get_event_constraints = glc_get_event_constraints;
@@ -7909,7 +8087,6 @@ __init int intel_pmu_init(void)
 		mem_attr = glc_events_attrs;
 		td_attr = glc_td_events_attrs;
 		tsx_attr = glc_tsx_events_attrs;
-		intel_pmu_pebs_data_source_skl(true);
 		break;
 
 	case INTEL_ALDERLAKE:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 272e652f25fc..06e42ac33749 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1425,6 +1425,33 @@ struct event_constraint intel_lnc_pebs_event_constraints[] = {
 	EVENT_CONSTRAINT_END
 };
 
+struct event_constraint intel_pnc_pebs_event_constraints[] = {
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL),	/* INST_RETIRED.PREC_DIST */
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL),
+
+	INTEL_HYBRID_LDLAT_CONSTRAINT(0x1cd, 0xfc),
+	INTEL_HYBRID_STLAT_CONSTRAINT(0x2cd, 0x3),
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf),	/* MEM_INST_RETIRED.STLB_MISS_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf),	/* MEM_INST_RETIRED.STLB_MISS_STORES */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf),	/* MEM_INST_RETIRED.LOCK_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x41d0, 0xf),	/* MEM_INST_RETIRED.SPLIT_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x42d0, 0xf),	/* MEM_INST_RETIRED.SPLIT_STORES */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x81d0, 0xf),	/* MEM_INST_RETIRED.ALL_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x82d0, 0xf),	/* MEM_INST_RETIRED.ALL_STORES */
+
+	INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD_RANGE(0xd1, 0xd4, 0xf),
+
+	INTEL_FLAGS_EVENT_CONSTRAINT(0xd0, 0xf),
+	INTEL_FLAGS_EVENT_CONSTRAINT(0xd6, 0xf),
+
+	/*
+	 * Everything else is handled by PMU_FL_PEBS_ALL, because we
+	 * need the full constraints from the main table.
+	 */
+
+	EVENT_CONSTRAINT_END
+};
+
 struct event_constraint *intel_pebs_constraints(struct perf_event *event)
 {
 	struct event_constraint *pebs_constraints = hybrid(event->pmu, pebs_constraints);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index bd501c2a0f73..cbca1888e8f7 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1698,6 +1698,8 @@ extern struct event_constraint intel_glc_pebs_event_constraints[];
 
 extern struct event_constraint intel_lnc_pebs_event_constraints[];
 
+extern struct event_constraint intel_pnc_pebs_event_constraints[];
+
 struct event_constraint *intel_pebs_constraints(struct perf_event *event);
 
 void intel_pmu_pebs_add(struct perf_event *event);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch v3 4/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
  2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
                   ` (2 preceding siblings ...)
  2026-01-14  1:17 ` [Patch v3 3/7] perf/x86/intel: Add core PMU support for DMR Dapeng Mi
@ 2026-01-14  1:17 ` Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 5/7] perf/x86/intel: Add core PMU support for Novalake Dapeng Mi
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Similar to DMR (Panther Cove uarch), both P-core (Coyote Cove uarch) and
E-core (Arctic Wolf uarch) of NVL adopt the new PEBS memory auxiliary
info layout.

Coyote Cove microarchitecture shares the same PMU capabilities, including
the memory auxiliary info layout, with Panther Cove. Arctic Wolf
microarchitecture has a similar layout to Panther Cove, with the only
difference being specific data source encoding for L2 hit cases (up to
the L2 cache level). The OMR encoding remains the same as in Panther Cove.

For detailed information on the memory auxiliary info encoding, please
refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
the latest ISE documentation.

This patch defines Arctic Wolf specific data source encoding and then
supports PEBS memory auxiliary info field for NVL.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/ds.c   | 83 ++++++++++++++++++++++++++++++++++++
 arch/x86/events/perf_event.h |  2 +
 2 files changed, 85 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 06e42ac33749..a47f173d411b 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -96,6 +96,18 @@ union intel_x86_pebs_dse {
 		unsigned int pnc_fb_full:1;
 		unsigned int ld_reserved8:16;
 	};
+	struct {
+		unsigned int arw_dse:8;
+		unsigned int arw_l2_miss:1;
+		unsigned int arw_xq_promotion:1;
+		unsigned int arw_reissue:1;
+		unsigned int arw_stlb_miss:1;
+		unsigned int arw_locked:1;
+		unsigned int arw_data_blk:1;
+		unsigned int arw_addr_blk:1;
+		unsigned int arw_fb_full:1;
+		unsigned int ld_reserved9:16;
+	};
 };
 
 
@@ -274,6 +286,29 @@ static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
 	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),	/* 0x0f: uncached */
 };
 
+/* Version for Arctic Wolf and later */
+
+/* L2 hit */
+#define ARW_PEBS_DATA_SOURCE_MAX	16
+static u64 arw_pebs_l2_hit_data_source[ARW_PEBS_DATA_SOURCE_MAX] = {
+	P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),	/* 0x00: non-cache access */
+	OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),	/* 0x01: L1 hit */
+	OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),	/* 0x02: WCB Hit */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),	/* 0x03: L2 Hit Clean */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),	/* 0x04: L2 Hit Snoop HIT */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),	/* 0x05: L2 Hit Snoop Hit Modified */
+	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),	/* 0x06: uncached */
+	0,							/* 0x07: Reserved */
+	0,							/* 0x08: Reserved */
+	0,							/* 0x09: Reserved */
+	0,							/* 0x0a: Reserved */
+	0,							/* 0x0b: Reserved */
+	0,							/* 0x0c: Reserved */
+	0,							/* 0x0d: Reserved */
+	0,							/* 0x0e: Reserved */
+	0,							/* 0x0f: Reserved */
+};
+
 /* L2 miss */
 #define OMR_DATA_SOURCE_MAX		16
 static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
@@ -458,6 +493,44 @@ u64 cmt_latency_data(struct perf_event *event, u64 status)
 				  dse.mtl_fwd_blk);
 }
 
+static u64 arw_latency_data(struct perf_event *event, u64 status)
+{
+	union intel_x86_pebs_dse dse;
+	union perf_mem_data_src src;
+	u64 val;
+
+	dse.val = status;
+
+	if (!dse.arw_l2_miss)
+		val = arw_pebs_l2_hit_data_source[dse.arw_dse & 0xf];
+	else
+		val = parse_omr_data_source(dse.arw_dse);
+
+	if (!val)
+		val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
+
+	if (dse.arw_stlb_miss)
+		val |= P(TLB, MISS) | P(TLB, L2);
+	else
+		val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
+
+	if (dse.arw_locked)
+		val |= P(LOCK, LOCKED);
+
+	if (dse.arw_data_blk)
+		val |= P(BLK, DATA);
+	if (dse.arw_addr_blk)
+		val |= P(BLK, ADDR);
+	if (!dse.arw_data_blk && !dse.arw_addr_blk)
+		val |= P(BLK, NA);
+
+	src.val = val;
+	if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
+		src.mem_op = P(OP, STORE);
+
+	return src.val;
+}
+
 static u64 lnc_latency_data(struct perf_event *event, u64 status)
 {
 	union intel_x86_pebs_dse dse;
@@ -551,6 +624,16 @@ u64 pnc_latency_data(struct perf_event *event, u64 status)
 	return src.val;
 }
 
+u64 nvl_latency_data(struct perf_event *event, u64 status)
+{
+	struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
+
+	if (pmu->pmu_type == hybrid_small)
+		return arw_latency_data(event, status);
+
+	return pnc_latency_data(event, status);
+}
+
 static u64 load_latency_data(struct perf_event *event, u64 status)
 {
 	union intel_x86_pebs_dse dse;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index cbca1888e8f7..aedc1a7762c2 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1666,6 +1666,8 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status);
 
 u64 pnc_latency_data(struct perf_event *event, u64 status);
 
+u64 nvl_latency_data(struct perf_event *event, u64 status);
+
 extern struct event_constraint intel_core2_pebs_event_constraints[];
 
 extern struct event_constraint intel_atom_pebs_event_constraints[];
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch v3 5/7] perf/x86/intel: Add core PMU support for Novalake
  2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
                   ` (3 preceding siblings ...)
  2026-01-14  1:17 ` [Patch v3 4/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL Dapeng Mi
@ 2026-01-14  1:17 ` Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc Dapeng Mi
  2026-01-14  1:17 ` [Patch v3 7/7] perf/x86/intel: Add support for rdpmc user disable feature Dapeng Mi
  6 siblings, 1 reply; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

This patch enables core PMU support for Novalake, covering both P-core
and E-core. It includes Arctic Wolf-specific counters and PEBS
constraints, and the model-specific OMR extra registers table.

Since Coyote Cove shares the same PMU capabilities as Panther Cove, the
existing Panther Cove PMU enabling functions are reused for Coyote Cove.

For detailed information about counter constraints, please refer to
section 16.3 "COUNTER RESTRICTIONS" in the ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 99 ++++++++++++++++++++++++++++++++++++
 arch/x86/events/intel/ds.c   | 11 ++++
 arch/x86/events/perf_event.h |  2 +
 3 files changed, 112 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index b2f99d47292b..d6bdbb7e449a 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -232,6 +232,29 @@ static struct event_constraint intel_skt_event_constraints[] __read_mostly = {
 	EVENT_CONSTRAINT_END
 };
 
+static struct event_constraint intel_arw_event_constraints[] __read_mostly = {
+	FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
+	FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
+	FIXED_EVENT_CONSTRAINT(0x0300, 2), /* pseudo CPU_CLK_UNHALTED.REF */
+	FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */
+	FIXED_EVENT_CONSTRAINT(0x0073, 4), /* TOPDOWN_BAD_SPECULATION.ALL */
+	FIXED_EVENT_CONSTRAINT(0x019c, 5), /* TOPDOWN_FE_BOUND.ALL */
+	FIXED_EVENT_CONSTRAINT(0x02c2, 6), /* TOPDOWN_RETIRING.ALL */
+	INTEL_UEVENT_CONSTRAINT(0x01b7, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x02b7, 0x2),
+	INTEL_UEVENT_CONSTRAINT(0x04b7, 0x4),
+	INTEL_UEVENT_CONSTRAINT(0x08b7, 0x8),
+	INTEL_UEVENT_CONSTRAINT(0x01d4, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x02d4, 0x2),
+	INTEL_UEVENT_CONSTRAINT(0x04d4, 0x4),
+	INTEL_UEVENT_CONSTRAINT(0x08d4, 0x8),
+	INTEL_UEVENT_CONSTRAINT(0x0175, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x0275, 0x2),
+	INTEL_UEVENT_CONSTRAINT(0x21d3, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x22d3, 0x1),
+	EVENT_CONSTRAINT_END
+};
+
 static struct event_constraint intel_skl_event_constraints[] = {
 	FIXED_EVENT_CONSTRAINT(0x00c0, 0),	/* INST_RETIRED.ANY */
 	FIXED_EVENT_CONSTRAINT(0x003c, 1),	/* CPU_CLK_UNHALTED.CORE */
@@ -2319,6 +2342,26 @@ static __initconst const u64 tnt_hw_cache_extra_regs
 	},
 };
 
+static __initconst const u64 arw_hw_cache_extra_regs
+				[PERF_COUNT_HW_CACHE_MAX]
+				[PERF_COUNT_HW_CACHE_OP_MAX]
+				[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= 0x4000000000000001,
+			[C(RESULT_MISS)]	= 0xFFFFF000000001,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= 0x4000000000000002,
+			[C(RESULT_MISS)]	= 0xFFFFF000000002,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= 0x0,
+			[C(RESULT_MISS)]	= 0x0,
+		},
+	},
+};
+
 EVENT_ATTR_STR(topdown-fe-bound,       td_fe_bound_tnt,        "event=0x71,umask=0x0");
 EVENT_ATTR_STR(topdown-retiring,       td_retiring_tnt,        "event=0xc2,umask=0x0");
 EVENT_ATTR_STR(topdown-bad-spec,       td_bad_spec_tnt,        "event=0x73,umask=0x6");
@@ -2377,6 +2420,22 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
 	EVENT_EXTRA_END
 };
 
+static struct extra_reg intel_arw_extra_regs[] __read_mostly = {
+	/* must define OMR_X first, see intel_alt_er() */
+	INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0),
+	INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1),
+	INTEL_UEVENT_EXTRA_REG(0x04b7, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2),
+	INTEL_UEVENT_EXTRA_REG(0x08b7, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3),
+	INTEL_UEVENT_EXTRA_REG(0x01d4, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0),
+	INTEL_UEVENT_EXTRA_REG(0x02d4, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1),
+	INTEL_UEVENT_EXTRA_REG(0x04d4, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2),
+	INTEL_UEVENT_EXTRA_REG(0x08d4, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3),
+	INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0),
+	INTEL_UEVENT_EXTRA_REG(0x0127, MSR_SNOOP_RSP_0, 0xffffffffffffffffull, SNOOP_0),
+	INTEL_UEVENT_EXTRA_REG(0x0227, MSR_SNOOP_RSP_1, 0xffffffffffffffffull, SNOOP_1),
+	EVENT_EXTRA_END
+};
+
 EVENT_ATTR_STR(topdown-fe-bound,       td_fe_bound_skt,        "event=0x9c,umask=0x01");
 EVENT_ATTR_STR(topdown-retiring,       td_retiring_skt,        "event=0xc2,umask=0x02");
 EVENT_ATTR_STR(topdown-be-bound,       td_be_bound_skt,        "event=0xa4,umask=0x02");
@@ -7410,6 +7469,19 @@ static __always_inline void intel_pmu_init_skt(struct pmu *pmu)
 	static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
 }
 
+static __always_inline void intel_pmu_init_arw(struct pmu *pmu)
+{
+	intel_pmu_init_grt(pmu);
+	x86_pmu.flags &= ~PMU_FL_HAS_RSP_1;
+	x86_pmu.flags |= PMU_FL_HAS_OMR;
+	memcpy(hybrid_var(pmu, hw_cache_extra_regs),
+	       arw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
+	hybrid(pmu, event_constraints) = intel_arw_event_constraints;
+	hybrid(pmu, pebs_constraints) = intel_arw_pebs_event_constraints;
+	hybrid(pmu, extra_regs) = intel_arw_extra_regs;
+	static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
+}
+
 __init int intel_pmu_init(void)
 {
 	struct attribute **extra_skl_attr = &empty_attrs;
@@ -8250,6 +8322,33 @@ __init int intel_pmu_init(void)
 		name = "arrowlake_h_hybrid";
 		break;
 
+	case INTEL_NOVALAKE:
+	case INTEL_NOVALAKE_L:
+		pr_cont("Novalake Hybrid events, ");
+		name = "novalake_hybrid";
+		intel_pmu_init_hybrid(hybrid_big_small);
+
+		x86_pmu.pebs_latency_data = nvl_latency_data;
+		x86_pmu.get_event_constraints = mtl_get_event_constraints;
+		x86_pmu.hw_config = adl_hw_config;
+
+		td_attr = lnl_hybrid_events_attrs;
+		mem_attr = mtl_hybrid_mem_attrs;
+		tsx_attr = adl_hybrid_tsx_attrs;
+		extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
+			mtl_hybrid_extra_attr_rtm : mtl_hybrid_extra_attr;
+
+		/* Initialize big core specific PerfMon capabilities.*/
+		pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
+		intel_pmu_init_pnc(&pmu->pmu);
+
+		/* Initialize Atom core specific PerfMon capabilities.*/
+		pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX];
+		intel_pmu_init_arw(&pmu->pmu);
+
+		intel_pmu_pebs_data_source_lnl();
+		break;
+
 	default:
 		switch (x86_pmu.version) {
 		case 1:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index a47f173d411b..5027afc97b65 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1293,6 +1293,17 @@ struct event_constraint intel_grt_pebs_event_constraints[] = {
 	EVENT_CONSTRAINT_END
 };
 
+struct event_constraint intel_arw_pebs_event_constraints[] = {
+	/* Allow all events as PEBS with no flags */
+	INTEL_HYBRID_LAT_CONSTRAINT(0x5d0, 0xff),
+	INTEL_HYBRID_LAT_CONSTRAINT(0x6d0, 0xff),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x01d4, 0x1),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x02d4, 0x2),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x04d4, 0x4),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x08d4, 0x8),
+	EVENT_CONSTRAINT_END
+};
+
 struct event_constraint intel_nehalem_pebs_event_constraints[] = {
 	INTEL_PLD_CONSTRAINT(0x100b, 0xf),      /* MEM_INST_RETIRED.* */
 	INTEL_FLAGS_EVENT_CONSTRAINT(0x0f, 0xf),    /* MEM_UNCORE_RETIRED.* */
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index aedc1a7762c2..f7caabc5d487 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1680,6 +1680,8 @@ extern struct event_constraint intel_glp_pebs_event_constraints[];
 
 extern struct event_constraint intel_grt_pebs_event_constraints[];
 
+extern struct event_constraint intel_arw_pebs_event_constraints[];
+
 extern struct event_constraint intel_nehalem_pebs_event_constraints[];
 
 extern struct event_constraint intel_westmere_pebs_event_constraints[];
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc
  2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
                   ` (4 preceding siblings ...)
  2026-01-14  1:17 ` [Patch v3 5/7] perf/x86/intel: Add core PMU support for Novalake Dapeng Mi
@ 2026-01-14  1:17 ` Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-03-09 23:56   ` [Patch v3 6/7] " Ian Rogers
  2026-01-14  1:17 ` [Patch v3 7/7] perf/x86/intel: Add support for rdpmc user disable feature Dapeng Mi
  6 siblings, 2 replies; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Replace magic numbers in attr_rdpmc with macros to improve readability
and make their meanings clearer for users.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/core.c       | 7 ++++---
 arch/x86/events/intel/p6.c   | 2 +-
 arch/x86/events/perf_event.h | 7 +++++++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 0ecac9495d74..c2717cb5034f 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2163,7 +2163,8 @@ static int __init init_hw_perf_events(void)
 
 	pr_cont("%s PMU driver.\n", x86_pmu.name);
 
-	x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */
+	/* enable userspace RDPMC usage by default */
+	x86_pmu.attr_rdpmc = X86_USER_RDPMC_CONDITIONAL_ENABLE;
 
 	for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next)
 		quirk->func();
@@ -2643,12 +2644,12 @@ static ssize_t set_attr_rdpmc(struct device *cdev,
 		 */
 		if (val == 0)
 			static_branch_inc(&rdpmc_never_available_key);
-		else if (x86_pmu.attr_rdpmc == 0)
+		else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_NEVER_ENABLE)
 			static_branch_dec(&rdpmc_never_available_key);
 
 		if (val == 2)
 			static_branch_inc(&rdpmc_always_available_key);
-		else if (x86_pmu.attr_rdpmc == 2)
+		else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE)
 			static_branch_dec(&rdpmc_always_available_key);
 
 		on_each_cpu(cr4_update_pce, NULL, 1);
diff --git a/arch/x86/events/intel/p6.c b/arch/x86/events/intel/p6.c
index 6e41de355bd8..fb991e0ac614 100644
--- a/arch/x86/events/intel/p6.c
+++ b/arch/x86/events/intel/p6.c
@@ -243,7 +243,7 @@ static __init void p6_pmu_rdpmc_quirk(void)
 		 */
 		pr_warn("Userspace RDPMC support disabled due to a CPU erratum\n");
 		x86_pmu.attr_rdpmc_broken = 1;
-		x86_pmu.attr_rdpmc = 0;
+		x86_pmu.attr_rdpmc = X86_USER_RDPMC_NEVER_ENABLE;
 	}
 }
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index f7caabc5d487..24a81d2916e9 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -187,6 +187,13 @@ struct amd_nb {
 	 (1ULL << PERF_REG_X86_R14)   | \
 	 (1ULL << PERF_REG_X86_R15))
 
+/* user space rdpmc control values */
+enum {
+	X86_USER_RDPMC_NEVER_ENABLE		= 0,
+	X86_USER_RDPMC_CONDITIONAL_ENABLE	= 1,
+	X86_USER_RDPMC_ALWAYS_ENABLE		= 2,
+};
+
 /*
  * Per register state.
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Patch v3 7/7] perf/x86/intel: Add support for rdpmc user disable feature
  2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
                   ` (5 preceding siblings ...)
  2026-01-14  1:17 ` [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc Dapeng Mi
@ 2026-01-14  1:17 ` Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  6 siblings, 1 reply; 21+ messages in thread
From: Dapeng Mi @ 2026-01-14  1:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Starting with Panther Cove, the rdpmc user disable feature is supported.
This feature allows the perf system to disable user space rdpmc reads at
the counter level.

Currently, when a global counter is active, any user with rdpmc rights
can read it, even if perf access permissions forbid it (e.g., disallow
reading ring 0 counters). The rdpmc user disable feature mitigates this
security concern.

Details:

- A new RDPMC_USR_DISABLE bit (bit 37) in each EVNTSELx MSR indicates
  that the GP counter cannot be read by RDPMC in ring 3.
- New RDPMC_USR_DISABLE bits in IA32_FIXED_CTR_CTRL MSR (bits 33, 37,
  41, 45, etc.) for fixed counters 0, 1, 2, 3, etc.
- When calling rdpmc instruction for counter x, the following pseudo
  code demonstrates how the counter value is obtained:
  	If (!CPL0 && RDPMC_USR_DISABLE[x] == 1) ? 0 : counter_value;
- RDPMC_USR_DISABLE is enumerated by CPUID.0x23.0.EBX[2].

This patch extends the current global user space rdpmc control logic via
the sysfs interface (/sys/devices/cpu/rdpmc) as follows:

- rdpmc = 0:
  Global user space rdpmc and counter-level user space rdpmc for all
  counters are both disabled.
- rdpmc = 1:
  Global user space rdpmc is enabled during the mmap-enabled time window,
  and counter-level user space rdpmc is enabled only for non-system-wide
  events. This prevents counter data leaks as count data is cleared
  during context switches.
- rdpmc = 2:
  Global user space rdpmc and counter-level user space rdpmc for all
  counters are enabled unconditionally.

The new rdpmc settings only affect newly activated perf events; currently
active perf events remain unaffected. This simplifies and cleans up the
code. The default value of rdpmc remains unchanged at 1.

For more details about rdpmc user disable, please refer to chapter 15
"RDPMC USER DISABLE" in ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---

v3: Add words to mention rdpmc attribute is global even on hybrid
platforms in the documentation.

 .../sysfs-bus-event_source-devices-rdpmc      | 44 +++++++++++++++++++
 arch/x86/events/core.c                        | 21 +++++++++
 arch/x86/events/intel/core.c                  | 27 ++++++++++++
 arch/x86/events/perf_event.h                  |  6 +++
 arch/x86/include/asm/perf_event.h             |  8 +++-
 5 files changed, 104 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc b/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc
new file mode 100644
index 000000000000..59ec18bbb418
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc
@@ -0,0 +1,44 @@
+What:           /sys/bus/event_source/devices/cpu.../rdpmc
+Date:           November 2011
+KernelVersion:  3.10
+Contact:        Linux kernel mailing list linux-kernel@vger.kernel.org
+Description:    The /sys/bus/event_source/devices/cpu.../rdpmc attribute
+                is used to show/manage if rdpmc instruction can be
+                executed in user space. This attribute supports 3 numbers.
+                - rdpmc = 0
+                user space rdpmc is globally disabled for all PMU
+                counters.
+                - rdpmc = 1
+                user space rdpmc is globally enabled only in event mmap
+                ioctl called time window. If the mmap region is unmapped,
+                user space rdpmc is disabled again.
+                - rdpmc = 2
+                user space rdpmc is globally enabled for all PMU
+                counters.
+
+                In the Intel platforms supporting counter level's user
+                space rdpmc disable feature (CPUID.23H.EBX[2] = 1), the
+                meaning of 3 numbers is extended to
+                - rdpmc = 0
+                global user space rdpmc and counter level's user space
+                rdpmc of all counters are both disabled.
+                - rdpmc = 1
+                No changes on behavior of global user space rdpmc.
+                counter level's rdpmc of system-wide events is disabled
+                but counter level's rdpmc of non-system-wide events is
+                enabled.
+                - rdpmc = 2
+                global user space rdpmc and counter level's user space
+                rdpmc of all counters are both enabled unconditionally.
+
+                The default value of rdpmc is 1.
+
+                Please notice:
+                - global user space rdpmc's behavior would change
+                immediately along with the rdpmc value's change,
+                but the behavior of counter level's user space rdpmc
+                won't take effect immediately until the event is
+                reactivated or recreated.
+                - The rdpmc attribute is global, even for x86 hybrid
+                platforms. For example, changing cpu_core/rdpmc will
+                also change cpu_atom/rdpmc.
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index c2717cb5034f..6df73e8398cd 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2616,6 +2616,27 @@ static ssize_t get_attr_rdpmc(struct device *cdev,
 	return snprintf(buf, 40, "%d\n", x86_pmu.attr_rdpmc);
 }
 
+/*
+ * Behaviors of rdpmc value:
+ * - rdpmc = 0
+ *    global user space rdpmc and counter level's user space rdpmc of all
+ *    counters are both disabled.
+ * - rdpmc = 1
+ *    global user space rdpmc is enabled in mmap enabled time window and
+ *    counter level's user space rdpmc is enabled for only non system-wide
+ *    events. Counter level's user space rdpmc of system-wide events is
+ *    still disabled by default. This won't introduce counter data leak for
+ *    non system-wide events since their count data would be cleared when
+ *    context switches.
+ * - rdpmc = 2
+ *    global user space rdpmc and counter level's user space rdpmc of all
+ *    counters are enabled unconditionally.
+ *
+ * Suppose the rdpmc value won't be changed frequently, don't dynamically
+ * reschedule events to make the new rpdmc value take effect on active perf
+ * events immediately, the new rdpmc value would only impact the new
+ * activated perf events. This makes code simpler and cleaner.
+ */
 static ssize_t set_attr_rdpmc(struct device *cdev,
 			      struct device_attribute *attr,
 			      const char *buf, size_t count)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index d6bdbb7e449a..f3ae1f8ee3cd 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3128,6 +3128,8 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
 		bits |= INTEL_FIXED_0_USER;
 	if (hwc->config & ARCH_PERFMON_EVENTSEL_OS)
 		bits |= INTEL_FIXED_0_KERNEL;
+	if (hwc->config & ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE)
+		bits |= INTEL_FIXED_0_RDPMC_USER_DISABLE;
 
 	/*
 	 * ANY bit is supported in v3 and up
@@ -3263,6 +3265,27 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
 		__intel_pmu_update_event_ext(hwc->idx, ext);
 }
 
+static void intel_pmu_update_rdpmc_user_disable(struct perf_event *event)
+{
+	if (!x86_pmu_has_rdpmc_user_disable(event->pmu))
+		return;
+
+	/*
+	 * Counter scope's user-space rdpmc is disabled by default
+	 * except two cases.
+	 * a. rdpmc = 2 (user space rdpmc enabled unconditionally)
+	 * b. rdpmc = 1 and the event is not a system-wide event.
+	 *    The count of non-system-wide events would be cleared when
+	 *    context switches, so no count data is leaked.
+	 */
+	if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE ||
+	    (x86_pmu.attr_rdpmc == X86_USER_RDPMC_CONDITIONAL_ENABLE &&
+	     event->ctx->task))
+		event->hw.config &= ~ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
+	else
+		event->hw.config |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
+}
+
 DEFINE_STATIC_CALL_NULL(intel_pmu_enable_event_ext, intel_pmu_enable_event_ext);
 
 static void intel_pmu_enable_event(struct perf_event *event)
@@ -3271,6 +3294,8 @@ static void intel_pmu_enable_event(struct perf_event *event)
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
+	intel_pmu_update_rdpmc_user_disable(event);
+
 	if (unlikely(event->attr.precise_ip))
 		static_call(x86_pmu_pebs_enable)(event);
 
@@ -5869,6 +5894,8 @@ static void update_pmu_cap(struct pmu *pmu)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
 	if (ebx_0.split.eq)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
+	if (ebx_0.split.rdpmc_user_disable)
+		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
 
 	if (eax_0.split.cntr_subleaf) {
 		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 24a81d2916e9..cd337f3ffd01 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1333,6 +1333,12 @@ static inline u64 x86_pmu_get_event_config(struct perf_event *event)
 	return event->attr.config & hybrid(event->pmu, config_mask);
 }
 
+static inline bool x86_pmu_has_rdpmc_user_disable(struct pmu *pmu)
+{
+	return !!(hybrid(pmu, config_mask) &
+		 ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE);
+}
+
 extern struct event_constraint emptyconstraint;
 
 extern struct event_constraint unconstrained;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 0d9af4135e0a..ff5acb8b199b 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -33,6 +33,7 @@
 #define ARCH_PERFMON_EVENTSEL_CMASK			0xFF000000ULL
 #define ARCH_PERFMON_EVENTSEL_BR_CNTR			(1ULL << 35)
 #define ARCH_PERFMON_EVENTSEL_EQ			(1ULL << 36)
+#define ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE	(1ULL << 37)
 #define ARCH_PERFMON_EVENTSEL_UMASK2			(0xFFULL << 40)
 
 #define INTEL_FIXED_BITS_STRIDE			4
@@ -40,6 +41,7 @@
 #define INTEL_FIXED_0_USER				(1ULL << 1)
 #define INTEL_FIXED_0_ANYTHREAD			(1ULL << 2)
 #define INTEL_FIXED_0_ENABLE_PMI			(1ULL << 3)
+#define INTEL_FIXED_0_RDPMC_USER_DISABLE		(1ULL << 33)
 #define INTEL_FIXED_3_METRICS_CLEAR			(1ULL << 2)
 
 #define HSW_IN_TX					(1ULL << 32)
@@ -50,7 +52,7 @@
 #define INTEL_FIXED_BITS_MASK					\
 	(INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER |		\
 	 INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI |	\
-	 ICL_FIXED_0_ADAPTIVE)
+	 ICL_FIXED_0_ADAPTIVE | INTEL_FIXED_0_RDPMC_USER_DISABLE)
 
 #define intel_fixed_bits_by_idx(_idx, _bits)			\
 	((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE))
@@ -226,7 +228,9 @@ union cpuid35_ebx {
 		unsigned int    umask2:1;
 		/* EQ-bit Supported */
 		unsigned int    eq:1;
-		unsigned int	reserved:30;
+		/* rdpmc user disable Supported */
+		unsigned int    rdpmc_user_disable:1;
+		unsigned int	reserved:29;
 	} split;
 	unsigned int            full;
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: perf/core] perf/x86/intel: Add support for rdpmc user disable feature
  2026-01-14  1:17 ` [Patch v3 7/7] perf/x86/intel: Add support for rdpmc user disable feature Dapeng Mi
@ 2026-01-15 21:44   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-01-15 21:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     59af95e028d4114991b9bd96a39ad855b399cc07
Gitweb:        https://git.kernel.org/tip/59af95e028d4114991b9bd96a39ad855b399cc07
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 14 Jan 2026 09:17:50 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 15 Jan 2026 10:04:28 +01:00

perf/x86/intel: Add support for rdpmc user disable feature

Starting with Panther Cove, the rdpmc user disable feature is supported.
This feature allows the perf system to disable user space rdpmc reads at
the counter level.

Currently, when a global counter is active, any user with rdpmc rights
can read it, even if perf access permissions forbid it (e.g., disallow
reading ring 0 counters). The rdpmc user disable feature mitigates this
security concern.

Details:

- A new RDPMC_USR_DISABLE bit (bit 37) in each EVNTSELx MSR indicates
  that the GP counter cannot be read by RDPMC in ring 3.
- New RDPMC_USR_DISABLE bits in IA32_FIXED_CTR_CTRL MSR (bits 33, 37,
  41, 45, etc.) for fixed counters 0, 1, 2, 3, etc.
- When calling rdpmc instruction for counter x, the following pseudo
  code demonstrates how the counter value is obtained:
  	If (!CPL0 && RDPMC_USR_DISABLE[x] == 1) ? 0 : counter_value;
- RDPMC_USR_DISABLE is enumerated by CPUID.0x23.0.EBX[2].

This patch extends the current global user space rdpmc control logic via
the sysfs interface (/sys/devices/cpu/rdpmc) as follows:

- rdpmc = 0:
  Global user space rdpmc and counter-level user space rdpmc for all
  counters are both disabled.
- rdpmc = 1:
  Global user space rdpmc is enabled during the mmap-enabled time window,
  and counter-level user space rdpmc is enabled only for non-system-wide
  events. This prevents counter data leaks as count data is cleared
  during context switches.
- rdpmc = 2:
  Global user space rdpmc and counter-level user space rdpmc for all
  counters are enabled unconditionally.

The new rdpmc settings only affect newly activated perf events; currently
active perf events remain unaffected. This simplifies and cleans up the
code. The default value of rdpmc remains unchanged at 1.

For more details about rdpmc user disable, please refer to chapter 15
"RDPMC USER DISABLE" in ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-8-dapeng1.mi@linux.intel.com
---
 Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc | 44 +++++++-
 arch/x86/events/core.c                                         | 21 +++-
 arch/x86/events/intel/core.c                                   | 27 ++++-
 arch/x86/events/perf_event.h                                   |  6 +-
 arch/x86/include/asm/perf_event.h                              |  8 +-
 5 files changed, 104 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc b/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc
new file mode 100644
index 0000000..59ec18b
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc
@@ -0,0 +1,44 @@
+What:           /sys/bus/event_source/devices/cpu.../rdpmc
+Date:           November 2011
+KernelVersion:  3.10
+Contact:        Linux kernel mailing list linux-kernel@vger.kernel.org
+Description:    The /sys/bus/event_source/devices/cpu.../rdpmc attribute
+                is used to show/manage if rdpmc instruction can be
+                executed in user space. This attribute supports 3 numbers.
+                - rdpmc = 0
+                user space rdpmc is globally disabled for all PMU
+                counters.
+                - rdpmc = 1
+                user space rdpmc is globally enabled only in event mmap
+                ioctl called time window. If the mmap region is unmapped,
+                user space rdpmc is disabled again.
+                - rdpmc = 2
+                user space rdpmc is globally enabled for all PMU
+                counters.
+
+                In the Intel platforms supporting counter level's user
+                space rdpmc disable feature (CPUID.23H.EBX[2] = 1), the
+                meaning of 3 numbers is extended to
+                - rdpmc = 0
+                global user space rdpmc and counter level's user space
+                rdpmc of all counters are both disabled.
+                - rdpmc = 1
+                No changes on behavior of global user space rdpmc.
+                counter level's rdpmc of system-wide events is disabled
+                but counter level's rdpmc of non-system-wide events is
+                enabled.
+                - rdpmc = 2
+                global user space rdpmc and counter level's user space
+                rdpmc of all counters are both enabled unconditionally.
+
+                The default value of rdpmc is 1.
+
+                Please notice:
+                - global user space rdpmc's behavior would change
+                immediately along with the rdpmc value's change,
+                but the behavior of counter level's user space rdpmc
+                won't take effect immediately until the event is
+                reactivated or recreated.
+                - The rdpmc attribute is global, even for x86 hybrid
+                platforms. For example, changing cpu_core/rdpmc will
+                also change cpu_atom/rdpmc.
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index c2717cb..6df73e8 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2616,6 +2616,27 @@ static ssize_t get_attr_rdpmc(struct device *cdev,
 	return snprintf(buf, 40, "%d\n", x86_pmu.attr_rdpmc);
 }
 
+/*
+ * Behaviors of rdpmc value:
+ * - rdpmc = 0
+ *    global user space rdpmc and counter level's user space rdpmc of all
+ *    counters are both disabled.
+ * - rdpmc = 1
+ *    global user space rdpmc is enabled in mmap enabled time window and
+ *    counter level's user space rdpmc is enabled for only non system-wide
+ *    events. Counter level's user space rdpmc of system-wide events is
+ *    still disabled by default. This won't introduce counter data leak for
+ *    non system-wide events since their count data would be cleared when
+ *    context switches.
+ * - rdpmc = 2
+ *    global user space rdpmc and counter level's user space rdpmc of all
+ *    counters are enabled unconditionally.
+ *
+ * Suppose the rdpmc value won't be changed frequently, don't dynamically
+ * reschedule events to make the new rpdmc value take effect on active perf
+ * events immediately, the new rdpmc value would only impact the new
+ * activated perf events. This makes code simpler and cleaner.
+ */
 static ssize_t set_attr_rdpmc(struct device *cdev,
 			      struct device_attribute *attr,
 			      const char *buf, size_t count)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index d6bdbb7..f3ae1f8 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3128,6 +3128,8 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
 		bits |= INTEL_FIXED_0_USER;
 	if (hwc->config & ARCH_PERFMON_EVENTSEL_OS)
 		bits |= INTEL_FIXED_0_KERNEL;
+	if (hwc->config & ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE)
+		bits |= INTEL_FIXED_0_RDPMC_USER_DISABLE;
 
 	/*
 	 * ANY bit is supported in v3 and up
@@ -3263,6 +3265,27 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
 		__intel_pmu_update_event_ext(hwc->idx, ext);
 }
 
+static void intel_pmu_update_rdpmc_user_disable(struct perf_event *event)
+{
+	if (!x86_pmu_has_rdpmc_user_disable(event->pmu))
+		return;
+
+	/*
+	 * Counter scope's user-space rdpmc is disabled by default
+	 * except two cases.
+	 * a. rdpmc = 2 (user space rdpmc enabled unconditionally)
+	 * b. rdpmc = 1 and the event is not a system-wide event.
+	 *    The count of non-system-wide events would be cleared when
+	 *    context switches, so no count data is leaked.
+	 */
+	if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE ||
+	    (x86_pmu.attr_rdpmc == X86_USER_RDPMC_CONDITIONAL_ENABLE &&
+	     event->ctx->task))
+		event->hw.config &= ~ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
+	else
+		event->hw.config |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
+}
+
 DEFINE_STATIC_CALL_NULL(intel_pmu_enable_event_ext, intel_pmu_enable_event_ext);
 
 static void intel_pmu_enable_event(struct perf_event *event)
@@ -3271,6 +3294,8 @@ static void intel_pmu_enable_event(struct perf_event *event)
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
+	intel_pmu_update_rdpmc_user_disable(event);
+
 	if (unlikely(event->attr.precise_ip))
 		static_call(x86_pmu_pebs_enable)(event);
 
@@ -5869,6 +5894,8 @@ static void update_pmu_cap(struct pmu *pmu)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
 	if (ebx_0.split.eq)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
+	if (ebx_0.split.rdpmc_user_disable)
+		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
 
 	if (eax_0.split.cntr_subleaf) {
 		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 24a81d2..cd337f3 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1333,6 +1333,12 @@ static inline u64 x86_pmu_get_event_config(struct perf_event *event)
 	return event->attr.config & hybrid(event->pmu, config_mask);
 }
 
+static inline bool x86_pmu_has_rdpmc_user_disable(struct pmu *pmu)
+{
+	return !!(hybrid(pmu, config_mask) &
+		 ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE);
+}
+
 extern struct event_constraint emptyconstraint;
 
 extern struct event_constraint unconstrained;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 0d9af41..ff5acb8 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -33,6 +33,7 @@
 #define ARCH_PERFMON_EVENTSEL_CMASK			0xFF000000ULL
 #define ARCH_PERFMON_EVENTSEL_BR_CNTR			(1ULL << 35)
 #define ARCH_PERFMON_EVENTSEL_EQ			(1ULL << 36)
+#define ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE	(1ULL << 37)
 #define ARCH_PERFMON_EVENTSEL_UMASK2			(0xFFULL << 40)
 
 #define INTEL_FIXED_BITS_STRIDE			4
@@ -40,6 +41,7 @@
 #define INTEL_FIXED_0_USER				(1ULL << 1)
 #define INTEL_FIXED_0_ANYTHREAD			(1ULL << 2)
 #define INTEL_FIXED_0_ENABLE_PMI			(1ULL << 3)
+#define INTEL_FIXED_0_RDPMC_USER_DISABLE		(1ULL << 33)
 #define INTEL_FIXED_3_METRICS_CLEAR			(1ULL << 2)
 
 #define HSW_IN_TX					(1ULL << 32)
@@ -50,7 +52,7 @@
 #define INTEL_FIXED_BITS_MASK					\
 	(INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER |		\
 	 INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI |	\
-	 ICL_FIXED_0_ADAPTIVE)
+	 ICL_FIXED_0_ADAPTIVE | INTEL_FIXED_0_RDPMC_USER_DISABLE)
 
 #define intel_fixed_bits_by_idx(_idx, _bits)			\
 	((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE))
@@ -226,7 +228,9 @@ union cpuid35_ebx {
 		unsigned int    umask2:1;
 		/* EQ-bit Supported */
 		unsigned int    eq:1;
-		unsigned int	reserved:30;
+		/* rdpmc user disable Supported */
+		unsigned int    rdpmc_user_disable:1;
+		unsigned int	reserved:29;
 	} split;
 	unsigned int            full;
 };

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: perf/core] perf/x86: Use macros to replace magic numbers in attr_rdpmc
  2026-01-14  1:17 ` [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc Dapeng Mi
@ 2026-01-15 21:44   ` tip-bot2 for Dapeng Mi
  2026-03-09 23:56   ` [Patch v3 6/7] " Ian Rogers
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-01-15 21:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     8c74e4e3e0596950554962229582260f1501d899
Gitweb:        https://git.kernel.org/tip/8c74e4e3e0596950554962229582260f1501d899
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 14 Jan 2026 09:17:49 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 15 Jan 2026 10:04:27 +01:00

perf/x86: Use macros to replace magic numbers in attr_rdpmc

Replace magic numbers in attr_rdpmc with macros to improve readability
and make their meanings clearer for users.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-7-dapeng1.mi@linux.intel.com
---
 arch/x86/events/core.c       | 7 ++++---
 arch/x86/events/intel/p6.c   | 2 +-
 arch/x86/events/perf_event.h | 7 +++++++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 0ecac94..c2717cb 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2163,7 +2163,8 @@ static int __init init_hw_perf_events(void)
 
 	pr_cont("%s PMU driver.\n", x86_pmu.name);
 
-	x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */
+	/* enable userspace RDPMC usage by default */
+	x86_pmu.attr_rdpmc = X86_USER_RDPMC_CONDITIONAL_ENABLE;
 
 	for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next)
 		quirk->func();
@@ -2643,12 +2644,12 @@ static ssize_t set_attr_rdpmc(struct device *cdev,
 		 */
 		if (val == 0)
 			static_branch_inc(&rdpmc_never_available_key);
-		else if (x86_pmu.attr_rdpmc == 0)
+		else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_NEVER_ENABLE)
 			static_branch_dec(&rdpmc_never_available_key);
 
 		if (val == 2)
 			static_branch_inc(&rdpmc_always_available_key);
-		else if (x86_pmu.attr_rdpmc == 2)
+		else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE)
 			static_branch_dec(&rdpmc_always_available_key);
 
 		on_each_cpu(cr4_update_pce, NULL, 1);
diff --git a/arch/x86/events/intel/p6.c b/arch/x86/events/intel/p6.c
index 6e41de3..fb991e0 100644
--- a/arch/x86/events/intel/p6.c
+++ b/arch/x86/events/intel/p6.c
@@ -243,7 +243,7 @@ static __init void p6_pmu_rdpmc_quirk(void)
 		 */
 		pr_warn("Userspace RDPMC support disabled due to a CPU erratum\n");
 		x86_pmu.attr_rdpmc_broken = 1;
-		x86_pmu.attr_rdpmc = 0;
+		x86_pmu.attr_rdpmc = X86_USER_RDPMC_NEVER_ENABLE;
 	}
 }
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index f7caabc..24a81d2 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -187,6 +187,13 @@ struct amd_nb {
 	 (1ULL << PERF_REG_X86_R14)   | \
 	 (1ULL << PERF_REG_X86_R15))
 
+/* user space rdpmc control values */
+enum {
+	X86_USER_RDPMC_NEVER_ENABLE		= 0,
+	X86_USER_RDPMC_CONDITIONAL_ENABLE	= 1,
+	X86_USER_RDPMC_ALWAYS_ENABLE		= 2,
+};
+
 /*
  * Per register state.
  */

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: perf/core] perf/x86/intel: Add core PMU support for Novalake
  2026-01-14  1:17 ` [Patch v3 5/7] perf/x86/intel: Add core PMU support for Novalake Dapeng Mi
@ 2026-01-15 21:44   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-01-15 21:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     c847a208f43bfeb56943f2ca6fe2baf1db9dee7a
Gitweb:        https://git.kernel.org/tip/c847a208f43bfeb56943f2ca6fe2baf1db9dee7a
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 14 Jan 2026 09:17:48 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 15 Jan 2026 10:04:27 +01:00

perf/x86/intel: Add core PMU support for Novalake

This patch enables core PMU support for Novalake, covering both P-core
and E-core. It includes Arctic Wolf-specific counters and PEBS
constraints, and the model-specific OMR extra registers table.

Since Coyote Cove shares the same PMU capabilities as Panther Cove, the
existing Panther Cove PMU enabling functions are reused for Coyote Cove.

For detailed information about counter constraints, please refer to
section 16.3 "COUNTER RESTRICTIONS" in the ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-6-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c |  99 ++++++++++++++++++++++++++++++++++-
 arch/x86/events/intel/ds.c   |  11 ++++-
 arch/x86/events/perf_event.h |   2 +-
 3 files changed, 112 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index b2f99d4..d6bdbb7 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -232,6 +232,29 @@ static struct event_constraint intel_skt_event_constraints[] __read_mostly = {
 	EVENT_CONSTRAINT_END
 };
 
+static struct event_constraint intel_arw_event_constraints[] __read_mostly = {
+	FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
+	FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
+	FIXED_EVENT_CONSTRAINT(0x0300, 2), /* pseudo CPU_CLK_UNHALTED.REF */
+	FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */
+	FIXED_EVENT_CONSTRAINT(0x0073, 4), /* TOPDOWN_BAD_SPECULATION.ALL */
+	FIXED_EVENT_CONSTRAINT(0x019c, 5), /* TOPDOWN_FE_BOUND.ALL */
+	FIXED_EVENT_CONSTRAINT(0x02c2, 6), /* TOPDOWN_RETIRING.ALL */
+	INTEL_UEVENT_CONSTRAINT(0x01b7, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x02b7, 0x2),
+	INTEL_UEVENT_CONSTRAINT(0x04b7, 0x4),
+	INTEL_UEVENT_CONSTRAINT(0x08b7, 0x8),
+	INTEL_UEVENT_CONSTRAINT(0x01d4, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x02d4, 0x2),
+	INTEL_UEVENT_CONSTRAINT(0x04d4, 0x4),
+	INTEL_UEVENT_CONSTRAINT(0x08d4, 0x8),
+	INTEL_UEVENT_CONSTRAINT(0x0175, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x0275, 0x2),
+	INTEL_UEVENT_CONSTRAINT(0x21d3, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x22d3, 0x1),
+	EVENT_CONSTRAINT_END
+};
+
 static struct event_constraint intel_skl_event_constraints[] = {
 	FIXED_EVENT_CONSTRAINT(0x00c0, 0),	/* INST_RETIRED.ANY */
 	FIXED_EVENT_CONSTRAINT(0x003c, 1),	/* CPU_CLK_UNHALTED.CORE */
@@ -2319,6 +2342,26 @@ static __initconst const u64 tnt_hw_cache_extra_regs
 	},
 };
 
+static __initconst const u64 arw_hw_cache_extra_regs
+				[PERF_COUNT_HW_CACHE_MAX]
+				[PERF_COUNT_HW_CACHE_OP_MAX]
+				[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= 0x4000000000000001,
+			[C(RESULT_MISS)]	= 0xFFFFF000000001,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= 0x4000000000000002,
+			[C(RESULT_MISS)]	= 0xFFFFF000000002,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= 0x0,
+			[C(RESULT_MISS)]	= 0x0,
+		},
+	},
+};
+
 EVENT_ATTR_STR(topdown-fe-bound,       td_fe_bound_tnt,        "event=0x71,umask=0x0");
 EVENT_ATTR_STR(topdown-retiring,       td_retiring_tnt,        "event=0xc2,umask=0x0");
 EVENT_ATTR_STR(topdown-bad-spec,       td_bad_spec_tnt,        "event=0x73,umask=0x6");
@@ -2377,6 +2420,22 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
 	EVENT_EXTRA_END
 };
 
+static struct extra_reg intel_arw_extra_regs[] __read_mostly = {
+	/* must define OMR_X first, see intel_alt_er() */
+	INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0),
+	INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1),
+	INTEL_UEVENT_EXTRA_REG(0x04b7, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2),
+	INTEL_UEVENT_EXTRA_REG(0x08b7, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3),
+	INTEL_UEVENT_EXTRA_REG(0x01d4, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0),
+	INTEL_UEVENT_EXTRA_REG(0x02d4, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1),
+	INTEL_UEVENT_EXTRA_REG(0x04d4, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2),
+	INTEL_UEVENT_EXTRA_REG(0x08d4, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3),
+	INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0),
+	INTEL_UEVENT_EXTRA_REG(0x0127, MSR_SNOOP_RSP_0, 0xffffffffffffffffull, SNOOP_0),
+	INTEL_UEVENT_EXTRA_REG(0x0227, MSR_SNOOP_RSP_1, 0xffffffffffffffffull, SNOOP_1),
+	EVENT_EXTRA_END
+};
+
 EVENT_ATTR_STR(topdown-fe-bound,       td_fe_bound_skt,        "event=0x9c,umask=0x01");
 EVENT_ATTR_STR(topdown-retiring,       td_retiring_skt,        "event=0xc2,umask=0x02");
 EVENT_ATTR_STR(topdown-be-bound,       td_be_bound_skt,        "event=0xa4,umask=0x02");
@@ -7410,6 +7469,19 @@ static __always_inline void intel_pmu_init_skt(struct pmu *pmu)
 	static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
 }
 
+static __always_inline void intel_pmu_init_arw(struct pmu *pmu)
+{
+	intel_pmu_init_grt(pmu);
+	x86_pmu.flags &= ~PMU_FL_HAS_RSP_1;
+	x86_pmu.flags |= PMU_FL_HAS_OMR;
+	memcpy(hybrid_var(pmu, hw_cache_extra_regs),
+	       arw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
+	hybrid(pmu, event_constraints) = intel_arw_event_constraints;
+	hybrid(pmu, pebs_constraints) = intel_arw_pebs_event_constraints;
+	hybrid(pmu, extra_regs) = intel_arw_extra_regs;
+	static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
+}
+
 __init int intel_pmu_init(void)
 {
 	struct attribute **extra_skl_attr = &empty_attrs;
@@ -8250,6 +8322,33 @@ __init int intel_pmu_init(void)
 		name = "arrowlake_h_hybrid";
 		break;
 
+	case INTEL_NOVALAKE:
+	case INTEL_NOVALAKE_L:
+		pr_cont("Novalake Hybrid events, ");
+		name = "novalake_hybrid";
+		intel_pmu_init_hybrid(hybrid_big_small);
+
+		x86_pmu.pebs_latency_data = nvl_latency_data;
+		x86_pmu.get_event_constraints = mtl_get_event_constraints;
+		x86_pmu.hw_config = adl_hw_config;
+
+		td_attr = lnl_hybrid_events_attrs;
+		mem_attr = mtl_hybrid_mem_attrs;
+		tsx_attr = adl_hybrid_tsx_attrs;
+		extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
+			mtl_hybrid_extra_attr_rtm : mtl_hybrid_extra_attr;
+
+		/* Initialize big core specific PerfMon capabilities.*/
+		pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
+		intel_pmu_init_pnc(&pmu->pmu);
+
+		/* Initialize Atom core specific PerfMon capabilities.*/
+		pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX];
+		intel_pmu_init_arw(&pmu->pmu);
+
+		intel_pmu_pebs_data_source_lnl();
+		break;
+
 	default:
 		switch (x86_pmu.version) {
 		case 1:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index a47f173..5027afc 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1293,6 +1293,17 @@ struct event_constraint intel_grt_pebs_event_constraints[] = {
 	EVENT_CONSTRAINT_END
 };
 
+struct event_constraint intel_arw_pebs_event_constraints[] = {
+	/* Allow all events as PEBS with no flags */
+	INTEL_HYBRID_LAT_CONSTRAINT(0x5d0, 0xff),
+	INTEL_HYBRID_LAT_CONSTRAINT(0x6d0, 0xff),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x01d4, 0x1),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x02d4, 0x2),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x04d4, 0x4),
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x08d4, 0x8),
+	EVENT_CONSTRAINT_END
+};
+
 struct event_constraint intel_nehalem_pebs_event_constraints[] = {
 	INTEL_PLD_CONSTRAINT(0x100b, 0xf),      /* MEM_INST_RETIRED.* */
 	INTEL_FLAGS_EVENT_CONSTRAINT(0x0f, 0xf),    /* MEM_UNCORE_RETIRED.* */
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index aedc1a7..f7caabc 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1680,6 +1680,8 @@ extern struct event_constraint intel_glp_pebs_event_constraints[];
 
 extern struct event_constraint intel_grt_pebs_event_constraints[];
 
+extern struct event_constraint intel_arw_pebs_event_constraints[];
+
 extern struct event_constraint intel_nehalem_pebs_event_constraints[];
 
 extern struct event_constraint intel_westmere_pebs_event_constraints[];

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: perf/core] perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
  2026-01-14  1:17 ` [Patch v3 4/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL Dapeng Mi
@ 2026-01-15 21:44   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-01-15 21:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     7cd264d1972d13177acc1ac9fb11ee0a7003e2e6
Gitweb:        https://git.kernel.org/tip/7cd264d1972d13177acc1ac9fb11ee0a7003e2e6
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 14 Jan 2026 09:17:47 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 15 Jan 2026 10:04:27 +01:00

perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL

Similar to DMR (Panther Cove uarch), both P-core (Coyote Cove uarch) and
E-core (Arctic Wolf uarch) of NVL adopt the new PEBS memory auxiliary
info layout.

Coyote Cove microarchitecture shares the same PMU capabilities, including
the memory auxiliary info layout, with Panther Cove. Arctic Wolf
microarchitecture has a similar layout to Panther Cove, with the only
difference being specific data source encoding for L2 hit cases (up to
the L2 cache level). The OMR encoding remains the same as in Panther Cove.

For detailed information on the memory auxiliary info encoding, please
refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
the latest ISE documentation.

This patch defines Arctic Wolf specific data source encoding and then
supports PEBS memory auxiliary info field for NVL.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-5-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/ds.c   | 83 +++++++++++++++++++++++++++++++++++-
 arch/x86/events/perf_event.h |  2 +-
 2 files changed, 85 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 06e42ac..a47f173 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -96,6 +96,18 @@ union intel_x86_pebs_dse {
 		unsigned int pnc_fb_full:1;
 		unsigned int ld_reserved8:16;
 	};
+	struct {
+		unsigned int arw_dse:8;
+		unsigned int arw_l2_miss:1;
+		unsigned int arw_xq_promotion:1;
+		unsigned int arw_reissue:1;
+		unsigned int arw_stlb_miss:1;
+		unsigned int arw_locked:1;
+		unsigned int arw_data_blk:1;
+		unsigned int arw_addr_blk:1;
+		unsigned int arw_fb_full:1;
+		unsigned int ld_reserved9:16;
+	};
 };
 
 
@@ -274,6 +286,29 @@ static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
 	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),	/* 0x0f: uncached */
 };
 
+/* Version for Arctic Wolf and later */
+
+/* L2 hit */
+#define ARW_PEBS_DATA_SOURCE_MAX	16
+static u64 arw_pebs_l2_hit_data_source[ARW_PEBS_DATA_SOURCE_MAX] = {
+	P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),	/* 0x00: non-cache access */
+	OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),	/* 0x01: L1 hit */
+	OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),	/* 0x02: WCB Hit */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),	/* 0x03: L2 Hit Clean */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),	/* 0x04: L2 Hit Snoop HIT */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),	/* 0x05: L2 Hit Snoop Hit Modified */
+	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),	/* 0x06: uncached */
+	0,							/* 0x07: Reserved */
+	0,							/* 0x08: Reserved */
+	0,							/* 0x09: Reserved */
+	0,							/* 0x0a: Reserved */
+	0,							/* 0x0b: Reserved */
+	0,							/* 0x0c: Reserved */
+	0,							/* 0x0d: Reserved */
+	0,							/* 0x0e: Reserved */
+	0,							/* 0x0f: Reserved */
+};
+
 /* L2 miss */
 #define OMR_DATA_SOURCE_MAX		16
 static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
@@ -458,6 +493,44 @@ u64 cmt_latency_data(struct perf_event *event, u64 status)
 				  dse.mtl_fwd_blk);
 }
 
+static u64 arw_latency_data(struct perf_event *event, u64 status)
+{
+	union intel_x86_pebs_dse dse;
+	union perf_mem_data_src src;
+	u64 val;
+
+	dse.val = status;
+
+	if (!dse.arw_l2_miss)
+		val = arw_pebs_l2_hit_data_source[dse.arw_dse & 0xf];
+	else
+		val = parse_omr_data_source(dse.arw_dse);
+
+	if (!val)
+		val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
+
+	if (dse.arw_stlb_miss)
+		val |= P(TLB, MISS) | P(TLB, L2);
+	else
+		val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
+
+	if (dse.arw_locked)
+		val |= P(LOCK, LOCKED);
+
+	if (dse.arw_data_blk)
+		val |= P(BLK, DATA);
+	if (dse.arw_addr_blk)
+		val |= P(BLK, ADDR);
+	if (!dse.arw_data_blk && !dse.arw_addr_blk)
+		val |= P(BLK, NA);
+
+	src.val = val;
+	if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
+		src.mem_op = P(OP, STORE);
+
+	return src.val;
+}
+
 static u64 lnc_latency_data(struct perf_event *event, u64 status)
 {
 	union intel_x86_pebs_dse dse;
@@ -551,6 +624,16 @@ u64 pnc_latency_data(struct perf_event *event, u64 status)
 	return src.val;
 }
 
+u64 nvl_latency_data(struct perf_event *event, u64 status)
+{
+	struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
+
+	if (pmu->pmu_type == hybrid_small)
+		return arw_latency_data(event, status);
+
+	return pnc_latency_data(event, status);
+}
+
 static u64 load_latency_data(struct perf_event *event, u64 status)
 {
 	union intel_x86_pebs_dse dse;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index cbca188..aedc1a7 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1666,6 +1666,8 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status);
 
 u64 pnc_latency_data(struct perf_event *event, u64 status);
 
+u64 nvl_latency_data(struct perf_event *event, u64 status);
+
 extern struct event_constraint intel_core2_pebs_event_constraints[];
 
 extern struct event_constraint intel_atom_pebs_event_constraints[];

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: perf/core] perf/x86/intel: Add core PMU support for DMR
  2026-01-14  1:17 ` [Patch v3 3/7] perf/x86/intel: Add core PMU support for DMR Dapeng Mi
@ 2026-01-15 21:44   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-01-15 21:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     d345b6bb886004ac1018da0348b5da7d9906071b
Gitweb:        https://git.kernel.org/tip/d345b6bb886004ac1018da0348b5da7d9906071b
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 14 Jan 2026 09:17:46 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 15 Jan 2026 10:04:27 +01:00

perf/x86/intel: Add core PMU support for DMR

This patch enables core PMU features for Diamond Rapids (Panther Cove
microarchitecture), including Panther Cove specific counter and PEBS
constraints, a new cache events ID table, and the model-specific OMR
events extra registers table.

For detailed information about counter constraints, please refer to
section 16.3 "COUNTER RESTRICTIONS" in the ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-4-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c | 179 +++++++++++++++++++++++++++++++++-
 arch/x86/events/intel/ds.c   |  27 +++++-
 arch/x86/events/perf_event.h |   2 +-
 3 files changed, 207 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 3578c66..b2f99d4 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -435,6 +435,62 @@ static struct extra_reg intel_lnc_extra_regs[] __read_mostly = {
 	EVENT_EXTRA_END
 };
 
+static struct event_constraint intel_pnc_event_constraints[] = {
+	FIXED_EVENT_CONSTRAINT(0x00c0, 0),	/* INST_RETIRED.ANY */
+	FIXED_EVENT_CONSTRAINT(0x0100, 0),	/* INST_RETIRED.PREC_DIST */
+	FIXED_EVENT_CONSTRAINT(0x003c, 1),	/* CPU_CLK_UNHALTED.CORE */
+	FIXED_EVENT_CONSTRAINT(0x0300, 2),	/* CPU_CLK_UNHALTED.REF */
+	FIXED_EVENT_CONSTRAINT(0x013c, 2),	/* CPU_CLK_UNHALTED.REF_TSC_P */
+	FIXED_EVENT_CONSTRAINT(0x0400, 3),	/* SLOTS */
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_RETIRING, 0),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BAD_SPEC, 1),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FE_BOUND, 2),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BE_BOUND, 3),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_HEAVY_OPS, 4),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BR_MISPREDICT, 5),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FETCH_LAT, 6),
+	METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_MEM_BOUND, 7),
+
+	INTEL_EVENT_CONSTRAINT(0x20, 0xf),
+	INTEL_EVENT_CONSTRAINT(0x79, 0xf),
+
+	INTEL_UEVENT_CONSTRAINT(0x0275, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0176, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x04a4, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x08a4, 0x1),
+	INTEL_UEVENT_CONSTRAINT(0x01cd, 0xfc),
+	INTEL_UEVENT_CONSTRAINT(0x02cd, 0x3),
+
+	INTEL_EVENT_CONSTRAINT(0xd0, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xd1, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xd4, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xd6, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xdf, 0xf),
+	INTEL_EVENT_CONSTRAINT(0xce, 0x1),
+
+	INTEL_UEVENT_CONSTRAINT(0x01b1, 0x8),
+	INTEL_UEVENT_CONSTRAINT(0x0847, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0446, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0846, 0xf),
+	INTEL_UEVENT_CONSTRAINT(0x0148, 0xf),
+
+	EVENT_CONSTRAINT_END
+};
+
+static struct extra_reg intel_pnc_extra_regs[] __read_mostly = {
+	/* must define OMR_X first, see intel_alt_er() */
+	INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OMR_0, 0x40ffffff0000ffffull, OMR_0),
+	INTEL_UEVENT_EXTRA_REG(0x022a, MSR_OMR_1, 0x40ffffff0000ffffull, OMR_1),
+	INTEL_UEVENT_EXTRA_REG(0x042a, MSR_OMR_2, 0x40ffffff0000ffffull, OMR_2),
+	INTEL_UEVENT_EXTRA_REG(0x082a, MSR_OMR_3, 0x40ffffff0000ffffull, OMR_3),
+	INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),
+	INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE),
+	INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE),
+	INTEL_UEVENT_EXTRA_REG(0x40ad, MSR_PEBS_FRONTEND, 0xf, FE),
+	INTEL_UEVENT_EXTRA_REG(0x04c2, MSR_PEBS_FRONTEND, 0x8, FE),
+	EVENT_EXTRA_END
+};
+
 EVENT_ATTR_STR(mem-loads,	mem_ld_nhm,	"event=0x0b,umask=0x10,ldlat=3");
 EVENT_ATTR_STR(mem-loads,	mem_ld_snb,	"event=0xcd,umask=0x1,ldlat=3");
 EVENT_ATTR_STR(mem-stores,	mem_st_snb,	"event=0xcd,umask=0x2");
@@ -650,6 +706,102 @@ static __initconst const u64 glc_hw_cache_extra_regs
  },
 };
 
+static __initconst const u64 pnc_hw_cache_event_ids
+				[PERF_COUNT_HW_CACHE_MAX]
+				[PERF_COUNT_HW_CACHE_OP_MAX]
+				[PERF_COUNT_HW_CACHE_RESULT_MAX] =
+{
+ [ C(L1D ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x81d0,
+		[ C(RESULT_MISS)   ] = 0xe124,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x82d0,
+	},
+ },
+ [ C(L1I ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_MISS)   ] = 0xe424,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+ [ C(LL  ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x12a,
+		[ C(RESULT_MISS)   ] = 0x12a,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x12a,
+		[ C(RESULT_MISS)   ] = 0x12a,
+	},
+ },
+ [ C(DTLB) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x81d0,
+		[ C(RESULT_MISS)   ] = 0xe12,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x82d0,
+		[ C(RESULT_MISS)   ] = 0xe13,
+	},
+ },
+ [ C(ITLB) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = 0xe11,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+	[ C(OP_PREFETCH) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+ [ C(BPU ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x4c4,
+		[ C(RESULT_MISS)   ] = 0x4c5,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+	[ C(OP_PREFETCH) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+ [ C(NODE) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = -1,
+		[ C(RESULT_MISS)   ] = -1,
+	},
+ },
+};
+
+static __initconst const u64 pnc_hw_cache_extra_regs
+				[PERF_COUNT_HW_CACHE_MAX]
+				[PERF_COUNT_HW_CACHE_OP_MAX]
+				[PERF_COUNT_HW_CACHE_RESULT_MAX] =
+{
+ [ C(LL  ) ] = {
+	[ C(OP_READ) ] = {
+		[ C(RESULT_ACCESS) ] = 0x4000000000000001,
+		[ C(RESULT_MISS)   ] = 0xFFFFF000000001,
+	},
+	[ C(OP_WRITE) ] = {
+		[ C(RESULT_ACCESS) ] = 0x4000000000000002,
+		[ C(RESULT_MISS)   ] = 0xFFFFF000000002,
+	},
+ },
+};
+
 /*
  * Notes on the events:
  * - data reads do not include code reads (comparable to earlier tables)
@@ -7236,6 +7388,20 @@ static __always_inline void intel_pmu_init_lnc(struct pmu *pmu)
 	hybrid(pmu, extra_regs) = intel_lnc_extra_regs;
 }
 
+static __always_inline void intel_pmu_init_pnc(struct pmu *pmu)
+{
+	intel_pmu_init_glc(pmu);
+	x86_pmu.flags &= ~PMU_FL_HAS_RSP_1;
+	x86_pmu.flags |= PMU_FL_HAS_OMR;
+	memcpy(hybrid_var(pmu, hw_cache_event_ids),
+	       pnc_hw_cache_event_ids, sizeof(hw_cache_event_ids));
+	memcpy(hybrid_var(pmu, hw_cache_extra_regs),
+	       pnc_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
+	hybrid(pmu, event_constraints) = intel_pnc_event_constraints;
+	hybrid(pmu, pebs_constraints) = intel_pnc_pebs_event_constraints;
+	hybrid(pmu, extra_regs) = intel_pnc_extra_regs;
+}
+
 static __always_inline void intel_pmu_init_skt(struct pmu *pmu)
 {
 	intel_pmu_init_grt(pmu);
@@ -7897,9 +8063,21 @@ __init int intel_pmu_init(void)
 		x86_pmu.extra_regs = intel_rwc_extra_regs;
 		pr_cont("Granite Rapids events, ");
 		name = "granite_rapids";
+		goto glc_common;
+
+	case INTEL_DIAMONDRAPIDS_X:
+		intel_pmu_init_pnc(NULL);
+		x86_pmu.pebs_latency_data = pnc_latency_data;
+
+		pr_cont("Panthercove events, ");
+		name = "panthercove";
+		goto glc_base;
 
 	glc_common:
 		intel_pmu_init_glc(NULL);
+		intel_pmu_pebs_data_source_skl(true);
+
+	glc_base:
 		x86_pmu.pebs_ept = 1;
 		x86_pmu.hw_config = hsw_hw_config;
 		x86_pmu.get_event_constraints = glc_get_event_constraints;
@@ -7909,7 +8087,6 @@ __init int intel_pmu_init(void)
 		mem_attr = glc_events_attrs;
 		td_attr = glc_td_events_attrs;
 		tsx_attr = glc_tsx_events_attrs;
-		intel_pmu_pebs_data_source_skl(true);
 		break;
 
 	case INTEL_ALDERLAKE:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 272e652..06e42ac 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1425,6 +1425,33 @@ struct event_constraint intel_lnc_pebs_event_constraints[] = {
 	EVENT_CONSTRAINT_END
 };
 
+struct event_constraint intel_pnc_pebs_event_constraints[] = {
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL),	/* INST_RETIRED.PREC_DIST */
+	INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL),
+
+	INTEL_HYBRID_LDLAT_CONSTRAINT(0x1cd, 0xfc),
+	INTEL_HYBRID_STLAT_CONSTRAINT(0x2cd, 0x3),
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf),	/* MEM_INST_RETIRED.STLB_MISS_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf),	/* MEM_INST_RETIRED.STLB_MISS_STORES */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf),	/* MEM_INST_RETIRED.LOCK_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x41d0, 0xf),	/* MEM_INST_RETIRED.SPLIT_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x42d0, 0xf),	/* MEM_INST_RETIRED.SPLIT_STORES */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x81d0, 0xf),	/* MEM_INST_RETIRED.ALL_LOADS */
+	INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x82d0, 0xf),	/* MEM_INST_RETIRED.ALL_STORES */
+
+	INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD_RANGE(0xd1, 0xd4, 0xf),
+
+	INTEL_FLAGS_EVENT_CONSTRAINT(0xd0, 0xf),
+	INTEL_FLAGS_EVENT_CONSTRAINT(0xd6, 0xf),
+
+	/*
+	 * Everything else is handled by PMU_FL_PEBS_ALL, because we
+	 * need the full constraints from the main table.
+	 */
+
+	EVENT_CONSTRAINT_END
+};
+
 struct event_constraint *intel_pebs_constraints(struct perf_event *event)
 {
 	struct event_constraint *pebs_constraints = hybrid(event->pmu, pebs_constraints);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index bd501c2..cbca188 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1698,6 +1698,8 @@ extern struct event_constraint intel_glc_pebs_event_constraints[];
 
 extern struct event_constraint intel_lnc_pebs_event_constraints[];
 
+extern struct event_constraint intel_pnc_pebs_event_constraints[];
+
 struct event_constraint *intel_pebs_constraints(struct perf_event *event);
 
 void intel_pmu_pebs_add(struct perf_event *event);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: perf/core] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  2026-01-14  1:17 ` [Patch v3 2/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR Dapeng Mi
@ 2026-01-15 21:44   ` tip-bot2 for Dapeng Mi
  2026-03-09 23:47     ` Ian Rogers
  0 siblings, 1 reply; 21+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-01-15 21:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
Gitweb:        https://git.kernel.org/tip/d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 14 Jan 2026 09:17:45 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 15 Jan 2026 10:04:26 +01:00

perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR

With the introduction of the OMR feature, the PEBS memory auxiliary info
field for load and store latency events has been restructured for DMR.

The memory auxiliary info field's bit[8] indicates whether a L2 cache
miss occurred for a memory load or store instruction. If bit[8] is 0,
it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
the OMR encoding, indicating the specific L3 cache or memory region
involved in the memory access. A significant enhancement is OMR encoding
provides up to 8 fine-grained memory regions besides the cache region.

A significant enhancement for OMR encoding is the ability to provide
up to 8 fine-grained memory regions in addition to the cache region,
offering more detailed insights into memory access regions.

For detailed information on the memory auxiliary info encoding, please
refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
the ISE documentation.

This patch ensures that the PEBS memory auxiliary info field is correctly
interpreted and utilized in DMR.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-3-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/ds.c            | 140 +++++++++++++++++++++++++-
 arch/x86/events/perf_event.h          |   2 +-
 include/uapi/linux/perf_event.h       |  27 ++++-
 tools/include/uapi/linux/perf_event.h |  27 ++++-
 4 files changed, 190 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index feb1c3c..272e652 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -34,6 +34,17 @@ struct pebs_record_32 {
 
  */
 
+union omr_encoding {
+	struct {
+		u8 omr_source : 4;
+		u8 omr_remote : 1;
+		u8 omr_hitm : 1;
+		u8 omr_snoop : 1;
+		u8 omr_promoted : 1;
+	};
+	u8 omr_full;
+};
+
 union intel_x86_pebs_dse {
 	u64 val;
 	struct {
@@ -73,6 +84,18 @@ union intel_x86_pebs_dse {
 		unsigned int lnc_addr_blk:1;
 		unsigned int ld_reserved6:18;
 	};
+	struct {
+		unsigned int pnc_dse: 8;
+		unsigned int pnc_l2_miss:1;
+		unsigned int pnc_stlb_clean_hit:1;
+		unsigned int pnc_stlb_any_hit:1;
+		unsigned int pnc_stlb_miss:1;
+		unsigned int pnc_locked:1;
+		unsigned int pnc_data_blk:1;
+		unsigned int pnc_addr_blk:1;
+		unsigned int pnc_fb_full:1;
+		unsigned int ld_reserved8:16;
+	};
 };
 
 
@@ -228,6 +251,85 @@ void __init intel_pmu_pebs_data_source_lnl(void)
 	__intel_pmu_pebs_data_source_cmt(data_source);
 }
 
+/* Version for Panthercove and later */
+
+/* L2 hit */
+#define PNC_PEBS_DATA_SOURCE_MAX	16
+static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
+	P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),	/* 0x00: non-cache access */
+	OP_LH               | LEVEL(L0) | P(SNOOP, NONE),	/* 0x01: L0 hit */
+	OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),	/* 0x02: L1 hit */
+	OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),	/* 0x03: L1 Miss Handling Buffer hit */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),	/* 0x04: L2 Hit Clean */
+	0,							/* 0x05: Reserved */
+	0,							/* 0x06: Reserved */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),	/* 0x07: L2 Hit Snoop HIT */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),	/* 0x08: L2 Hit Snoop Hit Modified */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),	/* 0x09: Prefetch Promotion */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),	/* 0x0a: Cross Core Prefetch Promotion */
+	0,							/* 0x0b: Reserved */
+	0,							/* 0x0c: Reserved */
+	0,							/* 0x0d: Reserved */
+	0,							/* 0x0e: Reserved */
+	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),	/* 0x0f: uncached */
+};
+
+/* L2 miss */
+#define OMR_DATA_SOURCE_MAX		16
+static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
+	P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),	/* 0x00: invalid */
+	0,							/* 0x01: Reserved */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE),	/* 0x02: local CA shared cache */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO),	/* 0x04: other CA IO agent */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE),	/* 0x05: other CA shared cache */
+	OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */
+	OP_LH | LEVEL(RAM) | P(REGION, MMIO),			/* 0x07: MMIO */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM0),			/* 0x08: Memory region 0 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM1),			/* 0x09: Memory region 1 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM2),			/* 0x0a: Memory region 2 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM3),			/* 0x0b: Memory region 3 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM4),			/* 0x0c: Memory region 4 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM5),			/* 0x0d: Memory region 5 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM6),			/* 0x0e: Memory region 6 */
+	OP_LH | LEVEL(RAM) | P(REGION, MEM7),			/* 0x0f: Memory region 7 */
+};
+
+static u64 parse_omr_data_source(u8 dse)
+{
+	union omr_encoding omr;
+	u64 val = 0;
+
+	omr.omr_full = dse;
+	val = omr_data_source[omr.omr_source];
+	if (omr.omr_source > 0x1 && omr.omr_source < 0x7)
+		val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0;
+	else if (omr.omr_source > 0x7)
+		val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM);
+
+	if (omr.omr_remote)
+		val |= REM;
+
+	val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT);
+
+	if (omr.omr_source == 0x2) {
+		u8 snoop = omr.omr_snoop | omr.omr_promoted;
+
+		if (snoop == 0x0)
+			val |= P(SNOOP, NA);
+		else if (snoop == 0x1)
+			val |= P(SNOOP, MISS);
+		else if (snoop == 0x2)
+			val |= P(SNOOP, HIT);
+		else if (snoop == 0x3)
+			val |= P(SNOOP, NONE);
+	} else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) {
+		val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0;
+	}
+
+	return val;
+}
+
 static u64 precise_store_data(u64 status)
 {
 	union intel_x86_pebs_dse dse;
@@ -411,6 +513,44 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status)
 	return lnl_latency_data(event, status);
 }
 
+u64 pnc_latency_data(struct perf_event *event, u64 status)
+{
+	union intel_x86_pebs_dse dse;
+	union perf_mem_data_src src;
+	u64 val;
+
+	dse.val = status;
+
+	if (!dse.pnc_l2_miss)
+		val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf];
+	else
+		val = parse_omr_data_source(dse.pnc_dse);
+
+	if (!val)
+		val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
+
+	if (dse.pnc_stlb_miss)
+		val |= P(TLB, MISS) | P(TLB, L2);
+	else
+		val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
+
+	if (dse.pnc_locked)
+		val |= P(LOCK, LOCKED);
+
+	if (dse.pnc_data_blk)
+		val |= P(BLK, DATA);
+	if (dse.pnc_addr_blk)
+		val |= P(BLK, ADDR);
+	if (!dse.pnc_data_blk && !dse.pnc_addr_blk)
+		val |= P(BLK, NA);
+
+	src.val = val;
+	if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
+		src.mem_op = P(OP, STORE);
+
+	return src.val;
+}
+
 static u64 load_latency_data(struct perf_event *event, u64 status)
 {
 	union intel_x86_pebs_dse dse;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 586e3fd..bd501c2 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1664,6 +1664,8 @@ u64 lnl_latency_data(struct perf_event *event, u64 status);
 
 u64 arl_h_latency_data(struct perf_event *event, u64 status);
 
+u64 pnc_latency_data(struct perf_event *event, u64 status);
+
 extern struct event_constraint intel_core2_pebs_event_constraints[];
 
 extern struct event_constraint intel_atom_pebs_event_constraints[];
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index c44a8fb..533393e 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1330,14 +1330,16 @@ union perf_mem_data_src {
 			mem_snoopx  :  2, /* Snoop mode, ext */
 			mem_blk     :  3, /* Access blocked */
 			mem_hops    :  3, /* Hop level */
-			mem_rsvd    : 18;
+			mem_region  :  5, /* cache/memory regions */
+			mem_rsvd    : 13;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd    : 18,
+		__u64	mem_rsvd    : 13,
+			mem_region  :  5, /* cache/memory regions */
 			mem_hops    :  3, /* Hop level */
 			mem_blk     :  3, /* Access blocked */
 			mem_snoopx  :  2, /* Snoop mode, ext */
@@ -1394,7 +1396,7 @@ union perf_mem_data_src {
 #define PERF_MEM_LVLNUM_L4			0x0004 /* L4 */
 #define PERF_MEM_LVLNUM_L2_MHB			0x0005 /* L2 Miss Handling Buffer */
 #define PERF_MEM_LVLNUM_MSC			0x0006 /* Memory-side Cache */
-/* 0x007 available */
+#define PERF_MEM_LVLNUM_L0			0x0007 /* L0 */
 #define PERF_MEM_LVLNUM_UNC			0x0008 /* Uncached */
 #define PERF_MEM_LVLNUM_CXL			0x0009 /* CXL */
 #define PERF_MEM_LVLNUM_IO			0x000a /* I/O */
@@ -1447,6 +1449,25 @@ union perf_mem_data_src {
 /* 5-7 available */
 #define PERF_MEM_HOPS_SHIFT			43
 
+/* Cache/Memory region */
+#define PERF_MEM_REGION_NA		0x0  /* Invalid */
+#define PERF_MEM_REGION_RSVD		0x01 /* Reserved */
+#define PERF_MEM_REGION_L_SHARE		0x02 /* Local CA shared cache */
+#define PERF_MEM_REGION_L_NON_SHARE	0x03 /* Local CA non-shared cache */
+#define PERF_MEM_REGION_O_IO		0x04 /* Other CA IO agent */
+#define PERF_MEM_REGION_O_SHARE		0x05 /* Other CA shared cache */
+#define PERF_MEM_REGION_O_NON_SHARE	0x06 /* Other CA non-shared cache */
+#define PERF_MEM_REGION_MMIO		0x07 /* MMIO */
+#define PERF_MEM_REGION_MEM0		0x08 /* Memory region 0 */
+#define PERF_MEM_REGION_MEM1		0x09 /* Memory region 1 */
+#define PERF_MEM_REGION_MEM2		0x0a /* Memory region 2 */
+#define PERF_MEM_REGION_MEM3		0x0b /* Memory region 3 */
+#define PERF_MEM_REGION_MEM4		0x0c /* Memory region 4 */
+#define PERF_MEM_REGION_MEM5		0x0d /* Memory region 5 */
+#define PERF_MEM_REGION_MEM6		0x0e /* Memory region 6 */
+#define PERF_MEM_REGION_MEM7		0x0f /* Memory region 7 */
+#define PERF_MEM_REGION_SHIFT		46
+
 #define PERF_MEM_S(a, s) \
 	(((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
 
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index c44a8fb..d4b9961 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -1330,14 +1330,16 @@ union perf_mem_data_src {
 			mem_snoopx  :  2, /* Snoop mode, ext */
 			mem_blk     :  3, /* Access blocked */
 			mem_hops    :  3, /* Hop level */
-			mem_rsvd    : 18;
+			mem_region  :  5, /* cache/memory regions */
+			mem_rsvd    : 13;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd    : 18,
+		__u64	mem_rsvd    : 13,
+			mem_region  :  5, /* cache/memory regions */
 			mem_hops    :  3, /* Hop level */
 			mem_blk     :  3, /* Access blocked */
 			mem_snoopx  :  2, /* Snoop mode, ext */
@@ -1394,7 +1396,7 @@ union perf_mem_data_src {
 #define PERF_MEM_LVLNUM_L4			0x0004 /* L4 */
 #define PERF_MEM_LVLNUM_L2_MHB			0x0005 /* L2 Miss Handling Buffer */
 #define PERF_MEM_LVLNUM_MSC			0x0006 /* Memory-side Cache */
-/* 0x007 available */
+#define PERF_MEM_LVLNUM_L0			0x0007   /* L0 */
 #define PERF_MEM_LVLNUM_UNC			0x0008 /* Uncached */
 #define PERF_MEM_LVLNUM_CXL			0x0009 /* CXL */
 #define PERF_MEM_LVLNUM_IO			0x000a /* I/O */
@@ -1447,6 +1449,25 @@ union perf_mem_data_src {
 /* 5-7 available */
 #define PERF_MEM_HOPS_SHIFT			43
 
+/* Cache/Memory region */
+#define PERF_MEM_REGION_NA		0x0  /* Invalid */
+#define PERF_MEM_REGION_RSVD		0x01 /* Reserved */
+#define PERF_MEM_REGION_L_SHARE		0x02 /* Local CA shared cache */
+#define PERF_MEM_REGION_L_NON_SHARE	0x03 /* Local CA non-shared cache */
+#define PERF_MEM_REGION_O_IO		0x04 /* Other CA IO agent */
+#define PERF_MEM_REGION_O_SHARE		0x05 /* Other CA shared cache */
+#define PERF_MEM_REGION_O_NON_SHARE	0x06 /* Other CA non-shared cache */
+#define PERF_MEM_REGION_MMIO		0x07 /* MMIO */
+#define PERF_MEM_REGION_MEM0		0x08 /* Memory region 0 */
+#define PERF_MEM_REGION_MEM1		0x09 /* Memory region 1 */
+#define PERF_MEM_REGION_MEM2		0x0a /* Memory region 2 */
+#define PERF_MEM_REGION_MEM3		0x0b /* Memory region 3 */
+#define PERF_MEM_REGION_MEM4		0x0c /* Memory region 4 */
+#define PERF_MEM_REGION_MEM5		0x0d /* Memory region 5 */
+#define PERF_MEM_REGION_MEM6		0x0e /* Memory region 6 */
+#define PERF_MEM_REGION_MEM7		0x0f /* Memory region 7 */
+#define PERF_MEM_REGION_SHIFT		46
+
 #define PERF_MEM_S(a, s) \
 	(((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: perf/core] perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
  2026-01-14  1:17 ` [Patch v3 1/7] perf/x86/intel: Support the 4 new OMR MSRs introduced in " Dapeng Mi
@ 2026-01-15 21:44   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-01-15 21:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     4e955c08d6dc76fb60cda9af955ddcebedaa7f69
Gitweb:        https://git.kernel.org/tip/4e955c08d6dc76fb60cda9af955ddcebedaa7f69
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 14 Jan 2026 09:17:44 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 15 Jan 2026 10:04:26 +01:00

perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL

Diamond Rapids (DMR) and Nova Lake (NVL) introduce an enhanced
Off-Module Response (OMR) facility, replacing the Off-Core Response (OCR)
Performance Monitoring of previous processors.

Legacy microarchitectures used the OCR facility to evaluate off-core and
multi-core off-module transactions. The newly named OMR facility improves
OCR capabilities for scalable coverage of new memory systems in
multi-core module systems.

Similar to OCR, 4 additional off-module configuration MSRs
(OFFMODULE_RSP_0 to OFFMODULE_RSP_3) are introduced to specify attributes
of off-module transactions. When multiple identical OMR events are
created, they need to occupy the same OFFMODULE_RSP_x MSR. To ensure
these multiple identical OMR events can work simultaneously, the
intel_alt_er() and intel_fixup_er() helpers are enhanced to rotate these
OMR events across different OFFMODULE_RSP_* MSRs, similar to previous OCR
events.

For more details about OMR, please refer to section 16.1 "OFF-MODULE
 RESPONSE (OMR) FACILITY" in ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-2-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c     | 59 ++++++++++++++++++++++---------
 arch/x86/events/perf_event.h     |  5 +++-
 arch/x86/include/asm/msr-index.h |  5 +++-
 3 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 1840ca1..3578c66 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3532,17 +3532,32 @@ static int intel_alt_er(struct cpu_hw_events *cpuc,
 	struct extra_reg *extra_regs = hybrid(cpuc->pmu, extra_regs);
 	int alt_idx = idx;
 
-	if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
-		return idx;
-
-	if (idx == EXTRA_REG_RSP_0)
-		alt_idx = EXTRA_REG_RSP_1;
+	switch (idx) {
+	case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1:
+		if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
+			return idx;
+		if (++alt_idx > EXTRA_REG_RSP_1)
+			alt_idx = EXTRA_REG_RSP_0;
+		if (config & ~extra_regs[alt_idx].valid_mask)
+			return idx;
+		break;
 
-	if (idx == EXTRA_REG_RSP_1)
-		alt_idx = EXTRA_REG_RSP_0;
+	case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3:
+		if (!(x86_pmu.flags & PMU_FL_HAS_OMR))
+			return idx;
+		if (++alt_idx > EXTRA_REG_OMR_3)
+			alt_idx = EXTRA_REG_OMR_0;
+		/*
+		 * Subtracting EXTRA_REG_OMR_0 ensures to get correct
+		 * OMR extra_reg entries which start from 0.
+		 */
+		if (config & ~extra_regs[alt_idx - EXTRA_REG_OMR_0].valid_mask)
+			return idx;
+		break;
 
-	if (config & ~extra_regs[alt_idx].valid_mask)
-		return idx;
+	default:
+		break;
+	}
 
 	return alt_idx;
 }
@@ -3550,16 +3565,26 @@ static int intel_alt_er(struct cpu_hw_events *cpuc,
 static void intel_fixup_er(struct perf_event *event, int idx)
 {
 	struct extra_reg *extra_regs = hybrid(event->pmu, extra_regs);
-	event->hw.extra_reg.idx = idx;
+	int er_idx;
 
-	if (idx == EXTRA_REG_RSP_0) {
-		event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
-		event->hw.config |= extra_regs[EXTRA_REG_RSP_0].event;
-		event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0;
-	} else if (idx == EXTRA_REG_RSP_1) {
+	event->hw.extra_reg.idx = idx;
+	switch (idx) {
+	case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1:
+		er_idx = idx - EXTRA_REG_RSP_0;
 		event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
-		event->hw.config |= extra_regs[EXTRA_REG_RSP_1].event;
-		event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
+		event->hw.config |= extra_regs[er_idx].event;
+		event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0 + er_idx;
+		break;
+
+	case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3:
+		er_idx = idx - EXTRA_REG_OMR_0;
+		event->hw.config &= ~ARCH_PERFMON_EVENTSEL_UMASK;
+		event->hw.config |= 1ULL << (8 + er_idx);
+		event->hw.extra_reg.reg = MSR_OMR_0 + er_idx;
+		break;
+
+	default:
+		pr_warn("The extra reg idx %d is not supported.\n", idx);
 	}
 }
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 3161ec0..586e3fd 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -45,6 +45,10 @@ enum extra_reg_type {
 	EXTRA_REG_FE		= 4,  /* fe_* */
 	EXTRA_REG_SNOOP_0	= 5,  /* snoop response 0 */
 	EXTRA_REG_SNOOP_1	= 6,  /* snoop response 1 */
+	EXTRA_REG_OMR_0		= 7,  /* OMR 0 */
+	EXTRA_REG_OMR_1		= 8,  /* OMR 1 */
+	EXTRA_REG_OMR_2		= 9,  /* OMR 2 */
+	EXTRA_REG_OMR_3		= 10,  /* OMR 3 */
 
 	EXTRA_REG_MAX		      /* number of entries needed */
 };
@@ -1099,6 +1103,7 @@ do {									\
 #define PMU_FL_RETIRE_LATENCY	0x200 /* Support Retire Latency in PEBS */
 #define PMU_FL_BR_CNTR		0x400 /* Support branch counter logging */
 #define PMU_FL_DYN_CONSTRAINT	0x800 /* Needs dynamic constraint */
+#define PMU_FL_HAS_OMR		0x1000 /* has 4 equivalent OMR regs */
 
 #define EVENT_VAR(_id)  event_attr_##_id
 #define EVENT_PTR(_id) &event_attr_##_id.attr.attr
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3d0a095..6d1b69e 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -263,6 +263,11 @@
 #define MSR_SNOOP_RSP_0			0x00001328
 #define MSR_SNOOP_RSP_1			0x00001329
 
+#define MSR_OMR_0			0x000003e0
+#define MSR_OMR_1			0x000003e1
+#define MSR_OMR_2			0x000003e2
+#define MSR_OMR_3			0x000003e3
+
 #define MSR_LBR_SELECT			0x000001c8
 #define MSR_LBR_TOS			0x000001c9
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [tip: perf/core] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
@ 2026-03-09 23:47     ` Ian Rogers
  2026-03-10  3:32       ` Mi, Dapeng
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Rogers @ 2026-03-09 23:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-tip-commits, Dapeng Mi, Peter Zijlstra (Intel), x86

On Thu, Jan 15, 2026 at 1:46 PM tip-bot2 for Dapeng Mi
<tip-bot2@linutronix.de> wrote:
>
> The following commit has been merged into the perf/core branch of tip:
>
> Commit-ID:     d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
> Gitweb:        https://git.kernel.org/tip/d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
> Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
> AuthorDate:    Wed, 14 Jan 2026 09:17:45 +08:00
> Committer:     Peter Zijlstra <peterz@infradead.org>
> CommitterDate: Thu, 15 Jan 2026 10:04:26 +01:00
>
> perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
>
> With the introduction of the OMR feature, the PEBS memory auxiliary info
> field for load and store latency events has been restructured for DMR.
>
> The memory auxiliary info field's bit[8] indicates whether a L2 cache
> miss occurred for a memory load or store instruction. If bit[8] is 0,
> it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
> source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
> the OMR encoding, indicating the specific L3 cache or memory region
> involved in the memory access. A significant enhancement is OMR encoding
> provides up to 8 fine-grained memory regions besides the cache region.
>
> A significant enhancement for OMR encoding is the ability to provide
> up to 8 fine-grained memory regions in addition to the cache region,
> offering more detailed insights into memory access regions.
>
> For detailed information on the memory auxiliary info encoding, please
> refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
> the ISE documentation.
>
> This patch ensures that the PEBS memory auxiliary info field is correctly
> interpreted and utilized in DMR.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: https://patch.msgid.link/20260114011750.350569-3-dapeng1.mi@linux.intel.com
> ---
>  arch/x86/events/intel/ds.c            | 140 +++++++++++++++++++++++++-
>  arch/x86/events/perf_event.h          |   2 +-
>  include/uapi/linux/perf_event.h       |  27 ++++-
>  tools/include/uapi/linux/perf_event.h |  27 ++++-
>  4 files changed, 190 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index feb1c3c..272e652 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -34,6 +34,17 @@ struct pebs_record_32 {
>
>   */
>
> +union omr_encoding {
> +       struct {
> +               u8 omr_source : 4;
> +               u8 omr_remote : 1;
> +               u8 omr_hitm : 1;
> +               u8 omr_snoop : 1;
> +               u8 omr_promoted : 1;

Hi Dapeng,

omr_snoop and omr_promoted are 1 bit fields here.

> +       };
> +       u8 omr_full;
> +};
> +
>  union intel_x86_pebs_dse {
>         u64 val;
>         struct {
> @@ -73,6 +84,18 @@ union intel_x86_pebs_dse {
>                 unsigned int lnc_addr_blk:1;
>                 unsigned int ld_reserved6:18;
>         };
> +       struct {
> +               unsigned int pnc_dse: 8;
> +               unsigned int pnc_l2_miss:1;
> +               unsigned int pnc_stlb_clean_hit:1;
> +               unsigned int pnc_stlb_any_hit:1;
> +               unsigned int pnc_stlb_miss:1;
> +               unsigned int pnc_locked:1;
> +               unsigned int pnc_data_blk:1;
> +               unsigned int pnc_addr_blk:1;
> +               unsigned int pnc_fb_full:1;
> +               unsigned int ld_reserved8:16;
> +       };
>  };
>
>
> @@ -228,6 +251,85 @@ void __init intel_pmu_pebs_data_source_lnl(void)
>         __intel_pmu_pebs_data_source_cmt(data_source);
>  }
>
> +/* Version for Panthercove and later */
> +
> +/* L2 hit */
> +#define PNC_PEBS_DATA_SOURCE_MAX       16
> +static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: non-cache access */
> +       OP_LH               | LEVEL(L0) | P(SNOOP, NONE),       /* 0x01: L0 hit */
> +       OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),       /* 0x02: L1 hit */
> +       OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),      /* 0x03: L1 Miss Handling Buffer hit */
> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),       /* 0x04: L2 Hit Clean */
> +       0,                                                      /* 0x05: Reserved */
> +       0,                                                      /* 0x06: Reserved */
> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),        /* 0x07: L2 Hit Snoop HIT */
> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),       /* 0x08: L2 Hit Snoop Hit Modified */
> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x09: Prefetch Promotion */
> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x0a: Cross Core Prefetch Promotion */
> +       0,                                                      /* 0x0b: Reserved */
> +       0,                                                      /* 0x0c: Reserved */
> +       0,                                                      /* 0x0d: Reserved */
> +       0,                                                      /* 0x0e: Reserved */
> +       OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),       /* 0x0f: uncached */
> +};
> +
> +/* L2 miss */
> +#define OMR_DATA_SOURCE_MAX            16
> +static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: invalid */
> +       0,                                                      /* 0x01: Reserved */
> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE),    /* 0x02: local CA shared cache */
> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */
> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO),       /* 0x04: other CA IO agent */
> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE),    /* 0x05: other CA shared cache */
> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */
> +       OP_LH | LEVEL(RAM) | P(REGION, MMIO),                   /* 0x07: MMIO */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM0),                   /* 0x08: Memory region 0 */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM1),                   /* 0x09: Memory region 1 */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM2),                   /* 0x0a: Memory region 2 */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM3),                   /* 0x0b: Memory region 3 */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM4),                   /* 0x0c: Memory region 4 */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM5),                   /* 0x0d: Memory region 5 */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM6),                   /* 0x0e: Memory region 6 */
> +       OP_LH | LEVEL(RAM) | P(REGION, MEM7),                   /* 0x0f: Memory region 7 */
> +};
> +
> +static u64 parse_omr_data_source(u8 dse)
> +{
> +       union omr_encoding omr;
> +       u64 val = 0;
> +
> +       omr.omr_full = dse;
> +       val = omr_data_source[omr.omr_source];
> +       if (omr.omr_source > 0x1 && omr.omr_source < 0x7)
> +               val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0;
> +       else if (omr.omr_source > 0x7)
> +               val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM);
> +
> +       if (omr.omr_remote)
> +               val |= REM;
> +
> +       val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT);
> +
> +       if (omr.omr_source == 0x2) {
> +               u8 snoop = omr.omr_snoop | omr.omr_promoted;

Or-ing the values together should mean snoop is only ever 0 or 1.

> +
> +               if (snoop == 0x0)
> +                       val |= P(SNOOP, NA);
> +               else if (snoop == 0x1)
> +                       val |= P(SNOOP, MISS);
> +               else if (snoop == 0x2)
> +                       val |= P(SNOOP, HIT);
> +               else if (snoop == 0x3)
> +                       val |= P(SNOOP, NONE);

How can snoop equal 0x2 or 0x3 here? Should snoop be "(omr.omr_snoop
<< 1) | omr.omr_promoted" ?

Thanks,
Ian

> +       } else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) {
> +               val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0;
> +       }
> +
> +       return val;
> +}
> +
>  static u64 precise_store_data(u64 status)
>  {
>         union intel_x86_pebs_dse dse;
> @@ -411,6 +513,44 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status)
>         return lnl_latency_data(event, status);
>  }
>
> +u64 pnc_latency_data(struct perf_event *event, u64 status)
> +{
> +       union intel_x86_pebs_dse dse;
> +       union perf_mem_data_src src;
> +       u64 val;
> +
> +       dse.val = status;
> +
> +       if (!dse.pnc_l2_miss)
> +               val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf];
> +       else
> +               val = parse_omr_data_source(dse.pnc_dse);
> +
> +       if (!val)
> +               val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
> +
> +       if (dse.pnc_stlb_miss)
> +               val |= P(TLB, MISS) | P(TLB, L2);
> +       else
> +               val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
> +
> +       if (dse.pnc_locked)
> +               val |= P(LOCK, LOCKED);
> +
> +       if (dse.pnc_data_blk)
> +               val |= P(BLK, DATA);
> +       if (dse.pnc_addr_blk)
> +               val |= P(BLK, ADDR);
> +       if (!dse.pnc_data_blk && !dse.pnc_addr_blk)
> +               val |= P(BLK, NA);
> +
> +       src.val = val;
> +       if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
> +               src.mem_op = P(OP, STORE);
> +
> +       return src.val;
> +}
> +
>  static u64 load_latency_data(struct perf_event *event, u64 status)
>  {
>         union intel_x86_pebs_dse dse;
> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> index 586e3fd..bd501c2 100644
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -1664,6 +1664,8 @@ u64 lnl_latency_data(struct perf_event *event, u64 status);
>
>  u64 arl_h_latency_data(struct perf_event *event, u64 status);
>
> +u64 pnc_latency_data(struct perf_event *event, u64 status);
> +
>  extern struct event_constraint intel_core2_pebs_event_constraints[];
>
>  extern struct event_constraint intel_atom_pebs_event_constraints[];
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index c44a8fb..533393e 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
>                         mem_snoopx  :  2, /* Snoop mode, ext */
>                         mem_blk     :  3, /* Access blocked */
>                         mem_hops    :  3, /* Hop level */
> -                       mem_rsvd    : 18;
> +                       mem_region  :  5, /* cache/memory regions */
> +                       mem_rsvd    : 13;
>         };
>  };
>  #elif defined(__BIG_ENDIAN_BITFIELD)
>  union perf_mem_data_src {
>         __u64 val;
>         struct {
> -               __u64   mem_rsvd    : 18,
> +               __u64   mem_rsvd    : 13,
> +                       mem_region  :  5, /* cache/memory regions */
>                         mem_hops    :  3, /* Hop level */
>                         mem_blk     :  3, /* Access blocked */
>                         mem_snoopx  :  2, /* Snoop mode, ext */
> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
> -/* 0x007 available */
> +#define PERF_MEM_LVLNUM_L0                     0x0007 /* L0 */
>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
>  /* 5-7 available */
>  #define PERF_MEM_HOPS_SHIFT                    43
>
> +/* Cache/Memory region */
> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
> +#define PERF_MEM_REGION_SHIFT          46
> +
>  #define PERF_MEM_S(a, s) \
>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index c44a8fb..d4b9961 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
>                         mem_snoopx  :  2, /* Snoop mode, ext */
>                         mem_blk     :  3, /* Access blocked */
>                         mem_hops    :  3, /* Hop level */
> -                       mem_rsvd    : 18;
> +                       mem_region  :  5, /* cache/memory regions */
> +                       mem_rsvd    : 13;
>         };
>  };
>  #elif defined(__BIG_ENDIAN_BITFIELD)
>  union perf_mem_data_src {
>         __u64 val;
>         struct {
> -               __u64   mem_rsvd    : 18,
> +               __u64   mem_rsvd    : 13,
> +                       mem_region  :  5, /* cache/memory regions */
>                         mem_hops    :  3, /* Hop level */
>                         mem_blk     :  3, /* Access blocked */
>                         mem_snoopx  :  2, /* Snoop mode, ext */
> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
> -/* 0x007 available */
> +#define PERF_MEM_LVLNUM_L0                     0x0007   /* L0 */
>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
>  /* 5-7 available */
>  #define PERF_MEM_HOPS_SHIFT                    43
>
> +/* Cache/Memory region */
> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
> +#define PERF_MEM_REGION_SHIFT          46
> +
>  #define PERF_MEM_S(a, s) \
>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc
  2026-01-14  1:17 ` [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc Dapeng Mi
  2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
@ 2026-03-09 23:56   ` Ian Rogers
  2026-03-10  3:14     ` Mi, Dapeng
  1 sibling, 1 reply; 21+ messages in thread
From: Ian Rogers @ 2026-03-09 23:56 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao

On Tue, Jan 13, 2026 at 5:22 PM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> Replace magic numbers in attr_rdpmc with macros to improve readability
> and make their meanings clearer for users.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/core.c       | 7 ++++---
>  arch/x86/events/intel/p6.c   | 2 +-
>  arch/x86/events/perf_event.h | 7 +++++++
>  3 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 0ecac9495d74..c2717cb5034f 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2163,7 +2163,8 @@ static int __init init_hw_perf_events(void)
>
>         pr_cont("%s PMU driver.\n", x86_pmu.name);
>
> -       x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */
> +       /* enable userspace RDPMC usage by default */
> +       x86_pmu.attr_rdpmc = X86_USER_RDPMC_CONDITIONAL_ENABLE;
>
>         for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next)
>                 quirk->func();
> @@ -2643,12 +2644,12 @@ static ssize_t set_attr_rdpmc(struct device *cdev,
>                  */
>                 if (val == 0)

nit: Could 0 here be X86_USER_RDPMC_NEVER_ENABLE to eliminate even
more magic numbers?

>                         static_branch_inc(&rdpmc_never_available_key);
> -               else if (x86_pmu.attr_rdpmc == 0)
> +               else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_NEVER_ENABLE)
>                         static_branch_dec(&rdpmc_never_available_key);
>
>                 if (val == 2)

Similarly could 2 be X86_USER_RDPMC_ALWAYS_ENABLE ?

Thanks,
Ian

>                         static_branch_inc(&rdpmc_always_available_key);
> -               else if (x86_pmu.attr_rdpmc == 2)
> +               else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE)
>                         static_branch_dec(&rdpmc_always_available_key);
>
>                 on_each_cpu(cr4_update_pce, NULL, 1);
> diff --git a/arch/x86/events/intel/p6.c b/arch/x86/events/intel/p6.c
> index 6e41de355bd8..fb991e0ac614 100644
> --- a/arch/x86/events/intel/p6.c
> +++ b/arch/x86/events/intel/p6.c
> @@ -243,7 +243,7 @@ static __init void p6_pmu_rdpmc_quirk(void)
>                  */
>                 pr_warn("Userspace RDPMC support disabled due to a CPU erratum\n");
>                 x86_pmu.attr_rdpmc_broken = 1;
> -               x86_pmu.attr_rdpmc = 0;
> +               x86_pmu.attr_rdpmc = X86_USER_RDPMC_NEVER_ENABLE;
>         }
>  }
>
> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> index f7caabc5d487..24a81d2916e9 100644
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -187,6 +187,13 @@ struct amd_nb {
>          (1ULL << PERF_REG_X86_R14)   | \
>          (1ULL << PERF_REG_X86_R15))
>
> +/* user space rdpmc control values */
> +enum {
> +       X86_USER_RDPMC_NEVER_ENABLE             = 0,
> +       X86_USER_RDPMC_CONDITIONAL_ENABLE       = 1,
> +       X86_USER_RDPMC_ALWAYS_ENABLE            = 2,
> +};
> +
>  /*
>   * Per register state.
>   */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc
  2026-03-09 23:56   ` [Patch v3 6/7] " Ian Rogers
@ 2026-03-10  3:14     ` Mi, Dapeng
  0 siblings, 0 replies; 21+ messages in thread
From: Mi, Dapeng @ 2026-03-10  3:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao


On 3/10/2026 7:56 AM, Ian Rogers wrote:
> On Tue, Jan 13, 2026 at 5:22 PM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> Replace magic numbers in attr_rdpmc with macros to improve readability
>> and make their meanings clearer for users.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>  arch/x86/events/core.c       | 7 ++++---
>>  arch/x86/events/intel/p6.c   | 2 +-
>>  arch/x86/events/perf_event.h | 7 +++++++
>>  3 files changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index 0ecac9495d74..c2717cb5034f 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -2163,7 +2163,8 @@ static int __init init_hw_perf_events(void)
>>
>>         pr_cont("%s PMU driver.\n", x86_pmu.name);
>>
>> -       x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */
>> +       /* enable userspace RDPMC usage by default */
>> +       x86_pmu.attr_rdpmc = X86_USER_RDPMC_CONDITIONAL_ENABLE;
>>
>>         for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next)
>>                 quirk->func();
>> @@ -2643,12 +2644,12 @@ static ssize_t set_attr_rdpmc(struct device *cdev,
>>                  */
>>                 if (val == 0)
> nit: Could 0 here be X86_USER_RDPMC_NEVER_ENABLE to eliminate even
> more magic numbers?

IMO, the "val" directly comes from user, we'd better keep it as the hard
coded number, then readers would be easy to know which values would be
mapped to "X86_USER_RDPMC_NEVER_ENABLE",  "X86_USER_RDPMC_ALWAYS_ENABLE" or
"X86_USER_RDPMC_CONDITIONAL_ENABLE". They are not 1:1 mapped. Thanks.


>
>>                         static_branch_inc(&rdpmc_never_available_key);
>> -               else if (x86_pmu.attr_rdpmc == 0)
>> +               else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_NEVER_ENABLE)
>>                         static_branch_dec(&rdpmc_never_available_key);
>>
>>                 if (val == 2)
> Similarly could 2 be X86_USER_RDPMC_ALWAYS_ENABLE ?
>
> Thanks,
> Ian
>
>>                         static_branch_inc(&rdpmc_always_available_key);
>> -               else if (x86_pmu.attr_rdpmc == 2)
>> +               else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE)
>>                         static_branch_dec(&rdpmc_always_available_key);
>>
>>                 on_each_cpu(cr4_update_pce, NULL, 1);
>> diff --git a/arch/x86/events/intel/p6.c b/arch/x86/events/intel/p6.c
>> index 6e41de355bd8..fb991e0ac614 100644
>> --- a/arch/x86/events/intel/p6.c
>> +++ b/arch/x86/events/intel/p6.c
>> @@ -243,7 +243,7 @@ static __init void p6_pmu_rdpmc_quirk(void)
>>                  */
>>                 pr_warn("Userspace RDPMC support disabled due to a CPU erratum\n");
>>                 x86_pmu.attr_rdpmc_broken = 1;
>> -               x86_pmu.attr_rdpmc = 0;
>> +               x86_pmu.attr_rdpmc = X86_USER_RDPMC_NEVER_ENABLE;
>>         }
>>  }
>>
>> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
>> index f7caabc5d487..24a81d2916e9 100644
>> --- a/arch/x86/events/perf_event.h
>> +++ b/arch/x86/events/perf_event.h
>> @@ -187,6 +187,13 @@ struct amd_nb {
>>          (1ULL << PERF_REG_X86_R14)   | \
>>          (1ULL << PERF_REG_X86_R15))
>>
>> +/* user space rdpmc control values */
>> +enum {
>> +       X86_USER_RDPMC_NEVER_ENABLE             = 0,
>> +       X86_USER_RDPMC_CONDITIONAL_ENABLE       = 1,
>> +       X86_USER_RDPMC_ALWAYS_ENABLE            = 2,
>> +};
>> +
>>  /*
>>   * Per register state.
>>   */
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [tip: perf/core] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  2026-03-09 23:47     ` Ian Rogers
@ 2026-03-10  3:32       ` Mi, Dapeng
  2026-03-10  4:38         ` Ian Rogers
  0 siblings, 1 reply; 21+ messages in thread
From: Mi, Dapeng @ 2026-03-10  3:32 UTC (permalink / raw)
  To: Ian Rogers, linux-kernel; +Cc: linux-tip-commits, Peter Zijlstra (Intel), x86


On 3/10/2026 7:47 AM, Ian Rogers wrote:
> On Thu, Jan 15, 2026 at 1:46 PM tip-bot2 for Dapeng Mi
> <tip-bot2@linutronix.de> wrote:
>> The following commit has been merged into the perf/core branch of tip:
>>
>> Commit-ID:     d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
>> Gitweb:        https://git.kernel.org/tip/d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
>> Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
>> AuthorDate:    Wed, 14 Jan 2026 09:17:45 +08:00
>> Committer:     Peter Zijlstra <peterz@infradead.org>
>> CommitterDate: Thu, 15 Jan 2026 10:04:26 +01:00
>>
>> perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
>>
>> With the introduction of the OMR feature, the PEBS memory auxiliary info
>> field for load and store latency events has been restructured for DMR.
>>
>> The memory auxiliary info field's bit[8] indicates whether a L2 cache
>> miss occurred for a memory load or store instruction. If bit[8] is 0,
>> it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
>> source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
>> the OMR encoding, indicating the specific L3 cache or memory region
>> involved in the memory access. A significant enhancement is OMR encoding
>> provides up to 8 fine-grained memory regions besides the cache region.
>>
>> A significant enhancement for OMR encoding is the ability to provide
>> up to 8 fine-grained memory regions in addition to the cache region,
>> offering more detailed insights into memory access regions.
>>
>> For detailed information on the memory auxiliary info encoding, please
>> refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
>> the ISE documentation.
>>
>> This patch ensures that the PEBS memory auxiliary info field is correctly
>> interpreted and utilized in DMR.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Link: https://patch.msgid.link/20260114011750.350569-3-dapeng1.mi@linux.intel.com
>> ---
>>  arch/x86/events/intel/ds.c            | 140 +++++++++++++++++++++++++-
>>  arch/x86/events/perf_event.h          |   2 +-
>>  include/uapi/linux/perf_event.h       |  27 ++++-
>>  tools/include/uapi/linux/perf_event.h |  27 ++++-
>>  4 files changed, 190 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index feb1c3c..272e652 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -34,6 +34,17 @@ struct pebs_record_32 {
>>
>>   */
>>
>> +union omr_encoding {
>> +       struct {
>> +               u8 omr_source : 4;
>> +               u8 omr_remote : 1;
>> +               u8 omr_hitm : 1;
>> +               u8 omr_snoop : 1;
>> +               u8 omr_promoted : 1;
> Hi Dapeng,
>
> omr_snoop and omr_promoted are 1 bit fields here.

Yes. According the OMR encoding layout in the "Table 16-5. OMR Encoding for
P-Core and E-Core Microarchitectures" of the ISE doc, bit [6] represents
the snoop information and bit [7] represents promoted prefetch in most
cases. Although the bit[7] and bit[6] are combined to represent the snoop
information when omr_source field is 0x2, but it's only an exception. So
bit[6] is named to omr_snoop and bit[7] is named to omr_promoted here. Thanks.

>
>> +       };
>> +       u8 omr_full;
>> +};
>> +
>>  union intel_x86_pebs_dse {
>>         u64 val;
>>         struct {
>> @@ -73,6 +84,18 @@ union intel_x86_pebs_dse {
>>                 unsigned int lnc_addr_blk:1;
>>                 unsigned int ld_reserved6:18;
>>         };
>> +       struct {
>> +               unsigned int pnc_dse: 8;
>> +               unsigned int pnc_l2_miss:1;
>> +               unsigned int pnc_stlb_clean_hit:1;
>> +               unsigned int pnc_stlb_any_hit:1;
>> +               unsigned int pnc_stlb_miss:1;
>> +               unsigned int pnc_locked:1;
>> +               unsigned int pnc_data_blk:1;
>> +               unsigned int pnc_addr_blk:1;
>> +               unsigned int pnc_fb_full:1;
>> +               unsigned int ld_reserved8:16;
>> +       };
>>  };
>>
>>
>> @@ -228,6 +251,85 @@ void __init intel_pmu_pebs_data_source_lnl(void)
>>         __intel_pmu_pebs_data_source_cmt(data_source);
>>  }
>>
>> +/* Version for Panthercove and later */
>> +
>> +/* L2 hit */
>> +#define PNC_PEBS_DATA_SOURCE_MAX       16
>> +static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
>> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: non-cache access */
>> +       OP_LH               | LEVEL(L0) | P(SNOOP, NONE),       /* 0x01: L0 hit */
>> +       OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),       /* 0x02: L1 hit */
>> +       OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),      /* 0x03: L1 Miss Handling Buffer hit */
>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),       /* 0x04: L2 Hit Clean */
>> +       0,                                                      /* 0x05: Reserved */
>> +       0,                                                      /* 0x06: Reserved */
>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),        /* 0x07: L2 Hit Snoop HIT */
>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),       /* 0x08: L2 Hit Snoop Hit Modified */
>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x09: Prefetch Promotion */
>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x0a: Cross Core Prefetch Promotion */
>> +       0,                                                      /* 0x0b: Reserved */
>> +       0,                                                      /* 0x0c: Reserved */
>> +       0,                                                      /* 0x0d: Reserved */
>> +       0,                                                      /* 0x0e: Reserved */
>> +       OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),       /* 0x0f: uncached */
>> +};
>> +
>> +/* L2 miss */
>> +#define OMR_DATA_SOURCE_MAX            16
>> +static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
>> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: invalid */
>> +       0,                                                      /* 0x01: Reserved */
>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE),    /* 0x02: local CA shared cache */
>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */
>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO),       /* 0x04: other CA IO agent */
>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE),    /* 0x05: other CA shared cache */
>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MMIO),                   /* 0x07: MMIO */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM0),                   /* 0x08: Memory region 0 */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM1),                   /* 0x09: Memory region 1 */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM2),                   /* 0x0a: Memory region 2 */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM3),                   /* 0x0b: Memory region 3 */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM4),                   /* 0x0c: Memory region 4 */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM5),                   /* 0x0d: Memory region 5 */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM6),                   /* 0x0e: Memory region 6 */
>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM7),                   /* 0x0f: Memory region 7 */
>> +};
>> +
>> +static u64 parse_omr_data_source(u8 dse)
>> +{
>> +       union omr_encoding omr;
>> +       u64 val = 0;
>> +
>> +       omr.omr_full = dse;
>> +       val = omr_data_source[omr.omr_source];
>> +       if (omr.omr_source > 0x1 && omr.omr_source < 0x7)
>> +               val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0;
>> +       else if (omr.omr_source > 0x7)
>> +               val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM);
>> +
>> +       if (omr.omr_remote)
>> +               val |= REM;
>> +
>> +       val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT);
>> +
>> +       if (omr.omr_source == 0x2) {
>> +               u8 snoop = omr.omr_snoop | omr.omr_promoted;
> Or-ing the values together should mean snoop is only ever 0 or 1.
>
>> +
>> +               if (snoop == 0x0)
>> +                       val |= P(SNOOP, NA);
>> +               else if (snoop == 0x1)
>> +                       val |= P(SNOOP, MISS);
>> +               else if (snoop == 0x2)
>> +                       val |= P(SNOOP, HIT);
>> +               else if (snoop == 0x3)
>> +                       val |= P(SNOOP, NONE);
> How can snoop equal 0x2 or 0x3 here? Should snoop be "(omr.omr_snoop
> << 1) | omr.omr_promoted" ?
>
> Thanks,
> Ian
>
>> +       } else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) {
>> +               val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0;
>> +       }
>> +
>> +       return val;
>> +}
>> +
>>  static u64 precise_store_data(u64 status)
>>  {
>>         union intel_x86_pebs_dse dse;
>> @@ -411,6 +513,44 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status)
>>         return lnl_latency_data(event, status);
>>  }
>>
>> +u64 pnc_latency_data(struct perf_event *event, u64 status)
>> +{
>> +       union intel_x86_pebs_dse dse;
>> +       union perf_mem_data_src src;
>> +       u64 val;
>> +
>> +       dse.val = status;
>> +
>> +       if (!dse.pnc_l2_miss)
>> +               val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf];
>> +       else
>> +               val = parse_omr_data_source(dse.pnc_dse);
>> +
>> +       if (!val)
>> +               val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
>> +
>> +       if (dse.pnc_stlb_miss)
>> +               val |= P(TLB, MISS) | P(TLB, L2);
>> +       else
>> +               val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
>> +
>> +       if (dse.pnc_locked)
>> +               val |= P(LOCK, LOCKED);
>> +
>> +       if (dse.pnc_data_blk)
>> +               val |= P(BLK, DATA);
>> +       if (dse.pnc_addr_blk)
>> +               val |= P(BLK, ADDR);
>> +       if (!dse.pnc_data_blk && !dse.pnc_addr_blk)
>> +               val |= P(BLK, NA);
>> +
>> +       src.val = val;
>> +       if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
>> +               src.mem_op = P(OP, STORE);
>> +
>> +       return src.val;
>> +}
>> +
>>  static u64 load_latency_data(struct perf_event *event, u64 status)
>>  {
>>         union intel_x86_pebs_dse dse;
>> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
>> index 586e3fd..bd501c2 100644
>> --- a/arch/x86/events/perf_event.h
>> +++ b/arch/x86/events/perf_event.h
>> @@ -1664,6 +1664,8 @@ u64 lnl_latency_data(struct perf_event *event, u64 status);
>>
>>  u64 arl_h_latency_data(struct perf_event *event, u64 status);
>>
>> +u64 pnc_latency_data(struct perf_event *event, u64 status);
>> +
>>  extern struct event_constraint intel_core2_pebs_event_constraints[];
>>
>>  extern struct event_constraint intel_atom_pebs_event_constraints[];
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index c44a8fb..533393e 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>>                         mem_blk     :  3, /* Access blocked */
>>                         mem_hops    :  3, /* Hop level */
>> -                       mem_rsvd    : 18;
>> +                       mem_region  :  5, /* cache/memory regions */
>> +                       mem_rsvd    : 13;
>>         };
>>  };
>>  #elif defined(__BIG_ENDIAN_BITFIELD)
>>  union perf_mem_data_src {
>>         __u64 val;
>>         struct {
>> -               __u64   mem_rsvd    : 18,
>> +               __u64   mem_rsvd    : 13,
>> +                       mem_region  :  5, /* cache/memory regions */
>>                         mem_hops    :  3, /* Hop level */
>>                         mem_blk     :  3, /* Access blocked */
>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
>>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
>>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
>>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
>> -/* 0x007 available */
>> +#define PERF_MEM_LVLNUM_L0                     0x0007 /* L0 */
>>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
>>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
>>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
>> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
>>  /* 5-7 available */
>>  #define PERF_MEM_HOPS_SHIFT                    43
>>
>> +/* Cache/Memory region */
>> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
>> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
>> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
>> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
>> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
>> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
>> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
>> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
>> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
>> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
>> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
>> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
>> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
>> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
>> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
>> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
>> +#define PERF_MEM_REGION_SHIFT          46
>> +
>>  #define PERF_MEM_S(a, s) \
>>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>>
>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>> index c44a8fb..d4b9961 100644
>> --- a/tools/include/uapi/linux/perf_event.h
>> +++ b/tools/include/uapi/linux/perf_event.h
>> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>>                         mem_blk     :  3, /* Access blocked */
>>                         mem_hops    :  3, /* Hop level */
>> -                       mem_rsvd    : 18;
>> +                       mem_region  :  5, /* cache/memory regions */
>> +                       mem_rsvd    : 13;
>>         };
>>  };
>>  #elif defined(__BIG_ENDIAN_BITFIELD)
>>  union perf_mem_data_src {
>>         __u64 val;
>>         struct {
>> -               __u64   mem_rsvd    : 18,
>> +               __u64   mem_rsvd    : 13,
>> +                       mem_region  :  5, /* cache/memory regions */
>>                         mem_hops    :  3, /* Hop level */
>>                         mem_blk     :  3, /* Access blocked */
>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
>>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
>>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
>>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
>> -/* 0x007 available */
>> +#define PERF_MEM_LVLNUM_L0                     0x0007   /* L0 */
>>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
>>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
>>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
>> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
>>  /* 5-7 available */
>>  #define PERF_MEM_HOPS_SHIFT                    43
>>
>> +/* Cache/Memory region */
>> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
>> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
>> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
>> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
>> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
>> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
>> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
>> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
>> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
>> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
>> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
>> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
>> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
>> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
>> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
>> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
>> +#define PERF_MEM_REGION_SHIFT          46
>> +
>>  #define PERF_MEM_S(a, s) \
>>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>>
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [tip: perf/core] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  2026-03-10  3:32       ` Mi, Dapeng
@ 2026-03-10  4:38         ` Ian Rogers
  2026-03-10  5:04           ` Mi, Dapeng
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Rogers @ 2026-03-10  4:38 UTC (permalink / raw)
  To: Mi, Dapeng; +Cc: linux-kernel, linux-tip-commits, Peter Zijlstra (Intel), x86

On Mon, Mar 9, 2026 at 8:32 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>
>
> On 3/10/2026 7:47 AM, Ian Rogers wrote:
> > On Thu, Jan 15, 2026 at 1:46 PM tip-bot2 for Dapeng Mi
> > <tip-bot2@linutronix.de> wrote:
> >> The following commit has been merged into the perf/core branch of tip:
> >>
> >> Commit-ID:     d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
> >> Gitweb:        https://git.kernel.org/tip/d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
> >> Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
> >> AuthorDate:    Wed, 14 Jan 2026 09:17:45 +08:00
> >> Committer:     Peter Zijlstra <peterz@infradead.org>
> >> CommitterDate: Thu, 15 Jan 2026 10:04:26 +01:00
> >>
> >> perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
> >>
> >> With the introduction of the OMR feature, the PEBS memory auxiliary info
> >> field for load and store latency events has been restructured for DMR.
> >>
> >> The memory auxiliary info field's bit[8] indicates whether a L2 cache
> >> miss occurred for a memory load or store instruction. If bit[8] is 0,
> >> it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
> >> source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
> >> the OMR encoding, indicating the specific L3 cache or memory region
> >> involved in the memory access. A significant enhancement is OMR encoding
> >> provides up to 8 fine-grained memory regions besides the cache region.
> >>
> >> A significant enhancement for OMR encoding is the ability to provide
> >> up to 8 fine-grained memory regions in addition to the cache region,
> >> offering more detailed insights into memory access regions.
> >>
> >> For detailed information on the memory auxiliary info encoding, please
> >> refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
> >> the ISE documentation.
> >>
> >> This patch ensures that the PEBS memory auxiliary info field is correctly
> >> interpreted and utilized in DMR.
> >>
> >> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> >> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >> Link: https://patch.msgid.link/20260114011750.350569-3-dapeng1.mi@linux.intel.com
> >> ---
> >>  arch/x86/events/intel/ds.c            | 140 +++++++++++++++++++++++++-
> >>  arch/x86/events/perf_event.h          |   2 +-
> >>  include/uapi/linux/perf_event.h       |  27 ++++-
> >>  tools/include/uapi/linux/perf_event.h |  27 ++++-
> >>  4 files changed, 190 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> >> index feb1c3c..272e652 100644
> >> --- a/arch/x86/events/intel/ds.c
> >> +++ b/arch/x86/events/intel/ds.c
> >> @@ -34,6 +34,17 @@ struct pebs_record_32 {
> >>
> >>   */
> >>
> >> +union omr_encoding {
> >> +       struct {
> >> +               u8 omr_source : 4;
> >> +               u8 omr_remote : 1;
> >> +               u8 omr_hitm : 1;
> >> +               u8 omr_snoop : 1;
> >> +               u8 omr_promoted : 1;
> > Hi Dapeng,
> >
> > omr_snoop and omr_promoted are 1 bit fields here.
>
> Yes. According the OMR encoding layout in the "Table 16-5. OMR Encoding for
> P-Core and E-Core Microarchitectures" of the ISE doc, bit [6] represents
> the snoop information and bit [7] represents promoted prefetch in most
> cases. Although the bit[7] and bit[6] are combined to represent the snoop
> information when omr_source field is 0x2, but it's only an exception. So
> bit[6] is named to omr_snoop and bit[7] is named to omr_promoted here. Thanks.

Yep, there were more comments below.

> >
> >> +       };
> >> +       u8 omr_full;
> >> +};
> >> +
> >>  union intel_x86_pebs_dse {
> >>         u64 val;
> >>         struct {
> >> @@ -73,6 +84,18 @@ union intel_x86_pebs_dse {
> >>                 unsigned int lnc_addr_blk:1;
> >>                 unsigned int ld_reserved6:18;
> >>         };
> >> +       struct {
> >> +               unsigned int pnc_dse: 8;
> >> +               unsigned int pnc_l2_miss:1;
> >> +               unsigned int pnc_stlb_clean_hit:1;
> >> +               unsigned int pnc_stlb_any_hit:1;
> >> +               unsigned int pnc_stlb_miss:1;
> >> +               unsigned int pnc_locked:1;
> >> +               unsigned int pnc_data_blk:1;
> >> +               unsigned int pnc_addr_blk:1;
> >> +               unsigned int pnc_fb_full:1;
> >> +               unsigned int ld_reserved8:16;
> >> +       };
> >>  };
> >>
> >>
> >> @@ -228,6 +251,85 @@ void __init intel_pmu_pebs_data_source_lnl(void)
> >>         __intel_pmu_pebs_data_source_cmt(data_source);
> >>  }
> >>
> >> +/* Version for Panthercove and later */
> >> +
> >> +/* L2 hit */
> >> +#define PNC_PEBS_DATA_SOURCE_MAX       16
> >> +static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
> >> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: non-cache access */
> >> +       OP_LH               | LEVEL(L0) | P(SNOOP, NONE),       /* 0x01: L0 hit */
> >> +       OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),       /* 0x02: L1 hit */
> >> +       OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),      /* 0x03: L1 Miss Handling Buffer hit */
> >> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),       /* 0x04: L2 Hit Clean */
> >> +       0,                                                      /* 0x05: Reserved */
> >> +       0,                                                      /* 0x06: Reserved */
> >> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),        /* 0x07: L2 Hit Snoop HIT */
> >> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),       /* 0x08: L2 Hit Snoop Hit Modified */
> >> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x09: Prefetch Promotion */
> >> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x0a: Cross Core Prefetch Promotion */
> >> +       0,                                                      /* 0x0b: Reserved */
> >> +       0,                                                      /* 0x0c: Reserved */
> >> +       0,                                                      /* 0x0d: Reserved */
> >> +       0,                                                      /* 0x0e: Reserved */
> >> +       OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),       /* 0x0f: uncached */
> >> +};
> >> +
> >> +/* L2 miss */
> >> +#define OMR_DATA_SOURCE_MAX            16
> >> +static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
> >> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: invalid */
> >> +       0,                                                      /* 0x01: Reserved */
> >> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE),    /* 0x02: local CA shared cache */
> >> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */
> >> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO),       /* 0x04: other CA IO agent */
> >> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE),    /* 0x05: other CA shared cache */
> >> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MMIO),                   /* 0x07: MMIO */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM0),                   /* 0x08: Memory region 0 */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM1),                   /* 0x09: Memory region 1 */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM2),                   /* 0x0a: Memory region 2 */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM3),                   /* 0x0b: Memory region 3 */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM4),                   /* 0x0c: Memory region 4 */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM5),                   /* 0x0d: Memory region 5 */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM6),                   /* 0x0e: Memory region 6 */
> >> +       OP_LH | LEVEL(RAM) | P(REGION, MEM7),                   /* 0x0f: Memory region 7 */
> >> +};
> >> +
> >> +static u64 parse_omr_data_source(u8 dse)
> >> +{
> >> +       union omr_encoding omr;
> >> +       u64 val = 0;
> >> +
> >> +       omr.omr_full = dse;
> >> +       val = omr_data_source[omr.omr_source];
> >> +       if (omr.omr_source > 0x1 && omr.omr_source < 0x7)
> >> +               val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0;
> >> +       else if (omr.omr_source > 0x7)
> >> +               val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM);
> >> +
> >> +       if (omr.omr_remote)
> >> +               val |= REM;
> >> +
> >> +       val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT);
> >> +
> >> +       if (omr.omr_source == 0x2) {
> >> +               u8 snoop = omr.omr_snoop | omr.omr_promoted;
> > Or-ing the values together should mean snoop is only ever 0 or 1.

This comment about the OR only yielding 0 or 1.

> >> +
> >> +               if (snoop == 0x0)
> >> +                       val |= P(SNOOP, NA);
> >> +               else if (snoop == 0x1)
> >> +                       val |= P(SNOOP, MISS);
> >> +               else if (snoop == 0x2)
> >> +                       val |= P(SNOOP, HIT);
> >> +               else if (snoop == 0x3)
> >> +                       val |= P(SNOOP, NONE);
> > How can snoop equal 0x2 or 0x3 here? Should snoop be "(omr.omr_snoop
> > << 1) | omr.omr_promoted" ?

And then this comment: the values 0x2 and 0x3 seem unreachable.

Thanks,
Ian

> >
> > Thanks,
> > Ian
> >
> >> +       } else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) {
> >> +               val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0;
> >> +       }
> >> +
> >> +       return val;
> >> +}
> >> +
> >>  static u64 precise_store_data(u64 status)
> >>  {
> >>         union intel_x86_pebs_dse dse;
> >> @@ -411,6 +513,44 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status)
> >>         return lnl_latency_data(event, status);
> >>  }
> >>
> >> +u64 pnc_latency_data(struct perf_event *event, u64 status)
> >> +{
> >> +       union intel_x86_pebs_dse dse;
> >> +       union perf_mem_data_src src;
> >> +       u64 val;
> >> +
> >> +       dse.val = status;
> >> +
> >> +       if (!dse.pnc_l2_miss)
> >> +               val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf];
> >> +       else
> >> +               val = parse_omr_data_source(dse.pnc_dse);
> >> +
> >> +       if (!val)
> >> +               val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
> >> +
> >> +       if (dse.pnc_stlb_miss)
> >> +               val |= P(TLB, MISS) | P(TLB, L2);
> >> +       else
> >> +               val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
> >> +
> >> +       if (dse.pnc_locked)
> >> +               val |= P(LOCK, LOCKED);
> >> +
> >> +       if (dse.pnc_data_blk)
> >> +               val |= P(BLK, DATA);
> >> +       if (dse.pnc_addr_blk)
> >> +               val |= P(BLK, ADDR);
> >> +       if (!dse.pnc_data_blk && !dse.pnc_addr_blk)
> >> +               val |= P(BLK, NA);
> >> +
> >> +       src.val = val;
> >> +       if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
> >> +               src.mem_op = P(OP, STORE);
> >> +
> >> +       return src.val;
> >> +}
> >> +
> >>  static u64 load_latency_data(struct perf_event *event, u64 status)
> >>  {
> >>         union intel_x86_pebs_dse dse;
> >> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> >> index 586e3fd..bd501c2 100644
> >> --- a/arch/x86/events/perf_event.h
> >> +++ b/arch/x86/events/perf_event.h
> >> @@ -1664,6 +1664,8 @@ u64 lnl_latency_data(struct perf_event *event, u64 status);
> >>
> >>  u64 arl_h_latency_data(struct perf_event *event, u64 status);
> >>
> >> +u64 pnc_latency_data(struct perf_event *event, u64 status);
> >> +
> >>  extern struct event_constraint intel_core2_pebs_event_constraints[];
> >>
> >>  extern struct event_constraint intel_atom_pebs_event_constraints[];
> >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> >> index c44a8fb..533393e 100644
> >> --- a/include/uapi/linux/perf_event.h
> >> +++ b/include/uapi/linux/perf_event.h
> >> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
> >>                         mem_snoopx  :  2, /* Snoop mode, ext */
> >>                         mem_blk     :  3, /* Access blocked */
> >>                         mem_hops    :  3, /* Hop level */
> >> -                       mem_rsvd    : 18;
> >> +                       mem_region  :  5, /* cache/memory regions */
> >> +                       mem_rsvd    : 13;
> >>         };
> >>  };
> >>  #elif defined(__BIG_ENDIAN_BITFIELD)
> >>  union perf_mem_data_src {
> >>         __u64 val;
> >>         struct {
> >> -               __u64   mem_rsvd    : 18,
> >> +               __u64   mem_rsvd    : 13,
> >> +                       mem_region  :  5, /* cache/memory regions */
> >>                         mem_hops    :  3, /* Hop level */
> >>                         mem_blk     :  3, /* Access blocked */
> >>                         mem_snoopx  :  2, /* Snoop mode, ext */
> >> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
> >>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
> >>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
> >>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
> >> -/* 0x007 available */
> >> +#define PERF_MEM_LVLNUM_L0                     0x0007 /* L0 */
> >>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
> >>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
> >>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
> >> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
> >>  /* 5-7 available */
> >>  #define PERF_MEM_HOPS_SHIFT                    43
> >>
> >> +/* Cache/Memory region */
> >> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
> >> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
> >> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
> >> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
> >> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
> >> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
> >> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
> >> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
> >> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
> >> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
> >> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
> >> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
> >> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
> >> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
> >> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
> >> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
> >> +#define PERF_MEM_REGION_SHIFT          46
> >> +
> >>  #define PERF_MEM_S(a, s) \
> >>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
> >>
> >> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> >> index c44a8fb..d4b9961 100644
> >> --- a/tools/include/uapi/linux/perf_event.h
> >> +++ b/tools/include/uapi/linux/perf_event.h
> >> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
> >>                         mem_snoopx  :  2, /* Snoop mode, ext */
> >>                         mem_blk     :  3, /* Access blocked */
> >>                         mem_hops    :  3, /* Hop level */
> >> -                       mem_rsvd    : 18;
> >> +                       mem_region  :  5, /* cache/memory regions */
> >> +                       mem_rsvd    : 13;
> >>         };
> >>  };
> >>  #elif defined(__BIG_ENDIAN_BITFIELD)
> >>  union perf_mem_data_src {
> >>         __u64 val;
> >>         struct {
> >> -               __u64   mem_rsvd    : 18,
> >> +               __u64   mem_rsvd    : 13,
> >> +                       mem_region  :  5, /* cache/memory regions */
> >>                         mem_hops    :  3, /* Hop level */
> >>                         mem_blk     :  3, /* Access blocked */
> >>                         mem_snoopx  :  2, /* Snoop mode, ext */
> >> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
> >>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
> >>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
> >>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
> >> -/* 0x007 available */
> >> +#define PERF_MEM_LVLNUM_L0                     0x0007   /* L0 */
> >>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
> >>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
> >>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
> >> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
> >>  /* 5-7 available */
> >>  #define PERF_MEM_HOPS_SHIFT                    43
> >>
> >> +/* Cache/Memory region */
> >> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
> >> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
> >> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
> >> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
> >> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
> >> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
> >> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
> >> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
> >> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
> >> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
> >> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
> >> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
> >> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
> >> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
> >> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
> >> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
> >> +#define PERF_MEM_REGION_SHIFT          46
> >> +
> >>  #define PERF_MEM_S(a, s) \
> >>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
> >>
> >>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [tip: perf/core] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  2026-03-10  4:38         ` Ian Rogers
@ 2026-03-10  5:04           ` Mi, Dapeng
  0 siblings, 0 replies; 21+ messages in thread
From: Mi, Dapeng @ 2026-03-10  5:04 UTC (permalink / raw)
  To: Ian Rogers; +Cc: linux-kernel, linux-tip-commits, Peter Zijlstra (Intel), x86


On 3/10/2026 12:38 PM, Ian Rogers wrote:
> On Mon, Mar 9, 2026 at 8:32 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>
>> On 3/10/2026 7:47 AM, Ian Rogers wrote:
>>> On Thu, Jan 15, 2026 at 1:46 PM tip-bot2 for Dapeng Mi
>>> <tip-bot2@linutronix.de> wrote:
>>>> The following commit has been merged into the perf/core branch of tip:
>>>>
>>>> Commit-ID:     d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
>>>> Gitweb:        https://git.kernel.org/tip/d2bdcde9626cbea0c44a6aaa33b440c8adf81e09
>>>> Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>> AuthorDate:    Wed, 14 Jan 2026 09:17:45 +08:00
>>>> Committer:     Peter Zijlstra <peterz@infradead.org>
>>>> CommitterDate: Thu, 15 Jan 2026 10:04:26 +01:00
>>>>
>>>> perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
>>>>
>>>> With the introduction of the OMR feature, the PEBS memory auxiliary info
>>>> field for load and store latency events has been restructured for DMR.
>>>>
>>>> The memory auxiliary info field's bit[8] indicates whether a L2 cache
>>>> miss occurred for a memory load or store instruction. If bit[8] is 0,
>>>> it signifies no L2 cache miss, and bits[7:0] specify the exact cache data
>>>> source (up to the L2 cache level). If bit[8] is 1, bits[7:0] represent
>>>> the OMR encoding, indicating the specific L3 cache or memory region
>>>> involved in the memory access. A significant enhancement is OMR encoding
>>>> provides up to 8 fine-grained memory regions besides the cache region.
>>>>
>>>> A significant enhancement for OMR encoding is the ability to provide
>>>> up to 8 fine-grained memory regions in addition to the cache region,
>>>> offering more detailed insights into memory access regions.
>>>>
>>>> For detailed information on the memory auxiliary info encoding, please
>>>> refer to section 16.2 "PEBS LOAD LATENCY AND STORE LATENCY FACILITY" in
>>>> the ISE documentation.
>>>>
>>>> This patch ensures that the PEBS memory auxiliary info field is correctly
>>>> interpreted and utilized in DMR.
>>>>
>>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>>> Link: https://patch.msgid.link/20260114011750.350569-3-dapeng1.mi@linux.intel.com
>>>> ---
>>>>  arch/x86/events/intel/ds.c            | 140 +++++++++++++++++++++++++-
>>>>  arch/x86/events/perf_event.h          |   2 +-
>>>>  include/uapi/linux/perf_event.h       |  27 ++++-
>>>>  tools/include/uapi/linux/perf_event.h |  27 ++++-
>>>>  4 files changed, 190 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>>>> index feb1c3c..272e652 100644
>>>> --- a/arch/x86/events/intel/ds.c
>>>> +++ b/arch/x86/events/intel/ds.c
>>>> @@ -34,6 +34,17 @@ struct pebs_record_32 {
>>>>
>>>>   */
>>>>
>>>> +union omr_encoding {
>>>> +       struct {
>>>> +               u8 omr_source : 4;
>>>> +               u8 omr_remote : 1;
>>>> +               u8 omr_hitm : 1;
>>>> +               u8 omr_snoop : 1;
>>>> +               u8 omr_promoted : 1;
>>> Hi Dapeng,
>>>
>>> omr_snoop and omr_promoted are 1 bit fields here.
>> Yes. According the OMR encoding layout in the "Table 16-5. OMR Encoding for
>> P-Core and E-Core Microarchitectures" of the ISE doc, bit [6] represents
>> the snoop information and bit [7] represents promoted prefetch in most
>> cases. Although the bit[7] and bit[6] are combined to represent the snoop
>> information when omr_source field is 0x2, but it's only an exception. So
>> bit[6] is named to omr_snoop and bit[7] is named to omr_promoted here. Thanks.
> Yep, there were more comments below.
>
>>>> +       };
>>>> +       u8 omr_full;
>>>> +};
>>>> +
>>>>  union intel_x86_pebs_dse {
>>>>         u64 val;
>>>>         struct {
>>>> @@ -73,6 +84,18 @@ union intel_x86_pebs_dse {
>>>>                 unsigned int lnc_addr_blk:1;
>>>>                 unsigned int ld_reserved6:18;
>>>>         };
>>>> +       struct {
>>>> +               unsigned int pnc_dse: 8;
>>>> +               unsigned int pnc_l2_miss:1;
>>>> +               unsigned int pnc_stlb_clean_hit:1;
>>>> +               unsigned int pnc_stlb_any_hit:1;
>>>> +               unsigned int pnc_stlb_miss:1;
>>>> +               unsigned int pnc_locked:1;
>>>> +               unsigned int pnc_data_blk:1;
>>>> +               unsigned int pnc_addr_blk:1;
>>>> +               unsigned int pnc_fb_full:1;
>>>> +               unsigned int ld_reserved8:16;
>>>> +       };
>>>>  };
>>>>
>>>>
>>>> @@ -228,6 +251,85 @@ void __init intel_pmu_pebs_data_source_lnl(void)
>>>>         __intel_pmu_pebs_data_source_cmt(data_source);
>>>>  }
>>>>
>>>> +/* Version for Panthercove and later */
>>>> +
>>>> +/* L2 hit */
>>>> +#define PNC_PEBS_DATA_SOURCE_MAX       16
>>>> +static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
>>>> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: non-cache access */
>>>> +       OP_LH               | LEVEL(L0) | P(SNOOP, NONE),       /* 0x01: L0 hit */
>>>> +       OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),       /* 0x02: L1 hit */
>>>> +       OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE),      /* 0x03: L1 Miss Handling Buffer hit */
>>>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),       /* 0x04: L2 Hit Clean */
>>>> +       0,                                                      /* 0x05: Reserved */
>>>> +       0,                                                      /* 0x06: Reserved */
>>>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HIT),        /* 0x07: L2 Hit Snoop HIT */
>>>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, HITM),       /* 0x08: L2 Hit Snoop Hit Modified */
>>>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x09: Prefetch Promotion */
>>>> +       OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, MISS),       /* 0x0a: Cross Core Prefetch Promotion */
>>>> +       0,                                                      /* 0x0b: Reserved */
>>>> +       0,                                                      /* 0x0c: Reserved */
>>>> +       0,                                                      /* 0x0d: Reserved */
>>>> +       0,                                                      /* 0x0e: Reserved */
>>>> +       OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE),       /* 0x0f: uncached */
>>>> +};
>>>> +
>>>> +/* L2 miss */
>>>> +#define OMR_DATA_SOURCE_MAX            16
>>>> +static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
>>>> +       P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA),    /* 0x00: invalid */
>>>> +       0,                                                      /* 0x01: Reserved */
>>>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE),    /* 0x02: local CA shared cache */
>>>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */
>>>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO),       /* 0x04: other CA IO agent */
>>>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE),    /* 0x05: other CA shared cache */
>>>> +       OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MMIO),                   /* 0x07: MMIO */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM0),                   /* 0x08: Memory region 0 */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM1),                   /* 0x09: Memory region 1 */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM2),                   /* 0x0a: Memory region 2 */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM3),                   /* 0x0b: Memory region 3 */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM4),                   /* 0x0c: Memory region 4 */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM5),                   /* 0x0d: Memory region 5 */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM6),                   /* 0x0e: Memory region 6 */
>>>> +       OP_LH | LEVEL(RAM) | P(REGION, MEM7),                   /* 0x0f: Memory region 7 */
>>>> +};
>>>> +
>>>> +static u64 parse_omr_data_source(u8 dse)
>>>> +{
>>>> +       union omr_encoding omr;
>>>> +       u64 val = 0;
>>>> +
>>>> +       omr.omr_full = dse;
>>>> +       val = omr_data_source[omr.omr_source];
>>>> +       if (omr.omr_source > 0x1 && omr.omr_source < 0x7)
>>>> +               val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0;
>>>> +       else if (omr.omr_source > 0x7)
>>>> +               val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM);
>>>> +
>>>> +       if (omr.omr_remote)
>>>> +               val |= REM;
>>>> +
>>>> +       val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT);
>>>> +
>>>> +       if (omr.omr_source == 0x2) {
>>>> +               u8 snoop = omr.omr_snoop | omr.omr_promoted;
>>> Or-ing the values together should mean snoop is only ever 0 or 1.
> This comment about the OR only yielding 0 or 1.

Oh, yes, it's a bug. Would submit a patch to fix it. Thanks a lot.


>
>>>> +
>>>> +               if (snoop == 0x0)
>>>> +                       val |= P(SNOOP, NA);
>>>> +               else if (snoop == 0x1)
>>>> +                       val |= P(SNOOP, MISS);
>>>> +               else if (snoop == 0x2)
>>>> +                       val |= P(SNOOP, HIT);
>>>> +               else if (snoop == 0x3)
>>>> +                       val |= P(SNOOP, NONE);
>>> How can snoop equal 0x2 or 0x3 here? Should snoop be "(omr.omr_snoop
>>> << 1) | omr.omr_promoted" ?
> And then this comment: the values 0x2 and 0x3 seem unreachable.
>
> Thanks,
> Ian
>
>>> Thanks,
>>> Ian
>>>
>>>> +       } else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) {
>>>> +               val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0;
>>>> +       }
>>>> +
>>>> +       return val;
>>>> +}
>>>> +
>>>>  static u64 precise_store_data(u64 status)
>>>>  {
>>>>         union intel_x86_pebs_dse dse;
>>>> @@ -411,6 +513,44 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status)
>>>>         return lnl_latency_data(event, status);
>>>>  }
>>>>
>>>> +u64 pnc_latency_data(struct perf_event *event, u64 status)
>>>> +{
>>>> +       union intel_x86_pebs_dse dse;
>>>> +       union perf_mem_data_src src;
>>>> +       u64 val;
>>>> +
>>>> +       dse.val = status;
>>>> +
>>>> +       if (!dse.pnc_l2_miss)
>>>> +               val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf];
>>>> +       else
>>>> +               val = parse_omr_data_source(dse.pnc_dse);
>>>> +
>>>> +       if (!val)
>>>> +               val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
>>>> +
>>>> +       if (dse.pnc_stlb_miss)
>>>> +               val |= P(TLB, MISS) | P(TLB, L2);
>>>> +       else
>>>> +               val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
>>>> +
>>>> +       if (dse.pnc_locked)
>>>> +               val |= P(LOCK, LOCKED);
>>>> +
>>>> +       if (dse.pnc_data_blk)
>>>> +               val |= P(BLK, DATA);
>>>> +       if (dse.pnc_addr_blk)
>>>> +               val |= P(BLK, ADDR);
>>>> +       if (!dse.pnc_data_blk && !dse.pnc_addr_blk)
>>>> +               val |= P(BLK, NA);
>>>> +
>>>> +       src.val = val;
>>>> +       if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
>>>> +               src.mem_op = P(OP, STORE);
>>>> +
>>>> +       return src.val;
>>>> +}
>>>> +
>>>>  static u64 load_latency_data(struct perf_event *event, u64 status)
>>>>  {
>>>>         union intel_x86_pebs_dse dse;
>>>> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
>>>> index 586e3fd..bd501c2 100644
>>>> --- a/arch/x86/events/perf_event.h
>>>> +++ b/arch/x86/events/perf_event.h
>>>> @@ -1664,6 +1664,8 @@ u64 lnl_latency_data(struct perf_event *event, u64 status);
>>>>
>>>>  u64 arl_h_latency_data(struct perf_event *event, u64 status);
>>>>
>>>> +u64 pnc_latency_data(struct perf_event *event, u64 status);
>>>> +
>>>>  extern struct event_constraint intel_core2_pebs_event_constraints[];
>>>>
>>>>  extern struct event_constraint intel_atom_pebs_event_constraints[];
>>>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>>>> index c44a8fb..533393e 100644
>>>> --- a/include/uapi/linux/perf_event.h
>>>> +++ b/include/uapi/linux/perf_event.h
>>>> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
>>>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>>>>                         mem_blk     :  3, /* Access blocked */
>>>>                         mem_hops    :  3, /* Hop level */
>>>> -                       mem_rsvd    : 18;
>>>> +                       mem_region  :  5, /* cache/memory regions */
>>>> +                       mem_rsvd    : 13;
>>>>         };
>>>>  };
>>>>  #elif defined(__BIG_ENDIAN_BITFIELD)
>>>>  union perf_mem_data_src {
>>>>         __u64 val;
>>>>         struct {
>>>> -               __u64   mem_rsvd    : 18,
>>>> +               __u64   mem_rsvd    : 13,
>>>> +                       mem_region  :  5, /* cache/memory regions */
>>>>                         mem_hops    :  3, /* Hop level */
>>>>                         mem_blk     :  3, /* Access blocked */
>>>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>>>> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
>>>>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
>>>>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
>>>>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
>>>> -/* 0x007 available */
>>>> +#define PERF_MEM_LVLNUM_L0                     0x0007 /* L0 */
>>>>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
>>>>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
>>>>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
>>>> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
>>>>  /* 5-7 available */
>>>>  #define PERF_MEM_HOPS_SHIFT                    43
>>>>
>>>> +/* Cache/Memory region */
>>>> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
>>>> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
>>>> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
>>>> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
>>>> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
>>>> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
>>>> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
>>>> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
>>>> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
>>>> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
>>>> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
>>>> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
>>>> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
>>>> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
>>>> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
>>>> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
>>>> +#define PERF_MEM_REGION_SHIFT          46
>>>> +
>>>>  #define PERF_MEM_S(a, s) \
>>>>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>>>>
>>>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>>>> index c44a8fb..d4b9961 100644
>>>> --- a/tools/include/uapi/linux/perf_event.h
>>>> +++ b/tools/include/uapi/linux/perf_event.h
>>>> @@ -1330,14 +1330,16 @@ union perf_mem_data_src {
>>>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>>>>                         mem_blk     :  3, /* Access blocked */
>>>>                         mem_hops    :  3, /* Hop level */
>>>> -                       mem_rsvd    : 18;
>>>> +                       mem_region  :  5, /* cache/memory regions */
>>>> +                       mem_rsvd    : 13;
>>>>         };
>>>>  };
>>>>  #elif defined(__BIG_ENDIAN_BITFIELD)
>>>>  union perf_mem_data_src {
>>>>         __u64 val;
>>>>         struct {
>>>> -               __u64   mem_rsvd    : 18,
>>>> +               __u64   mem_rsvd    : 13,
>>>> +                       mem_region  :  5, /* cache/memory regions */
>>>>                         mem_hops    :  3, /* Hop level */
>>>>                         mem_blk     :  3, /* Access blocked */
>>>>                         mem_snoopx  :  2, /* Snoop mode, ext */
>>>> @@ -1394,7 +1396,7 @@ union perf_mem_data_src {
>>>>  #define PERF_MEM_LVLNUM_L4                     0x0004 /* L4 */
>>>>  #define PERF_MEM_LVLNUM_L2_MHB                 0x0005 /* L2 Miss Handling Buffer */
>>>>  #define PERF_MEM_LVLNUM_MSC                    0x0006 /* Memory-side Cache */
>>>> -/* 0x007 available */
>>>> +#define PERF_MEM_LVLNUM_L0                     0x0007   /* L0 */
>>>>  #define PERF_MEM_LVLNUM_UNC                    0x0008 /* Uncached */
>>>>  #define PERF_MEM_LVLNUM_CXL                    0x0009 /* CXL */
>>>>  #define PERF_MEM_LVLNUM_IO                     0x000a /* I/O */
>>>> @@ -1447,6 +1449,25 @@ union perf_mem_data_src {
>>>>  /* 5-7 available */
>>>>  #define PERF_MEM_HOPS_SHIFT                    43
>>>>
>>>> +/* Cache/Memory region */
>>>> +#define PERF_MEM_REGION_NA             0x0  /* Invalid */
>>>> +#define PERF_MEM_REGION_RSVD           0x01 /* Reserved */
>>>> +#define PERF_MEM_REGION_L_SHARE                0x02 /* Local CA shared cache */
>>>> +#define PERF_MEM_REGION_L_NON_SHARE    0x03 /* Local CA non-shared cache */
>>>> +#define PERF_MEM_REGION_O_IO           0x04 /* Other CA IO agent */
>>>> +#define PERF_MEM_REGION_O_SHARE                0x05 /* Other CA shared cache */
>>>> +#define PERF_MEM_REGION_O_NON_SHARE    0x06 /* Other CA non-shared cache */
>>>> +#define PERF_MEM_REGION_MMIO           0x07 /* MMIO */
>>>> +#define PERF_MEM_REGION_MEM0           0x08 /* Memory region 0 */
>>>> +#define PERF_MEM_REGION_MEM1           0x09 /* Memory region 1 */
>>>> +#define PERF_MEM_REGION_MEM2           0x0a /* Memory region 2 */
>>>> +#define PERF_MEM_REGION_MEM3           0x0b /* Memory region 3 */
>>>> +#define PERF_MEM_REGION_MEM4           0x0c /* Memory region 4 */
>>>> +#define PERF_MEM_REGION_MEM5           0x0d /* Memory region 5 */
>>>> +#define PERF_MEM_REGION_MEM6           0x0e /* Memory region 6 */
>>>> +#define PERF_MEM_REGION_MEM7           0x0f /* Memory region 7 */
>>>> +#define PERF_MEM_REGION_SHIFT          46
>>>> +
>>>>  #define PERF_MEM_S(a, s) \
>>>>         (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>>>>
>>>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-03-10  5:04 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-14  1:17 [Patch v3 0/7] Enable core PMU for DMR and NVL Dapeng Mi
2026-01-14  1:17 ` [Patch v3 1/7] perf/x86/intel: Support the 4 new OMR MSRs introduced in " Dapeng Mi
2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-01-14  1:17 ` [Patch v3 2/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR Dapeng Mi
2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-03-09 23:47     ` Ian Rogers
2026-03-10  3:32       ` Mi, Dapeng
2026-03-10  4:38         ` Ian Rogers
2026-03-10  5:04           ` Mi, Dapeng
2026-01-14  1:17 ` [Patch v3 3/7] perf/x86/intel: Add core PMU support for DMR Dapeng Mi
2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-01-14  1:17 ` [Patch v3 4/7] perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL Dapeng Mi
2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-01-14  1:17 ` [Patch v3 5/7] perf/x86/intel: Add core PMU support for Novalake Dapeng Mi
2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-01-14  1:17 ` [Patch v3 6/7] perf/x86: Use macros to replace magic numbers in attr_rdpmc Dapeng Mi
2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-03-09 23:56   ` [Patch v3 6/7] " Ian Rogers
2026-03-10  3:14     ` Mi, Dapeng
2026-01-14  1:17 ` [Patch v3 7/7] perf/x86/intel: Add support for rdpmc user disable feature Dapeng Mi
2026-01-15 21:44   ` [tip: perf/core] " tip-bot2 for Dapeng Mi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox