public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [Patch v9 00/12] arch-PEBS enabling for Intel platforms
@ 2025-10-29 10:21 Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 01/12] perf/x86: Remove redundant is_x86_event() prototype Dapeng Mi
                   ` (11 more replies)
  0 siblings, 12 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Changes:
v8 -> v9:
  * Move cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del() to
    fix the NULL event access issue as events throttling (Patch 02/12)
  * Add macro counter_mask() to simplify the counter mask generation
    code (Patch 05/12)
  * Use "__always_inline" to replace "inline" attribute to ensure the
    setup_fn callback won't be called as indirect call (Patch 06/12)
  * Code style refine (Patch 06/12) 

v7 -> v8:
  * Fix the warning reported by Kernel test robot (Patch 02/12)
  * Rebase code to 6.18-rc1.

v6 -> v7:
  * Rebase code to last tip perf/core tree.
  * Opportunistically remove the redundant is_x86_event() prototype.
    (Patch 01/12)
  * Fix PEBS handler NULL event access and record loss issue.
    (Patch 02/12)
  * Reset MSR_IA32_PEBS_INDEX at the head of_drain_arch_pebs() instead
    of end. It avoids the processed PEBS records are processed again in
    some corner cases like event throttling. (Patch 08/12)

v5 -> v6:
  * Rebase code to last tip perf/core tree + "x86 perf bug fixes and
    optimization" patchset
 
v4 -> v5:
  * Rebase code to 6.16-rc3
  * Allocate/free arch-PEBS buffer in callbacks *prepare_cpu/*dead_cpu
    (patch 07/10, Peter)
  * Code and comments refine (patch 09/10, Peter)


This patchset introduces architectural PEBS support for Intel platforms
like Clearwater Forest (CWF) and Panther Lake (PTL). The detailed
information about arch-PEBS can be found in chapter 11
"architectural PEBS" of "Intel Architecture Instruction Set Extensions
and Future Features".

This patch set doesn't include the SSP and SIMD regs (OPMASK/YMM/ZMM)
sampling support for arch-PEBS to avoid the dependency for the basic
SIMD regs sampling support patch series[1]. Once the basic SIMD regs
sampling is supported, the arch-PEBS based SSP and SIMD regs
(OPMASK/YMM/ZMM) sampling would be supported in a later patch set.

Tests:
  Run below tests on Clearwater Forest and Pantherlake, no issue is
  found.

  1. Basic perf counting case.
    perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1

  2. Basic PMI based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1

  3. Basic PEBS based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}:p' sleep 1

  4. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,xmm0 -b -c 10000 sleep 1

  5. User space PEBS sampling case with basic, GPRs and LBR groups
    perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 sleep 1

  6. PEBS sampling case with auxiliary (memory info) group
    perf mem record sleep 1

  7. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1

  8. Perf stat and record test
    perf test 100; perf test 131


History:
  v8: https://lore.kernel.org/all/20251015064422.47437-1-dapeng1.mi@linux.intel.com/
  v7: https://lore.kernel.org/all/20250828013435.1528459-1-dapeng1.mi@linux.intel.com/
  v6: https://lore.kernel.org/all/20250821035805.159494-1-dapeng1.mi@linux.intel.com/ 
  v5: https://lore.kernel.org/all/20250623223546.112465-1-dapeng1.mi@linux.intel.com/
  v4: https://lore.kernel.org/all/20250620103909.1586595-1-dapeng1.mi@linux.intel.com/
  v3: https://lore.kernel.org/all/20250415114428.341182-1-dapeng1.mi@linux.intel.com/
  v2: https://lore.kernel.org/all/20250218152818.158614-1-dapeng1.mi@linux.intel.com/
  v1: https://lore.kernel.org/all/20250123140721.2496639-1-dapeng1.mi@linux.intel.com/

Ref:
  [1]: https://lore.kernel.org/all/20250925061213.178796-1-dapeng1.mi@linux.intel.com/


Dapeng Mi (12):
  perf/x86: Remove redundant is_x86_event() prototype
  perf/x86: Fix NULL event access and potential PEBS record loss
  perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  perf/x86/intel: Correct large PEBS flag check
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel/ds: Factor out PEBS record processing code to functions
  perf/x86/intel/ds: Factor out PEBS group processing code to functions
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Add counter group support for arch-PEBS

 arch/x86/events/core.c            |  26 +-
 arch/x86/events/intel/core.c      | 273 ++++++++++++--
 arch/x86/events/intel/ds.c        | 602 ++++++++++++++++++++++++------
 arch/x86/events/perf_event.h      |  41 +-
 arch/x86/include/asm/intel_ds.h   |  10 +-
 arch/x86/include/asm/msr-index.h  |  20 +
 arch/x86/include/asm/perf_event.h | 116 +++++-
 7 files changed, 943 insertions(+), 145 deletions(-)


base-commit: 45e1dccc0653c50e377dae57ef086a8d0f71061d
-- 
2.34.1


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Patch v9 01/12] perf/x86: Remove redundant is_x86_event() prototype
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 02/12] perf/x86: Fix NULL event access and potential PEBS record loss Dapeng Mi
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

2 is_x86_event() prototypes are defined in perf_event.h. Remove the
redundant one.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/perf_event.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 2b969386dcdd..285779c73479 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1124,7 +1124,6 @@ static struct perf_pmu_format_hybrid_attr format_attr_hybrid_##_name = {\
 	.pmu_type	= _pmu,						\
 }
 
-int is_x86_event(struct perf_event *event);
 struct pmu *x86_get_pmu(unsigned int cpu);
 extern struct x86_pmu x86_pmu __read_mostly;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 02/12] perf/x86: Fix NULL event access and potential PEBS record loss
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 01/12] perf/x86: Remove redundant is_x86_event() prototype Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-06 14:19   ` Peter Zijlstra
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 03/12] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call Dapeng Mi
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi, kernel test robot

When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
perf_event_overflow() could be called to process the last PEBS record.

While perf_event_overflow() could trigger the interrupt throttle and
stop all events of the group, like what the below call-chain shows.

perf_event_overflow()
  -> __perf_event_overflow()
    ->__perf_event_account_interrupt()
      -> perf_event_throttle_group()
        -> perf_event_throttle()
          -> event->pmu->stop()
            -> x86_pmu_stop()

The side effect of stopping the events is that all corresponding event
pointers in cpuc->events[] array are cleared to NULL.

Assume there are two PEBS events (event a and event b) in a group. When
intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
last PEBS record of PEBS event a, interrupt throttle is triggered and
all pointers of event a and event b are cleared to NULL. Then
intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
event b and encounters NULL pointer access.

To avoid this issue, move cpuc->events[] clearing from x86_pmu_stop()
to x86_pmu_del(). It's safe since cpuc->active_mask or
cpuc->pebs_enabled is always checked before access the event pointer
from cpuc->events[].

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com
Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 745caa6c15a3..74479f9d6eed 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1344,6 +1344,7 @@ static void x86_pmu_enable(struct pmu *pmu)
 				hwc->state |= PERF_HES_ARCH;
 
 			x86_pmu_stop(event, PERF_EF_UPDATE);
+			cpuc->events[hwc->idx] = NULL;
 		}
 
 		/*
@@ -1365,6 +1366,7 @@ static void x86_pmu_enable(struct pmu *pmu)
 			 * if cpuc->enabled = 0, then no wrmsr as
 			 * per x86_pmu_enable_event()
 			 */
+			cpuc->events[hwc->idx] = event;
 			x86_pmu_start(event, PERF_EF_RELOAD);
 		}
 		cpuc->n_added = 0;
@@ -1531,7 +1533,6 @@ static void x86_pmu_start(struct perf_event *event, int flags)
 
 	event->hw.state = 0;
 
-	cpuc->events[idx] = event;
 	__set_bit(idx, cpuc->active_mask);
 	static_call(x86_pmu_enable)(event);
 	perf_event_update_userpage(event);
@@ -1610,7 +1611,6 @@ void x86_pmu_stop(struct perf_event *event, int flags)
 	if (test_bit(hwc->idx, cpuc->active_mask)) {
 		static_call(x86_pmu_disable)(event);
 		__clear_bit(hwc->idx, cpuc->active_mask);
-		cpuc->events[hwc->idx] = NULL;
 		WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
 		hwc->state |= PERF_HES_STOPPED;
 	}
@@ -1648,6 +1648,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
 	 * Not a TXN, therefore cleanup properly.
 	 */
 	x86_pmu_stop(event, PERF_EF_UPDATE);
+	cpuc->events[event->hw.idx] = NULL;
 
 	for (i = 0; i < cpuc->n_events; i++) {
 		if (event == cpuc->event_list[i])
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 03/12] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 01/12] perf/x86: Remove redundant is_x86_event() prototype Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 02/12] perf/x86: Fix NULL event access and potential PEBS record loss Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 04/12] perf/x86/intel: Correct large PEBS flag check Dapeng Mi
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Use x86_pmu_drain_pebs static call to replace calling x86_pmu.drain_pebs
function pointer.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 28f5468a6ea3..46a000eb0bb3 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3269,7 +3269,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 		 * The PEBS buffer has to be drained before handling the A-PMI
 		 */
 		if (is_pebs_counter_event_group(event))
-			x86_pmu.drain_pebs(regs, &data);
+			static_call(x86_pmu_drain_pebs)(regs, &data);
 
 		last_period = event->hw.last_period;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 04/12] perf/x86/intel: Correct large PEBS flag check
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (2 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 03/12] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

current large PEBS flag check only checks if sample_regs_user contains
unsupported GPRs but doesn't check if sample_regs_intr contains
unsupported GPRs.

Of course, currently PEBS HW supports to sample all perf supported GPRs,
the missed check doesn't cause real issue. But it won't be true any more
after the subsequent patches support to sample SSP register. SSP
sampling is not supported by adaptive PEBS HW and it would be supported
until arch-PEBS HW. So correct this issue.

Fixes: a47ba4d77e12 ("perf/x86: Enable free running PEBS for REGS_USER/INTR")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 46a000eb0bb3..c88bcd5d2bc4 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4029,7 +4029,9 @@ static unsigned long intel_pmu_large_pebs_flags(struct perf_event *event)
 	if (!event->attr.exclude_kernel)
 		flags &= ~PERF_SAMPLE_REGS_USER;
 	if (event->attr.sample_regs_user & ~PEBS_GP_REGS)
-		flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR);
+		flags &= ~PERF_SAMPLE_REGS_USER;
+	if (event->attr.sample_regs_intr & ~PEBS_GP_REGS)
+		flags &= ~PERF_SAMPLE_REGS_INTR;
 	return flags;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (3 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 04/12] perf/x86/intel: Correct large PEBS flag check Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-03-05  0:50   ` [Patch v9 05/12] " Ian Rogers
  2025-10-29 10:21 ` [Patch v9 06/12] perf/x86/intel/ds: Factor out PEBS record processing code to functions Dapeng Mi
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS
supported capabilities and counters bitmap. This patch parses these 2
sub-leaves and initializes arch-PEBS capabilities and corresponding
structures.

Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed
for arch-PEBS, arch-PEBS doesn't need to manipulate these MSRs. Thus add
a simple pair of __intel_pmu_pebs_enable/disable() callbacks for
arch-PEBS.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/core.c            | 21 ++++++++---
 arch/x86/events/intel/core.c      | 60 ++++++++++++++++++++++---------
 arch/x86/events/intel/ds.c        | 52 ++++++++++++++++++++++-----
 arch/x86/events/perf_event.h      | 25 +++++++++++--
 arch/x86/include/asm/perf_event.h |  7 +++-
 5 files changed, 132 insertions(+), 33 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 74479f9d6eed..f2402ae3ffa0 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -554,14 +554,22 @@ static inline int precise_br_compat(struct perf_event *event)
 	return m == b;
 }
 
-int x86_pmu_max_precise(void)
+int x86_pmu_max_precise(struct pmu *pmu)
 {
 	int precise = 0;
 
-	/* Support for constant skid */
 	if (x86_pmu.pebs_active && !x86_pmu.pebs_broken) {
-		precise++;
+		/* arch PEBS */
+		if (x86_pmu.arch_pebs) {
+			precise = 2;
+			if (hybrid(pmu, arch_pebs_cap).pdists)
+				precise++;
+
+			return precise;
+		}
 
+		/* legacy PEBS - support for constant skid */
+		precise++;
 		/* Support for IP fixup */
 		if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
 			precise++;
@@ -569,13 +577,14 @@ int x86_pmu_max_precise(void)
 		if (x86_pmu.pebs_prec_dist)
 			precise++;
 	}
+
 	return precise;
 }
 
 int x86_pmu_hw_config(struct perf_event *event)
 {
 	if (event->attr.precise_ip) {
-		int precise = x86_pmu_max_precise();
+		int precise = x86_pmu_max_precise(event->pmu);
 
 		if (event->attr.precise_ip > precise)
 			return -EOPNOTSUPP;
@@ -2630,7 +2639,9 @@ static ssize_t max_precise_show(struct device *cdev,
 				  struct device_attribute *attr,
 				  char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise());
+	struct pmu *pmu = dev_get_drvdata(cdev);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise(pmu));
 }
 
 static DEVICE_ATTR_RO(max_precise);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c88bcd5d2bc4..9ce27b326923 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5271,34 +5271,59 @@ static inline bool intel_pmu_broken_perf_cap(void)
 	return false;
 }
 
+#define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
+
 static void update_pmu_cap(struct pmu *pmu)
 {
-	unsigned int cntr, fixed_cntr, ecx, edx;
-	union cpuid35_eax eax;
-	union cpuid35_ebx ebx;
+	unsigned int eax, ebx, ecx, edx;
+	union cpuid35_eax eax_0;
+	union cpuid35_ebx ebx_0;
+	u64 cntrs_mask = 0;
+	u64 pebs_mask = 0;
+	u64 pdists_mask = 0;
 
-	cpuid(ARCH_PERFMON_EXT_LEAF, &eax.full, &ebx.full, &ecx, &edx);
+	cpuid(ARCH_PERFMON_EXT_LEAF, &eax_0.full, &ebx_0.full, &ecx, &edx);
 
-	if (ebx.split.umask2)
+	if (ebx_0.split.umask2)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
-	if (ebx.split.eq)
+	if (ebx_0.split.eq)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
 
-	if (eax.split.cntr_subleaf) {
+	if (eax_0.split.cntr_subleaf) {
 		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
-			    &cntr, &fixed_cntr, &ecx, &edx);
-		hybrid(pmu, cntr_mask64) = cntr;
-		hybrid(pmu, fixed_cntr_mask64) = fixed_cntr;
+			    &eax, &ebx, &ecx, &edx);
+		hybrid(pmu, cntr_mask64) = eax;
+		hybrid(pmu, fixed_cntr_mask64) = ebx;
+		cntrs_mask = counter_mask(eax, ebx);
 	}
 
-	if (eax.split.acr_subleaf) {
+	if (eax_0.split.acr_subleaf) {
 		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_ACR_LEAF,
-			    &cntr, &fixed_cntr, &ecx, &edx);
+			    &eax, &ebx, &ecx, &edx);
 		/* The mask of the counters which can be reloaded */
-		hybrid(pmu, acr_cntr_mask64) = cntr | ((u64)fixed_cntr << INTEL_PMC_IDX_FIXED);
-
+		hybrid(pmu, acr_cntr_mask64) = counter_mask(eax, ebx);
 		/* The mask of the counters which can cause a reload of reloadable counters */
-		hybrid(pmu, acr_cause_mask64) = ecx | ((u64)edx << INTEL_PMC_IDX_FIXED);
+		hybrid(pmu, acr_cause_mask64) = counter_mask(ecx, edx);
+	}
+
+	/* Bits[5:4] should be set simultaneously if arch-PEBS is supported */
+	if (eax_0.split.pebs_caps_subleaf && eax_0.split.pebs_cnts_subleaf) {
+		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_CAP_LEAF,
+			    &eax, &ebx, &ecx, &edx);
+		hybrid(pmu, arch_pebs_cap).caps = (u64)ebx << 32;
+
+		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_COUNTER_LEAF,
+			    &eax, &ebx, &ecx, &edx);
+		pebs_mask   = counter_mask(eax, ecx);
+		pdists_mask = counter_mask(ebx, edx);
+		hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
+		hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
+
+		if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
+			x86_pmu.arch_pebs = 0;
+	} else {
+		WARN_ON(x86_pmu.arch_pebs == 1);
+		x86_pmu.arch_pebs = 0;
 	}
 
 	if (!intel_pmu_broken_perf_cap()) {
@@ -6252,7 +6277,7 @@ tsx_is_visible(struct kobject *kobj, struct attribute *attr, int i)
 static umode_t
 pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i)
 {
-	return x86_pmu.ds_pebs ? attr->mode : 0;
+	return intel_pmu_has_pebs() ? attr->mode : 0;
 }
 
 static umode_t
@@ -7728,6 +7753,9 @@ __init int intel_pmu_init(void)
 	if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
 		update_pmu_cap(NULL);
 
+	if (x86_pmu.arch_pebs)
+		pr_cont("Architectural PEBS, ");
+
 	intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64,
 				      &x86_pmu.fixed_cntr_mask64,
 				      &x86_pmu.intel_ctrl);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c0b7ac1c7594..26e485eca0a0 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1531,6 +1531,15 @@ static inline void intel_pmu_drain_large_pebs(struct cpu_hw_events *cpuc)
 		intel_pmu_drain_pebs_buffer();
 }
 
+static void __intel_pmu_pebs_enable(struct perf_event *event)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
+	cpuc->pebs_enabled |= 1ULL << hwc->idx;
+}
+
 void intel_pmu_pebs_enable(struct perf_event *event)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -1539,9 +1548,7 @@ void intel_pmu_pebs_enable(struct perf_event *event)
 	struct debug_store *ds = cpuc->ds;
 	unsigned int idx = hwc->idx;
 
-	hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
-
-	cpuc->pebs_enabled |= 1ULL << hwc->idx;
+	__intel_pmu_pebs_enable(event);
 
 	if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5))
 		cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
@@ -1603,14 +1610,22 @@ void intel_pmu_pebs_del(struct perf_event *event)
 	pebs_update_state(needed_cb, cpuc, event, false);
 }
 
-void intel_pmu_pebs_disable(struct perf_event *event)
+static void __intel_pmu_pebs_disable(struct perf_event *event)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	intel_pmu_drain_large_pebs(cpuc);
-
 	cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
+	hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
+}
+
+void intel_pmu_pebs_disable(struct perf_event *event)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	__intel_pmu_pebs_disable(event);
 
 	if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) &&
 	    (x86_pmu.version < 5))
@@ -1622,8 +1637,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
 
 	if (cpuc->enabled)
 		wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
-
-	hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
 }
 
 void intel_pmu_pebs_enable_all(void)
@@ -2669,11 +2682,26 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 	}
 }
 
+static void __init intel_arch_pebs_init(void)
+{
+	/*
+	 * Current hybrid platforms always both support arch-PEBS or not
+	 * on all kinds of cores. So directly set x86_pmu.arch_pebs flag
+	 * if boot cpu supports arch-PEBS.
+	 */
+	x86_pmu.arch_pebs = 1;
+	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
+	x86_pmu.pebs_capable = ~0ULL;
+
+	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
+	x86_pmu.pebs_disable = __intel_pmu_pebs_disable;
+}
+
 /*
  * PEBS probe and setup
  */
 
-void __init intel_pebs_init(void)
+static void __init intel_ds_pebs_init(void)
 {
 	/*
 	 * No support for 32bit formats
@@ -2788,6 +2816,14 @@ void __init intel_pebs_init(void)
 	}
 }
 
+void __init intel_pebs_init(void)
+{
+	if (x86_pmu.intel_cap.pebs_format == 0xf)
+		intel_arch_pebs_init();
+	else
+		intel_ds_pebs_init();
+}
+
 void perf_restore_debug_store(void)
 {
 	struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 285779c73479..ca5289980b52 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -708,6 +708,12 @@ enum hybrid_pmu_type {
 	hybrid_big_small_tiny	= hybrid_big   | hybrid_small_tiny,
 };
 
+struct arch_pebs_cap {
+	u64 caps;
+	u64 counters;
+	u64 pdists;
+};
+
 struct x86_hybrid_pmu {
 	struct pmu			pmu;
 	const char			*name;
@@ -752,6 +758,8 @@ struct x86_hybrid_pmu {
 					mid_ack		:1,
 					enabled_ack	:1;
 
+	struct arch_pebs_cap		arch_pebs_cap;
+
 	u64				pebs_data_source[PERF_PEBS_DATA_SOURCE_MAX];
 };
 
@@ -906,7 +914,7 @@ struct x86_pmu {
 	union perf_capabilities intel_cap;
 
 	/*
-	 * Intel DebugStore bits
+	 * Intel DebugStore and PEBS bits
 	 */
 	unsigned int	bts			:1,
 			bts_active		:1,
@@ -917,7 +925,8 @@ struct x86_pmu {
 			pebs_no_tlb		:1,
 			pebs_no_isolation	:1,
 			pebs_block		:1,
-			pebs_ept		:1;
+			pebs_ept		:1,
+			arch_pebs		:1;
 	int		pebs_record_size;
 	int		pebs_buffer_size;
 	u64		pebs_events_mask;
@@ -929,6 +938,11 @@ struct x86_pmu {
 	u64		rtm_abort_event;
 	u64		pebs_capable;
 
+	/*
+	 * Intel Architectural PEBS
+	 */
+	struct arch_pebs_cap arch_pebs_cap;
+
 	/*
 	 * Intel LBR
 	 */
@@ -1216,7 +1230,7 @@ int x86_reserve_hardware(void);
 
 void x86_release_hardware(void);
 
-int x86_pmu_max_precise(void);
+int x86_pmu_max_precise(struct pmu *pmu);
 
 void hw_perf_lbr_event_destroy(struct perf_event *event);
 
@@ -1791,6 +1805,11 @@ static inline int intel_pmu_max_num_pebs(struct pmu *pmu)
 	return fls((u32)hybrid(pmu, pebs_events_mask));
 }
 
+static inline bool intel_pmu_has_pebs(void)
+{
+	return x86_pmu.ds_pebs || x86_pmu.arch_pebs;
+}
+
 #else /* CONFIG_CPU_SUP_INTEL */
 
 static inline void reserve_ds_buffers(void)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 49a4d442f3fc..0dfa06722bab 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -200,6 +200,8 @@ union cpuid10_edx {
 #define ARCH_PERFMON_EXT_LEAF			0x00000023
 #define ARCH_PERFMON_NUM_COUNTER_LEAF		0x1
 #define ARCH_PERFMON_ACR_LEAF			0x2
+#define ARCH_PERFMON_PEBS_CAP_LEAF		0x4
+#define ARCH_PERFMON_PEBS_COUNTER_LEAF		0x5
 
 union cpuid35_eax {
 	struct {
@@ -210,7 +212,10 @@ union cpuid35_eax {
 		unsigned int    acr_subleaf:1;
 		/* Events Sub-Leaf */
 		unsigned int    events_subleaf:1;
-		unsigned int	reserved:28;
+		/* arch-PEBS Sub-Leaves */
+		unsigned int	pebs_caps_subleaf:1;
+		unsigned int	pebs_cnts_subleaf:1;
+		unsigned int	reserved:26;
 	} split;
 	unsigned int            full;
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 06/12] perf/x86/intel/ds: Factor out PEBS record processing code to functions
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (4 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 07/12] perf/x86/intel/ds: Factor out PEBS group " Dapeng Mi
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi, Kan Liang

Beside some PEBS record layout difference, arch-PEBS can share most of
PEBS record processing code with adaptive PEBS. Thus, factor out these
common processing code to independent inline functions, so they can be
reused by subsequent arch-PEBS handler.

Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/ds.c | 83 ++++++++++++++++++++++++++------------
 1 file changed, 58 insertions(+), 25 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 26e485eca0a0..c8aa72db86d9 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2614,6 +2614,57 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 	}
 }
 
+static __always_inline void
+__intel_pmu_handle_pebs_record(struct pt_regs *iregs,
+			       struct pt_regs *regs,
+			       struct perf_sample_data *data,
+			       void *at, u64 pebs_status,
+			       short *counts, void **last,
+			       setup_fn setup_sample)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct perf_event *event;
+	int bit;
+
+	for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) {
+		event = cpuc->events[bit];
+
+		if (WARN_ON_ONCE(!event) ||
+		    WARN_ON_ONCE(!event->attr.precise_ip))
+			continue;
+
+		if (counts[bit]++) {
+			__intel_pmu_pebs_event(event, iregs, regs, data,
+					       last[bit], setup_sample);
+		}
+
+		last[bit] = at;
+	}
+}
+
+static __always_inline void
+__intel_pmu_handle_last_pebs_record(struct pt_regs *iregs,
+				    struct pt_regs *regs,
+				    struct perf_sample_data *data,
+				    u64 mask, short *counts, void **last,
+				    setup_fn setup_sample)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct perf_event *event;
+	int bit;
+
+	for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) {
+		if (!counts[bit])
+			continue;
+
+		event = cpuc->events[bit];
+
+		__intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
+					    counts[bit], setup_sample);
+	}
+
+}
+
 static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_data *data)
 {
 	short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
@@ -2623,9 +2674,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 	struct x86_perf_regs perf_regs;
 	struct pt_regs *regs = &perf_regs.regs;
 	struct pebs_basic *basic;
-	struct perf_event *event;
 	void *base, *at, *top;
-	int bit;
 	u64 mask;
 
 	if (!x86_pmu.pebs_active)
@@ -2638,6 +2687,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 
 	mask = hybrid(cpuc->pmu, pebs_events_mask) |
 	       (hybrid(cpuc->pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED);
+	mask &= cpuc->pebs_enabled;
 
 	if (unlikely(base >= top)) {
 		intel_pmu_pebs_event_update_no_drain(cpuc, mask);
@@ -2655,31 +2705,14 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 		if (basic->format_size != cpuc->pebs_record_size)
 			continue;
 
-		pebs_status = basic->applicable_counters & cpuc->pebs_enabled & mask;
-		for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) {
-			event = cpuc->events[bit];
-
-			if (WARN_ON_ONCE(!event) ||
-			    WARN_ON_ONCE(!event->attr.precise_ip))
-				continue;
-
-			if (counts[bit]++) {
-				__intel_pmu_pebs_event(event, iregs, regs, data, last[bit],
-						       setup_pebs_adaptive_sample_data);
-			}
-			last[bit] = at;
-		}
+		pebs_status = mask & basic->applicable_counters;
+		__intel_pmu_handle_pebs_record(iregs, regs, data, at,
+					       pebs_status, counts, last,
+					       setup_pebs_adaptive_sample_data);
 	}
 
-	for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) {
-		if (!counts[bit])
-			continue;
-
-		event = cpuc->events[bit];
-
-		__intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
-					    counts[bit], setup_pebs_adaptive_sample_data);
-	}
+	__intel_pmu_handle_last_pebs_record(iregs, regs, data, mask, counts, last,
+					    setup_pebs_adaptive_sample_data);
 }
 
 static void __init intel_arch_pebs_init(void)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 07/12] perf/x86/intel/ds: Factor out PEBS group processing code to functions
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (5 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 06/12] perf/x86/intel/ds: Factor out PEBS record processing code to functions Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Adaptive PEBS and arch-PEBS share lots of same code to process these
PEBS groups, like basic, GPR and meminfo groups. Extract these shared
code to generic functions to avoid duplicated code.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/ds.c | 170 +++++++++++++++++++++++--------------
 1 file changed, 104 insertions(+), 66 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c8aa72db86d9..68664526443f 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2072,6 +2072,90 @@ static inline void __setup_pebs_counter_group(struct cpu_hw_events *cpuc,
 
 #define PEBS_LATENCY_MASK			0xffff
 
+static inline void __setup_perf_sample_data(struct perf_event *event,
+					    struct pt_regs *iregs,
+					    struct perf_sample_data *data)
+{
+	perf_sample_data_init(data, 0, event->hw.last_period);
+
+	/*
+	 * We must however always use iregs for the unwinder to stay sane; the
+	 * record BP,SP,IP can point into thin air when the record is from a
+	 * previous PMI context or an (I)RET happened between the record and
+	 * PMI.
+	 */
+	perf_sample_save_callchain(data, event, iregs);
+}
+
+static inline void __setup_pebs_basic_group(struct perf_event *event,
+					    struct pt_regs *regs,
+					    struct perf_sample_data *data,
+					    u64 sample_type, u64 ip,
+					    u64 tsc, u16 retire)
+{
+	/* The ip in basic is EventingIP */
+	set_linear_ip(regs, ip);
+	regs->flags = PERF_EFLAGS_EXACT;
+	setup_pebs_time(event, data, tsc);
+
+	if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
+		data->weight.var3_w = retire;
+}
+
+static inline void __setup_pebs_gpr_group(struct perf_event *event,
+					  struct pt_regs *regs,
+					  struct pebs_gprs *gprs,
+					  u64 sample_type)
+{
+	if (event->attr.precise_ip < 2) {
+		set_linear_ip(regs, gprs->ip);
+		regs->flags &= ~PERF_EFLAGS_EXACT;
+	}
+
+	if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER))
+		adaptive_pebs_save_regs(regs, gprs);
+}
+
+static inline void __setup_pebs_meminfo_group(struct perf_event *event,
+					      struct perf_sample_data *data,
+					      u64 sample_type, u64 latency,
+					      u16 instr_latency, u64 address,
+					      u64 aux, u64 tsx_tuning, u64 ax)
+{
+	if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
+		u64 tsx_latency = intel_get_tsx_weight(tsx_tuning);
+
+		data->weight.var2_w = instr_latency;
+
+		/*
+		 * Although meminfo::latency is defined as a u64,
+		 * only the lower 32 bits include the valid data
+		 * in practice on Ice Lake and earlier platforms.
+		 */
+		if (sample_type & PERF_SAMPLE_WEIGHT)
+			data->weight.full = latency ?: tsx_latency;
+		else
+			data->weight.var1_dw = (u32)latency ?: tsx_latency;
+
+		data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+	}
+
+	if (sample_type & PERF_SAMPLE_DATA_SRC) {
+		data->data_src.val = get_data_src(event, aux);
+		data->sample_flags |= PERF_SAMPLE_DATA_SRC;
+	}
+
+	if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
+		data->addr = address;
+		data->sample_flags |= PERF_SAMPLE_ADDR;
+	}
+
+	if (sample_type & PERF_SAMPLE_TRANSACTION) {
+		data->txn = intel_get_tsx_transaction(tsx_tuning, ax);
+		data->sample_flags |= PERF_SAMPLE_TRANSACTION;
+	}
+}
+
 /*
  * With adaptive PEBS the layout depends on what fields are configured.
  */
@@ -2081,12 +2165,14 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 					    struct pt_regs *regs)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	u64 sample_type = event->attr.sample_type;
 	struct pebs_basic *basic = __pebs;
 	void *next_record = basic + 1;
-	u64 sample_type, format_group;
 	struct pebs_meminfo *meminfo = NULL;
 	struct pebs_gprs *gprs = NULL;
 	struct x86_perf_regs *perf_regs;
+	u64 format_group;
+	u16 retire;
 
 	if (basic == NULL)
 		return;
@@ -2094,31 +2180,17 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 	perf_regs = container_of(regs, struct x86_perf_regs, regs);
 	perf_regs->xmm_regs = NULL;
 
-	sample_type = event->attr.sample_type;
 	format_group = basic->format_group;
-	perf_sample_data_init(data, 0, event->hw.last_period);
 
-	setup_pebs_time(event, data, basic->tsc);
-
-	/*
-	 * We must however always use iregs for the unwinder to stay sane; the
-	 * record BP,SP,IP can point into thin air when the record is from a
-	 * previous PMI context or an (I)RET happened between the record and
-	 * PMI.
-	 */
-	perf_sample_save_callchain(data, event, iregs);
+	__setup_perf_sample_data(event, iregs, data);
 
 	*regs = *iregs;
-	/* The ip in basic is EventingIP */
-	set_linear_ip(regs, basic->ip);
-	regs->flags = PERF_EFLAGS_EXACT;
 
-	if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) {
-		if (x86_pmu.flags & PMU_FL_RETIRE_LATENCY)
-			data->weight.var3_w = basic->retire_latency;
-		else
-			data->weight.var3_w = 0;
-	}
+	/* basic group */
+	retire = x86_pmu.flags & PMU_FL_RETIRE_LATENCY ?
+			basic->retire_latency : 0;
+	__setup_pebs_basic_group(event, regs, data, sample_type,
+				 basic->ip, basic->tsc, retire);
 
 	/*
 	 * The record for MEMINFO is in front of GP
@@ -2134,54 +2206,20 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 		gprs = next_record;
 		next_record = gprs + 1;
 
-		if (event->attr.precise_ip < 2) {
-			set_linear_ip(regs, gprs->ip);
-			regs->flags &= ~PERF_EFLAGS_EXACT;
-		}
-
-		if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER))
-			adaptive_pebs_save_regs(regs, gprs);
+		__setup_pebs_gpr_group(event, regs, gprs, sample_type);
 	}
 
 	if (format_group & PEBS_DATACFG_MEMINFO) {
-		if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
-			u64 latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
-					meminfo->cache_latency : meminfo->mem_latency;
-
-			if (x86_pmu.flags & PMU_FL_INSTR_LATENCY)
-				data->weight.var2_w = meminfo->instr_latency;
-
-			/*
-			 * Although meminfo::latency is defined as a u64,
-			 * only the lower 32 bits include the valid data
-			 * in practice on Ice Lake and earlier platforms.
-			 */
-			if (sample_type & PERF_SAMPLE_WEIGHT) {
-				data->weight.full = latency ?:
-					intel_get_tsx_weight(meminfo->tsx_tuning);
-			} else {
-				data->weight.var1_dw = (u32)latency ?:
-					intel_get_tsx_weight(meminfo->tsx_tuning);
-			}
-
-			data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
-		}
-
-		if (sample_type & PERF_SAMPLE_DATA_SRC) {
-			data->data_src.val = get_data_src(event, meminfo->aux);
-			data->sample_flags |= PERF_SAMPLE_DATA_SRC;
-		}
-
-		if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
-			data->addr = meminfo->address;
-			data->sample_flags |= PERF_SAMPLE_ADDR;
-		}
-
-		if (sample_type & PERF_SAMPLE_TRANSACTION) {
-			data->txn = intel_get_tsx_transaction(meminfo->tsx_tuning,
-							  gprs ? gprs->ax : 0);
-			data->sample_flags |= PERF_SAMPLE_TRANSACTION;
-		}
+		u64 latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
+				meminfo->cache_latency : meminfo->mem_latency;
+		u64 instr_latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
+				meminfo->instr_latency : 0;
+		u64 ax = gprs ? gprs->ax : 0;
+
+		__setup_pebs_meminfo_group(event, data, sample_type, latency,
+					   instr_latency, meminfo->address,
+					   meminfo->aux, meminfo->tsx_tuning,
+					   ax);
 	}
 
 	if (format_group & PEBS_DATACFG_XMMS) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (6 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 07/12] perf/x86/intel/ds: Factor out PEBS group " Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-03-03  0:20   ` [Patch v9 08/12] " Chun-Tse Shao
  2025-10-29 10:21 ` [Patch v9 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

A significant difference with adaptive PEBS is that arch-PEBS record
supports fragments which means an arch-PEBS record could be split into
several independent fragments which have its own arch-PEBS header in
each fragment.

This patch defines architectural PEBS record layout structures and add
helpers to process arch-PEBS records or fragments. Only legacy PEBS
groups like basic, GPR, XMM and LBR groups are supported in this patch,
the new added YMM/ZMM/OPMASK vector registers capturing would be
supported in the future.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c      |  13 +++
 arch/x86/events/intel/ds.c        | 184 ++++++++++++++++++++++++++++++
 arch/x86/include/asm/msr-index.h  |   6 +
 arch/x86/include/asm/perf_event.h |  96 ++++++++++++++++
 4 files changed, 299 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 9ce27b326923..de4dbde28adc 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3215,6 +3215,19 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 			status &= ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT;
 	}
 
+	/*
+	 * Arch PEBS sets bit 54 in the global status register
+	 */
+	if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT,
+				 (unsigned long *)&status)) {
+		handled++;
+		static_call(x86_pmu_drain_pebs)(regs, &data);
+
+		if (cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS] &&
+		    is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS]))
+			status &= ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT;
+	}
+
 	/*
 	 * Intel PT
 	 */
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 68664526443f..fe1bf373409e 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2270,6 +2270,117 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 			format_group);
 }
 
+static inline bool arch_pebs_record_continued(struct arch_pebs_header *header)
+{
+	/* Continue bit or null PEBS record indicates fragment follows. */
+	return header->cont || !(header->format & GENMASK_ULL(63, 16));
+}
+
+static void setup_arch_pebs_sample_data(struct perf_event *event,
+					struct pt_regs *iregs,
+					void *__pebs,
+					struct perf_sample_data *data,
+					struct pt_regs *regs)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	u64 sample_type = event->attr.sample_type;
+	struct arch_pebs_header *header = NULL;
+	struct arch_pebs_aux *meminfo = NULL;
+	struct arch_pebs_gprs *gprs = NULL;
+	struct x86_perf_regs *perf_regs;
+	void *next_record;
+	void *at = __pebs;
+
+	if (at == NULL)
+		return;
+
+	perf_regs = container_of(regs, struct x86_perf_regs, regs);
+	perf_regs->xmm_regs = NULL;
+
+	__setup_perf_sample_data(event, iregs, data);
+
+	*regs = *iregs;
+
+again:
+	header = at;
+	next_record = at + sizeof(struct arch_pebs_header);
+	if (header->basic) {
+		struct arch_pebs_basic *basic = next_record;
+		u16 retire = 0;
+
+		next_record = basic + 1;
+
+		if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
+			retire = basic->valid ? basic->retire : 0;
+		__setup_pebs_basic_group(event, regs, data, sample_type,
+				 basic->ip, basic->tsc, retire);
+	}
+
+	/*
+	 * The record for MEMINFO is in front of GP
+	 * But PERF_SAMPLE_TRANSACTION needs gprs->ax.
+	 * Save the pointer here but process later.
+	 */
+	if (header->aux) {
+		meminfo = next_record;
+		next_record = meminfo + 1;
+	}
+
+	if (header->gpr) {
+		gprs = next_record;
+		next_record = gprs + 1;
+
+		__setup_pebs_gpr_group(event, regs,
+				       (struct pebs_gprs *)gprs,
+				       sample_type);
+	}
+
+	if (header->aux) {
+		u64 ax = gprs ? gprs->ax : 0;
+
+		__setup_pebs_meminfo_group(event, data, sample_type,
+					   meminfo->cache_latency,
+					   meminfo->instr_latency,
+					   meminfo->address, meminfo->aux,
+					   meminfo->tsx_tuning, ax);
+	}
+
+	if (header->xmm) {
+		struct pebs_xmm *xmm;
+
+		next_record += sizeof(struct arch_pebs_xer_header);
+
+		xmm = next_record;
+		perf_regs->xmm_regs = xmm->xmm;
+		next_record = xmm + 1;
+	}
+
+	if (header->lbr) {
+		struct arch_pebs_lbr_header *lbr_header = next_record;
+		struct lbr_entry *lbr;
+		int num_lbr;
+
+		next_record = lbr_header + 1;
+		lbr = next_record;
+
+		num_lbr = header->lbr == ARCH_PEBS_LBR_NUM_VAR ?
+				lbr_header->depth :
+				header->lbr * ARCH_PEBS_BASE_LBR_ENTRIES;
+		next_record += num_lbr * sizeof(struct lbr_entry);
+
+		if (has_branch_stack(event)) {
+			intel_pmu_store_pebs_lbrs(lbr);
+			intel_pmu_lbr_save_brstack(data, cpuc, event);
+		}
+	}
+
+	/* Parse followed fragments if there are. */
+	if (arch_pebs_record_continued(header)) {
+		at = at + header->size;
+		goto again;
+	}
+}
+
 static inline void *
 get_next_pebs_record_by_bit(void *base, void *top, int bit)
 {
@@ -2753,6 +2864,78 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 					    setup_pebs_adaptive_sample_data);
 }
 
+static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
+				      struct perf_sample_data *data)
+{
+	short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
+	void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS];
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	union arch_pebs_index index;
+	struct x86_perf_regs perf_regs;
+	struct pt_regs *regs = &perf_regs.regs;
+	void *base, *at, *top;
+	u64 mask;
+
+	rdmsrq(MSR_IA32_PEBS_INDEX, index.whole);
+
+	if (unlikely(!index.wr)) {
+		intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX);
+		return;
+	}
+
+	base = cpuc->ds_pebs_vaddr;
+	top = (void *)((u64)cpuc->ds_pebs_vaddr +
+		       (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
+
+	index.wr = 0;
+	index.full = 0;
+	wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
+
+	mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
+
+	if (!iregs)
+		iregs = &dummy_iregs;
+
+	/* Process all but the last event for each counter. */
+	for (at = base; at < top;) {
+		struct arch_pebs_header *header;
+		struct arch_pebs_basic *basic;
+		u64 pebs_status;
+
+		header = at;
+
+		if (WARN_ON_ONCE(!header->size))
+			break;
+
+		/* 1st fragment or single record must have basic group */
+		if (!header->basic) {
+			at += header->size;
+			continue;
+		}
+
+		basic = at + sizeof(struct arch_pebs_header);
+		pebs_status = mask & basic->applicable_counters;
+		__intel_pmu_handle_pebs_record(iregs, regs, data, at,
+					       pebs_status, counts, last,
+					       setup_arch_pebs_sample_data);
+
+		/* Skip non-last fragments */
+		while (arch_pebs_record_continued(header)) {
+			if (!header->size)
+				break;
+			at += header->size;
+			header = at;
+		}
+
+		/* Skip last fragment or the single record */
+		at += header->size;
+	}
+
+	__intel_pmu_handle_last_pebs_record(iregs, regs, data, mask,
+					    counts, last,
+					    setup_arch_pebs_sample_data);
+}
+
 static void __init intel_arch_pebs_init(void)
 {
 	/*
@@ -2762,6 +2945,7 @@ static void __init intel_arch_pebs_init(void)
 	 */
 	x86_pmu.arch_pebs = 1;
 	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
+	x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
 	x86_pmu.pebs_capable = ~0ULL;
 
 	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 9e1720d73244..fc7a4e7c718d 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -327,6 +327,12 @@
 					 PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
 					 PERF_CAP_PEBS_TIMING_INFO)
 
+/* Arch PEBS */
+#define MSR_IA32_PEBS_BASE		0x000003f4
+#define MSR_IA32_PEBS_INDEX		0x000003f5
+#define ARCH_PEBS_OFFSET_MASK		0x7fffff
+#define ARCH_PEBS_INDEX_WR_SHIFT	4
+
 #define MSR_IA32_RTIT_CTL		0x00000570
 #define RTIT_CTL_TRACEEN		BIT(0)
 #define RTIT_CTL_CYCLEACC		BIT(1)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 0dfa06722bab..3b3848f0d339 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -437,6 +437,8 @@ static inline bool is_topdown_idx(int idx)
 #define GLOBAL_STATUS_LBRS_FROZEN		BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT)
 #define GLOBAL_STATUS_TRACE_TOPAPMI_BIT		55
 #define GLOBAL_STATUS_TRACE_TOPAPMI		BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
+#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT	54
+#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD	BIT_ULL(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT)
 #define GLOBAL_STATUS_PERF_METRICS_OVF_BIT	48
 
 #define GLOBAL_CTRL_EN_PERF_METRICS		BIT_ULL(48)
@@ -507,6 +509,100 @@ struct pebs_cntr_header {
 
 #define INTEL_CNTR_METRICS		0x3
 
+/*
+ * Arch PEBS
+ */
+union arch_pebs_index {
+	struct {
+		u64 rsvd:4,
+		    wr:23,
+		    rsvd2:4,
+		    full:1,
+		    en:1,
+		    rsvd3:3,
+		    thresh:23,
+		    rsvd4:5;
+	};
+	u64 whole;
+};
+
+struct arch_pebs_header {
+	union {
+		u64 format;
+		struct {
+			u64 size:16,	/* Record size */
+			    rsvd:14,
+			    mode:1,	/* 64BIT_MODE */
+			    cont:1,
+			    rsvd2:3,
+			    cntr:5,
+			    lbr:2,
+			    rsvd3:7,
+			    xmm:1,
+			    ymmh:1,
+			    rsvd4:2,
+			    opmask:1,
+			    zmmh:1,
+			    h16zmm:1,
+			    rsvd5:5,
+			    gpr:1,
+			    aux:1,
+			    basic:1;
+		};
+	};
+	u64 rsvd6;
+};
+
+struct arch_pebs_basic {
+	u64 ip;
+	u64 applicable_counters;
+	u64 tsc;
+	u64 retire	:16,	/* Retire Latency */
+	    valid	:1,
+	    rsvd	:47;
+	u64 rsvd2;
+	u64 rsvd3;
+};
+
+struct arch_pebs_aux {
+	u64 address;
+	u64 rsvd;
+	u64 rsvd2;
+	u64 rsvd3;
+	u64 rsvd4;
+	u64 aux;
+	u64 instr_latency	:16,
+	    pad2		:16,
+	    cache_latency	:16,
+	    pad3		:16;
+	u64 tsx_tuning;
+};
+
+struct arch_pebs_gprs {
+	u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di;
+	u64 r8, r9, r10, r11, r12, r13, r14, r15, ssp;
+	u64 rsvd;
+};
+
+struct arch_pebs_xer_header {
+	u64 xstate;
+	u64 rsvd;
+};
+
+#define ARCH_PEBS_LBR_NAN		0x0
+#define ARCH_PEBS_LBR_NUM_8		0x1
+#define ARCH_PEBS_LBR_NUM_16		0x2
+#define ARCH_PEBS_LBR_NUM_VAR		0x3
+#define ARCH_PEBS_BASE_LBR_ENTRIES	8
+struct arch_pebs_lbr_header {
+	u64 rsvd;
+	u64 ctl;
+	u64 depth;
+	u64 ler_from;
+	u64 ler_to;
+	u64 ler_info;
+};
+
 /*
  * AMD Extended Performance Monitoring and Debug cpuid feature detection
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (7 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
                     ` (2 more replies)
  2025-10-29 10:21 ` [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Dapeng Mi
                   ` (2 subsequent siblings)
  11 siblings, 3 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi, Kan Liang

Arch-PEBS introduces a new MSR IA32_PEBS_BASE to store the arch-PEBS
buffer physical address. This patch allocates arch-PEBS buffer and then
initialize IA32_PEBS_BASE MSR with the buffer physical address.

Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c    | 11 ++++-
 arch/x86/events/intel/ds.c      | 82 ++++++++++++++++++++++++++++-----
 arch/x86/events/perf_event.h    | 11 ++++-
 arch/x86/include/asm/intel_ds.h |  3 +-
 4 files changed, 92 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index de4dbde28adc..6e04d73dfae5 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5227,7 +5227,13 @@ int intel_cpuc_prepare(struct cpu_hw_events *cpuc, int cpu)
 
 static int intel_pmu_cpu_prepare(int cpu)
 {
-	return intel_cpuc_prepare(&per_cpu(cpu_hw_events, cpu), cpu);
+	int ret;
+
+	ret = intel_cpuc_prepare(&per_cpu(cpu_hw_events, cpu), cpu);
+	if (ret)
+		return ret;
+
+	return alloc_arch_pebs_buf_on_cpu(cpu);
 }
 
 static void flip_smm_bit(void *data)
@@ -5458,6 +5464,7 @@ static void intel_pmu_cpu_starting(int cpu)
 		return;
 
 	init_debug_store_on_cpu(cpu);
+	init_arch_pebs_on_cpu(cpu);
 	/*
 	 * Deal with CPUs that don't clear their LBRs on power-up, and that may
 	 * even boot with LBRs enabled.
@@ -5555,6 +5562,7 @@ static void free_excl_cntrs(struct cpu_hw_events *cpuc)
 static void intel_pmu_cpu_dying(int cpu)
 {
 	fini_debug_store_on_cpu(cpu);
+	fini_arch_pebs_on_cpu(cpu);
 }
 
 void intel_cpuc_finish(struct cpu_hw_events *cpuc)
@@ -5575,6 +5583,7 @@ static void intel_pmu_cpu_dead(int cpu)
 {
 	struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
 
+	release_arch_pebs_buf_on_cpu(cpu);
 	intel_cpuc_finish(cpuc);
 
 	if (is_hybrid() && cpuc->pmu)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index fe1bf373409e..5c26a5235f94 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -625,13 +625,18 @@ static int alloc_pebs_buffer(int cpu)
 	int max, node = cpu_to_node(cpu);
 	void *buffer, *insn_buff, *cea;
 
-	if (!x86_pmu.ds_pebs)
+	if (!intel_pmu_has_pebs())
 		return 0;
 
 	buffer = dsalloc_pages(bsiz, GFP_KERNEL, cpu);
 	if (unlikely(!buffer))
 		return -ENOMEM;
 
+	if (x86_pmu.arch_pebs) {
+		hwev->pebs_vaddr = buffer;
+		return 0;
+	}
+
 	/*
 	 * HSW+ already provides us the eventing ip; no need to allocate this
 	 * buffer then.
@@ -644,7 +649,7 @@ static int alloc_pebs_buffer(int cpu)
 		}
 		per_cpu(insn_buffer, cpu) = insn_buff;
 	}
-	hwev->ds_pebs_vaddr = buffer;
+	hwev->pebs_vaddr = buffer;
 	/* Update the cpu entry area mapping */
 	cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
 	ds->pebs_buffer_base = (unsigned long) cea;
@@ -660,17 +665,20 @@ static void release_pebs_buffer(int cpu)
 	struct cpu_hw_events *hwev = per_cpu_ptr(&cpu_hw_events, cpu);
 	void *cea;
 
-	if (!x86_pmu.ds_pebs)
+	if (!intel_pmu_has_pebs())
 		return;
 
-	kfree(per_cpu(insn_buffer, cpu));
-	per_cpu(insn_buffer, cpu) = NULL;
+	if (x86_pmu.ds_pebs) {
+		kfree(per_cpu(insn_buffer, cpu));
+		per_cpu(insn_buffer, cpu) = NULL;
 
-	/* Clear the fixmap */
-	cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
-	ds_clear_cea(cea, x86_pmu.pebs_buffer_size);
-	dsfree_pages(hwev->ds_pebs_vaddr, x86_pmu.pebs_buffer_size);
-	hwev->ds_pebs_vaddr = NULL;
+		/* Clear the fixmap */
+		cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
+		ds_clear_cea(cea, x86_pmu.pebs_buffer_size);
+	}
+
+	dsfree_pages(hwev->pebs_vaddr, x86_pmu.pebs_buffer_size);
+	hwev->pebs_vaddr = NULL;
 }
 
 static int alloc_bts_buffer(int cpu)
@@ -823,6 +831,56 @@ void reserve_ds_buffers(void)
 	}
 }
 
+inline int alloc_arch_pebs_buf_on_cpu(int cpu)
+{
+	if (!x86_pmu.arch_pebs)
+		return 0;
+
+	return alloc_pebs_buffer(cpu);
+}
+
+inline void release_arch_pebs_buf_on_cpu(int cpu)
+{
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	release_pebs_buffer(cpu);
+}
+
+void init_arch_pebs_on_cpu(int cpu)
+{
+	struct cpu_hw_events *cpuc = per_cpu_ptr(&cpu_hw_events, cpu);
+	u64 arch_pebs_base;
+
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	if (!cpuc->pebs_vaddr) {
+		WARN(1, "Fail to allocate PEBS buffer on CPU %d\n", cpu);
+		x86_pmu.pebs_active = 0;
+		return;
+	}
+
+	/*
+	 * 4KB-aligned pointer of the output buffer
+	 * (__alloc_pages_node() return page aligned address)
+	 * Buffer Size = 4KB * 2^SIZE
+	 * contiguous physical buffer (__alloc_pages_node() with order)
+	 */
+	arch_pebs_base = virt_to_phys(cpuc->pebs_vaddr) | PEBS_BUFFER_SHIFT;
+	wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, (u32)arch_pebs_base,
+		     (u32)(arch_pebs_base >> 32));
+	x86_pmu.pebs_active = 1;
+}
+
+inline void fini_arch_pebs_on_cpu(int cpu)
+{
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0, 0);
+}
+
 /*
  * BTS
  */
@@ -2883,8 +2941,8 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
 		return;
 	}
 
-	base = cpuc->ds_pebs_vaddr;
-	top = (void *)((u64)cpuc->ds_pebs_vaddr +
+	base = cpuc->pebs_vaddr;
+	top = (void *)((u64)cpuc->pebs_vaddr +
 		       (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
 
 	index.wr = 0;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ca5289980b52..13f411bca6bc 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -283,8 +283,9 @@ struct cpu_hw_events {
 	 * Intel DebugStore bits
 	 */
 	struct debug_store	*ds;
-	void			*ds_pebs_vaddr;
 	void			*ds_bts_vaddr;
+	/* DS based PEBS or arch-PEBS buffer address */
+	void			*pebs_vaddr;
 	u64			pebs_enabled;
 	int			n_pebs;
 	int			n_large_pebs;
@@ -1617,6 +1618,14 @@ extern void intel_cpuc_finish(struct cpu_hw_events *cpuc);
 
 int intel_pmu_init(void);
 
+int alloc_arch_pebs_buf_on_cpu(int cpu);
+
+void release_arch_pebs_buf_on_cpu(int cpu);
+
+void init_arch_pebs_on_cpu(int cpu);
+
+void fini_arch_pebs_on_cpu(int cpu);
+
 void init_debug_store_on_cpu(int cpu);
 
 void fini_debug_store_on_cpu(int cpu);
diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
index 5dbeac48a5b9..023c2883f9f3 100644
--- a/arch/x86/include/asm/intel_ds.h
+++ b/arch/x86/include/asm/intel_ds.h
@@ -4,7 +4,8 @@
 #include <linux/percpu-defs.h>
 
 #define BTS_BUFFER_SIZE		(PAGE_SIZE << 4)
-#define PEBS_BUFFER_SIZE	(PAGE_SIZE << 4)
+#define PEBS_BUFFER_SHIFT	4
+#define PEBS_BUFFER_SIZE	(PAGE_SIZE << PEBS_BUFFER_SHIFT)
 
 /* The maximal number of PEBS events: */
 #define MAX_PEBS_EVENTS_FMT4	8
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (8 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-06 14:52   ` Peter Zijlstra
  2025-11-11 11:37   ` [tip: perf/core] perf/x86/intel: Update dyn_constraint " tip-bot2 for Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
  2025-10-29 10:21 ` [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS Dapeng Mi
  11 siblings, 2 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

arch-PEBS provides CPUIDs to enumerate which counters support PEBS
sampling and precise distribution PEBS sampling. Thus PEBS constraints
should be dynamically configured base on these counter and precise
distribution bitmap instead of defining them statically.

Update event dyn_constraint base on PEBS event precise level.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 11 +++++++++++
 arch/x86/events/intel/ds.c   |  1 +
 2 files changed, 12 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 6e04d73dfae5..40ccfd80d554 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4252,6 +4252,8 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	}
 
 	if (event->attr.precise_ip) {
+		struct arch_pebs_cap pebs_cap = hybrid(event->pmu, arch_pebs_cap);
+
 		if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == INTEL_FIXED_VLBR_EVENT)
 			return -EINVAL;
 
@@ -4265,6 +4267,15 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		}
 		if (x86_pmu.pebs_aliases)
 			x86_pmu.pebs_aliases(event);
+
+		if (x86_pmu.arch_pebs) {
+			u64 cntr_mask = hybrid(event->pmu, intel_ctrl) &
+						~GLOBAL_CTRL_EN_PERF_METRICS;
+			u64 pebs_mask = event->attr.precise_ip >= 3 ?
+						pebs_cap.pdists : pebs_cap.counters;
+			if (cntr_mask != pebs_mask)
+				event->hw.dyn_constraint &= pebs_mask;
+		}
 	}
 
 	if (needs_branch_stack(event)) {
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 5c26a5235f94..1179980f795b 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -3005,6 +3005,7 @@ static void __init intel_arch_pebs_init(void)
 	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
 	x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
 	x86_pmu.pebs_capable = ~0ULL;
+	x86_pmu.flags |= PMU_FL_PEBS_ALL;
 
 	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
 	x86_pmu.pebs_disable = __intel_pmu_pebs_disable;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (9 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-03-05  1:20   ` [Patch v9 11/12] " Ian Rogers
  2025-10-29 10:21 ` [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS Dapeng Mi
  11 siblings, 2 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi, Kan Liang

Different with legacy PEBS, arch-PEBS provides per-counter PEBS data
configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs.

This patch obtains PEBS data configuration from event attribute and then
writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and
enable corresponding PEBS groups.

Please notice this patch only enables XMM SIMD regs sampling for
arch-PEBS, the other SIMD regs (OPMASK/YMM/ZMM) sampling on arch-PEBS
would be supported after PMI based SIMD regs (OPMASK/YMM/ZMM) sampling
is supported.

Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c     | 136 ++++++++++++++++++++++++++++++-
 arch/x86/events/intel/ds.c       |  17 ++++
 arch/x86/events/perf_event.h     |   4 +
 arch/x86/include/asm/intel_ds.h  |   7 ++
 arch/x86/include/asm/msr-index.h |   8 ++
 5 files changed, 171 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 40ccfd80d554..75cba28b86d5 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2563,6 +2563,45 @@ static void intel_pmu_disable_fixed(struct perf_event *event)
 	cpuc->fixed_ctrl_val &= ~mask;
 }
 
+static inline void __intel_pmu_update_event_ext(int idx, u64 ext)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	u32 msr;
+
+	if (idx < INTEL_PMC_IDX_FIXED) {
+		msr = MSR_IA32_PMC_V6_GP0_CFG_C +
+		      x86_pmu.addr_offset(idx, false);
+	} else {
+		msr = MSR_IA32_PMC_V6_FX0_CFG_C +
+		      x86_pmu.addr_offset(idx - INTEL_PMC_IDX_FIXED, false);
+	}
+
+	cpuc->cfg_c_val[idx] = ext;
+	wrmsrq(msr, ext);
+}
+
+static void intel_pmu_disable_event_ext(struct perf_event *event)
+{
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	/*
+	 * Only clear CFG_C MSR for PEBS counter group events,
+	 * it avoids the HW counter's value to be added into
+	 * other PEBS records incorrectly after PEBS counter
+	 * group events are disabled.
+	 *
+	 * For other events, it's unnecessary to clear CFG_C MSRs
+	 * since CFG_C doesn't take effect if counter is in
+	 * disabled state. That helps to reduce the WRMSR overhead
+	 * in context switches.
+	 */
+	if (!is_pebs_counter_event_group(event))
+		return;
+
+	__intel_pmu_update_event_ext(event->hw.idx, 0);
+}
+
 static void intel_pmu_disable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
@@ -2571,9 +2610,12 @@ static void intel_pmu_disable_event(struct perf_event *event)
 	switch (idx) {
 	case 0 ... INTEL_PMC_IDX_FIXED - 1:
 		intel_clear_masks(event, idx);
+		intel_pmu_disable_event_ext(event);
 		x86_pmu_disable_event(event);
 		break;
 	case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
+		intel_pmu_disable_event_ext(event);
+		fallthrough;
 	case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
 		intel_pmu_disable_fixed(event);
 		break;
@@ -2940,6 +2982,66 @@ static void intel_pmu_enable_acr(struct perf_event *event)
 
 DEFINE_STATIC_CALL_NULL(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
 
+static void intel_pmu_enable_event_ext(struct perf_event *event)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+	union arch_pebs_index old, new;
+	struct arch_pebs_cap cap;
+	u64 ext = 0;
+
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	cap = hybrid(cpuc->pmu, arch_pebs_cap);
+
+	if (event->attr.precise_ip) {
+		u64 pebs_data_cfg = intel_get_arch_pebs_data_config(event);
+
+		ext |= ARCH_PEBS_EN;
+		if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD)
+			ext |= (-hwc->sample_period) & ARCH_PEBS_RELOAD;
+
+		if (pebs_data_cfg && cap.caps) {
+			if (pebs_data_cfg & PEBS_DATACFG_MEMINFO)
+				ext |= ARCH_PEBS_AUX & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_GP)
+				ext |= ARCH_PEBS_GPR & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_XMMS)
+				ext |= ARCH_PEBS_VECR_XMM & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_LBRS)
+				ext |= ARCH_PEBS_LBR & cap.caps;
+		}
+
+		if (cpuc->n_pebs == cpuc->n_large_pebs)
+			new.thresh = ARCH_PEBS_THRESH_MULTI;
+		else
+			new.thresh = ARCH_PEBS_THRESH_SINGLE;
+
+		rdmsrq(MSR_IA32_PEBS_INDEX, old.whole);
+		if (new.thresh != old.thresh || !old.en) {
+			if (old.thresh == ARCH_PEBS_THRESH_MULTI && old.wr > 0) {
+				/*
+				 * Large PEBS was enabled.
+				 * Drain PEBS buffer before applying the single PEBS.
+				 */
+				intel_pmu_drain_pebs_buffer();
+			} else {
+				new.wr = 0;
+				new.full = 0;
+				new.en = 1;
+				wrmsrq(MSR_IA32_PEBS_INDEX, new.whole);
+			}
+		}
+	}
+
+	if (cpuc->cfg_c_val[hwc->idx] != ext)
+		__intel_pmu_update_event_ext(hwc->idx, ext);
+}
+
 static void intel_pmu_enable_event(struct perf_event *event)
 {
 	u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE;
@@ -2955,10 +3057,12 @@ static void intel_pmu_enable_event(struct perf_event *event)
 			enable_mask |= ARCH_PERFMON_EVENTSEL_BR_CNTR;
 		intel_set_masks(event, idx);
 		static_call_cond(intel_pmu_enable_acr_event)(event);
+		intel_pmu_enable_event_ext(event);
 		__x86_pmu_enable_event(hwc, enable_mask);
 		break;
 	case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
 		static_call_cond(intel_pmu_enable_acr_event)(event);
+		intel_pmu_enable_event_ext(event);
 		fallthrough;
 	case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
 		intel_pmu_enable_fixed(event);
@@ -5301,6 +5405,30 @@ static inline bool intel_pmu_broken_perf_cap(void)
 	return false;
 }
 
+static inline void __intel_update_pmu_caps(struct pmu *pmu)
+{
+	struct pmu *dest_pmu = pmu ? pmu : x86_get_pmu(smp_processor_id());
+
+	if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM)
+		dest_pmu->capabilities |= PERF_PMU_CAP_EXTENDED_REGS;
+}
+
+static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
+{
+	u64 caps = hybrid(pmu, arch_pebs_cap).caps;
+
+	x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
+	if (caps & ARCH_PEBS_LBR)
+		x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
+
+	if (!(caps & ARCH_PEBS_AUX))
+		x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
+	if (!(caps & ARCH_PEBS_GPR)) {
+		x86_pmu.large_pebs_flags &=
+			~(PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER);
+	}
+}
+
 #define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
 
 static void update_pmu_cap(struct pmu *pmu)
@@ -5349,8 +5477,12 @@ static void update_pmu_cap(struct pmu *pmu)
 		hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
 		hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
 
-		if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
+		if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask)) {
 			x86_pmu.arch_pebs = 0;
+		} else {
+			__intel_update_pmu_caps(pmu);
+			__intel_update_large_pebs_flags(pmu);
+		}
 	} else {
 		WARN_ON(x86_pmu.arch_pebs == 1);
 		x86_pmu.arch_pebs = 0;
@@ -5514,6 +5646,8 @@ static void intel_pmu_cpu_starting(int cpu)
 		}
 	}
 
+	__intel_update_pmu_caps(cpuc->pmu);
+
 	if (!cpuc->shared_regs)
 		return;
 
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 1179980f795b..c66e9b562de3 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1528,6 +1528,18 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
 	}
 }
 
+u64 intel_get_arch_pebs_data_config(struct perf_event *event)
+{
+	u64 pebs_data_cfg = 0;
+
+	if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
+		return 0;
+
+	pebs_data_cfg |= pebs_update_adaptive_cfg(event);
+
+	return pebs_data_cfg;
+}
+
 void intel_pmu_pebs_add(struct perf_event *event)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -2947,6 +2959,11 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
 
 	index.wr = 0;
 	index.full = 0;
+	index.en = 1;
+	if (cpuc->n_pebs == cpuc->n_large_pebs)
+		index.thresh = ARCH_PEBS_THRESH_MULTI;
+	else
+		index.thresh = ARCH_PEBS_THRESH_SINGLE;
 	wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
 
 	mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 13f411bca6bc..3161ec0a3416 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -304,6 +304,8 @@ struct cpu_hw_events {
 	/* Intel ACR configuration */
 	u64			acr_cfg_b[X86_PMC_IDX_MAX];
 	u64			acr_cfg_c[X86_PMC_IDX_MAX];
+	/* Cached CFG_C values */
+	u64			cfg_c_val[X86_PMC_IDX_MAX];
 
 	/*
 	 * Intel LBR bits
@@ -1782,6 +1784,8 @@ void intel_pmu_pebs_data_source_cmt(void);
 
 void intel_pmu_pebs_data_source_lnl(void);
 
+u64 intel_get_arch_pebs_data_config(struct perf_event *event);
+
 int intel_pmu_setup_lbr_filter(struct perf_event *event);
 
 void intel_pt_interrupt(void);
diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
index 023c2883f9f3..695f87efbeb8 100644
--- a/arch/x86/include/asm/intel_ds.h
+++ b/arch/x86/include/asm/intel_ds.h
@@ -7,6 +7,13 @@
 #define PEBS_BUFFER_SHIFT	4
 #define PEBS_BUFFER_SIZE	(PAGE_SIZE << PEBS_BUFFER_SHIFT)
 
+/*
+ * The largest PEBS record could consume a page, ensure
+ * a record at least can be written after triggering PMI.
+ */
+#define ARCH_PEBS_THRESH_MULTI	((PEBS_BUFFER_SIZE - PAGE_SIZE) >> PEBS_BUFFER_SHIFT)
+#define ARCH_PEBS_THRESH_SINGLE	1
+
 /* The maximal number of PEBS events: */
 #define MAX_PEBS_EVENTS_FMT4	8
 #define MAX_PEBS_EVENTS		32
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index fc7a4e7c718d..f1ef9ac38bfb 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -333,6 +333,14 @@
 #define ARCH_PEBS_OFFSET_MASK		0x7fffff
 #define ARCH_PEBS_INDEX_WR_SHIFT	4
 
+#define ARCH_PEBS_RELOAD		0xffffffff
+#define ARCH_PEBS_LBR_SHIFT		40
+#define ARCH_PEBS_LBR			(0x3ull << ARCH_PEBS_LBR_SHIFT)
+#define ARCH_PEBS_VECR_XMM		BIT_ULL(49)
+#define ARCH_PEBS_GPR			BIT_ULL(61)
+#define ARCH_PEBS_AUX			BIT_ULL(62)
+#define ARCH_PEBS_EN			BIT_ULL(63)
+
 #define MSR_IA32_RTIT_CTL		0x00000570
 #define RTIT_CTL_TRACEEN		BIT(0)
 #define RTIT_CTL_CYCLEACC		BIT(1)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS
  2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
                   ` (10 preceding siblings ...)
  2025-10-29 10:21 ` [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
@ 2025-10-29 10:21 ` Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2026-03-09 22:59   ` [Patch v9 12/12] " Ian Rogers
  11 siblings, 2 replies; 48+ messages in thread
From: Dapeng Mi @ 2025-10-29 10:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

Base on previous adaptive PEBS counter snapshot support, add counter
group support for architectural PEBS. Since arch-PEBS shares same
counter group layout with adaptive PEBS, directly reuse
__setup_pebs_counter_group() helper to process arch-PEBS counter group.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c      | 38 ++++++++++++++++++++++++++++---
 arch/x86/events/intel/ds.c        | 29 ++++++++++++++++++++---
 arch/x86/include/asm/msr-index.h  |  6 +++++
 arch/x86/include/asm/perf_event.h | 13 ++++++++---
 4 files changed, 77 insertions(+), 9 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 75cba28b86d5..cb64018321dd 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3014,6 +3014,17 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
 
 			if (pebs_data_cfg & PEBS_DATACFG_LBRS)
 				ext |= ARCH_PEBS_LBR & cap.caps;
+
+			if (pebs_data_cfg &
+			    (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT))
+				ext |= ARCH_PEBS_CNTR_GP & cap.caps;
+
+			if (pebs_data_cfg &
+			    (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT))
+				ext |= ARCH_PEBS_CNTR_FIXED & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_METRICS)
+				ext |= ARCH_PEBS_CNTR_METRICS & cap.caps;
 		}
 
 		if (cpuc->n_pebs == cpuc->n_large_pebs)
@@ -3038,6 +3049,9 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
 		}
 	}
 
+	if (is_pebs_counter_event_group(event))
+		ext |= ARCH_PEBS_CNTR_ALLOW;
+
 	if (cpuc->cfg_c_val[hwc->idx] != ext)
 		__intel_pmu_update_event_ext(hwc->idx, ext);
 }
@@ -4323,6 +4337,20 @@ static bool intel_pmu_is_acr_group(struct perf_event *event)
 	return false;
 }
 
+static inline bool intel_pmu_has_pebs_counter_group(struct pmu *pmu)
+{
+	u64 caps;
+
+	if (x86_pmu.intel_cap.pebs_format >= 6 && x86_pmu.intel_cap.pebs_baseline)
+		return true;
+
+	caps = hybrid(pmu, arch_pebs_cap).caps;
+	if (x86_pmu.arch_pebs && (caps & ARCH_PEBS_CNTR_MASK))
+		return true;
+
+	return false;
+}
+
 static inline void intel_pmu_set_acr_cntr_constr(struct perf_event *event,
 						 u64 *cause_mask, int *num)
 {
@@ -4471,8 +4499,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	}
 
 	if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
-	    (x86_pmu.intel_cap.pebs_format >= 6) &&
-	    x86_pmu.intel_cap.pebs_baseline &&
+	    intel_pmu_has_pebs_counter_group(event->pmu) &&
 	    is_sampling_event(event) &&
 	    event->attr.precise_ip)
 		event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
@@ -5420,6 +5447,8 @@ static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
 	x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
 	if (caps & ARCH_PEBS_LBR)
 		x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
+	if (caps & ARCH_PEBS_CNTR_MASK)
+		x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
 
 	if (!(caps & ARCH_PEBS_AUX))
 		x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
@@ -7134,8 +7163,11 @@ __init int intel_pmu_init(void)
 	 * Many features on and after V6 require dynamic constraint,
 	 * e.g., Arch PEBS, ACR.
 	 */
-	if (version >= 6)
+	if (version >= 6) {
 		x86_pmu.flags |= PMU_FL_DYN_CONSTRAINT;
+		x86_pmu.late_setup = intel_pmu_late_setup;
+	}
+
 	/*
 	 * Install the hw-cache-events table:
 	 */
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c66e9b562de3..c93bf971d97b 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1530,13 +1530,20 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
 
 u64 intel_get_arch_pebs_data_config(struct perf_event *event)
 {
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	u64 pebs_data_cfg = 0;
+	u64 cntr_mask;
 
 	if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
 		return 0;
 
 	pebs_data_cfg |= pebs_update_adaptive_cfg(event);
 
+	cntr_mask = (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT) |
+		    (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT) |
+		    PEBS_DATACFG_CNTR | PEBS_DATACFG_METRICS;
+	pebs_data_cfg |= cpuc->pebs_data_cfg & cntr_mask;
+
 	return pebs_data_cfg;
 }
 
@@ -2444,6 +2451,24 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
 		}
 	}
 
+	if (header->cntr) {
+		struct arch_pebs_cntr_header *cntr = next_record;
+		unsigned int nr;
+
+		next_record += sizeof(struct arch_pebs_cntr_header);
+
+		if (is_pebs_counter_event_group(event)) {
+			__setup_pebs_counter_group(cpuc, event,
+				(struct pebs_cntr_header *)cntr, next_record);
+			data->sample_flags |= PERF_SAMPLE_READ;
+		}
+
+		nr = hweight32(cntr->cntr) + hweight32(cntr->fixed);
+		if (cntr->metrics == INTEL_CNTR_METRICS)
+			nr += 2;
+		next_record += nr * sizeof(u64);
+	}
+
 	/* Parse followed fragments if there are. */
 	if (arch_pebs_record_continued(header)) {
 		at = at + header->size;
@@ -3094,10 +3119,8 @@ static void __init intel_ds_pebs_init(void)
 			break;
 
 		case 6:
-			if (x86_pmu.intel_cap.pebs_baseline) {
+			if (x86_pmu.intel_cap.pebs_baseline)
 				x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
-				x86_pmu.late_setup = intel_pmu_late_setup;
-			}
 			fallthrough;
 		case 5:
 			x86_pmu.pebs_ept = 1;
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index f1ef9ac38bfb..65cc528fbad8 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -334,12 +334,18 @@
 #define ARCH_PEBS_INDEX_WR_SHIFT	4
 
 #define ARCH_PEBS_RELOAD		0xffffffff
+#define ARCH_PEBS_CNTR_ALLOW		BIT_ULL(35)
+#define ARCH_PEBS_CNTR_GP		BIT_ULL(36)
+#define ARCH_PEBS_CNTR_FIXED		BIT_ULL(37)
+#define ARCH_PEBS_CNTR_METRICS		BIT_ULL(38)
 #define ARCH_PEBS_LBR_SHIFT		40
 #define ARCH_PEBS_LBR			(0x3ull << ARCH_PEBS_LBR_SHIFT)
 #define ARCH_PEBS_VECR_XMM		BIT_ULL(49)
 #define ARCH_PEBS_GPR			BIT_ULL(61)
 #define ARCH_PEBS_AUX			BIT_ULL(62)
 #define ARCH_PEBS_EN			BIT_ULL(63)
+#define ARCH_PEBS_CNTR_MASK		(ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \
+					 ARCH_PEBS_CNTR_METRICS)
 
 #define MSR_IA32_RTIT_CTL		0x00000570
 #define RTIT_CTL_TRACEEN		BIT(0)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 3b3848f0d339..7276ba70c88a 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -141,16 +141,16 @@
 #define ARCH_PERFMON_EVENTS_COUNT			7
 
 #define PEBS_DATACFG_MEMINFO	BIT_ULL(0)
-#define PEBS_DATACFG_GP	BIT_ULL(1)
+#define PEBS_DATACFG_GP		BIT_ULL(1)
 #define PEBS_DATACFG_XMMS	BIT_ULL(2)
 #define PEBS_DATACFG_LBRS	BIT_ULL(3)
-#define PEBS_DATACFG_LBR_SHIFT	24
 #define PEBS_DATACFG_CNTR	BIT_ULL(4)
+#define PEBS_DATACFG_METRICS	BIT_ULL(5)
+#define PEBS_DATACFG_LBR_SHIFT	24
 #define PEBS_DATACFG_CNTR_SHIFT	32
 #define PEBS_DATACFG_CNTR_MASK	GENMASK_ULL(15, 0)
 #define PEBS_DATACFG_FIX_SHIFT	48
 #define PEBS_DATACFG_FIX_MASK	GENMASK_ULL(7, 0)
-#define PEBS_DATACFG_METRICS	BIT_ULL(5)
 
 /* Steal the highest bit of pebs_data_cfg for SW usage */
 #define PEBS_UPDATE_DS_SW	BIT_ULL(63)
@@ -603,6 +603,13 @@ struct arch_pebs_lbr_header {
 	u64 ler_info;
 };
 
+struct arch_pebs_cntr_header {
+	u32 cntr;
+	u32 fixed;
+	u32 metrics;
+	u32 reserved;
+};
+
 /*
  * AMD Extended Performance Monitoring and Debug cpuid feature detection
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [Patch v9 02/12] perf/x86: Fix NULL event access and potential PEBS record loss
  2025-10-29 10:21 ` [Patch v9 02/12] perf/x86: Fix NULL event access and potential PEBS record loss Dapeng Mi
@ 2025-11-06 14:19   ` Peter Zijlstra
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  1 sibling, 0 replies; 48+ messages in thread
From: Peter Zijlstra @ 2025-11-06 14:19 UTC (permalink / raw)
  To: Dapeng Mi, george.kennedy
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, kernel test robot, ravi.bangoria


George, it just occurred to me that the below might also fix the root
cause of your 866cf36bfee4 ("perf/x86/amd: Check event before enable to avoid GPF")
and thus we can revert that again.

Specifically, this moves the clearing of cpuc->events[] out to
x86_pmu_del() time.

On Wed, Oct 29, 2025 at 06:21:26PM +0800, Dapeng Mi wrote:
> When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
> perf_event_overflow() could be called to process the last PEBS record.
> 
> While perf_event_overflow() could trigger the interrupt throttle and
> stop all events of the group, like what the below call-chain shows.
> 
> perf_event_overflow()
>   -> __perf_event_overflow()
>     ->__perf_event_account_interrupt()
>       -> perf_event_throttle_group()
>         -> perf_event_throttle()
>           -> event->pmu->stop()
>             -> x86_pmu_stop()
> 
> The side effect of stopping the events is that all corresponding event
> pointers in cpuc->events[] array are cleared to NULL.
> 
> Assume there are two PEBS events (event a and event b) in a group. When
> intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
> last PEBS record of PEBS event a, interrupt throttle is triggered and
> all pointers of event a and event b are cleared to NULL. Then
> intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
> event b and encounters NULL pointer access.
> 
> To avoid this issue, move cpuc->events[] clearing from x86_pmu_stop()
> to x86_pmu_del(). It's safe since cpuc->active_mask or
> cpuc->pebs_enabled is always checked before access the event pointer
> from cpuc->events[].
> 
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com
> Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 745caa6c15a3..74479f9d6eed 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -1344,6 +1344,7 @@ static void x86_pmu_enable(struct pmu *pmu)
>  				hwc->state |= PERF_HES_ARCH;
>  
>  			x86_pmu_stop(event, PERF_EF_UPDATE);
> +			cpuc->events[hwc->idx] = NULL;
>  		}
>  
>  		/*
> @@ -1365,6 +1366,7 @@ static void x86_pmu_enable(struct pmu *pmu)
>  			 * if cpuc->enabled = 0, then no wrmsr as
>  			 * per x86_pmu_enable_event()
>  			 */
> +			cpuc->events[hwc->idx] = event;
>  			x86_pmu_start(event, PERF_EF_RELOAD);
>  		}
>  		cpuc->n_added = 0;
> @@ -1531,7 +1533,6 @@ static void x86_pmu_start(struct perf_event *event, int flags)
>  
>  	event->hw.state = 0;
>  
> -	cpuc->events[idx] = event;
>  	__set_bit(idx, cpuc->active_mask);
>  	static_call(x86_pmu_enable)(event);
>  	perf_event_update_userpage(event);
> @@ -1610,7 +1611,6 @@ void x86_pmu_stop(struct perf_event *event, int flags)
>  	if (test_bit(hwc->idx, cpuc->active_mask)) {
>  		static_call(x86_pmu_disable)(event);
>  		__clear_bit(hwc->idx, cpuc->active_mask);
> -		cpuc->events[hwc->idx] = NULL;
>  		WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
>  		hwc->state |= PERF_HES_STOPPED;
>  	}
> @@ -1648,6 +1648,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
>  	 * Not a TXN, therefore cleanup properly.
>  	 */
>  	x86_pmu_stop(event, PERF_EF_UPDATE);
> +	cpuc->events[event->hw.idx] = NULL;
>  
>  	for (i = 0; i < cpuc->n_events; i++) {
>  		if (event == cpuc->event_list[i])
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-10-29 10:21 ` [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Dapeng Mi
@ 2025-11-06 14:52   ` Peter Zijlstra
  2025-11-07  6:11     ` Mi, Dapeng
  2025-11-11 11:37   ` [tip: perf/core] perf/x86/intel: Update dyn_constraint " tip-bot2 for Dapeng Mi
  1 sibling, 1 reply; 48+ messages in thread
From: Peter Zijlstra @ 2025-11-06 14:52 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao

On Wed, Oct 29, 2025 at 06:21:34PM +0800, Dapeng Mi wrote:
> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
> sampling and precise distribution PEBS sampling. Thus PEBS constraints
> should be dynamically configured base on these counter and precise
> distribution bitmap instead of defining them statically.
> 
> Update event dyn_constraint base on PEBS event precise level.

What happened to this:

  https://lore.kernel.org/all/e0b25b3e-aec0-4c43-9ab2-907186b56c71@linux.intel.com/


> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/intel/core.c | 11 +++++++++++
>  arch/x86/events/intel/ds.c   |  1 +
>  2 files changed, 12 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 6e04d73dfae5..40ccfd80d554 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4252,6 +4252,8 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  	}
>  
>  	if (event->attr.precise_ip) {
> +		struct arch_pebs_cap pebs_cap = hybrid(event->pmu, arch_pebs_cap);
> +
>  		if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == INTEL_FIXED_VLBR_EVENT)
>  			return -EINVAL;
>  
> @@ -4265,6 +4267,15 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  		}
>  		if (x86_pmu.pebs_aliases)
>  			x86_pmu.pebs_aliases(event);
> +
> +		if (x86_pmu.arch_pebs) {
> +			u64 cntr_mask = hybrid(event->pmu, intel_ctrl) &
> +						~GLOBAL_CTRL_EN_PERF_METRICS;
> +			u64 pebs_mask = event->attr.precise_ip >= 3 ?
> +						pebs_cap.pdists : pebs_cap.counters;
> +			if (cntr_mask != pebs_mask)
> +				event->hw.dyn_constraint &= pebs_mask;
> +		}
>  	}
>  
>  	if (needs_branch_stack(event)) {
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 5c26a5235f94..1179980f795b 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -3005,6 +3005,7 @@ static void __init intel_arch_pebs_init(void)
>  	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>  	x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>  	x86_pmu.pebs_capable = ~0ULL;
> +	x86_pmu.flags |= PMU_FL_PEBS_ALL;
>  
>  	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
>  	x86_pmu.pebs_disable = __intel_pmu_pebs_disable;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-06 14:52   ` Peter Zijlstra
@ 2025-11-07  6:11     ` Mi, Dapeng
  2025-11-07  8:28       ` Peter Zijlstra
  2025-11-07 13:05       ` Peter Zijlstra
  0 siblings, 2 replies; 48+ messages in thread
From: Mi, Dapeng @ 2025-11-07  6:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao


On 11/6/2025 10:52 PM, Peter Zijlstra wrote:
> On Wed, Oct 29, 2025 at 06:21:34PM +0800, Dapeng Mi wrote:
>> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
>> sampling and precise distribution PEBS sampling. Thus PEBS constraints
>> should be dynamically configured base on these counter and precise
>> distribution bitmap instead of defining them statically.
>>
>> Update event dyn_constraint base on PEBS event precise level.
> What happened to this:
>
>   https://lore.kernel.org/all/e0b25b3e-aec0-4c43-9ab2-907186b56c71@linux.intel.com/

About the issue, Kan ever posted a patch to mitigate the risk, but it seems
the patch is not merged yet.

https://lore.kernel.org/all/20250512175542.2000708-1-kan.liang@linux.intel.com/


>
>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>  arch/x86/events/intel/core.c | 11 +++++++++++
>>  arch/x86/events/intel/ds.c   |  1 +
>>  2 files changed, 12 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 6e04d73dfae5..40ccfd80d554 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -4252,6 +4252,8 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>  	}
>>  
>>  	if (event->attr.precise_ip) {
>> +		struct arch_pebs_cap pebs_cap = hybrid(event->pmu, arch_pebs_cap);
>> +
>>  		if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == INTEL_FIXED_VLBR_EVENT)
>>  			return -EINVAL;
>>  
>> @@ -4265,6 +4267,15 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>  		}
>>  		if (x86_pmu.pebs_aliases)
>>  			x86_pmu.pebs_aliases(event);
>> +
>> +		if (x86_pmu.arch_pebs) {
>> +			u64 cntr_mask = hybrid(event->pmu, intel_ctrl) &
>> +						~GLOBAL_CTRL_EN_PERF_METRICS;
>> +			u64 pebs_mask = event->attr.precise_ip >= 3 ?
>> +						pebs_cap.pdists : pebs_cap.counters;
>> +			if (cntr_mask != pebs_mask)
>> +				event->hw.dyn_constraint &= pebs_mask;
>> +		}
>>  	}
>>  
>>  	if (needs_branch_stack(event)) {
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index 5c26a5235f94..1179980f795b 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -3005,6 +3005,7 @@ static void __init intel_arch_pebs_init(void)
>>  	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>>  	x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>>  	x86_pmu.pebs_capable = ~0ULL;
>> +	x86_pmu.flags |= PMU_FL_PEBS_ALL;
>>  
>>  	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
>>  	x86_pmu.pebs_disable = __intel_pmu_pebs_disable;
>> -- 
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-07  6:11     ` Mi, Dapeng
@ 2025-11-07  8:28       ` Peter Zijlstra
  2025-11-07  8:36         ` Mi, Dapeng
  2025-11-07 13:05       ` Peter Zijlstra
  1 sibling, 1 reply; 48+ messages in thread
From: Peter Zijlstra @ 2025-11-07  8:28 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao

On Fri, Nov 07, 2025 at 02:11:09PM +0800, Mi, Dapeng wrote:
> 
> On 11/6/2025 10:52 PM, Peter Zijlstra wrote:
> > On Wed, Oct 29, 2025 at 06:21:34PM +0800, Dapeng Mi wrote:
> >> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
> >> sampling and precise distribution PEBS sampling. Thus PEBS constraints
> >> should be dynamically configured base on these counter and precise
> >> distribution bitmap instead of defining them statically.
> >>
> >> Update event dyn_constraint base on PEBS event precise level.
> > What happened to this:
> >
> >   https://lore.kernel.org/all/e0b25b3e-aec0-4c43-9ab2-907186b56c71@linux.intel.com/
> 
> About the issue, Kan ever posted a patch to mitigate the risk, but it seems
> the patch is not merged yet.
> 
> https://lore.kernel.org/all/20250512175542.2000708-1-kan.liang@linux.intel.com/
> 

Clearly it became a victim of some scatter brained maintainer or
something.

Let me stick that near this set and go read the last few patches.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-07  8:28       ` Peter Zijlstra
@ 2025-11-07  8:36         ` Mi, Dapeng
  0 siblings, 0 replies; 48+ messages in thread
From: Mi, Dapeng @ 2025-11-07  8:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao


On 11/7/2025 4:28 PM, Peter Zijlstra wrote:
> On Fri, Nov 07, 2025 at 02:11:09PM +0800, Mi, Dapeng wrote:
>> On 11/6/2025 10:52 PM, Peter Zijlstra wrote:
>>> On Wed, Oct 29, 2025 at 06:21:34PM +0800, Dapeng Mi wrote:
>>>> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
>>>> sampling and precise distribution PEBS sampling. Thus PEBS constraints
>>>> should be dynamically configured base on these counter and precise
>>>> distribution bitmap instead of defining them statically.
>>>>
>>>> Update event dyn_constraint base on PEBS event precise level.
>>> What happened to this:
>>>
>>>   https://lore.kernel.org/all/e0b25b3e-aec0-4c43-9ab2-907186b56c71@linux.intel.com/
>> About the issue, Kan ever posted a patch to mitigate the risk, but it seems
>> the patch is not merged yet.
>>
>> https://lore.kernel.org/all/20250512175542.2000708-1-kan.liang@linux.intel.com/
>>
> Clearly it became a victim of some scatter brained maintainer or
> something.
>
> Let me stick that near this set and go read the last few patches.

Thanks.


>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-07  6:11     ` Mi, Dapeng
  2025-11-07  8:28       ` Peter Zijlstra
@ 2025-11-07 13:05       ` Peter Zijlstra
  2025-11-10  0:23         ` Mi, Dapeng
  1 sibling, 1 reply; 48+ messages in thread
From: Peter Zijlstra @ 2025-11-07 13:05 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao

On Fri, Nov 07, 2025 at 02:11:09PM +0800, Mi, Dapeng wrote:
> 
> On 11/6/2025 10:52 PM, Peter Zijlstra wrote:
> > On Wed, Oct 29, 2025 at 06:21:34PM +0800, Dapeng Mi wrote:
> >> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
> >> sampling and precise distribution PEBS sampling. Thus PEBS constraints
> >> should be dynamically configured base on these counter and precise
> >> distribution bitmap instead of defining them statically.
> >>
> >> Update event dyn_constraint base on PEBS event precise level.
> > What happened to this:
> >
> >   https://lore.kernel.org/all/e0b25b3e-aec0-4c43-9ab2-907186b56c71@linux.intel.com/
> 
> About the issue, Kan ever posted a patch to mitigate the risk, but it seems
> the patch is not merged yet.
> 
> https://lore.kernel.org/all/20250512175542.2000708-1-kan.liang@linux.intel.com/

IIUC the below is what is required handle this new dynamic case, right?

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5423,6 +5423,8 @@ enum dyn_constr_type {
 	DYN_CONSTR_BR_CNTR,
 	DYN_CONSTR_ACR_CNTR,
 	DYN_CONSTR_ACR_CAUSE,
+	DYN_CONSTR_PEBS,
+	DYN_CONSTR_PDIST,
 
 	DYN_CONSTR_MAX,
 };
@@ -5432,6 +5434,8 @@ static const char * const dyn_constr_typ
 	[DYN_CONSTR_BR_CNTR] = "a branch counter logging event",
 	[DYN_CONSTR_ACR_CNTR] = "an auto-counter reload event",
 	[DYN_CONSTR_ACR_CAUSE] = "an auto-counter reload cause event",
+	[DYN_CONSTR_PEBS] = "a PEBS event",
+	[DYN_CONSTR_PDIST] = "a PEBS PDIST event",
 };
 
 static void __intel_pmu_check_dyn_constr(struct event_constraint *constr,
@@ -5536,6 +5540,14 @@ static void intel_pmu_check_dyn_constr(s
 				continue;
 			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
 			break;
+		case DYN_CONSTR_PEBS:
+			if (x86_pmu.arch_pebs)
+				mask = hybrid(pmu, arch_pebs_cap).counters;
+			break;
+		case DYN_CONSTR_PDIST:
+			if (x86_pmu.arch_pebs)
+				mask = hybrid(pmu, arch_pebs_cap).pdists;
+			break;
 		default:
 			pr_warn("Unsupported dynamic constraint type %d\n", i);
 		}

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-07 13:05       ` Peter Zijlstra
@ 2025-11-10  0:23         ` Mi, Dapeng
  2025-11-10  9:03           ` Peter Zijlstra
  0 siblings, 1 reply; 48+ messages in thread
From: Mi, Dapeng @ 2025-11-10  0:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao


On 11/7/2025 9:05 PM, Peter Zijlstra wrote:
> On Fri, Nov 07, 2025 at 02:11:09PM +0800, Mi, Dapeng wrote:
>> On 11/6/2025 10:52 PM, Peter Zijlstra wrote:
>>> On Wed, Oct 29, 2025 at 06:21:34PM +0800, Dapeng Mi wrote:
>>>> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
>>>> sampling and precise distribution PEBS sampling. Thus PEBS constraints
>>>> should be dynamically configured base on these counter and precise
>>>> distribution bitmap instead of defining them statically.
>>>>
>>>> Update event dyn_constraint base on PEBS event precise level.
>>> What happened to this:
>>>
>>>   https://lore.kernel.org/all/e0b25b3e-aec0-4c43-9ab2-907186b56c71@linux.intel.com/
>> About the issue, Kan ever posted a patch to mitigate the risk, but it seems
>> the patch is not merged yet.
>>
>> https://lore.kernel.org/all/20250512175542.2000708-1-kan.liang@linux.intel.com/
> IIUC the below is what is required handle this new dynamic case, right?
>
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -5423,6 +5423,8 @@ enum dyn_constr_type {
>  	DYN_CONSTR_BR_CNTR,
>  	DYN_CONSTR_ACR_CNTR,
>  	DYN_CONSTR_ACR_CAUSE,
> +	DYN_CONSTR_PEBS,
> +	DYN_CONSTR_PDIST,
>  
>  	DYN_CONSTR_MAX,
>  };
> @@ -5432,6 +5434,8 @@ static const char * const dyn_constr_typ
>  	[DYN_CONSTR_BR_CNTR] = "a branch counter logging event",
>  	[DYN_CONSTR_ACR_CNTR] = "an auto-counter reload event",
>  	[DYN_CONSTR_ACR_CAUSE] = "an auto-counter reload cause event",
> +	[DYN_CONSTR_PEBS] = "a PEBS event",
> +	[DYN_CONSTR_PDIST] = "a PEBS PDIST event",
>  };
>  
>  static void __intel_pmu_check_dyn_constr(struct event_constraint *constr,
> @@ -5536,6 +5540,14 @@ static void intel_pmu_check_dyn_constr(s
>  				continue;
>  			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
>  			break;
> +		case DYN_CONSTR_PEBS:
> +			if (x86_pmu.arch_pebs)
> +				mask = hybrid(pmu, arch_pebs_cap).counters;
> +			break;
> +		case DYN_CONSTR_PDIST:
> +			if (x86_pmu.arch_pebs)
> +				mask = hybrid(pmu, arch_pebs_cap).pdists;
> +			break;
>  		default:
>  			pr_warn("Unsupported dynamic constraint type %d\n", i);
>  		}

Yes, exactly. Thanks.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-10  0:23         ` Mi, Dapeng
@ 2025-11-10  9:03           ` Peter Zijlstra
  2025-11-10  9:15             ` Mi, Dapeng
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Zijlstra @ 2025-11-10  9:03 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao

On Mon, Nov 10, 2025 at 08:23:55AM +0800, Mi, Dapeng wrote:

> > @@ -5536,6 +5540,14 @@ static void intel_pmu_check_dyn_constr(s
> >  				continue;
> >  			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
> >  			break;
> > +		case DYN_CONSTR_PEBS:
> > +			if (x86_pmu.arch_pebs)
> > +				mask = hybrid(pmu, arch_pebs_cap).counters;
> > +			break;
> > +		case DYN_CONSTR_PDIST:
> > +			if (x86_pmu.arch_pebs)
> > +				mask = hybrid(pmu, arch_pebs_cap).pdists;
> > +			break;
> >  		default:
> >  			pr_warn("Unsupported dynamic constraint type %d\n", i);
> >  		}
> 
> Yes, exactly. Thanks.

Excellent. Could you please double check and try the bits I have in
queue/perf/core ? I don't think I've got v6 hardware at hand.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-10  9:03           ` Peter Zijlstra
@ 2025-11-10  9:15             ` Mi, Dapeng
  2025-11-11  5:41               ` Mi, Dapeng
  0 siblings, 1 reply; 48+ messages in thread
From: Mi, Dapeng @ 2025-11-10  9:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao


On 11/10/2025 5:03 PM, Peter Zijlstra wrote:
> On Mon, Nov 10, 2025 at 08:23:55AM +0800, Mi, Dapeng wrote:
>
>>> @@ -5536,6 +5540,14 @@ static void intel_pmu_check_dyn_constr(s
>>>  				continue;
>>>  			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
>>>  			break;
>>> +		case DYN_CONSTR_PEBS:
>>> +			if (x86_pmu.arch_pebs)
>>> +				mask = hybrid(pmu, arch_pebs_cap).counters;
>>> +			break;
>>> +		case DYN_CONSTR_PDIST:
>>> +			if (x86_pmu.arch_pebs)
>>> +				mask = hybrid(pmu, arch_pebs_cap).pdists;
>>> +			break;
>>>  		default:
>>>  			pr_warn("Unsupported dynamic constraint type %d\n", i);
>>>  		}
>> Yes, exactly. Thanks.
> Excellent. Could you please double check and try the bits I have in
> queue/perf/core ? I don't think I've got v6 hardware at hand.

Sure. I would post test results tomorrow.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-10  9:15             ` Mi, Dapeng
@ 2025-11-11  5:41               ` Mi, Dapeng
  2025-11-11 11:37                 ` Peter Zijlstra
  0 siblings, 1 reply; 48+ messages in thread
From: Mi, Dapeng @ 2025-11-11  5:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao


On 11/10/2025 5:15 PM, Mi, Dapeng wrote:
> On 11/10/2025 5:03 PM, Peter Zijlstra wrote:
>> On Mon, Nov 10, 2025 at 08:23:55AM +0800, Mi, Dapeng wrote:
>>
>>>> @@ -5536,6 +5540,14 @@ static void intel_pmu_check_dyn_constr(s
>>>>  				continue;
>>>>  			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
>>>>  			break;
>>>> +		case DYN_CONSTR_PEBS:
>>>> +			if (x86_pmu.arch_pebs)
>>>> +				mask = hybrid(pmu, arch_pebs_cap).counters;
>>>> +			break;
>>>> +		case DYN_CONSTR_PDIST:
>>>> +			if (x86_pmu.arch_pebs)
>>>> +				mask = hybrid(pmu, arch_pebs_cap).pdists;
>>>> +			break;
>>>>  		default:
>>>>  			pr_warn("Unsupported dynamic constraint type %d\n", i);
>>>>  		}
>>> Yes, exactly. Thanks.
>> Excellent. Could you please double check and try the bits I have in
>> queue/perf/core ? I don't think I've got v6 hardware at hand.
> Sure. I would post test results tomorrow.

Hi Peter,

I tested the queue/perf/core code with a slight code refine on SPR/CWF/PTL.
In summary, all things look good. The constraints validation passes on all
these 3 platforms, no overlapped constraints are reported. Besides, perf
counting/sampling (both legacy PEBS and arch-PEBS) works well, no issue is
found.

I did a slight change for the intel_pmu_check_dyn_constr() helper. It
should be good enough to only validate the GP counters for the PEBS counter
and PDIST constraint check. Beside the code style is refined
opportunistically. Thanks.

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index aad89c9d9514..81e6c8bcabde 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5506,7 +5506,7 @@ static void __intel_pmu_check_dyn_constr(struct
event_constraint *constr,
                        }

                        if (check_fail) {
-                               pr_info("The two events 0x%llx and 0x%llx
may not be "
+                               pr_warn("The two events 0x%llx and 0x%llx
may not be "
                                        "fully scheduled under some
circumstances as "
                                        "%s.\n",
                                        c1->code, c2->code,
dyn_constr_type_name[type]);
@@ -5519,6 +5519,7 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
                                       struct event_constraint *constr,
                                       u64 cntr_mask)
 {
+       u64 gp_mask = GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
        enum dyn_constr_type i;
        u64 mask;

@@ -5533,20 +5534,25 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
                                mask = x86_pmu.lbr_counters;
                        break;
                case DYN_CONSTR_ACR_CNTR:
-                       mask = hybrid(pmu, acr_cntr_mask64) &
GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
+                       mask = hybrid(pmu, acr_cntr_mask64) & gp_mask;
                        break;
                case DYN_CONSTR_ACR_CAUSE:
-                       if (hybrid(pmu, acr_cntr_mask64) == hybrid(pmu,
acr_cause_mask64))
+                       if (hybrid(pmu, acr_cntr_mask64) ==
+                                       hybrid(pmu, acr_cause_mask64))
                                continue;
-                       mask = hybrid(pmu, acr_cause_mask64) &
GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
+                       mask = hybrid(pmu, acr_cause_mask64) & gp_mask;
                        break;
                case DYN_CONSTR_PEBS:
-                       if (x86_pmu.arch_pebs)
-                               mask = hybrid(pmu, arch_pebs_cap).counters;
+                       if (x86_pmu.arch_pebs) {
+                               mask = hybrid(pmu, arch_pebs_cap).counters &
+                                      gp_mask;
+                       }
                        break;
                case DYN_CONSTR_PDIST:
-                       if (x86_pmu.arch_pebs)
-                               mask = hybrid(pmu, arch_pebs_cap).pdists;
+                       if (x86_pmu.arch_pebs) {
+                               mask = hybrid(pmu, arch_pebs_cap).pdists &
+                                      gp_mask;
+                       }
                        break;
                default:
                        pr_warn("Unsupported dynamic constraint type %d\n", i);


>
>
>

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Add counter group support for arch-PEBS
  2025-10-29 10:21 ` [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  2026-03-09 22:59   ` [Patch v9 12/12] " Ian Rogers
  1 sibling, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     bb5f13df3c455110c4468a31a5b21954268108c9
Gitweb:        https://git.kernel.org/tip/bb5f13df3c455110c4468a31a5b21954268108c9
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:36 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:22 +01:00

perf/x86/intel: Add counter group support for arch-PEBS

Base on previous adaptive PEBS counter snapshot support, add counter
group support for architectural PEBS. Since arch-PEBS shares same
counter group layout with adaptive PEBS, directly reuse
__setup_pebs_counter_group() helper to process arch-PEBS counter group.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-13-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c      | 38 +++++++++++++++++++++++++++---
 arch/x86/events/intel/ds.c        | 29 ++++++++++++++++++++---
 arch/x86/include/asm/msr-index.h  |  6 +++++-
 arch/x86/include/asm/perf_event.h | 13 +++++++---
 4 files changed, 77 insertions(+), 9 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 75cba28..cb64018 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3014,6 +3014,17 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
 
 			if (pebs_data_cfg & PEBS_DATACFG_LBRS)
 				ext |= ARCH_PEBS_LBR & cap.caps;
+
+			if (pebs_data_cfg &
+			    (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT))
+				ext |= ARCH_PEBS_CNTR_GP & cap.caps;
+
+			if (pebs_data_cfg &
+			    (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT))
+				ext |= ARCH_PEBS_CNTR_FIXED & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_METRICS)
+				ext |= ARCH_PEBS_CNTR_METRICS & cap.caps;
 		}
 
 		if (cpuc->n_pebs == cpuc->n_large_pebs)
@@ -3038,6 +3049,9 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
 		}
 	}
 
+	if (is_pebs_counter_event_group(event))
+		ext |= ARCH_PEBS_CNTR_ALLOW;
+
 	if (cpuc->cfg_c_val[hwc->idx] != ext)
 		__intel_pmu_update_event_ext(hwc->idx, ext);
 }
@@ -4323,6 +4337,20 @@ static bool intel_pmu_is_acr_group(struct perf_event *event)
 	return false;
 }
 
+static inline bool intel_pmu_has_pebs_counter_group(struct pmu *pmu)
+{
+	u64 caps;
+
+	if (x86_pmu.intel_cap.pebs_format >= 6 && x86_pmu.intel_cap.pebs_baseline)
+		return true;
+
+	caps = hybrid(pmu, arch_pebs_cap).caps;
+	if (x86_pmu.arch_pebs && (caps & ARCH_PEBS_CNTR_MASK))
+		return true;
+
+	return false;
+}
+
 static inline void intel_pmu_set_acr_cntr_constr(struct perf_event *event,
 						 u64 *cause_mask, int *num)
 {
@@ -4471,8 +4499,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	}
 
 	if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
-	    (x86_pmu.intel_cap.pebs_format >= 6) &&
-	    x86_pmu.intel_cap.pebs_baseline &&
+	    intel_pmu_has_pebs_counter_group(event->pmu) &&
 	    is_sampling_event(event) &&
 	    event->attr.precise_ip)
 		event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
@@ -5420,6 +5447,8 @@ static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
 	x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
 	if (caps & ARCH_PEBS_LBR)
 		x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
+	if (caps & ARCH_PEBS_CNTR_MASK)
+		x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
 
 	if (!(caps & ARCH_PEBS_AUX))
 		x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
@@ -7134,8 +7163,11 @@ __init int intel_pmu_init(void)
 	 * Many features on and after V6 require dynamic constraint,
 	 * e.g., Arch PEBS, ACR.
 	 */
-	if (version >= 6)
+	if (version >= 6) {
 		x86_pmu.flags |= PMU_FL_DYN_CONSTRAINT;
+		x86_pmu.late_setup = intel_pmu_late_setup;
+	}
+
 	/*
 	 * Install the hw-cache-events table:
 	 */
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c66e9b5..c93bf97 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1530,13 +1530,20 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
 
 u64 intel_get_arch_pebs_data_config(struct perf_event *event)
 {
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	u64 pebs_data_cfg = 0;
+	u64 cntr_mask;
 
 	if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
 		return 0;
 
 	pebs_data_cfg |= pebs_update_adaptive_cfg(event);
 
+	cntr_mask = (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT) |
+		    (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT) |
+		    PEBS_DATACFG_CNTR | PEBS_DATACFG_METRICS;
+	pebs_data_cfg |= cpuc->pebs_data_cfg & cntr_mask;
+
 	return pebs_data_cfg;
 }
 
@@ -2444,6 +2451,24 @@ again:
 		}
 	}
 
+	if (header->cntr) {
+		struct arch_pebs_cntr_header *cntr = next_record;
+		unsigned int nr;
+
+		next_record += sizeof(struct arch_pebs_cntr_header);
+
+		if (is_pebs_counter_event_group(event)) {
+			__setup_pebs_counter_group(cpuc, event,
+				(struct pebs_cntr_header *)cntr, next_record);
+			data->sample_flags |= PERF_SAMPLE_READ;
+		}
+
+		nr = hweight32(cntr->cntr) + hweight32(cntr->fixed);
+		if (cntr->metrics == INTEL_CNTR_METRICS)
+			nr += 2;
+		next_record += nr * sizeof(u64);
+	}
+
 	/* Parse followed fragments if there are. */
 	if (arch_pebs_record_continued(header)) {
 		at = at + header->size;
@@ -3094,10 +3119,8 @@ static void __init intel_ds_pebs_init(void)
 			break;
 
 		case 6:
-			if (x86_pmu.intel_cap.pebs_baseline) {
+			if (x86_pmu.intel_cap.pebs_baseline)
 				x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
-				x86_pmu.late_setup = intel_pmu_late_setup;
-			}
 			fallthrough;
 		case 5:
 			x86_pmu.pebs_ept = 1;
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index f1ef9ac..65cc528 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -334,12 +334,18 @@
 #define ARCH_PEBS_INDEX_WR_SHIFT	4
 
 #define ARCH_PEBS_RELOAD		0xffffffff
+#define ARCH_PEBS_CNTR_ALLOW		BIT_ULL(35)
+#define ARCH_PEBS_CNTR_GP		BIT_ULL(36)
+#define ARCH_PEBS_CNTR_FIXED		BIT_ULL(37)
+#define ARCH_PEBS_CNTR_METRICS		BIT_ULL(38)
 #define ARCH_PEBS_LBR_SHIFT		40
 #define ARCH_PEBS_LBR			(0x3ull << ARCH_PEBS_LBR_SHIFT)
 #define ARCH_PEBS_VECR_XMM		BIT_ULL(49)
 #define ARCH_PEBS_GPR			BIT_ULL(61)
 #define ARCH_PEBS_AUX			BIT_ULL(62)
 #define ARCH_PEBS_EN			BIT_ULL(63)
+#define ARCH_PEBS_CNTR_MASK		(ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \
+					 ARCH_PEBS_CNTR_METRICS)
 
 #define MSR_IA32_RTIT_CTL		0x00000570
 #define RTIT_CTL_TRACEEN		BIT(0)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 3b3848f..7276ba7 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -141,16 +141,16 @@
 #define ARCH_PERFMON_EVENTS_COUNT			7
 
 #define PEBS_DATACFG_MEMINFO	BIT_ULL(0)
-#define PEBS_DATACFG_GP	BIT_ULL(1)
+#define PEBS_DATACFG_GP		BIT_ULL(1)
 #define PEBS_DATACFG_XMMS	BIT_ULL(2)
 #define PEBS_DATACFG_LBRS	BIT_ULL(3)
-#define PEBS_DATACFG_LBR_SHIFT	24
 #define PEBS_DATACFG_CNTR	BIT_ULL(4)
+#define PEBS_DATACFG_METRICS	BIT_ULL(5)
+#define PEBS_DATACFG_LBR_SHIFT	24
 #define PEBS_DATACFG_CNTR_SHIFT	32
 #define PEBS_DATACFG_CNTR_MASK	GENMASK_ULL(15, 0)
 #define PEBS_DATACFG_FIX_SHIFT	48
 #define PEBS_DATACFG_FIX_MASK	GENMASK_ULL(7, 0)
-#define PEBS_DATACFG_METRICS	BIT_ULL(5)
 
 /* Steal the highest bit of pebs_data_cfg for SW usage */
 #define PEBS_UPDATE_DS_SW	BIT_ULL(63)
@@ -603,6 +603,13 @@ struct arch_pebs_lbr_header {
 	u64 ler_info;
 };
 
+struct arch_pebs_cntr_header {
+	u32 cntr;
+	u32 fixed;
+	u32 metrics;
+	u32 reserved;
+};
+
 /*
  * AMD Extended Performance Monitoring and Debug cpuid feature detection
  */

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  2025-10-29 10:21 ` [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  2026-03-05  1:20   ` [Patch v9 11/12] " Ian Rogers
  1 sibling, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kan Liang, Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     52448a0a739002eca3d051a6ec314a0b178949a1
Gitweb:        https://git.kernel.org/tip/52448a0a739002eca3d051a6ec314a0b178949a1
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:35 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:22 +01:00

perf/x86/intel: Setup PEBS data configuration and enable legacy groups

Different with legacy PEBS, arch-PEBS provides per-counter PEBS data
configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs.

This patch obtains PEBS data configuration from event attribute and then
writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and
enable corresponding PEBS groups.

Please notice this patch only enables XMM SIMD regs sampling for
arch-PEBS, the other SIMD regs (OPMASK/YMM/ZMM) sampling on arch-PEBS
would be supported after PMI based SIMD regs (OPMASK/YMM/ZMM) sampling
is supported.

Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-12-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c     | 136 +++++++++++++++++++++++++++++-
 arch/x86/events/intel/ds.c       |  17 ++++-
 arch/x86/events/perf_event.h     |   4 +-
 arch/x86/include/asm/intel_ds.h  |   7 ++-
 arch/x86/include/asm/msr-index.h |   8 ++-
 5 files changed, 171 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 40ccfd8..75cba28 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2563,6 +2563,45 @@ static void intel_pmu_disable_fixed(struct perf_event *event)
 	cpuc->fixed_ctrl_val &= ~mask;
 }
 
+static inline void __intel_pmu_update_event_ext(int idx, u64 ext)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	u32 msr;
+
+	if (idx < INTEL_PMC_IDX_FIXED) {
+		msr = MSR_IA32_PMC_V6_GP0_CFG_C +
+		      x86_pmu.addr_offset(idx, false);
+	} else {
+		msr = MSR_IA32_PMC_V6_FX0_CFG_C +
+		      x86_pmu.addr_offset(idx - INTEL_PMC_IDX_FIXED, false);
+	}
+
+	cpuc->cfg_c_val[idx] = ext;
+	wrmsrq(msr, ext);
+}
+
+static void intel_pmu_disable_event_ext(struct perf_event *event)
+{
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	/*
+	 * Only clear CFG_C MSR for PEBS counter group events,
+	 * it avoids the HW counter's value to be added into
+	 * other PEBS records incorrectly after PEBS counter
+	 * group events are disabled.
+	 *
+	 * For other events, it's unnecessary to clear CFG_C MSRs
+	 * since CFG_C doesn't take effect if counter is in
+	 * disabled state. That helps to reduce the WRMSR overhead
+	 * in context switches.
+	 */
+	if (!is_pebs_counter_event_group(event))
+		return;
+
+	__intel_pmu_update_event_ext(event->hw.idx, 0);
+}
+
 static void intel_pmu_disable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
@@ -2571,9 +2610,12 @@ static void intel_pmu_disable_event(struct perf_event *event)
 	switch (idx) {
 	case 0 ... INTEL_PMC_IDX_FIXED - 1:
 		intel_clear_masks(event, idx);
+		intel_pmu_disable_event_ext(event);
 		x86_pmu_disable_event(event);
 		break;
 	case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
+		intel_pmu_disable_event_ext(event);
+		fallthrough;
 	case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
 		intel_pmu_disable_fixed(event);
 		break;
@@ -2940,6 +2982,66 @@ static void intel_pmu_enable_acr(struct perf_event *event)
 
 DEFINE_STATIC_CALL_NULL(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
 
+static void intel_pmu_enable_event_ext(struct perf_event *event)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+	union arch_pebs_index old, new;
+	struct arch_pebs_cap cap;
+	u64 ext = 0;
+
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	cap = hybrid(cpuc->pmu, arch_pebs_cap);
+
+	if (event->attr.precise_ip) {
+		u64 pebs_data_cfg = intel_get_arch_pebs_data_config(event);
+
+		ext |= ARCH_PEBS_EN;
+		if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD)
+			ext |= (-hwc->sample_period) & ARCH_PEBS_RELOAD;
+
+		if (pebs_data_cfg && cap.caps) {
+			if (pebs_data_cfg & PEBS_DATACFG_MEMINFO)
+				ext |= ARCH_PEBS_AUX & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_GP)
+				ext |= ARCH_PEBS_GPR & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_XMMS)
+				ext |= ARCH_PEBS_VECR_XMM & cap.caps;
+
+			if (pebs_data_cfg & PEBS_DATACFG_LBRS)
+				ext |= ARCH_PEBS_LBR & cap.caps;
+		}
+
+		if (cpuc->n_pebs == cpuc->n_large_pebs)
+			new.thresh = ARCH_PEBS_THRESH_MULTI;
+		else
+			new.thresh = ARCH_PEBS_THRESH_SINGLE;
+
+		rdmsrq(MSR_IA32_PEBS_INDEX, old.whole);
+		if (new.thresh != old.thresh || !old.en) {
+			if (old.thresh == ARCH_PEBS_THRESH_MULTI && old.wr > 0) {
+				/*
+				 * Large PEBS was enabled.
+				 * Drain PEBS buffer before applying the single PEBS.
+				 */
+				intel_pmu_drain_pebs_buffer();
+			} else {
+				new.wr = 0;
+				new.full = 0;
+				new.en = 1;
+				wrmsrq(MSR_IA32_PEBS_INDEX, new.whole);
+			}
+		}
+	}
+
+	if (cpuc->cfg_c_val[hwc->idx] != ext)
+		__intel_pmu_update_event_ext(hwc->idx, ext);
+}
+
 static void intel_pmu_enable_event(struct perf_event *event)
 {
 	u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE;
@@ -2955,10 +3057,12 @@ static void intel_pmu_enable_event(struct perf_event *event)
 			enable_mask |= ARCH_PERFMON_EVENTSEL_BR_CNTR;
 		intel_set_masks(event, idx);
 		static_call_cond(intel_pmu_enable_acr_event)(event);
+		intel_pmu_enable_event_ext(event);
 		__x86_pmu_enable_event(hwc, enable_mask);
 		break;
 	case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
 		static_call_cond(intel_pmu_enable_acr_event)(event);
+		intel_pmu_enable_event_ext(event);
 		fallthrough;
 	case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
 		intel_pmu_enable_fixed(event);
@@ -5301,6 +5405,30 @@ static inline bool intel_pmu_broken_perf_cap(void)
 	return false;
 }
 
+static inline void __intel_update_pmu_caps(struct pmu *pmu)
+{
+	struct pmu *dest_pmu = pmu ? pmu : x86_get_pmu(smp_processor_id());
+
+	if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM)
+		dest_pmu->capabilities |= PERF_PMU_CAP_EXTENDED_REGS;
+}
+
+static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
+{
+	u64 caps = hybrid(pmu, arch_pebs_cap).caps;
+
+	x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
+	if (caps & ARCH_PEBS_LBR)
+		x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
+
+	if (!(caps & ARCH_PEBS_AUX))
+		x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
+	if (!(caps & ARCH_PEBS_GPR)) {
+		x86_pmu.large_pebs_flags &=
+			~(PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER);
+	}
+}
+
 #define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
 
 static void update_pmu_cap(struct pmu *pmu)
@@ -5349,8 +5477,12 @@ static void update_pmu_cap(struct pmu *pmu)
 		hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
 		hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
 
-		if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
+		if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask)) {
 			x86_pmu.arch_pebs = 0;
+		} else {
+			__intel_update_pmu_caps(pmu);
+			__intel_update_large_pebs_flags(pmu);
+		}
 	} else {
 		WARN_ON(x86_pmu.arch_pebs == 1);
 		x86_pmu.arch_pebs = 0;
@@ -5514,6 +5646,8 @@ static void intel_pmu_cpu_starting(int cpu)
 		}
 	}
 
+	__intel_update_pmu_caps(cpuc->pmu);
+
 	if (!cpuc->shared_regs)
 		return;
 
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 1179980..c66e9b5 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1528,6 +1528,18 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
 	}
 }
 
+u64 intel_get_arch_pebs_data_config(struct perf_event *event)
+{
+	u64 pebs_data_cfg = 0;
+
+	if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
+		return 0;
+
+	pebs_data_cfg |= pebs_update_adaptive_cfg(event);
+
+	return pebs_data_cfg;
+}
+
 void intel_pmu_pebs_add(struct perf_event *event)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -2947,6 +2959,11 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
 
 	index.wr = 0;
 	index.full = 0;
+	index.en = 1;
+	if (cpuc->n_pebs == cpuc->n_large_pebs)
+		index.thresh = ARCH_PEBS_THRESH_MULTI;
+	else
+		index.thresh = ARCH_PEBS_THRESH_SINGLE;
 	wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
 
 	mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 13f411b..3161ec0 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -304,6 +304,8 @@ struct cpu_hw_events {
 	/* Intel ACR configuration */
 	u64			acr_cfg_b[X86_PMC_IDX_MAX];
 	u64			acr_cfg_c[X86_PMC_IDX_MAX];
+	/* Cached CFG_C values */
+	u64			cfg_c_val[X86_PMC_IDX_MAX];
 
 	/*
 	 * Intel LBR bits
@@ -1782,6 +1784,8 @@ void intel_pmu_pebs_data_source_cmt(void);
 
 void intel_pmu_pebs_data_source_lnl(void);
 
+u64 intel_get_arch_pebs_data_config(struct perf_event *event);
+
 int intel_pmu_setup_lbr_filter(struct perf_event *event);
 
 void intel_pt_interrupt(void);
diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
index 023c288..695f87e 100644
--- a/arch/x86/include/asm/intel_ds.h
+++ b/arch/x86/include/asm/intel_ds.h
@@ -7,6 +7,13 @@
 #define PEBS_BUFFER_SHIFT	4
 #define PEBS_BUFFER_SIZE	(PAGE_SIZE << PEBS_BUFFER_SHIFT)
 
+/*
+ * The largest PEBS record could consume a page, ensure
+ * a record at least can be written after triggering PMI.
+ */
+#define ARCH_PEBS_THRESH_MULTI	((PEBS_BUFFER_SIZE - PAGE_SIZE) >> PEBS_BUFFER_SHIFT)
+#define ARCH_PEBS_THRESH_SINGLE	1
+
 /* The maximal number of PEBS events: */
 #define MAX_PEBS_EVENTS_FMT4	8
 #define MAX_PEBS_EVENTS		32
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index fc7a4e7..f1ef9ac 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -333,6 +333,14 @@
 #define ARCH_PEBS_OFFSET_MASK		0x7fffff
 #define ARCH_PEBS_INDEX_WR_SHIFT	4
 
+#define ARCH_PEBS_RELOAD		0xffffffff
+#define ARCH_PEBS_LBR_SHIFT		40
+#define ARCH_PEBS_LBR			(0x3ull << ARCH_PEBS_LBR_SHIFT)
+#define ARCH_PEBS_VECR_XMM		BIT_ULL(49)
+#define ARCH_PEBS_GPR			BIT_ULL(61)
+#define ARCH_PEBS_AUX			BIT_ULL(62)
+#define ARCH_PEBS_EN			BIT_ULL(63)
+
 #define MSR_IA32_RTIT_CTL		0x00000570
 #define RTIT_CTL_TRACEEN		BIT(0)
 #define RTIT_CTL_CYCLEACC		BIT(1)

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Update dyn_constraint base on PEBS event precise level
  2025-10-29 10:21 ` [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Dapeng Mi
  2025-11-06 14:52   ` Peter Zijlstra
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  1 sibling, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     e89c5d1f290e8915e0aad10014f2241086ea95e4
Gitweb:        https://git.kernel.org/tip/e89c5d1f290e8915e0aad10014f2241086ea95e4
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:34 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:22 +01:00

perf/x86/intel: Update dyn_constraint base on PEBS event precise level

arch-PEBS provides CPUIDs to enumerate which counters support PEBS
sampling and precise distribution PEBS sampling. Thus PEBS constraints
should be dynamically configured base on these counter and precise
distribution bitmap instead of defining them statically.

Update event dyn_constraint base on PEBS event precise level.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-11-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c | 11 +++++++++++
 arch/x86/events/intel/ds.c   |  1 +
 2 files changed, 12 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 6e04d73..40ccfd8 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4252,6 +4252,8 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	}
 
 	if (event->attr.precise_ip) {
+		struct arch_pebs_cap pebs_cap = hybrid(event->pmu, arch_pebs_cap);
+
 		if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == INTEL_FIXED_VLBR_EVENT)
 			return -EINVAL;
 
@@ -4265,6 +4267,15 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		}
 		if (x86_pmu.pebs_aliases)
 			x86_pmu.pebs_aliases(event);
+
+		if (x86_pmu.arch_pebs) {
+			u64 cntr_mask = hybrid(event->pmu, intel_ctrl) &
+						~GLOBAL_CTRL_EN_PERF_METRICS;
+			u64 pebs_mask = event->attr.precise_ip >= 3 ?
+						pebs_cap.pdists : pebs_cap.counters;
+			if (cntr_mask != pebs_mask)
+				event->hw.dyn_constraint &= pebs_mask;
+		}
 	}
 
 	if (needs_branch_stack(event)) {
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 5c26a52..1179980 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -3005,6 +3005,7 @@ static void __init intel_arch_pebs_init(void)
 	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
 	x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
 	x86_pmu.pebs_capable = ~0ULL;
+	x86_pmu.flags |= PMU_FL_PEBS_ALL;
 
 	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
 	x86_pmu.pebs_disable = __intel_pmu_pebs_disable;

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  2025-10-29 10:21 ` [Patch v9 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  2025-11-12 10:03   ` [tip: perf/core] perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use tip-bot2 for Ingo Molnar
  2025-11-12 11:18   ` tip-bot2 for Ingo Molnar
  2 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kan Liang, Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     2721e8da2de7271533ac36285332219f700d16ca
Gitweb:        https://git.kernel.org/tip/2721e8da2de7271533ac36285332219f700d16ca
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:33 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:22 +01:00

perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR

Arch-PEBS introduces a new MSR IA32_PEBS_BASE to store the arch-PEBS
buffer physical address. This patch allocates arch-PEBS buffer and then
initialize IA32_PEBS_BASE MSR with the buffer physical address.

Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-10-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c    | 11 +++-
 arch/x86/events/intel/ds.c      | 82 +++++++++++++++++++++++++++-----
 arch/x86/events/perf_event.h    | 11 +++-
 arch/x86/include/asm/intel_ds.h |  3 +-
 4 files changed, 92 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index de4dbde..6e04d73 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5227,7 +5227,13 @@ err:
 
 static int intel_pmu_cpu_prepare(int cpu)
 {
-	return intel_cpuc_prepare(&per_cpu(cpu_hw_events, cpu), cpu);
+	int ret;
+
+	ret = intel_cpuc_prepare(&per_cpu(cpu_hw_events, cpu), cpu);
+	if (ret)
+		return ret;
+
+	return alloc_arch_pebs_buf_on_cpu(cpu);
 }
 
 static void flip_smm_bit(void *data)
@@ -5458,6 +5464,7 @@ static void intel_pmu_cpu_starting(int cpu)
 		return;
 
 	init_debug_store_on_cpu(cpu);
+	init_arch_pebs_on_cpu(cpu);
 	/*
 	 * Deal with CPUs that don't clear their LBRs on power-up, and that may
 	 * even boot with LBRs enabled.
@@ -5555,6 +5562,7 @@ static void free_excl_cntrs(struct cpu_hw_events *cpuc)
 static void intel_pmu_cpu_dying(int cpu)
 {
 	fini_debug_store_on_cpu(cpu);
+	fini_arch_pebs_on_cpu(cpu);
 }
 
 void intel_cpuc_finish(struct cpu_hw_events *cpuc)
@@ -5575,6 +5583,7 @@ static void intel_pmu_cpu_dead(int cpu)
 {
 	struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
 
+	release_arch_pebs_buf_on_cpu(cpu);
 	intel_cpuc_finish(cpuc);
 
 	if (is_hybrid() && cpuc->pmu)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index fe1bf37..5c26a52 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -625,13 +625,18 @@ static int alloc_pebs_buffer(int cpu)
 	int max, node = cpu_to_node(cpu);
 	void *buffer, *insn_buff, *cea;
 
-	if (!x86_pmu.ds_pebs)
+	if (!intel_pmu_has_pebs())
 		return 0;
 
 	buffer = dsalloc_pages(bsiz, GFP_KERNEL, cpu);
 	if (unlikely(!buffer))
 		return -ENOMEM;
 
+	if (x86_pmu.arch_pebs) {
+		hwev->pebs_vaddr = buffer;
+		return 0;
+	}
+
 	/*
 	 * HSW+ already provides us the eventing ip; no need to allocate this
 	 * buffer then.
@@ -644,7 +649,7 @@ static int alloc_pebs_buffer(int cpu)
 		}
 		per_cpu(insn_buffer, cpu) = insn_buff;
 	}
-	hwev->ds_pebs_vaddr = buffer;
+	hwev->pebs_vaddr = buffer;
 	/* Update the cpu entry area mapping */
 	cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
 	ds->pebs_buffer_base = (unsigned long) cea;
@@ -660,17 +665,20 @@ static void release_pebs_buffer(int cpu)
 	struct cpu_hw_events *hwev = per_cpu_ptr(&cpu_hw_events, cpu);
 	void *cea;
 
-	if (!x86_pmu.ds_pebs)
+	if (!intel_pmu_has_pebs())
 		return;
 
-	kfree(per_cpu(insn_buffer, cpu));
-	per_cpu(insn_buffer, cpu) = NULL;
+	if (x86_pmu.ds_pebs) {
+		kfree(per_cpu(insn_buffer, cpu));
+		per_cpu(insn_buffer, cpu) = NULL;
 
-	/* Clear the fixmap */
-	cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
-	ds_clear_cea(cea, x86_pmu.pebs_buffer_size);
-	dsfree_pages(hwev->ds_pebs_vaddr, x86_pmu.pebs_buffer_size);
-	hwev->ds_pebs_vaddr = NULL;
+		/* Clear the fixmap */
+		cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
+		ds_clear_cea(cea, x86_pmu.pebs_buffer_size);
+	}
+
+	dsfree_pages(hwev->pebs_vaddr, x86_pmu.pebs_buffer_size);
+	hwev->pebs_vaddr = NULL;
 }
 
 static int alloc_bts_buffer(int cpu)
@@ -823,6 +831,56 @@ void reserve_ds_buffers(void)
 	}
 }
 
+inline int alloc_arch_pebs_buf_on_cpu(int cpu)
+{
+	if (!x86_pmu.arch_pebs)
+		return 0;
+
+	return alloc_pebs_buffer(cpu);
+}
+
+inline void release_arch_pebs_buf_on_cpu(int cpu)
+{
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	release_pebs_buffer(cpu);
+}
+
+void init_arch_pebs_on_cpu(int cpu)
+{
+	struct cpu_hw_events *cpuc = per_cpu_ptr(&cpu_hw_events, cpu);
+	u64 arch_pebs_base;
+
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	if (!cpuc->pebs_vaddr) {
+		WARN(1, "Fail to allocate PEBS buffer on CPU %d\n", cpu);
+		x86_pmu.pebs_active = 0;
+		return;
+	}
+
+	/*
+	 * 4KB-aligned pointer of the output buffer
+	 * (__alloc_pages_node() return page aligned address)
+	 * Buffer Size = 4KB * 2^SIZE
+	 * contiguous physical buffer (__alloc_pages_node() with order)
+	 */
+	arch_pebs_base = virt_to_phys(cpuc->pebs_vaddr) | PEBS_BUFFER_SHIFT;
+	wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, (u32)arch_pebs_base,
+		     (u32)(arch_pebs_base >> 32));
+	x86_pmu.pebs_active = 1;
+}
+
+inline void fini_arch_pebs_on_cpu(int cpu)
+{
+	if (!x86_pmu.arch_pebs)
+		return;
+
+	wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0, 0);
+}
+
 /*
  * BTS
  */
@@ -2883,8 +2941,8 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
 		return;
 	}
 
-	base = cpuc->ds_pebs_vaddr;
-	top = (void *)((u64)cpuc->ds_pebs_vaddr +
+	base = cpuc->pebs_vaddr;
+	top = (void *)((u64)cpuc->pebs_vaddr +
 		       (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
 
 	index.wr = 0;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ca52899..13f411b 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -283,8 +283,9 @@ struct cpu_hw_events {
 	 * Intel DebugStore bits
 	 */
 	struct debug_store	*ds;
-	void			*ds_pebs_vaddr;
 	void			*ds_bts_vaddr;
+	/* DS based PEBS or arch-PEBS buffer address */
+	void			*pebs_vaddr;
 	u64			pebs_enabled;
 	int			n_pebs;
 	int			n_large_pebs;
@@ -1617,6 +1618,14 @@ extern void intel_cpuc_finish(struct cpu_hw_events *cpuc);
 
 int intel_pmu_init(void);
 
+int alloc_arch_pebs_buf_on_cpu(int cpu);
+
+void release_arch_pebs_buf_on_cpu(int cpu);
+
+void init_arch_pebs_on_cpu(int cpu);
+
+void fini_arch_pebs_on_cpu(int cpu);
+
 void init_debug_store_on_cpu(int cpu);
 
 void fini_debug_store_on_cpu(int cpu);
diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
index 5dbeac4..023c288 100644
--- a/arch/x86/include/asm/intel_ds.h
+++ b/arch/x86/include/asm/intel_ds.h
@@ -4,7 +4,8 @@
 #include <linux/percpu-defs.h>
 
 #define BTS_BUFFER_SIZE		(PAGE_SIZE << 4)
-#define PEBS_BUFFER_SIZE	(PAGE_SIZE << 4)
+#define PEBS_BUFFER_SHIFT	4
+#define PEBS_BUFFER_SIZE	(PAGE_SIZE << PEBS_BUFFER_SHIFT)
 
 /* The maximal number of PEBS events: */
 #define MAX_PEBS_EVENTS_FMT4	8

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Process arch-PEBS records or record fragments
  2025-10-29 10:21 ` [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  2026-03-03  0:20   ` [Patch v9 08/12] " Chun-Tse Shao
  1 sibling, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     d21954c8a0ffbc94ffdd65106fb6da5b59042e0a
Gitweb:        https://git.kernel.org/tip/d21954c8a0ffbc94ffdd65106fb6da5b59042e0a
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:32 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:21 +01:00

perf/x86/intel: Process arch-PEBS records or record fragments

A significant difference with adaptive PEBS is that arch-PEBS record
supports fragments which means an arch-PEBS record could be split into
several independent fragments which have its own arch-PEBS header in
each fragment.

This patch defines architectural PEBS record layout structures and add
helpers to process arch-PEBS records or fragments. Only legacy PEBS
groups like basic, GPR, XMM and LBR groups are supported in this patch,
the new added YMM/ZMM/OPMASK vector registers capturing would be
supported in the future.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-9-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c      |  13 ++-
 arch/x86/events/intel/ds.c        | 184 +++++++++++++++++++++++++++++-
 arch/x86/include/asm/msr-index.h  |   6 +-
 arch/x86/include/asm/perf_event.h |  96 +++++++++++++++-
 4 files changed, 299 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 9ce27b3..de4dbde 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3216,6 +3216,19 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 	}
 
 	/*
+	 * Arch PEBS sets bit 54 in the global status register
+	 */
+	if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT,
+				 (unsigned long *)&status)) {
+		handled++;
+		static_call(x86_pmu_drain_pebs)(regs, &data);
+
+		if (cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS] &&
+		    is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS]))
+			status &= ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT;
+	}
+
+	/*
 	 * Intel PT
 	 */
 	if (__test_and_clear_bit(GLOBAL_STATUS_TRACE_TOPAPMI_BIT, (unsigned long *)&status)) {
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 6866452..fe1bf37 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2270,6 +2270,117 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 			format_group);
 }
 
+static inline bool arch_pebs_record_continued(struct arch_pebs_header *header)
+{
+	/* Continue bit or null PEBS record indicates fragment follows. */
+	return header->cont || !(header->format & GENMASK_ULL(63, 16));
+}
+
+static void setup_arch_pebs_sample_data(struct perf_event *event,
+					struct pt_regs *iregs,
+					void *__pebs,
+					struct perf_sample_data *data,
+					struct pt_regs *regs)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	u64 sample_type = event->attr.sample_type;
+	struct arch_pebs_header *header = NULL;
+	struct arch_pebs_aux *meminfo = NULL;
+	struct arch_pebs_gprs *gprs = NULL;
+	struct x86_perf_regs *perf_regs;
+	void *next_record;
+	void *at = __pebs;
+
+	if (at == NULL)
+		return;
+
+	perf_regs = container_of(regs, struct x86_perf_regs, regs);
+	perf_regs->xmm_regs = NULL;
+
+	__setup_perf_sample_data(event, iregs, data);
+
+	*regs = *iregs;
+
+again:
+	header = at;
+	next_record = at + sizeof(struct arch_pebs_header);
+	if (header->basic) {
+		struct arch_pebs_basic *basic = next_record;
+		u16 retire = 0;
+
+		next_record = basic + 1;
+
+		if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
+			retire = basic->valid ? basic->retire : 0;
+		__setup_pebs_basic_group(event, regs, data, sample_type,
+				 basic->ip, basic->tsc, retire);
+	}
+
+	/*
+	 * The record for MEMINFO is in front of GP
+	 * But PERF_SAMPLE_TRANSACTION needs gprs->ax.
+	 * Save the pointer here but process later.
+	 */
+	if (header->aux) {
+		meminfo = next_record;
+		next_record = meminfo + 1;
+	}
+
+	if (header->gpr) {
+		gprs = next_record;
+		next_record = gprs + 1;
+
+		__setup_pebs_gpr_group(event, regs,
+				       (struct pebs_gprs *)gprs,
+				       sample_type);
+	}
+
+	if (header->aux) {
+		u64 ax = gprs ? gprs->ax : 0;
+
+		__setup_pebs_meminfo_group(event, data, sample_type,
+					   meminfo->cache_latency,
+					   meminfo->instr_latency,
+					   meminfo->address, meminfo->aux,
+					   meminfo->tsx_tuning, ax);
+	}
+
+	if (header->xmm) {
+		struct pebs_xmm *xmm;
+
+		next_record += sizeof(struct arch_pebs_xer_header);
+
+		xmm = next_record;
+		perf_regs->xmm_regs = xmm->xmm;
+		next_record = xmm + 1;
+	}
+
+	if (header->lbr) {
+		struct arch_pebs_lbr_header *lbr_header = next_record;
+		struct lbr_entry *lbr;
+		int num_lbr;
+
+		next_record = lbr_header + 1;
+		lbr = next_record;
+
+		num_lbr = header->lbr == ARCH_PEBS_LBR_NUM_VAR ?
+				lbr_header->depth :
+				header->lbr * ARCH_PEBS_BASE_LBR_ENTRIES;
+		next_record += num_lbr * sizeof(struct lbr_entry);
+
+		if (has_branch_stack(event)) {
+			intel_pmu_store_pebs_lbrs(lbr);
+			intel_pmu_lbr_save_brstack(data, cpuc, event);
+		}
+	}
+
+	/* Parse followed fragments if there are. */
+	if (arch_pebs_record_continued(header)) {
+		at = at + header->size;
+		goto again;
+	}
+}
+
 static inline void *
 get_next_pebs_record_by_bit(void *base, void *top, int bit)
 {
@@ -2753,6 +2864,78 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 					    setup_pebs_adaptive_sample_data);
 }
 
+static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
+				      struct perf_sample_data *data)
+{
+	short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
+	void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS];
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	union arch_pebs_index index;
+	struct x86_perf_regs perf_regs;
+	struct pt_regs *regs = &perf_regs.regs;
+	void *base, *at, *top;
+	u64 mask;
+
+	rdmsrq(MSR_IA32_PEBS_INDEX, index.whole);
+
+	if (unlikely(!index.wr)) {
+		intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX);
+		return;
+	}
+
+	base = cpuc->ds_pebs_vaddr;
+	top = (void *)((u64)cpuc->ds_pebs_vaddr +
+		       (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
+
+	index.wr = 0;
+	index.full = 0;
+	wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
+
+	mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
+
+	if (!iregs)
+		iregs = &dummy_iregs;
+
+	/* Process all but the last event for each counter. */
+	for (at = base; at < top;) {
+		struct arch_pebs_header *header;
+		struct arch_pebs_basic *basic;
+		u64 pebs_status;
+
+		header = at;
+
+		if (WARN_ON_ONCE(!header->size))
+			break;
+
+		/* 1st fragment or single record must have basic group */
+		if (!header->basic) {
+			at += header->size;
+			continue;
+		}
+
+		basic = at + sizeof(struct arch_pebs_header);
+		pebs_status = mask & basic->applicable_counters;
+		__intel_pmu_handle_pebs_record(iregs, regs, data, at,
+					       pebs_status, counts, last,
+					       setup_arch_pebs_sample_data);
+
+		/* Skip non-last fragments */
+		while (arch_pebs_record_continued(header)) {
+			if (!header->size)
+				break;
+			at += header->size;
+			header = at;
+		}
+
+		/* Skip last fragment or the single record */
+		at += header->size;
+	}
+
+	__intel_pmu_handle_last_pebs_record(iregs, regs, data, mask,
+					    counts, last,
+					    setup_arch_pebs_sample_data);
+}
+
 static void __init intel_arch_pebs_init(void)
 {
 	/*
@@ -2762,6 +2945,7 @@ static void __init intel_arch_pebs_init(void)
 	 */
 	x86_pmu.arch_pebs = 1;
 	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
+	x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
 	x86_pmu.pebs_capable = ~0ULL;
 
 	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 9e1720d..fc7a4e7 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -327,6 +327,12 @@
 					 PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
 					 PERF_CAP_PEBS_TIMING_INFO)
 
+/* Arch PEBS */
+#define MSR_IA32_PEBS_BASE		0x000003f4
+#define MSR_IA32_PEBS_INDEX		0x000003f5
+#define ARCH_PEBS_OFFSET_MASK		0x7fffff
+#define ARCH_PEBS_INDEX_WR_SHIFT	4
+
 #define MSR_IA32_RTIT_CTL		0x00000570
 #define RTIT_CTL_TRACEEN		BIT(0)
 #define RTIT_CTL_CYCLEACC		BIT(1)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 0dfa067..3b3848f 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -437,6 +437,8 @@ static inline bool is_topdown_idx(int idx)
 #define GLOBAL_STATUS_LBRS_FROZEN		BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT)
 #define GLOBAL_STATUS_TRACE_TOPAPMI_BIT		55
 #define GLOBAL_STATUS_TRACE_TOPAPMI		BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
+#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT	54
+#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD	BIT_ULL(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT)
 #define GLOBAL_STATUS_PERF_METRICS_OVF_BIT	48
 
 #define GLOBAL_CTRL_EN_PERF_METRICS		BIT_ULL(48)
@@ -508,6 +510,100 @@ struct pebs_cntr_header {
 #define INTEL_CNTR_METRICS		0x3
 
 /*
+ * Arch PEBS
+ */
+union arch_pebs_index {
+	struct {
+		u64 rsvd:4,
+		    wr:23,
+		    rsvd2:4,
+		    full:1,
+		    en:1,
+		    rsvd3:3,
+		    thresh:23,
+		    rsvd4:5;
+	};
+	u64 whole;
+};
+
+struct arch_pebs_header {
+	union {
+		u64 format;
+		struct {
+			u64 size:16,	/* Record size */
+			    rsvd:14,
+			    mode:1,	/* 64BIT_MODE */
+			    cont:1,
+			    rsvd2:3,
+			    cntr:5,
+			    lbr:2,
+			    rsvd3:7,
+			    xmm:1,
+			    ymmh:1,
+			    rsvd4:2,
+			    opmask:1,
+			    zmmh:1,
+			    h16zmm:1,
+			    rsvd5:5,
+			    gpr:1,
+			    aux:1,
+			    basic:1;
+		};
+	};
+	u64 rsvd6;
+};
+
+struct arch_pebs_basic {
+	u64 ip;
+	u64 applicable_counters;
+	u64 tsc;
+	u64 retire	:16,	/* Retire Latency */
+	    valid	:1,
+	    rsvd	:47;
+	u64 rsvd2;
+	u64 rsvd3;
+};
+
+struct arch_pebs_aux {
+	u64 address;
+	u64 rsvd;
+	u64 rsvd2;
+	u64 rsvd3;
+	u64 rsvd4;
+	u64 aux;
+	u64 instr_latency	:16,
+	    pad2		:16,
+	    cache_latency	:16,
+	    pad3		:16;
+	u64 tsx_tuning;
+};
+
+struct arch_pebs_gprs {
+	u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di;
+	u64 r8, r9, r10, r11, r12, r13, r14, r15, ssp;
+	u64 rsvd;
+};
+
+struct arch_pebs_xer_header {
+	u64 xstate;
+	u64 rsvd;
+};
+
+#define ARCH_PEBS_LBR_NAN		0x0
+#define ARCH_PEBS_LBR_NUM_8		0x1
+#define ARCH_PEBS_LBR_NUM_16		0x2
+#define ARCH_PEBS_LBR_NUM_VAR		0x3
+#define ARCH_PEBS_BASE_LBR_ENTRIES	8
+struct arch_pebs_lbr_header {
+	u64 rsvd;
+	u64 ctl;
+	u64 depth;
+	u64 ler_from;
+	u64 ler_to;
+	u64 ler_info;
+};
+
+/*
  * AMD Extended Performance Monitoring and Debug cpuid feature detection
  */
 #define EXT_PERFMON_DEBUG_FEATURES		0x80000022

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel/ds: Factor out PEBS group processing code to functions
  2025-10-29 10:21 ` [Patch v9 07/12] perf/x86/intel/ds: Factor out PEBS group " Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     167cde7dc9b36b7a88f3c29d836fabce13023327
Gitweb:        https://git.kernel.org/tip/167cde7dc9b36b7a88f3c29d836fabce13023327
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:31 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:21 +01:00

perf/x86/intel/ds: Factor out PEBS group processing code to functions

Adaptive PEBS and arch-PEBS share lots of same code to process these
PEBS groups, like basic, GPR and meminfo groups. Extract these shared
code to generic functions to avoid duplicated code.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-8-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/ds.c | 170 ++++++++++++++++++++++--------------
 1 file changed, 104 insertions(+), 66 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c8aa72d..6866452 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2072,6 +2072,90 @@ static inline void __setup_pebs_counter_group(struct cpu_hw_events *cpuc,
 
 #define PEBS_LATENCY_MASK			0xffff
 
+static inline void __setup_perf_sample_data(struct perf_event *event,
+					    struct pt_regs *iregs,
+					    struct perf_sample_data *data)
+{
+	perf_sample_data_init(data, 0, event->hw.last_period);
+
+	/*
+	 * We must however always use iregs for the unwinder to stay sane; the
+	 * record BP,SP,IP can point into thin air when the record is from a
+	 * previous PMI context or an (I)RET happened between the record and
+	 * PMI.
+	 */
+	perf_sample_save_callchain(data, event, iregs);
+}
+
+static inline void __setup_pebs_basic_group(struct perf_event *event,
+					    struct pt_regs *regs,
+					    struct perf_sample_data *data,
+					    u64 sample_type, u64 ip,
+					    u64 tsc, u16 retire)
+{
+	/* The ip in basic is EventingIP */
+	set_linear_ip(regs, ip);
+	regs->flags = PERF_EFLAGS_EXACT;
+	setup_pebs_time(event, data, tsc);
+
+	if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
+		data->weight.var3_w = retire;
+}
+
+static inline void __setup_pebs_gpr_group(struct perf_event *event,
+					  struct pt_regs *regs,
+					  struct pebs_gprs *gprs,
+					  u64 sample_type)
+{
+	if (event->attr.precise_ip < 2) {
+		set_linear_ip(regs, gprs->ip);
+		regs->flags &= ~PERF_EFLAGS_EXACT;
+	}
+
+	if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER))
+		adaptive_pebs_save_regs(regs, gprs);
+}
+
+static inline void __setup_pebs_meminfo_group(struct perf_event *event,
+					      struct perf_sample_data *data,
+					      u64 sample_type, u64 latency,
+					      u16 instr_latency, u64 address,
+					      u64 aux, u64 tsx_tuning, u64 ax)
+{
+	if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
+		u64 tsx_latency = intel_get_tsx_weight(tsx_tuning);
+
+		data->weight.var2_w = instr_latency;
+
+		/*
+		 * Although meminfo::latency is defined as a u64,
+		 * only the lower 32 bits include the valid data
+		 * in practice on Ice Lake and earlier platforms.
+		 */
+		if (sample_type & PERF_SAMPLE_WEIGHT)
+			data->weight.full = latency ?: tsx_latency;
+		else
+			data->weight.var1_dw = (u32)latency ?: tsx_latency;
+
+		data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+	}
+
+	if (sample_type & PERF_SAMPLE_DATA_SRC) {
+		data->data_src.val = get_data_src(event, aux);
+		data->sample_flags |= PERF_SAMPLE_DATA_SRC;
+	}
+
+	if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
+		data->addr = address;
+		data->sample_flags |= PERF_SAMPLE_ADDR;
+	}
+
+	if (sample_type & PERF_SAMPLE_TRANSACTION) {
+		data->txn = intel_get_tsx_transaction(tsx_tuning, ax);
+		data->sample_flags |= PERF_SAMPLE_TRANSACTION;
+	}
+}
+
 /*
  * With adaptive PEBS the layout depends on what fields are configured.
  */
@@ -2081,12 +2165,14 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 					    struct pt_regs *regs)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	u64 sample_type = event->attr.sample_type;
 	struct pebs_basic *basic = __pebs;
 	void *next_record = basic + 1;
-	u64 sample_type, format_group;
 	struct pebs_meminfo *meminfo = NULL;
 	struct pebs_gprs *gprs = NULL;
 	struct x86_perf_regs *perf_regs;
+	u64 format_group;
+	u16 retire;
 
 	if (basic == NULL)
 		return;
@@ -2094,31 +2180,17 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 	perf_regs = container_of(regs, struct x86_perf_regs, regs);
 	perf_regs->xmm_regs = NULL;
 
-	sample_type = event->attr.sample_type;
 	format_group = basic->format_group;
-	perf_sample_data_init(data, 0, event->hw.last_period);
 
-	setup_pebs_time(event, data, basic->tsc);
-
-	/*
-	 * We must however always use iregs for the unwinder to stay sane; the
-	 * record BP,SP,IP can point into thin air when the record is from a
-	 * previous PMI context or an (I)RET happened between the record and
-	 * PMI.
-	 */
-	perf_sample_save_callchain(data, event, iregs);
+	__setup_perf_sample_data(event, iregs, data);
 
 	*regs = *iregs;
-	/* The ip in basic is EventingIP */
-	set_linear_ip(regs, basic->ip);
-	regs->flags = PERF_EFLAGS_EXACT;
 
-	if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) {
-		if (x86_pmu.flags & PMU_FL_RETIRE_LATENCY)
-			data->weight.var3_w = basic->retire_latency;
-		else
-			data->weight.var3_w = 0;
-	}
+	/* basic group */
+	retire = x86_pmu.flags & PMU_FL_RETIRE_LATENCY ?
+			basic->retire_latency : 0;
+	__setup_pebs_basic_group(event, regs, data, sample_type,
+				 basic->ip, basic->tsc, retire);
 
 	/*
 	 * The record for MEMINFO is in front of GP
@@ -2134,54 +2206,20 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
 		gprs = next_record;
 		next_record = gprs + 1;
 
-		if (event->attr.precise_ip < 2) {
-			set_linear_ip(regs, gprs->ip);
-			regs->flags &= ~PERF_EFLAGS_EXACT;
-		}
-
-		if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER))
-			adaptive_pebs_save_regs(regs, gprs);
+		__setup_pebs_gpr_group(event, regs, gprs, sample_type);
 	}
 
 	if (format_group & PEBS_DATACFG_MEMINFO) {
-		if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
-			u64 latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
-					meminfo->cache_latency : meminfo->mem_latency;
-
-			if (x86_pmu.flags & PMU_FL_INSTR_LATENCY)
-				data->weight.var2_w = meminfo->instr_latency;
-
-			/*
-			 * Although meminfo::latency is defined as a u64,
-			 * only the lower 32 bits include the valid data
-			 * in practice on Ice Lake and earlier platforms.
-			 */
-			if (sample_type & PERF_SAMPLE_WEIGHT) {
-				data->weight.full = latency ?:
-					intel_get_tsx_weight(meminfo->tsx_tuning);
-			} else {
-				data->weight.var1_dw = (u32)latency ?:
-					intel_get_tsx_weight(meminfo->tsx_tuning);
-			}
-
-			data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
-		}
-
-		if (sample_type & PERF_SAMPLE_DATA_SRC) {
-			data->data_src.val = get_data_src(event, meminfo->aux);
-			data->sample_flags |= PERF_SAMPLE_DATA_SRC;
-		}
-
-		if (sample_type & PERF_SAMPLE_ADDR_TYPE) {
-			data->addr = meminfo->address;
-			data->sample_flags |= PERF_SAMPLE_ADDR;
-		}
-
-		if (sample_type & PERF_SAMPLE_TRANSACTION) {
-			data->txn = intel_get_tsx_transaction(meminfo->tsx_tuning,
-							  gprs ? gprs->ax : 0);
-			data->sample_flags |= PERF_SAMPLE_TRANSACTION;
-		}
+		u64 latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
+				meminfo->cache_latency : meminfo->mem_latency;
+		u64 instr_latency = x86_pmu.flags & PMU_FL_INSTR_LATENCY ?
+				meminfo->instr_latency : 0;
+		u64 ax = gprs ? gprs->ax : 0;
+
+		__setup_pebs_meminfo_group(event, data, sample_type, latency,
+					   instr_latency, meminfo->address,
+					   meminfo->aux, meminfo->tsx_tuning,
+					   ax);
 	}
 
 	if (format_group & PEBS_DATACFG_XMMS) {

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel/ds: Factor out PEBS record processing code to functions
  2025-10-29 10:21 ` [Patch v9 06/12] perf/x86/intel/ds: Factor out PEBS record processing code to functions Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kan Liang, Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     8807d922705f0a137d8de5f636b50e7b4fbef155
Gitweb:        https://git.kernel.org/tip/8807d922705f0a137d8de5f636b50e7b4fbef155
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:30 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:21 +01:00

perf/x86/intel/ds: Factor out PEBS record processing code to functions

Beside some PEBS record layout difference, arch-PEBS can share most of
PEBS record processing code with adaptive PEBS. Thus, factor out these
common processing code to independent inline functions, so they can be
reused by subsequent arch-PEBS handler.

Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-7-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/ds.c | 83 +++++++++++++++++++++++++------------
 1 file changed, 58 insertions(+), 25 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 26e485e..c8aa72d 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2614,6 +2614,57 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 	}
 }
 
+static __always_inline void
+__intel_pmu_handle_pebs_record(struct pt_regs *iregs,
+			       struct pt_regs *regs,
+			       struct perf_sample_data *data,
+			       void *at, u64 pebs_status,
+			       short *counts, void **last,
+			       setup_fn setup_sample)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct perf_event *event;
+	int bit;
+
+	for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) {
+		event = cpuc->events[bit];
+
+		if (WARN_ON_ONCE(!event) ||
+		    WARN_ON_ONCE(!event->attr.precise_ip))
+			continue;
+
+		if (counts[bit]++) {
+			__intel_pmu_pebs_event(event, iregs, regs, data,
+					       last[bit], setup_sample);
+		}
+
+		last[bit] = at;
+	}
+}
+
+static __always_inline void
+__intel_pmu_handle_last_pebs_record(struct pt_regs *iregs,
+				    struct pt_regs *regs,
+				    struct perf_sample_data *data,
+				    u64 mask, short *counts, void **last,
+				    setup_fn setup_sample)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct perf_event *event;
+	int bit;
+
+	for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) {
+		if (!counts[bit])
+			continue;
+
+		event = cpuc->events[bit];
+
+		__intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
+					    counts[bit], setup_sample);
+	}
+
+}
+
 static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_data *data)
 {
 	short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
@@ -2623,9 +2674,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 	struct x86_perf_regs perf_regs;
 	struct pt_regs *regs = &perf_regs.regs;
 	struct pebs_basic *basic;
-	struct perf_event *event;
 	void *base, *at, *top;
-	int bit;
 	u64 mask;
 
 	if (!x86_pmu.pebs_active)
@@ -2638,6 +2687,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 
 	mask = hybrid(cpuc->pmu, pebs_events_mask) |
 	       (hybrid(cpuc->pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED);
+	mask &= cpuc->pebs_enabled;
 
 	if (unlikely(base >= top)) {
 		intel_pmu_pebs_event_update_no_drain(cpuc, mask);
@@ -2655,31 +2705,14 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 		if (basic->format_size != cpuc->pebs_record_size)
 			continue;
 
-		pebs_status = basic->applicable_counters & cpuc->pebs_enabled & mask;
-		for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) {
-			event = cpuc->events[bit];
-
-			if (WARN_ON_ONCE(!event) ||
-			    WARN_ON_ONCE(!event->attr.precise_ip))
-				continue;
-
-			if (counts[bit]++) {
-				__intel_pmu_pebs_event(event, iregs, regs, data, last[bit],
-						       setup_pebs_adaptive_sample_data);
-			}
-			last[bit] = at;
-		}
+		pebs_status = mask & basic->applicable_counters;
+		__intel_pmu_handle_pebs_record(iregs, regs, data, at,
+					       pebs_status, counts, last,
+					       setup_pebs_adaptive_sample_data);
 	}
 
-	for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) {
-		if (!counts[bit])
-			continue;
-
-		event = cpuc->events[bit];
-
-		__intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
-					    counts[bit], setup_pebs_adaptive_sample_data);
-	}
+	__intel_pmu_handle_last_pebs_record(iregs, regs, data, mask, counts, last,
+					    setup_pebs_adaptive_sample_data);
 }
 
 static void __init intel_arch_pebs_init(void)

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Initialize architectural PEBS
  2025-10-29 10:21 ` [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  2026-03-05  0:50   ` [Patch v9 05/12] " Ian Rogers
  1 sibling, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     d243d0bb64af1e90ec18ac2fa6e7cadfe8895913
Gitweb:        https://git.kernel.org/tip/d243d0bb64af1e90ec18ac2fa6e7cadfe8895913
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:29 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:20 +01:00

perf/x86/intel: Initialize architectural PEBS

arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS
supported capabilities and counters bitmap. This patch parses these 2
sub-leaves and initializes arch-PEBS capabilities and corresponding
structures.

Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed
for arch-PEBS, arch-PEBS doesn't need to manipulate these MSRs. Thus add
a simple pair of __intel_pmu_pebs_enable/disable() callbacks for
arch-PEBS.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-6-dapeng1.mi@linux.intel.com
---
 arch/x86/events/core.c            | 21 ++++++++---
 arch/x86/events/intel/core.c      | 60 ++++++++++++++++++++++--------
 arch/x86/events/intel/ds.c        | 52 ++++++++++++++++++++++----
 arch/x86/events/perf_event.h      | 25 +++++++++++--
 arch/x86/include/asm/perf_event.h |  7 +++-
 5 files changed, 132 insertions(+), 33 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index b2868fe..5d0d5e4 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -554,14 +554,22 @@ static inline int precise_br_compat(struct perf_event *event)
 	return m == b;
 }
 
-int x86_pmu_max_precise(void)
+int x86_pmu_max_precise(struct pmu *pmu)
 {
 	int precise = 0;
 
-	/* Support for constant skid */
 	if (x86_pmu.pebs_active && !x86_pmu.pebs_broken) {
-		precise++;
+		/* arch PEBS */
+		if (x86_pmu.arch_pebs) {
+			precise = 2;
+			if (hybrid(pmu, arch_pebs_cap).pdists)
+				precise++;
+
+			return precise;
+		}
 
+		/* legacy PEBS - support for constant skid */
+		precise++;
 		/* Support for IP fixup */
 		if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
 			precise++;
@@ -569,13 +577,14 @@ int x86_pmu_max_precise(void)
 		if (x86_pmu.pebs_prec_dist)
 			precise++;
 	}
+
 	return precise;
 }
 
 int x86_pmu_hw_config(struct perf_event *event)
 {
 	if (event->attr.precise_ip) {
-		int precise = x86_pmu_max_precise();
+		int precise = x86_pmu_max_precise(event->pmu);
 
 		if (event->attr.precise_ip > precise)
 			return -EOPNOTSUPP;
@@ -2630,7 +2639,9 @@ static ssize_t max_precise_show(struct device *cdev,
 				  struct device_attribute *attr,
 				  char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise());
+	struct pmu *pmu = dev_get_drvdata(cdev);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise(pmu));
 }
 
 static DEVICE_ATTR_RO(max_precise);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c88bcd5..9ce27b3 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5271,34 +5271,59 @@ static inline bool intel_pmu_broken_perf_cap(void)
 	return false;
 }
 
+#define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
+
 static void update_pmu_cap(struct pmu *pmu)
 {
-	unsigned int cntr, fixed_cntr, ecx, edx;
-	union cpuid35_eax eax;
-	union cpuid35_ebx ebx;
+	unsigned int eax, ebx, ecx, edx;
+	union cpuid35_eax eax_0;
+	union cpuid35_ebx ebx_0;
+	u64 cntrs_mask = 0;
+	u64 pebs_mask = 0;
+	u64 pdists_mask = 0;
 
-	cpuid(ARCH_PERFMON_EXT_LEAF, &eax.full, &ebx.full, &ecx, &edx);
+	cpuid(ARCH_PERFMON_EXT_LEAF, &eax_0.full, &ebx_0.full, &ecx, &edx);
 
-	if (ebx.split.umask2)
+	if (ebx_0.split.umask2)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
-	if (ebx.split.eq)
+	if (ebx_0.split.eq)
 		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
 
-	if (eax.split.cntr_subleaf) {
+	if (eax_0.split.cntr_subleaf) {
 		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
-			    &cntr, &fixed_cntr, &ecx, &edx);
-		hybrid(pmu, cntr_mask64) = cntr;
-		hybrid(pmu, fixed_cntr_mask64) = fixed_cntr;
+			    &eax, &ebx, &ecx, &edx);
+		hybrid(pmu, cntr_mask64) = eax;
+		hybrid(pmu, fixed_cntr_mask64) = ebx;
+		cntrs_mask = counter_mask(eax, ebx);
 	}
 
-	if (eax.split.acr_subleaf) {
+	if (eax_0.split.acr_subleaf) {
 		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_ACR_LEAF,
-			    &cntr, &fixed_cntr, &ecx, &edx);
+			    &eax, &ebx, &ecx, &edx);
 		/* The mask of the counters which can be reloaded */
-		hybrid(pmu, acr_cntr_mask64) = cntr | ((u64)fixed_cntr << INTEL_PMC_IDX_FIXED);
-
+		hybrid(pmu, acr_cntr_mask64) = counter_mask(eax, ebx);
 		/* The mask of the counters which can cause a reload of reloadable counters */
-		hybrid(pmu, acr_cause_mask64) = ecx | ((u64)edx << INTEL_PMC_IDX_FIXED);
+		hybrid(pmu, acr_cause_mask64) = counter_mask(ecx, edx);
+	}
+
+	/* Bits[5:4] should be set simultaneously if arch-PEBS is supported */
+	if (eax_0.split.pebs_caps_subleaf && eax_0.split.pebs_cnts_subleaf) {
+		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_CAP_LEAF,
+			    &eax, &ebx, &ecx, &edx);
+		hybrid(pmu, arch_pebs_cap).caps = (u64)ebx << 32;
+
+		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_COUNTER_LEAF,
+			    &eax, &ebx, &ecx, &edx);
+		pebs_mask   = counter_mask(eax, ecx);
+		pdists_mask = counter_mask(ebx, edx);
+		hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
+		hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
+
+		if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
+			x86_pmu.arch_pebs = 0;
+	} else {
+		WARN_ON(x86_pmu.arch_pebs == 1);
+		x86_pmu.arch_pebs = 0;
 	}
 
 	if (!intel_pmu_broken_perf_cap()) {
@@ -6252,7 +6277,7 @@ tsx_is_visible(struct kobject *kobj, struct attribute *attr, int i)
 static umode_t
 pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i)
 {
-	return x86_pmu.ds_pebs ? attr->mode : 0;
+	return intel_pmu_has_pebs() ? attr->mode : 0;
 }
 
 static umode_t
@@ -7728,6 +7753,9 @@ __init int intel_pmu_init(void)
 	if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
 		update_pmu_cap(NULL);
 
+	if (x86_pmu.arch_pebs)
+		pr_cont("Architectural PEBS, ");
+
 	intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64,
 				      &x86_pmu.fixed_cntr_mask64,
 				      &x86_pmu.intel_ctrl);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c0b7ac1..26e485e 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1531,6 +1531,15 @@ static inline void intel_pmu_drain_large_pebs(struct cpu_hw_events *cpuc)
 		intel_pmu_drain_pebs_buffer();
 }
 
+static void __intel_pmu_pebs_enable(struct perf_event *event)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
+	cpuc->pebs_enabled |= 1ULL << hwc->idx;
+}
+
 void intel_pmu_pebs_enable(struct perf_event *event)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -1539,9 +1548,7 @@ void intel_pmu_pebs_enable(struct perf_event *event)
 	struct debug_store *ds = cpuc->ds;
 	unsigned int idx = hwc->idx;
 
-	hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
-
-	cpuc->pebs_enabled |= 1ULL << hwc->idx;
+	__intel_pmu_pebs_enable(event);
 
 	if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5))
 		cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
@@ -1603,14 +1610,22 @@ void intel_pmu_pebs_del(struct perf_event *event)
 	pebs_update_state(needed_cb, cpuc, event, false);
 }
 
-void intel_pmu_pebs_disable(struct perf_event *event)
+static void __intel_pmu_pebs_disable(struct perf_event *event)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	intel_pmu_drain_large_pebs(cpuc);
-
 	cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
+	hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
+}
+
+void intel_pmu_pebs_disable(struct perf_event *event)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	__intel_pmu_pebs_disable(event);
 
 	if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) &&
 	    (x86_pmu.version < 5))
@@ -1622,8 +1637,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
 
 	if (cpuc->enabled)
 		wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
-
-	hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
 }
 
 void intel_pmu_pebs_enable_all(void)
@@ -2669,11 +2682,26 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
 	}
 }
 
+static void __init intel_arch_pebs_init(void)
+{
+	/*
+	 * Current hybrid platforms always both support arch-PEBS or not
+	 * on all kinds of cores. So directly set x86_pmu.arch_pebs flag
+	 * if boot cpu supports arch-PEBS.
+	 */
+	x86_pmu.arch_pebs = 1;
+	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
+	x86_pmu.pebs_capable = ~0ULL;
+
+	x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
+	x86_pmu.pebs_disable = __intel_pmu_pebs_disable;
+}
+
 /*
  * PEBS probe and setup
  */
 
-void __init intel_pebs_init(void)
+static void __init intel_ds_pebs_init(void)
 {
 	/*
 	 * No support for 32bit formats
@@ -2788,6 +2816,14 @@ void __init intel_pebs_init(void)
 	}
 }
 
+void __init intel_pebs_init(void)
+{
+	if (x86_pmu.intel_cap.pebs_format == 0xf)
+		intel_arch_pebs_init();
+	else
+		intel_ds_pebs_init();
+}
+
 void perf_restore_debug_store(void)
 {
 	struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 285779c..ca52899 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -708,6 +708,12 @@ enum hybrid_pmu_type {
 	hybrid_big_small_tiny	= hybrid_big   | hybrid_small_tiny,
 };
 
+struct arch_pebs_cap {
+	u64 caps;
+	u64 counters;
+	u64 pdists;
+};
+
 struct x86_hybrid_pmu {
 	struct pmu			pmu;
 	const char			*name;
@@ -752,6 +758,8 @@ struct x86_hybrid_pmu {
 					mid_ack		:1,
 					enabled_ack	:1;
 
+	struct arch_pebs_cap		arch_pebs_cap;
+
 	u64				pebs_data_source[PERF_PEBS_DATA_SOURCE_MAX];
 };
 
@@ -906,7 +914,7 @@ struct x86_pmu {
 	union perf_capabilities intel_cap;
 
 	/*
-	 * Intel DebugStore bits
+	 * Intel DebugStore and PEBS bits
 	 */
 	unsigned int	bts			:1,
 			bts_active		:1,
@@ -917,7 +925,8 @@ struct x86_pmu {
 			pebs_no_tlb		:1,
 			pebs_no_isolation	:1,
 			pebs_block		:1,
-			pebs_ept		:1;
+			pebs_ept		:1,
+			arch_pebs		:1;
 	int		pebs_record_size;
 	int		pebs_buffer_size;
 	u64		pebs_events_mask;
@@ -930,6 +939,11 @@ struct x86_pmu {
 	u64		pebs_capable;
 
 	/*
+	 * Intel Architectural PEBS
+	 */
+	struct arch_pebs_cap arch_pebs_cap;
+
+	/*
 	 * Intel LBR
 	 */
 	unsigned int	lbr_tos, lbr_from, lbr_to,
@@ -1216,7 +1230,7 @@ int x86_reserve_hardware(void);
 
 void x86_release_hardware(void);
 
-int x86_pmu_max_precise(void);
+int x86_pmu_max_precise(struct pmu *pmu);
 
 void hw_perf_lbr_event_destroy(struct perf_event *event);
 
@@ -1791,6 +1805,11 @@ static inline int intel_pmu_max_num_pebs(struct pmu *pmu)
 	return fls((u32)hybrid(pmu, pebs_events_mask));
 }
 
+static inline bool intel_pmu_has_pebs(void)
+{
+	return x86_pmu.ds_pebs || x86_pmu.arch_pebs;
+}
+
 #else /* CONFIG_CPU_SUP_INTEL */
 
 static inline void reserve_ds_buffers(void)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 49a4d44..0dfa067 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -200,6 +200,8 @@ union cpuid10_edx {
 #define ARCH_PERFMON_EXT_LEAF			0x00000023
 #define ARCH_PERFMON_NUM_COUNTER_LEAF		0x1
 #define ARCH_PERFMON_ACR_LEAF			0x2
+#define ARCH_PERFMON_PEBS_CAP_LEAF		0x4
+#define ARCH_PERFMON_PEBS_COUNTER_LEAF		0x5
 
 union cpuid35_eax {
 	struct {
@@ -210,7 +212,10 @@ union cpuid35_eax {
 		unsigned int    acr_subleaf:1;
 		/* Events Sub-Leaf */
 		unsigned int    events_subleaf:1;
-		unsigned int	reserved:28;
+		/* arch-PEBS Sub-Leaves */
+		unsigned int	pebs_caps_subleaf:1;
+		unsigned int	pebs_cnts_subleaf:1;
+		unsigned int	reserved:26;
 	} split;
 	unsigned int            full;
 };

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Correct large PEBS flag check
  2025-10-29 10:21 ` [Patch v9 04/12] perf/x86/intel: Correct large PEBS flag check Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     5e4e355ae7cdeb0fef5dbe908866e1f895abfacc
Gitweb:        https://git.kernel.org/tip/5e4e355ae7cdeb0fef5dbe908866e1f895abfacc
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:28 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:20 +01:00

perf/x86/intel: Correct large PEBS flag check

current large PEBS flag check only checks if sample_regs_user contains
unsupported GPRs but doesn't check if sample_regs_intr contains
unsupported GPRs.

Of course, currently PEBS HW supports to sample all perf supported GPRs,
the missed check doesn't cause real issue. But it won't be true any more
after the subsequent patches support to sample SSP register. SSP
sampling is not supported by adaptive PEBS HW and it would be supported
until arch-PEBS HW. So correct this issue.

Fixes: a47ba4d77e12 ("perf/x86: Enable free running PEBS for REGS_USER/INTR")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-5-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 46a000e..c88bcd5 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4029,7 +4029,9 @@ static unsigned long intel_pmu_large_pebs_flags(struct perf_event *event)
 	if (!event->attr.exclude_kernel)
 		flags &= ~PERF_SAMPLE_REGS_USER;
 	if (event->attr.sample_regs_user & ~PEBS_GP_REGS)
-		flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR);
+		flags &= ~PERF_SAMPLE_REGS_USER;
+	if (event->attr.sample_regs_intr & ~PEBS_GP_REGS)
+		flags &= ~PERF_SAMPLE_REGS_INTR;
 	return flags;
 }
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  2025-10-29 10:21 ` [Patch v9 03/12] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Peter Zijlstra, Dapeng Mi, x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     ee98b8bfc7c4baca69a6852c4ecc399794f7e53b
Gitweb:        https://git.kernel.org/tip/ee98b8bfc7c4baca69a6852c4ecc399794f7e53b
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:27 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:20 +01:00

perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call

Use x86_pmu_drain_pebs static call to replace calling x86_pmu.drain_pebs
function pointer.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-4-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 28f5468..46a000e 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3269,7 +3269,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 		 * The PEBS buffer has to be drained before handling the A-PMI
 		 */
 		if (is_pebs_counter_event_group(event))
-			x86_pmu.drain_pebs(regs, &data);
+			static_call(x86_pmu_drain_pebs)(regs, &data);
 
 		last_period = event->hw.last_period;
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86: Fix NULL event access and potential PEBS record loss
  2025-10-29 10:21 ` [Patch v9 02/12] perf/x86: Fix NULL event access and potential PEBS record loss Dapeng Mi
  2025-11-06 14:19   ` Peter Zijlstra
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  1 sibling, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, Peter Zijlstra, Dapeng Mi, x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     7e772a93eb61cb6265bdd1c5bde17d0f2718b452
Gitweb:        https://git.kernel.org/tip/7e772a93eb61cb6265bdd1c5bde17d0f2718b452
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:26 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:19 +01:00

perf/x86: Fix NULL event access and potential PEBS record loss

When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
perf_event_overflow() could be called to process the last PEBS record.

While perf_event_overflow() could trigger the interrupt throttle and
stop all events of the group, like what the below call-chain shows.

perf_event_overflow()
  -> __perf_event_overflow()
    ->__perf_event_account_interrupt()
      -> perf_event_throttle_group()
        -> perf_event_throttle()
          -> event->pmu->stop()
            -> x86_pmu_stop()

The side effect of stopping the events is that all corresponding event
pointers in cpuc->events[] array are cleared to NULL.

Assume there are two PEBS events (event a and event b) in a group. When
intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
last PEBS record of PEBS event a, interrupt throttle is triggered and
all pointers of event a and event b are cleared to NULL. Then
intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
event b and encounters NULL pointer access.

To avoid this issue, move cpuc->events[] clearing from x86_pmu_stop()
to x86_pmu_del(). It's safe since cpuc->active_mask or
cpuc->pebs_enabled is always checked before access the event pointer
from cpuc->events[].

Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com
Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
Reported-by: kernel test robot <oliver.sang@intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-3-dapeng1.mi@linux.intel.com
---
 arch/x86/events/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 0cf68ad..b2868fe 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1344,6 +1344,7 @@ static void x86_pmu_enable(struct pmu *pmu)
 				hwc->state |= PERF_HES_ARCH;
 
 			x86_pmu_stop(event, PERF_EF_UPDATE);
+			cpuc->events[hwc->idx] = NULL;
 		}
 
 		/*
@@ -1365,6 +1366,7 @@ static void x86_pmu_enable(struct pmu *pmu)
 			 * if cpuc->enabled = 0, then no wrmsr as
 			 * per x86_pmu_enable_event()
 			 */
+			cpuc->events[hwc->idx] = event;
 			x86_pmu_start(event, PERF_EF_RELOAD);
 		}
 		cpuc->n_added = 0;
@@ -1531,7 +1533,6 @@ static void x86_pmu_start(struct perf_event *event, int flags)
 
 	event->hw.state = 0;
 
-	cpuc->events[idx] = event;
 	__set_bit(idx, cpuc->active_mask);
 	static_call(x86_pmu_enable)(event);
 	perf_event_update_userpage(event);
@@ -1610,7 +1611,6 @@ void x86_pmu_stop(struct perf_event *event, int flags)
 	if (test_bit(hwc->idx, cpuc->active_mask)) {
 		static_call(x86_pmu_disable)(event);
 		__clear_bit(hwc->idx, cpuc->active_mask);
-		cpuc->events[hwc->idx] = NULL;
 		WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
 		hwc->state |= PERF_HES_STOPPED;
 	}
@@ -1648,6 +1648,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
 	 * Not a TXN, therefore cleanup properly.
 	 */
 	x86_pmu_stop(event, PERF_EF_UPDATE);
+	cpuc->events[event->hw.idx] = NULL;
 
 	for (i = 0; i < cpuc->n_events; i++) {
 		if (event == cpuc->event_list[i])

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86: Remove redundant is_x86_event() prototype
  2025-10-29 10:21 ` [Patch v9 01/12] perf/x86: Remove redundant is_x86_event() prototype Dapeng Mi
@ 2025-11-11 11:37   ` tip-bot2 for Dapeng Mi
  0 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-11-11 11:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dapeng Mi, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     c7f69dc073e51f1c448713320ccd2e2be63fb1f6
Gitweb:        https://git.kernel.org/tip/c7f69dc073e51f1c448713320ccd2e2be63fb1f6
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Wed, 29 Oct 2025 18:21:25 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 07 Nov 2025 15:08:19 +01:00

perf/x86: Remove redundant is_x86_event() prototype

2 is_x86_event() prototypes are defined in perf_event.h. Remove the
redundant one.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251029102136.61364-2-dapeng1.mi@linux.intel.com
---
 arch/x86/events/perf_event.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 2b96938..285779c 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1124,7 +1124,6 @@ static struct perf_pmu_format_hybrid_attr format_attr_hybrid_##_name = {\
 	.pmu_type	= _pmu,						\
 }
 
-int is_x86_event(struct perf_event *event);
 struct pmu *x86_get_pmu(unsigned int cpu);
 extern struct x86_pmu x86_pmu __read_mostly;
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-11  5:41               ` Mi, Dapeng
@ 2025-11-11 11:37                 ` Peter Zijlstra
  2025-11-12  0:16                   ` Mi, Dapeng
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Zijlstra @ 2025-11-11 11:37 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao

On Tue, Nov 11, 2025 at 01:41:05PM +0800, Mi, Dapeng wrote:

> I tested the queue/perf/core code with a slight code refine on SPR/CWF/PTL.
> In summary, all things look good. The constraints validation passes on all
> these 3 platforms, no overlapped constraints are reported. Besides, perf
> counting/sampling (both legacy PEBS and arch-PEBS) works well, no issue is
> found.

Excellent, I pushed out to tip/perf/core.

> I did a slight change for the intel_pmu_check_dyn_constr() helper. It
> should be good enough to only validate the GP counters for the PEBS counter
> and PDIST constraint check. Beside the code style is refined
> opportunistically. Thanks.

If you could send that as a proper patch -- the thing was horribly
whitespace mangled.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  2025-11-11 11:37                 ` Peter Zijlstra
@ 2025-11-12  0:16                   ` Mi, Dapeng
  0 siblings, 0 replies; 48+ messages in thread
From: Mi, Dapeng @ 2025-11-12  0:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao


On 11/11/2025 7:37 PM, Peter Zijlstra wrote:
> On Tue, Nov 11, 2025 at 01:41:05PM +0800, Mi, Dapeng wrote:
>
>> I tested the queue/perf/core code with a slight code refine on SPR/CWF/PTL.
>> In summary, all things look good. The constraints validation passes on all
>> these 3 platforms, no overlapped constraints are reported. Besides, perf
>> counting/sampling (both legacy PEBS and arch-PEBS) works well, no issue is
>> found.
> Excellent, I pushed out to tip/perf/core.
>
>> I did a slight change for the intel_pmu_check_dyn_constr() helper. It
>> should be good enough to only validate the GP counters for the PEBS counter
>> and PDIST constraint check. Beside the code style is refined
>> opportunistically. Thanks.
> If you could send that as a proper patch -- the thing was horribly
> whitespace mangled.

Sure. Would send the patch soon.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use
  2025-10-29 10:21 ` [Patch v9 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
@ 2025-11-12 10:03   ` tip-bot2 for Ingo Molnar
  2025-11-12 11:18   ` tip-bot2 for Ingo Molnar
  2 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Ingo Molnar @ 2025-11-12 10:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Ingo Molnar, Peter Zijlstra (Intel), Kan Liang, Dapeng Mi, x86,
	linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     60f9f1d437201f6c457fc8a56f9df6d8a6d0bea6
Gitweb:        https://git.kernel.org/tip/60f9f1d437201f6c457fc8a56f9df6d8a6d0bea6
Author:        Ingo Molnar <mingo@kernel.org>
AuthorDate:    Wed, 12 Nov 2025 10:40:26 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 12 Nov 2025 10:49:35 +01:00

perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use

The following commit introduced a build failure on x86-32:

  2721e8da2de7 ("perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR")

  ...

  arch/x86/events/intel/ds.c:2983:24: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]

The forced type conversion to 'u64' and 'void *' are not 32-bit clean,
but they are also entirely unnecessary: ->pebs_vaddr is 'void *' already,
and integer-compatible pointer arithmetics will work just fine on it.

Fix & simplify the code.

Fixes: 2721e8da2de7 ("perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://patch.msgid.link/20251029102136.61364-10-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/ds.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c93bf97..2e170f2 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2979,8 +2979,7 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
 	}
 
 	base = cpuc->pebs_vaddr;
-	top = (void *)((u64)cpuc->pebs_vaddr +
-		       (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
+	top = cpuc->pebs_vaddr + (index.wr << ARCH_PEBS_INDEX_WR_SHIFT);
 
 	index.wr = 0;
 	index.full = 0;

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [tip: perf/core] perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use
  2025-10-29 10:21 ` [Patch v9 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
  2025-11-12 10:03   ` [tip: perf/core] perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use tip-bot2 for Ingo Molnar
@ 2025-11-12 11:18   ` tip-bot2 for Ingo Molnar
  2 siblings, 0 replies; 48+ messages in thread
From: tip-bot2 for Ingo Molnar @ 2025-11-12 11:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Stephen Rothwell, Ingo Molnar, Peter Zijlstra (Intel), Dapeng Mi,
	Kan Liang, x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     9929dffce5ed7e2988e0274f4db98035508b16d9
Gitweb:        https://git.kernel.org/tip/9929dffce5ed7e2988e0274f4db98035508b16d9
Author:        Ingo Molnar <mingo@kernel.org>
AuthorDate:    Wed, 12 Nov 2025 10:40:26 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 12 Nov 2025 12:12:28 +01:00

perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use

The following commit introduced a build failure on x86-32:

  21954c8a0ff ("perf/x86/intel: Process arch-PEBS records or record fragments")

  ...

  arch/x86/events/intel/ds.c:2983:24: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]

The forced type conversion to 'u64' and 'void *' are not 32-bit clean,
but they are also entirely unnecessary: ->pebs_vaddr is 'void *' already,
and integer-compatible pointer arithmetics will work just fine on it.

Fix & simplify the code.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: d21954c8a0ff ("perf/x86/intel: Process arch-PEBS records or record fragments")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://patch.msgid.link/20251029102136.61364-10-dapeng1.mi@linux.intel.com
---
 arch/x86/events/intel/ds.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c93bf97..2e170f2 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2979,8 +2979,7 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
 	}
 
 	base = cpuc->pebs_vaddr;
-	top = (void *)((u64)cpuc->pebs_vaddr +
-		       (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
+	top = cpuc->pebs_vaddr + (index.wr << ARCH_PEBS_INDEX_WR_SHIFT);
 
 	index.wr = 0;
 	index.full = 0;

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments
  2025-10-29 10:21 ` [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
@ 2026-03-03  0:20   ` Chun-Tse Shao
  2026-03-06  1:20     ` Mi, Dapeng
  1 sibling, 1 reply; 48+ messages in thread
From: Chun-Tse Shao @ 2026-03-03  0:20 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane, linux-kernel, linux-perf-users,
	Dapeng Mi, Zide Chen, Falcon Thomas, Xudong Hao

On Wed, Oct 29, 2025 at 3:39 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> A significant difference with adaptive PEBS is that arch-PEBS record
> supports fragments which means an arch-PEBS record could be split into
> several independent fragments which have its own arch-PEBS header in
> each fragment.
>
> This patch defines architectural PEBS record layout structures and add
> helpers to process arch-PEBS records or fragments. Only legacy PEBS
> groups like basic, GPR, XMM and LBR groups are supported in this patch,
> the new added YMM/ZMM/OPMASK vector registers capturing would be
> supported in the future.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/intel/core.c      |  13 +++
>  arch/x86/events/intel/ds.c        | 184 ++++++++++++++++++++++++++++++
>  arch/x86/include/asm/msr-index.h  |   6 +
>  arch/x86/include/asm/perf_event.h |  96 ++++++++++++++++
>  4 files changed, 299 insertions(+)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 9ce27b326923..de4dbde28adc 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3215,6 +3215,19 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
>                         status &= ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT;
>         }
>
> +       /*
> +        * Arch PEBS sets bit 54 in the global status register
> +        */
> +       if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT,
> +                                (unsigned long *)&status)) {
> +               handled++;
> +               static_call(x86_pmu_drain_pebs)(regs, &data);
> +
> +               if (cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS] &&
> +                   is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS]))
> +                       status &= ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT;
> +       }
> +
>         /*
>          * Intel PT
>          */
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 68664526443f..fe1bf373409e 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -2270,6 +2270,117 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
>                         format_group);
>  }
>
> +static inline bool arch_pebs_record_continued(struct arch_pebs_header *header)
> +{
> +       /* Continue bit or null PEBS record indicates fragment follows. */
> +       return header->cont || !(header->format & GENMASK_ULL(63, 16));
> +}
> +
> +static void setup_arch_pebs_sample_data(struct perf_event *event,
> +                                       struct pt_regs *iregs,
> +                                       void *__pebs,
> +                                       struct perf_sample_data *data,
> +                                       struct pt_regs *regs)
> +{
> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +       u64 sample_type = event->attr.sample_type;
> +       struct arch_pebs_header *header = NULL;
> +       struct arch_pebs_aux *meminfo = NULL;
> +       struct arch_pebs_gprs *gprs = NULL;
> +       struct x86_perf_regs *perf_regs;
> +       void *next_record;
> +       void *at = __pebs;
> +
> +       if (at == NULL)
> +               return;
> +
> +       perf_regs = container_of(regs, struct x86_perf_regs, regs);
> +       perf_regs->xmm_regs = NULL;
> +
> +       __setup_perf_sample_data(event, iregs, data);
> +
> +       *regs = *iregs;
> +
> +again:
> +       header = at;
> +       next_record = at + sizeof(struct arch_pebs_header);
> +       if (header->basic) {
> +               struct arch_pebs_basic *basic = next_record;
> +               u16 retire = 0;
> +
> +               next_record = basic + 1;
> +
> +               if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
> +                       retire = basic->valid ? basic->retire : 0;
> +               __setup_pebs_basic_group(event, regs, data, sample_type,
> +                                basic->ip, basic->tsc, retire);
> +       }
> +
> +       /*
> +        * The record for MEMINFO is in front of GP
> +        * But PERF_SAMPLE_TRANSACTION needs gprs->ax.
> +        * Save the pointer here but process later.
> +        */
> +       if (header->aux) {
> +               meminfo = next_record;
> +               next_record = meminfo + 1;
> +       }
> +
> +       if (header->gpr) {
> +               gprs = next_record;
> +               next_record = gprs + 1;
> +
> +               __setup_pebs_gpr_group(event, regs,
> +                                      (struct pebs_gprs *)gprs,
> +                                      sample_type);
> +       }
> +
> +       if (header->aux) {
> +               u64 ax = gprs ? gprs->ax : 0;
> +
> +               __setup_pebs_meminfo_group(event, data, sample_type,
> +                                          meminfo->cache_latency,
> +                                          meminfo->instr_latency,
> +                                          meminfo->address, meminfo->aux,
> +                                          meminfo->tsx_tuning, ax);
> +       }
> +
> +       if (header->xmm) {
> +               struct pebs_xmm *xmm;
> +
> +               next_record += sizeof(struct arch_pebs_xer_header);
> +
> +               xmm = next_record;
> +               perf_regs->xmm_regs = xmm->xmm;
> +               next_record = xmm + 1;
> +       }
> +
> +       if (header->lbr) {
> +               struct arch_pebs_lbr_header *lbr_header = next_record;
> +               struct lbr_entry *lbr;
> +               int num_lbr;
> +
> +               next_record = lbr_header + 1;
> +               lbr = next_record;
> +
> +               num_lbr = header->lbr == ARCH_PEBS_LBR_NUM_VAR ?
> +                               lbr_header->depth :
> +                               header->lbr * ARCH_PEBS_BASE_LBR_ENTRIES;
> +               next_record += num_lbr * sizeof(struct lbr_entry);
> +
> +               if (has_branch_stack(event)) {
> +                       intel_pmu_store_pebs_lbrs(lbr);
> +                       intel_pmu_lbr_save_brstack(data, cpuc, event);
> +               }
> +       }
> +
> +       /* Parse followed fragments if there are. */
> +       if (arch_pebs_record_continued(header)) {
> +               at = at + header->size;

If the header->size is 0, will it cause infinite loop?
I can see a 0 check below but not here.

Thanks,
CT

> +               goto again;
> +       }
> +}
> +
>  static inline void *
>  get_next_pebs_record_by_bit(void *base, void *top, int bit)
>  {
> @@ -2753,6 +2864,78 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
>                                             setup_pebs_adaptive_sample_data);
>  }
>
> +static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
> +                                     struct perf_sample_data *data)
> +{
> +       short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
> +       void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS];
> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +       union arch_pebs_index index;
> +       struct x86_perf_regs perf_regs;
> +       struct pt_regs *regs = &perf_regs.regs;
> +       void *base, *at, *top;
> +       u64 mask;
> +
> +       rdmsrq(MSR_IA32_PEBS_INDEX, index.whole);
> +
> +       if (unlikely(!index.wr)) {
> +               intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX);
> +               return;
> +       }
> +
> +       base = cpuc->ds_pebs_vaddr;
> +       top = (void *)((u64)cpuc->ds_pebs_vaddr +
> +                      (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
> +
> +       index.wr = 0;
> +       index.full = 0;
> +       wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
> +
> +       mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
> +
> +       if (!iregs)
> +               iregs = &dummy_iregs;
> +
> +       /* Process all but the last event for each counter. */
> +       for (at = base; at < top;) {
> +               struct arch_pebs_header *header;
> +               struct arch_pebs_basic *basic;
> +               u64 pebs_status;
> +
> +               header = at;
> +
> +               if (WARN_ON_ONCE(!header->size))
> +                       break;
> +
> +               /* 1st fragment or single record must have basic group */
> +               if (!header->basic) {
> +                       at += header->size;
> +                       continue;
> +               }
> +
> +               basic = at + sizeof(struct arch_pebs_header);
> +               pebs_status = mask & basic->applicable_counters;
> +               __intel_pmu_handle_pebs_record(iregs, regs, data, at,
> +                                              pebs_status, counts, last,
> +                                              setup_arch_pebs_sample_data);
> +
> +               /* Skip non-last fragments */
> +               while (arch_pebs_record_continued(header)) {
> +                       if (!header->size)
> +                               break;
> +                       at += header->size;
> +                       header = at;
> +               }
> +
> +               /* Skip last fragment or the single record */
> +               at += header->size;
> +       }
> +
> +       __intel_pmu_handle_last_pebs_record(iregs, regs, data, mask,
> +                                           counts, last,
> +                                           setup_arch_pebs_sample_data);
> +}
> +
>  static void __init intel_arch_pebs_init(void)
>  {
>         /*
> @@ -2762,6 +2945,7 @@ static void __init intel_arch_pebs_init(void)
>          */
>         x86_pmu.arch_pebs = 1;
>         x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
> +       x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>         x86_pmu.pebs_capable = ~0ULL;
>
>         x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 9e1720d73244..fc7a4e7c718d 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -327,6 +327,12 @@
>                                          PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
>                                          PERF_CAP_PEBS_TIMING_INFO)
>
> +/* Arch PEBS */
> +#define MSR_IA32_PEBS_BASE             0x000003f4
> +#define MSR_IA32_PEBS_INDEX            0x000003f5
> +#define ARCH_PEBS_OFFSET_MASK          0x7fffff
> +#define ARCH_PEBS_INDEX_WR_SHIFT       4
> +
>  #define MSR_IA32_RTIT_CTL              0x00000570
>  #define RTIT_CTL_TRACEEN               BIT(0)
>  #define RTIT_CTL_CYCLEACC              BIT(1)
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index 0dfa06722bab..3b3848f0d339 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -437,6 +437,8 @@ static inline bool is_topdown_idx(int idx)
>  #define GLOBAL_STATUS_LBRS_FROZEN              BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT)
>  #define GLOBAL_STATUS_TRACE_TOPAPMI_BIT                55
>  #define GLOBAL_STATUS_TRACE_TOPAPMI            BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
> +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT  54
> +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD      BIT_ULL(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT)
>  #define GLOBAL_STATUS_PERF_METRICS_OVF_BIT     48
>
>  #define GLOBAL_CTRL_EN_PERF_METRICS            BIT_ULL(48)
> @@ -507,6 +509,100 @@ struct pebs_cntr_header {
>
>  #define INTEL_CNTR_METRICS             0x3
>
> +/*
> + * Arch PEBS
> + */
> +union arch_pebs_index {
> +       struct {
> +               u64 rsvd:4,
> +                   wr:23,
> +                   rsvd2:4,
> +                   full:1,
> +                   en:1,
> +                   rsvd3:3,
> +                   thresh:23,
> +                   rsvd4:5;
> +       };
> +       u64 whole;
> +};
> +
> +struct arch_pebs_header {
> +       union {
> +               u64 format;
> +               struct {
> +                       u64 size:16,    /* Record size */
> +                           rsvd:14,
> +                           mode:1,     /* 64BIT_MODE */
> +                           cont:1,
> +                           rsvd2:3,
> +                           cntr:5,
> +                           lbr:2,
> +                           rsvd3:7,
> +                           xmm:1,
> +                           ymmh:1,
> +                           rsvd4:2,
> +                           opmask:1,
> +                           zmmh:1,
> +                           h16zmm:1,
> +                           rsvd5:5,
> +                           gpr:1,
> +                           aux:1,
> +                           basic:1;
> +               };
> +       };
> +       u64 rsvd6;
> +};
> +
> +struct arch_pebs_basic {
> +       u64 ip;
> +       u64 applicable_counters;
> +       u64 tsc;
> +       u64 retire      :16,    /* Retire Latency */
> +           valid       :1,
> +           rsvd        :47;
> +       u64 rsvd2;
> +       u64 rsvd3;
> +};
> +
> +struct arch_pebs_aux {
> +       u64 address;
> +       u64 rsvd;
> +       u64 rsvd2;
> +       u64 rsvd3;
> +       u64 rsvd4;
> +       u64 aux;
> +       u64 instr_latency       :16,
> +           pad2                :16,
> +           cache_latency       :16,
> +           pad3                :16;
> +       u64 tsx_tuning;
> +};
> +
> +struct arch_pebs_gprs {
> +       u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di;
> +       u64 r8, r9, r10, r11, r12, r13, r14, r15, ssp;
> +       u64 rsvd;
> +};
> +
> +struct arch_pebs_xer_header {
> +       u64 xstate;
> +       u64 rsvd;
> +};
> +
> +#define ARCH_PEBS_LBR_NAN              0x0
> +#define ARCH_PEBS_LBR_NUM_8            0x1
> +#define ARCH_PEBS_LBR_NUM_16           0x2
> +#define ARCH_PEBS_LBR_NUM_VAR          0x3
> +#define ARCH_PEBS_BASE_LBR_ENTRIES     8
> +struct arch_pebs_lbr_header {
> +       u64 rsvd;
> +       u64 ctl;
> +       u64 depth;
> +       u64 ler_from;
> +       u64 ler_to;
> +       u64 ler_info;
> +};
> +
>  /*
>   * AMD Extended Performance Monitoring and Debug cpuid feature detection
>   */
> --
> 2.34.1
>
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS
  2025-10-29 10:21 ` [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
@ 2026-03-05  0:50   ` Ian Rogers
  2026-03-06  1:38     ` Mi, Dapeng
  1 sibling, 1 reply; 48+ messages in thread
From: Ian Rogers @ 2026-03-05  0:50 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao

On Wed, Oct 29, 2025 at 3:24 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS
> supported capabilities and counters bitmap. This patch parses these 2
> sub-leaves and initializes arch-PEBS capabilities and corresponding
> structures.
>
> Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed
> for arch-PEBS, arch-PEBS doesn't need to manipulate these MSRs. Thus add
> a simple pair of __intel_pmu_pebs_enable/disable() callbacks for
> arch-PEBS.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/core.c            | 21 ++++++++---
>  arch/x86/events/intel/core.c      | 60 ++++++++++++++++++++++---------
>  arch/x86/events/intel/ds.c        | 52 ++++++++++++++++++++++-----
>  arch/x86/events/perf_event.h      | 25 +++++++++++--
>  arch/x86/include/asm/perf_event.h |  7 +++-
>  5 files changed, 132 insertions(+), 33 deletions(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 74479f9d6eed..f2402ae3ffa0 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -554,14 +554,22 @@ static inline int precise_br_compat(struct perf_event *event)
>         return m == b;
>  }
>
> -int x86_pmu_max_precise(void)
> +int x86_pmu_max_precise(struct pmu *pmu)
>  {
>         int precise = 0;
>
> -       /* Support for constant skid */
>         if (x86_pmu.pebs_active && !x86_pmu.pebs_broken) {
> -               precise++;
> +               /* arch PEBS */
> +               if (x86_pmu.arch_pebs) {
> +                       precise = 2;
> +                       if (hybrid(pmu, arch_pebs_cap).pdists)
> +                               precise++;
> +
> +                       return precise;
> +               }
>
> +               /* legacy PEBS - support for constant skid */
> +               precise++;
>                 /* Support for IP fixup */
>                 if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
>                         precise++;
> @@ -569,13 +577,14 @@ int x86_pmu_max_precise(void)
>                 if (x86_pmu.pebs_prec_dist)
>                         precise++;
>         }
> +
>         return precise;
>  }
>
>  int x86_pmu_hw_config(struct perf_event *event)
>  {
>         if (event->attr.precise_ip) {
> -               int precise = x86_pmu_max_precise();
> +               int precise = x86_pmu_max_precise(event->pmu);
>
>                 if (event->attr.precise_ip > precise)
>                         return -EOPNOTSUPP;
> @@ -2630,7 +2639,9 @@ static ssize_t max_precise_show(struct device *cdev,
>                                   struct device_attribute *attr,
>                                   char *buf)
>  {
> -       return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise());
> +       struct pmu *pmu = dev_get_drvdata(cdev);
> +
> +       return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise(pmu));
>  }
>
>  static DEVICE_ATTR_RO(max_precise);
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index c88bcd5d2bc4..9ce27b326923 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -5271,34 +5271,59 @@ static inline bool intel_pmu_broken_perf_cap(void)
>         return false;
>  }
>
> +#define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
> +
>  static void update_pmu_cap(struct pmu *pmu)
>  {
> -       unsigned int cntr, fixed_cntr, ecx, edx;
> -       union cpuid35_eax eax;
> -       union cpuid35_ebx ebx;
> +       unsigned int eax, ebx, ecx, edx;
> +       union cpuid35_eax eax_0;
> +       union cpuid35_ebx ebx_0;
> +       u64 cntrs_mask = 0;
> +       u64 pebs_mask = 0;
> +       u64 pdists_mask = 0;
>
> -       cpuid(ARCH_PERFMON_EXT_LEAF, &eax.full, &ebx.full, &ecx, &edx);
> +       cpuid(ARCH_PERFMON_EXT_LEAF, &eax_0.full, &ebx_0.full, &ecx, &edx);
>
> -       if (ebx.split.umask2)
> +       if (ebx_0.split.umask2)
>                 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
> -       if (ebx.split.eq)
> +       if (ebx_0.split.eq)
>                 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
>
> -       if (eax.split.cntr_subleaf) {
> +       if (eax_0.split.cntr_subleaf) {
>                 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
> -                           &cntr, &fixed_cntr, &ecx, &edx);
> -               hybrid(pmu, cntr_mask64) = cntr;
> -               hybrid(pmu, fixed_cntr_mask64) = fixed_cntr;
> +                           &eax, &ebx, &ecx, &edx);
> +               hybrid(pmu, cntr_mask64) = eax;
> +               hybrid(pmu, fixed_cntr_mask64) = ebx;
> +               cntrs_mask = counter_mask(eax, ebx);
>         }
>
> -       if (eax.split.acr_subleaf) {
> +       if (eax_0.split.acr_subleaf) {
>                 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_ACR_LEAF,
> -                           &cntr, &fixed_cntr, &ecx, &edx);
> +                           &eax, &ebx, &ecx, &edx);
>                 /* The mask of the counters which can be reloaded */
> -               hybrid(pmu, acr_cntr_mask64) = cntr | ((u64)fixed_cntr << INTEL_PMC_IDX_FIXED);
> -
> +               hybrid(pmu, acr_cntr_mask64) = counter_mask(eax, ebx);
>                 /* The mask of the counters which can cause a reload of reloadable counters */
> -               hybrid(pmu, acr_cause_mask64) = ecx | ((u64)edx << INTEL_PMC_IDX_FIXED);
> +               hybrid(pmu, acr_cause_mask64) = counter_mask(ecx, edx);
> +       }
> +
> +       /* Bits[5:4] should be set simultaneously if arch-PEBS is supported */
> +       if (eax_0.split.pebs_caps_subleaf && eax_0.split.pebs_cnts_subleaf) {
> +               cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_CAP_LEAF,
> +                           &eax, &ebx, &ecx, &edx);
> +               hybrid(pmu, arch_pebs_cap).caps = (u64)ebx << 32;

nit: It seems strange to use a u64 for caps but only use the top 32
bits. Did you intend to use the low 32-bits for eax?

Thanks,
Ian

> +
> +               cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_COUNTER_LEAF,
> +                           &eax, &ebx, &ecx, &edx);
> +               pebs_mask   = counter_mask(eax, ecx);
> +               pdists_mask = counter_mask(ebx, edx);
> +               hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
> +               hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
> +
> +               if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
> +                       x86_pmu.arch_pebs = 0;
> +       } else {
> +               WARN_ON(x86_pmu.arch_pebs == 1);
> +               x86_pmu.arch_pebs = 0;
>         }
>
>         if (!intel_pmu_broken_perf_cap()) {
> @@ -6252,7 +6277,7 @@ tsx_is_visible(struct kobject *kobj, struct attribute *attr, int i)
>  static umode_t
>  pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i)
>  {
> -       return x86_pmu.ds_pebs ? attr->mode : 0;
> +       return intel_pmu_has_pebs() ? attr->mode : 0;
>  }
>
>  static umode_t
> @@ -7728,6 +7753,9 @@ __init int intel_pmu_init(void)
>         if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
>                 update_pmu_cap(NULL);
>
> +       if (x86_pmu.arch_pebs)
> +               pr_cont("Architectural PEBS, ");
> +
>         intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64,
>                                       &x86_pmu.fixed_cntr_mask64,
>                                       &x86_pmu.intel_ctrl);
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index c0b7ac1c7594..26e485eca0a0 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -1531,6 +1531,15 @@ static inline void intel_pmu_drain_large_pebs(struct cpu_hw_events *cpuc)
>                 intel_pmu_drain_pebs_buffer();
>  }
>
> +static void __intel_pmu_pebs_enable(struct perf_event *event)
> +{
> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +       struct hw_perf_event *hwc = &event->hw;
> +
> +       hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
> +       cpuc->pebs_enabled |= 1ULL << hwc->idx;
> +}
> +
>  void intel_pmu_pebs_enable(struct perf_event *event)
>  {
>         struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> @@ -1539,9 +1548,7 @@ void intel_pmu_pebs_enable(struct perf_event *event)
>         struct debug_store *ds = cpuc->ds;
>         unsigned int idx = hwc->idx;
>
> -       hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
> -
> -       cpuc->pebs_enabled |= 1ULL << hwc->idx;
> +       __intel_pmu_pebs_enable(event);
>
>         if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5))
>                 cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
> @@ -1603,14 +1610,22 @@ void intel_pmu_pebs_del(struct perf_event *event)
>         pebs_update_state(needed_cb, cpuc, event, false);
>  }
>
> -void intel_pmu_pebs_disable(struct perf_event *event)
> +static void __intel_pmu_pebs_disable(struct perf_event *event)
>  {
>         struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>         struct hw_perf_event *hwc = &event->hw;
>
>         intel_pmu_drain_large_pebs(cpuc);
> -
>         cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
> +       hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
> +}
> +
> +void intel_pmu_pebs_disable(struct perf_event *event)
> +{
> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +       struct hw_perf_event *hwc = &event->hw;
> +
> +       __intel_pmu_pebs_disable(event);
>
>         if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) &&
>             (x86_pmu.version < 5))
> @@ -1622,8 +1637,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
>
>         if (cpuc->enabled)
>                 wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
> -
> -       hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
>  }
>
>  void intel_pmu_pebs_enable_all(void)
> @@ -2669,11 +2682,26 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
>         }
>  }
>
> +static void __init intel_arch_pebs_init(void)
> +{
> +       /*
> +        * Current hybrid platforms always both support arch-PEBS or not
> +        * on all kinds of cores. So directly set x86_pmu.arch_pebs flag
> +        * if boot cpu supports arch-PEBS.
> +        */
> +       x86_pmu.arch_pebs = 1;
> +       x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
> +       x86_pmu.pebs_capable = ~0ULL;
> +
> +       x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
> +       x86_pmu.pebs_disable = __intel_pmu_pebs_disable;
> +}
> +
>  /*
>   * PEBS probe and setup
>   */
>
> -void __init intel_pebs_init(void)
> +static void __init intel_ds_pebs_init(void)
>  {
>         /*
>          * No support for 32bit formats
> @@ -2788,6 +2816,14 @@ void __init intel_pebs_init(void)
>         }
>  }
>
> +void __init intel_pebs_init(void)
> +{
> +       if (x86_pmu.intel_cap.pebs_format == 0xf)
> +               intel_arch_pebs_init();
> +       else
> +               intel_ds_pebs_init();
> +}
> +
>  void perf_restore_debug_store(void)
>  {
>         struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> index 285779c73479..ca5289980b52 100644
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -708,6 +708,12 @@ enum hybrid_pmu_type {
>         hybrid_big_small_tiny   = hybrid_big   | hybrid_small_tiny,
>  };
>
> +struct arch_pebs_cap {
> +       u64 caps;
> +       u64 counters;
> +       u64 pdists;
> +};
> +
>  struct x86_hybrid_pmu {
>         struct pmu                      pmu;
>         const char                      *name;
> @@ -752,6 +758,8 @@ struct x86_hybrid_pmu {
>                                         mid_ack         :1,
>                                         enabled_ack     :1;
>
> +       struct arch_pebs_cap            arch_pebs_cap;
> +
>         u64                             pebs_data_source[PERF_PEBS_DATA_SOURCE_MAX];
>  };
>
> @@ -906,7 +914,7 @@ struct x86_pmu {
>         union perf_capabilities intel_cap;
>
>         /*
> -        * Intel DebugStore bits
> +        * Intel DebugStore and PEBS bits
>          */
>         unsigned int    bts                     :1,
>                         bts_active              :1,
> @@ -917,7 +925,8 @@ struct x86_pmu {
>                         pebs_no_tlb             :1,
>                         pebs_no_isolation       :1,
>                         pebs_block              :1,
> -                       pebs_ept                :1;
> +                       pebs_ept                :1,
> +                       arch_pebs               :1;
>         int             pebs_record_size;
>         int             pebs_buffer_size;
>         u64             pebs_events_mask;
> @@ -929,6 +938,11 @@ struct x86_pmu {
>         u64             rtm_abort_event;
>         u64             pebs_capable;
>
> +       /*
> +        * Intel Architectural PEBS
> +        */
> +       struct arch_pebs_cap arch_pebs_cap;
> +
>         /*
>          * Intel LBR
>          */
> @@ -1216,7 +1230,7 @@ int x86_reserve_hardware(void);
>
>  void x86_release_hardware(void);
>
> -int x86_pmu_max_precise(void);
> +int x86_pmu_max_precise(struct pmu *pmu);
>
>  void hw_perf_lbr_event_destroy(struct perf_event *event);
>
> @@ -1791,6 +1805,11 @@ static inline int intel_pmu_max_num_pebs(struct pmu *pmu)
>         return fls((u32)hybrid(pmu, pebs_events_mask));
>  }
>
> +static inline bool intel_pmu_has_pebs(void)
> +{
> +       return x86_pmu.ds_pebs || x86_pmu.arch_pebs;
> +}
> +
>  #else /* CONFIG_CPU_SUP_INTEL */
>
>  static inline void reserve_ds_buffers(void)
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index 49a4d442f3fc..0dfa06722bab 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -200,6 +200,8 @@ union cpuid10_edx {
>  #define ARCH_PERFMON_EXT_LEAF                  0x00000023
>  #define ARCH_PERFMON_NUM_COUNTER_LEAF          0x1
>  #define ARCH_PERFMON_ACR_LEAF                  0x2
> +#define ARCH_PERFMON_PEBS_CAP_LEAF             0x4
> +#define ARCH_PERFMON_PEBS_COUNTER_LEAF         0x5
>
>  union cpuid35_eax {
>         struct {
> @@ -210,7 +212,10 @@ union cpuid35_eax {
>                 unsigned int    acr_subleaf:1;
>                 /* Events Sub-Leaf */
>                 unsigned int    events_subleaf:1;
> -               unsigned int    reserved:28;
> +               /* arch-PEBS Sub-Leaves */
> +               unsigned int    pebs_caps_subleaf:1;
> +               unsigned int    pebs_cnts_subleaf:1;
> +               unsigned int    reserved:26;
>         } split;
>         unsigned int            full;
>  };
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  2025-10-29 10:21 ` [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
@ 2026-03-05  1:20   ` Ian Rogers
  2026-03-06  2:17     ` Mi, Dapeng
  1 sibling, 1 reply; 48+ messages in thread
From: Ian Rogers @ 2026-03-05  1:20 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao, Kan Liang

On Wed, Oct 29, 2025 at 3:25 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> Different with legacy PEBS, arch-PEBS provides per-counter PEBS data
> configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs.
>
> This patch obtains PEBS data configuration from event attribute and then
> writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and
> enable corresponding PEBS groups.
>
> Please notice this patch only enables XMM SIMD regs sampling for
> arch-PEBS, the other SIMD regs (OPMASK/YMM/ZMM) sampling on arch-PEBS
> would be supported after PMI based SIMD regs (OPMASK/YMM/ZMM) sampling
> is supported.
>
> Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/intel/core.c     | 136 ++++++++++++++++++++++++++++++-
>  arch/x86/events/intel/ds.c       |  17 ++++
>  arch/x86/events/perf_event.h     |   4 +
>  arch/x86/include/asm/intel_ds.h  |   7 ++
>  arch/x86/include/asm/msr-index.h |   8 ++
>  5 files changed, 171 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 40ccfd80d554..75cba28b86d5 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2563,6 +2563,45 @@ static void intel_pmu_disable_fixed(struct perf_event *event)
>         cpuc->fixed_ctrl_val &= ~mask;
>  }
>
> +static inline void __intel_pmu_update_event_ext(int idx, u64 ext)
> +{
> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +       u32 msr;
> +
> +       if (idx < INTEL_PMC_IDX_FIXED) {
> +               msr = MSR_IA32_PMC_V6_GP0_CFG_C +
> +                     x86_pmu.addr_offset(idx, false);
> +       } else {
> +               msr = MSR_IA32_PMC_V6_FX0_CFG_C +
> +                     x86_pmu.addr_offset(idx - INTEL_PMC_IDX_FIXED, false);
> +       }
> +
> +       cpuc->cfg_c_val[idx] = ext;
> +       wrmsrq(msr, ext);
> +}
> +
> +static void intel_pmu_disable_event_ext(struct perf_event *event)
> +{
> +       if (!x86_pmu.arch_pebs)
> +               return;
> +
> +       /*
> +        * Only clear CFG_C MSR for PEBS counter group events,
> +        * it avoids the HW counter's value to be added into
> +        * other PEBS records incorrectly after PEBS counter
> +        * group events are disabled.
> +        *
> +        * For other events, it's unnecessary to clear CFG_C MSRs
> +        * since CFG_C doesn't take effect if counter is in
> +        * disabled state. That helps to reduce the WRMSR overhead
> +        * in context switches.
> +        */
> +       if (!is_pebs_counter_event_group(event))
> +               return;
> +
> +       __intel_pmu_update_event_ext(event->hw.idx, 0);
> +}
> +
>  static void intel_pmu_disable_event(struct perf_event *event)
>  {
>         struct hw_perf_event *hwc = &event->hw;
> @@ -2571,9 +2610,12 @@ static void intel_pmu_disable_event(struct perf_event *event)
>         switch (idx) {
>         case 0 ... INTEL_PMC_IDX_FIXED - 1:
>                 intel_clear_masks(event, idx);
> +               intel_pmu_disable_event_ext(event);
>                 x86_pmu_disable_event(event);
>                 break;
>         case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
> +               intel_pmu_disable_event_ext(event);
> +               fallthrough;
>         case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
>                 intel_pmu_disable_fixed(event);
>                 break;
> @@ -2940,6 +2982,66 @@ static void intel_pmu_enable_acr(struct perf_event *event)
>
>  DEFINE_STATIC_CALL_NULL(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
>
> +static void intel_pmu_enable_event_ext(struct perf_event *event)
> +{
> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +       struct hw_perf_event *hwc = &event->hw;
> +       union arch_pebs_index old, new;
> +       struct arch_pebs_cap cap;
> +       u64 ext = 0;
> +
> +       if (!x86_pmu.arch_pebs)
> +               return;
> +
> +       cap = hybrid(cpuc->pmu, arch_pebs_cap);
> +
> +       if (event->attr.precise_ip) {
> +               u64 pebs_data_cfg = intel_get_arch_pebs_data_config(event);
> +
> +               ext |= ARCH_PEBS_EN;
> +               if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD)
> +                       ext |= (-hwc->sample_period) & ARCH_PEBS_RELOAD;

Nit: Perhaps there should be a warning if "hwc->sample_period >
ARCH_PEBS_RELOAD"?

Thanks,
Ian

> +
> +               if (pebs_data_cfg && cap.caps) {
> +                       if (pebs_data_cfg & PEBS_DATACFG_MEMINFO)
> +                               ext |= ARCH_PEBS_AUX & cap.caps;
> +
> +                       if (pebs_data_cfg & PEBS_DATACFG_GP)
> +                               ext |= ARCH_PEBS_GPR & cap.caps;
> +
> +                       if (pebs_data_cfg & PEBS_DATACFG_XMMS)
> +                               ext |= ARCH_PEBS_VECR_XMM & cap.caps;
> +
> +                       if (pebs_data_cfg & PEBS_DATACFG_LBRS)
> +                               ext |= ARCH_PEBS_LBR & cap.caps;
> +               }
> +
> +               if (cpuc->n_pebs == cpuc->n_large_pebs)
> +                       new.thresh = ARCH_PEBS_THRESH_MULTI;
> +               else
> +                       new.thresh = ARCH_PEBS_THRESH_SINGLE;
> +
> +               rdmsrq(MSR_IA32_PEBS_INDEX, old.whole);
> +               if (new.thresh != old.thresh || !old.en) {
> +                       if (old.thresh == ARCH_PEBS_THRESH_MULTI && old.wr > 0) {
> +                               /*
> +                                * Large PEBS was enabled.
> +                                * Drain PEBS buffer before applying the single PEBS.
> +                                */
> +                               intel_pmu_drain_pebs_buffer();
> +                       } else {
> +                               new.wr = 0;
> +                               new.full = 0;
> +                               new.en = 1;
> +                               wrmsrq(MSR_IA32_PEBS_INDEX, new.whole);
> +                       }
> +               }
> +       }
> +
> +       if (cpuc->cfg_c_val[hwc->idx] != ext)
> +               __intel_pmu_update_event_ext(hwc->idx, ext);
> +}
> +
>  static void intel_pmu_enable_event(struct perf_event *event)
>  {
>         u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE;
> @@ -2955,10 +3057,12 @@ static void intel_pmu_enable_event(struct perf_event *event)
>                         enable_mask |= ARCH_PERFMON_EVENTSEL_BR_CNTR;
>                 intel_set_masks(event, idx);
>                 static_call_cond(intel_pmu_enable_acr_event)(event);
> +               intel_pmu_enable_event_ext(event);
>                 __x86_pmu_enable_event(hwc, enable_mask);
>                 break;
>         case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
>                 static_call_cond(intel_pmu_enable_acr_event)(event);
> +               intel_pmu_enable_event_ext(event);
>                 fallthrough;
>         case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
>                 intel_pmu_enable_fixed(event);
> @@ -5301,6 +5405,30 @@ static inline bool intel_pmu_broken_perf_cap(void)
>         return false;
>  }
>
> +static inline void __intel_update_pmu_caps(struct pmu *pmu)
> +{
> +       struct pmu *dest_pmu = pmu ? pmu : x86_get_pmu(smp_processor_id());
> +
> +       if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM)
> +               dest_pmu->capabilities |= PERF_PMU_CAP_EXTENDED_REGS;
> +}
> +
> +static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
> +{
> +       u64 caps = hybrid(pmu, arch_pebs_cap).caps;
> +
> +       x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
> +       if (caps & ARCH_PEBS_LBR)
> +               x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
> +
> +       if (!(caps & ARCH_PEBS_AUX))
> +               x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
> +       if (!(caps & ARCH_PEBS_GPR)) {
> +               x86_pmu.large_pebs_flags &=
> +                       ~(PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER);
> +       }
> +}
> +
>  #define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
>
>  static void update_pmu_cap(struct pmu *pmu)
> @@ -5349,8 +5477,12 @@ static void update_pmu_cap(struct pmu *pmu)
>                 hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
>                 hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
>
> -               if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
> +               if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask)) {
>                         x86_pmu.arch_pebs = 0;
> +               } else {
> +                       __intel_update_pmu_caps(pmu);
> +                       __intel_update_large_pebs_flags(pmu);
> +               }
>         } else {
>                 WARN_ON(x86_pmu.arch_pebs == 1);
>                 x86_pmu.arch_pebs = 0;
> @@ -5514,6 +5646,8 @@ static void intel_pmu_cpu_starting(int cpu)
>                 }
>         }
>
> +       __intel_update_pmu_caps(cpuc->pmu);
> +
>         if (!cpuc->shared_regs)
>                 return;
>
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 1179980f795b..c66e9b562de3 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -1528,6 +1528,18 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
>         }
>  }
>
> +u64 intel_get_arch_pebs_data_config(struct perf_event *event)
> +{
> +       u64 pebs_data_cfg = 0;
> +
> +       if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
> +               return 0;
> +
> +       pebs_data_cfg |= pebs_update_adaptive_cfg(event);
> +
> +       return pebs_data_cfg;
> +}
> +
>  void intel_pmu_pebs_add(struct perf_event *event)
>  {
>         struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> @@ -2947,6 +2959,11 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
>
>         index.wr = 0;
>         index.full = 0;
> +       index.en = 1;
> +       if (cpuc->n_pebs == cpuc->n_large_pebs)
> +               index.thresh = ARCH_PEBS_THRESH_MULTI;
> +       else
> +               index.thresh = ARCH_PEBS_THRESH_SINGLE;
>         wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
>
>         mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> index 13f411bca6bc..3161ec0a3416 100644
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -304,6 +304,8 @@ struct cpu_hw_events {
>         /* Intel ACR configuration */
>         u64                     acr_cfg_b[X86_PMC_IDX_MAX];
>         u64                     acr_cfg_c[X86_PMC_IDX_MAX];
> +       /* Cached CFG_C values */
> +       u64                     cfg_c_val[X86_PMC_IDX_MAX];
>
>         /*
>          * Intel LBR bits
> @@ -1782,6 +1784,8 @@ void intel_pmu_pebs_data_source_cmt(void);
>
>  void intel_pmu_pebs_data_source_lnl(void);
>
> +u64 intel_get_arch_pebs_data_config(struct perf_event *event);
> +
>  int intel_pmu_setup_lbr_filter(struct perf_event *event);
>
>  void intel_pt_interrupt(void);
> diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
> index 023c2883f9f3..695f87efbeb8 100644
> --- a/arch/x86/include/asm/intel_ds.h
> +++ b/arch/x86/include/asm/intel_ds.h
> @@ -7,6 +7,13 @@
>  #define PEBS_BUFFER_SHIFT      4
>  #define PEBS_BUFFER_SIZE       (PAGE_SIZE << PEBS_BUFFER_SHIFT)
>
> +/*
> + * The largest PEBS record could consume a page, ensure
> + * a record at least can be written after triggering PMI.
> + */
> +#define ARCH_PEBS_THRESH_MULTI ((PEBS_BUFFER_SIZE - PAGE_SIZE) >> PEBS_BUFFER_SHIFT)
> +#define ARCH_PEBS_THRESH_SINGLE        1
> +
>  /* The maximal number of PEBS events: */
>  #define MAX_PEBS_EVENTS_FMT4   8
>  #define MAX_PEBS_EVENTS                32
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index fc7a4e7c718d..f1ef9ac38bfb 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -333,6 +333,14 @@
>  #define ARCH_PEBS_OFFSET_MASK          0x7fffff
>  #define ARCH_PEBS_INDEX_WR_SHIFT       4
>
> +#define ARCH_PEBS_RELOAD               0xffffffff
> +#define ARCH_PEBS_LBR_SHIFT            40
> +#define ARCH_PEBS_LBR                  (0x3ull << ARCH_PEBS_LBR_SHIFT)
> +#define ARCH_PEBS_VECR_XMM             BIT_ULL(49)
> +#define ARCH_PEBS_GPR                  BIT_ULL(61)
> +#define ARCH_PEBS_AUX                  BIT_ULL(62)
> +#define ARCH_PEBS_EN                   BIT_ULL(63)
> +
>  #define MSR_IA32_RTIT_CTL              0x00000570
>  #define RTIT_CTL_TRACEEN               BIT(0)
>  #define RTIT_CTL_CYCLEACC              BIT(1)
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments
  2026-03-03  0:20   ` [Patch v9 08/12] " Chun-Tse Shao
@ 2026-03-06  1:20     ` Mi, Dapeng
  0 siblings, 0 replies; 48+ messages in thread
From: Mi, Dapeng @ 2026-03-06  1:20 UTC (permalink / raw)
  To: Chun-Tse Shao
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane, linux-kernel, linux-perf-users,
	Dapeng Mi, Zide Chen, Falcon Thomas, Xudong Hao


On 3/3/2026 8:20 AM, Chun-Tse Shao wrote:
> On Wed, Oct 29, 2025 at 3:39 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> A significant difference with adaptive PEBS is that arch-PEBS record
>> supports fragments which means an arch-PEBS record could be split into
>> several independent fragments which have its own arch-PEBS header in
>> each fragment.
>>
>> This patch defines architectural PEBS record layout structures and add
>> helpers to process arch-PEBS records or fragments. Only legacy PEBS
>> groups like basic, GPR, XMM and LBR groups are supported in this patch,
>> the new added YMM/ZMM/OPMASK vector registers capturing would be
>> supported in the future.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>  arch/x86/events/intel/core.c      |  13 +++
>>  arch/x86/events/intel/ds.c        | 184 ++++++++++++++++++++++++++++++
>>  arch/x86/include/asm/msr-index.h  |   6 +
>>  arch/x86/include/asm/perf_event.h |  96 ++++++++++++++++
>>  4 files changed, 299 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 9ce27b326923..de4dbde28adc 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -3215,6 +3215,19 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
>>                         status &= ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT;
>>         }
>>
>> +       /*
>> +        * Arch PEBS sets bit 54 in the global status register
>> +        */
>> +       if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT,
>> +                                (unsigned long *)&status)) {
>> +               handled++;
>> +               static_call(x86_pmu_drain_pebs)(regs, &data);
>> +
>> +               if (cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS] &&
>> +                   is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS]))
>> +                       status &= ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT;
>> +       }
>> +
>>         /*
>>          * Intel PT
>>          */
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index 68664526443f..fe1bf373409e 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -2270,6 +2270,117 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
>>                         format_group);
>>  }
>>
>> +static inline bool arch_pebs_record_continued(struct arch_pebs_header *header)
>> +{
>> +       /* Continue bit or null PEBS record indicates fragment follows. */
>> +       return header->cont || !(header->format & GENMASK_ULL(63, 16));
>> +}
>> +
>> +static void setup_arch_pebs_sample_data(struct perf_event *event,
>> +                                       struct pt_regs *iregs,
>> +                                       void *__pebs,
>> +                                       struct perf_sample_data *data,
>> +                                       struct pt_regs *regs)
>> +{
>> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +       u64 sample_type = event->attr.sample_type;
>> +       struct arch_pebs_header *header = NULL;
>> +       struct arch_pebs_aux *meminfo = NULL;
>> +       struct arch_pebs_gprs *gprs = NULL;
>> +       struct x86_perf_regs *perf_regs;
>> +       void *next_record;
>> +       void *at = __pebs;
>> +
>> +       if (at == NULL)
>> +               return;
>> +
>> +       perf_regs = container_of(regs, struct x86_perf_regs, regs);
>> +       perf_regs->xmm_regs = NULL;
>> +
>> +       __setup_perf_sample_data(event, iregs, data);
>> +
>> +       *regs = *iregs;
>> +
>> +again:
>> +       header = at;
>> +       next_record = at + sizeof(struct arch_pebs_header);
>> +       if (header->basic) {
>> +               struct arch_pebs_basic *basic = next_record;
>> +               u16 retire = 0;
>> +
>> +               next_record = basic + 1;
>> +
>> +               if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
>> +                       retire = basic->valid ? basic->retire : 0;
>> +               __setup_pebs_basic_group(event, regs, data, sample_type,
>> +                                basic->ip, basic->tsc, retire);
>> +       }
>> +
>> +       /*
>> +        * The record for MEMINFO is in front of GP
>> +        * But PERF_SAMPLE_TRANSACTION needs gprs->ax.
>> +        * Save the pointer here but process later.
>> +        */
>> +       if (header->aux) {
>> +               meminfo = next_record;
>> +               next_record = meminfo + 1;
>> +       }
>> +
>> +       if (header->gpr) {
>> +               gprs = next_record;
>> +               next_record = gprs + 1;
>> +
>> +               __setup_pebs_gpr_group(event, regs,
>> +                                      (struct pebs_gprs *)gprs,
>> +                                      sample_type);
>> +       }
>> +
>> +       if (header->aux) {
>> +               u64 ax = gprs ? gprs->ax : 0;
>> +
>> +               __setup_pebs_meminfo_group(event, data, sample_type,
>> +                                          meminfo->cache_latency,
>> +                                          meminfo->instr_latency,
>> +                                          meminfo->address, meminfo->aux,
>> +                                          meminfo->tsx_tuning, ax);
>> +       }
>> +
>> +       if (header->xmm) {
>> +               struct pebs_xmm *xmm;
>> +
>> +               next_record += sizeof(struct arch_pebs_xer_header);
>> +
>> +               xmm = next_record;
>> +               perf_regs->xmm_regs = xmm->xmm;
>> +               next_record = xmm + 1;
>> +       }
>> +
>> +       if (header->lbr) {
>> +               struct arch_pebs_lbr_header *lbr_header = next_record;
>> +               struct lbr_entry *lbr;
>> +               int num_lbr;
>> +
>> +               next_record = lbr_header + 1;
>> +               lbr = next_record;
>> +
>> +               num_lbr = header->lbr == ARCH_PEBS_LBR_NUM_VAR ?
>> +                               lbr_header->depth :
>> +                               header->lbr * ARCH_PEBS_BASE_LBR_ENTRIES;
>> +               next_record += num_lbr * sizeof(struct lbr_entry);
>> +
>> +               if (has_branch_stack(event)) {
>> +                       intel_pmu_store_pebs_lbrs(lbr);
>> +                       intel_pmu_lbr_save_brstack(data, cpuc, event);
>> +               }
>> +       }
>> +
>> +       /* Parse followed fragments if there are. */
>> +       if (arch_pebs_record_continued(header)) {
>> +               at = at + header->size;
> If the header->size is 0, will it cause infinite loop?
> I can see a 0 check below but not here.

No, there are 2 places to check the header size in
intel_pmu_drain_arch_pebs(). They would break the while loop if there is
any 0 size record or fragment. Thanks.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/ds.c?h=v7.0-rc2#n3268

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/ds.c?h=v7.0-rc2#n3285


>
> Thanks,
> CT
>
>> +               goto again;
>> +       }
>> +}
>> +
>>  static inline void *
>>  get_next_pebs_record_by_bit(void *base, void *top, int bit)
>>  {
>> @@ -2753,6 +2864,78 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
>>                                             setup_pebs_adaptive_sample_data);
>>  }
>>
>> +static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
>> +                                     struct perf_sample_data *data)
>> +{
>> +       short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
>> +       void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS];
>> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +       union arch_pebs_index index;
>> +       struct x86_perf_regs perf_regs;
>> +       struct pt_regs *regs = &perf_regs.regs;
>> +       void *base, *at, *top;
>> +       u64 mask;
>> +
>> +       rdmsrq(MSR_IA32_PEBS_INDEX, index.whole);
>> +
>> +       if (unlikely(!index.wr)) {
>> +               intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX);
>> +               return;
>> +       }
>> +
>> +       base = cpuc->ds_pebs_vaddr;
>> +       top = (void *)((u64)cpuc->ds_pebs_vaddr +
>> +                      (index.wr << ARCH_PEBS_INDEX_WR_SHIFT));
>> +
>> +       index.wr = 0;
>> +       index.full = 0;
>> +       wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
>> +
>> +       mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
>> +
>> +       if (!iregs)
>> +               iregs = &dummy_iregs;
>> +
>> +       /* Process all but the last event for each counter. */
>> +       for (at = base; at < top;) {
>> +               struct arch_pebs_header *header;
>> +               struct arch_pebs_basic *basic;
>> +               u64 pebs_status;
>> +
>> +               header = at;
>> +
>> +               if (WARN_ON_ONCE(!header->size))
>> +                       break;
>> +
>> +               /* 1st fragment or single record must have basic group */
>> +               if (!header->basic) {
>> +                       at += header->size;
>> +                       continue;
>> +               }
>> +
>> +               basic = at + sizeof(struct arch_pebs_header);
>> +               pebs_status = mask & basic->applicable_counters;
>> +               __intel_pmu_handle_pebs_record(iregs, regs, data, at,
>> +                                              pebs_status, counts, last,
>> +                                              setup_arch_pebs_sample_data);
>> +
>> +               /* Skip non-last fragments */
>> +               while (arch_pebs_record_continued(header)) {
>> +                       if (!header->size)
>> +                               break;
>> +                       at += header->size;
>> +                       header = at;
>> +               }
>> +
>> +               /* Skip last fragment or the single record */
>> +               at += header->size;
>> +       }
>> +
>> +       __intel_pmu_handle_last_pebs_record(iregs, regs, data, mask,
>> +                                           counts, last,
>> +                                           setup_arch_pebs_sample_data);
>> +}
>> +
>>  static void __init intel_arch_pebs_init(void)
>>  {
>>         /*
>> @@ -2762,6 +2945,7 @@ static void __init intel_arch_pebs_init(void)
>>          */
>>         x86_pmu.arch_pebs = 1;
>>         x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>> +       x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>>         x86_pmu.pebs_capable = ~0ULL;
>>
>>         x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 9e1720d73244..fc7a4e7c718d 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -327,6 +327,12 @@
>>                                          PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
>>                                          PERF_CAP_PEBS_TIMING_INFO)
>>
>> +/* Arch PEBS */
>> +#define MSR_IA32_PEBS_BASE             0x000003f4
>> +#define MSR_IA32_PEBS_INDEX            0x000003f5
>> +#define ARCH_PEBS_OFFSET_MASK          0x7fffff
>> +#define ARCH_PEBS_INDEX_WR_SHIFT       4
>> +
>>  #define MSR_IA32_RTIT_CTL              0x00000570
>>  #define RTIT_CTL_TRACEEN               BIT(0)
>>  #define RTIT_CTL_CYCLEACC              BIT(1)
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index 0dfa06722bab..3b3848f0d339 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -437,6 +437,8 @@ static inline bool is_topdown_idx(int idx)
>>  #define GLOBAL_STATUS_LBRS_FROZEN              BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT)
>>  #define GLOBAL_STATUS_TRACE_TOPAPMI_BIT                55
>>  #define GLOBAL_STATUS_TRACE_TOPAPMI            BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
>> +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT  54
>> +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD      BIT_ULL(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT)
>>  #define GLOBAL_STATUS_PERF_METRICS_OVF_BIT     48
>>
>>  #define GLOBAL_CTRL_EN_PERF_METRICS            BIT_ULL(48)
>> @@ -507,6 +509,100 @@ struct pebs_cntr_header {
>>
>>  #define INTEL_CNTR_METRICS             0x3
>>
>> +/*
>> + * Arch PEBS
>> + */
>> +union arch_pebs_index {
>> +       struct {
>> +               u64 rsvd:4,
>> +                   wr:23,
>> +                   rsvd2:4,
>> +                   full:1,
>> +                   en:1,
>> +                   rsvd3:3,
>> +                   thresh:23,
>> +                   rsvd4:5;
>> +       };
>> +       u64 whole;
>> +};
>> +
>> +struct arch_pebs_header {
>> +       union {
>> +               u64 format;
>> +               struct {
>> +                       u64 size:16,    /* Record size */
>> +                           rsvd:14,
>> +                           mode:1,     /* 64BIT_MODE */
>> +                           cont:1,
>> +                           rsvd2:3,
>> +                           cntr:5,
>> +                           lbr:2,
>> +                           rsvd3:7,
>> +                           xmm:1,
>> +                           ymmh:1,
>> +                           rsvd4:2,
>> +                           opmask:1,
>> +                           zmmh:1,
>> +                           h16zmm:1,
>> +                           rsvd5:5,
>> +                           gpr:1,
>> +                           aux:1,
>> +                           basic:1;
>> +               };
>> +       };
>> +       u64 rsvd6;
>> +};
>> +
>> +struct arch_pebs_basic {
>> +       u64 ip;
>> +       u64 applicable_counters;
>> +       u64 tsc;
>> +       u64 retire      :16,    /* Retire Latency */
>> +           valid       :1,
>> +           rsvd        :47;
>> +       u64 rsvd2;
>> +       u64 rsvd3;
>> +};
>> +
>> +struct arch_pebs_aux {
>> +       u64 address;
>> +       u64 rsvd;
>> +       u64 rsvd2;
>> +       u64 rsvd3;
>> +       u64 rsvd4;
>> +       u64 aux;
>> +       u64 instr_latency       :16,
>> +           pad2                :16,
>> +           cache_latency       :16,
>> +           pad3                :16;
>> +       u64 tsx_tuning;
>> +};
>> +
>> +struct arch_pebs_gprs {
>> +       u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di;
>> +       u64 r8, r9, r10, r11, r12, r13, r14, r15, ssp;
>> +       u64 rsvd;
>> +};
>> +
>> +struct arch_pebs_xer_header {
>> +       u64 xstate;
>> +       u64 rsvd;
>> +};
>> +
>> +#define ARCH_PEBS_LBR_NAN              0x0
>> +#define ARCH_PEBS_LBR_NUM_8            0x1
>> +#define ARCH_PEBS_LBR_NUM_16           0x2
>> +#define ARCH_PEBS_LBR_NUM_VAR          0x3
>> +#define ARCH_PEBS_BASE_LBR_ENTRIES     8
>> +struct arch_pebs_lbr_header {
>> +       u64 rsvd;
>> +       u64 ctl;
>> +       u64 depth;
>> +       u64 ler_from;
>> +       u64 ler_to;
>> +       u64 ler_info;
>> +};
>> +
>>  /*
>>   * AMD Extended Performance Monitoring and Debug cpuid feature detection
>>   */
>> --
>> 2.34.1
>>
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS
  2026-03-05  0:50   ` [Patch v9 05/12] " Ian Rogers
@ 2026-03-06  1:38     ` Mi, Dapeng
  0 siblings, 0 replies; 48+ messages in thread
From: Mi, Dapeng @ 2026-03-06  1:38 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao


On 3/5/2026 8:50 AM, Ian Rogers wrote:
> On Wed, Oct 29, 2025 at 3:24 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS
>> supported capabilities and counters bitmap. This patch parses these 2
>> sub-leaves and initializes arch-PEBS capabilities and corresponding
>> structures.
>>
>> Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed
>> for arch-PEBS, arch-PEBS doesn't need to manipulate these MSRs. Thus add
>> a simple pair of __intel_pmu_pebs_enable/disable() callbacks for
>> arch-PEBS.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>  arch/x86/events/core.c            | 21 ++++++++---
>>  arch/x86/events/intel/core.c      | 60 ++++++++++++++++++++++---------
>>  arch/x86/events/intel/ds.c        | 52 ++++++++++++++++++++++-----
>>  arch/x86/events/perf_event.h      | 25 +++++++++++--
>>  arch/x86/include/asm/perf_event.h |  7 +++-
>>  5 files changed, 132 insertions(+), 33 deletions(-)
>>
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index 74479f9d6eed..f2402ae3ffa0 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -554,14 +554,22 @@ static inline int precise_br_compat(struct perf_event *event)
>>         return m == b;
>>  }
>>
>> -int x86_pmu_max_precise(void)
>> +int x86_pmu_max_precise(struct pmu *pmu)
>>  {
>>         int precise = 0;
>>
>> -       /* Support for constant skid */
>>         if (x86_pmu.pebs_active && !x86_pmu.pebs_broken) {
>> -               precise++;
>> +               /* arch PEBS */
>> +               if (x86_pmu.arch_pebs) {
>> +                       precise = 2;
>> +                       if (hybrid(pmu, arch_pebs_cap).pdists)
>> +                               precise++;
>> +
>> +                       return precise;
>> +               }
>>
>> +               /* legacy PEBS - support for constant skid */
>> +               precise++;
>>                 /* Support for IP fixup */
>>                 if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
>>                         precise++;
>> @@ -569,13 +577,14 @@ int x86_pmu_max_precise(void)
>>                 if (x86_pmu.pebs_prec_dist)
>>                         precise++;
>>         }
>> +
>>         return precise;
>>  }
>>
>>  int x86_pmu_hw_config(struct perf_event *event)
>>  {
>>         if (event->attr.precise_ip) {
>> -               int precise = x86_pmu_max_precise();
>> +               int precise = x86_pmu_max_precise(event->pmu);
>>
>>                 if (event->attr.precise_ip > precise)
>>                         return -EOPNOTSUPP;
>> @@ -2630,7 +2639,9 @@ static ssize_t max_precise_show(struct device *cdev,
>>                                   struct device_attribute *attr,
>>                                   char *buf)
>>  {
>> -       return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise());
>> +       struct pmu *pmu = dev_get_drvdata(cdev);
>> +
>> +       return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise(pmu));
>>  }
>>
>>  static DEVICE_ATTR_RO(max_precise);
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index c88bcd5d2bc4..9ce27b326923 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -5271,34 +5271,59 @@ static inline bool intel_pmu_broken_perf_cap(void)
>>         return false;
>>  }
>>
>> +#define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
>> +
>>  static void update_pmu_cap(struct pmu *pmu)
>>  {
>> -       unsigned int cntr, fixed_cntr, ecx, edx;
>> -       union cpuid35_eax eax;
>> -       union cpuid35_ebx ebx;
>> +       unsigned int eax, ebx, ecx, edx;
>> +       union cpuid35_eax eax_0;
>> +       union cpuid35_ebx ebx_0;
>> +       u64 cntrs_mask = 0;
>> +       u64 pebs_mask = 0;
>> +       u64 pdists_mask = 0;
>>
>> -       cpuid(ARCH_PERFMON_EXT_LEAF, &eax.full, &ebx.full, &ecx, &edx);
>> +       cpuid(ARCH_PERFMON_EXT_LEAF, &eax_0.full, &ebx_0.full, &ecx, &edx);
>>
>> -       if (ebx.split.umask2)
>> +       if (ebx_0.split.umask2)
>>                 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
>> -       if (ebx.split.eq)
>> +       if (ebx_0.split.eq)
>>                 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
>>
>> -       if (eax.split.cntr_subleaf) {
>> +       if (eax_0.split.cntr_subleaf) {
>>                 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
>> -                           &cntr, &fixed_cntr, &ecx, &edx);
>> -               hybrid(pmu, cntr_mask64) = cntr;
>> -               hybrid(pmu, fixed_cntr_mask64) = fixed_cntr;
>> +                           &eax, &ebx, &ecx, &edx);
>> +               hybrid(pmu, cntr_mask64) = eax;
>> +               hybrid(pmu, fixed_cntr_mask64) = ebx;
>> +               cntrs_mask = counter_mask(eax, ebx);
>>         }
>>
>> -       if (eax.split.acr_subleaf) {
>> +       if (eax_0.split.acr_subleaf) {
>>                 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_ACR_LEAF,
>> -                           &cntr, &fixed_cntr, &ecx, &edx);
>> +                           &eax, &ebx, &ecx, &edx);
>>                 /* The mask of the counters which can be reloaded */
>> -               hybrid(pmu, acr_cntr_mask64) = cntr | ((u64)fixed_cntr << INTEL_PMC_IDX_FIXED);
>> -
>> +               hybrid(pmu, acr_cntr_mask64) = counter_mask(eax, ebx);
>>                 /* The mask of the counters which can cause a reload of reloadable counters */
>> -               hybrid(pmu, acr_cause_mask64) = ecx | ((u64)edx << INTEL_PMC_IDX_FIXED);
>> +               hybrid(pmu, acr_cause_mask64) = counter_mask(ecx, edx);
>> +       }
>> +
>> +       /* Bits[5:4] should be set simultaneously if arch-PEBS is supported */
>> +       if (eax_0.split.pebs_caps_subleaf && eax_0.split.pebs_cnts_subleaf) {
>> +               cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_CAP_LEAF,
>> +                           &eax, &ebx, &ecx, &edx);
>> +               hybrid(pmu, arch_pebs_cap).caps = (u64)ebx << 32;
> nit: It seems strange to use a u64 for caps but only use the top 32
> bits. Did you intend to use the low 32-bits for eax?

The intent of right shifting the caps 32 bits is to ensure there are same
layout for the caps with XXX_CFG_C MSR and PEBS record format which put the
caps field on the higher 32 bits. Then it would be easy and unified to
manipulate the caps filed in these 3 places. Thanks.


>
> Thanks,
> Ian
>
>> +
>> +               cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_COUNTER_LEAF,
>> +                           &eax, &ebx, &ecx, &edx);
>> +               pebs_mask   = counter_mask(eax, ecx);
>> +               pdists_mask = counter_mask(ebx, edx);
>> +               hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
>> +               hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
>> +
>> +               if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
>> +                       x86_pmu.arch_pebs = 0;
>> +       } else {
>> +               WARN_ON(x86_pmu.arch_pebs == 1);
>> +               x86_pmu.arch_pebs = 0;
>>         }
>>
>>         if (!intel_pmu_broken_perf_cap()) {
>> @@ -6252,7 +6277,7 @@ tsx_is_visible(struct kobject *kobj, struct attribute *attr, int i)
>>  static umode_t
>>  pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i)
>>  {
>> -       return x86_pmu.ds_pebs ? attr->mode : 0;
>> +       return intel_pmu_has_pebs() ? attr->mode : 0;
>>  }
>>
>>  static umode_t
>> @@ -7728,6 +7753,9 @@ __init int intel_pmu_init(void)
>>         if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
>>                 update_pmu_cap(NULL);
>>
>> +       if (x86_pmu.arch_pebs)
>> +               pr_cont("Architectural PEBS, ");
>> +
>>         intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64,
>>                                       &x86_pmu.fixed_cntr_mask64,
>>                                       &x86_pmu.intel_ctrl);
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index c0b7ac1c7594..26e485eca0a0 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -1531,6 +1531,15 @@ static inline void intel_pmu_drain_large_pebs(struct cpu_hw_events *cpuc)
>>                 intel_pmu_drain_pebs_buffer();
>>  }
>>
>> +static void __intel_pmu_pebs_enable(struct perf_event *event)
>> +{
>> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +       struct hw_perf_event *hwc = &event->hw;
>> +
>> +       hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
>> +       cpuc->pebs_enabled |= 1ULL << hwc->idx;
>> +}
>> +
>>  void intel_pmu_pebs_enable(struct perf_event *event)
>>  {
>>         struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> @@ -1539,9 +1548,7 @@ void intel_pmu_pebs_enable(struct perf_event *event)
>>         struct debug_store *ds = cpuc->ds;
>>         unsigned int idx = hwc->idx;
>>
>> -       hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
>> -
>> -       cpuc->pebs_enabled |= 1ULL << hwc->idx;
>> +       __intel_pmu_pebs_enable(event);
>>
>>         if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5))
>>                 cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
>> @@ -1603,14 +1610,22 @@ void intel_pmu_pebs_del(struct perf_event *event)
>>         pebs_update_state(needed_cb, cpuc, event, false);
>>  }
>>
>> -void intel_pmu_pebs_disable(struct perf_event *event)
>> +static void __intel_pmu_pebs_disable(struct perf_event *event)
>>  {
>>         struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>>         struct hw_perf_event *hwc = &event->hw;
>>
>>         intel_pmu_drain_large_pebs(cpuc);
>> -
>>         cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
>> +       hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
>> +}
>> +
>> +void intel_pmu_pebs_disable(struct perf_event *event)
>> +{
>> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +       struct hw_perf_event *hwc = &event->hw;
>> +
>> +       __intel_pmu_pebs_disable(event);
>>
>>         if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) &&
>>             (x86_pmu.version < 5))
>> @@ -1622,8 +1637,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
>>
>>         if (cpuc->enabled)
>>                 wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>> -
>> -       hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
>>  }
>>
>>  void intel_pmu_pebs_enable_all(void)
>> @@ -2669,11 +2682,26 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
>>         }
>>  }
>>
>> +static void __init intel_arch_pebs_init(void)
>> +{
>> +       /*
>> +        * Current hybrid platforms always both support arch-PEBS or not
>> +        * on all kinds of cores. So directly set x86_pmu.arch_pebs flag
>> +        * if boot cpu supports arch-PEBS.
>> +        */
>> +       x86_pmu.arch_pebs = 1;
>> +       x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>> +       x86_pmu.pebs_capable = ~0ULL;
>> +
>> +       x86_pmu.pebs_enable = __intel_pmu_pebs_enable;
>> +       x86_pmu.pebs_disable = __intel_pmu_pebs_disable;
>> +}
>> +
>>  /*
>>   * PEBS probe and setup
>>   */
>>
>> -void __init intel_pebs_init(void)
>> +static void __init intel_ds_pebs_init(void)
>>  {
>>         /*
>>          * No support for 32bit formats
>> @@ -2788,6 +2816,14 @@ void __init intel_pebs_init(void)
>>         }
>>  }
>>
>> +void __init intel_pebs_init(void)
>> +{
>> +       if (x86_pmu.intel_cap.pebs_format == 0xf)
>> +               intel_arch_pebs_init();
>> +       else
>> +               intel_ds_pebs_init();
>> +}
>> +
>>  void perf_restore_debug_store(void)
>>  {
>>         struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
>> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
>> index 285779c73479..ca5289980b52 100644
>> --- a/arch/x86/events/perf_event.h
>> +++ b/arch/x86/events/perf_event.h
>> @@ -708,6 +708,12 @@ enum hybrid_pmu_type {
>>         hybrid_big_small_tiny   = hybrid_big   | hybrid_small_tiny,
>>  };
>>
>> +struct arch_pebs_cap {
>> +       u64 caps;
>> +       u64 counters;
>> +       u64 pdists;
>> +};
>> +
>>  struct x86_hybrid_pmu {
>>         struct pmu                      pmu;
>>         const char                      *name;
>> @@ -752,6 +758,8 @@ struct x86_hybrid_pmu {
>>                                         mid_ack         :1,
>>                                         enabled_ack     :1;
>>
>> +       struct arch_pebs_cap            arch_pebs_cap;
>> +
>>         u64                             pebs_data_source[PERF_PEBS_DATA_SOURCE_MAX];
>>  };
>>
>> @@ -906,7 +914,7 @@ struct x86_pmu {
>>         union perf_capabilities intel_cap;
>>
>>         /*
>> -        * Intel DebugStore bits
>> +        * Intel DebugStore and PEBS bits
>>          */
>>         unsigned int    bts                     :1,
>>                         bts_active              :1,
>> @@ -917,7 +925,8 @@ struct x86_pmu {
>>                         pebs_no_tlb             :1,
>>                         pebs_no_isolation       :1,
>>                         pebs_block              :1,
>> -                       pebs_ept                :1;
>> +                       pebs_ept                :1,
>> +                       arch_pebs               :1;
>>         int             pebs_record_size;
>>         int             pebs_buffer_size;
>>         u64             pebs_events_mask;
>> @@ -929,6 +938,11 @@ struct x86_pmu {
>>         u64             rtm_abort_event;
>>         u64             pebs_capable;
>>
>> +       /*
>> +        * Intel Architectural PEBS
>> +        */
>> +       struct arch_pebs_cap arch_pebs_cap;
>> +
>>         /*
>>          * Intel LBR
>>          */
>> @@ -1216,7 +1230,7 @@ int x86_reserve_hardware(void);
>>
>>  void x86_release_hardware(void);
>>
>> -int x86_pmu_max_precise(void);
>> +int x86_pmu_max_precise(struct pmu *pmu);
>>
>>  void hw_perf_lbr_event_destroy(struct perf_event *event);
>>
>> @@ -1791,6 +1805,11 @@ static inline int intel_pmu_max_num_pebs(struct pmu *pmu)
>>         return fls((u32)hybrid(pmu, pebs_events_mask));
>>  }
>>
>> +static inline bool intel_pmu_has_pebs(void)
>> +{
>> +       return x86_pmu.ds_pebs || x86_pmu.arch_pebs;
>> +}
>> +
>>  #else /* CONFIG_CPU_SUP_INTEL */
>>
>>  static inline void reserve_ds_buffers(void)
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index 49a4d442f3fc..0dfa06722bab 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -200,6 +200,8 @@ union cpuid10_edx {
>>  #define ARCH_PERFMON_EXT_LEAF                  0x00000023
>>  #define ARCH_PERFMON_NUM_COUNTER_LEAF          0x1
>>  #define ARCH_PERFMON_ACR_LEAF                  0x2
>> +#define ARCH_PERFMON_PEBS_CAP_LEAF             0x4
>> +#define ARCH_PERFMON_PEBS_COUNTER_LEAF         0x5
>>
>>  union cpuid35_eax {
>>         struct {
>> @@ -210,7 +212,10 @@ union cpuid35_eax {
>>                 unsigned int    acr_subleaf:1;
>>                 /* Events Sub-Leaf */
>>                 unsigned int    events_subleaf:1;
>> -               unsigned int    reserved:28;
>> +               /* arch-PEBS Sub-Leaves */
>> +               unsigned int    pebs_caps_subleaf:1;
>> +               unsigned int    pebs_cnts_subleaf:1;
>> +               unsigned int    reserved:26;
>>         } split;
>>         unsigned int            full;
>>  };
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  2026-03-05  1:20   ` [Patch v9 11/12] " Ian Rogers
@ 2026-03-06  2:17     ` Mi, Dapeng
  0 siblings, 0 replies; 48+ messages in thread
From: Mi, Dapeng @ 2026-03-06  2:17 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao, Kan Liang


On 3/5/2026 9:20 AM, Ian Rogers wrote:
> On Wed, Oct 29, 2025 at 3:25 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> Different with legacy PEBS, arch-PEBS provides per-counter PEBS data
>> configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs.
>>
>> This patch obtains PEBS data configuration from event attribute and then
>> writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and
>> enable corresponding PEBS groups.
>>
>> Please notice this patch only enables XMM SIMD regs sampling for
>> arch-PEBS, the other SIMD regs (OPMASK/YMM/ZMM) sampling on arch-PEBS
>> would be supported after PMI based SIMD regs (OPMASK/YMM/ZMM) sampling
>> is supported.
>>
>> Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>  arch/x86/events/intel/core.c     | 136 ++++++++++++++++++++++++++++++-
>>  arch/x86/events/intel/ds.c       |  17 ++++
>>  arch/x86/events/perf_event.h     |   4 +
>>  arch/x86/include/asm/intel_ds.h  |   7 ++
>>  arch/x86/include/asm/msr-index.h |   8 ++
>>  5 files changed, 171 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 40ccfd80d554..75cba28b86d5 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -2563,6 +2563,45 @@ static void intel_pmu_disable_fixed(struct perf_event *event)
>>         cpuc->fixed_ctrl_val &= ~mask;
>>  }
>>
>> +static inline void __intel_pmu_update_event_ext(int idx, u64 ext)
>> +{
>> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +       u32 msr;
>> +
>> +       if (idx < INTEL_PMC_IDX_FIXED) {
>> +               msr = MSR_IA32_PMC_V6_GP0_CFG_C +
>> +                     x86_pmu.addr_offset(idx, false);
>> +       } else {
>> +               msr = MSR_IA32_PMC_V6_FX0_CFG_C +
>> +                     x86_pmu.addr_offset(idx - INTEL_PMC_IDX_FIXED, false);
>> +       }
>> +
>> +       cpuc->cfg_c_val[idx] = ext;
>> +       wrmsrq(msr, ext);
>> +}
>> +
>> +static void intel_pmu_disable_event_ext(struct perf_event *event)
>> +{
>> +       if (!x86_pmu.arch_pebs)
>> +               return;
>> +
>> +       /*
>> +        * Only clear CFG_C MSR for PEBS counter group events,
>> +        * it avoids the HW counter's value to be added into
>> +        * other PEBS records incorrectly after PEBS counter
>> +        * group events are disabled.
>> +        *
>> +        * For other events, it's unnecessary to clear CFG_C MSRs
>> +        * since CFG_C doesn't take effect if counter is in
>> +        * disabled state. That helps to reduce the WRMSR overhead
>> +        * in context switches.
>> +        */
>> +       if (!is_pebs_counter_event_group(event))
>> +               return;
>> +
>> +       __intel_pmu_update_event_ext(event->hw.idx, 0);
>> +}
>> +
>>  static void intel_pmu_disable_event(struct perf_event *event)
>>  {
>>         struct hw_perf_event *hwc = &event->hw;
>> @@ -2571,9 +2610,12 @@ static void intel_pmu_disable_event(struct perf_event *event)
>>         switch (idx) {
>>         case 0 ... INTEL_PMC_IDX_FIXED - 1:
>>                 intel_clear_masks(event, idx);
>> +               intel_pmu_disable_event_ext(event);
>>                 x86_pmu_disable_event(event);
>>                 break;
>>         case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
>> +               intel_pmu_disable_event_ext(event);
>> +               fallthrough;
>>         case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
>>                 intel_pmu_disable_fixed(event);
>>                 break;
>> @@ -2940,6 +2982,66 @@ static void intel_pmu_enable_acr(struct perf_event *event)
>>
>>  DEFINE_STATIC_CALL_NULL(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
>>
>> +static void intel_pmu_enable_event_ext(struct perf_event *event)
>> +{
>> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +       struct hw_perf_event *hwc = &event->hw;
>> +       union arch_pebs_index old, new;
>> +       struct arch_pebs_cap cap;
>> +       u64 ext = 0;
>> +
>> +       if (!x86_pmu.arch_pebs)
>> +               return;
>> +
>> +       cap = hybrid(cpuc->pmu, arch_pebs_cap);
>> +
>> +       if (event->attr.precise_ip) {
>> +               u64 pebs_data_cfg = intel_get_arch_pebs_data_config(event);
>> +
>> +               ext |= ARCH_PEBS_EN;
>> +               if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD)
>> +                       ext |= (-hwc->sample_period) & ARCH_PEBS_RELOAD;
> Nit: Perhaps there should be a warning if "hwc->sample_period >
> ARCH_PEBS_RELOAD"?

Hmm, strictly speaking, we should not check if hwc->sample_period is larger
than ARCH_PEBS_RELOAD since it's allowed even hwc->sample_period is larger
than ARCH_PEBS_RELOAD as long as hwc->sample_period is not larger than
x86_pmu.max_period (which is "x86_pmu.cntval_mask >> 1" for modern Intel
platforms). But we indeed need a sample_period check for arch-PEBS, just
like what the below ACR does.

```

        /* The reload value cannot exceeds the max period */
        if (event->attr.sample_period > x86_pmu.max_period)
            return -EINVAL;

```

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c?h=v7.0-rc2#n4789

I would add a patch to do this check. Thanks.


>
> Thanks,
> Ian
>
>> +
>> +               if (pebs_data_cfg && cap.caps) {
>> +                       if (pebs_data_cfg & PEBS_DATACFG_MEMINFO)
>> +                               ext |= ARCH_PEBS_AUX & cap.caps;
>> +
>> +                       if (pebs_data_cfg & PEBS_DATACFG_GP)
>> +                               ext |= ARCH_PEBS_GPR & cap.caps;
>> +
>> +                       if (pebs_data_cfg & PEBS_DATACFG_XMMS)
>> +                               ext |= ARCH_PEBS_VECR_XMM & cap.caps;
>> +
>> +                       if (pebs_data_cfg & PEBS_DATACFG_LBRS)
>> +                               ext |= ARCH_PEBS_LBR & cap.caps;
>> +               }
>> +
>> +               if (cpuc->n_pebs == cpuc->n_large_pebs)
>> +                       new.thresh = ARCH_PEBS_THRESH_MULTI;
>> +               else
>> +                       new.thresh = ARCH_PEBS_THRESH_SINGLE;
>> +
>> +               rdmsrq(MSR_IA32_PEBS_INDEX, old.whole);
>> +               if (new.thresh != old.thresh || !old.en) {
>> +                       if (old.thresh == ARCH_PEBS_THRESH_MULTI && old.wr > 0) {
>> +                               /*
>> +                                * Large PEBS was enabled.
>> +                                * Drain PEBS buffer before applying the single PEBS.
>> +                                */
>> +                               intel_pmu_drain_pebs_buffer();
>> +                       } else {
>> +                               new.wr = 0;
>> +                               new.full = 0;
>> +                               new.en = 1;
>> +                               wrmsrq(MSR_IA32_PEBS_INDEX, new.whole);
>> +                       }
>> +               }
>> +       }
>> +
>> +       if (cpuc->cfg_c_val[hwc->idx] != ext)
>> +               __intel_pmu_update_event_ext(hwc->idx, ext);
>> +}
>> +
>>  static void intel_pmu_enable_event(struct perf_event *event)
>>  {
>>         u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE;
>> @@ -2955,10 +3057,12 @@ static void intel_pmu_enable_event(struct perf_event *event)
>>                         enable_mask |= ARCH_PERFMON_EVENTSEL_BR_CNTR;
>>                 intel_set_masks(event, idx);
>>                 static_call_cond(intel_pmu_enable_acr_event)(event);
>> +               intel_pmu_enable_event_ext(event);
>>                 __x86_pmu_enable_event(hwc, enable_mask);
>>                 break;
>>         case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
>>                 static_call_cond(intel_pmu_enable_acr_event)(event);
>> +               intel_pmu_enable_event_ext(event);
>>                 fallthrough;
>>         case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END:
>>                 intel_pmu_enable_fixed(event);
>> @@ -5301,6 +5405,30 @@ static inline bool intel_pmu_broken_perf_cap(void)
>>         return false;
>>  }
>>
>> +static inline void __intel_update_pmu_caps(struct pmu *pmu)
>> +{
>> +       struct pmu *dest_pmu = pmu ? pmu : x86_get_pmu(smp_processor_id());
>> +
>> +       if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM)
>> +               dest_pmu->capabilities |= PERF_PMU_CAP_EXTENDED_REGS;
>> +}
>> +
>> +static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
>> +{
>> +       u64 caps = hybrid(pmu, arch_pebs_cap).caps;
>> +
>> +       x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
>> +       if (caps & ARCH_PEBS_LBR)
>> +               x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
>> +
>> +       if (!(caps & ARCH_PEBS_AUX))
>> +               x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
>> +       if (!(caps & ARCH_PEBS_GPR)) {
>> +               x86_pmu.large_pebs_flags &=
>> +                       ~(PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER);
>> +       }
>> +}
>> +
>>  #define counter_mask(_gp, _fixed) ((_gp) | ((u64)(_fixed) << INTEL_PMC_IDX_FIXED))
>>
>>  static void update_pmu_cap(struct pmu *pmu)
>> @@ -5349,8 +5477,12 @@ static void update_pmu_cap(struct pmu *pmu)
>>                 hybrid(pmu, arch_pebs_cap).counters = pebs_mask;
>>                 hybrid(pmu, arch_pebs_cap).pdists = pdists_mask;
>>
>> -               if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask))
>> +               if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask)) {
>>                         x86_pmu.arch_pebs = 0;
>> +               } else {
>> +                       __intel_update_pmu_caps(pmu);
>> +                       __intel_update_large_pebs_flags(pmu);
>> +               }
>>         } else {
>>                 WARN_ON(x86_pmu.arch_pebs == 1);
>>                 x86_pmu.arch_pebs = 0;
>> @@ -5514,6 +5646,8 @@ static void intel_pmu_cpu_starting(int cpu)
>>                 }
>>         }
>>
>> +       __intel_update_pmu_caps(cpuc->pmu);
>> +
>>         if (!cpuc->shared_regs)
>>                 return;
>>
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index 1179980f795b..c66e9b562de3 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -1528,6 +1528,18 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
>>         }
>>  }
>>
>> +u64 intel_get_arch_pebs_data_config(struct perf_event *event)
>> +{
>> +       u64 pebs_data_cfg = 0;
>> +
>> +       if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
>> +               return 0;
>> +
>> +       pebs_data_cfg |= pebs_update_adaptive_cfg(event);
>> +
>> +       return pebs_data_cfg;
>> +}
>> +
>>  void intel_pmu_pebs_add(struct perf_event *event)
>>  {
>>         struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> @@ -2947,6 +2959,11 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs,
>>
>>         index.wr = 0;
>>         index.full = 0;
>> +       index.en = 1;
>> +       if (cpuc->n_pebs == cpuc->n_large_pebs)
>> +               index.thresh = ARCH_PEBS_THRESH_MULTI;
>> +       else
>> +               index.thresh = ARCH_PEBS_THRESH_SINGLE;
>>         wrmsrq(MSR_IA32_PEBS_INDEX, index.whole);
>>
>>         mask = hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled;
>> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
>> index 13f411bca6bc..3161ec0a3416 100644
>> --- a/arch/x86/events/perf_event.h
>> +++ b/arch/x86/events/perf_event.h
>> @@ -304,6 +304,8 @@ struct cpu_hw_events {
>>         /* Intel ACR configuration */
>>         u64                     acr_cfg_b[X86_PMC_IDX_MAX];
>>         u64                     acr_cfg_c[X86_PMC_IDX_MAX];
>> +       /* Cached CFG_C values */
>> +       u64                     cfg_c_val[X86_PMC_IDX_MAX];
>>
>>         /*
>>          * Intel LBR bits
>> @@ -1782,6 +1784,8 @@ void intel_pmu_pebs_data_source_cmt(void);
>>
>>  void intel_pmu_pebs_data_source_lnl(void);
>>
>> +u64 intel_get_arch_pebs_data_config(struct perf_event *event);
>> +
>>  int intel_pmu_setup_lbr_filter(struct perf_event *event);
>>
>>  void intel_pt_interrupt(void);
>> diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_ds.h
>> index 023c2883f9f3..695f87efbeb8 100644
>> --- a/arch/x86/include/asm/intel_ds.h
>> +++ b/arch/x86/include/asm/intel_ds.h
>> @@ -7,6 +7,13 @@
>>  #define PEBS_BUFFER_SHIFT      4
>>  #define PEBS_BUFFER_SIZE       (PAGE_SIZE << PEBS_BUFFER_SHIFT)
>>
>> +/*
>> + * The largest PEBS record could consume a page, ensure
>> + * a record at least can be written after triggering PMI.
>> + */
>> +#define ARCH_PEBS_THRESH_MULTI ((PEBS_BUFFER_SIZE - PAGE_SIZE) >> PEBS_BUFFER_SHIFT)
>> +#define ARCH_PEBS_THRESH_SINGLE        1
>> +
>>  /* The maximal number of PEBS events: */
>>  #define MAX_PEBS_EVENTS_FMT4   8
>>  #define MAX_PEBS_EVENTS                32
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index fc7a4e7c718d..f1ef9ac38bfb 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -333,6 +333,14 @@
>>  #define ARCH_PEBS_OFFSET_MASK          0x7fffff
>>  #define ARCH_PEBS_INDEX_WR_SHIFT       4
>>
>> +#define ARCH_PEBS_RELOAD               0xffffffff
>> +#define ARCH_PEBS_LBR_SHIFT            40
>> +#define ARCH_PEBS_LBR                  (0x3ull << ARCH_PEBS_LBR_SHIFT)
>> +#define ARCH_PEBS_VECR_XMM             BIT_ULL(49)
>> +#define ARCH_PEBS_GPR                  BIT_ULL(61)
>> +#define ARCH_PEBS_AUX                  BIT_ULL(62)
>> +#define ARCH_PEBS_EN                   BIT_ULL(63)
>> +
>>  #define MSR_IA32_RTIT_CTL              0x00000570
>>  #define RTIT_CTL_TRACEEN               BIT(0)
>>  #define RTIT_CTL_CYCLEACC              BIT(1)
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS
  2025-10-29 10:21 ` [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS Dapeng Mi
  2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
@ 2026-03-09 22:59   ` Ian Rogers
  2026-03-10  2:06     ` Mi, Dapeng
  1 sibling, 1 reply; 48+ messages in thread
From: Ian Rogers @ 2026-03-09 22:59 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao

On Wed, Oct 29, 2025 at 3:25 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> Base on previous adaptive PEBS counter snapshot support, add counter
> group support for architectural PEBS. Since arch-PEBS shares same
> counter group layout with adaptive PEBS, directly reuse
> __setup_pebs_counter_group() helper to process arch-PEBS counter group.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/intel/core.c      | 38 ++++++++++++++++++++++++++++---
>  arch/x86/events/intel/ds.c        | 29 ++++++++++++++++++++---
>  arch/x86/include/asm/msr-index.h  |  6 +++++
>  arch/x86/include/asm/perf_event.h | 13 ++++++++---
>  4 files changed, 77 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 75cba28b86d5..cb64018321dd 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3014,6 +3014,17 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
>
>                         if (pebs_data_cfg & PEBS_DATACFG_LBRS)
>                                 ext |= ARCH_PEBS_LBR & cap.caps;
> +
> +                       if (pebs_data_cfg &
> +                           (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT))
> +                               ext |= ARCH_PEBS_CNTR_GP & cap.caps;
> +
> +                       if (pebs_data_cfg &
> +                           (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT))
> +                               ext |= ARCH_PEBS_CNTR_FIXED & cap.caps;
> +
> +                       if (pebs_data_cfg & PEBS_DATACFG_METRICS)
> +                               ext |= ARCH_PEBS_CNTR_METRICS & cap.caps;
>                 }
>
>                 if (cpuc->n_pebs == cpuc->n_large_pebs)
> @@ -3038,6 +3049,9 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
>                 }
>         }
>
> +       if (is_pebs_counter_event_group(event))
> +               ext |= ARCH_PEBS_CNTR_ALLOW;
> +
>         if (cpuc->cfg_c_val[hwc->idx] != ext)
>                 __intel_pmu_update_event_ext(hwc->idx, ext);
>  }
> @@ -4323,6 +4337,20 @@ static bool intel_pmu_is_acr_group(struct perf_event *event)
>         return false;
>  }
>
> +static inline bool intel_pmu_has_pebs_counter_group(struct pmu *pmu)
> +{
> +       u64 caps;
> +
> +       if (x86_pmu.intel_cap.pebs_format >= 6 && x86_pmu.intel_cap.pebs_baseline)
> +               return true;
> +
> +       caps = hybrid(pmu, arch_pebs_cap).caps;
> +       if (x86_pmu.arch_pebs && (caps & ARCH_PEBS_CNTR_MASK))
> +               return true;
> +
> +       return false;
> +}
> +
>  static inline void intel_pmu_set_acr_cntr_constr(struct perf_event *event,
>                                                  u64 *cause_mask, int *num)
>  {
> @@ -4471,8 +4499,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
>         }
>
>         if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
> -           (x86_pmu.intel_cap.pebs_format >= 6) &&
> -           x86_pmu.intel_cap.pebs_baseline &&
> +           intel_pmu_has_pebs_counter_group(event->pmu) &&
>             is_sampling_event(event) &&
>             event->attr.precise_ip)
>                 event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
> @@ -5420,6 +5447,8 @@ static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
>         x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
>         if (caps & ARCH_PEBS_LBR)
>                 x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
> +       if (caps & ARCH_PEBS_CNTR_MASK)
> +               x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
>
>         if (!(caps & ARCH_PEBS_AUX))
>                 x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
> @@ -7134,8 +7163,11 @@ __init int intel_pmu_init(void)
>          * Many features on and after V6 require dynamic constraint,
>          * e.g., Arch PEBS, ACR.
>          */
> -       if (version >= 6)
> +       if (version >= 6) {
>                 x86_pmu.flags |= PMU_FL_DYN_CONSTRAINT;
> +               x86_pmu.late_setup = intel_pmu_late_setup;
> +       }
> +
>         /*
>          * Install the hw-cache-events table:
>          */
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index c66e9b562de3..c93bf971d97b 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -1530,13 +1530,20 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
>
>  u64 intel_get_arch_pebs_data_config(struct perf_event *event)
>  {
> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>         u64 pebs_data_cfg = 0;
> +       u64 cntr_mask;
>
>         if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
>                 return 0;
>
>         pebs_data_cfg |= pebs_update_adaptive_cfg(event);
>
> +       cntr_mask = (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT) |
> +                   (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT) |
> +                   PEBS_DATACFG_CNTR | PEBS_DATACFG_METRICS;
> +       pebs_data_cfg |= cpuc->pebs_data_cfg & cntr_mask;
> +
>         return pebs_data_cfg;
>  }
>
> @@ -2444,6 +2451,24 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
>                 }
>         }
>
> +       if (header->cntr) {
> +               struct arch_pebs_cntr_header *cntr = next_record;
> +               unsigned int nr;
> +
> +               next_record += sizeof(struct arch_pebs_cntr_header);
> +
> +               if (is_pebs_counter_event_group(event)) {
> +                       __setup_pebs_counter_group(cpuc, event,
> +                               (struct pebs_cntr_header *)cntr, next_record);
> +                       data->sample_flags |= PERF_SAMPLE_READ;
> +               }
> +
> +               nr = hweight32(cntr->cntr) + hweight32(cntr->fixed);
> +               if (cntr->metrics == INTEL_CNTR_METRICS)
> +                       nr += 2;
> +               next_record += nr * sizeof(u64);
> +       }
> +
>         /* Parse followed fragments if there are. */
>         if (arch_pebs_record_continued(header)) {
>                 at = at + header->size;
> @@ -3094,10 +3119,8 @@ static void __init intel_ds_pebs_init(void)
>                         break;
>
>                 case 6:
> -                       if (x86_pmu.intel_cap.pebs_baseline) {
> +                       if (x86_pmu.intel_cap.pebs_baseline)
>                                 x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
> -                               x86_pmu.late_setup = intel_pmu_late_setup;
> -                       }

Hi Dapeng,

I'm trying to understand why the late_setup initialization was changed
here and its connection with counter group support. I couldn't find a
mention in the commit message.

Thanks,
Ian

>                         fallthrough;
>                 case 5:
>                         x86_pmu.pebs_ept = 1;
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index f1ef9ac38bfb..65cc528fbad8 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -334,12 +334,18 @@
>  #define ARCH_PEBS_INDEX_WR_SHIFT       4
>
>  #define ARCH_PEBS_RELOAD               0xffffffff
> +#define ARCH_PEBS_CNTR_ALLOW           BIT_ULL(35)
> +#define ARCH_PEBS_CNTR_GP              BIT_ULL(36)
> +#define ARCH_PEBS_CNTR_FIXED           BIT_ULL(37)
> +#define ARCH_PEBS_CNTR_METRICS         BIT_ULL(38)
>  #define ARCH_PEBS_LBR_SHIFT            40
>  #define ARCH_PEBS_LBR                  (0x3ull << ARCH_PEBS_LBR_SHIFT)
>  #define ARCH_PEBS_VECR_XMM             BIT_ULL(49)
>  #define ARCH_PEBS_GPR                  BIT_ULL(61)
>  #define ARCH_PEBS_AUX                  BIT_ULL(62)
>  #define ARCH_PEBS_EN                   BIT_ULL(63)
> +#define ARCH_PEBS_CNTR_MASK            (ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \
> +                                        ARCH_PEBS_CNTR_METRICS)
>
>  #define MSR_IA32_RTIT_CTL              0x00000570
>  #define RTIT_CTL_TRACEEN               BIT(0)
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index 3b3848f0d339..7276ba70c88a 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -141,16 +141,16 @@
>  #define ARCH_PERFMON_EVENTS_COUNT                      7
>
>  #define PEBS_DATACFG_MEMINFO   BIT_ULL(0)
> -#define PEBS_DATACFG_GP        BIT_ULL(1)
> +#define PEBS_DATACFG_GP                BIT_ULL(1)
>  #define PEBS_DATACFG_XMMS      BIT_ULL(2)
>  #define PEBS_DATACFG_LBRS      BIT_ULL(3)
> -#define PEBS_DATACFG_LBR_SHIFT 24
>  #define PEBS_DATACFG_CNTR      BIT_ULL(4)
> +#define PEBS_DATACFG_METRICS   BIT_ULL(5)
> +#define PEBS_DATACFG_LBR_SHIFT 24
>  #define PEBS_DATACFG_CNTR_SHIFT        32
>  #define PEBS_DATACFG_CNTR_MASK GENMASK_ULL(15, 0)
>  #define PEBS_DATACFG_FIX_SHIFT 48
>  #define PEBS_DATACFG_FIX_MASK  GENMASK_ULL(7, 0)
> -#define PEBS_DATACFG_METRICS   BIT_ULL(5)
>
>  /* Steal the highest bit of pebs_data_cfg for SW usage */
>  #define PEBS_UPDATE_DS_SW      BIT_ULL(63)
> @@ -603,6 +603,13 @@ struct arch_pebs_lbr_header {
>         u64 ler_info;
>  };
>
> +struct arch_pebs_cntr_header {
> +       u32 cntr;
> +       u32 fixed;
> +       u32 metrics;
> +       u32 reserved;
> +};
> +
>  /*
>   * AMD Extended Performance Monitoring and Debug cpuid feature detection
>   */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS
  2026-03-09 22:59   ` [Patch v9 12/12] " Ian Rogers
@ 2026-03-10  2:06     ` Mi, Dapeng
  2026-03-10  4:36       ` Ian Rogers
  0 siblings, 1 reply; 48+ messages in thread
From: Mi, Dapeng @ 2026-03-10  2:06 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao


On 3/10/2026 6:59 AM, Ian Rogers wrote:
> On Wed, Oct 29, 2025 at 3:25 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> Base on previous adaptive PEBS counter snapshot support, add counter
>> group support for architectural PEBS. Since arch-PEBS shares same
>> counter group layout with adaptive PEBS, directly reuse
>> __setup_pebs_counter_group() helper to process arch-PEBS counter group.
>>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>  arch/x86/events/intel/core.c      | 38 ++++++++++++++++++++++++++++---
>>  arch/x86/events/intel/ds.c        | 29 ++++++++++++++++++++---
>>  arch/x86/include/asm/msr-index.h  |  6 +++++
>>  arch/x86/include/asm/perf_event.h | 13 ++++++++---
>>  4 files changed, 77 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 75cba28b86d5..cb64018321dd 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -3014,6 +3014,17 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
>>
>>                         if (pebs_data_cfg & PEBS_DATACFG_LBRS)
>>                                 ext |= ARCH_PEBS_LBR & cap.caps;
>> +
>> +                       if (pebs_data_cfg &
>> +                           (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT))
>> +                               ext |= ARCH_PEBS_CNTR_GP & cap.caps;
>> +
>> +                       if (pebs_data_cfg &
>> +                           (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT))
>> +                               ext |= ARCH_PEBS_CNTR_FIXED & cap.caps;
>> +
>> +                       if (pebs_data_cfg & PEBS_DATACFG_METRICS)
>> +                               ext |= ARCH_PEBS_CNTR_METRICS & cap.caps;
>>                 }
>>
>>                 if (cpuc->n_pebs == cpuc->n_large_pebs)
>> @@ -3038,6 +3049,9 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
>>                 }
>>         }
>>
>> +       if (is_pebs_counter_event_group(event))
>> +               ext |= ARCH_PEBS_CNTR_ALLOW;
>> +
>>         if (cpuc->cfg_c_val[hwc->idx] != ext)
>>                 __intel_pmu_update_event_ext(hwc->idx, ext);
>>  }
>> @@ -4323,6 +4337,20 @@ static bool intel_pmu_is_acr_group(struct perf_event *event)
>>         return false;
>>  }
>>
>> +static inline bool intel_pmu_has_pebs_counter_group(struct pmu *pmu)
>> +{
>> +       u64 caps;
>> +
>> +       if (x86_pmu.intel_cap.pebs_format >= 6 && x86_pmu.intel_cap.pebs_baseline)
>> +               return true;
>> +
>> +       caps = hybrid(pmu, arch_pebs_cap).caps;
>> +       if (x86_pmu.arch_pebs && (caps & ARCH_PEBS_CNTR_MASK))
>> +               return true;
>> +
>> +       return false;
>> +}
>> +
>>  static inline void intel_pmu_set_acr_cntr_constr(struct perf_event *event,
>>                                                  u64 *cause_mask, int *num)
>>  {
>> @@ -4471,8 +4499,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>         }
>>
>>         if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
>> -           (x86_pmu.intel_cap.pebs_format >= 6) &&
>> -           x86_pmu.intel_cap.pebs_baseline &&
>> +           intel_pmu_has_pebs_counter_group(event->pmu) &&
>>             is_sampling_event(event) &&
>>             event->attr.precise_ip)
>>                 event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
>> @@ -5420,6 +5447,8 @@ static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
>>         x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
>>         if (caps & ARCH_PEBS_LBR)
>>                 x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
>> +       if (caps & ARCH_PEBS_CNTR_MASK)
>> +               x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
>>
>>         if (!(caps & ARCH_PEBS_AUX))
>>                 x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
>> @@ -7134,8 +7163,11 @@ __init int intel_pmu_init(void)
>>          * Many features on and after V6 require dynamic constraint,
>>          * e.g., Arch PEBS, ACR.
>>          */
>> -       if (version >= 6)
>> +       if (version >= 6) {
>>                 x86_pmu.flags |= PMU_FL_DYN_CONSTRAINT;
>> +               x86_pmu.late_setup = intel_pmu_late_setup;
>> +       }
>> +
>>         /*
>>          * Install the hw-cache-events table:
>>          */
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index c66e9b562de3..c93bf971d97b 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -1530,13 +1530,20 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
>>
>>  u64 intel_get_arch_pebs_data_config(struct perf_event *event)
>>  {
>> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>>         u64 pebs_data_cfg = 0;
>> +       u64 cntr_mask;
>>
>>         if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
>>                 return 0;
>>
>>         pebs_data_cfg |= pebs_update_adaptive_cfg(event);
>>
>> +       cntr_mask = (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT) |
>> +                   (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT) |
>> +                   PEBS_DATACFG_CNTR | PEBS_DATACFG_METRICS;
>> +       pebs_data_cfg |= cpuc->pebs_data_cfg & cntr_mask;
>> +
>>         return pebs_data_cfg;
>>  }
>>
>> @@ -2444,6 +2451,24 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
>>                 }
>>         }
>>
>> +       if (header->cntr) {
>> +               struct arch_pebs_cntr_header *cntr = next_record;
>> +               unsigned int nr;
>> +
>> +               next_record += sizeof(struct arch_pebs_cntr_header);
>> +
>> +               if (is_pebs_counter_event_group(event)) {
>> +                       __setup_pebs_counter_group(cpuc, event,
>> +                               (struct pebs_cntr_header *)cntr, next_record);
>> +                       data->sample_flags |= PERF_SAMPLE_READ;
>> +               }
>> +
>> +               nr = hweight32(cntr->cntr) + hweight32(cntr->fixed);
>> +               if (cntr->metrics == INTEL_CNTR_METRICS)
>> +                       nr += 2;
>> +               next_record += nr * sizeof(u64);
>> +       }
>> +
>>         /* Parse followed fragments if there are. */
>>         if (arch_pebs_record_continued(header)) {
>>                 at = at + header->size;
>> @@ -3094,10 +3119,8 @@ static void __init intel_ds_pebs_init(void)
>>                         break;
>>
>>                 case 6:
>> -                       if (x86_pmu.intel_cap.pebs_baseline) {
>> +                       if (x86_pmu.intel_cap.pebs_baseline)
>>                                 x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
>> -                               x86_pmu.late_setup = intel_pmu_late_setup;
>> -                       }
> Hi Dapeng,
>
> I'm trying to understand why the late_setup initialization was changed
> here and its connection with counter group support. I couldn't find a
> mention in the commit message.

It's because arch-PEBS also supports counters group sampling, not just the
legacy PEBS with PEBS format 6. Currently ACR (auto counter reload) and
PEBS both needs the late_setup, ACR and counters group sampling (regardless
of legacy PEBS or arch-PEBS) are introduced since Perfmon v6, so the
late_setup initialization is moved to the unified place of perfmon v6
initialization. Thanks.


>
> Thanks,
> Ian
>
>>                         fallthrough;
>>                 case 5:
>>                         x86_pmu.pebs_ept = 1;
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index f1ef9ac38bfb..65cc528fbad8 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -334,12 +334,18 @@
>>  #define ARCH_PEBS_INDEX_WR_SHIFT       4
>>
>>  #define ARCH_PEBS_RELOAD               0xffffffff
>> +#define ARCH_PEBS_CNTR_ALLOW           BIT_ULL(35)
>> +#define ARCH_PEBS_CNTR_GP              BIT_ULL(36)
>> +#define ARCH_PEBS_CNTR_FIXED           BIT_ULL(37)
>> +#define ARCH_PEBS_CNTR_METRICS         BIT_ULL(38)
>>  #define ARCH_PEBS_LBR_SHIFT            40
>>  #define ARCH_PEBS_LBR                  (0x3ull << ARCH_PEBS_LBR_SHIFT)
>>  #define ARCH_PEBS_VECR_XMM             BIT_ULL(49)
>>  #define ARCH_PEBS_GPR                  BIT_ULL(61)
>>  #define ARCH_PEBS_AUX                  BIT_ULL(62)
>>  #define ARCH_PEBS_EN                   BIT_ULL(63)
>> +#define ARCH_PEBS_CNTR_MASK            (ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \
>> +                                        ARCH_PEBS_CNTR_METRICS)
>>
>>  #define MSR_IA32_RTIT_CTL              0x00000570
>>  #define RTIT_CTL_TRACEEN               BIT(0)
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index 3b3848f0d339..7276ba70c88a 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -141,16 +141,16 @@
>>  #define ARCH_PERFMON_EVENTS_COUNT                      7
>>
>>  #define PEBS_DATACFG_MEMINFO   BIT_ULL(0)
>> -#define PEBS_DATACFG_GP        BIT_ULL(1)
>> +#define PEBS_DATACFG_GP                BIT_ULL(1)
>>  #define PEBS_DATACFG_XMMS      BIT_ULL(2)
>>  #define PEBS_DATACFG_LBRS      BIT_ULL(3)
>> -#define PEBS_DATACFG_LBR_SHIFT 24
>>  #define PEBS_DATACFG_CNTR      BIT_ULL(4)
>> +#define PEBS_DATACFG_METRICS   BIT_ULL(5)
>> +#define PEBS_DATACFG_LBR_SHIFT 24
>>  #define PEBS_DATACFG_CNTR_SHIFT        32
>>  #define PEBS_DATACFG_CNTR_MASK GENMASK_ULL(15, 0)
>>  #define PEBS_DATACFG_FIX_SHIFT 48
>>  #define PEBS_DATACFG_FIX_MASK  GENMASK_ULL(7, 0)
>> -#define PEBS_DATACFG_METRICS   BIT_ULL(5)
>>
>>  /* Steal the highest bit of pebs_data_cfg for SW usage */
>>  #define PEBS_UPDATE_DS_SW      BIT_ULL(63)
>> @@ -603,6 +603,13 @@ struct arch_pebs_lbr_header {
>>         u64 ler_info;
>>  };
>>
>> +struct arch_pebs_cntr_header {
>> +       u32 cntr;
>> +       u32 fixed;
>> +       u32 metrics;
>> +       u32 reserved;
>> +};
>> +
>>  /*
>>   * AMD Extended Performance Monitoring and Debug cpuid feature detection
>>   */
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS
  2026-03-10  2:06     ` Mi, Dapeng
@ 2026-03-10  4:36       ` Ian Rogers
  0 siblings, 0 replies; 48+ messages in thread
From: Ian Rogers @ 2026-03-10  4:36 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Zide Chen, Falcon Thomas, Xudong Hao

On Mon, Mar 9, 2026 at 7:06 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>
>
> On 3/10/2026 6:59 AM, Ian Rogers wrote:
> > On Wed, Oct 29, 2025 at 3:25 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
> >> Base on previous adaptive PEBS counter snapshot support, add counter
> >> group support for architectural PEBS. Since arch-PEBS shares same
> >> counter group layout with adaptive PEBS, directly reuse
> >> __setup_pebs_counter_group() helper to process arch-PEBS counter group.
> >>
> >> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> >> ---
> >>  arch/x86/events/intel/core.c      | 38 ++++++++++++++++++++++++++++---
> >>  arch/x86/events/intel/ds.c        | 29 ++++++++++++++++++++---
> >>  arch/x86/include/asm/msr-index.h  |  6 +++++
> >>  arch/x86/include/asm/perf_event.h | 13 ++++++++---
> >>  4 files changed, 77 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> >> index 75cba28b86d5..cb64018321dd 100644
> >> --- a/arch/x86/events/intel/core.c
> >> +++ b/arch/x86/events/intel/core.c
> >> @@ -3014,6 +3014,17 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
> >>
> >>                         if (pebs_data_cfg & PEBS_DATACFG_LBRS)
> >>                                 ext |= ARCH_PEBS_LBR & cap.caps;
> >> +
> >> +                       if (pebs_data_cfg &
> >> +                           (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT))
> >> +                               ext |= ARCH_PEBS_CNTR_GP & cap.caps;
> >> +
> >> +                       if (pebs_data_cfg &
> >> +                           (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT))
> >> +                               ext |= ARCH_PEBS_CNTR_FIXED & cap.caps;
> >> +
> >> +                       if (pebs_data_cfg & PEBS_DATACFG_METRICS)
> >> +                               ext |= ARCH_PEBS_CNTR_METRICS & cap.caps;
> >>                 }
> >>
> >>                 if (cpuc->n_pebs == cpuc->n_large_pebs)
> >> @@ -3038,6 +3049,9 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
> >>                 }
> >>         }
> >>
> >> +       if (is_pebs_counter_event_group(event))
> >> +               ext |= ARCH_PEBS_CNTR_ALLOW;
> >> +
> >>         if (cpuc->cfg_c_val[hwc->idx] != ext)
> >>                 __intel_pmu_update_event_ext(hwc->idx, ext);
> >>  }
> >> @@ -4323,6 +4337,20 @@ static bool intel_pmu_is_acr_group(struct perf_event *event)
> >>         return false;
> >>  }
> >>
> >> +static inline bool intel_pmu_has_pebs_counter_group(struct pmu *pmu)
> >> +{
> >> +       u64 caps;
> >> +
> >> +       if (x86_pmu.intel_cap.pebs_format >= 6 && x86_pmu.intel_cap.pebs_baseline)
> >> +               return true;
> >> +
> >> +       caps = hybrid(pmu, arch_pebs_cap).caps;
> >> +       if (x86_pmu.arch_pebs && (caps & ARCH_PEBS_CNTR_MASK))
> >> +               return true;
> >> +
> >> +       return false;
> >> +}
> >> +
> >>  static inline void intel_pmu_set_acr_cntr_constr(struct perf_event *event,
> >>                                                  u64 *cause_mask, int *num)
> >>  {
> >> @@ -4471,8 +4499,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
> >>         }
> >>
> >>         if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
> >> -           (x86_pmu.intel_cap.pebs_format >= 6) &&
> >> -           x86_pmu.intel_cap.pebs_baseline &&
> >> +           intel_pmu_has_pebs_counter_group(event->pmu) &&
> >>             is_sampling_event(event) &&
> >>             event->attr.precise_ip)
> >>                 event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
> >> @@ -5420,6 +5447,8 @@ static inline void __intel_update_large_pebs_flags(struct pmu *pmu)
> >>         x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;
> >>         if (caps & ARCH_PEBS_LBR)
> >>                 x86_pmu.large_pebs_flags |= PERF_SAMPLE_BRANCH_STACK;
> >> +       if (caps & ARCH_PEBS_CNTR_MASK)
> >> +               x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
> >>
> >>         if (!(caps & ARCH_PEBS_AUX))
> >>                 x86_pmu.large_pebs_flags &= ~PERF_SAMPLE_DATA_SRC;
> >> @@ -7134,8 +7163,11 @@ __init int intel_pmu_init(void)
> >>          * Many features on and after V6 require dynamic constraint,
> >>          * e.g., Arch PEBS, ACR.
> >>          */
> >> -       if (version >= 6)
> >> +       if (version >= 6) {
> >>                 x86_pmu.flags |= PMU_FL_DYN_CONSTRAINT;
> >> +               x86_pmu.late_setup = intel_pmu_late_setup;
> >> +       }
> >> +
> >>         /*
> >>          * Install the hw-cache-events table:
> >>          */
> >> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> >> index c66e9b562de3..c93bf971d97b 100644
> >> --- a/arch/x86/events/intel/ds.c
> >> +++ b/arch/x86/events/intel/ds.c
> >> @@ -1530,13 +1530,20 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
> >>
> >>  u64 intel_get_arch_pebs_data_config(struct perf_event *event)
> >>  {
> >> +       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> >>         u64 pebs_data_cfg = 0;
> >> +       u64 cntr_mask;
> >>
> >>         if (WARN_ON(event->hw.idx < 0 || event->hw.idx >= X86_PMC_IDX_MAX))
> >>                 return 0;
> >>
> >>         pebs_data_cfg |= pebs_update_adaptive_cfg(event);
> >>
> >> +       cntr_mask = (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT) |
> >> +                   (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT) |
> >> +                   PEBS_DATACFG_CNTR | PEBS_DATACFG_METRICS;
> >> +       pebs_data_cfg |= cpuc->pebs_data_cfg & cntr_mask;
> >> +
> >>         return pebs_data_cfg;
> >>  }
> >>
> >> @@ -2444,6 +2451,24 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
> >>                 }
> >>         }
> >>
> >> +       if (header->cntr) {
> >> +               struct arch_pebs_cntr_header *cntr = next_record;
> >> +               unsigned int nr;
> >> +
> >> +               next_record += sizeof(struct arch_pebs_cntr_header);
> >> +
> >> +               if (is_pebs_counter_event_group(event)) {
> >> +                       __setup_pebs_counter_group(cpuc, event,
> >> +                               (struct pebs_cntr_header *)cntr, next_record);
> >> +                       data->sample_flags |= PERF_SAMPLE_READ;
> >> +               }
> >> +
> >> +               nr = hweight32(cntr->cntr) + hweight32(cntr->fixed);
> >> +               if (cntr->metrics == INTEL_CNTR_METRICS)
> >> +                       nr += 2;
> >> +               next_record += nr * sizeof(u64);
> >> +       }
> >> +
> >>         /* Parse followed fragments if there are. */
> >>         if (arch_pebs_record_continued(header)) {
> >>                 at = at + header->size;
> >> @@ -3094,10 +3119,8 @@ static void __init intel_ds_pebs_init(void)
> >>                         break;
> >>
> >>                 case 6:
> >> -                       if (x86_pmu.intel_cap.pebs_baseline) {
> >> +                       if (x86_pmu.intel_cap.pebs_baseline)
> >>                                 x86_pmu.large_pebs_flags |= PERF_SAMPLE_READ;
> >> -                               x86_pmu.late_setup = intel_pmu_late_setup;
> >> -                       }
> > Hi Dapeng,
> >
> > I'm trying to understand why the late_setup initialization was changed
> > here and its connection with counter group support. I couldn't find a
> > mention in the commit message.
>
> It's because arch-PEBS also supports counters group sampling, not just the
> legacy PEBS with PEBS format 6. Currently ACR (auto counter reload) and
> PEBS both needs the late_setup, ACR and counters group sampling (regardless
> of legacy PEBS or arch-PEBS) are introduced since Perfmon v6, so the
> late_setup initialization is moved to the unified place of perfmon v6
> initialization. Thanks.

Ah, okay. I think it would have been clearer if those changes and the
counter group changes were kept in separate patches. Anyway, this code
has landed and nothing broke with this change.

Thanks,
Ian

>
> >
> > Thanks,
> > Ian
> >
> >>                         fallthrough;
> >>                 case 5:
> >>                         x86_pmu.pebs_ept = 1;
> >> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> >> index f1ef9ac38bfb..65cc528fbad8 100644
> >> --- a/arch/x86/include/asm/msr-index.h
> >> +++ b/arch/x86/include/asm/msr-index.h
> >> @@ -334,12 +334,18 @@
> >>  #define ARCH_PEBS_INDEX_WR_SHIFT       4
> >>
> >>  #define ARCH_PEBS_RELOAD               0xffffffff
> >> +#define ARCH_PEBS_CNTR_ALLOW           BIT_ULL(35)
> >> +#define ARCH_PEBS_CNTR_GP              BIT_ULL(36)
> >> +#define ARCH_PEBS_CNTR_FIXED           BIT_ULL(37)
> >> +#define ARCH_PEBS_CNTR_METRICS         BIT_ULL(38)
> >>  #define ARCH_PEBS_LBR_SHIFT            40
> >>  #define ARCH_PEBS_LBR                  (0x3ull << ARCH_PEBS_LBR_SHIFT)
> >>  #define ARCH_PEBS_VECR_XMM             BIT_ULL(49)
> >>  #define ARCH_PEBS_GPR                  BIT_ULL(61)
> >>  #define ARCH_PEBS_AUX                  BIT_ULL(62)
> >>  #define ARCH_PEBS_EN                   BIT_ULL(63)
> >> +#define ARCH_PEBS_CNTR_MASK            (ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \
> >> +                                        ARCH_PEBS_CNTR_METRICS)
> >>
> >>  #define MSR_IA32_RTIT_CTL              0x00000570
> >>  #define RTIT_CTL_TRACEEN               BIT(0)
> >> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> >> index 3b3848f0d339..7276ba70c88a 100644
> >> --- a/arch/x86/include/asm/perf_event.h
> >> +++ b/arch/x86/include/asm/perf_event.h
> >> @@ -141,16 +141,16 @@
> >>  #define ARCH_PERFMON_EVENTS_COUNT                      7
> >>
> >>  #define PEBS_DATACFG_MEMINFO   BIT_ULL(0)
> >> -#define PEBS_DATACFG_GP        BIT_ULL(1)
> >> +#define PEBS_DATACFG_GP                BIT_ULL(1)
> >>  #define PEBS_DATACFG_XMMS      BIT_ULL(2)
> >>  #define PEBS_DATACFG_LBRS      BIT_ULL(3)
> >> -#define PEBS_DATACFG_LBR_SHIFT 24
> >>  #define PEBS_DATACFG_CNTR      BIT_ULL(4)
> >> +#define PEBS_DATACFG_METRICS   BIT_ULL(5)
> >> +#define PEBS_DATACFG_LBR_SHIFT 24
> >>  #define PEBS_DATACFG_CNTR_SHIFT        32
> >>  #define PEBS_DATACFG_CNTR_MASK GENMASK_ULL(15, 0)
> >>  #define PEBS_DATACFG_FIX_SHIFT 48
> >>  #define PEBS_DATACFG_FIX_MASK  GENMASK_ULL(7, 0)
> >> -#define PEBS_DATACFG_METRICS   BIT_ULL(5)
> >>
> >>  /* Steal the highest bit of pebs_data_cfg for SW usage */
> >>  #define PEBS_UPDATE_DS_SW      BIT_ULL(63)
> >> @@ -603,6 +603,13 @@ struct arch_pebs_lbr_header {
> >>         u64 ler_info;
> >>  };
> >>
> >> +struct arch_pebs_cntr_header {
> >> +       u32 cntr;
> >> +       u32 fixed;
> >> +       u32 metrics;
> >> +       u32 reserved;
> >> +};
> >> +
> >>  /*
> >>   * AMD Extended Performance Monitoring and Debug cpuid feature detection
> >>   */
> >> --
> >> 2.34.1
> >>
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2026-03-10  4:36 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29 10:21 [Patch v9 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
2025-10-29 10:21 ` [Patch v9 01/12] perf/x86: Remove redundant is_x86_event() prototype Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-10-29 10:21 ` [Patch v9 02/12] perf/x86: Fix NULL event access and potential PEBS record loss Dapeng Mi
2025-11-06 14:19   ` Peter Zijlstra
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-10-29 10:21 ` [Patch v9 03/12] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-10-29 10:21 ` [Patch v9 04/12] perf/x86/intel: Correct large PEBS flag check Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-10-29 10:21 ` [Patch v9 05/12] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-03-05  0:50   ` [Patch v9 05/12] " Ian Rogers
2026-03-06  1:38     ` Mi, Dapeng
2025-10-29 10:21 ` [Patch v9 06/12] perf/x86/intel/ds: Factor out PEBS record processing code to functions Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-10-29 10:21 ` [Patch v9 07/12] perf/x86/intel/ds: Factor out PEBS group " Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-10-29 10:21 ` [Patch v9 08/12] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-03-03  0:20   ` [Patch v9 08/12] " Chun-Tse Shao
2026-03-06  1:20     ` Mi, Dapeng
2025-10-29 10:21 ` [Patch v9 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-11-12 10:03   ` [tip: perf/core] perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use tip-bot2 for Ingo Molnar
2025-11-12 11:18   ` tip-bot2 for Ingo Molnar
2025-10-29 10:21 ` [Patch v9 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Dapeng Mi
2025-11-06 14:52   ` Peter Zijlstra
2025-11-07  6:11     ` Mi, Dapeng
2025-11-07  8:28       ` Peter Zijlstra
2025-11-07  8:36         ` Mi, Dapeng
2025-11-07 13:05       ` Peter Zijlstra
2025-11-10  0:23         ` Mi, Dapeng
2025-11-10  9:03           ` Peter Zijlstra
2025-11-10  9:15             ` Mi, Dapeng
2025-11-11  5:41               ` Mi, Dapeng
2025-11-11 11:37                 ` Peter Zijlstra
2025-11-12  0:16                   ` Mi, Dapeng
2025-11-11 11:37   ` [tip: perf/core] perf/x86/intel: Update dyn_constraint " tip-bot2 for Dapeng Mi
2025-10-29 10:21 ` [Patch v9 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-03-05  1:20   ` [Patch v9 11/12] " Ian Rogers
2026-03-06  2:17     ` Mi, Dapeng
2025-10-29 10:21 ` [Patch v9 12/12] perf/x86/intel: Add counter group support for arch-PEBS Dapeng Mi
2025-11-11 11:37   ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2026-03-09 22:59   ` [Patch v9 12/12] " Ian Rogers
2026-03-10  2:06     ` Mi, Dapeng
2026-03-10  4:36       ` Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox