* [Patch v3 0/7] x86 perf bug fixes and optimization
@ 2025-08-20 2:30 Dapeng Mi
2025-08-20 2:30 ` [Patch v3 1/7] perf/x86/intel: Use early_initcall() to hook bts_init() Dapeng Mi
` (7 more replies)
0 siblings, 8 replies; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Changes:
v2 -> v3:
* Rebase to latest tip perf/core tree.
* Rewrite commit message to explain why NULL access happens and
refine code (Patch 3/7)
* Refine commit message of patch 6/7
* Dump counters bitmap instead of absolute counter in boot message
(patch 7/7)
v1 -> v2:
* Rebase to 6.17-rc1.
* No code changes.
Tests:
* Run perf stats/record commands on Intel Sapphire Rapids platform, no
issue is found.
History:
v2: https://lore.kernel.org/all/20250811090034.51249-1-dapeng1.mi@linux.intel.com/
v1:
* Patch 1/6: https://lore.kernel.org/all/20250606111606.84350-1-dapeng1.mi@linux.intel.com/
* Patch 2/6: https://lore.kernel.org/all/20250529080236.2552247-1-dapeng1.mi@linux.intel.com/
* Patch 3/6: https://lore.kernel.org/all/20250718062602.21444-1-dapeng1.mi@linux.intel.com/
* Patches 4-6/6: https://lore.kernel.org/all/20250717090302.11316-1-dapeng1.mi@linux.intel.com/
Dapeng Mi (7):
perf/x86/intel: Use early_initcall() to hook bts_init()
perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
perf/x86: Check if cpuc->events[*] pointer exists before accessing it
perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to
BIT_ULL(48)
perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into
INTEL_FIXED_BITS_MASK
perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
arch/x86/events/core.c | 16 +++++++++-------
arch/x86/events/intel/bts.c | 2 +-
arch/x86/events/intel/core.c | 21 +++++++++------------
arch/x86/events/intel/ds.c | 10 ++++++++++
arch/x86/include/asm/msr-index.h | 14 ++++++++------
arch/x86/include/asm/perf_event.h | 8 ++++++--
arch/x86/kvm/pmu.h | 2 +-
tools/arch/x86/include/asm/msr-index.h | 14 ++++++++------
8 files changed, 52 insertions(+), 35 deletions(-)
base-commit: 448f97fba9013ffa13f5dd82febd18836b189499
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Patch v3 1/7] perf/x86/intel: Use early_initcall() to hook bts_init()
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
@ 2025-08-20 2:30 ` Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 2/7] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error Dapeng Mi
` (6 subsequent siblings)
7 siblings, 1 reply; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
After the commit 'd971342d38bf ("perf/x86/intel: Decouple BTS
initialization from PEBS initialization")' is introduced, x86_pmu.bts
would initialized in bts_init() which is hooked by arch_initcall().
Whereas init_hw_perf_events() is hooked by early_initcall(). Once the
core PMU is initialized, nmi watchdog initialization is called
immediately before bts_init() is called. It leads to the BTS buffer is
not really initialized since bts_init() is not called and x86_pmu.bts is
still false at that time. Worse, BTS buffer would never be initialized
then unless all core PMU events are freed and reserve_ds_buffers()
is called again.
Thus aligning with init_hw_perf_events(), use early_initcall() to hook
bts_init() to ensure x86_pmu.bts is initialized before nmi watchdog
initialization.
Fixes: d971342d38bf ("perf/x86/intel: Decouple BTS initialization from PEBS initialization")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/bts.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 61da6b8a3d51..cbac54cb3a9e 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -643,4 +643,4 @@ static __init int bts_init(void)
return perf_pmu_register(&bts_pmu, "intel_bts", -1);
}
-arch_initcall(bts_init);
+early_initcall(bts_init);
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Patch v3 2/7] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
2025-08-20 2:30 ` [Patch v3 1/7] perf/x86/intel: Use early_initcall() to hook bts_init() Dapeng Mi
@ 2025-08-20 2:30 ` Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it Dapeng Mi
` (5 subsequent siblings)
7 siblings, 1 reply; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.
[ 55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
[ 55.611280] Call Trace:
[ 55.611282] <TASK>
[ 55.611284] ? intel_pmu_config_acr+0x87/0x160
[ 55.611289] intel_pmu_enable_acr+0x6d/0x80
[ 55.611291] intel_pmu_enable_event+0xce/0x460
[ 55.611293] x86_pmu_start+0x78/0xb0
[ 55.611297] x86_pmu_enable+0x218/0x3a0
[ 55.611300] ? x86_pmu_enable+0x121/0x3a0
[ 55.611302] perf_pmu_enable+0x40/0x50
[ 55.611307] ctx_resched+0x19d/0x220
[ 55.611309] __perf_install_in_context+0x284/0x2f0
[ 55.611311] ? __pfx_remote_function+0x10/0x10
[ 55.611314] remote_function+0x52/0x70
[ 55.611317] ? __pfx_remote_function+0x10/0x10
[ 55.611319] generic_exec_single+0x84/0x150
[ 55.611323] smp_call_function_single+0xc5/0x1a0
[ 55.611326] ? __pfx_remote_function+0x10/0x10
[ 55.611329] perf_install_in_context+0xd1/0x1e0
[ 55.611331] ? __pfx___perf_install_in_context+0x10/0x10
[ 55.611333] __do_sys_perf_event_open+0xa76/0x1040
[ 55.611336] __x64_sys_perf_event_open+0x26/0x30
[ 55.611337] x64_sys_call+0x1d8e/0x20c0
[ 55.611339] do_syscall_64+0x4f/0x120
[ 55.611343] entry_SYSCALL_64_after_hwframe+0x76/0x7e
On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR
which requires to enable auto counter reload on GP counter 0.
The root cause of causing this issue is the check for auto counter
reload (ACR) counter mask from user space is incorrect in
intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter
mask from user space could be set into hw.config1 and then written into
CFG_B MSRs and trigger the MSR access warning.
e.g., User may create a perf event with ACR counter mask (config2=0xcb),
and there is only 1 event created, so "cpuc->n_events" is 1.
The correct check condition should be "i + idx >= cpuc->n_events"
instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise,
the counter mask would traverse twice and an invalid "cpuc->assign[1]"
bit (bit 0) is set into hw.config1 and cause MSR accessing error.
Besides, also check if the ACR counter mask corresponding events are
ACR events. If not, filter out these counter mask. If a event is not a
ACR event, it could be scheduled to an HW counter which doesn't support
ACR. It's invalid to add their counter index in ACR counter mask.
Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as
user could set any invalid ACR counter mask and the warning message
could mislead users.
Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
---
arch/x86/events/intel/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c2fb729c270e..15da60cf69f2 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2997,7 +2997,8 @@ static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc)
if (event->group_leader != leader->group_leader)
break;
for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
- if (WARN_ON_ONCE(i + idx > cpuc->n_events))
+ if (i + idx >= cpuc->n_events ||
+ !is_acr_event_group(cpuc->event_list[i + idx]))
return;
__set_bit(cpuc->assign[i + idx], (unsigned long *)&event->hw.config1);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
2025-08-20 2:30 ` [Patch v3 1/7] perf/x86/intel: Use early_initcall() to hook bts_init() Dapeng Mi
2025-08-20 2:30 ` [Patch v3 2/7] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error Dapeng Mi
@ 2025-08-20 2:30 ` Dapeng Mi
2025-08-20 3:41 ` Andi Kleen
2025-08-21 13:35 ` Peter Zijlstra
2025-08-20 2:30 ` [Patch v3 4/7] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
` (4 subsequent siblings)
7 siblings, 2 replies; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi,
kernel test robot
When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
perf_event_overflow() could be called to process the last PEBS record.
While perf_event_overflow() could trigger the interrupt throttle and
stop all events of the group, like what the below call-chain shows.
perf_event_overflow()
-> __perf_event_overflow()
->__perf_event_account_interrupt()
-> perf_event_throttle_group()
-> perf_event_throttle()
-> event->pmu->stop()
-> x86_pmu_stop()
The side effect of stopping the events is that all corresponding event
pointers in cpuc->events[] array are cleared to NULL.
Assume there are two PEBS events (event a and event b) in a group. When
intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
last PEBS record of PEBS event a, interrupt throttle is triggered and
all pointers of event a and event b are cleared to NULL. Then
intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
event b and encounters NULL pointer access.
Since the left PEBS records have been processed when stopping the event,
check and skip to process the last PEBS record if cpuc->events[*] is
NULL.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com
Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: kernel test robot <oliver.sang@intel.com>
---
arch/x86/events/intel/ds.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c0b7ac1c7594..dcf29c099ad2 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2663,6 +2663,16 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
continue;
event = cpuc->events[bit];
+ /*
+ * perf_event_overflow() called by below __intel_pmu_pebs_last_event()
+ * could trigger interrupt throttle and clear all event pointers of the
+ * group in cpuc->events[] to NULL. So need to re-check if cpuc->events[*]
+ * is NULL, if so it indicates the event has been throttled (stopped) and
+ * the corresponding last PEBS records have been processed in stopping
+ * event, don't need to process it again.
+ */
+ if (!event)
+ continue;
__intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
counts[bit], setup_pebs_adaptive_sample_data);
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Patch v3 4/7] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
` (2 preceding siblings ...)
2025-08-20 2:30 ` [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it Dapeng Mi
@ 2025-08-20 2:30 ` Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 5/7] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
` (3 subsequent siblings)
7 siblings, 1 reply; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi, Yi Lai
IA32_PERF_CAPABILITIES.PEBS_TIMING_INFO[bit 17] is introduced to
indicate whether timed PEBS is supported. Timed PEBS adds a new "retired
latency" field in basic info group to show the timing info. Please find
detailed information about timed PEBS in section 8.4.1 "Timed Processor
Event Based Sampling" of "Intel Architecture Instruction Set Extensions
and Future Features".
This patch adds PERF_CAP_PEBS_TIMING_INFO flag and KVM module leverages
this flag to expose timed PEBS feature to guest.
Moreover, opportunistically refine the indents and make the macros
share consistent indents.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
arch/x86/include/asm/msr-index.h | 14 ++++++++------
tools/arch/x86/include/asm/msr-index.h | 14 ++++++++------
2 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b65c3ba5fa14..f627196eb796 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -315,12 +315,14 @@
#define PERF_CAP_PT_IDX 16
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
-#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
-#define PERF_CAP_ARCH_REG BIT_ULL(7)
-#define PERF_CAP_PEBS_FORMAT 0xf00
-#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
-#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
- PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT 0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
+#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
+#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
+ PERF_CAP_PEBS_TIMING_INFO)
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 5cfb5d74dd5f..daebfd926f08 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -315,12 +315,14 @@
#define PERF_CAP_PT_IDX 16
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
-#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
-#define PERF_CAP_ARCH_REG BIT_ULL(7)
-#define PERF_CAP_PEBS_FORMAT 0xf00
-#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
-#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
- PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT 0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
+#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
+#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
+ PERF_CAP_PEBS_TIMING_INFO)
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Patch v3 5/7] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48)
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
` (3 preceding siblings ...)
2025-08-20 2:30 ` [Patch v3 4/7] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
@ 2025-08-20 2:30 ` Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 6/7] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Dapeng Mi
` (2 subsequent siblings)
7 siblings, 1 reply; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi, Yi Lai
Macro GLOBAL_CTRL_EN_PERF_METRICS is defined to 48 instead of
BIT_ULL(48), it's inconsistent with other similar macros. This leads to
this macro is quite easily used wrongly since users thinks it's a
bit-mask just like other similar macros.
Thus change GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) and eliminate
this potential misuse.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
arch/x86/events/intel/core.c | 8 ++++----
arch/x86/include/asm/perf_event.h | 2 +-
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 15da60cf69f2..f88a99d8d125 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5319,9 +5319,9 @@ static void intel_pmu_check_hybrid_pmus(struct x86_hybrid_pmu *pmu)
0, x86_pmu_num_counters(&pmu->pmu), 0, 0);
if (pmu->intel_cap.perf_metrics)
- pmu->intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ pmu->intel_ctrl |= GLOBAL_CTRL_EN_PERF_METRICS;
else
- pmu->intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS);
+ pmu->intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS;
intel_pmu_check_event_constraints(pmu->event_constraints,
pmu->cntr_mask64,
@@ -5456,7 +5456,7 @@ static void intel_pmu_cpu_starting(int cpu)
rdmsrq(MSR_IA32_PERF_CAPABILITIES, perf_cap.capabilities);
if (!perf_cap.perf_metrics) {
x86_pmu.intel_cap.perf_metrics = 0;
- x86_pmu.intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS);
+ x86_pmu.intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS;
}
}
@@ -7790,7 +7790,7 @@ __init int intel_pmu_init(void)
}
if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics)
- x86_pmu.intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ x86_pmu.intel_ctrl |= GLOBAL_CTRL_EN_PERF_METRICS;
if (x86_pmu.intel_cap.pebs_timing_info)
x86_pmu.flags |= PMU_FL_RETIRE_LATENCY;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 70d1d94aca7e..f8247ac276c4 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -430,7 +430,7 @@ static inline bool is_topdown_idx(int idx)
#define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
#define GLOBAL_STATUS_PERF_METRICS_OVF_BIT 48
-#define GLOBAL_CTRL_EN_PERF_METRICS 48
+#define GLOBAL_CTRL_EN_PERF_METRICS BIT_ULL(48)
/*
* We model guest LBR event tracing as another fixed-mode PMC like BTS.
*
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Patch v3 6/7] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
` (4 preceding siblings ...)
2025-08-20 2:30 ` [Patch v3 5/7] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
@ 2025-08-20 2:30 ` Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 7/7] perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap() Dapeng Mi
2025-08-20 15:55 ` [Patch v3 0/7] x86 perf bug fixes and optimization Liang, Kan
7 siblings, 1 reply; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi, Yi Lai
ICL_FIXED_0_ADAPTIVE is missed to be added into INTEL_FIXED_BITS_MASK,
add it.
With help of this new INTEL_FIXED_BITS_MASK, intel_pmu_enable_fixed() can
be optimized. The old fixed counter control bits can be unconditionally
cleared with INTEL_FIXED_BITS_MASK and then set new control bits base on
new configuration.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
arch/x86/events/intel/core.c | 10 +++-------
arch/x86/include/asm/perf_event.h | 6 +++++-
arch/x86/kvm/pmu.h | 2 +-
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index f88a99d8d125..28f5468a6ea3 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2845,8 +2845,8 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
- u64 mask, bits = 0;
int idx = hwc->idx;
+ u64 bits = 0;
if (is_topdown_idx(idx)) {
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -2885,14 +2885,10 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
idx -= INTEL_PMC_IDX_FIXED;
bits = intel_fixed_bits_by_idx(idx, bits);
- mask = intel_fixed_bits_by_idx(idx, INTEL_FIXED_BITS_MASK);
-
- if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
+ if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip)
bits |= intel_fixed_bits_by_idx(idx, ICL_FIXED_0_ADAPTIVE);
- mask |= intel_fixed_bits_by_idx(idx, ICL_FIXED_0_ADAPTIVE);
- }
- cpuc->fixed_ctrl_val &= ~mask;
+ cpuc->fixed_ctrl_val &= ~intel_fixed_bits_by_idx(idx, INTEL_FIXED_BITS_MASK);
cpuc->fixed_ctrl_val |= bits;
}
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index f8247ac276c4..49a4d442f3fc 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -35,7 +35,6 @@
#define ARCH_PERFMON_EVENTSEL_EQ (1ULL << 36)
#define ARCH_PERFMON_EVENTSEL_UMASK2 (0xFFULL << 40)
-#define INTEL_FIXED_BITS_MASK 0xFULL
#define INTEL_FIXED_BITS_STRIDE 4
#define INTEL_FIXED_0_KERNEL (1ULL << 0)
#define INTEL_FIXED_0_USER (1ULL << 1)
@@ -48,6 +47,11 @@
#define ICL_EVENTSEL_ADAPTIVE (1ULL << 34)
#define ICL_FIXED_0_ADAPTIVE (1ULL << 32)
+#define INTEL_FIXED_BITS_MASK \
+ (INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER | \
+ INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI | \
+ ICL_FIXED_0_ADAPTIVE)
+
#define intel_fixed_bits_by_idx(_idx, _bits) \
((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE))
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index ad89d0bd6005..103604c4b33b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -13,7 +13,7 @@
#define MSR_IA32_MISC_ENABLE_PMU_RO_MASK (MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | \
MSR_IA32_MISC_ENABLE_BTS_UNAVAIL)
-/* retrieve the 4 bits for EN and PMI out of IA32_FIXED_CTR_CTRL */
+/* retrieve a fixed counter bits out of IA32_FIXED_CTR_CTRL */
#define fixed_ctrl_field(ctrl_reg, idx) \
(((ctrl_reg) >> ((idx) * INTEL_FIXED_BITS_STRIDE)) & INTEL_FIXED_BITS_MASK)
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Patch v3 7/7] perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
` (5 preceding siblings ...)
2025-08-20 2:30 ` [Patch v3 6/7] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Dapeng Mi
@ 2025-08-20 2:30 ` Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 15:55 ` [Patch v3 0/7] x86 perf bug fixes and optimization Liang, Kan
7 siblings, 1 reply; 25+ messages in thread
From: Dapeng Mi @ 2025-08-20 2:30 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Dapeng Mi
Along with the introduction Perfmon v6, pmu counters could be
incontinuous, like fixed counters on CWF, only fixed counters 0-3 and
5-7 are supported, there is no fixed counter 4 on CWF. To accommodate
this change, archPerfmonExt CPUID (0x23) leaves are introduced to
enumerate the true-view of counters bitmap.
Current perf code already supports archPerfmonExt CPUID and uses
counters-bitmap to enumerate HW really supported counters, but
x86_pmu_show_pmu_cap() still only dumps the absolute counter number
instead of true-view bitmap, it's out-dated and may mislead readers.
So dump counters true-view bitmap in x86_pmu_show_pmu_cap() and
opportunistically change the dump sequence and words.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/core.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7610f26dfbd9..745caa6c15a3 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2069,13 +2069,15 @@ static void _x86_pmu_read(struct perf_event *event)
void x86_pmu_show_pmu_cap(struct pmu *pmu)
{
- pr_info("... version: %d\n", x86_pmu.version);
- pr_info("... bit width: %d\n", x86_pmu.cntval_bits);
- pr_info("... generic registers: %d\n", x86_pmu_num_counters(pmu));
- pr_info("... value mask: %016Lx\n", x86_pmu.cntval_mask);
- pr_info("... max period: %016Lx\n", x86_pmu.max_period);
- pr_info("... fixed-purpose events: %d\n", x86_pmu_num_counters_fixed(pmu));
- pr_info("... event mask: %016Lx\n", hybrid(pmu, intel_ctrl));
+ pr_info("... version: %d\n", x86_pmu.version);
+ pr_info("... bit width: %d\n", x86_pmu.cntval_bits);
+ pr_info("... generic counters: %d\n", x86_pmu_num_counters(pmu));
+ pr_info("... generic bitmap: %016llx\n", hybrid(pmu, cntr_mask64));
+ pr_info("... fixed-purpose counters: %d\n", x86_pmu_num_counters_fixed(pmu));
+ pr_info("... fixed-purpose bitmap: %016llx\n", hybrid(pmu, fixed_cntr_mask64));
+ pr_info("... value mask: %016llx\n", x86_pmu.cntval_mask);
+ pr_info("... max period: %016llx\n", x86_pmu.max_period);
+ pr_info("... global_ctrl mask: %016llx\n", hybrid(pmu, intel_ctrl));
}
static int __init init_hw_perf_events(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-20 2:30 ` [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it Dapeng Mi
@ 2025-08-20 3:41 ` Andi Kleen
2025-08-20 5:33 ` Mi, Dapeng
2025-08-21 13:35 ` Peter Zijlstra
1 sibling, 1 reply; 25+ messages in thread
From: Andi Kleen @ 2025-08-20 3:41 UTC (permalink / raw)
To: Dapeng Mi
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi, kernel test robot
> event = cpuc->events[bit];
> + /*
> + * perf_event_overflow() called by below __intel_pmu_pebs_last_event()
> + * could trigger interrupt throttle and clear all event pointers of the
> + * group in cpuc->events[] to NULL. So need to re-check if cpuc->events[*]
> + * is NULL, if so it indicates the event has been throttled (stopped) and
> + * the corresponding last PEBS records have been processed in stopping
> + * event, don't need to process it again.
> + */
> + if (!event)
> + continue;
Then we silently ignore the overflow. Would be better to log at least an overflow
packet or something like that.
-Andi
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-20 3:41 ` Andi Kleen
@ 2025-08-20 5:33 ` Mi, Dapeng
2025-08-20 5:44 ` Andi Kleen
0 siblings, 1 reply; 25+ messages in thread
From: Mi, Dapeng @ 2025-08-20 5:33 UTC (permalink / raw)
To: Andi Kleen
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi, kernel test robot
On 8/20/2025 11:41 AM, Andi Kleen wrote:
>> event = cpuc->events[bit];
>> + /*
>> + * perf_event_overflow() called by below __intel_pmu_pebs_last_event()
>> + * could trigger interrupt throttle and clear all event pointers of the
>> + * group in cpuc->events[] to NULL. So need to re-check if cpuc->events[*]
>> + * is NULL, if so it indicates the event has been throttled (stopped) and
>> + * the corresponding last PEBS records have been processed in stopping
>> + * event, don't need to process it again.
>> + */
>> + if (!event)
>> + continue;
> Then we silently ignore the overflow. Would be better to log at least an overflow
> packet or something like that.
Andi, I didn't fully get the exact meaning about the "log" here. When
throttle is triggered, perf_event_throttle() has already called
perf_log_throttle() to log the throttle event although only for the group
leader. Is it enough?
>
> -Andi
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-20 5:33 ` Mi, Dapeng
@ 2025-08-20 5:44 ` Andi Kleen
2025-08-20 5:54 ` Mi, Dapeng
0 siblings, 1 reply; 25+ messages in thread
From: Andi Kleen @ 2025-08-20 5:44 UTC (permalink / raw)
To: Mi, Dapeng
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi, kernel test robot
> Andi, I didn't fully get the exact meaning about the "log" here. When
> throttle is triggered, perf_event_throttle() has already called
> perf_log_throttle() to log the throttle event although only for the group
> leader. Is it enough?
Throttle normally doesn't involve data loss, just less samples. But this
is data loss, so it's an overflow.
-Andi
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-20 5:44 ` Andi Kleen
@ 2025-08-20 5:54 ` Mi, Dapeng
2025-08-21 1:51 ` Andi Kleen
0 siblings, 1 reply; 25+ messages in thread
From: Mi, Dapeng @ 2025-08-20 5:54 UTC (permalink / raw)
To: Andi Kleen
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi, kernel test robot
On 8/20/2025 1:44 PM, Andi Kleen wrote:
>> Andi, I didn't fully get the exact meaning about the "log" here. When
>> throttle is triggered, perf_event_throttle() has already called
>> perf_log_throttle() to log the throttle event although only for the group
>> leader. Is it enough?
> Throttle normally doesn't involve data loss, just less samples. But this
> is data loss, so it's an overflow.
IIUC, there should be no data loss, the unprocessed PEBS records of these
throttled events would be still processed eventually by calling
intel_pmu_drain_pebs_buffer() when stopping the event.
>
> -Andi
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 0/7] x86 perf bug fixes and optimization
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
` (6 preceding siblings ...)
2025-08-20 2:30 ` [Patch v3 7/7] perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap() Dapeng Mi
@ 2025-08-20 15:55 ` Liang, Kan
2025-08-21 13:39 ` Peter Zijlstra
7 siblings, 1 reply; 25+ messages in thread
From: Liang, Kan @ 2025-08-20 15:55 UTC (permalink / raw)
To: Dapeng Mi, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi
On 2025-08-19 7:30 p.m., Dapeng Mi wrote:
> Changes:
> v2 -> v3:
> * Rebase to latest tip perf/core tree.
> * Rewrite commit message to explain why NULL access happens and
> refine code (Patch 3/7)
> * Refine commit message of patch 6/7
> * Dump counters bitmap instead of absolute counter in boot message
> (patch 7/7)
>
> v1 -> v2:
> * Rebase to 6.17-rc1.
> * No code changes.
>
> Tests:
> * Run perf stats/record commands on Intel Sapphire Rapids platform, no
> issue is found.
>
> History:
> v2: https://lore.kernel.org/all/20250811090034.51249-1-dapeng1.mi@linux.intel.com/
> v1:
> * Patch 1/6: https://lore.kernel.org/all/20250606111606.84350-1-dapeng1.mi@linux.intel.com/
> * Patch 2/6: https://lore.kernel.org/all/20250529080236.2552247-1-dapeng1.mi@linux.intel.com/
> * Patch 3/6: https://lore.kernel.org/all/20250718062602.21444-1-dapeng1.mi@linux.intel.com/
> * Patches 4-6/6: https://lore.kernel.org/all/20250717090302.11316-1-dapeng1.mi@linux.intel.com/
>
> Dapeng Mi (7):
> perf/x86/intel: Use early_initcall() to hook bts_init()
> perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
> perf/x86: Check if cpuc->events[*] pointer exists before accessing it
> perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
> perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to
> BIT_ULL(48)
> perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into
> INTEL_FIXED_BITS_MASK
> perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
>
The series looks good to me.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Thanks,
Kan
> arch/x86/events/core.c | 16 +++++++++-------
> arch/x86/events/intel/bts.c | 2 +-
> arch/x86/events/intel/core.c | 21 +++++++++------------
> arch/x86/events/intel/ds.c | 10 ++++++++++
> arch/x86/include/asm/msr-index.h | 14 ++++++++------
> arch/x86/include/asm/perf_event.h | 8 ++++++--
> arch/x86/kvm/pmu.h | 2 +-
> tools/arch/x86/include/asm/msr-index.h | 14 ++++++++------
> 8 files changed, 52 insertions(+), 35 deletions(-)
>
>
> base-commit: 448f97fba9013ffa13f5dd82febd18836b189499
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-20 5:54 ` Mi, Dapeng
@ 2025-08-21 1:51 ` Andi Kleen
0 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2025-08-21 1:51 UTC (permalink / raw)
To: Mi, Dapeng
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang, Eranian Stephane, linux-kernel, linux-perf-users,
Dapeng Mi, kernel test robot
On Wed, Aug 20, 2025 at 01:54:17PM +0800, Mi, Dapeng wrote:
>
> On 8/20/2025 1:44 PM, Andi Kleen wrote:
> >> Andi, I didn't fully get the exact meaning about the "log" here. When
> >> throttle is triggered, perf_event_throttle() has already called
> >> perf_log_throttle() to log the throttle event although only for the group
> >> leader. Is it enough?
> > Throttle normally doesn't involve data loss, just less samples. But this
> > is data loss, so it's an overflow.
>
> IIUC, there should be no data loss, the unprocessed PEBS records of these
> throttled events would be still processed eventually by calling
> intel_pmu_drain_pebs_buffer() when stopping the event.
Makes sense. Thanks,
-Andi
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-20 2:30 ` [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it Dapeng Mi
2025-08-20 3:41 ` Andi Kleen
@ 2025-08-21 13:35 ` Peter Zijlstra
2025-08-22 5:26 ` Mi, Dapeng
1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2025-08-21 13:35 UTC (permalink / raw)
To: Dapeng Mi
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
kernel test robot
On Wed, Aug 20, 2025 at 10:30:28AM +0800, Dapeng Mi wrote:
> When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
> perf_event_overflow() could be called to process the last PEBS record.
>
> While perf_event_overflow() could trigger the interrupt throttle and
> stop all events of the group, like what the below call-chain shows.
>
> perf_event_overflow()
> -> __perf_event_overflow()
> ->__perf_event_account_interrupt()
> -> perf_event_throttle_group()
> -> perf_event_throttle()
> -> event->pmu->stop()
> -> x86_pmu_stop()
>
> The side effect of stopping the events is that all corresponding event
> pointers in cpuc->events[] array are cleared to NULL.
>
> Assume there are two PEBS events (event a and event b) in a group. When
> intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
> last PEBS record of PEBS event a, interrupt throttle is triggered and
> all pointers of event a and event b are cleared to NULL. Then
> intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
> event b and encounters NULL pointer access.
>
> Since the left PEBS records have been processed when stopping the event,
> check and skip to process the last PEBS record if cpuc->events[*] is
> NULL.
>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com
> Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Tested-by: kernel test robot <oliver.sang@intel.com>
> ---
> arch/x86/events/intel/ds.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index c0b7ac1c7594..dcf29c099ad2 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -2663,6 +2663,16 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
> continue;
>
> event = cpuc->events[bit];
> + /*
> + * perf_event_overflow() called by below __intel_pmu_pebs_last_event()
> + * could trigger interrupt throttle and clear all event pointers of the
> + * group in cpuc->events[] to NULL. So need to re-check if cpuc->events[*]
> + * is NULL, if so it indicates the event has been throttled (stopped) and
> + * the corresponding last PEBS records have been processed in stopping
> + * event, don't need to process it again.
> + */
> + if (!event)
> + continue;
>
> __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
> counts[bit], setup_pebs_adaptive_sample_data);
So if this is due to __intel_pmu_pebs_last_event() calling into
perf_event_overflow(); then isn't intel_pmu_drain_pebs_nhm() similarly
affected?
And worse, the _nhm() version would loose all events for that counter,
not just the last.
I'm really thinking this isn't the right thing to do.
How about we audit the entirety of arch/x86/events/ for cpuc->events[]
usage and see if we can get away with changing x86_pmu_stop() to simply
not clearing that field.
Or perhaps move the setting and clearing into x86_pmu_{add,del}() rather
than x86_pmu_{start,stop}(). After all, the latter don't affect the
counter placement, they just stop/start the event.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 0/7] x86 perf bug fixes and optimization
2025-08-20 15:55 ` [Patch v3 0/7] x86 perf bug fixes and optimization Liang, Kan
@ 2025-08-21 13:39 ` Peter Zijlstra
2025-08-22 5:29 ` Mi, Dapeng
0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2025-08-21 13:39 UTC (permalink / raw)
To: Liang, Kan
Cc: Dapeng Mi, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
Ian Rogers, Adrian Hunter, Alexander Shishkin, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi
On Wed, Aug 20, 2025 at 08:55:24AM -0700, Liang, Kan wrote:
> > Dapeng Mi (7):
> > perf/x86/intel: Use early_initcall() to hook bts_init()
> > perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
> > perf/x86: Check if cpuc->events[*] pointer exists before accessing it
> > perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
> > perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to
> > BIT_ULL(48)
> > perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into
> > INTEL_FIXED_BITS_MASK
> > perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
> >
>
> The series looks good to me.
>
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
I've picked up all but patch 3 -- I really don't think that does the
right thing.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-21 13:35 ` Peter Zijlstra
@ 2025-08-22 5:26 ` Mi, Dapeng
2025-08-26 3:47 ` Mi, Dapeng
0 siblings, 1 reply; 25+ messages in thread
From: Mi, Dapeng @ 2025-08-22 5:26 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
kernel test robot
On 8/21/2025 9:35 PM, Peter Zijlstra wrote:
> On Wed, Aug 20, 2025 at 10:30:28AM +0800, Dapeng Mi wrote:
>> When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
>> perf_event_overflow() could be called to process the last PEBS record.
>>
>> While perf_event_overflow() could trigger the interrupt throttle and
>> stop all events of the group, like what the below call-chain shows.
>>
>> perf_event_overflow()
>> -> __perf_event_overflow()
>> ->__perf_event_account_interrupt()
>> -> perf_event_throttle_group()
>> -> perf_event_throttle()
>> -> event->pmu->stop()
>> -> x86_pmu_stop()
>>
>> The side effect of stopping the events is that all corresponding event
>> pointers in cpuc->events[] array are cleared to NULL.
>>
>> Assume there are two PEBS events (event a and event b) in a group. When
>> intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
>> last PEBS record of PEBS event a, interrupt throttle is triggered and
>> all pointers of event a and event b are cleared to NULL. Then
>> intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
>> event b and encounters NULL pointer access.
>>
>> Since the left PEBS records have been processed when stopping the event,
>> check and skip to process the last PEBS record if cpuc->events[*] is
>> NULL.
>>
>> Reported-by: kernel test robot <oliver.sang@intel.com>
>> Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com
>> Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> Tested-by: kernel test robot <oliver.sang@intel.com>
>> ---
>> arch/x86/events/intel/ds.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index c0b7ac1c7594..dcf29c099ad2 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -2663,6 +2663,16 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
>> continue;
>>
>> event = cpuc->events[bit];
>> + /*
>> + * perf_event_overflow() called by below __intel_pmu_pebs_last_event()
>> + * could trigger interrupt throttle and clear all event pointers of the
>> + * group in cpuc->events[] to NULL. So need to re-check if cpuc->events[*]
>> + * is NULL, if so it indicates the event has been throttled (stopped) and
>> + * the corresponding last PEBS records have been processed in stopping
>> + * event, don't need to process it again.
>> + */
>> + if (!event)
>> + continue;
>>
>> __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
>> counts[bit], setup_pebs_adaptive_sample_data);
>
> So if this is due to __intel_pmu_pebs_last_event() calling into
> perf_event_overflow(); then isn't intel_pmu_drain_pebs_nhm() similarly
> affected?
>
> And worse, the _nhm() version would loose all events for that counter,
> not just the last.
hmm, Yes. After double check, I suppose I made a mistake for the answer to
Andi. It indeed has data loss since the "ds->pebs_index" is reset at the
head of _nhm()/_icl() these drain_pebs helper instead of the end of the
drain_pebs helper. :(
> I'm really thinking this isn't the right thing to do.
>
>
> How about we audit the entirety of arch/x86/events/ for cpuc->events[]
> usage and see if we can get away with changing x86_pmu_stop() to simply
> not clearing that field.
Checking current code, I suppose it's fine that we don't clear
cpuc->events[] in x86_pmu_stop() since we already have another variable
"cpuc->active_mask" which is used to indicate if the corresponding
cpuc->events[*] is active. But in current code, the cpuc->active_mask is
not always checked.
So if we select not to clear cpuc->events[] in x86_pmu_stop(), then it's a
must to check cpuc->active_mask before really accessing cpuc->events[]
represented event. Maybe we can add an inline function got check this.
bool inline x86_pmu_cntr_event_active(int idx)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
return cpuc->events[idx] && test_bit(idx, cpuc->active_mask);
}
>
> Or perhaps move the setting and clearing into x86_pmu_{add,del}() rather
> than x86_pmu_{start,stop}(). After all, the latter don't affect the
> counter placement, they just stop/start the event.
IIUC, we could not move the setting into x86_pmu_add() from x86_pmu_stop()
since the counter index is not finalized at x86_pmu_add() is called. The
counter index could change for each adding a new event.
>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Patch v3 0/7] x86 perf bug fixes and optimization
2025-08-21 13:39 ` Peter Zijlstra
@ 2025-08-22 5:29 ` Mi, Dapeng
0 siblings, 0 replies; 25+ messages in thread
From: Mi, Dapeng @ 2025-08-22 5:29 UTC (permalink / raw)
To: Peter Zijlstra, Liang, Kan
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
linux-kernel, linux-perf-users, Dapeng Mi
On 8/21/2025 9:39 PM, Peter Zijlstra wrote:
> On Wed, Aug 20, 2025 at 08:55:24AM -0700, Liang, Kan wrote:
>
>>> Dapeng Mi (7):
>>> perf/x86/intel: Use early_initcall() to hook bts_init()
>>> perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
>>> perf/x86: Check if cpuc->events[*] pointer exists before accessing it
>>> perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
>>> perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to
>>> BIT_ULL(48)
>>> perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into
>>> INTEL_FIXED_BITS_MASK
>>> perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
>>>
>> The series looks good to me.
>>
>> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> I've picked up all but patch 3 -- I really don't think that does the
> right thing.
Thanks. I would rewrite the patch 3 and aggregate it into the arch-PEBS
enabling series.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [tip: perf/core] perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
2025-08-20 2:30 ` [Patch v3 7/7] perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap() Dapeng Mi
@ 2025-08-25 10:24 ` tip-bot2 for Dapeng Mi
0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-08-25 10:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: Dapeng Mi, Peter Zijlstra (Intel), Kan Liang, x86, linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: f49e1be19542487921e82b29004908966cb99d7c
Gitweb: https://git.kernel.org/tip/f49e1be19542487921e82b29004908966cb99d7c
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate: Wed, 20 Aug 2025 10:30:32 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 21 Aug 2025 20:09:28 +02:00
perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
Along with the introduction Perfmon v6, pmu counters could be
incontinuous, like fixed counters on CWF, only fixed counters 0-3 and
5-7 are supported, there is no fixed counter 4 on CWF. To accommodate
this change, archPerfmonExt CPUID (0x23) leaves are introduced to
enumerate the true-view of counters bitmap.
Current perf code already supports archPerfmonExt CPUID and uses
counters-bitmap to enumerate HW really supported counters, but
x86_pmu_show_pmu_cap() still only dumps the absolute counter number
instead of true-view bitmap, it's out-dated and may mislead readers.
So dump counters true-view bitmap in x86_pmu_show_pmu_cap() and
opportunistically change the dump sequence and words.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-8-dapeng1.mi@linux.intel.com
---
arch/x86/events/core.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7610f26..745caa6 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2069,13 +2069,15 @@ static void _x86_pmu_read(struct perf_event *event)
void x86_pmu_show_pmu_cap(struct pmu *pmu)
{
- pr_info("... version: %d\n", x86_pmu.version);
- pr_info("... bit width: %d\n", x86_pmu.cntval_bits);
- pr_info("... generic registers: %d\n", x86_pmu_num_counters(pmu));
- pr_info("... value mask: %016Lx\n", x86_pmu.cntval_mask);
- pr_info("... max period: %016Lx\n", x86_pmu.max_period);
- pr_info("... fixed-purpose events: %d\n", x86_pmu_num_counters_fixed(pmu));
- pr_info("... event mask: %016Lx\n", hybrid(pmu, intel_ctrl));
+ pr_info("... version: %d\n", x86_pmu.version);
+ pr_info("... bit width: %d\n", x86_pmu.cntval_bits);
+ pr_info("... generic counters: %d\n", x86_pmu_num_counters(pmu));
+ pr_info("... generic bitmap: %016llx\n", hybrid(pmu, cntr_mask64));
+ pr_info("... fixed-purpose counters: %d\n", x86_pmu_num_counters_fixed(pmu));
+ pr_info("... fixed-purpose bitmap: %016llx\n", hybrid(pmu, fixed_cntr_mask64));
+ pr_info("... value mask: %016llx\n", x86_pmu.cntval_mask);
+ pr_info("... max period: %016llx\n", x86_pmu.max_period);
+ pr_info("... global_ctrl mask: %016llx\n", hybrid(pmu, intel_ctrl));
}
static int __init init_hw_perf_events(void)
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [tip: perf/core] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK
2025-08-20 2:30 ` [Patch v3 6/7] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Dapeng Mi
@ 2025-08-25 10:24 ` tip-bot2 for Dapeng Mi
0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-08-25 10:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: Dapeng Mi, Peter Zijlstra (Intel), Kan Liang, Yi Lai, x86,
linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 2676dbf9f4fb7f6739d1207c0f1deaf63124642a
Gitweb: https://git.kernel.org/tip/2676dbf9f4fb7f6739d1207c0f1deaf63124642a
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate: Wed, 20 Aug 2025 10:30:31 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 21 Aug 2025 20:09:27 +02:00
perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK
ICL_FIXED_0_ADAPTIVE is missed to be added into INTEL_FIXED_BITS_MASK,
add it.
With help of this new INTEL_FIXED_BITS_MASK, intel_pmu_enable_fixed() can
be optimized. The old fixed counter control bits can be unconditionally
cleared with INTEL_FIXED_BITS_MASK and then set new control bits base on
new configuration.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-7-dapeng1.mi@linux.intel.com
---
arch/x86/events/intel/core.c | 10 +++-------
arch/x86/include/asm/perf_event.h | 6 +++++-
arch/x86/kvm/pmu.h | 2 +-
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index f88a99d..28f5468 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2845,8 +2845,8 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
- u64 mask, bits = 0;
int idx = hwc->idx;
+ u64 bits = 0;
if (is_topdown_idx(idx)) {
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -2885,14 +2885,10 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
idx -= INTEL_PMC_IDX_FIXED;
bits = intel_fixed_bits_by_idx(idx, bits);
- mask = intel_fixed_bits_by_idx(idx, INTEL_FIXED_BITS_MASK);
-
- if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
+ if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip)
bits |= intel_fixed_bits_by_idx(idx, ICL_FIXED_0_ADAPTIVE);
- mask |= intel_fixed_bits_by_idx(idx, ICL_FIXED_0_ADAPTIVE);
- }
- cpuc->fixed_ctrl_val &= ~mask;
+ cpuc->fixed_ctrl_val &= ~intel_fixed_bits_by_idx(idx, INTEL_FIXED_BITS_MASK);
cpuc->fixed_ctrl_val |= bits;
}
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index f8247ac..49a4d44 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -35,7 +35,6 @@
#define ARCH_PERFMON_EVENTSEL_EQ (1ULL << 36)
#define ARCH_PERFMON_EVENTSEL_UMASK2 (0xFFULL << 40)
-#define INTEL_FIXED_BITS_MASK 0xFULL
#define INTEL_FIXED_BITS_STRIDE 4
#define INTEL_FIXED_0_KERNEL (1ULL << 0)
#define INTEL_FIXED_0_USER (1ULL << 1)
@@ -48,6 +47,11 @@
#define ICL_EVENTSEL_ADAPTIVE (1ULL << 34)
#define ICL_FIXED_0_ADAPTIVE (1ULL << 32)
+#define INTEL_FIXED_BITS_MASK \
+ (INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER | \
+ INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI | \
+ ICL_FIXED_0_ADAPTIVE)
+
#define intel_fixed_bits_by_idx(_idx, _bits) \
((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE))
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index ad89d0b..103604c 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -13,7 +13,7 @@
#define MSR_IA32_MISC_ENABLE_PMU_RO_MASK (MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | \
MSR_IA32_MISC_ENABLE_BTS_UNAVAIL)
-/* retrieve the 4 bits for EN and PMI out of IA32_FIXED_CTR_CTRL */
+/* retrieve a fixed counter bits out of IA32_FIXED_CTR_CTRL */
#define fixed_ctrl_field(ctrl_reg, idx) \
(((ctrl_reg) >> ((idx) * INTEL_FIXED_BITS_STRIDE)) & INTEL_FIXED_BITS_MASK)
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [tip: perf/core] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48)
2025-08-20 2:30 ` [Patch v3 5/7] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
@ 2025-08-25 10:24 ` tip-bot2 for Dapeng Mi
0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-08-25 10:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: Dapeng Mi, Peter Zijlstra (Intel), Kan Liang, Yi Lai, x86,
linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 9b3e119784bc3671fde5043001a5c9a607c7d920
Gitweb: https://git.kernel.org/tip/9b3e119784bc3671fde5043001a5c9a607c7d920
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate: Wed, 20 Aug 2025 10:30:30 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 21 Aug 2025 20:09:27 +02:00
perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48)
Macro GLOBAL_CTRL_EN_PERF_METRICS is defined to 48 instead of
BIT_ULL(48), it's inconsistent with other similar macros. This leads to
this macro is quite easily used wrongly since users thinks it's a
bit-mask just like other similar macros.
Thus change GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) and eliminate
this potential misuse.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-6-dapeng1.mi@linux.intel.com
---
arch/x86/events/intel/core.c | 8 ++++----
arch/x86/include/asm/perf_event.h | 2 +-
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 15da60c..f88a99d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5319,9 +5319,9 @@ static void intel_pmu_check_hybrid_pmus(struct x86_hybrid_pmu *pmu)
0, x86_pmu_num_counters(&pmu->pmu), 0, 0);
if (pmu->intel_cap.perf_metrics)
- pmu->intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ pmu->intel_ctrl |= GLOBAL_CTRL_EN_PERF_METRICS;
else
- pmu->intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS);
+ pmu->intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS;
intel_pmu_check_event_constraints(pmu->event_constraints,
pmu->cntr_mask64,
@@ -5456,7 +5456,7 @@ static void intel_pmu_cpu_starting(int cpu)
rdmsrq(MSR_IA32_PERF_CAPABILITIES, perf_cap.capabilities);
if (!perf_cap.perf_metrics) {
x86_pmu.intel_cap.perf_metrics = 0;
- x86_pmu.intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS);
+ x86_pmu.intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS;
}
}
@@ -7790,7 +7790,7 @@ __init int intel_pmu_init(void)
}
if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics)
- x86_pmu.intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ x86_pmu.intel_ctrl |= GLOBAL_CTRL_EN_PERF_METRICS;
if (x86_pmu.intel_cap.pebs_timing_info)
x86_pmu.flags |= PMU_FL_RETIRE_LATENCY;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 70d1d94..f8247ac 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -430,7 +430,7 @@ static inline bool is_topdown_idx(int idx)
#define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_BIT)
#define GLOBAL_STATUS_PERF_METRICS_OVF_BIT 48
-#define GLOBAL_CTRL_EN_PERF_METRICS 48
+#define GLOBAL_CTRL_EN_PERF_METRICS BIT_ULL(48)
/*
* We model guest LBR event tracing as another fixed-mode PMC like BTS.
*
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [tip: perf/core] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
2025-08-20 2:30 ` [Patch v3 4/7] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
@ 2025-08-25 10:24 ` tip-bot2 for Dapeng Mi
0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-08-25 10:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: Dapeng Mi, Peter Zijlstra (Intel), Kan Liang, Yi Lai, x86,
linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 0c5caea762de31a85cbcce65d978cec83449f699
Gitweb: https://git.kernel.org/tip/0c5caea762de31a85cbcce65d978cec83449f699
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate: Wed, 20 Aug 2025 10:30:29 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 21 Aug 2025 20:09:27 +02:00
perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
IA32_PERF_CAPABILITIES.PEBS_TIMING_INFO[bit 17] is introduced to
indicate whether timed PEBS is supported. Timed PEBS adds a new "retired
latency" field in basic info group to show the timing info. Please find
detailed information about timed PEBS in section 8.4.1 "Timed Processor
Event Based Sampling" of "Intel Architecture Instruction Set Extensions
and Future Features".
This patch adds PERF_CAP_PEBS_TIMING_INFO flag and KVM module leverages
this flag to expose timed PEBS feature to guest.
Moreover, opportunistically refine the indents and make the macros
share consistent indents.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-5-dapeng1.mi@linux.intel.com
---
arch/x86/include/asm/msr-index.h | 14 ++++++++------
tools/arch/x86/include/asm/msr-index.h | 14 ++++++++------
2 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b65c3ba..f627196 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -315,12 +315,14 @@
#define PERF_CAP_PT_IDX 16
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
-#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
-#define PERF_CAP_ARCH_REG BIT_ULL(7)
-#define PERF_CAP_PEBS_FORMAT 0xf00
-#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
-#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
- PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT 0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
+#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
+#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
+ PERF_CAP_PEBS_TIMING_INFO)
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 5cfb5d7..daebfd9 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -315,12 +315,14 @@
#define PERF_CAP_PT_IDX 16
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
-#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
-#define PERF_CAP_ARCH_REG BIT_ULL(7)
-#define PERF_CAP_PEBS_FORMAT 0xf00
-#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
-#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
- PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT 0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
+#define PERF_CAP_PEBS_TIMING_INFO BIT_ULL(17)
+#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \
+ PERF_CAP_PEBS_TIMING_INFO)
#define MSR_IA32_RTIT_CTL 0x00000570
#define RTIT_CTL_TRACEEN BIT(0)
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [tip: perf/core] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
2025-08-20 2:30 ` [Patch v3 2/7] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error Dapeng Mi
@ 2025-08-25 10:24 ` tip-bot2 for Dapeng Mi
0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-08-25 10:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: Dapeng Mi, Peter Zijlstra (Intel), Kan Liang, x86, linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 43796f30507802d93ead2dc44fc9637f34671a89
Gitweb: https://git.kernel.org/tip/43796f30507802d93ead2dc44fc9637f34671a89
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate: Wed, 20 Aug 2025 10:30:27 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 21 Aug 2025 20:09:27 +02:00
perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.
[ 55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
[ 55.611280] Call Trace:
[ 55.611282] <TASK>
[ 55.611284] ? intel_pmu_config_acr+0x87/0x160
[ 55.611289] intel_pmu_enable_acr+0x6d/0x80
[ 55.611291] intel_pmu_enable_event+0xce/0x460
[ 55.611293] x86_pmu_start+0x78/0xb0
[ 55.611297] x86_pmu_enable+0x218/0x3a0
[ 55.611300] ? x86_pmu_enable+0x121/0x3a0
[ 55.611302] perf_pmu_enable+0x40/0x50
[ 55.611307] ctx_resched+0x19d/0x220
[ 55.611309] __perf_install_in_context+0x284/0x2f0
[ 55.611311] ? __pfx_remote_function+0x10/0x10
[ 55.611314] remote_function+0x52/0x70
[ 55.611317] ? __pfx_remote_function+0x10/0x10
[ 55.611319] generic_exec_single+0x84/0x150
[ 55.611323] smp_call_function_single+0xc5/0x1a0
[ 55.611326] ? __pfx_remote_function+0x10/0x10
[ 55.611329] perf_install_in_context+0xd1/0x1e0
[ 55.611331] ? __pfx___perf_install_in_context+0x10/0x10
[ 55.611333] __do_sys_perf_event_open+0xa76/0x1040
[ 55.611336] __x64_sys_perf_event_open+0x26/0x30
[ 55.611337] x64_sys_call+0x1d8e/0x20c0
[ 55.611339] do_syscall_64+0x4f/0x120
[ 55.611343] entry_SYSCALL_64_after_hwframe+0x76/0x7e
On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR
which requires to enable auto counter reload on GP counter 0.
The root cause of causing this issue is the check for auto counter
reload (ACR) counter mask from user space is incorrect in
intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter
mask from user space could be set into hw.config1 and then written into
CFG_B MSRs and trigger the MSR access warning.
e.g., User may create a perf event with ACR counter mask (config2=0xcb),
and there is only 1 event created, so "cpuc->n_events" is 1.
The correct check condition should be "i + idx >= cpuc->n_events"
instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise,
the counter mask would traverse twice and an invalid "cpuc->assign[1]"
bit (bit 0) is set into hw.config1 and cause MSR accessing error.
Besides, also check if the ACR counter mask corresponding events are
ACR events. If not, filter out these counter mask. If a event is not a
ACR event, it could be scheduled to an HW counter which doesn't support
ACR. It's invalid to add their counter index in ACR counter mask.
Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as
user could set any invalid ACR counter mask and the warning message
could mislead users.
Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-3-dapeng1.mi@linux.intel.com
---
arch/x86/events/intel/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c2fb729..15da60c 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2997,7 +2997,8 @@ static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc)
if (event->group_leader != leader->group_leader)
break;
for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
- if (WARN_ON_ONCE(i + idx > cpuc->n_events))
+ if (i + idx >= cpuc->n_events ||
+ !is_acr_event_group(cpuc->event_list[i + idx]))
return;
__set_bit(cpuc->assign[i + idx], (unsigned long *)&event->hw.config1);
}
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [tip: perf/core] perf/x86/intel: Use early_initcall() to hook bts_init()
2025-08-20 2:30 ` [Patch v3 1/7] perf/x86/intel: Use early_initcall() to hook bts_init() Dapeng Mi
@ 2025-08-25 10:24 ` tip-bot2 for Dapeng Mi
0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2025-08-25 10:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: Dapeng Mi, Peter Zijlstra (Intel), Kan Liang, x86, linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: d9cf9c6884d21e01483c4e17479d27636ea4bb50
Gitweb: https://git.kernel.org/tip/d9cf9c6884d21e01483c4e17479d27636ea4bb50
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate: Wed, 20 Aug 2025 10:30:26 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 21 Aug 2025 20:09:26 +02:00
perf/x86/intel: Use early_initcall() to hook bts_init()
After the commit 'd971342d38bf ("perf/x86/intel: Decouple BTS
initialization from PEBS initialization")' is introduced, x86_pmu.bts
would initialized in bts_init() which is hooked by arch_initcall().
Whereas init_hw_perf_events() is hooked by early_initcall(). Once the
core PMU is initialized, nmi watchdog initialization is called
immediately before bts_init() is called. It leads to the BTS buffer is
not really initialized since bts_init() is not called and x86_pmu.bts is
still false at that time. Worse, BTS buffer would never be initialized
then unless all core PMU events are freed and reserve_ds_buffers()
is called again.
Thus aligning with init_hw_perf_events(), use early_initcall() to hook
bts_init() to ensure x86_pmu.bts is initialized before nmi watchdog
initialization.
Fixes: d971342d38bf ("perf/x86/intel: Decouple BTS initialization from PEBS initialization")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-2-dapeng1.mi@linux.intel.com
---
arch/x86/events/intel/bts.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 61da6b8..cbac54c 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -643,4 +643,4 @@ static __init int bts_init(void)
return perf_pmu_register(&bts_pmu, "intel_bts", -1);
}
-arch_initcall(bts_init);
+early_initcall(bts_init);
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it
2025-08-22 5:26 ` Mi, Dapeng
@ 2025-08-26 3:47 ` Mi, Dapeng
0 siblings, 0 replies; 25+ messages in thread
From: Mi, Dapeng @ 2025-08-26 3:47 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
Adrian Hunter, Alexander Shishkin, Kan Liang, Andi Kleen,
Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
kernel test robot
On 8/22/2025 1:26 PM, Mi, Dapeng wrote:
> On 8/21/2025 9:35 PM, Peter Zijlstra wrote:
>> On Wed, Aug 20, 2025 at 10:30:28AM +0800, Dapeng Mi wrote:
>>> When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
>>> perf_event_overflow() could be called to process the last PEBS record.
>>>
>>> While perf_event_overflow() could trigger the interrupt throttle and
>>> stop all events of the group, like what the below call-chain shows.
>>>
>>> perf_event_overflow()
>>> -> __perf_event_overflow()
>>> ->__perf_event_account_interrupt()
>>> -> perf_event_throttle_group()
>>> -> perf_event_throttle()
>>> -> event->pmu->stop()
>>> -> x86_pmu_stop()
>>>
>>> The side effect of stopping the events is that all corresponding event
>>> pointers in cpuc->events[] array are cleared to NULL.
>>>
>>> Assume there are two PEBS events (event a and event b) in a group. When
>>> intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
>>> last PEBS record of PEBS event a, interrupt throttle is triggered and
>>> all pointers of event a and event b are cleared to NULL. Then
>>> intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
>>> event b and encounters NULL pointer access.
>>>
>>> Since the left PEBS records have been processed when stopping the event,
>>> check and skip to process the last PEBS record if cpuc->events[*] is
>>> NULL.
>>>
>>> Reported-by: kernel test robot <oliver.sang@intel.com>
>>> Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com
>>> Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>> Tested-by: kernel test robot <oliver.sang@intel.com>
>>> ---
>>> arch/x86/events/intel/ds.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>>> index c0b7ac1c7594..dcf29c099ad2 100644
>>> --- a/arch/x86/events/intel/ds.c
>>> +++ b/arch/x86/events/intel/ds.c
>>> @@ -2663,6 +2663,16 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
>>> continue;
>>>
>>> event = cpuc->events[bit];
>>> + /*
>>> + * perf_event_overflow() called by below __intel_pmu_pebs_last_event()
>>> + * could trigger interrupt throttle and clear all event pointers of the
>>> + * group in cpuc->events[] to NULL. So need to re-check if cpuc->events[*]
>>> + * is NULL, if so it indicates the event has been throttled (stopped) and
>>> + * the corresponding last PEBS records have been processed in stopping
>>> + * event, don't need to process it again.
>>> + */
>>> + if (!event)
>>> + continue;
>>>
>>> __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit],
>>> counts[bit], setup_pebs_adaptive_sample_data);
>> So if this is due to __intel_pmu_pebs_last_event() calling into
>> perf_event_overflow(); then isn't intel_pmu_drain_pebs_nhm() similarly
>> affected?
>>
>> And worse, the _nhm() version would loose all events for that counter,
>> not just the last.
> hmm, Yes. After double check, I suppose I made a mistake for the answer to
> Andi. It indeed has data loss since the "ds->pebs_index" is reset at the
> head of _nhm()/_icl() these drain_pebs helper instead of the end of the
> drain_pebs helper. :(
>
>> I'm really thinking this isn't the right thing to do.
>>
>>
>> How about we audit the entirety of arch/x86/events/ for cpuc->events[]
>> usage and see if we can get away with changing x86_pmu_stop() to simply
>> not clearing that field.
> Checking current code, I suppose it's fine that we don't clear
> cpuc->events[] in x86_pmu_stop() since we already have another variable
> "cpuc->active_mask" which is used to indicate if the corresponding
> cpuc->events[*] is active. But in current code, the cpuc->active_mask is
> not always checked.
>
> So if we select not to clear cpuc->events[] in x86_pmu_stop(), then it's a
> must to check cpuc->active_mask before really accessing cpuc->events[]
> represented event. Maybe we can add an inline function got check this.
>
> bool inline x86_pmu_cntr_event_active(int idx)
> {
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>
> return cpuc->events[idx] && test_bit(idx, cpuc->active_mask);
> }
Just think twice about this method, it breaks the original logic about
cpuc->events[] (setting at x86_pmu_start() and clearing at x86_pmu_stop())
and this leaves the cpuc->events[] is never cleared, this is ambiguous.
Besides checking cpuc->events[idx] and cpuc->active_mask is not atomic,
this may bring potential risks especially considering cpuc->events[] are
broadly used in x86/perf code.
Talked with Kan offline, he suggests to get a events[] snapshot from
cpuc->events[] before calling perf_event_overflow() and then use the
events[] snapshot to process all left PEBS records. That seems a better and
safer method to fix this issue for me.
Peter, if you don't object this method, I would follow Kan's suggestion.
Thanks.
>
>> Or perhaps move the setting and clearing into x86_pmu_{add,del}() rather
>> than x86_pmu_{start,stop}(). After all, the latter don't affect the
>> counter placement, they just stop/start the event.
> IIUC, we could not move the setting into x86_pmu_add() from x86_pmu_stop()
> since the counter index is not finalized at x86_pmu_add() is called. The
> counter index could change for each adding a new event.
>
>
>>
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2025-08-26 3:47 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-20 2:30 [Patch v3 0/7] x86 perf bug fixes and optimization Dapeng Mi
2025-08-20 2:30 ` [Patch v3 1/7] perf/x86/intel: Use early_initcall() to hook bts_init() Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 2/7] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 3/7] perf/x86: Check if cpuc->events[*] pointer exists before accessing it Dapeng Mi
2025-08-20 3:41 ` Andi Kleen
2025-08-20 5:33 ` Mi, Dapeng
2025-08-20 5:44 ` Andi Kleen
2025-08-20 5:54 ` Mi, Dapeng
2025-08-21 1:51 ` Andi Kleen
2025-08-21 13:35 ` Peter Zijlstra
2025-08-22 5:26 ` Mi, Dapeng
2025-08-26 3:47 ` Mi, Dapeng
2025-08-20 2:30 ` [Patch v3 4/7] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 5/7] perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 6/7] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 2:30 ` [Patch v3 7/7] perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap() Dapeng Mi
2025-08-25 10:24 ` [tip: perf/core] " tip-bot2 for Dapeng Mi
2025-08-20 15:55 ` [Patch v3 0/7] x86 perf bug fixes and optimization Liang, Kan
2025-08-21 13:39 ` Peter Zijlstra
2025-08-22 5:29 ` Mi, Dapeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).