* [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes
@ 2026-06-09 5:02 Dapeng Mi
2026-06-09 5:02 ` [Patch v2 1/9] perf/x86/intel: Remove anythread_deprecated bit from perf_capabilities Dapeng Mi
` (8 more replies)
0 siblings, 9 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi
This series groups several independent PMU fixes to simplify review and
backporting.
Changes:
v1 -> v2:
- Fallback to software branch type decoding if hardware decoding is not
suppprted (Sashiko patch 4/9).
- Drop kernel IP for PERF_SAMPLE_IP if exclude_kernel attribute is
required (Sashiko, patch 8/9).
- Add kernel access check when kernel callchains are requested
(Sashiko, patch 9/9)
- Address Zide and Thomas's comments.
- Collect Reviewed-bys.
Patch layout:
- Patch 1/9: Fix anythread_deprecated being overwritten issue.
- Patches 2-3/9: Fix the issue that cap_user_rdpmc is not updated
correctly.
- Patch 4/9: Fallback to software branch type decoding if no hardware
decoding.
- Patch 5/9: Fix a kernel address leakage issue in LBR stack.
- Patch 6/9: Fix the issue that the return value of
intel_pmu_init_hybrid() is not valiated correctly.
- Patch 7/9: Fix a "unchecked MSR access error" on PEBS_ENABLE MSR.
- Patch 8/9: Prevent a theoretical kernel register data leak in sampling.
- Patch 9/9: Add kernel access check when kernel callchains are
requested.
History:
v1: https://lore.kernel.org/all/20260605011136.2043393-1-dapeng1.mi@linux.intel.com/
Dapeng Mi (8):
perf/x86/intel: Remove anythread_deprecated bit from perf_capabilities
perf/x86: Update cap_user_rdpmc base on rdpmc user disable state
perf/x86/intel: Fallback to sw branch type decoding if no hw decoding
perf/x86/intel: Drop LBR entries whose privilege level mismatches
br_sel
perf/x86/intel: Validate return value of intel_pmu_init_hybrid()
perf/x86/intel: Drop fixed-counter PEBS constraints for baseline PEBS
perf/core: Fix kernel register info leak via hardware skid
perf/core: Check kernel access when kernel callchains are requested
Ian Rogers (1):
perf/x86: Introduce is_x86_pmu() helper
arch/x86/events/core.c | 19 +++-------------
arch/x86/events/intel/core.c | 43 ++++++++++++++++++++++++------------
arch/x86/events/intel/ds.c | 13 -----------
arch/x86/events/intel/lbr.c | 15 ++++++++++---
arch/x86/events/perf_event.h | 25 +++++++++++++++++----
kernel/events/core.c | 41 +++++++++++++++++++++++++++-------
6 files changed, 98 insertions(+), 58 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Patch v2 1/9] perf/x86/intel: Remove anythread_deprecated bit from perf_capabilities
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:02 ` [Patch v2 2/9] perf/x86: Introduce is_x86_pmu() helper Dapeng Mi
` (7 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi, stable
AnyThread mode deprecation is enumerated by CPUID.0AH:EDX[15] instead of
PERF_CAPABILITIES MSR. It's not a good practice to define a bit to
represent "anythread deprecation" in perf_capabilities. It leads to the
anythread_deprecated bit could be overwritten by the real value of
PERF_CAPABILITIES MSR, just like the below code in update_pmu_cap() does.
```
if (!intel_pmu_broken_perf_cap()) {
/* Perf Metric (Bit 15) and PEBS via PT (Bit 16) are hybrid enumeration */
rdmsrq(MSR_IA32_PERF_CAPABILITIES, hybrid(pmu, intel_cap).capabilities);
}
```
It leads to the anythread_deprecated bit is cleared to 0 and the "any"
attribute is incorrectly shown in the /sys/devices/cpu/format/ folder on
these support Perfmon v6 platforms, like Clearwater Forest.
```
$grep . /sys/devices/cpu/format/*
/sys/devices/cpu/format/acr_mask:config2:0-63
/sys/devices/cpu/format/any:config:21
/sys/devices/cpu/format/cmask:config:24-31
```
So remove the anythread_deprecated bit from perf_capabilities structure
and directly depends on CPUID.0AH:EDX[15] to judge if anythread is
deprecated.
Cc: stable@vger.kernel.org
Reported-by: Namhyung Kim <namhyung@kernel.org>
Fixes: cadbaa039b99 ("perf/x86/intel: Make anythread filter support conditional")
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
---
arch/x86/events/intel/core.c | 10 +++-------
arch/x86/events/perf_event.h | 2 +-
2 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 0217e701aeeb..ea3ab3050a3b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -7946,12 +7946,6 @@ __init int intel_pmu_init(void)
x86_add_quirk(intel_arch_events_quirk); /* Install first, so it runs last */
- if (version >= 5) {
- x86_pmu.intel_cap.anythread_deprecated = edx.split.anythread_deprecated;
- if (x86_pmu.intel_cap.anythread_deprecated)
- pr_cont(" AnyThread deprecated, ");
- }
-
/* The perf side of core PMU is ready to support the mediated vPMU. */
x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_MEDIATED_VPMU;
@@ -8828,8 +8822,10 @@ __init int intel_pmu_init(void)
&x86_pmu.intel_ctrl);
/* AnyThread may be deprecated on arch perfmon v5 or later */
- if (x86_pmu.intel_cap.anythread_deprecated)
+ if (version >= 5 && edx.split.anythread_deprecated) {
x86_pmu.format_attrs = intel_arch_formats_attr;
+ pr_cont("AnyThread deprecated, ");
+ }
intel_pmu_check_event_constraints_all(NULL);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index eae24bb35dc1..5902a297daa1 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -668,7 +668,7 @@ union perf_capabilities {
u64 perf_metrics:1;
u64 pebs_output_pt_available:1;
u64 pebs_timing_info:1;
- u64 anythread_deprecated:1;
+ u64 __reserved:1;
u64 rdpmc_metrics_clear:1;
};
u64 capabilities;
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 2/9] perf/x86: Introduce is_x86_pmu() helper
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
2026-06-09 5:02 ` [Patch v2 1/9] perf/x86/intel: Remove anythread_deprecated bit from perf_capabilities Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:02 ` [Patch v2 3/9] perf/x86: Update cap_user_rdpmc base on rdpmc user disable state Dapeng Mi
` (6 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi
From: Ian Rogers <irogers@google.com>
To facilitate the detection of x86 PMU structures in upcoming patches,
the is_x86_pmu() helper is introduced. Additionally, the is_x86_event()
helper has been refactored to utilize is_x86_pmu().
No function changes intended.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
---
arch/x86/events/core.c | 16 ----------------
arch/x86/events/perf_event.h | 18 +++++++++++++++++-
2 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 4b9e105309c6..3bd0522afe6d 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -774,22 +774,6 @@ void x86_pmu_enable_all(int added)
}
}
-int is_x86_event(struct perf_event *event)
-{
- /*
- * For a non-hybrid platforms, the type of X86 pmu is
- * always PERF_TYPE_RAW.
- * For a hybrid platform, the PERF_PMU_CAP_EXTENDED_HW_TYPE
- * is a unique capability for the X86 PMU.
- * Use them to detect a X86 event.
- */
- if (event->pmu->type == PERF_TYPE_RAW ||
- event->pmu->capabilities & PERF_PMU_CAP_EXTENDED_HW_TYPE)
- return true;
-
- return false;
-}
-
struct pmu *x86_get_pmu(unsigned int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 5902a297daa1..dbb5c8e8a8ea 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -115,7 +115,23 @@ static inline bool is_topdown_event(struct perf_event *event)
return is_metric_event(event) || is_slots_event(event);
}
-int is_x86_event(struct perf_event *event);
+static inline bool is_x86_pmu(struct pmu *pmu)
+{
+ /*
+ * For a non-hybrid platforms, the type of X86 pmu is
+ * always PERF_TYPE_RAW.
+ * For a hybrid platform, the PERF_PMU_CAP_EXTENDED_HW_TYPE
+ * is a unique capability for the X86 PMU.
+ * Use them to detect a X86 event.
+ */
+ return pmu->type == PERF_TYPE_RAW ||
+ pmu->capabilities & PERF_PMU_CAP_EXTENDED_HW_TYPE;
+}
+
+static inline bool is_x86_event(struct perf_event *event)
+{
+ return is_x86_pmu(event->pmu);
+}
static inline bool check_leader_group(struct perf_event *leader, int flags)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 3/9] perf/x86: Update cap_user_rdpmc base on rdpmc user disable state
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
2026-06-09 5:02 ` [Patch v2 1/9] perf/x86/intel: Remove anythread_deprecated bit from perf_capabilities Dapeng Mi
2026-06-09 5:02 ` [Patch v2 2/9] perf/x86: Introduce is_x86_pmu() helper Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:02 ` [Patch v2 4/9] perf/x86/intel: Fallback to sw branch type decoding if no hw decoding Dapeng Mi
` (5 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi
After introducing the RDPMC user disable feature, user-space RDPMC may
return 0 instead of the actual event count. This creates an inconsistency
with cap_user_rdpmc, where cap_user_rdpmc is set, but user-space RDPMC
only returns 0.
To accurately represent the user-space RDPMC capability, update
cap_user_rdpmc based on the RDPMC user disable state. If RDPMC user
disable is enabled, cap_user_rdpmc is set to false, allowing user-space
programs to fall back to the read() syscall to obtain the real event
count.
Since arch_perf_update_userpage() could be called for software events,
enhance x86_pmu_has_rdpmc_user_disable() to only check the x86 PMUs.
Fixes: 59af95e028d4 ("perf/x86/intel: Add support for rdpmc user disable feature")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
---
arch/x86/events/core.c | 3 +++
arch/x86/events/perf_event.h | 5 +++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 3bd0522afe6d..6cd95b8e31cb 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2797,6 +2797,9 @@ void arch_perf_update_userpage(struct perf_event *event,
userpg->cap_user_time_zero = 0;
userpg->cap_user_rdpmc =
!!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT);
+ if (x86_pmu_has_rdpmc_user_disable(event->pmu) &&
+ event->hw.config & ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE)
+ userpg->cap_user_rdpmc = 0;
userpg->pmc_width = x86_pmu.cntval_bits;
if (!using_native_sched_clock() || !sched_clock_stable())
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index dbb5c8e8a8ea..4003e2e0aa9c 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1359,8 +1359,9 @@ static inline u64 x86_pmu_get_event_config(struct perf_event *event)
static inline bool x86_pmu_has_rdpmc_user_disable(struct pmu *pmu)
{
- return !!(hybrid(pmu, config_mask) &
- ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE);
+ return is_x86_pmu(pmu) &&
+ (hybrid(pmu, config_mask) &
+ ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE);
}
extern struct event_constraint emptyconstraint;
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 4/9] perf/x86/intel: Fallback to sw branch type decoding if no hw decoding
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
` (2 preceding siblings ...)
2026-06-09 5:02 ` [Patch v2 3/9] perf/x86: Update cap_user_rdpmc base on rdpmc user disable state Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:24 ` sashiko-bot
2026-06-09 5:02 ` [Patch v2 5/9] perf/x86/intel: Drop LBR entries whose privilege level mismatches br_sel Dapeng Mi
` (4 subsequent siblings)
8 siblings, 1 reply; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi
intel_pmu_lbr_filter() currently assumes Arch LBR provides hardware
branch-type decoding and skips software decoding on that path.
However, Arch LBR may not always expose branch-type information. In
that case, treating entries as hardware-decoded can misclassify sampled
branches (for example, defaulting to JCC), which breaks branch-type
filtering results.
Fix this by using software branch-type decoding when hardware
branch-type decoding is unavailable (that is, when x86_lbr_type is not
enabled). This keeps branch classification and filtering behavior
correct across Arch LBR configurations.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/lbr.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 72f2adcda7c6..d4c0ed85e1fb 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1232,6 +1232,7 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
* OTHER_BRANCH branch type still rely on software decoding.
*/
if (static_cpu_has(X86_FEATURE_ARCH_LBR) &&
+ static_branch_likely(&x86_lbr_type) &&
type <= ARCH_LBR_BR_TYPE_KNOWN_MAX) {
to_plm = kernel_ip(to) ? X86_BR_KERNEL : X86_BR_USER;
type = arch_lbr_br_type_map[type] | to_plm;
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 5/9] perf/x86/intel: Drop LBR entries whose privilege level mismatches br_sel
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
` (3 preceding siblings ...)
2026-06-09 5:02 ` [Patch v2 4/9] perf/x86/intel: Fallback to sw branch type decoding if no hw decoding Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:21 ` sashiko-bot
2026-06-09 5:02 ` [Patch v2 6/9] perf/x86/intel: Validate return value of intel_pmu_init_hybrid() Dapeng Mi
` (3 subsequent siblings)
8 siblings, 1 reply; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi, stable
Before Arch LBR gained CPL filtering support, a user-only branch stack
could still contain kernel addresses. As a result, kernel branch records
may be exposed to user space even when PERF_SAMPLE_BRANCH_USER is
requested.
For example, on Intel Tiger Lake, the following command can still report
SYSRET/ERET entries with kernel-space from addresses:
```
$./perf record -e cycles:p -o - --branch-filter any,save_type,u -- \
./perf bench syscall basic --loop 1000 | \
./perf script -i - --fields brstack|tr ' ' '\n'| \
grep -E '0x[89a-f][0-9a-f]{15}'
Total time: 0.000 [sec]
0.219000 usecs/op
4,566,210 ops/sec
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.551 MB - ]
0xffffffff93c001c8/0x7f12a2b1d647/P/-/-/16959/SYSRET/-
0xffffffff93c001c8/0x7f12a2b1d5c2/P/-/-/17535/SYSRET/-
0xffffffff93c01928/0x7f12a2861000/P/-/-/6719/ERET/-
0xffffffff93c01928/0x7f12a297a000/P/-/-/8575/ERET/-
```
The problem is that intel_pmu_lbr_filter() does not fully validate the
privilege level of sampled entries. It filters some mismatches based on
the branch type and the to address, but it does not reject entries whose
from address violates the requested branch privilege filter.
Fix this by extending software filtering to validate both from and to
addresses against br_sel. Any LBR entry whose privilege level does not
match the requested user/kernel filter is dropped. This prevents kernel
addresses from appearing in user-only branch stacks, and likewise drops
user entries from kernel-only stacks.
Cc: stable@vger.kernel.org
Reported-by: Ian Rogers <irogers@google.com>
Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/lbr.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index d4c0ed85e1fb..807ce903c972 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1212,7 +1212,7 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
{
u64 from, to;
int br_sel = cpuc->br_sel;
- int i, j, type, to_plm;
+ int i, j, type, to_plm, from_plm;
bool compress = false;
/* if sampling all branches, then nothing to filter */
@@ -1245,8 +1245,16 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
type |= X86_BR_NO_TX;
}
- /* if type does not correspond, then discard */
- if (type == X86_BR_NONE || (br_sel & type) != type) {
+ from_plm = kernel_ip(from) ? X86_BR_KERNEL : X86_BR_USER;
+ /*
+ * If type does not correspond, then discard.
+ * Especially filter out the entries whose from or to address is
+ * a kernel address while only X86_BR_USER is set. This prevents
+ * kernel address from being leaked into a user-space-only LBR stack.
+ */
+ if (type == X86_BR_NONE || (br_sel & type) != type ||
+ (!(br_sel & X86_BR_KERNEL) && (from_plm & X86_BR_KERNEL)) ||
+ (!(br_sel & X86_BR_USER) && (from_plm & X86_BR_USER))) {
cpuc->lbr_entries[i].from = 0;
compress = true;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 6/9] perf/x86/intel: Validate return value of intel_pmu_init_hybrid()
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
` (4 preceding siblings ...)
2026-06-09 5:02 ` [Patch v2 5/9] perf/x86/intel: Drop LBR entries whose privilege level mismatches br_sel Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:25 ` sashiko-bot
2026-06-09 5:02 ` [Patch v2 7/9] perf/x86/intel: Drop fixed-counter PEBS constraints for baseline PEBS Dapeng Mi
` (2 subsequent siblings)
8 siblings, 1 reply; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi
The memory allocation for the x86_pmu.hybrid_pmu[] array in
intel_pmu_init_hybrid() can theoretically fail due to memory shortages.
If this occurs, the initialization of the x86 hybrid PMU would fail.
Currently, the code does not check the return value of the
intel_pmu_init_hybrid() function, which could lead to attempts to access
the uninitialized x86_pmu.hybrid_pmu[] array, potentially causing a
system panic.
So, add a check for the return value of intel_pmu_init_hybrid() to
prevent invalid memory access in such scenarios. Besides, free the
created kmem cache when error occurs.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
---
arch/x86/events/intel/core.c | 33 ++++++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 7 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ea3ab3050a3b..efd9caa3502c 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -7870,6 +7870,7 @@ __init int intel_pmu_init(void)
int version, i;
char *name;
struct x86_hybrid_pmu *pmu;
+ int ret;
/* Architectural Perfmon was introduced starting with Core "Yonah" */
if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
@@ -8539,7 +8540,9 @@ __init int intel_pmu_init(void)
*
* Initialize the common PerfMon capabilities here.
*/
- intel_pmu_init_hybrid(hybrid_big_small);
+ ret = intel_pmu_init_hybrid(hybrid_big_small);
+ if (ret < 0)
+ goto err;
x86_pmu.pebs_latency_data = grt_latency_data;
x86_pmu.get_event_constraints = adl_get_event_constraints;
@@ -8597,7 +8600,9 @@ __init int intel_pmu_init(void)
case INTEL_METEORLAKE:
case INTEL_METEORLAKE_L:
case INTEL_ARROWLAKE_U:
- intel_pmu_init_hybrid(hybrid_big_small);
+ ret = intel_pmu_init_hybrid(hybrid_big_small);
+ if (ret < 0)
+ goto err;
x86_pmu.pebs_latency_data = cmt_latency_data;
x86_pmu.get_event_constraints = mtl_get_event_constraints;
@@ -8628,7 +8633,9 @@ __init int intel_pmu_init(void)
pr_cont("Pantherlake Hybrid events, ");
name = "pantherlake_hybrid";
- intel_pmu_init_hybrid(hybrid_big_small);
+ ret = intel_pmu_init_hybrid(hybrid_big_small);
+ if (ret < 0)
+ goto err;
/* Initialize big core specific PerfMon capabilities.*/
pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
@@ -8643,7 +8650,9 @@ __init int intel_pmu_init(void)
pr_cont("Arrowlake Hybrid events, ");
name = "arrowlake_hybrid";
- intel_pmu_init_hybrid(hybrid_big_small);
+ ret = intel_pmu_init_hybrid(hybrid_big_small);
+ if (ret < 0)
+ goto err;
/* Initialize big core specific PerfMon capabilities.*/
pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
@@ -8660,7 +8669,9 @@ __init int intel_pmu_init(void)
pr_cont("Lunarlake Hybrid events, ");
name = "lunarlake_hybrid";
- intel_pmu_init_hybrid(hybrid_big_small);
+ ret = intel_pmu_init_hybrid(hybrid_big_small);
+ if (ret < 0)
+ goto err;
/* Initialize big core specific PerfMon capabilities.*/
pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
@@ -8685,7 +8696,9 @@ __init int intel_pmu_init(void)
break;
case INTEL_ARROWLAKE_H:
- intel_pmu_init_hybrid(hybrid_big_small_tiny);
+ ret = intel_pmu_init_hybrid(hybrid_big_small_tiny);
+ if (ret < 0)
+ goto err;
x86_pmu.pebs_latency_data = arl_h_latency_data;
x86_pmu.get_event_constraints = arl_h_get_event_constraints;
@@ -8720,7 +8733,9 @@ __init int intel_pmu_init(void)
case INTEL_NOVALAKE_L:
pr_cont("Novalake Hybrid events, ");
name = "novalake_hybrid";
- intel_pmu_init_hybrid(hybrid_big_small);
+ ret = intel_pmu_init_hybrid(hybrid_big_small);
+ if (ret < 0)
+ goto err;
x86_pmu.pebs_latency_data = nvl_latency_data;
x86_pmu.get_event_constraints = mtl_get_event_constraints;
@@ -8885,6 +8900,10 @@ __init int intel_pmu_init(void)
intel_aux_output_init();
return 0;
+
+err:
+ kmem_cache_destroy(x86_get_pmu(smp_processor_id())->task_ctx_cache);
+ return ret;
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 7/9] perf/x86/intel: Drop fixed-counter PEBS constraints for baseline PEBS
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
` (5 preceding siblings ...)
2026-06-09 5:02 ` [Patch v2 6/9] perf/x86/intel: Validate return value of intel_pmu_init_hybrid() Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:02 ` [Patch v2 8/9] perf/core: Fix kernel register info leak via hardware skid Dapeng Mi
2026-06-09 5:02 ` [Patch v2 9/9] perf/core: Check kernel access when kernel callchains are requested Dapeng Mi
8 siblings, 0 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi, Yi Lai
On SPR guests where pebs_baseline is not advertised, running:
$ ./perf record -e cpu/event=0x00,umask=0x01,i\
name=INST_RETIRED.PREC_DIST/p -c 10000 sleep 1
can trigger:
unchecked MSR access error: WRMSR to 0x3f1 ... in\
intel_pmu_pebs_enable_all()
Root cause:
SPR-specific PEBS constraints allow fixed-counter scheduling,
for example INST_RETIRED.PREC_DIST on fixed counter 0. In guests without
pebs_baseline, KVM does not support PEBS sampling on fixed counters,
so enabling such events reaches an invalid MSR programming path.
Fix:
Drop fixed-counter entries from the PEBS constraint table. Without
pebs_baseline, those fixed-counter PEBS events now resolve to empty
constraints and are not scheduled/enabled, avoiding the warning and the
broken guest PEBS path.
This is safe because, in pebs_baseline-capable cases, PEBS constraint
lookup already falls back to non-PEBS constraints when needed, and
fixed-counter constraints are effectively shared there.
Reported-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/events/intel/ds.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index cb72af9b61ce..5db15a92017a 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1447,10 +1447,6 @@ struct event_constraint intel_skl_pebs_event_constraints[] = {
};
struct event_constraint intel_icl_pebs_event_constraints[] = {
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x01c0, 0x100000000ULL), /* old INST_RETIRED.PREC_DIST */
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x0100, 0x100000000ULL), /* INST_RETIRED.PREC_DIST */
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL), /* SLOTS */
-
INTEL_PLD_CONSTRAINT(0x1cd, 0xff), /* MEM_TRANS_RETIRED.LOAD_LATENCY */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_STORES */
@@ -1473,9 +1469,6 @@ struct event_constraint intel_icl_pebs_event_constraints[] = {
};
struct event_constraint intel_glc_pebs_event_constraints[] = {
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL), /* INST_RETIRED.PREC_DIST */
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL),
-
INTEL_FLAGS_EVENT_CONSTRAINT(0xc0, 0xfe),
INTEL_PLD_CONSTRAINT(0x1cd, 0xfe),
INTEL_PSD_CONSTRAINT(0x2cd, 0x1),
@@ -1500,9 +1493,6 @@ struct event_constraint intel_glc_pebs_event_constraints[] = {
};
struct event_constraint intel_lnc_pebs_event_constraints[] = {
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL), /* INST_RETIRED.PREC_DIST */
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL),
-
INTEL_FLAGS_UEVENT_CONSTRAINT(0x012a, 0x1), /* OCR.* events */
INTEL_FLAGS_UEVENT_CONSTRAINT(0x012b, 0x1), /* OCR.* events */
@@ -1534,9 +1524,6 @@ struct event_constraint intel_lnc_pebs_event_constraints[] = {
};
struct event_constraint intel_pnc_pebs_event_constraints[] = {
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL), /* INST_RETIRED.PREC_DIST */
- INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL),
-
INTEL_HYBRID_LDLAT_CONSTRAINT(0x1cd, 0xfc),
INTEL_HYBRID_STLAT_CONSTRAINT(0x2cd, 0x3),
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_LOADS */
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 8/9] perf/core: Fix kernel register info leak via hardware skid
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
` (6 preceding siblings ...)
2026-06-09 5:02 ` [Patch v2 7/9] perf/x86/intel: Drop fixed-counter PEBS constraints for baseline PEBS Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:02 ` [Patch v2 9/9] perf/core: Check kernel access when kernel callchains are requested Dapeng Mi
8 siblings, 0 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi, Mark Rutland
An unprivileged hardware perf event using exclude_kernel=1 can leak kernel
register data to user space via PERF_SAMPLE_REGS_INTR or PERF_SAMPLE_IP.
Due to hardware skid, a PMI may trigger after the CPU has already entered
kernel space (Ring 0), bypassing the perf_allow_kernel() privilege
barrier.
This security vulnerability is severely exacerbated by upcoming support
for SIMD register sampling via XSAVES, which could expose sensitive kernel
FPU states (such as active cryptographic keys).
Fix this by ensuring that sampled register data is dropped if the event's
exclude_kernel attribute is set but the PMI catches the CPU in kernel mode.
Link: https://lore.kernel.org/all/20260529085613.CCAFB1F00893@smtp.kernel.org/
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
kernel/events/core.c | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7935d5663944..1bde029eeca7 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7763,10 +7763,20 @@ unsigned long perf_misc_flags(struct perf_event *event,
unsigned long perf_instruction_pointer(struct perf_event *event,
struct pt_regs *regs)
{
- if (should_sample_guest(event))
- return perf_guest_get_ip();
+ /*
+ * Hardware skid can lead to a scenario where a PMI is
+ * delivered after the CPU has already entered kernel mode.
+ * In that case, user-space sampling must not expose kernel
+ * register state.
+ */
+ if (should_sample_guest(event)) {
+ return event->attr.exclude_kernel &&
+ !(perf_guest_state() & PERF_GUEST_USER) ?
+ 0 : perf_guest_get_ip();
+ }
- return perf_arch_instruction_pointer(regs);
+ return event->attr.exclude_kernel && !user_mode(regs) ?
+ 0 : perf_arch_instruction_pointer(regs);
}
static void
@@ -7800,10 +7810,22 @@ static void perf_sample_regs_user(struct perf_regs *regs_user,
}
static void perf_sample_regs_intr(struct perf_regs *regs_intr,
- struct pt_regs *regs)
+ struct pt_regs *regs,
+ bool exclude_kernel)
{
- regs_intr->regs = regs;
- regs_intr->abi = perf_reg_abi(current);
+ /*
+ * Hardware skid can lead to a scenario where a PMI is
+ * delivered after the CPU has already entered kernel mode.
+ * In that case, user-space sampling must not expose kernel
+ * register state.
+ */
+ if (exclude_kernel && !user_mode(regs)) {
+ regs_intr->abi = PERF_SAMPLE_REGS_ABI_NONE;
+ regs_intr->regs = NULL;
+ } else {
+ regs_intr->regs = regs;
+ regs_intr->abi = perf_reg_abi(current);
+ }
}
@@ -8694,7 +8716,8 @@ void perf_prepare_sample(struct perf_sample_data *data,
/* regs dump ABI info */
int size = sizeof(u64);
- perf_sample_regs_intr(&data->regs_intr, regs);
+ perf_sample_regs_intr(&data->regs_intr, regs,
+ event->attr.exclude_kernel);
if (data->regs_intr.regs) {
u64 mask = event->attr.sample_regs_intr;
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Patch v2 9/9] perf/core: Check kernel access when kernel callchains are requested
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
` (7 preceding siblings ...)
2026-06-09 5:02 ` [Patch v2 8/9] perf/core: Fix kernel register info leak via hardware skid Dapeng Mi
@ 2026-06-09 5:02 ` Dapeng Mi
2026-06-09 5:24 ` sashiko-bot
8 siblings, 1 reply; 14+ messages in thread
From: Dapeng Mi @ 2026-06-09 5:02 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
Falcon Thomas, Xudong Hao, Dapeng Mi, Mark Rutland
perf_event_open() currently gates perf_allow_kernel() only on
!attr.exclude_kernel.
However, users can still request kernel callchain collection with
attr.exclude_callchain_kernel == 0 even when attr.exclude_kernel == 1.
That still requires kernel profiling privilege, but the existing check
does not enforce it.
Update the permission check to call perf_allow_kernel() when either
kernel sampling is requested or kernel callchains are requested.
This keeps permission checks aligned with requested data and prevents
unprivileged use of kernel callchain capture.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
kernel/events/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1bde029eeca7..57c9fd640cf5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -13895,7 +13895,9 @@ SYSCALL_DEFINE5(perf_event_open,
if (err)
return err;
- if (!attr.exclude_kernel) {
+ if (!attr.exclude_kernel ||
+ ((attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
+ !attr.exclude_callchain_kernel)) {
err = perf_allow_kernel();
if (err)
return err;
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Patch v2 5/9] perf/x86/intel: Drop LBR entries whose privilege level mismatches br_sel
2026-06-09 5:02 ` [Patch v2 5/9] perf/x86/intel: Drop LBR entries whose privilege level mismatches br_sel Dapeng Mi
@ 2026-06-09 5:21 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-09 5:21 UTC (permalink / raw)
To: Dapeng Mi; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] The kernel address leak in LBR stacks fixed here for Intel PMUs still exists in AMD PMUs (both AMD LBR and AMD BRS), allowing KASLR bypass.
--
commit 955fb4f61b3305a8e4972ee94bb41e752a3dd25c
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
perf/x86/intel: Drop LBR entries whose privilege level mismatches br_sel
This commit fixes an issue where kernel branch addresses could be exposed
to user space in user-only branch stacks. It extends the software filter
in intel_pmu_lbr_filter() to validate both the from and to addresses
against the requested branch privilege filter.
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index 171fdf1962ade..9a63209cc7878 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
[ ... ]
> @@ -1246,8 +1246,16 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
> type |= X86_BR_NO_TX;
> }
>
> - /* if type does not correspond, then discard */
> - if (type == X86_BR_NONE || (br_sel & type) != type) {
> + from_plm = kernel_ip(from) ? X86_BR_KERNEL : X86_BR_USER;
> + /*
> + * If type does not correspond, then discard.
> + * Especially filter out the entries whose from or to address is
> + * a kernel address while only X86_BR_USER is set. This prevents
> + * kernel address from being leaked into a user-space-only LBR stack.
> + */
> + if (type == X86_BR_NONE || (br_sel & type) != type ||
> + (!(br_sel & X86_BR_KERNEL) && (from_plm & X86_BR_KERNEL)) ||
> + (!(br_sel & X86_BR_USER) && (from_plm & X86_BR_USER))) {
> cpuc->lbr_entries[i].from = 0;
> compress = true;
> }
[Severity: High]
This is a pre-existing issue, but does the AMD PMU code suffer from this
same vulnerability?
Looking at the AMD LBR software filter, it appears to only check the
target privilege level and is missing the from_plm privilege validation:
arch/x86/events/amd/lbr.c:amd_pmu_lbr_filter() {
...
/* If type does not correspond, then discard */
if (type == X86_BR_NONE || (br_sel & type) != type) {
cpuc->lbr_entries[i].from = 0; /* mark invalid */
...
}
Similarly, the AMD BRS filter explicitly only validates the to address
and ignores the from address:
arch/x86/events/amd/brs.c:amd_brs_match_plm() {
...
if (!(type & plm_k) && kernel_ip(to))
return 0;
if (!(type & plm_u) && !kernel_ip(to))
return 0;
return 1;
}
Could kernel addresses from kernel-to-user branches leak into user-space
LBR stacks on AMD systems through these paths, bypassing KASLR in the
same way?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609050222.2458129-1-dapeng1.mi@linux.intel.com?part=5
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v2 4/9] perf/x86/intel: Fallback to sw branch type decoding if no hw decoding
2026-06-09 5:02 ` [Patch v2 4/9] perf/x86/intel: Fallback to sw branch type decoding if no hw decoding Dapeng Mi
@ 2026-06-09 5:24 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-09 5:24 UTC (permalink / raw)
To: Dapeng Mi; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [High] The fallback to software branch decoding causes massive NMI overhead and drops valid LBR entries when user memory is paged out.
Pre-existing issues:
- [Critical] Page fault in NMI context due to an out-of-bounds direct memory read into unmapped guard pages during LBR software branch type decoding.
--
commit 11f8cb2dc702b7450f8fbb74de9eaa30a63a9248
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
Subject: perf/x86/intel: Fallback to sw branch type decoding if no hw decoding
This commit updates intel_pmu_lbr_filter() to use software branch-type
decoding when hardware branch-type decoding is unavailable (i.e.
x86_lbr_type is not enabled). This aims to keep branch classification and
filtering behavior correct across Arch LBR configurations.
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index cae2e02fe6cc4..171fdf1962ade 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -1233,6 +1233,7 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
> * OTHER_BRANCH branch type still rely on software decoding.
> */
> if (static_cpu_has(X86_FEATURE_ARCH_LBR) &&
> + static_branch_likely(&x86_lbr_type) &&
[Severity: High]
When x86_lbr_type is not enabled, this falls back to software decoding via
branch_type(). Because intel_pmu_setup_hw_lbr_filter() unconditionally sets
X86_BR_TYPE_SAVE for Arch LBR CPUs:
arch/x86/events/intel/lbr.c:intel_pmu_setup_hw_lbr_filter() {
...
if (static_cpu_has(X86_FEATURE_ARCH_LBR)) {
...
reg->reg |= X86_BR_TYPE_SAVE;
...
}
the early return fast-path in intel_pmu_lbr_filter() is bypassed. Does this
cause intel_pmu_lbr_filter() to iterate over all LBR entries and perform
high-overhead instruction decoding in NMI context even for generic branch
requests?
Additionally, if user memory is paged out, copy_from_user_nmi() will fail,
causing branch_type() to return X86_BR_NONE. Does this result in valid LBR
entries being silently dropped?
[Severity: Critical]
This is a pre-existing issue, but falling back to software decoding exposes
a vulnerability in get_branch_type() when processing LBR entries in NMI context:
arch/x86/events/utils.c:get_branch_type() {
...
if (kernel_text_address(from) && !in_gate_area_no_mm(from)) {
addr = (void *)from;
bytes_read = MAX_INSN_SIZE;
...
insn_init(&insn, addr, bytes_read, is64);
ret = decode_branch_type(&insn);
...
}
If the from address (which is controlled by the LBR hardware recording a
faulting RIP) points to the very last byte of a mapped page, bytes_read is
set to 15. The instruction decoder will blindly read past the page boundary
into an unmapped guard page without exception table handlers.
Can this trigger a kernel page fault (#PF) and Oops in NMI context when
processing the PMU interrupt?
> type <= ARCH_LBR_BR_TYPE_KNOWN_MAX) {
> to_plm = kernel_ip(to) ? X86_BR_KERNEL : X86_BR_USER;
> type = arch_lbr_br_type_map[type] | to_plm;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609050222.2458129-1-dapeng1.mi@linux.intel.com?part=4
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v2 9/9] perf/core: Check kernel access when kernel callchains are requested
2026-06-09 5:02 ` [Patch v2 9/9] perf/core: Check kernel access when kernel callchains are requested Dapeng Mi
@ 2026-06-09 5:24 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-09 5:24 UTC (permalink / raw)
To: Dapeng Mi; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Critical] The patch breaks `perf record -g` for unprivileged users with existing `perf` binaries.
--
commit ac223ffdf5e79c1aa96a91deecdff94004bbe30f
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
perf/core: Check kernel access when kernel callchains are requested
This commit updates perf_event_open to verify kernel access permissions
when kernel callchains are requested, even if exclude_kernel is set.
This is intended to prevent unprivileged capture of kernel callchains.
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
[ ... ]
> @@ -13904,7 +13904,9 @@ SYSCALL_DEFINE5(perf_event_open,
> if (err)
> return err;
>
> - if (!attr.exclude_kernel) {
> + if (!attr.exclude_kernel ||
> + ((attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
> + !attr.exclude_callchain_kernel)) {
> err = perf_allow_kernel();
[Severity: Critical]
Does this change break perf record -g for unprivileged users using
existing perf binaries?
When an unprivileged user runs perf record -g, the initial syscall
request fails with -EACCES. The perf tool's existing fallback logic
retries by setting exclude_kernel = 1, but it leaves
exclude_callchain_kernel = 0.
Because this new check enforces the privilege requirement for callchains
even when exclude_kernel is set, the retried syscall will fail a second
time with -EACCES.
Looking at the perf tool's fallback handler:
tools/perf/util/evsel.c:evsel__fallback() {
...
} else if (err == EACCES && !evsel->core.attr.exclude_kernel &&
(paranoid = perf_event_paranoid()) > 1) {
...
}
Since exclude_kernel was set to 1 during the first fallback, the tool
cannot handle the second -EACCES failure. Will this cause the tool to
abort entirely with a permission denied error rather than falling back
to user-only callchains?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609050222.2458129-1-dapeng1.mi@linux.intel.com?part=9
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v2 6/9] perf/x86/intel: Validate return value of intel_pmu_init_hybrid()
2026-06-09 5:02 ` [Patch v2 6/9] perf/x86/intel: Validate return value of intel_pmu_init_hybrid() Dapeng Mi
@ 2026-06-09 5:25 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-09 5:25 UTC (permalink / raw)
To: Dapeng Mi; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] Unnecessary error handling logic for memory allocation failure in early boot (__init) function
--
commit 3dd0c7770a2df131c0ee92d24ad34bc579754b8e
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
perf/x86/intel: Validate return value of intel_pmu_init_hybrid()
This commit adds checks for the return value of intel_pmu_init_hybrid()
to prevent accessing the uninitialized x86_pmu.hybrid_pmu[] array on
allocation failure. It also introduces an error path to free the created
task_ctx_cache kmem cache if this failure occurs.
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 532a089e16762..57b1a4bd3facf 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
[ ... ]
> @@ -8540,7 +8541,9 @@ __init int intel_pmu_init(void)
> *
> * Initialize the common PerfMon capabilities here.
> */
> - intel_pmu_init_hybrid(hybrid_big_small);
> + ret = intel_pmu_init_hybrid(hybrid_big_small);
> + if (ret < 0)
> + goto err;
[ ... ]
> @@ -8886,6 +8901,10 @@ __init int intel_pmu_init(void)
> intel_aux_output_init();
>
> return 0;
> +
> +err:
> + kmem_cache_destroy(x86_get_pmu(smp_processor_id())->task_ctx_cache);
> + return ret;
> }
[Severity: Low]
Is this cleanup logic necessary for an __init function?
Memory allocation failures during early boot generally mean the system cannot
boot anyway. Functions executed only at this stage typically do not need to
handle memory allocation failures gracefully, making this kind of error
handling and cleanup logic effectively dead code.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609050222.2458129-1-dapeng1.mi@linux.intel.com?part=6
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-06-09 5:25 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 5:02 [Patch v2 0/9] perf/x86: Miscellaneous PMU bug fixes Dapeng Mi
2026-06-09 5:02 ` [Patch v2 1/9] perf/x86/intel: Remove anythread_deprecated bit from perf_capabilities Dapeng Mi
2026-06-09 5:02 ` [Patch v2 2/9] perf/x86: Introduce is_x86_pmu() helper Dapeng Mi
2026-06-09 5:02 ` [Patch v2 3/9] perf/x86: Update cap_user_rdpmc base on rdpmc user disable state Dapeng Mi
2026-06-09 5:02 ` [Patch v2 4/9] perf/x86/intel: Fallback to sw branch type decoding if no hw decoding Dapeng Mi
2026-06-09 5:24 ` sashiko-bot
2026-06-09 5:02 ` [Patch v2 5/9] perf/x86/intel: Drop LBR entries whose privilege level mismatches br_sel Dapeng Mi
2026-06-09 5:21 ` sashiko-bot
2026-06-09 5:02 ` [Patch v2 6/9] perf/x86/intel: Validate return value of intel_pmu_init_hybrid() Dapeng Mi
2026-06-09 5:25 ` sashiko-bot
2026-06-09 5:02 ` [Patch v2 7/9] perf/x86/intel: Drop fixed-counter PEBS constraints for baseline PEBS Dapeng Mi
2026-06-09 5:02 ` [Patch v2 8/9] perf/core: Fix kernel register info leak via hardware skid Dapeng Mi
2026-06-09 5:02 ` [Patch v2 9/9] perf/core: Check kernel access when kernel callchains are requested Dapeng Mi
2026-06-09 5:24 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox