public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
       [not found] <20260228053320.140406-1-dapeng1.mi@linux.intel.com>
@ 2026-02-28  5:33 ` Dapeng Mi
  2026-03-07  1:27   ` Chen, Zide
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Dapeng Mi @ 2026-02-28  5:33 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi, stable

When running the command:
'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
a "shift-out-of-bounds" warning is reported on CWF.

[ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
[ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
[ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
[ 5231.981449][   C17] Call Trace:
[ 5231.981453][   C17]  <NMI>
[ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
[ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
[ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
[ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
[ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
[ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
[ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
[ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
[ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
[ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
[ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
[ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
[ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
[ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
[ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
[ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
[ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
[ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
[ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
[ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
[ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
[ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
[ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
[ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
[ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
[ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
[ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
[ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
[ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
[ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
[ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
[ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
[ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
[ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
[ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
[ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
[ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
[ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
[ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
[ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
......

The warning occurs because the second "instructions:p" event, which
involves branch counters sampling, is incorrectly programmed to fixed
counter 0 instead of the general-purpose (GP) counters 0-3 that support
branch counters sampling. Currently only GP counters 0~3 support branch
counters sampling on CWF, any event involving branch counters sampling
should be programed on GP counters 0~3. Since the counter index of fixed
counter 0 is 32, it leads to the "src" value in below code is right
shifted 64 bits and trigger the "shift-out-of-bounds" warning.

cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;

The root cause is the loss of the branch counters constraint for the
last event in the branch counters sampling event group. This results in
the second "instructions:p" event being programmed on fixed counter 0
incorrectly instead of the appropriate GP counters 0~3.

To address this, we apply the missing branch counters constraint for
the last event in the group. Additionally, we introduce a new function,
`intel_set_branch_counter_constr()`, to apply the branch counters
constraint and avoid code duplication.

Cc: stable@vger.kernel.org
Reported-by: Xudong Hao <xudong.hao@intel.com>
Fixes: 33744916196b ("perf/x86/intel: Support branch counters logging")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 4768236c054b..4b042d71104f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
 		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
 }
 
+static inline int intel_set_branch_counter_constr(struct perf_event *event,
+						  int *num)
+{
+	if (branch_sample_call_stack(event))
+		return -EINVAL;
+	if (branch_sample_counters(event)) {
+		(*num)++;
+		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
+	}
+
+	return 0;
+}
+
 static int intel_pmu_hw_config(struct perf_event *event)
 {
 	int ret = x86_pmu_hw_config(event);
@@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		 * group, which requires the extra space to store the counters.
 		 */
 		leader = event->group_leader;
-		if (branch_sample_call_stack(leader))
+		if (intel_set_branch_counter_constr(leader, &num))
 			return -EINVAL;
-		if (branch_sample_counters(leader)) {
-			num++;
-			leader->hw.dyn_constraint &= x86_pmu.lbr_counters;
-		}
 		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
 
 		for_each_sibling_event(sibling, leader) {
-			if (branch_sample_call_stack(sibling))
+			if (intel_set_branch_counter_constr(sibling, &num))
+				return -EINVAL;
+		}
+
+		if (event != leader) {
+			if (intel_set_branch_counter_constr(event, &num))
 				return -EINVAL;
-			if (branch_sample_counters(sibling)) {
-				num++;
-				sibling->hw.dyn_constraint &= x86_pmu.lbr_counters;
-			}
 		}
 
 		if (num > fls(x86_pmu.lbr_counters))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
@ 2026-03-07  1:27   ` Chen, Zide
  2026-03-11 20:03     ` Peter Zijlstra
  2026-03-11 20:16   ` Peter Zijlstra
  2026-03-16  9:50   ` [tip: perf/urgent] " tip-bot2 for Dapeng Mi
  2 siblings, 1 reply; 10+ messages in thread
From: Chen, Zide @ 2026-03-07  1:27 UTC (permalink / raw)
  To: Dapeng Mi, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Falcon Thomas,
	Xudong Hao, stable



On 2/27/2026 9:33 PM, Dapeng Mi wrote:
> When running the command:
> 'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
> a "shift-out-of-bounds" warning is reported on CWF.
> 
> [ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
> [ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
> [ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
> [ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> [ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
> [ 5231.981449][   C17] Call Trace:
> [ 5231.981453][   C17]  <NMI>
> [ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
> [ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
> [ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
> [ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
> [ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
> [ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
> [ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
> [ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
> [ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
> [ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
> [ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
> [ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
> [ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
> [ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
> [ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
> [ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
> [ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
> [ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
> [ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
> [ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
> [ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
> [ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> [ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
> [ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
> [ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
> [ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
> [ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
> [ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
> [ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
> [ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
> [ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
> [ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
> [ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
> [ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> [ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
> [ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> [ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
> [ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
> [ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
> [ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
> ......
> 
> The warning occurs because the second "instructions:p" event, which
> involves branch counters sampling, is incorrectly programmed to fixed
> counter 0 instead of the general-purpose (GP) counters 0-3 that support
> branch counters sampling. Currently only GP counters 0~3 support branch
> counters sampling on CWF, any event involving branch counters sampling
> should be programed on GP counters 0~3.
> Since the counter index of fixed> counter 0 is 32, it leads to the "src" value in below code is right
> shifted 64 bits and trigger the "shift-out-of-bounds" warning.
> 
> cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;
> 
> The root cause is the loss of the branch counters constraint for the
> last event in the branch counters sampling event group. This results in
> the second "instructions:p" event being programmed on fixed counter 0
> incorrectly instead of the appropriate GP counters 0~3.
> 
> To address this, we apply the missing branch counters constraint for
> the last event in the group. Additionally, we introduce a new function,
> `intel_set_branch_counter_constr()`, to apply the branch counters
> constraint and avoid code duplication.
> 
> Cc: stable@vger.kernel.org
> Reported-by: Xudong Hao <xudong.hao@intel.com>
> Fixes: 33744916196b ("perf/x86/intel: Support branch counters logging")
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---

Reviewed-by: Zide Chen <zide.chen@intel.com>

>  arch/x86/events/intel/core.c | 30 ++++++++++++++++++++----------
>  1 file changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 4768236c054b..4b042d71104f 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>  }
>  
> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
> +						  int *num)
> +{
> +	if (branch_sample_call_stack(event))
> +		return -EINVAL;
> +	if (branch_sample_counters(event)) {
> +		(*num)++;
> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
> +	}
> +
> +	return 0;
> +}
> +
>  static int intel_pmu_hw_config(struct perf_event *event)
>  {
>  	int ret = x86_pmu_hw_config(event);
> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  		 * group, which requires the extra space to store the counters.
>  		 */
>  		leader = event->group_leader;
> -		if (branch_sample_call_stack(leader))
> +		if (intel_set_branch_counter_constr(leader, &num))
>  			return -EINVAL;
> -		if (branch_sample_counters(leader)) {
> -			num++;
> -			leader->hw.dyn_constraint &= x86_pmu.lbr_counters;
> -		}
>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>  
>  		for_each_sibling_event(sibling, leader) {
> -			if (branch_sample_call_stack(sibling))
> +			if (intel_set_branch_counter_constr(sibling, &num))
> +				return -EINVAL;
> +		}
> +
> +		if (event != leader) {
> +			if (intel_set_branch_counter_constr(event, &num))
>  				return -EINVAL;
> -			if (branch_sample_counters(sibling)) {
> -				num++;
> -				sibling->hw.dyn_constraint &= x86_pmu.lbr_counters;
> -			}
>  		}
>  
>  		if (num > fls(x86_pmu.lbr_counters))


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-07  1:27   ` Chen, Zide
@ 2026-03-11 20:03     ` Peter Zijlstra
  2026-03-12  2:02       ` Mi, Dapeng
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2026-03-11 20:03 UTC (permalink / raw)
  To: Chen, Zide
  Cc: Dapeng Mi, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Falcon Thomas, Xudong Hao, stable

On Fri, Mar 06, 2026 at 05:27:35PM -0800, Chen, Zide wrote:
> 
> 
> On 2/27/2026 9:33 PM, Dapeng Mi wrote:
> > When running the command:
> > 'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
> > a "shift-out-of-bounds" warning is reported on CWF.
> > 
> > [ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
> > [ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
> > [ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
> > [ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> > [ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
> > [ 5231.981449][   C17] Call Trace:
> > [ 5231.981453][   C17]  <NMI>
> > [ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
> > [ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
> > [ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
> > [ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
> > [ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
> > [ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
> > [ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
> > [ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
> > [ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
> > [ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
> > [ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
> > [ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
> > [ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
> > [ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
> > [ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
> > [ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
> > [ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
> > [ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
> > [ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
> > [ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
> > [ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
> > [ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> > [ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
> > [ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
> > [ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
> > [ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
> > [ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
> > [ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
> > [ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
> > [ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
> > [ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
> > [ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
> > [ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
> > [ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> > [ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
> > [ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> > [ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
> > [ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
> > [ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
> > [ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
> > ......

That trace could be reduced to:

  UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
  shift exponent 64 is too large for 64-bit type 'long long unsigned int'
  ......
  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
  intel_pmu_lbr_save_brstack+0xc0/0x4c0
  setup_arch_pebs_sample_data+0x114b/0x2400

Without loosing anything valuable.


> > The warning occurs because the second "instructions:p" event, which
> > involves branch counters sampling, is incorrectly programmed to fixed
> > counter 0 instead of the general-purpose (GP) counters 0-3 that support

So here you have 0-3, the normal 'range' notation, but then you go all
funny and use ~ instead:

> > branch counters sampling. Currently only GP counters 0~3 support branch
> > counters sampling on CWF, any event involving branch counters sampling
> > should be programed on GP counters 0~3.
> > Since the counter index of fixed> counter 0 is 32, it leads to the "src" value in below code is right
> > shifted 64 bits and trigger the "shift-out-of-bounds" warning.
> > 
> > cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;
> > 
> > The root cause is the loss of the branch counters constraint for the
> > last event in the branch counters sampling event group. This results in
> > the second "instructions:p" event being programmed on fixed counter 0
> > incorrectly instead of the appropriate GP counters 0~3.

s/0~3/0-3/ ?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
  2026-03-07  1:27   ` Chen, Zide
@ 2026-03-11 20:16   ` Peter Zijlstra
  2026-03-12  2:31     ` Mi, Dapeng
  2026-03-16  9:50   ` [tip: perf/urgent] " tip-bot2 for Dapeng Mi
  2 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2026-03-11 20:16 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable

On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 4768236c054b..4b042d71104f 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>  }
>  
> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
> +						  int *num)
> +{
> +	if (branch_sample_call_stack(event))
> +		return -EINVAL;
> +	if (branch_sample_counters(event)) {
> +		(*num)++;
> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
> +	}
> +
> +	return 0;
> +}
> +
>  static int intel_pmu_hw_config(struct perf_event *event)
>  {
>  	int ret = x86_pmu_hw_config(event);
> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  		 * group, which requires the extra space to store the counters.
>  		 */
>  		leader = event->group_leader;
> +		if (intel_set_branch_counter_constr(leader, &num))
>  			return -EINVAL;
>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>  
>  		for_each_sibling_event(sibling, leader) {
> +			if (intel_set_branch_counter_constr(sibling, &num))
> +				return -EINVAL;
> +		}
> +

Do the new bit is this, right?

> +		if (event != leader) {
> +			if (intel_set_branch_counter_constr(event, &num))
>  				return -EINVAL;
>  		}

The point being that for_each_sibling_event() will not have iterated the
event because its not on the list yet?

That wasn't really clear from the changelog and I think that deserves a
comment as well.

Let me go fix that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-11 20:03     ` Peter Zijlstra
@ 2026-03-12  2:02       ` Mi, Dapeng
  0 siblings, 0 replies; 10+ messages in thread
From: Mi, Dapeng @ 2026-03-12  2:02 UTC (permalink / raw)
  To: Peter Zijlstra, Chen, Zide
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Falcon Thomas,
	Xudong Hao, stable


On 3/12/2026 4:03 AM, Peter Zijlstra wrote:
> On Fri, Mar 06, 2026 at 05:27:35PM -0800, Chen, Zide wrote:
>>
>> On 2/27/2026 9:33 PM, Dapeng Mi wrote:
>>> When running the command:
>>> 'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
>>> a "shift-out-of-bounds" warning is reported on CWF.
>>>
>>> [ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
>>> [ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
>>> [ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
>>> [ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
>>> [ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
>>> [ 5231.981449][   C17] Call Trace:
>>> [ 5231.981453][   C17]  <NMI>
>>> [ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
>>> [ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
>>> [ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
>>> [ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
>>> [ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
>>> [ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
>>> [ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
>>> [ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
>>> [ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
>>> [ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
>>> [ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
>>> [ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
>>> [ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
>>> [ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
>>> [ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
>>> [ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
>>> [ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
>>> [ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
>>> [ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
>>> [ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
>>> [ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
>>> [ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
>>> [ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
>>> [ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
>>> [ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
>>> [ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
>>> [ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
>>> [ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
>>> [ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
>>> [ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
>>> [ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
>>> [ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
>>> [ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
>>> [ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
>>> [ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
>>> [ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
>>> [ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
>>> [ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
>>> [ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
>>> [ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
>>> ......
> That trace could be reduced to:
>
>   UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
>   shift exponent 64 is too large for 64-bit type 'long long unsigned int'
>   ......
>   intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
>   intel_pmu_lbr_save_brstack+0xc0/0x4c0
>   setup_arch_pebs_sample_data+0x114b/0x2400
>
> Without loosing anything valuable.

Sure.


>
>
>>> The warning occurs because the second "instructions:p" event, which
>>> involves branch counters sampling, is incorrectly programmed to fixed
>>> counter 0 instead of the general-purpose (GP) counters 0-3 that support
> So here you have 0-3, the normal 'range' notation, but then you go all
> funny and use ~ instead:

😂


>
>>> branch counters sampling. Currently only GP counters 0~3 support branch
>>> counters sampling on CWF, any event involving branch counters sampling
>>> should be programed on GP counters 0~3.
>>> Since the counter index of fixed> counter 0 is 32, it leads to the "src" value in below code is right
>>> shifted 64 bits and trigger the "shift-out-of-bounds" warning.
>>>
>>> cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;
>>>
>>> The root cause is the loss of the branch counters constraint for the
>>> last event in the branch counters sampling event group. This results in
>>> the second "instructions:p" event being programmed on fixed counter 0
>>> incorrectly instead of the appropriate GP counters 0~3.
> s/0~3/0-3/ ?

Sure.


>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-11 20:16   ` Peter Zijlstra
@ 2026-03-12  2:31     ` Mi, Dapeng
  2026-03-12  6:41       ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Mi, Dapeng @ 2026-03-12  2:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable


On 3/12/2026 4:16 AM, Peter Zijlstra wrote:
> On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 4768236c054b..4b042d71104f 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>>  }
>>  
>> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
>> +						  int *num)
>> +{
>> +	if (branch_sample_call_stack(event))
>> +		return -EINVAL;
>> +	if (branch_sample_counters(event)) {
>> +		(*num)++;
>> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  static int intel_pmu_hw_config(struct perf_event *event)
>>  {
>>  	int ret = x86_pmu_hw_config(event);
>> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>  		 * group, which requires the extra space to store the counters.
>>  		 */
>>  		leader = event->group_leader;
>> +		if (intel_set_branch_counter_constr(leader, &num))
>>  			return -EINVAL;
>>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>>  
>>  		for_each_sibling_event(sibling, leader) {
>> +			if (intel_set_branch_counter_constr(sibling, &num))
>> +				return -EINVAL;
>> +		}
>> +
> Do the new bit is this, right?

Actually not, the key change is the below one. The last event in the group
is not applied the branch counter constraint.

Assume we have a event group {cycles,instructions,branches}. When the 3rd
event "branches" is created and the function intel_pmu_hw_config() is
called for the "branches" event to check the config.  The event leader is
"cycles" and the sibling event has only the "instructions" event at that
time since the 3rd event "branches" is in creation and still not added into
the sibling_list. So for_each_sibling_event() can't really iterate the
"branches" event.


>
>> +		if (event != leader) {
>> +			if (intel_set_branch_counter_constr(event, &num))
>>  				return -EINVAL;
>>  		}
> The point being that for_each_sibling_event() will not have iterated the
> event because its not on the list yet?

Yes. 


>
> That wasn't really clear from the changelog and I think that deserves a
> comment as well.

Sure. I would add comment and enhance the changelog to make it clearer. Thanks.


>
> Let me go fix that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-12  2:31     ` Mi, Dapeng
@ 2026-03-12  6:41       ` Peter Zijlstra
  2026-03-12  6:52         ` Mi, Dapeng
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2026-03-12  6:41 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable

On Thu, Mar 12, 2026 at 10:31:28AM +0800, Mi, Dapeng wrote:
> 
> On 3/12/2026 4:16 AM, Peter Zijlstra wrote:
> > On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
> >> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> >> index 4768236c054b..4b042d71104f 100644
> >> --- a/arch/x86/events/intel/core.c
> >> +++ b/arch/x86/events/intel/core.c
> >> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
> >>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
> >>  }
> >>  
> >> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
> >> +						  int *num)
> >> +{
> >> +	if (branch_sample_call_stack(event))
> >> +		return -EINVAL;
> >> +	if (branch_sample_counters(event)) {
> >> +		(*num)++;
> >> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  static int intel_pmu_hw_config(struct perf_event *event)
> >>  {
> >>  	int ret = x86_pmu_hw_config(event);
> >> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
> >>  		 * group, which requires the extra space to store the counters.
> >>  		 */
> >>  		leader = event->group_leader;
> >> +		if (intel_set_branch_counter_constr(leader, &num))
> >>  			return -EINVAL;
> >>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
> >>  
> >>  		for_each_sibling_event(sibling, leader) {
> >> +			if (intel_set_branch_counter_constr(sibling, &num))
> >> +				return -EINVAL;
> >> +		}
> >> +
> > Do the new bit is this, right?
> 
> Actually not, the key change is the below one. The last event in the group
> is not applied the branch counter constraint.
> 
> Assume we have a event group {cycles,instructions,branches}. When the 3rd
> event "branches" is created and the function intel_pmu_hw_config() is
> called for the "branches" event to check the config.  The event leader is
> "cycles" and the sibling event has only the "instructions" event at that
> time since the 3rd event "branches" is in creation and still not added into
> the sibling_list. So for_each_sibling_event() can't really iterate the
> "branches" event.
> 
> 
> >
> >> +		if (event != leader) {
> >> +			if (intel_set_branch_counter_constr(event, &num))
> >>  				return -EINVAL;
> >>  		}
> > The point being that for_each_sibling_event() will not have iterated the
> > event because its not on the list yet?
> 
> Yes. 
> 
> 
> >
> > That wasn't really clear from the changelog and I think that deserves a
> > comment as well.
> 
> Sure. I would add comment and enhance the changelog to make it clearer. Thanks.
> 

I already fixed everything up. Should be in queue/perf/urgent.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-12  6:41       ` Peter Zijlstra
@ 2026-03-12  6:52         ` Mi, Dapeng
  2026-03-12  7:40           ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Mi, Dapeng @ 2026-03-12  6:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable


On 3/12/2026 2:41 PM, Peter Zijlstra wrote:
> On Thu, Mar 12, 2026 at 10:31:28AM +0800, Mi, Dapeng wrote:
>> On 3/12/2026 4:16 AM, Peter Zijlstra wrote:
>>> On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
>>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>>> index 4768236c054b..4b042d71104f 100644
>>>> --- a/arch/x86/events/intel/core.c
>>>> +++ b/arch/x86/events/intel/core.c
>>>> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>>>>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>>>>  }
>>>>  
>>>> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
>>>> +						  int *num)
>>>> +{
>>>> +	if (branch_sample_call_stack(event))
>>>> +		return -EINVAL;
>>>> +	if (branch_sample_counters(event)) {
>>>> +		(*num)++;
>>>> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  static int intel_pmu_hw_config(struct perf_event *event)
>>>>  {
>>>>  	int ret = x86_pmu_hw_config(event);
>>>> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>>>  		 * group, which requires the extra space to store the counters.
>>>>  		 */
>>>>  		leader = event->group_leader;
>>>> +		if (intel_set_branch_counter_constr(leader, &num))
>>>>  			return -EINVAL;
>>>>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>>>>  
>>>>  		for_each_sibling_event(sibling, leader) {
>>>> +			if (intel_set_branch_counter_constr(sibling, &num))
>>>> +				return -EINVAL;
>>>> +		}
>>>> +
>>> Do the new bit is this, right?
>> Actually not, the key change is the below one. The last event in the group
>> is not applied the branch counter constraint.
>>
>> Assume we have a event group {cycles,instructions,branches}. When the 3rd
>> event "branches" is created and the function intel_pmu_hw_config() is
>> called for the "branches" event to check the config.  The event leader is
>> "cycles" and the sibling event has only the "instructions" event at that
>> time since the 3rd event "branches" is in creation and still not added into
>> the sibling_list. So for_each_sibling_event() can't really iterate the
>> "branches" event.
>>
>>
>>>> +		if (event != leader) {
>>>> +			if (intel_set_branch_counter_constr(event, &num))
>>>>  				return -EINVAL;
>>>>  		}
>>> The point being that for_each_sibling_event() will not have iterated the
>>> event because its not on the list yet?
>> Yes. 
>>
>>
>>> That wasn't really clear from the changelog and I think that deserves a
>>> comment as well.
>> Sure. I would add comment and enhance the changelog to make it clearer. Thanks.
>>
> I already fixed everything up. Should be in queue/perf/urgent.

Thanks. 

Peter, As Ian points out, the patch "perf/x86: Update cap_user_rdpmc base
on rdpmc user disable state" has a bug
(https://lore.kernel.org/all/CAP-5=fWr2L6miiFZ6Km3HYEdmqp3T0NBL=WY3buKdKztW+HvmA@mail.gmail.com/).
I would posted a patch to fix the issue. Thanks.



>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-12  6:52         ` Mi, Dapeng
@ 2026-03-12  7:40           ` Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2026-03-12  7:40 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable

On Thu, Mar 12, 2026 at 02:52:43PM +0800, Mi, Dapeng wrote:
> 
> Peter, As Ian points out, the patch "perf/x86: Update cap_user_rdpmc base
> on rdpmc user disable state" has a bug
> (https://lore.kernel.org/all/CAP-5=fWr2L6miiFZ6Km3HYEdmqp3T0NBL=WY3buKdKztW+HvmA@mail.gmail.com/).
> I would posted a patch to fix the issue. Thanks.
> 

Yep, saw that, its gone :-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip: perf/urgent] perf/x86/intel: Add missing branch counters constraint apply
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
  2026-03-07  1:27   ` Chen, Zide
  2026-03-11 20:16   ` Peter Zijlstra
@ 2026-03-16  9:50   ` tip-bot2 for Dapeng Mi
  2 siblings, 0 replies; 10+ messages in thread
From: tip-bot2 for Dapeng Mi @ 2026-03-16  9:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Xudong Hao, Dapeng Mi, Peter Zijlstra (Intel), stable, x86,
	linux-kernel

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     1d07bbd7ea36ea0b8dfa8068dbe67eb3a32d9590
Gitweb:        https://git.kernel.org/tip/1d07bbd7ea36ea0b8dfa8068dbe67eb3a32d9590
Author:        Dapeng Mi <dapeng1.mi@linux.intel.com>
AuthorDate:    Sat, 28 Feb 2026 13:33:20 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 12 Mar 2026 11:29:16 +01:00

perf/x86/intel: Add missing branch counters constraint apply

When running the command:
'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
a "shift-out-of-bounds" warning is reported on CWF.

  UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
  shift exponent 64 is too large for 64-bit type 'long long unsigned int'
  ......
  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
  intel_pmu_lbr_save_brstack+0xc0/0x4c0
  setup_arch_pebs_sample_data+0x114b/0x2400

The warning occurs because the second "instructions:p" event, which
involves branch counters sampling, is incorrectly programmed to fixed
counter 0 instead of the general-purpose (GP) counters 0-3 that support
branch counters sampling. Currently only GP counters 0-3 support branch
counters sampling on CWF, any event involving branch counters sampling
should be programed on GP counters 0-3. Since the counter index of fixed
counter 0 is 32, it leads to the "src" value in below code is right
shifted 64 bits and trigger the "shift-out-of-bounds" warning.

cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;

The root cause is the loss of the branch counters constraint for the
new event in the branch counters sampling event group. Since it isn't
yet part of the sibling list. This results in the second
"instructions:p" event being programmed on fixed counter 0 incorrectly
instead of the appropriate GP counters 0-3.

To address this, we apply the missing branch counters constraint for
the last event in the group. Additionally, we introduce a new function,
`intel_set_branch_counter_constr()`, to apply the branch counters
constraint and avoid code duplication.

Fixes: 33744916196b ("perf/x86/intel: Support branch counters logging")
Reported-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260228053320.140406-2-dapeng1.mi@linux.intel.com
Cc: stable@vger.kernel.org
---
 arch/x86/events/intel/core.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cf3a4fe..36c6821 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
 		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
 }
 
+static inline int intel_set_branch_counter_constr(struct perf_event *event,
+						  int *num)
+{
+	if (branch_sample_call_stack(event))
+		return -EINVAL;
+	if (branch_sample_counters(event)) {
+		(*num)++;
+		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
+	}
+
+	return 0;
+}
+
 static int intel_pmu_hw_config(struct perf_event *event)
 {
 	int ret = x86_pmu_hw_config(event);
@@ -4698,21 +4711,19 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		 * group, which requires the extra space to store the counters.
 		 */
 		leader = event->group_leader;
-		if (branch_sample_call_stack(leader))
+		if (intel_set_branch_counter_constr(leader, &num))
 			return -EINVAL;
-		if (branch_sample_counters(leader)) {
-			num++;
-			leader->hw.dyn_constraint &= x86_pmu.lbr_counters;
-		}
 		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
 
 		for_each_sibling_event(sibling, leader) {
-			if (branch_sample_call_stack(sibling))
+			if (intel_set_branch_counter_constr(sibling, &num))
+				return -EINVAL;
+		}
+
+		/* event isn't installed as a sibling yet. */
+		if (event != leader) {
+			if (intel_set_branch_counter_constr(event, &num))
 				return -EINVAL;
-			if (branch_sample_counters(sibling)) {
-				num++;
-				sibling->hw.dyn_constraint &= x86_pmu.lbr_counters;
-			}
 		}
 
 		if (num > fls(x86_pmu.lbr_counters))

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-03-16  9:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260228053320.140406-1-dapeng1.mi@linux.intel.com>
2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
2026-03-07  1:27   ` Chen, Zide
2026-03-11 20:03     ` Peter Zijlstra
2026-03-12  2:02       ` Mi, Dapeng
2026-03-11 20:16   ` Peter Zijlstra
2026-03-12  2:31     ` Mi, Dapeng
2026-03-12  6:41       ` Peter Zijlstra
2026-03-12  6:52         ` Mi, Dapeng
2026-03-12  7:40           ` Peter Zijlstra
2026-03-16  9:50   ` [tip: perf/urgent] " tip-bot2 for Dapeng Mi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox