public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation
@ 2026-02-28  5:33 Dapeng Mi
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-02-28  5:33 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi

It's good enough to only check GP counters for PEBS constraints
validation since constraints overlap can only happen on GP counters.

Besides opportunistically refine the code style and use pr_warn() to
replace pr_info() as the message itself is a warning message.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cf3a4fe06ff2..4768236c054b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5770,7 +5770,7 @@ static void __intel_pmu_check_dyn_constr(struct event_constraint *constr,
 			}
 
 			if (check_fail) {
-				pr_info("The two events 0x%llx and 0x%llx may not be "
+				pr_warn("The two events 0x%llx and 0x%llx may not be "
 					"fully scheduled under some circumstances as "
 					"%s.\n",
 					c1->code, c2->code, dyn_constr_type_name[type]);
@@ -5783,6 +5783,7 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
 				       struct event_constraint *constr,
 				       u64 cntr_mask)
 {
+	u64 gp_mask = GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
 	enum dyn_constr_type i;
 	u64 mask;
 
@@ -5797,20 +5798,25 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
 				mask = x86_pmu.lbr_counters;
 			break;
 		case DYN_CONSTR_ACR_CNTR:
-			mask = hybrid(pmu, acr_cntr_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
+			mask = hybrid(pmu, acr_cntr_mask64) & gp_mask;
 			break;
 		case DYN_CONSTR_ACR_CAUSE:
-			if (hybrid(pmu, acr_cntr_mask64) == hybrid(pmu, acr_cause_mask64))
+			if (hybrid(pmu, acr_cntr_mask64) ==
+					hybrid(pmu, acr_cause_mask64))
 				continue;
-			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
+			mask = hybrid(pmu, acr_cause_mask64) & gp_mask;
 			break;
 		case DYN_CONSTR_PEBS:
-			if (x86_pmu.arch_pebs)
-				mask = hybrid(pmu, arch_pebs_cap).counters;
+			if (x86_pmu.arch_pebs) {
+				mask = hybrid(pmu, arch_pebs_cap).counters &
+				       gp_mask;
+			}
 			break;
 		case DYN_CONSTR_PDIST:
-			if (x86_pmu.arch_pebs)
-				mask = hybrid(pmu, arch_pebs_cap).pdists;
+			if (x86_pmu.arch_pebs) {
+				mask = hybrid(pmu, arch_pebs_cap).pdists &
+				       gp_mask;
+			}
 			break;
 		default:
 			pr_warn("Unsupported dynamic constraint type %d\n", i);

base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-02-28  5:33 [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation Dapeng Mi
@ 2026-02-28  5:33 ` Dapeng Mi
  2026-03-07  1:27   ` Chen, Zide
  2026-03-11 20:16   ` Peter Zijlstra
  2026-03-07  1:27 ` [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation Chen, Zide
  2026-03-12  8:28 ` Mi, Dapeng
  2 siblings, 2 replies; 14+ messages in thread
From: Dapeng Mi @ 2026-02-28  5:33 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, Dapeng Mi, stable

When running the command:
'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
a "shift-out-of-bounds" warning is reported on CWF.

[ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
[ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
[ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
[ 5231.981449][   C17] Call Trace:
[ 5231.981453][   C17]  <NMI>
[ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
[ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
[ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
[ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
[ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
[ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
[ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
[ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
[ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
[ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
[ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
[ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
[ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
[ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
[ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
[ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
[ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
[ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
[ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
[ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
[ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
[ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
[ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
[ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
[ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
[ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
[ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
[ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
[ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
[ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
[ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
[ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
[ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
[ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
[ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
[ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
[ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
[ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
[ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
[ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
......

The warning occurs because the second "instructions:p" event, which
involves branch counters sampling, is incorrectly programmed to fixed
counter 0 instead of the general-purpose (GP) counters 0-3 that support
branch counters sampling. Currently only GP counters 0~3 support branch
counters sampling on CWF, any event involving branch counters sampling
should be programed on GP counters 0~3. Since the counter index of fixed
counter 0 is 32, it leads to the "src" value in below code is right
shifted 64 bits and trigger the "shift-out-of-bounds" warning.

cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;

The root cause is the loss of the branch counters constraint for the
last event in the branch counters sampling event group. This results in
the second "instructions:p" event being programmed on fixed counter 0
incorrectly instead of the appropriate GP counters 0~3.

To address this, we apply the missing branch counters constraint for
the last event in the group. Additionally, we introduce a new function,
`intel_set_branch_counter_constr()`, to apply the branch counters
constraint and avoid code duplication.

Cc: stable@vger.kernel.org
Reported-by: Xudong Hao <xudong.hao@intel.com>
Fixes: 33744916196b ("perf/x86/intel: Support branch counters logging")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 4768236c054b..4b042d71104f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
 		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
 }
 
+static inline int intel_set_branch_counter_constr(struct perf_event *event,
+						  int *num)
+{
+	if (branch_sample_call_stack(event))
+		return -EINVAL;
+	if (branch_sample_counters(event)) {
+		(*num)++;
+		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
+	}
+
+	return 0;
+}
+
 static int intel_pmu_hw_config(struct perf_event *event)
 {
 	int ret = x86_pmu_hw_config(event);
@@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		 * group, which requires the extra space to store the counters.
 		 */
 		leader = event->group_leader;
-		if (branch_sample_call_stack(leader))
+		if (intel_set_branch_counter_constr(leader, &num))
 			return -EINVAL;
-		if (branch_sample_counters(leader)) {
-			num++;
-			leader->hw.dyn_constraint &= x86_pmu.lbr_counters;
-		}
 		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
 
 		for_each_sibling_event(sibling, leader) {
-			if (branch_sample_call_stack(sibling))
+			if (intel_set_branch_counter_constr(sibling, &num))
+				return -EINVAL;
+		}
+
+		if (event != leader) {
+			if (intel_set_branch_counter_constr(event, &num))
 				return -EINVAL;
-			if (branch_sample_counters(sibling)) {
-				num++;
-				sibling->hw.dyn_constraint &= x86_pmu.lbr_counters;
-			}
 		}
 
 		if (num > fls(x86_pmu.lbr_counters))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation
  2026-02-28  5:33 [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation Dapeng Mi
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
@ 2026-03-07  1:27 ` Chen, Zide
  2026-03-12  8:28 ` Mi, Dapeng
  2 siblings, 0 replies; 14+ messages in thread
From: Chen, Zide @ 2026-03-07  1:27 UTC (permalink / raw)
  To: Dapeng Mi, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Falcon Thomas,
	Xudong Hao



On 2/27/2026 9:33 PM, Dapeng Mi wrote:
> It's good enough to only check GP counters for PEBS constraints
> validation since constraints overlap can only happen on GP counters.
> 
> Besides opportunistically refine the code style and use pr_warn() to
> replace pr_info() as the message itself is a warning message.
> 
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---

LGTM.

Reviewed-by: Zide Chen <zide.chen@intel.com>

>  arch/x86/events/intel/core.c | 22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index cf3a4fe06ff2..4768236c054b 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -5770,7 +5770,7 @@ static void __intel_pmu_check_dyn_constr(struct event_constraint *constr,
>  			}
>  
>  			if (check_fail) {
> -				pr_info("The two events 0x%llx and 0x%llx may not be "
> +				pr_warn("The two events 0x%llx and 0x%llx may not be "
>  					"fully scheduled under some circumstances as "
>  					"%s.\n",
>  					c1->code, c2->code, dyn_constr_type_name[type]);
> @@ -5783,6 +5783,7 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
>  				       struct event_constraint *constr,
>  				       u64 cntr_mask)
>  {
> +	u64 gp_mask = GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
>  	enum dyn_constr_type i;
>  	u64 mask;
>  
> @@ -5797,20 +5798,25 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
>  				mask = x86_pmu.lbr_counters;
>  			break;
>  		case DYN_CONSTR_ACR_CNTR:
> -			mask = hybrid(pmu, acr_cntr_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
> +			mask = hybrid(pmu, acr_cntr_mask64) & gp_mask;
>  			break;
>  		case DYN_CONSTR_ACR_CAUSE:
> -			if (hybrid(pmu, acr_cntr_mask64) == hybrid(pmu, acr_cause_mask64))
> +			if (hybrid(pmu, acr_cntr_mask64) ==
> +					hybrid(pmu, acr_cause_mask64))
>  				continue;
> -			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
> +			mask = hybrid(pmu, acr_cause_mask64) & gp_mask;
>  			break;
>  		case DYN_CONSTR_PEBS:
> -			if (x86_pmu.arch_pebs)
> -				mask = hybrid(pmu, arch_pebs_cap).counters;
> +			if (x86_pmu.arch_pebs) {
> +				mask = hybrid(pmu, arch_pebs_cap).counters &
> +				       gp_mask;
> +			}
>  			break;
>  		case DYN_CONSTR_PDIST:
> -			if (x86_pmu.arch_pebs)
> -				mask = hybrid(pmu, arch_pebs_cap).pdists;
> +			if (x86_pmu.arch_pebs) {
> +				mask = hybrid(pmu, arch_pebs_cap).pdists &
> +				       gp_mask;
> +			}
>  			break;
>  		default:
>  			pr_warn("Unsupported dynamic constraint type %d\n", i);
> 
> base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
@ 2026-03-07  1:27   ` Chen, Zide
  2026-03-11 20:03     ` Peter Zijlstra
  2026-03-11 20:16   ` Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: Chen, Zide @ 2026-03-07  1:27 UTC (permalink / raw)
  To: Dapeng Mi, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Falcon Thomas,
	Xudong Hao, stable



On 2/27/2026 9:33 PM, Dapeng Mi wrote:
> When running the command:
> 'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
> a "shift-out-of-bounds" warning is reported on CWF.
> 
> [ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
> [ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
> [ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
> [ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> [ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
> [ 5231.981449][   C17] Call Trace:
> [ 5231.981453][   C17]  <NMI>
> [ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
> [ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
> [ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
> [ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
> [ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
> [ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
> [ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
> [ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
> [ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
> [ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
> [ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
> [ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
> [ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
> [ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
> [ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
> [ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
> [ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
> [ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
> [ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
> [ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
> [ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
> [ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> [ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
> [ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
> [ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
> [ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
> [ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
> [ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
> [ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
> [ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
> [ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
> [ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
> [ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
> [ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> [ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
> [ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> [ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
> [ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
> [ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
> [ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
> ......
> 
> The warning occurs because the second "instructions:p" event, which
> involves branch counters sampling, is incorrectly programmed to fixed
> counter 0 instead of the general-purpose (GP) counters 0-3 that support
> branch counters sampling. Currently only GP counters 0~3 support branch
> counters sampling on CWF, any event involving branch counters sampling
> should be programed on GP counters 0~3.
> Since the counter index of fixed> counter 0 is 32, it leads to the "src" value in below code is right
> shifted 64 bits and trigger the "shift-out-of-bounds" warning.
> 
> cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;
> 
> The root cause is the loss of the branch counters constraint for the
> last event in the branch counters sampling event group. This results in
> the second "instructions:p" event being programmed on fixed counter 0
> incorrectly instead of the appropriate GP counters 0~3.
> 
> To address this, we apply the missing branch counters constraint for
> the last event in the group. Additionally, we introduce a new function,
> `intel_set_branch_counter_constr()`, to apply the branch counters
> constraint and avoid code duplication.
> 
> Cc: stable@vger.kernel.org
> Reported-by: Xudong Hao <xudong.hao@intel.com>
> Fixes: 33744916196b ("perf/x86/intel: Support branch counters logging")
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---

Reviewed-by: Zide Chen <zide.chen@intel.com>

>  arch/x86/events/intel/core.c | 30 ++++++++++++++++++++----------
>  1 file changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 4768236c054b..4b042d71104f 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>  }
>  
> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
> +						  int *num)
> +{
> +	if (branch_sample_call_stack(event))
> +		return -EINVAL;
> +	if (branch_sample_counters(event)) {
> +		(*num)++;
> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
> +	}
> +
> +	return 0;
> +}
> +
>  static int intel_pmu_hw_config(struct perf_event *event)
>  {
>  	int ret = x86_pmu_hw_config(event);
> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  		 * group, which requires the extra space to store the counters.
>  		 */
>  		leader = event->group_leader;
> -		if (branch_sample_call_stack(leader))
> +		if (intel_set_branch_counter_constr(leader, &num))
>  			return -EINVAL;
> -		if (branch_sample_counters(leader)) {
> -			num++;
> -			leader->hw.dyn_constraint &= x86_pmu.lbr_counters;
> -		}
>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>  
>  		for_each_sibling_event(sibling, leader) {
> -			if (branch_sample_call_stack(sibling))
> +			if (intel_set_branch_counter_constr(sibling, &num))
> +				return -EINVAL;
> +		}
> +
> +		if (event != leader) {
> +			if (intel_set_branch_counter_constr(event, &num))
>  				return -EINVAL;
> -			if (branch_sample_counters(sibling)) {
> -				num++;
> -				sibling->hw.dyn_constraint &= x86_pmu.lbr_counters;
> -			}
>  		}
>  
>  		if (num > fls(x86_pmu.lbr_counters))


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-07  1:27   ` Chen, Zide
@ 2026-03-11 20:03     ` Peter Zijlstra
  2026-03-12  2:02       ` Mi, Dapeng
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2026-03-11 20:03 UTC (permalink / raw)
  To: Chen, Zide
  Cc: Dapeng Mi, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Alexander Shishkin, Andi Kleen,
	Eranian Stephane, linux-kernel, linux-perf-users, Dapeng Mi,
	Falcon Thomas, Xudong Hao, stable

On Fri, Mar 06, 2026 at 05:27:35PM -0800, Chen, Zide wrote:
> 
> 
> On 2/27/2026 9:33 PM, Dapeng Mi wrote:
> > When running the command:
> > 'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
> > a "shift-out-of-bounds" warning is reported on CWF.
> > 
> > [ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
> > [ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
> > [ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
> > [ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> > [ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
> > [ 5231.981449][   C17] Call Trace:
> > [ 5231.981453][   C17]  <NMI>
> > [ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
> > [ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
> > [ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
> > [ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
> > [ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
> > [ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
> > [ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
> > [ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
> > [ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
> > [ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
> > [ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
> > [ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
> > [ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
> > [ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
> > [ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
> > [ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
> > [ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
> > [ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
> > [ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
> > [ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
> > [ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
> > [ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> > [ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
> > [ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
> > [ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
> > [ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
> > [ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
> > [ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
> > [ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
> > [ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
> > [ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
> > [ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
> > [ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
> > [ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> > [ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
> > [ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
> > [ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
> > [ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
> > [ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
> > [ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
> > ......

That trace could be reduced to:

  UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
  shift exponent 64 is too large for 64-bit type 'long long unsigned int'
  ......
  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
  intel_pmu_lbr_save_brstack+0xc0/0x4c0
  setup_arch_pebs_sample_data+0x114b/0x2400

Without loosing anything valuable.


> > The warning occurs because the second "instructions:p" event, which
> > involves branch counters sampling, is incorrectly programmed to fixed
> > counter 0 instead of the general-purpose (GP) counters 0-3 that support

So here you have 0-3, the normal 'range' notation, but then you go all
funny and use ~ instead:

> > branch counters sampling. Currently only GP counters 0~3 support branch
> > counters sampling on CWF, any event involving branch counters sampling
> > should be programed on GP counters 0~3.
> > Since the counter index of fixed> counter 0 is 32, it leads to the "src" value in below code is right
> > shifted 64 bits and trigger the "shift-out-of-bounds" warning.
> > 
> > cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;
> > 
> > The root cause is the loss of the branch counters constraint for the
> > last event in the branch counters sampling event group. This results in
> > the second "instructions:p" event being programmed on fixed counter 0
> > incorrectly instead of the appropriate GP counters 0~3.

s/0~3/0-3/ ?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
  2026-03-07  1:27   ` Chen, Zide
@ 2026-03-11 20:16   ` Peter Zijlstra
  2026-03-12  2:31     ` Mi, Dapeng
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2026-03-11 20:16 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable

On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 4768236c054b..4b042d71104f 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>  }
>  
> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
> +						  int *num)
> +{
> +	if (branch_sample_call_stack(event))
> +		return -EINVAL;
> +	if (branch_sample_counters(event)) {
> +		(*num)++;
> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
> +	}
> +
> +	return 0;
> +}
> +
>  static int intel_pmu_hw_config(struct perf_event *event)
>  {
>  	int ret = x86_pmu_hw_config(event);
> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  		 * group, which requires the extra space to store the counters.
>  		 */
>  		leader = event->group_leader;
> +		if (intel_set_branch_counter_constr(leader, &num))
>  			return -EINVAL;
>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>  
>  		for_each_sibling_event(sibling, leader) {
> +			if (intel_set_branch_counter_constr(sibling, &num))
> +				return -EINVAL;
> +		}
> +

Do the new bit is this, right?

> +		if (event != leader) {
> +			if (intel_set_branch_counter_constr(event, &num))
>  				return -EINVAL;
>  		}

The point being that for_each_sibling_event() will not have iterated the
event because its not on the list yet?

That wasn't really clear from the changelog and I think that deserves a
comment as well.

Let me go fix that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-11 20:03     ` Peter Zijlstra
@ 2026-03-12  2:02       ` Mi, Dapeng
  0 siblings, 0 replies; 14+ messages in thread
From: Mi, Dapeng @ 2026-03-12  2:02 UTC (permalink / raw)
  To: Peter Zijlstra, Chen, Zide
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Falcon Thomas,
	Xudong Hao, stable


On 3/12/2026 4:03 AM, Peter Zijlstra wrote:
> On Fri, Mar 06, 2026 at 05:27:35PM -0800, Chen, Zide wrote:
>>
>> On 2/27/2026 9:33 PM, Dapeng Mi wrote:
>>> When running the command:
>>> 'perf record -e "{instructions,instructions:p}" -j any,counter sleep 1',
>>> a "shift-out-of-bounds" warning is reported on CWF.
>>>
>>> [ 5231.981423][   C17] UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
>>> [ 5231.981428][   C17] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
>>> [ 5231.981436][   C17] CPU: 17 UID: 0 PID: 211871 Comm: sleep Tainted: G S      W           6.18.0-2025-12-09-intel-next-48166-g6cf574943ba3 #1 PREEMPT(none)
>>> [ 5231.981445][   C17] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
>>> [ 5231.981447][   C17] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.3544.P98.2508260307 08/26/2025
>>> [ 5231.981449][   C17] Call Trace:
>>> [ 5231.981453][   C17]  <NMI>
>>> [ 5231.981455][   C17]  dump_stack_lvl+0x4b/0x70
>>> [ 5231.981463][   C17]  ubsan_epilogue+0x5/0x2b
>>> [ 5231.981468][   C17]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
>>> [ 5231.981472][   C17]  ? __entry_text_end+0x158b/0x102259
>>> [ 5231.981475][   C17]  intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
>>> [ 5231.981480][   C17]  ? __task_pid_nr_ns+0x134/0x2a0
>>> [ 5231.981483][   C17]  ? __pfx_intel_pmu_lbr_counters_reorder.isra.0+0x10/0x10
>>> [ 5231.981486][   C17]  ? __pfx_perf_output_sample+0x10/0x10
>>> [ 5231.981489][   C17]  ? arch_perf_update_userpage+0x293/0x310
>>> [ 5231.981491][   C17]  ? __pfx_arch_perf_update_userpage+0x10/0x10
>>> [ 5231.981494][   C17]  ? local_clock_noinstr+0xd/0x100
>>> [ 5231.981498][   C17]  ? calc_timer_values+0x2cb/0x860
>>> [ 5231.981501][   C17]  ? perf_event_update_userpage+0x399/0x5b0
>>> [ 5231.981505][   C17]  ? __pfx_perf_event_update_userpage+0x10/0x10
>>> [ 5231.981508][   C17]  ? local_clock_noinstr+0xd/0x100
>>> [ 5231.981511][   C17]  ? __perf_event_account_interrupt+0x11c/0x540
>>> [ 5231.981514][   C17]  intel_pmu_lbr_save_brstack+0xc0/0x4c0
>>> [ 5231.981518][   C17]  setup_arch_pebs_sample_data+0x114b/0x2400
>>> [ 5231.981522][   C17]  ? __pfx_x86_perf_event_set_period+0x10/0x10
>>> [ 5231.981526][   C17]  intel_pmu_drain_arch_pebs+0x64d/0xcc0
>>> [ 5231.981530][   C17]  ? __pfx_intel_pmu_drain_arch_pebs+0x10/0x10
>>> [ 5231.981534][   C17]  ? unwind_next_frame+0x11c5/0x1df0
>>> [ 5231.981541][   C17]  ? intel_pmu_drain_bts_buffer+0xbf/0x6e0
>>> [ 5231.981545][   C17]  ? __pfx_intel_pmu_drain_bts_buffer+0x10/0x10
>>> [ 5231.981550][   C17]  handle_pmi_common+0x5c5/0xcb0
>>> [ 5231.981553][   C17]  ? __pfx_handle_pmi_common+0x10/0x10
>>> [ 5231.981556][   C17]  ? intel_idle+0x64/0xb0
>>> [ 5231.981560][   C17]  ? intel_bts_interrupt+0xe5/0x4c0
>>> [ 5231.981562][   C17]  ? __pfx_intel_bts_interrupt+0x10/0x10
>>> [ 5231.981565][   C17]  ? intel_pmu_lbr_filter+0x27f/0x910
>>> [ 5231.981568][   C17]  intel_pmu_handle_irq+0x2ed/0x600
>>> [ 5231.981571][   C17]  perf_event_nmi_handler+0x219/0x280
>>> [ 5231.981575][   C17]  ? __pfx_perf_event_nmi_handler+0x10/0x10
>>> [ 5231.981579][   C17]  ? unwind_next_frame+0x11c5/0x1df0
>>> [ 5231.981582][   C17]  nmi_handle.part.0+0x11b/0x3a0
>>> [ 5231.981585][   C17]  ? unwind_next_frame+0x11c5/0x1df0
>>> [ 5231.981588][   C17]  default_do_nmi+0x6b/0x180
>>> [ 5231.981591][   C17]  fred_exc_nmi+0x3e/0x80
>>> [ 5231.981594][   C17]  asm_fred_entrypoint_kernel+0x41/0x60
>>> [ 5231.981596][   C17] RIP: 0010:unwind_next_frame+0x11c5/0x1df0
>>> ......
> That trace could be reduced to:
>
>   UBSAN: shift-out-of-bounds in /kbuild/src/consumer/arch/x86/events/intel/lbr.c:970:15
>   shift exponent 64 is too large for 64-bit type 'long long unsigned int'
>   ......
>   intel_pmu_lbr_counters_reorder.isra.0.cold+0x2a/0xa7
>   intel_pmu_lbr_save_brstack+0xc0/0x4c0
>   setup_arch_pebs_sample_data+0x114b/0x2400
>
> Without loosing anything valuable.

Sure.


>
>
>>> The warning occurs because the second "instructions:p" event, which
>>> involves branch counters sampling, is incorrectly programmed to fixed
>>> counter 0 instead of the general-purpose (GP) counters 0-3 that support
> So here you have 0-3, the normal 'range' notation, but then you go all
> funny and use ~ instead:

😂


>
>>> branch counters sampling. Currently only GP counters 0~3 support branch
>>> counters sampling on CWF, any event involving branch counters sampling
>>> should be programed on GP counters 0~3.
>>> Since the counter index of fixed> counter 0 is 32, it leads to the "src" value in below code is right
>>> shifted 64 bits and trigger the "shift-out-of-bounds" warning.
>>>
>>> cnt = (src >> (order[j] * LBR_INFO_BR_CNTR_BITS)) & LBR_INFO_BR_CNTR_MASK;
>>>
>>> The root cause is the loss of the branch counters constraint for the
>>> last event in the branch counters sampling event group. This results in
>>> the second "instructions:p" event being programmed on fixed counter 0
>>> incorrectly instead of the appropriate GP counters 0~3.
> s/0~3/0-3/ ?

Sure.


>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-11 20:16   ` Peter Zijlstra
@ 2026-03-12  2:31     ` Mi, Dapeng
  2026-03-12  6:41       ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Mi, Dapeng @ 2026-03-12  2:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable


On 3/12/2026 4:16 AM, Peter Zijlstra wrote:
> On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 4768236c054b..4b042d71104f 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>>  }
>>  
>> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
>> +						  int *num)
>> +{
>> +	if (branch_sample_call_stack(event))
>> +		return -EINVAL;
>> +	if (branch_sample_counters(event)) {
>> +		(*num)++;
>> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  static int intel_pmu_hw_config(struct perf_event *event)
>>  {
>>  	int ret = x86_pmu_hw_config(event);
>> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>  		 * group, which requires the extra space to store the counters.
>>  		 */
>>  		leader = event->group_leader;
>> +		if (intel_set_branch_counter_constr(leader, &num))
>>  			return -EINVAL;
>>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>>  
>>  		for_each_sibling_event(sibling, leader) {
>> +			if (intel_set_branch_counter_constr(sibling, &num))
>> +				return -EINVAL;
>> +		}
>> +
> Do the new bit is this, right?

Actually not, the key change is the below one. The last event in the group
is not applied the branch counter constraint.

Assume we have a event group {cycles,instructions,branches}. When the 3rd
event "branches" is created and the function intel_pmu_hw_config() is
called for the "branches" event to check the config.  The event leader is
"cycles" and the sibling event has only the "instructions" event at that
time since the 3rd event "branches" is in creation and still not added into
the sibling_list. So for_each_sibling_event() can't really iterate the
"branches" event.


>
>> +		if (event != leader) {
>> +			if (intel_set_branch_counter_constr(event, &num))
>>  				return -EINVAL;
>>  		}
> The point being that for_each_sibling_event() will not have iterated the
> event because its not on the list yet?

Yes. 


>
> That wasn't really clear from the changelog and I think that deserves a
> comment as well.

Sure. I would add comment and enhance the changelog to make it clearer. Thanks.


>
> Let me go fix that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-12  2:31     ` Mi, Dapeng
@ 2026-03-12  6:41       ` Peter Zijlstra
  2026-03-12  6:52         ` Mi, Dapeng
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2026-03-12  6:41 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable

On Thu, Mar 12, 2026 at 10:31:28AM +0800, Mi, Dapeng wrote:
> 
> On 3/12/2026 4:16 AM, Peter Zijlstra wrote:
> > On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
> >> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> >> index 4768236c054b..4b042d71104f 100644
> >> --- a/arch/x86/events/intel/core.c
> >> +++ b/arch/x86/events/intel/core.c
> >> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
> >>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
> >>  }
> >>  
> >> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
> >> +						  int *num)
> >> +{
> >> +	if (branch_sample_call_stack(event))
> >> +		return -EINVAL;
> >> +	if (branch_sample_counters(event)) {
> >> +		(*num)++;
> >> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  static int intel_pmu_hw_config(struct perf_event *event)
> >>  {
> >>  	int ret = x86_pmu_hw_config(event);
> >> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
> >>  		 * group, which requires the extra space to store the counters.
> >>  		 */
> >>  		leader = event->group_leader;
> >> +		if (intel_set_branch_counter_constr(leader, &num))
> >>  			return -EINVAL;
> >>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
> >>  
> >>  		for_each_sibling_event(sibling, leader) {
> >> +			if (intel_set_branch_counter_constr(sibling, &num))
> >> +				return -EINVAL;
> >> +		}
> >> +
> > Do the new bit is this, right?
> 
> Actually not, the key change is the below one. The last event in the group
> is not applied the branch counter constraint.
> 
> Assume we have a event group {cycles,instructions,branches}. When the 3rd
> event "branches" is created and the function intel_pmu_hw_config() is
> called for the "branches" event to check the config.  The event leader is
> "cycles" and the sibling event has only the "instructions" event at that
> time since the 3rd event "branches" is in creation and still not added into
> the sibling_list. So for_each_sibling_event() can't really iterate the
> "branches" event.
> 
> 
> >
> >> +		if (event != leader) {
> >> +			if (intel_set_branch_counter_constr(event, &num))
> >>  				return -EINVAL;
> >>  		}
> > The point being that for_each_sibling_event() will not have iterated the
> > event because its not on the list yet?
> 
> Yes. 
> 
> 
> >
> > That wasn't really clear from the changelog and I think that deserves a
> > comment as well.
> 
> Sure. I would add comment and enhance the changelog to make it clearer. Thanks.
> 

I already fixed everything up. Should be in queue/perf/urgent.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-12  6:41       ` Peter Zijlstra
@ 2026-03-12  6:52         ` Mi, Dapeng
  2026-03-12  7:40           ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Mi, Dapeng @ 2026-03-12  6:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable


On 3/12/2026 2:41 PM, Peter Zijlstra wrote:
> On Thu, Mar 12, 2026 at 10:31:28AM +0800, Mi, Dapeng wrote:
>> On 3/12/2026 4:16 AM, Peter Zijlstra wrote:
>>> On Sat, Feb 28, 2026 at 01:33:20PM +0800, Dapeng Mi wrote:
>>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>>> index 4768236c054b..4b042d71104f 100644
>>>> --- a/arch/x86/events/intel/core.c
>>>> +++ b/arch/x86/events/intel/core.c
>>>> @@ -4628,6 +4628,19 @@ static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
>>>>  		event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
>>>>  }
>>>>  
>>>> +static inline int intel_set_branch_counter_constr(struct perf_event *event,
>>>> +						  int *num)
>>>> +{
>>>> +	if (branch_sample_call_stack(event))
>>>> +		return -EINVAL;
>>>> +	if (branch_sample_counters(event)) {
>>>> +		(*num)++;
>>>> +		event->hw.dyn_constraint &= x86_pmu.lbr_counters;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  static int intel_pmu_hw_config(struct perf_event *event)
>>>>  {
>>>>  	int ret = x86_pmu_hw_config(event);
>>>> @@ -4698,21 +4711,18 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>>>  		 * group, which requires the extra space to store the counters.
>>>>  		 */
>>>>  		leader = event->group_leader;
>>>> +		if (intel_set_branch_counter_constr(leader, &num))
>>>>  			return -EINVAL;
>>>>  		leader->hw.flags |= PERF_X86_EVENT_BRANCH_COUNTERS;
>>>>  
>>>>  		for_each_sibling_event(sibling, leader) {
>>>> +			if (intel_set_branch_counter_constr(sibling, &num))
>>>> +				return -EINVAL;
>>>> +		}
>>>> +
>>> Do the new bit is this, right?
>> Actually not, the key change is the below one. The last event in the group
>> is not applied the branch counter constraint.
>>
>> Assume we have a event group {cycles,instructions,branches}. When the 3rd
>> event "branches" is created and the function intel_pmu_hw_config() is
>> called for the "branches" event to check the config.  The event leader is
>> "cycles" and the sibling event has only the "instructions" event at that
>> time since the 3rd event "branches" is in creation and still not added into
>> the sibling_list. So for_each_sibling_event() can't really iterate the
>> "branches" event.
>>
>>
>>>> +		if (event != leader) {
>>>> +			if (intel_set_branch_counter_constr(event, &num))
>>>>  				return -EINVAL;
>>>>  		}
>>> The point being that for_each_sibling_event() will not have iterated the
>>> event because its not on the list yet?
>> Yes. 
>>
>>
>>> That wasn't really clear from the changelog and I think that deserves a
>>> comment as well.
>> Sure. I would add comment and enhance the changelog to make it clearer. Thanks.
>>
> I already fixed everything up. Should be in queue/perf/urgent.

Thanks. 

Peter, As Ian points out, the patch "perf/x86: Update cap_user_rdpmc base
on rdpmc user disable state" has a bug
(https://lore.kernel.org/all/CAP-5=fWr2L6miiFZ6Km3HYEdmqp3T0NBL=WY3buKdKztW+HvmA@mail.gmail.com/).
I would posted a patch to fix the issue. Thanks.



>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply
  2026-03-12  6:52         ` Mi, Dapeng
@ 2026-03-12  7:40           ` Peter Zijlstra
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2026-03-12  7:40 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao, stable

On Thu, Mar 12, 2026 at 02:52:43PM +0800, Mi, Dapeng wrote:
> 
> Peter, As Ian points out, the patch "perf/x86: Update cap_user_rdpmc base
> on rdpmc user disable state" has a bug
> (https://lore.kernel.org/all/CAP-5=fWr2L6miiFZ6Km3HYEdmqp3T0NBL=WY3buKdKztW+HvmA@mail.gmail.com/).
> I would posted a patch to fix the issue. Thanks.
> 

Yep, saw that, its gone :-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation
  2026-02-28  5:33 [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation Dapeng Mi
  2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
  2026-03-07  1:27 ` [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation Chen, Zide
@ 2026-03-12  8:28 ` Mi, Dapeng
  2026-03-12  8:42   ` Peter Zijlstra
  2 siblings, 1 reply; 14+ messages in thread
From: Mi, Dapeng @ 2026-03-12  8:28 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao

Hi Peter,

Could you please review and merge this change? This change suppose to be
low risky. Without this change, the dynamic constraints check would trigger
false positive on DMR/NVL. Thanks.


On 2/28/2026 1:33 PM, Dapeng Mi wrote:
> It's good enough to only check GP counters for PEBS constraints
> validation since constraints overlap can only happen on GP counters.
>
> Besides opportunistically refine the code style and use pr_warn() to
> replace pr_info() as the message itself is a warning message.
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  arch/x86/events/intel/core.c | 22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index cf3a4fe06ff2..4768236c054b 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -5770,7 +5770,7 @@ static void __intel_pmu_check_dyn_constr(struct event_constraint *constr,
>  			}
>  
>  			if (check_fail) {
> -				pr_info("The two events 0x%llx and 0x%llx may not be "
> +				pr_warn("The two events 0x%llx and 0x%llx may not be "
>  					"fully scheduled under some circumstances as "
>  					"%s.\n",
>  					c1->code, c2->code, dyn_constr_type_name[type]);
> @@ -5783,6 +5783,7 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
>  				       struct event_constraint *constr,
>  				       u64 cntr_mask)
>  {
> +	u64 gp_mask = GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
>  	enum dyn_constr_type i;
>  	u64 mask;
>  
> @@ -5797,20 +5798,25 @@ static void intel_pmu_check_dyn_constr(struct pmu *pmu,
>  				mask = x86_pmu.lbr_counters;
>  			break;
>  		case DYN_CONSTR_ACR_CNTR:
> -			mask = hybrid(pmu, acr_cntr_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
> +			mask = hybrid(pmu, acr_cntr_mask64) & gp_mask;
>  			break;
>  		case DYN_CONSTR_ACR_CAUSE:
> -			if (hybrid(pmu, acr_cntr_mask64) == hybrid(pmu, acr_cause_mask64))
> +			if (hybrid(pmu, acr_cntr_mask64) ==
> +					hybrid(pmu, acr_cause_mask64))
>  				continue;
> -			mask = hybrid(pmu, acr_cause_mask64) & GENMASK_ULL(INTEL_PMC_MAX_GENERIC - 1, 0);
> +			mask = hybrid(pmu, acr_cause_mask64) & gp_mask;
>  			break;
>  		case DYN_CONSTR_PEBS:
> -			if (x86_pmu.arch_pebs)
> -				mask = hybrid(pmu, arch_pebs_cap).counters;
> +			if (x86_pmu.arch_pebs) {
> +				mask = hybrid(pmu, arch_pebs_cap).counters &
> +				       gp_mask;
> +			}
>  			break;
>  		case DYN_CONSTR_PDIST:
> -			if (x86_pmu.arch_pebs)
> -				mask = hybrid(pmu, arch_pebs_cap).pdists;
> +			if (x86_pmu.arch_pebs) {
> +				mask = hybrid(pmu, arch_pebs_cap).pdists &
> +				       gp_mask;
> +			}
>  			break;
>  		default:
>  			pr_warn("Unsupported dynamic constraint type %d\n", i);
>
> base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation
  2026-03-12  8:28 ` Mi, Dapeng
@ 2026-03-12  8:42   ` Peter Zijlstra
  2026-03-12  8:44     ` Mi, Dapeng
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2026-03-12  8:42 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao

On Thu, Mar 12, 2026 at 04:28:45PM +0800, Mi, Dapeng wrote:
> Hi Peter,
> 
> Could you please review and merge this change? This change suppose to be
> low risky. Without this change, the dynamic constraints check would trigger
> false positive on DMR/NVL. Thanks.

Ah, so I have it in queue/perf/core, it didn't have no Fixes on so I
didn't think it urgent.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation
  2026-03-12  8:42   ` Peter Zijlstra
@ 2026-03-12  8:44     ` Mi, Dapeng
  0 siblings, 0 replies; 14+ messages in thread
From: Mi, Dapeng @ 2026-03-12  8:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Eranian Stephane,
	linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen,
	Falcon Thomas, Xudong Hao


On 3/12/2026 4:42 PM, Peter Zijlstra wrote:
> On Thu, Mar 12, 2026 at 04:28:45PM +0800, Mi, Dapeng wrote:
>> Hi Peter,
>>
>> Could you please review and merge this change? This change suppose to be
>> low risky. Without this change, the dynamic constraints check would trigger
>> false positive on DMR/NVL. Thanks.
> Ah, so I have it in queue/perf/core, it didn't have no Fixes on so I
> didn't think it urgent.

Oh, just saw it. Thanks a lot! 😊


>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-03-12  8:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-28  5:33 [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation Dapeng Mi
2026-02-28  5:33 ` [RESEND Patch 2/2] perf/x86/intel: Add missing branch counters constraint apply Dapeng Mi
2026-03-07  1:27   ` Chen, Zide
2026-03-11 20:03     ` Peter Zijlstra
2026-03-12  2:02       ` Mi, Dapeng
2026-03-11 20:16   ` Peter Zijlstra
2026-03-12  2:31     ` Mi, Dapeng
2026-03-12  6:41       ` Peter Zijlstra
2026-03-12  6:52         ` Mi, Dapeng
2026-03-12  7:40           ` Peter Zijlstra
2026-03-07  1:27 ` [RESEND Patch 1/2] perf/x86/intel: Only check GP counters for PEBS constraints validation Chen, Zide
2026-03-12  8:28 ` Mi, Dapeng
2026-03-12  8:42   ` Peter Zijlstra
2026-03-12  8:44     ` Mi, Dapeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox