From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: sashiko@lists.linux.dev
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [Patch v3 1/4] perf/x86/intel: Improve validation and configuration of ACR masks
Date: Tue, 28 Apr 2026 17:51:30 +0800 [thread overview]
Message-ID: <4e48eacf-484e-4cbc-aa95-47c9ded1161c@linux.intel.com> (raw)
In-Reply-To: <20260427101337.5FB32C19425@smtp.kernel.org>
On 4/27/2026 6:13 PM, sashiko-bot@kernel.org wrote:
> Sashiko AI review found 3 potential issue(s):
> - [High] The ACR feature incorrectly maps logical sibling indices from user space to physical array indices in `event_list`, which silently corrupts the configuration if any sibling is disabled.
> - [High] The updated ACR hardware configuration (`hw.config1`) is not synchronized to the PMU hardware register if the event itself does not change counters, leaving the hardware state stale.
> - [Low] The patch uses a complex and brittle hardware counter bitmask to verify group membership instead of simply checking array bounds.
> --
>
> commit 0bc290b3247e55dbfb3b7cee24f6926da6f5adfb
> Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Subject: perf/x86/intel: Improve validation and configuration of ACR masks
>
> This commit addresses several issues with user space ACR mask validation and
> configuration. It calculates an index bitmap for each ACR group to drop invalid
> user-space mask bits, avoids early returns on invalid bits, and explicitly
> clears stale hardware ACR masks.
>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index d9488ade0f8e..dafa233dd757 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
> [ ... ]
>> @@ -3332,23 +3332,41 @@ static void intel_pmu_enable_event(struct perf_event *event)
>> static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc)
>> {
>> struct perf_event *event, *leader;
>> - int i, j, idx;
>> + int i, j, k, bit, idx;
>> + u64 group_mask;
>>
>> for (i = 0; i < cpuc->n_events; i++) {
>> leader = cpuc->event_list[i];
>> if (!is_acr_event_group(leader))
>> continue;
>>
>> - /* The ACR events must be contiguous. */
>> + /* Find the last event of the ACR group. */
>> for (j = i; j < cpuc->n_events; j++) {
>> event = cpuc->event_list[j];
>> if (event->group_leader != leader->group_leader)
>> break;
>> - for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
>> - if (i + idx >= cpuc->n_events ||
>> - !is_acr_event_group(cpuc->event_list[i + idx]))
>> - return;
>> - __set_bit(cpuc->assign[i + idx], (unsigned long *)&event->hw.config1);
>> + }
>> +
>> + /* Figure out the group indices bitmap. */
>> + group_mask = 0;
>> + for (k = i; k < j; k++)
>> + group_mask |= BIT_ULL(cpuc->assign[k]);
> Since x86_schedule_events() guarantees strictly unique hardware counters for
> all concurrently active core PMU events, a counter is present in group_mask if
> and only if its index falls within [i, j).
>
> Given that idx = i + bit guarantees idx >= i, is it possible to simplify this
> by just checking if idx >= j instead of constructing and checking the
> group_mask?
Yes, since the ACR events must be continuous, we can directly check the
event indices. The code can be simplified to this.
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 013e6e02706d..f38cf245a894 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3343,7 +3343,6 @@ static void intel_pmu_acr_late_setup(struct
cpu_hw_events *cpuc)
{
struct perf_event *event, *leader;
int i, j, k, bit, idx;
- u64 group_mask;
for (i = 0; i < cpuc->n_events; i++) {
leader = cpuc->event_list[i];
@@ -3357,11 +3356,6 @@ static void intel_pmu_acr_late_setup(struct
cpu_hw_events *cpuc)
break;
}
- /* Figure out the group indices bitmap. */
- group_mask = 0;
- for (k = i; k < j; k++)
- group_mask |= BIT_ULL(cpuc->assign[k]);
-
/*
* Translate the user-space ACR mask (attr.config2) into
the physical
* counter bitmask (hw.config1) for each ACR event in the
group.
@@ -3372,9 +3366,8 @@ static void intel_pmu_acr_late_setup(struct
cpu_hw_events *cpuc)
event->hw.config1 = 0;
for_each_set_bit(bit, (unsigned long
*)&event->attr.config2, X86_PMC_IDX_MAX) {
idx = i + bit;
- if (idx >= cpuc->n_events ||
- !(BIT_ULL(cpuc->assign[idx]) &
group_mask) ||
- !is_acr_event_group(cpuc->event_list[idx]))
+ /* Event index of ACR group must locate in
[i, j). */
+ if (idx >= j ||
!is_acr_event_group(cpuc->event_list[idx]))
continue;
__set_bit(cpuc->assign[idx], (unsigned long
*)&event->hw.config1);
}
>
>> +
>> + /*
>> + * Translate the user-space ACR mask (attr.config2) into the physical
>> + * counter bitmask (hw.config1) for each ACR event in the group.
>> + * NOTE: ACR event contiguity is guaranteed by intel_pmu_hw_config().
>> + */
>> + for (k = i; k < j; k++) {
>> + event = cpuc->event_list[k];
>> + event->hw.config1 = 0;
> If Event A moves to a new counter but Event B itself does not move,
> match_prev_assignment() would evaluate to true for Event B, and
> x86_pmu_start() would be skipped.
>
> Does this mean the updated hw.config1 for Event B is never written to the
> physical hardware MSR, breaking the Auto Counter Reload functionality?
Yeah, this is indeed a potential issue. Suppose the simplest fix would be
to make match_prev_assignment() return false for ACR events.
Maybe like this.
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 03ce1bc7ef2e..e766621f9449 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1294,13 +1294,16 @@ int x86_perf_rdpmc_index(struct perf_event *event)
return event->hw.event_base_rdpmc;
}
-static inline int match_prev_assignment(struct hw_perf_event *hwc,
+static inline int match_prev_assignment(struct perf_event *event,
struct cpu_hw_events *cpuc,
int i)
{
+ struct hw_perf_event *hwc = &event->hw;
+
return hwc->idx == cpuc->assign[i] &&
- hwc->last_cpu == smp_processor_id() &&
- hwc->last_tag == cpuc->tags[i];
+ hwc->last_cpu == smp_processor_id() &&
+ hwc->last_tag == cpuc->tags[i] &&
+ !is_acr_event_group(event);
}
static void x86_pmu_start(struct perf_event *event, int flags);
@@ -1346,7 +1349,7 @@ static void x86_pmu_enable(struct pmu *pmu)
* - no other event has used the counter since
*/
if (hwc->idx == -1 ||
- match_prev_assignment(hwc, cpuc, i))
+ match_prev_assignment(event, cpuc, i))
continue;
/*
@@ -1367,7 +1370,7 @@ static void x86_pmu_enable(struct pmu *pmu)
event = cpuc->event_list[i];
hwc = &event->hw;
- if (!match_prev_assignment(hwc, cpuc, i))
+ if (!match_prev_assignment(event, cpuc, i))
x86_assign_hw_event(event, cpuc, i);
else if (i < n_running)
continue;
>
>> + for_each_set_bit(bit, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
>> + idx = i + bit;
> User space provides the ACR mask (attr.config2) using absolute logical bit
> indices (e.g., bit 0 = leader, bit 1 = first sibling). However,
> collect_events() skips events that are in PERF_EVENT_STATE_OFF.
>
> If an earlier sibling is disabled, it is not added to event_list, causing all
> subsequent active siblings to shift down in the array.
>
> Does idx point to the wrong event in the active array if any sibling is
> disabled? Should the mask resolve against the logical sibling_list rather
> than the compressed active array?
In theory, it's indeed an issue, but it could be hard to trigger in real
application. The events in ACR group are interdependent and disabling any
event would break the expected behavior of the ACR sampling and user should
not disable any event in the ACR group. Even though some non-leader member
is disabled, the above acr_mask validation code would filter out the
invalid bits in most of cases.
It's unfeasible to parse the counter indices against cpuc->events[] which
is in HW counters order. It's hard to figure out an ACR events group base
on the cpuc->events[]. It looks there is not an easy way to thoroughly fix
this issue. Since the issue would be hard to happen in reality, I would
just add a "FIXME" to highlight the risk.
Thanks.
>
>> + if (idx >= cpuc->n_events ||
>> + !(BIT_ULL(cpuc->assign[idx]) & group_mask) ||
>> + !is_acr_event_group(cpuc->event_list[idx]))
>> + continue;
>> + __set_bit(cpuc->assign[idx], (unsigned long *)&event->hw.config1);
>> }
>> }
next prev parent reply other threads:[~2026-04-28 9:51 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-27 8:55 [Patch v3 0/4] perf/x86/intel: Fix bugs of auto counter reload sampling Dapeng Mi
2026-04-27 8:55 ` [Patch v3 1/4] perf/x86/intel: Improve validation and configuration of ACR masks Dapeng Mi
2026-04-27 10:13 ` sashiko-bot
2026-04-28 9:51 ` Mi, Dapeng [this message]
2026-04-27 8:55 ` [Patch v3 2/4] perf/x86/intel: Disable PMI for self-reloaded ACR events Dapeng Mi
2026-04-27 8:55 ` [Patch v3 3/4] perf/x86/intel: Enable auto counter reload for DMR Dapeng Mi
2026-04-27 8:55 ` [Patch v3 4/4] perf/x86/intel: Consolidate MSR_IA32_PERF_CFG_C tracking Dapeng Mi
2026-04-27 12:12 ` sashiko-bot
2026-04-28 9:59 ` Mi, Dapeng
2026-04-28 13:00 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e48eacf-484e-4cbc-aa95-47c9ded1161c@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox