* Re: [Patch v4 2/5] perf/x86/intel: Always reprogram ACR events to prevent stale masks
[not found] ` <20260430020402.60C32C19425@smtp.kernel.org>
@ 2026-04-30 3:02 ` Mi, Dapeng
0 siblings, 0 replies; only message in thread
From: Mi, Dapeng @ 2026-04-30 3:02 UTC (permalink / raw)
To: sashiko; +Cc: linux-perf-users
On 4/30/2026 10:04 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Medium] Unconditional hardware MSR reprogramming for ACR events introduces performance regression and increases measurement blind spots.
> --
>
> commit a8f30026401c4d7de00c3f1a7543204056d7235c
> Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
> perf/x86/intel: Always reprogram ACR events to prevent stale masks
>
> This commit modifies PMU event scheduling to force reprogramming of all ACR
> events during rescheduling. It does this by making match_prev_assignment()
> always return false for ACR events.
>
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index 810ab21ffd99..4b9e105309c6 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -1294,13 +1294,16 @@ int x86_perf_rdpmc_index(struct perf_event *event)
>> return event->hw.event_base_rdpmc;
>> }
>>
>> -static inline int match_prev_assignment(struct hw_perf_event *hwc,
>> +static inline int match_prev_assignment(struct perf_event *event,
>> struct cpu_hw_events *cpuc,
>> int i)
>> {
>> + struct hw_perf_event *hwc = &event->hw;
>> +
>> return hwc->idx == cpuc->assign[i] &&
>> - hwc->last_cpu == smp_processor_id() &&
>> - hwc->last_tag == cpuc->tags[i];
>> + hwc->last_cpu == smp_processor_id() &&
>> + hwc->last_tag == cpuc->tags[i] &&
>> + !is_acr_event_group(event);
>> }
> Does this unconditional hardware MSR reprogramming for ACR events introduce a
> performance regression and increase measurement blind spots?
>
> By unconditionally returning false for all ACR events, every active ACR event
> will undergo a full x86_pmu_stop() and x86_pmu_start() cycle during PMU
> re-enablement (e.g., in x86_pmu_enable() when cpuc->n_added > 0).
>
> These full stop/start cycles invoke multiple wrmsr and rdpmc instructions,
> adding significant cycle overhead to the hot scheduling path. Since this
> occurs while the PMU is globally disabled, the added latency extends the
> global PMU off-time, unnecessarily increasing the measurement blind spot
> where events are not counted across the CPU.
>
> Could this overhead be avoided by checking if the mask actually changed?
>
> x86_pmu_late_setup() calculates the new physical mask and stores it in
> event->hw.config1 before match_prev_assignment() is evaluated. Would it be
> better to compare event->hw.config1 against the previously programmed mask
> stored in cpuc->acr_cfg_b[hwc->idx] and only force a reprogram when they
> differ?
Nice idea. This can be further optimized. I'm not sure if it's worth to
post a new patchset for this single optimization. If not, I would post a
single optimization patch for this.
Thanks.
>
> [ ... ]
>
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-04-30 3:02 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260430002558.712334-3-dapeng1.mi@linux.intel.com>
[not found] ` <20260430020402.60C32C19425@smtp.kernel.org>
2026-04-30 3:02 ` [Patch v4 2/5] perf/x86/intel: Always reprogram ACR events to prevent stale masks Mi, Dapeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox