Sashiko discussions
 help / color / mirror / Atom feed
* Re: [Patch v4 2/5] perf/x86/intel: Always reprogram ACR events to prevent stale masks
       [not found] ` <20260430020402.60C32C19425@smtp.kernel.org>
@ 2026-04-30  3:02   ` Mi, Dapeng
  0 siblings, 0 replies; only message in thread
From: Mi, Dapeng @ 2026-04-30  3:02 UTC (permalink / raw)
  To: sashiko; +Cc: linux-perf-users


On 4/30/2026 10:04 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Medium] Unconditional hardware MSR reprogramming for ACR events introduces performance regression and increases measurement blind spots.
> --
>
> commit a8f30026401c4d7de00c3f1a7543204056d7235c
> Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
> perf/x86/intel: Always reprogram ACR events to prevent stale masks
>
> This commit modifies PMU event scheduling to force reprogramming of all ACR
> events during rescheduling. It does this by making match_prev_assignment()
> always return false for ACR events.
>
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index 810ab21ffd99..4b9e105309c6 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -1294,13 +1294,16 @@ int x86_perf_rdpmc_index(struct perf_event *event)
>>  	return event->hw.event_base_rdpmc;
>>  }
>>  
>> -static inline int match_prev_assignment(struct hw_perf_event *hwc,
>> +static inline int match_prev_assignment(struct perf_event *event,
>>  					struct cpu_hw_events *cpuc,
>>  					int i)
>>  {
>> +	struct hw_perf_event *hwc = &event->hw;
>> +
>>  	return hwc->idx == cpuc->assign[i] &&
>> -		hwc->last_cpu == smp_processor_id() &&
>> -		hwc->last_tag == cpuc->tags[i];
>> +	       hwc->last_cpu == smp_processor_id() &&
>> +	       hwc->last_tag == cpuc->tags[i] &&
>> +	       !is_acr_event_group(event);
>>  }
> Does this unconditional hardware MSR reprogramming for ACR events introduce a
> performance regression and increase measurement blind spots?
>
> By unconditionally returning false for all ACR events, every active ACR event
> will undergo a full x86_pmu_stop() and x86_pmu_start() cycle during PMU
> re-enablement (e.g., in x86_pmu_enable() when cpuc->n_added > 0).
>
> These full stop/start cycles invoke multiple wrmsr and rdpmc instructions,
> adding significant cycle overhead to the hot scheduling path. Since this
> occurs while the PMU is globally disabled, the added latency extends the
> global PMU off-time, unnecessarily increasing the measurement blind spot
> where events are not counted across the CPU.
>
> Could this overhead be avoided by checking if the mask actually changed?
>
> x86_pmu_late_setup() calculates the new physical mask and stores it in
> event->hw.config1 before match_prev_assignment() is evaluated. Would it be
> better to compare event->hw.config1 against the previously programmed mask
> stored in cpuc->acr_cfg_b[hwc->idx] and only force a reprogram when they
> differ?

Nice idea. This can be further optimized. I'm not sure if it's worth to
post a new patchset for this single optimization.  If not, I would post a
single optimization patch for this.

Thanks.


>
> [ ... ]
>

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-04-30  3:02 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260430002558.712334-3-dapeng1.mi@linux.intel.com>
     [not found] ` <20260430020402.60C32C19425@smtp.kernel.org>
2026-04-30  3:02   ` [Patch v4 2/5] perf/x86/intel: Always reprogram ACR events to prevent stale masks Mi, Dapeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox