From: Ingo Molnar <mingo@kernel.org>
To: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Kan Liang <kan.liang@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Eranian Stephane <eranian@google.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Dapeng Mi <dapeng1.mi@intel.com>
Subject: Re: [PATCH 1/2] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
Date: Sat, 31 May 2025 10:01:15 +0200 [thread overview]
Message-ID: <aDq3S9lwO1YadCKT@gmail.com> (raw)
In-Reply-To: <20250529080236.2552247-1-dapeng1.mi@linux.intel.com>
* Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
> When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
> access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.
>
> [ 55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
> [ 55.611280] Call Trace:
> [ 55.611282] <TASK>
> [ 55.611284] ? intel_pmu_config_acr+0x87/0x160
> [ 55.611289] intel_pmu_enable_acr+0x6d/0x80
> [ 55.611291] intel_pmu_enable_event+0xce/0x460
> [ 55.611293] x86_pmu_start+0x78/0xb0
> [ 55.611297] x86_pmu_enable+0x218/0x3a0
> [ 55.611300] ? x86_pmu_enable+0x121/0x3a0
> [ 55.611302] perf_pmu_enable+0x40/0x50
> [ 55.611307] ctx_resched+0x19d/0x220
> [ 55.611309] __perf_install_in_context+0x284/0x2f0
> [ 55.611311] ? __pfx_remote_function+0x10/0x10
> [ 55.611314] remote_function+0x52/0x70
> [ 55.611317] ? __pfx_remote_function+0x10/0x10
> [ 55.611319] generic_exec_single+0x84/0x150
> [ 55.611323] smp_call_function_single+0xc5/0x1a0
> [ 55.611326] ? __pfx_remote_function+0x10/0x10
> [ 55.611329] perf_install_in_context+0xd1/0x1e0
> [ 55.611331] ? __pfx___perf_install_in_context+0x10/0x10
> [ 55.611333] __do_sys_perf_event_open+0xa76/0x1040
> [ 55.611336] __x64_sys_perf_event_open+0x26/0x30
> [ 55.611337] x64_sys_call+0x1d8e/0x20c0
> [ 55.611339] do_syscall_64+0x4f/0x120
> [ 55.611343] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
> thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR
> which requires to enable auto counter reload on GP counter 0.
>
> The root cause of causing this issue is the check for auto counter
> reload (ACR) counter mask from user space is incorrect in
> intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter
> mask from user space could be set into hw.config1 and then written into
> CFG_B MSRs and trigger the MSR access warning.
>
> e.g., User may create a perf event with ACR counter mask (config2=0xcb),
> and there is only 1 event created, so "cpuc->n_events" is 1.
>
> The correct check condition should be "i + idx >= cpuc->n_events"
> instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise,
> the counter mask would traverse twice and an invalid "cpuc->assign[1]"
> bit (bit 0) is set into hw.config1 and cause MSR accessing error.
>
> Besides, also check if the ACR counter mask corresponding events are
> ACR events. If not, filter out these counter mask. If a event is not a
> ACR event, it could be scheduled to an HW counter which doesn't support
> ACR. It's invalid to add their counter index in ACR counter mask.
>
> Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as
> user could set any invalid ACR counter mask and the warning message
> could mislead users.
>
> Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> arch/x86/events/intel/core.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 3a319cf6d364..8d046b1a237e 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2994,7 +2994,8 @@ static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc)
> if (event->group_leader != leader->group_leader)
> break;
> for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
> - if (WARN_ON_ONCE(i + idx > cpuc->n_events))
> + if (i + idx >= cpuc->n_events ||
> + !is_acr_event_group(cpuc->event_list[i + idx]))
> return;
Is this a normal condition?
- If it's normal then the 'return' is destructive, isn't it? Shouldn't
it be a 'break', if this condition is legitimate?
- If it's not normal, then the WARN_ON_ONCE() was justified, right?
Thanks,
Ingo
next prev parent reply other threads:[~2025-05-31 8:01 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-29 8:02 [PATCH 1/2] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error Dapeng Mi
2025-05-29 8:02 ` [PATCH 2/2] perf/x86/intel: Fix wrong index calculation in intel_pmu_config_acr() Dapeng Mi
2025-05-29 13:55 ` Liang, Kan
2025-05-31 8:25 ` [tip: perf/urgent] perf/x86/intel: Fix incorrect MSR index calculations " tip-bot2 for Dapeng Mi
2025-05-29 13:54 ` [PATCH 1/2] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error Liang, Kan
2025-05-31 8:01 ` Ingo Molnar [this message]
2025-06-03 2:14 ` Mi, Dapeng
2025-06-19 1:11 ` Mi, Dapeng
2025-07-11 2:40 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aDq3S9lwO1YadCKT@gmail.com \
--to=mingo@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=dapeng1.mi@intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.