* [PATCH] perf/x86/amd/core: Always use the NMI latency mitigation
@ 2026-06-01 14:58 Sandipan Das
2026-06-01 15:13 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: Sandipan Das @ 2026-06-01 14:58 UTC (permalink / raw)
To: linux-perf-users, linux-kernel
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, James Clark, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Ravi Bangoria,
Ananth Narayan, Sandipan Das
Commit df4d29732fda ("perf/x86/amd: Change/fix NMI latency mitigation
to use a timestamp") fixed handling of late-arriving NMIs but limited
the mitigation to processors having X86_FEATURE_PERFCTR_CORE. However,
it is unclear if processors without this feature are also affected.
When Mediated vPMU is enabled on affected hardware, it is also possible
to bypass the fix inside KVM guests if X86_FEATURE_PERFCTR_CORE is
removed from the guest CPUID (e.g. using "-cpu host,-perfctr-core" with
QEMU). Hence, use the mitigation at all times.
Fixes: df4d29732fda ("perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp")
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
arch/x86/events/amd/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 6a4dfc9ecabe..6569048a8c1c 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1414,12 +1414,12 @@ static int __init amd_core_pmu_init(void)
u64 even_ctr_mask = 0ULL;
int i;
- if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
- return 0;
-
/* Avoid calculating the value each time in the NMI handler */
perf_nmi_window = msecs_to_jiffies(100);
+ if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
+ return 0;
+
/*
* If core performance counter extensions exists, we must use
* MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] perf/x86/amd/core: Always use the NMI latency mitigation
2026-06-01 14:58 [PATCH] perf/x86/amd/core: Always use the NMI latency mitigation Sandipan Das
@ 2026-06-01 15:13 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-01 15:13 UTC (permalink / raw)
To: Sandipan Das; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Uninitialized `perf_nmi_tstamp` combined with `INITIAL_JIFFIES` causes the AMD PMU to silently swallow unhandled NMIs during the first ~5 minutes of system uptime on 32-bit x86.
--
commit 71c3691b558a0c49eaeaca25067c0e97960a98e6
Author: Sandipan Das <sandipan.das@amd.com>
perf/x86/amd/core: Always use the NMI latency mitigation
This commit unconditionally enables the NMI latency mitigation by setting up
the perf_nmi_window during initialization. Previously, this was restricted
to processors with X86_FEATURE_PERFCTR_CORE.
> diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
> index 44656d2fb5550..ec0b74b9da4fa 100644
> --- a/arch/x86/events/amd/core.c
> +++ b/arch/x86/events/amd/core.c
> @@ -1412,12 +1412,12 @@ static int __init amd_core_pmu_init(void)
> u64 even_ctr_mask = 0ULL;
> int i;
>
> - if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
> - return 0;
> -
> /* Avoid calculating the value each time in the NMI handler */
> perf_nmi_window = msecs_to_jiffies(100);
[Severity: High]
This is a pre-existing issue, but does enabling this mitigation globally
expose 32-bit systems without PERFCTR_CORE to swallowed NMIs during
early boot?
The per-CPU variable perf_nmi_tstamp is zero-initialized:
arch/x86/events/amd/core.c:
static DEFINE_PER_CPU(unsigned long, perf_nmi_tstamp);
On 32-bit x86, jiffies starts at INITIAL_JIFFIES (a large positive 32-bit
integer like -300 * HZ). If an unhandled NMI occurs before any PMU event
sets perf_nmi_tstamp, amd_pmu_adjust_nmi_window() checks:
arch/x86/events/amd/core.c:amd_pmu_adjust_nmi_window() {
...
if (time_after(jiffies, this_cpu_read(perf_nmi_tstamp)))
return NMI_DONE;
return NMI_HANDLED;
}
Due to 32-bit modulo arithmetic, time_after(INITIAL_JIFFIES, 0) evaluates
to false for the first ~5 minutes of uptime.
Could this cause the AMD PMU NMI handler to incorrectly return NMI_HANDLED
instead of NMI_DONE, silently dropping external hardware watchdog and PCI
error NMIs until jiffies wraps past 0?
>
> + if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
> + return 0;
> +
--
Sashiko AI review · https://sashiko.dev/#/patchset/29a3c970da289ab8f24282933bdb36545c0403e8.1780325517.git.sandipan.das@amd.com?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-01 15:13 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 14:58 [PATCH] perf/x86/amd/core: Always use the NMI latency mitigation Sandipan Das
2026-06-01 15:13 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox