From: Breno Leitao <leitao@debian.org>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
James Clark <james.clark@linaro.org>,
Thomas Gleixner <tglx@kernel.org>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel-team@meta.com, Breno Leitao <leitao@debian.org>
Subject: [PATCH] perf/x86: Restore event pointer setup in x86_pmu_start()
Date: Mon, 09 Mar 2026 07:40:56 -0700 [thread overview]
Message-ID: <20260309-perf-v1-1-601ffb531893@debian.org> (raw)
A production AMD EPYC system crashed with a NULL pointer dereference
in the PMU NMI handler:
BUG: kernel NULL pointer dereference, address: 0000000000000198
RIP: x86_perf_event_update+0xc/0xa0
Call Trace:
<NMI>
amd_pmu_v2_handle_irq+0x1a6/0x390
perf_event_nmi_handler+0x24/0x40
The faulting instruction is `cmpq $0x0, 0x198(%rdi)` with RDI=0,
corresponding to the `if (unlikely(!hwc->event_base))` check in
x86_perf_event_update() where hwc = &event->hw and event is NULL.
drgn inspection of the vmcore on CPU 106 showed a mismatch between
cpuc->active_mask and cpuc->events[]:
active_mask: 0x1e (bits 1, 2, 3, 4)
events[1]: 0xff1100136cbd4f38 (valid)
events[2]: 0x0 (NULL, but active_mask bit 2 set)
events[3]: 0xff1100076fd2cf38 (valid)
events[4]: 0xff1100079e990a90 (valid)
The event that should occupy events[2] was found in event_list[2]
with hw.idx=2 and hw.state=0x0, confirming x86_pmu_start() had run
(which clears hw.state and sets active_mask) but events[2] was
never populated.
Another event (event_list[0]) had hw.state=0x7 (STOPPED|UPTODATE|ARCH),
showing it was stopped when the PMU rescheduled events, confirming the
throttle-then-reschedule sequence occurred.
The root cause is commit 7e772a93eb61 ("perf/x86: Fix NULL event access
and potential PEBS record loss") which moved the cpuc->events[idx]
assignment out of x86_pmu_start() and into x86_pmu_enable(). This
broke any path that calls pmu->start() without going through
x86_pmu_enable() -- specifically the unthrottle path:
perf_adjust_freq_unthr_events()
-> perf_event_unthrottle_group()
-> perf_event_unthrottle()
-> event->pmu->start(event, 0)
-> x86_pmu_start() // sets active_mask but not events[]
The race sequence is:
1. A group of perf events overflows, triggering group throttle via
perf_event_throttle_group(). All events are stopped: active_mask
bits cleared, events[] preserved (x86_pmu_stop no longer clears
events[] after commit 7e772a93eb61).
2. While still throttled (PERF_HES_STOPPED), x86_pmu_enable() runs
due to other scheduling activity. Stopped events that need to
move counters get PERF_HES_ARCH set and events[old_idx] cleared.
In step 2 of x86_pmu_enable(), PERF_HES_ARCH causes these events
to be skipped -- events[new_idx] is never set.
3. The timer tick unthrottles the group via pmu->start(). Since
commit 7e772a93eb61 removed the events[] assignment from
x86_pmu_start(), active_mask[new_idx] is set but events[new_idx]
remains NULL.
4. A PMC overflow NMI fires. The handler iterates active counters,
finds active_mask[2] set, reads events[2] which is NULL, and
crashes dereferencing it.
Restore cpuc->events[idx] = event in x86_pmu_start() so that every
caller of pmu->start() correctly populates events[] before setting
active_mask. This does not reintroduce the PEBS issue that commit
7e772a93eb61 fixed, because that fix also moved the events[] = NULL
clearing from x86_pmu_stop() to x86_pmu_del() -- throttle/unthrottle
cycles no longer clear events[].
Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss")
Signed-off-by: Breno Leitao <leitao@debian.org>
---
arch/x86/events/core.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 03ce1bc7ef2ea..fd82d1427b335 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1546,6 +1546,11 @@ static void x86_pmu_start(struct perf_event *event, int flags)
event->hw.state = 0;
+ /*
+ * Ensure events[idx] is set before active_mask, so NMI handlers
+ * never see an active counter with a NULL event pointer.
+ */
+ cpuc->events[idx] = event;
__set_bit(idx, cpuc->active_mask);
static_call(x86_pmu_enable)(event);
perf_event_update_userpage(event);
---
base-commit: 0bcac7b11262557c990da1ac564d45777eb6b005
change-id: 20260309-perf-fd32da0317a8
Best regards,
--
Breno Leitao <leitao@debian.org>
next reply other threads:[~2026-03-09 14:41 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 14:40 Breno Leitao [this message]
2026-03-09 16:38 ` [PATCH] perf/x86: Restore event pointer setup in x86_pmu_start() Peter Zijlstra
2026-03-09 17:00 ` Breno Leitao
2026-03-10 1:45 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260309-perf-v1-1-601ffb531893@debian.org \
--to=leitao@debian.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=bp@alien8.de \
--cc=dapeng1.mi@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=irogers@google.com \
--cc=james.clark@linaro.org \
--cc=jolsa@kernel.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox