From: sashiko-bot@kernel.org
To: "Leo Yan" <leo.yan@arm.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH] perf/core: Use AUX pause/resume for throttling
Date: Thu, 14 May 2026 21:10:48 +0000 [thread overview]
Message-ID: <20260514211048.BFB21C2BCB3@smtp.kernel.org> (raw)
In-Reply-To: <20260514-aux-throttle-fix-v1-1-92bdc84f5922@arm.com>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] The event->hw.aux_paused flag is permanently leaked when an event is unthrottled during a context switch, leaving the event paused forever.
- [High] Throttling overwrites and breaks the user-configured aux-action=pause state, overriding user intent.
- [High] Throttle requests from NMIs can be silently dropped due to the recursion guard in perf_event_aux_pause().
- [High] Throttling silently fails to stop the PMU if the event does not have a mapped ring buffer.
- [High] The commit restricts its fix to PMUs with PERF_PMU_CAP_AUX_PAUSE, leaving other AUX PMUs broken and vulnerable to the same WARN_ON_ONCE(nest).
--
commit 1ca7736295ce051e131ce2cca7c64a9e4d034b68
Author: Leo Yan <leo.yan@arm.com>
perf/core: Use AUX pause/resume for throttling
Since commit 9734e25fbf5a ("perf: Fix the throttle logic for a group"), perf
throttles events in a group with pmu->stop(event, 0) and unthrottles them
with pmu->start(event, 0). For AUX trace PMUs, these callbacks also manage
AUX output state, which can trigger a kernel warning. This commit uses
perf_event_aux_pause() for throttling when the PMU supports it.
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 7935d5663944e..33d049b9cccfa 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
[ ... ]
> @@ -2763,8 +2764,14 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
> return;
>
> event->hw.interrupts = 0;
> - if (start)
> - event->pmu->start(event, 0);
> +
> + if (start) {
> + if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)
> + perf_event_aux_pause(event, false);
If a task is scheduled out, the throttled event is stopped. When scheduled
back in, event_sched_in() detects the throttle state and calls
perf_event_unthrottle(event, false). Because start is false, the call to
perf_event_aux_pause(event, false) is skipped, meaning event->hw.aux_paused
is never cleared.
When event_sched_in() subsequently calls event->pmu->add(event, PERF_EF_START),
PMU drivers check the leaked event->hw.aux_paused flag. Seeing it as 1, does
this cause a regression where the hardware starts in a paused state and never
resumes tracing?
Additionally, if a user or BPF program explicitly pauses an AUX event (e.g.,
via aux-action=pause), event->hw.aux_paused is set to 1. If a sibling event
overflows, the kernel throttles the group.
When the group is subsequently unthrottled, this calls
perf_event_aux_pause(event, false) which clears hw.aux_paused and resumes the
PMU. Could this unconditionally override the user's explicit pause request,
starting tracing when it was intentionally halted?
> + else
> + event->pmu->start(event, 0);
This isn't a bug introduced by this patch, but doesn't restricting this fix to
PMUs with PERF_PMU_CAP_AUX_PAUSE leave other AUX PMUs broken?
They still fall back to the original event->pmu->stop(event, 0) logic and will
suffer from the exact same unmatched perf_aux_output_begin() cycle and
subsequent kernel warning when unthrottled.
> + }
> +
> if (event == event->group_leader)
> perf_log_throttle(event, 1);
> }
[ ... ]
> @@ -2775,7 +2782,12 @@ static void perf_event_throttle(struct perf_event *event)
> return;
>
> event->hw.interrupts = MAX_INTERRUPTS;
> - event->pmu->stop(event, 0);
> +
> + if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)
> + perf_event_aux_pause(event, true);
If an event group is unthrottled in timer context (e.g., sched_tick()) and is
executing perf_event_aux_pause(), an NMI from a sibling event overflow can
interrupt it.
The NMI handler attempts to throttle the group and calls
perf_event_aux_pause(). Looking at perf_event_aux_pause(), it has a recursion
guard:
if (READ_ONCE(rb->aux_in_pause_resume))
break;
Since this breaks out early without pausing the hardware, and the fallback to
pmu->stop() was removed here, does the hardware event continue running despite
hw.interrupts == MAX_INTERRUPTS? Could this defeat the throttling mechanism
and cause NMI storms?
Furthermore, if an AUX event is active but has no mapped ring buffer
(event->rb == NULL), perf_event_aux_pause() immediately returns early.
Since the unconditional call to event->pmu->stop(event, 0) was removed, does
this silently fail to stop the PMU if there's no mapped ring buffer?
> + else
> + event->pmu->stop(event, 0);
> +
> if (event == event->group_leader)
> perf_log_throttle(event, 0);
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260514-aux-throttle-fix-v1-1-92bdc84f5922@arm.com?part=1
prev parent reply other threads:[~2026-05-14 21:10 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 15:06 [PATCH] perf/core: Use AUX pause/resume for throttling Leo Yan
2026-05-14 21:10 ` sashiko-bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260514211048.BFB21C2BCB3@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=leo.yan@arm.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox