Linux Perf Users
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Leo Yan" <leo.yan@arm.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH] perf/core: Use AUX pause/resume for throttling
Date: Thu, 14 May 2026 21:10:48 +0000	[thread overview]
Message-ID: <20260514211048.BFB21C2BCB3@smtp.kernel.org> (raw)
In-Reply-To: <20260514-aux-throttle-fix-v1-1-92bdc84f5922@arm.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] The event->hw.aux_paused flag is permanently leaked when an event is unthrottled during a context switch, leaving the event paused forever.
- [High] Throttling overwrites and breaks the user-configured aux-action=pause state, overriding user intent.
- [High] Throttle requests from NMIs can be silently dropped due to the recursion guard in perf_event_aux_pause().
- [High] Throttling silently fails to stop the PMU if the event does not have a mapped ring buffer.
- [High] The commit restricts its fix to PMUs with PERF_PMU_CAP_AUX_PAUSE, leaving other AUX PMUs broken and vulnerable to the same WARN_ON_ONCE(nest).
--

commit 1ca7736295ce051e131ce2cca7c64a9e4d034b68
Author: Leo Yan <leo.yan@arm.com>

perf/core: Use AUX pause/resume for throttling

Since commit 9734e25fbf5a ("perf: Fix the throttle logic for a group"), perf
throttles events in a group with pmu->stop(event, 0) and unthrottles them
with pmu->start(event, 0). For AUX trace PMUs, these callbacks also manage
AUX output state, which can trigger a kernel warning. This commit uses
perf_event_aux_pause() for throttling when the PMU supports it.

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 7935d5663944e..33d049b9cccfa 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c

[ ... ]

> @@ -2763,8 +2764,14 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
>  		return;
>  
>  	event->hw.interrupts = 0;
> -	if (start)
> -		event->pmu->start(event, 0);
> +
> +	if (start) {
> +		if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)
> +			perf_event_aux_pause(event, false);

If a task is scheduled out, the throttled event is stopped. When scheduled
back in, event_sched_in() detects the throttle state and calls
perf_event_unthrottle(event, false). Because start is false, the call to
perf_event_aux_pause(event, false) is skipped, meaning event->hw.aux_paused
is never cleared.

When event_sched_in() subsequently calls event->pmu->add(event, PERF_EF_START),
PMU drivers check the leaked event->hw.aux_paused flag. Seeing it as 1, does
this cause a regression where the hardware starts in a paused state and never
resumes tracing?

Additionally, if a user or BPF program explicitly pauses an AUX event (e.g.,
via aux-action=pause), event->hw.aux_paused is set to 1. If a sibling event
overflows, the kernel throttles the group.

When the group is subsequently unthrottled, this calls
perf_event_aux_pause(event, false) which clears hw.aux_paused and resumes the
PMU. Could this unconditionally override the user's explicit pause request,
starting tracing when it was intentionally halted?

> +		else
> +			event->pmu->start(event, 0);

This isn't a bug introduced by this patch, but doesn't restricting this fix to
PMUs with PERF_PMU_CAP_AUX_PAUSE leave other AUX PMUs broken?

They still fall back to the original event->pmu->stop(event, 0) logic and will
suffer from the exact same unmatched perf_aux_output_begin() cycle and
subsequent kernel warning when unthrottled.

> +	}
> +
>  	if (event == event->group_leader)
>  		perf_log_throttle(event, 1);
>  }

[ ... ]

> @@ -2775,7 +2782,12 @@ static void perf_event_throttle(struct perf_event *event)
>  		return;
>  
>  	event->hw.interrupts = MAX_INTERRUPTS;
> -	event->pmu->stop(event, 0);
> +
> +	if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)
> +		perf_event_aux_pause(event, true);

If an event group is unthrottled in timer context (e.g., sched_tick()) and is
executing perf_event_aux_pause(), an NMI from a sibling event overflow can
interrupt it.

The NMI handler attempts to throttle the group and calls
perf_event_aux_pause(). Looking at perf_event_aux_pause(), it has a recursion
guard:

    if (READ_ONCE(rb->aux_in_pause_resume))
        break;

Since this breaks out early without pausing the hardware, and the fallback to
pmu->stop() was removed here, does the hardware event continue running despite
hw.interrupts == MAX_INTERRUPTS? Could this defeat the throttling mechanism
and cause NMI storms?

Furthermore, if an AUX event is active but has no mapped ring buffer
(event->rb == NULL), perf_event_aux_pause() immediately returns early.

Since the unconditional call to event->pmu->stop(event, 0) was removed, does
this silently fail to stop the PMU if there's no mapped ring buffer?

> +	else
> +		event->pmu->stop(event, 0);
> +
>  	if (event == event->group_leader)
>  		perf_log_throttle(event, 0);
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260514-aux-throttle-fix-v1-1-92bdc84f5922@arm.com?part=1

      reply	other threads:[~2026-05-14 21:10 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-14 15:06 [PATCH] perf/core: Use AUX pause/resume for throttling Leo Yan
2026-05-14 21:10 ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514211048.BFB21C2BCB3@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=leo.yan@arm.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox