All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Leo Yan" <leo.yan@arm.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH] perf/core: Use AUX pause/resume for throttling
Date: Thu, 14 May 2026 21:10:48 +0000	[thread overview]
Message-ID: <20260514211048.BFB21C2BCB3@smtp.kernel.org> (raw)
In-Reply-To: <20260514-aux-throttle-fix-v1-1-92bdc84f5922@arm.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] The event->hw.aux_paused flag is permanently leaked when an event is unthrottled during a context switch, leaving the event paused forever.
- [High] Throttling overwrites and breaks the user-configured aux-action=pause state, overriding user intent.
- [High] Throttle requests from NMIs can be silently dropped due to the recursion guard in perf_event_aux_pause().
- [High] Throttling silently fails to stop the PMU if the event does not have a mapped ring buffer.
- [High] The commit restricts its fix to PMUs with PERF_PMU_CAP_AUX_PAUSE, leaving other AUX PMUs broken and vulnerable to the same WARN_ON_ONCE(nest).
--

commit 1ca7736295ce051e131ce2cca7c64a9e4d034b68
Author: Leo Yan <leo.yan@arm.com>

perf/core: Use AUX pause/resume for throttling

Since commit 9734e25fbf5a ("perf: Fix the throttle logic for a group"), perf
throttles events in a group with pmu->stop(event, 0) and unthrottles them
with pmu->start(event, 0). For AUX trace PMUs, these callbacks also manage
AUX output state, which can trigger a kernel warning. This commit uses
perf_event_aux_pause() for throttling when the PMU supports it.

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 7935d5663944e..33d049b9cccfa 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c

[ ... ]

> @@ -2763,8 +2764,14 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
>  		return;
>  
>  	event->hw.interrupts = 0;
> -	if (start)
> -		event->pmu->start(event, 0);
> +
> +	if (start) {
> +		if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)
> +			perf_event_aux_pause(event, false);

If a task is scheduled out, the throttled event is stopped. When scheduled
back in, event_sched_in() detects the throttle state and calls
perf_event_unthrottle(event, false). Because start is false, the call to
perf_event_aux_pause(event, false) is skipped, meaning event->hw.aux_paused
is never cleared.

When event_sched_in() subsequently calls event->pmu->add(event, PERF_EF_START),
PMU drivers check the leaked event->hw.aux_paused flag. Seeing it as 1, does
this cause a regression where the hardware starts in a paused state and never
resumes tracing?

Additionally, if a user or BPF program explicitly pauses an AUX event (e.g.,
via aux-action=pause), event->hw.aux_paused is set to 1. If a sibling event
overflows, the kernel throttles the group.

When the group is subsequently unthrottled, this calls
perf_event_aux_pause(event, false) which clears hw.aux_paused and resumes the
PMU. Could this unconditionally override the user's explicit pause request,
starting tracing when it was intentionally halted?

> +		else
> +			event->pmu->start(event, 0);

This isn't a bug introduced by this patch, but doesn't restricting this fix to
PMUs with PERF_PMU_CAP_AUX_PAUSE leave other AUX PMUs broken?

They still fall back to the original event->pmu->stop(event, 0) logic and will
suffer from the exact same unmatched perf_aux_output_begin() cycle and
subsequent kernel warning when unthrottled.

> +	}
> +
>  	if (event == event->group_leader)
>  		perf_log_throttle(event, 1);
>  }

[ ... ]

> @@ -2775,7 +2782,12 @@ static void perf_event_throttle(struct perf_event *event)
>  		return;
>  
>  	event->hw.interrupts = MAX_INTERRUPTS;
> -	event->pmu->stop(event, 0);
> +
> +	if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)
> +		perf_event_aux_pause(event, true);

If an event group is unthrottled in timer context (e.g., sched_tick()) and is
executing perf_event_aux_pause(), an NMI from a sibling event overflow can
interrupt it.

The NMI handler attempts to throttle the group and calls
perf_event_aux_pause(). Looking at perf_event_aux_pause(), it has a recursion
guard:

    if (READ_ONCE(rb->aux_in_pause_resume))
        break;

Since this breaks out early without pausing the hardware, and the fallback to
pmu->stop() was removed here, does the hardware event continue running despite
hw.interrupts == MAX_INTERRUPTS? Could this defeat the throttling mechanism
and cause NMI storms?

Furthermore, if an AUX event is active but has no mapped ring buffer
(event->rb == NULL), perf_event_aux_pause() immediately returns early.

Since the unconditional call to event->pmu->stop(event, 0) was removed, does
this silently fail to stop the PMU if there's no mapped ring buffer?

> +	else
> +		event->pmu->stop(event, 0);
> +
>  	if (event == event->group_leader)
>  		perf_log_throttle(event, 0);
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260514-aux-throttle-fix-v1-1-92bdc84f5922@arm.com?part=1

      reply	other threads:[~2026-05-14 21:10 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-14 15:06 [PATCH] perf/core: Use AUX pause/resume for throttling Leo Yan
2026-05-14 21:10 ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514211048.BFB21C2BCB3@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=leo.yan@arm.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.