From: sashiko-bot@kernel.org
To: "Saurav Sachidanand" <sauravsc@amazon.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 4/4] perf: nvidia_t410_c2c: handle PERF_EF_UPDATE in stop
Date: Thu, 14 May 2026 17:34:32 +0000 [thread overview]
Message-ID: <20260514173433.B51DAC2BCB3@smtp.kernel.org> (raw)
In-Reply-To: <20260514010629.76558-4-sauravsc@amazon.com>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] Calling nv_c2c_pmu_event_update() in nv_c2c_pmu_stop() causes double-counting and massive counter wrap-around corruption.
- [High] Logically stopping the event fails to exclude counts because the hardware counter continues running and start logic is unmodified.
- [High] nv_c2c_pmu_stop() incorrectly uses an early return on PERF_HES_STOPPED, permanently dropping final counter updates.
- [Medium] nv_c2c_pmu_stop() fails to set PERF_HES_UPTODATE after updating the event.
--
commit 1d56e58b1d1592eb3b08b91d9c3d09bd821ef6ad
Author: Saurav Sachidanand <sauravsc@amazon.com>
perf: nvidia_t410_c2c: handle PERF_EF_UPDATE in stop
This commit updates nv_c2c_pmu_stop() to read the final counter value when
called with PERF_EF_UPDATE, aiming to prevent lost counter deltas when the
perf core removes the event.
> diff --git a/drivers/perf/nvidia_t410_c2c_pmu.c b/drivers/perf/nvidia_t410_c2c_pmu.c
> index 662fa1bc833a5..39d0e661e4c5c 100644
> --- a/drivers/perf/nvidia_t410_c2c_pmu.c
> +++ b/drivers/perf/nvidia_t410_c2c_pmu.c
> @@ -394,6 +394,12 @@ static void nv_c2c_pmu_start(struct perf_event *event, int pmu_flags)
>
> static void nv_c2c_pmu_stop(struct perf_event *event, int pmu_flags)
> {
> + if (event->hw.state & PERF_HES_STOPPED)
> + return;
Does this early return permanently drop final counter updates?
If the core stops the event without requesting an update first (for example,
via pmu->stop(event, 0) during event throttling), the event is marked
PERF_HES_STOPPED.
When the core later attempts to drain the final unread counts by calling
pmu->stop(event, PERF_EF_UPDATE), the early return will trigger and the
final update will be entirely skipped.
> +
> + if (pmu_flags & PERF_EF_UPDATE)
> + nv_c2c_pmu_event_update(event);
Can this cause double-counting and massive counter wrap-around?
The perf core wraps event teardowns within perf_pmu_disable() and
perf_pmu_enable(). For this PMU, nv_c2c_pmu_disable() halts global
counters, calls nv_c2c_pmu_event_update() to accumulate final counts,
and then explicitly sets event->hw.prev_count to 0.
Subsequently, pmu->del() calls nv_c2c_pmu_stop() with PERF_EF_UPDATE, which
redundantly calls nv_c2c_pmu_event_update() again. Because prev_count was
just reset to 0 but the stopped hardware counter still retains its final
value V, the second update computes V - 0 = V and erroneously adds the full
lifetime count to event->count a second time.
Additionally, this second update overwrites prev_count with V. When the PMU
restarts, the physical counters clear to 0. The next update will read a
small new value W, calculate W - V, and cause a massive unsigned 64-bit
integer underflow.
Also, if the event is stopped logically without disabling the entire global
PMU, the hardware counter continues incrementing. Since nv_c2c_pmu_start()
is unmodified and does not synchronize prev_count to the new running
value, won't all counts accumulated while the event was logically stopped
still be incorrectly added to event->count upon the next update?
> +
> event->hw.state |= PERF_HES_STOPPED;
Should this also set PERF_HES_UPTODATE?
The standard PMU stop pattern usually sets PERF_HES_UPTODATE after an
update so the core knows the software counter is synchronized with the
hardware.
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260514010629.76558-1-sauravsc@amazon.com?part=4
next prev parent reply other threads:[~2026-05-14 17:34 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 1:06 [PATCH 1/4] perf: nvidia_t410_cmem_latency: fix cpuhp state leak on init failure Saurav Sachidanand
2026-05-14 1:06 ` [PATCH 2/4] perf: nvidia_t410_cmem_latency: handle PERF_EF_UPDATE in stop Saurav Sachidanand
2026-05-14 12:49 ` sashiko-bot
2026-05-14 15:59 ` Besar Wicaksono
2026-05-14 1:06 ` [PATCH 3/4] perf: nvidia_t410_c2c: fix cpuhp state leak on init failure Saurav Sachidanand
2026-05-14 15:57 ` Besar Wicaksono
2026-05-14 1:06 ` [PATCH 4/4] perf: nvidia_t410_c2c: handle PERF_EF_UPDATE in stop Saurav Sachidanand
2026-05-14 16:00 ` Besar Wicaksono
2026-05-14 17:34 ` sashiko-bot [this message]
2026-05-14 15:55 ` [PATCH 1/4] perf: nvidia_t410_cmem_latency: fix cpuhp state leak on init failure Besar Wicaksono
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260514173433.B51DAC2BCB3@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=sauravsc@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox