* [PATCH] perf/x86/amd/uncore: Add group validation
@ 2026-06-23 10:49 Sandipan Das
2026-06-23 11:04 ` sashiko-bot
2026-06-23 17:16 ` Ian Rogers
0 siblings, 2 replies; 3+ messages in thread
From: Sandipan Das @ 2026-06-23 10:49 UTC (permalink / raw)
To: linux-perf-users, linux-kernel
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, James Clark, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin, Ravi Bangoria,
Ananth Narayan, Sandipan Das
The amd_uncore driver currently does not validate event groups and
allows creation of groups with more events than the number of available
hardware counters. Because of this, pmu->event_init() succeeds but
counter assignment fails later in pmu->add() which returns -EBUSY once
all counters are exhausted.
Address this by introducing group validation in the pmu->event_init()
path. Since the uncore PMUs have no per-event constraints and all
counters of a PMU are interchangeable, validation is reduced to just
counting the group members that target a PMU and ensuring that they fit
within the available set of counters.
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
arch/x86/events/amd/uncore.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index dbc00b6dd69e..9715cfb8cce3 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -265,6 +265,29 @@ static void amd_uncore_del(struct perf_event *event, int flags)
hwc->idx = -1;
}
+static bool amd_uncore_group_valid(struct perf_event *event)
+{
+ struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
+ struct perf_event *leader = event->group_leader;
+ struct perf_event *sibling;
+ int counters = 0;
+
+ /* Ignore software events as they do not occupy hardware counters */
+ if (leader->pmu == event->pmu && !is_software_event(leader))
+ counters++;
+
+ for_each_sibling_event(sibling, leader) {
+ if (sibling->pmu == event->pmu && !is_software_event(sibling))
+ counters++;
+ }
+
+ /*
+ * When pmu->event_init() is called, the event is yet to be linked to
+ * its leader's sibling list, so it is counted separately
+ */
+ return (counters + 1) <= pmu->num_counters;
+}
+
static int amd_uncore_event_init(struct perf_event *event)
{
struct amd_uncore_pmu *pmu;
@@ -282,6 +305,15 @@ static int amd_uncore_event_init(struct perf_event *event)
if (!ctx)
return -ENODEV;
+ /*
+ * Ensure that all events in a group can be scheduled together so that
+ * a failure can be reported at perf_event_open() time rather than
+ * silently at pmu->add() time when no free counter is found
+ */
+ if (event->group_leader != event &&
+ !amd_uncore_group_valid(event))
+ return -EINVAL;
+
/*
* NB and Last level cache counters (MSRs) are shared across all cores
* that share the same NB / Last level cache. On family 16h and below,
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] perf/x86/amd/uncore: Add group validation
2026-06-23 10:49 [PATCH] perf/x86/amd/uncore: Add group validation Sandipan Das
@ 2026-06-23 11:04 ` sashiko-bot
2026-06-23 17:16 ` Ian Rogers
1 sibling, 0 replies; 3+ messages in thread
From: sashiko-bot @ 2026-06-23 11:04 UTC (permalink / raw)
To: Sandipan Das; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Group validation incorrectly counts disabled or errored siblings.
--
commit a3401bf46a6300cee39613401a4f3512876a47f1
Author: Sandipan Das <sandipan.das@amd.com>
perf/x86/amd/uncore: Add group validation
This patch introduces group validation in the pmu->event_init() path for
the AMD uncore driver. It counts the hardware events targeting the PMU
in an event group and ensures they do not exceed the available hardware
counters, preventing failures later in pmu->add().
> diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
> index 9293ce50574da..15330cbab6a69 100644
> --- a/arch/x86/events/amd/uncore.c
> +++ b/arch/x86/events/amd/uncore.c
[ ... ]
> +static bool amd_uncore_group_valid(struct perf_event *event)
> +{
> + struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
> + struct perf_event *leader = event->group_leader;
> + struct perf_event *sibling;
> + int counters = 0;
> +
> + /* Ignore software events as they do not occupy hardware counters */
> + if (leader->pmu == event->pmu && !is_software_event(leader))
> + counters++;
> +
> + for_each_sibling_event(sibling, leader) {
> + if (sibling->pmu == event->pmu && !is_software_event(sibling))
> + counters++;
[Severity: Medium]
Is it possible that disabled or errored siblings are incorrectly counted here?
Users might create oversized groups and manually multiplex them by toggling
events on and off. If this loop unconditionally increments the hardware
counter usage for all siblings, could it reject valid requests from userspace?
Unlike the core x86 and Intel uncore PMU drivers, it appears this fails to
ignore events with state <= PERF_EVENT_STATE_OFF.
> + }
> +
> + /*
> + * When pmu->event_init() is called, the event is yet to be linked to
> + * its leader's sibling list, so it is counted separately
> + */
> + return (counters + 1) <= pmu->num_counters;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/0f2cff983d79521e153d211ce412fedbc4ed942c.1782211518.git.sandipan.das@amd.com?part=1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] perf/x86/amd/uncore: Add group validation
2026-06-23 10:49 [PATCH] perf/x86/amd/uncore: Add group validation Sandipan Das
2026-06-23 11:04 ` sashiko-bot
@ 2026-06-23 17:16 ` Ian Rogers
1 sibling, 0 replies; 3+ messages in thread
From: Ian Rogers @ 2026-06-23 17:16 UTC (permalink / raw)
To: Sandipan Das
Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, James Clark,
Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
H . Peter Anvin, Ravi Bangoria, Ananth Narayan
On Tue, Jun 23, 2026 at 3:49 AM Sandipan Das <sandipan.das@amd.com> wrote:
>
> The amd_uncore driver currently does not validate event groups and
> allows creation of groups with more events than the number of available
> hardware counters. Because of this, pmu->event_init() succeeds but
> counter assignment fails later in pmu->add() which returns -EBUSY once
> all counters are exhausted.
>
> Address this by introducing group validation in the pmu->event_init()
> path. Since the uncore PMUs have no per-event constraints and all
> counters of a PMU are interchangeable, validation is reduced to just
> counting the group members that target a PMU and ensuring that they fit
> within the available set of counters.
>
> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
This is great Sandipan! Thanks for addressing this! I'd been wondering
if in the perf tool if we could test hardware PMUs for not supporting
failing at open properly. This is a problem for weak groups, as used
by metrics, because they try to group all events and then break the
group when the open fails. I'd observed that AMD uncore events
supposedly opened but then failed during reading. I suspect other PMUs
also suffer this.
Peter mentioned a behavior in the past: opening events in a group in a
disabled state, with more events than counters, and then the software
enables and disables events in the group to control counter
allocation. The perf tool doesn't currently utilize this behavior but
I think it explains some of the Sashiko feedback.
Would it be possible to get a Fixes tag for stable backports?
Thanks,
Ian
> ---
> arch/x86/events/amd/uncore.c | 32 ++++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
> index dbc00b6dd69e..9715cfb8cce3 100644
> --- a/arch/x86/events/amd/uncore.c
> +++ b/arch/x86/events/amd/uncore.c
> @@ -265,6 +265,29 @@ static void amd_uncore_del(struct perf_event *event, int flags)
> hwc->idx = -1;
> }
>
> +static bool amd_uncore_group_valid(struct perf_event *event)
> +{
> + struct amd_uncore_pmu *pmu = event_to_amd_uncore_pmu(event);
> + struct perf_event *leader = event->group_leader;
> + struct perf_event *sibling;
> + int counters = 0;
> +
> + /* Ignore software events as they do not occupy hardware counters */
> + if (leader->pmu == event->pmu && !is_software_event(leader))
> + counters++;
> +
> + for_each_sibling_event(sibling, leader) {
> + if (sibling->pmu == event->pmu && !is_software_event(sibling))
> + counters++;
> + }
> +
> + /*
> + * When pmu->event_init() is called, the event is yet to be linked to
> + * its leader's sibling list, so it is counted separately
> + */
> + return (counters + 1) <= pmu->num_counters;
> +}
> +
> static int amd_uncore_event_init(struct perf_event *event)
> {
> struct amd_uncore_pmu *pmu;
> @@ -282,6 +305,15 @@ static int amd_uncore_event_init(struct perf_event *event)
> if (!ctx)
> return -ENODEV;
>
> + /*
> + * Ensure that all events in a group can be scheduled together so that
> + * a failure can be reported at perf_event_open() time rather than
> + * silently at pmu->add() time when no free counter is found
> + */
> + if (event->group_leader != event &&
> + !amd_uncore_group_valid(event))
> + return -EINVAL;
> +
> /*
> * NB and Last level cache counters (MSRs) are shared across all cores
> * that share the same NB / Last level cache. On family 16h and below,
> --
> 2.53.0
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-23 17:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23 10:49 [PATCH] perf/x86/amd/uncore: Add group validation Sandipan Das
2026-06-23 11:04 ` sashiko-bot
2026-06-23 17:16 ` Ian Rogers
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.