public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] perf stat: Fix crash on arm64
@ 2026-03-25 10:24 Breno Leitao
  2026-03-25 18:25 ` Ian Rogers
  2026-03-25 20:59 ` Leo Yan
  0 siblings, 2 replies; 9+ messages in thread
From: Breno Leitao @ 2026-03-25 10:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark
  Cc: linux-perf-users, linux-kernel, kernel-team, Denis Yaroshevskiy,
	Dmitry Ilvokhin, Breno Leitao

Perf stat is crashing on arm64 hosts with the following issue:

	# make -C tools/perf DEBUG=1
	# perf stat sleep 1
	perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
	[1]    1220794 IOT instruction (core dumped)  ./perf stat

The sorting function introduced by commit a745c0831c15c ("perf stat:
Sort default events/metrics") compares events based on their individual
properties. This can cause events from different groups to be
interleaved, resulting in group members appearing before their leaders
in the sorted evlist.

When the iterator opens events in list order, a group member may be
processed before its leader has been opened.

For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
could be sorted before its leader, causing the crash when CPU_CYCLES
tries to get its group fd from the not-yet-opened leader.

Fix this by comparing events based on their leader's attributes instead
of their own attributes when the events are in different groups. This
ensures all members of a group share the same sort key as their leader,
keeping groups together and guaranteeing leaders are opened before their
members.

Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Cc; linux-arm-kernel@lists.infradead.org
---
Changes in v2:
- No changes from V1, just resending the exact same patch.
- Link to v1: https://patch.msgid.link/20260205-perf_stat-v1-1-e433b0c918af@debian.org
---
 tools/perf/builtin-stat.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 73c2ba7e30760..a97b9d0de3f58 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1917,25 +1917,33 @@ static int default_evlist_evsel_cmp(void *priv __maybe_unused,
 	const struct evsel *lhs = container_of(lhs_core, struct evsel, core);
 	const struct perf_evsel *rhs_core = container_of(r, struct perf_evsel, node);
 	const struct evsel *rhs = container_of(rhs_core, struct evsel, core);
+	const struct evsel *lhs_leader = evsel__leader(lhs);
+	const struct evsel *rhs_leader = evsel__leader(rhs);
 
-	if (evsel__leader(lhs) == evsel__leader(rhs)) {
+	if (lhs_leader == rhs_leader) {
 		/* Within the same group, respect the original order. */
 		return lhs_core->idx - rhs_core->idx;
 	}
 
+	/*
+	 * Compare using leader's attributes so that all members of a group
+	 * stay together. This ensures leaders are opened before their members.
+	 */
+
 	/* Sort default metrics evsels first, and default show events before those. */
-	if (lhs->default_metricgroup != rhs->default_metricgroup)
-		return lhs->default_metricgroup ? -1 : 1;
+	if (lhs_leader->default_metricgroup != rhs_leader->default_metricgroup)
+		return lhs_leader->default_metricgroup ? -1 : 1;
 
-	if (lhs->default_show_events != rhs->default_show_events)
-		return lhs->default_show_events ? -1 : 1;
+	if (lhs_leader->default_show_events != rhs_leader->default_show_events)
+		return lhs_leader->default_show_events ? -1 : 1;
 
 	/* Sort by PMU type (prefers legacy types first). */
-	if (lhs->pmu != rhs->pmu)
-		return lhs->pmu->type - rhs->pmu->type;
+	if (lhs_leader->pmu != rhs_leader->pmu)
+		return lhs_leader->pmu->type - rhs_leader->pmu->type;
 
-	/* Sort by name. */
-	return strcmp(evsel__name((struct evsel *)lhs), evsel__name((struct evsel *)rhs));
+	/* Sort by leader's name. */
+	return strcmp(evsel__name((struct evsel *)lhs_leader),
+		      evsel__name((struct evsel *)rhs_leader));
 }
 
 /*

---
base-commit: 85964cdcad0fac9a0eb7b87a0f9d88cc074b854c
change-id: 20260205-perf_stat-a0a2a37e21c5

Best regards,
--  
Breno Leitao <leitao@debian.org>


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-25 10:24 [PATCH v2] perf stat: Fix crash on arm64 Breno Leitao
@ 2026-03-25 18:25 ` Ian Rogers
  2026-03-25 20:59 ` Leo Yan
  1 sibling, 0 replies; 9+ messages in thread
From: Ian Rogers @ 2026-03-25 18:25 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Adrian Hunter, James Clark, linux-perf-users, linux-kernel,
	kernel-team, Denis Yaroshevskiy, Dmitry Ilvokhin

On Wed, Mar 25, 2026 at 3:25 AM Breno Leitao <leitao@debian.org> wrote:
>
> Perf stat is crashing on arm64 hosts with the following issue:
>
>         # make -C tools/perf DEBUG=1
>         # perf stat sleep 1
>         perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
>         [1]    1220794 IOT instruction (core dumped)  ./perf stat
>
> The sorting function introduced by commit a745c0831c15c ("perf stat:
> Sort default events/metrics") compares events based on their individual
> properties. This can cause events from different groups to be
> interleaved, resulting in group members appearing before their leaders
> in the sorted evlist.
>
> When the iterator opens events in list order, a group member may be
> processed before its leader has been opened.
>
> For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
> could be sorted before its leader, causing the crash when CPU_CYCLES
> tries to get its group fd from the not-yet-opened leader.
>
> Fix this by comparing events based on their leader's attributes instead
> of their own attributes when the events are in different groups. This
> ensures all members of a group share the same sort key as their leader,
> keeping groups together and guaranteeing leaders are opened before their
> members.
>
> Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
> Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
> Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>

Ah, this is the separate sorting for perf stat output and not the
parse-events sorting. This sorting happens after we've regrouped for
PMUs, etc. and so the invariants I expect shouldn't be broken by the
change. Fwiw, the Intel hybrid sorting isn't impacted:

Before:
```
$ perf stat -a sleep 1

 Performance counter stats for 'system wide':

            22,355      context-switches                 #    792.6
cs/sec  cs_per_second
         28,205.79 msec cpu-clock                        #     27.6
CPUs  CPUs_utilized
               477      cpu-migrations                   #     16.9
migrations/sec  migrations_per_second
               294      page-faults                      #     10.4
faults/sec  page_faults_per_second
         1,796,969      cpu_core/branch-misses/          #      2.0 %
branch_miss_rate
        90,819,614      cpu_core/branches/               #      3.2
M/sec  branch_frequency
       465,688,306      cpu_core/cpu-cycles/             #      0.0
GHz  cycles_frequency
       435,799,882      cpu_core/instructions/           #      0.9
instructions  insn_per_cycle
         2,239,373      cpu_atom/branch-misses/          #      4.6 %
branch_miss_rate         (49.78%)
        48,356,802      cpu_atom/branches/               #      1.7
M/sec  branch_frequency     (50.17%)
       477,205,378      cpu_atom/cpu-cycles/             #      0.0
GHz  cycles_frequency       (50.37%)
       236,677,090      cpu_atom/instructions/           #      0.5
instructions  insn_per_cycle  (50.35%)
             TopdownL1 (cpu_core)                        #      7.1 %
tma_bad_speculation
                                                         #     40.2 %
tma_frontend_bound
                                                         #     35.7 %
tma_backend_bound
                                                         #     17.1 %
tma_retiring
             TopdownL1 (cpu_atom)                        #     32.7 %
tma_backend_bound        (59.74%)
                                                         #     37.8 %
tma_frontend_bound       (59.54%)
                                                         #     17.2 %
tma_bad_speculation
                                                         #     12.3 %
tma_retiring             (59.62%)

       1.006767726 seconds time elapsed
```

After:
```
$ perf stat -a sleep 1

 Performance counter stats for 'system wide':

            21,329      context-switches                 #    758.5
cs/sec  cs_per_second
         28,120.76 msec cpu-clock                        #     27.7
CPUs  CPUs_utilized
               482      cpu-migrations                   #     17.1
migrations/sec  migrations_per_second
               217      page-faults                      #      7.7
faults/sec  page_faults_per_second
         1,606,877      cpu_core/branch-misses/          #      1.8 %
branch_miss_rate
        90,472,412      cpu_core/branches/               #      3.2
M/sec  branch_frequency
       459,566,033      cpu_core/cpu-cycles/             #      0.0
GHz  cycles_frequency
       482,577,042      cpu_core/instructions/           #      1.1
instructions  insn_per_cycle
         2,430,046      cpu_atom/branch-misses/          #      7.0 %
branch_miss_rate         (49.85%)
        35,031,345      cpu_atom/branches/               #      1.2
M/sec  branch_frequency     (50.20%)
       494,493,558      cpu_atom/cpu-cycles/             #      0.0
GHz  cycles_frequency       (50.22%)
       190,397,854      cpu_atom/instructions/           #      0.4
instructions  insn_per_cycle  (50.24%)
             TopdownL1 (cpu_core)                        #      6.6 %
tma_bad_speculation
                                                         #     35.9 %
tma_frontend_bound
                                                         #     38.2 %
tma_backend_bound
                                                         #     19.3 %
tma_retiring
             TopdownL1 (cpu_atom)                        #     30.2 %
tma_backend_bound        (59.74%)
                                                         #     38.8 %
tma_frontend_bound       (59.71%)
                                                         #     21.7 %
tma_bad_speculation
                                                         #      9.3 %
tma_retiring             (59.68%)

       1.005096844 seconds time elapsed
```

Tested-by: Ian Rogers <irogers@google.com>

Thanks,
Ian


> ---
> Cc; linux-arm-kernel@lists.infradead.org
> ---
> Changes in v2:
> - No changes from V1, just resending the exact same patch.
> - Link to v1: https://patch.msgid.link/20260205-perf_stat-v1-1-e433b0c918af@debian.org
> ---
>  tools/perf/builtin-stat.c | 26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 73c2ba7e30760..a97b9d0de3f58 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1917,25 +1917,33 @@ static int default_evlist_evsel_cmp(void *priv __maybe_unused,
>         const struct evsel *lhs = container_of(lhs_core, struct evsel, core);
>         const struct perf_evsel *rhs_core = container_of(r, struct perf_evsel, node);
>         const struct evsel *rhs = container_of(rhs_core, struct evsel, core);
> +       const struct evsel *lhs_leader = evsel__leader(lhs);
> +       const struct evsel *rhs_leader = evsel__leader(rhs);
>
> -       if (evsel__leader(lhs) == evsel__leader(rhs)) {
> +       if (lhs_leader == rhs_leader) {
>                 /* Within the same group, respect the original order. */
>                 return lhs_core->idx - rhs_core->idx;
>         }
>
> +       /*
> +        * Compare using leader's attributes so that all members of a group
> +        * stay together. This ensures leaders are opened before their members.
> +        */
> +
>         /* Sort default metrics evsels first, and default show events before those. */
> -       if (lhs->default_metricgroup != rhs->default_metricgroup)
> -               return lhs->default_metricgroup ? -1 : 1;
> +       if (lhs_leader->default_metricgroup != rhs_leader->default_metricgroup)
> +               return lhs_leader->default_metricgroup ? -1 : 1;
>
> -       if (lhs->default_show_events != rhs->default_show_events)
> -               return lhs->default_show_events ? -1 : 1;
> +       if (lhs_leader->default_show_events != rhs_leader->default_show_events)
> +               return lhs_leader->default_show_events ? -1 : 1;
>
>         /* Sort by PMU type (prefers legacy types first). */
> -       if (lhs->pmu != rhs->pmu)
> -               return lhs->pmu->type - rhs->pmu->type;
> +       if (lhs_leader->pmu != rhs_leader->pmu)
> +               return lhs_leader->pmu->type - rhs_leader->pmu->type;
>
> -       /* Sort by name. */
> -       return strcmp(evsel__name((struct evsel *)lhs), evsel__name((struct evsel *)rhs));
> +       /* Sort by leader's name. */
> +       return strcmp(evsel__name((struct evsel *)lhs_leader),
> +                     evsel__name((struct evsel *)rhs_leader));
>  }
>
>  /*
>
> ---
> base-commit: 85964cdcad0fac9a0eb7b87a0f9d88cc074b854c
> change-id: 20260205-perf_stat-a0a2a37e21c5
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-25 10:24 [PATCH v2] perf stat: Fix crash on arm64 Breno Leitao
  2026-03-25 18:25 ` Ian Rogers
@ 2026-03-25 20:59 ` Leo Yan
  2026-03-25 22:27   ` Ian Rogers
  1 sibling, 1 reply; 9+ messages in thread
From: Leo Yan @ 2026-03-25 20:59 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, linux-perf-users,
	linux-kernel, kernel-team, Denis Yaroshevskiy, Dmitry Ilvokhin

On Wed, Mar 25, 2026 at 03:24:30AM -0700, Breno Leitao wrote:
> Perf stat is crashing on arm64 hosts with the following issue:
> 
> 	# make -C tools/perf DEBUG=1
> 	# perf stat sleep 1
> 	perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
> 	[1]    1220794 IOT instruction (core dumped)  ./perf stat
> 
> The sorting function introduced by commit a745c0831c15c ("perf stat:
> Sort default events/metrics") compares events based on their individual
> properties. This can cause events from different groups to be
> interleaved, resulting in group members appearing before their leaders
> in the sorted evlist.
> 
> When the iterator opens events in list order, a group member may be
> processed before its leader has been opened.
> 
> For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
> could be sorted before its leader, causing the crash when CPU_CYCLES
> tries to get its group fd from the not-yet-opened leader.
> 
> Fix this by comparing events based on their leader's attributes instead
> of their own attributes when the events are in different groups. This
> ensures all members of a group share the same sort key as their leader,
> keeping groups together and guaranteeing leaders are opened before their
> members.
> 
> Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
> Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
> Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>

As Arnaldo mentioned in v1, I also found Segmentation fault when
testing this patch:

 Program received signal SIGSEGV, Segmentation fault.
 metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
 1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
 (gdb) bt
 #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
 #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
 #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
 #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
 #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
 #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
 #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549

Last week I tested v1 and confirmed the issue was gone with the change,
I will dig a bit in tomorrow and share back if any finding.

Apologies for my lazy, as I should double check once Arnaldo
pointed out in v1.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-25 20:59 ` Leo Yan
@ 2026-03-25 22:27   ` Ian Rogers
  2026-03-26 16:39     ` Leo Yan
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Rogers @ 2026-03-25 22:27 UTC (permalink / raw)
  To: Leo Yan
  Cc: Breno Leitao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, James Clark,
	linux-perf-users, linux-kernel, kernel-team, Denis Yaroshevskiy,
	Dmitry Ilvokhin

On Wed, Mar 25, 2026 at 1:59 PM Leo Yan <leo.yan@arm.com> wrote:
>
> On Wed, Mar 25, 2026 at 03:24:30AM -0700, Breno Leitao wrote:
> > Perf stat is crashing on arm64 hosts with the following issue:
> >
> >       # make -C tools/perf DEBUG=1
> >       # perf stat sleep 1
> >       perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
> >       [1]    1220794 IOT instruction (core dumped)  ./perf stat
> >
> > The sorting function introduced by commit a745c0831c15c ("perf stat:
> > Sort default events/metrics") compares events based on their individual
> > properties. This can cause events from different groups to be
> > interleaved, resulting in group members appearing before their leaders
> > in the sorted evlist.
> >
> > When the iterator opens events in list order, a group member may be
> > processed before its leader has been opened.
> >
> > For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
> > could be sorted before its leader, causing the crash when CPU_CYCLES
> > tries to get its group fd from the not-yet-opened leader.
> >
> > Fix this by comparing events based on their leader's attributes instead
> > of their own attributes when the events are in different groups. This
> > ensures all members of a group share the same sort key as their leader,
> > keeping groups together and guaranteeing leaders are opened before their
> > members.
> >
> > Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
> > Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
> > Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
> > Signed-off-by: Breno Leitao <leitao@debian.org>
>
> As Arnaldo mentioned in v1, I also found Segmentation fault when
> testing this patch:
>
>  Program received signal SIGSEGV, Segmentation fault.
>  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
>  1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
>  (gdb) bt
>  #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
>  #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
>  #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
>  #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
>  #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
>  #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
>  #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549
>
> Last week I tested v1 and confirmed the issue was gone with the change,
> I will dig a bit in tomorrow and share back if any finding.
>
> Apologies for my lazy, as I should double check once Arnaldo
> pointed out in v1.


From the stack trace I don't see anything that should allow this.
Either the list of metrics is broken, or the struct metric_event's
evsel is NULL. But that should never happen as we wouldn't have a
metric at that point. The sorting shouldn't affect that. If you can
reproduce the issue, verbose logs may help.

Thanks,
Ian

> Thanks,
> Leo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-25 22:27   ` Ian Rogers
@ 2026-03-26 16:39     ` Leo Yan
  2026-03-26 21:21       ` Ian Rogers
  2026-03-27 10:35       ` Breno Leitao
  0 siblings, 2 replies; 9+ messages in thread
From: Leo Yan @ 2026-03-26 16:39 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Breno Leitao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, James Clark,
	linux-perf-users, linux-kernel, kernel-team, Denis Yaroshevskiy,
	Dmitry Ilvokhin

On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:

[...]

> > As Arnaldo mentioned in v1, I also found Segmentation fault when
> > testing this patch:
> >
> >  Program received signal SIGSEGV, Segmentation fault.
> >  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> >  1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
> >  (gdb) bt
> >  #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> >  #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
> >  #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
> >  #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
> >  #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
> >  #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
> >  #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549
> >
> > Last week I tested v1 and confirmed the issue was gone with the change,
> > I will dig a bit in tomorrow and share back if any finding.
> >
> > Apologies for my lazy, as I should double check once Arnaldo
> > pointed out in v1.
> 
> 
> From the stack trace I don't see anything that should allow this.
> Either the list of metrics is broken, or the struct metric_event's
> evsel is NULL. But that should never happen as we wouldn't have a
> metric at that point. The sorting shouldn't affect that. If you can
> reproduce the issue, verbose logs may help.

I believe I encountered a different issue, which is irrelevant to this
patch.  So Breno's patch is fine for me.

On one of my board, I can see the log:

  Events in 'frontend_bound' fully contained within 'retiring'
  Events in 'bad_speculation' fully contained within 'retiring'
  Events in 'backend_bound' fully contained within 'retiring'

Then, in parse_groups(), it selects fully contained list:

  list_for_each_entry(n, &metric_list, nd) {
      ...
      if (expr__subset_of_ids(n->pctx, m->pctx)) {
          metric_evlist = n->evlist;
          break;
      }
  }

  // As metric_evlist has been set, so below does not run
  if (!metric_evlist) {
        ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
                        m->group_events, tool_events, &m->evlist);
        if (ret)
                goto out;

        metric_evlist = m->evlist;
  }

The problem is it skips to call parse_ids() and m->pctx is not
initialized.  This leads to pick_display_evsel() returns NULL and
pass NULL pointer to metricgroup__lookup().   In the end, the NULL
pointer is consumed in metricgroup__copy_metric_events().

How about below change ?  I tested it which can dismiss the issue.

Subject: [PATCH] perf metricgroup: Fix segment fault in
 metricgroup__copy_metric_events()

In parse_groups(), when find a event is fully contained by a previous
event, it skips to call parse_ids(), as a result, m->pctx->ids is not
initialized.  Then, setup_metric_events() returns an empty metric
events, pick_display_evsel() consumes the returned metric_events and
feeds to metricgroup__lookup() with passing "evsel = NULL".

metricgroup__copy_metric_events() access old_me->evsel->core.idx, due
to the evsel is set to NULL in metricgroup__lookup(), it causes
segment fault.

This commit corrects the fully contained case to allow it to parse IDs,
thus this can effectively avoid setup NULL pointer in
metricgroup__lookup().

Fixes: 5ecd5a0c7d1c ("perf metrics: Modify setup and deduplication")
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/metricgroup.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 7e39d469111b..528a6462876b 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1431,7 +1431,7 @@ static int parse_groups(struct evlist *perf_evlist,
 	list_for_each_entry(m, &metric_list, nd) {
 		struct metric_event *me;
 		struct evsel **metric_events;
-		struct evlist *metric_evlist = NULL;
+		struct evlist *metric_evlist = NULL, *contained = NULL;
 		struct metric *n;
 		struct metric_expr *expr;
 
@@ -1463,19 +1463,18 @@ static int parse_groups(struct evlist *perf_evlist,
 				if (expr__subset_of_ids(n->pctx, m->pctx)) {
 					pr_debug("Events in '%s' fully contained within '%s'\n",
 						 m->metric_name, n->metric_name);
-					metric_evlist = n->evlist;
+					contained = n->evlist;
 					break;
 				}
-
 			}
 		}
 		if (!metric_evlist) {
+			metric_evlist = contained ? contained : m->evlist;
+
 			ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
-					m->group_events, tool_events, &m->evlist);
+					m->group_events, tool_events, &metric_evlist);
 			if (ret)
 				goto out;
-
-			metric_evlist = m->evlist;
 		}
 		ret = setup_metric_events(fake_pmu ? "all" : m->pmu, m->pctx->ids,
 					  metric_evlist, &metric_events);
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-26 16:39     ` Leo Yan
@ 2026-03-26 21:21       ` Ian Rogers
  2026-03-27 17:57         ` Leo Yan
  2026-03-27 10:35       ` Breno Leitao
  1 sibling, 1 reply; 9+ messages in thread
From: Ian Rogers @ 2026-03-26 21:21 UTC (permalink / raw)
  To: Leo Yan
  Cc: Breno Leitao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, James Clark,
	linux-perf-users, linux-kernel, kernel-team, Denis Yaroshevskiy,
	Dmitry Ilvokhin

On Thu, Mar 26, 2026 at 9:39 AM Leo Yan <leo.yan@arm.com> wrote:
>
> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
>
> [...]
>
> > > As Arnaldo mentioned in v1, I also found Segmentation fault when
> > > testing this patch:
> > >
> > >  Program received signal SIGSEGV, Segmentation fault.
> > >  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> > >  1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
> > >  (gdb) bt
> > >  #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> > >  #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
> > >  #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
> > >  #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
> > >  #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
> > >  #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
> > >  #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549
> > >
> > > Last week I tested v1 and confirmed the issue was gone with the change,
> > > I will dig a bit in tomorrow and share back if any finding.
> > >
> > > Apologies for my lazy, as I should double check once Arnaldo
> > > pointed out in v1.
> >
> >
> > From the stack trace I don't see anything that should allow this.
> > Either the list of metrics is broken, or the struct metric_event's
> > evsel is NULL. But that should never happen as we wouldn't have a
> > metric at that point. The sorting shouldn't affect that. If you can
> > reproduce the issue, verbose logs may help.
>
> I believe I encountered a different issue, which is irrelevant to this
> patch.  So Breno's patch is fine for me.
>
> On one of my board, I can see the log:
>
>   Events in 'frontend_bound' fully contained within 'retiring'
>   Events in 'bad_speculation' fully contained within 'retiring'
>   Events in 'backend_bound' fully contained within 'retiring'

I looked at nvidia/t410/metrics.json but I'm not clear on where the
issue is coming from.

>
> Then, in parse_groups(), it selects fully contained list:
>
>   list_for_each_entry(n, &metric_list, nd) {
>       ...
>       if (expr__subset_of_ids(n->pctx, m->pctx)) {
>           metric_evlist = n->evlist;
>           break;
>       }
>   }
>
>   // As metric_evlist has been set, so below does not run
>   if (!metric_evlist) {
>         ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
>                         m->group_events, tool_events, &m->evlist);
>         if (ret)
>                 goto out;
>
>         metric_evlist = m->evlist;
>   }
>
> The problem is it skips to call parse_ids() and m->pctx is not
> initialized.  This leads to pick_display_evsel() returns NULL and
> pass NULL pointer to metricgroup__lookup().   In the end, the NULL
> pointer is consumed in metricgroup__copy_metric_events().
>
> How about below change ?  I tested it which can dismiss the issue.
>
> Subject: [PATCH] perf metricgroup: Fix segment fault in
>  metricgroup__copy_metric_events()
>
> In parse_groups(), when find a event is fully contained by a previous
> event, it skips to call parse_ids(), as a result, m->pctx->ids is not
> initialized.  Then, setup_metric_events() returns an empty metric
> events, pick_display_evsel() consumes the returned metric_events and
> feeds to metricgroup__lookup() with passing "evsel = NULL".

Fully contained groups exist on x86, why isn't this problem breaking
whenever this happens? Stepping through "contained" examples, I see
that the ids aren't initialized. I think something else must be
happening.

> metricgroup__copy_metric_events() access old_me->evsel->core.idx, due
> to the evsel is set to NULL in metricgroup__lookup(), it causes
> segment fault.
>
> This commit corrects the fully contained case to allow it to parse IDs,
> thus this can effectively avoid setup NULL pointer in
> metricgroup__lookup().
>
> Fixes: 5ecd5a0c7d1c ("perf metrics: Modify setup and deduplication")
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/util/metricgroup.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 7e39d469111b..528a6462876b 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -1431,7 +1431,7 @@ static int parse_groups(struct evlist *perf_evlist,
>         list_for_each_entry(m, &metric_list, nd) {
>                 struct metric_event *me;
>                 struct evsel **metric_events;
> -               struct evlist *metric_evlist = NULL;
> +               struct evlist *metric_evlist = NULL, *contained = NULL;
>                 struct metric *n;
>                 struct metric_expr *expr;
>
> @@ -1463,19 +1463,18 @@ static int parse_groups(struct evlist *perf_evlist,
>                                 if (expr__subset_of_ids(n->pctx, m->pctx)) {
>                                         pr_debug("Events in '%s' fully contained within '%s'\n",
>                                                  m->metric_name, n->metric_name);
> -                                       metric_evlist = n->evlist;
> +                                       contained = n->evlist;
>                                         break;
>                                 }
> -
>                         }
>                 }
>                 if (!metric_evlist) {
> +                       metric_evlist = contained ? contained : m->evlist;
> +
>                         ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
> -                                       m->group_events, tool_events, &m->evlist);
> +                                       m->group_events, tool_events, &metric_evlist);

Won't this match the behavior of metric_no_merge/--metric-no-merge,
since for every metric the events for that metric are being appended
to the evlist?

Thanks,
Ian


>                         if (ret)
>                                 goto out;
> -
> -                       metric_evlist = m->evlist;
>                 }
>                 ret = setup_metric_events(fake_pmu ? "all" : m->pmu, m->pctx->ids,
>                                           metric_evlist, &metric_events);
> --
> 2.34.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-26 16:39     ` Leo Yan
  2026-03-26 21:21       ` Ian Rogers
@ 2026-03-27 10:35       ` Breno Leitao
  2026-03-27 13:48         ` Arnaldo Melo
  1 sibling, 1 reply; 9+ messages in thread
From: Breno Leitao @ 2026-03-27 10:35 UTC (permalink / raw)
  To: Leo Yan, irogers, cme
  Cc: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Adrian Hunter, James Clark, linux-perf-users, linux-kernel,
	kernel-team, Denis Yaroshevskiy, Dmitry Ilvokhin

On Thu, Mar 26, 2026 at 04:39:26PM +0000, Leo Yan wrote:
> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
>
> > From the stack trace I don't see anything that should allow this.
> > Either the list of metrics is broken, or the struct metric_event's
> > evsel is NULL. But that should never happen as we wouldn't have a
> > metric at that point. The sorting shouldn't affect that. If you can
> > reproduce the issue, verbose logs may help.
>
> I believe I encountered a different issue, which is irrelevant to this
> patch.  So Breno's patch is fine for me.

That said, are there any concerns about merging this patch in its
current form?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-27 10:35       ` Breno Leitao
@ 2026-03-27 13:48         ` Arnaldo Melo
  0 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Melo @ 2026-03-27 13:48 UTC (permalink / raw)
  To: Breno Leitao, Leo Yan, irogers, cme
  Cc: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Adrian Hunter, James Clark, linux-perf-users, linux-kernel,
	kernel-team, Denis Yaroshevskiy, Dmitry Ilvokhin



On March 27, 2026 7:35:19 AM GMT-03:00, Breno Leitao <leitao@debian.org> wrote:
>On Thu, Mar 26, 2026 at 04:39:26PM +0000, Leo Yan wrote:
>> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
>>
>> > From the stack trace I don't see anything that should allow this.
>> > Either the list of metrics is broken, or the struct metric_event's
>> > evsel is NULL. But that should never happen as we wouldn't have a
>> > metric at that point. The sorting shouldn't affect that. If you can
>> > reproduce the issue, verbose logs may help.
>>
>> I believe I encountered a different issue, which is irrelevant to this
>> patch.  So Breno's patch is fine for me.
>
>That said, are there any concerns about merging this patch in its
>current form?

I guess not, will try and merge it later today for 7
0.

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] perf stat: Fix crash on arm64
  2026-03-26 21:21       ` Ian Rogers
@ 2026-03-27 17:57         ` Leo Yan
  0 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2026-03-27 17:57 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Breno Leitao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, James Clark,
	linux-perf-users, linux-kernel, kernel-team, Denis Yaroshevskiy,
	Dmitry Ilvokhin

On Thu, Mar 26, 2026 at 02:21:00PM -0700, Ian Rogers wrote:

[...]

> > On one of my board, I can see the log:
> >
> >   Events in 'frontend_bound' fully contained within 'retiring'
> >   Events in 'bad_speculation' fully contained within 'retiring'
> >   Events in 'backend_bound' fully contained within 'retiring'
> 
> I looked at nvidia/t410/metrics.json but I'm not clear on where the
> issue is coming from.

Digging a bit, all these metrics have syntax errors because #slots is
zero when returned from tool_pmu__cpu_slots_per_cycle().
See the log: https://termbin.com/e9ue

After that, executes parse_groups() treats all these problematic metrics
as being contained by "retiring".

I have sent a patch to make __add_metrics() respect the returned syntax
error, so the program can exit early.


> > In parse_groups(), when find a event is fully contained by a previous
> > event, it skips to call parse_ids(), as a result, m->pctx->ids is not
> > initialized.  Then, setup_metric_events() returns an empty metric
> > events, pick_display_evsel() consumes the returned metric_events and
> > feeds to metricgroup__lookup() with passing "evsel = NULL".
> 
> Fully contained groups exist on x86, why isn't this problem breaking
> whenever this happens? Stepping through "contained" examples, I see
> that the ids aren't initialized. I think something else must be
> happening.

If you change tool_pmu__cpu_slots_per_cycle() to always return zero,
this might can reproduce the issue.

[...]

> > @@ -1463,19 +1463,18 @@ static int parse_groups(struct evlist *perf_evlist,
> >                                 if (expr__subset_of_ids(n->pctx, m->pctx)) {
> >                                         pr_debug("Events in '%s' fully contained within '%s'\n",
> >                                                  m->metric_name, n->metric_name);
> > -                                       metric_evlist = n->evlist;
> > +                                       contained = n->evlist;
> >                                         break;
> >                                 }
> > -
> >                         }
> >                 }
> >                 if (!metric_evlist) {
> > +                       metric_evlist = contained ? contained : m->evlist;
> > +
> >                         ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
> > -                                       m->group_events, tool_events, &m->evlist);
> > +                                       m->group_events, tool_events, &metric_evlist);
> 
> Won't this match the behavior of metric_no_merge/--metric-no-merge,
> since for every metric the events for that metric are being appended
> to the evlist?

TBH, I still cannot understand well parse_groups().

The change above passes &metric_evlist to parse_ids(), as this will only
change the value of metric_evlist itself but not update m->evlist or
n->evlist, this is not right for metric_no_merge case ?

Here I am not sure how to use parse_ids() to parse "m->pctx" but avoid
to overwrite "n->evlist" for the fully contained case.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-03-27 18:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-25 10:24 [PATCH v2] perf stat: Fix crash on arm64 Breno Leitao
2026-03-25 18:25 ` Ian Rogers
2026-03-25 20:59 ` Leo Yan
2026-03-25 22:27   ` Ian Rogers
2026-03-26 16:39     ` Leo Yan
2026-03-26 21:21       ` Ian Rogers
2026-03-27 17:57         ` Leo Yan
2026-03-27 10:35       ` Breno Leitao
2026-03-27 13:48         ` Arnaldo Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox