perf stat issue with 7.0.0rc3

public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed

* perf stat issue with 7.0.0rc3
@ 2026-03-13 13:13 Thomas Richter
  2026-03-13 15:13 ` Leo Yan
  2026-03-13 15:19 ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 18+ messages in thread
From: Thomas Richter @ 2026-03-13 13:13 UTC (permalink / raw)
  To: Ian Rogers; +Cc: linux-perf-use., Jan Polensky

Ian, 

I just discovered a strange behavior on linux 7.0.0rc3.

I run these commands on my x86 virtual machine:

bash-5.3# uname -m
x86_64
bash-5.3# perf -v
perf version 6.18.13-200.fc43.x86_64
bash-5.3# perf stat -- true

 Performance counter stats for 'true':

         1.108.929      task-clock                       #    0,437 CPUs utilized             
                 0      context-switches                 #    0,000 /sec                      
                 0      cpu-migrations                   #    0,000 /sec                      
                55      page-faults                      #   49,597 K/sec                     
   <not supported>      cycles                                                                

       0,002536439 seconds time elapsed

       0,001363000 seconds user
       0,001393000 seconds sys

bash-5.3#

This is the epected output, however when I use the perf version 7.0.0rc3:

bash-5.3# ./perf -v
perf version 7.0.rc3.g1f318b96cc84
bash-5.3# ./perf stat -- true
Error:
No supported events found.
trace.args_alignment
bash-5.3# 

Same happens on my s390 systems (LPAR and z/VM).
Is this a known already?
I guess this is a perf tool issue, same result when you execute this on a 7.0.0rc3
kernel.

Any ideas where to start debugging
-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH

Vorsitzender des Aufsichtsrats: Wolfgang Wendt

Geschäftsführung: David Faller

Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 13:13 perf stat issue with 7.0.0rc3 Thomas Richter
@ 2026-03-13 15:13 ` Leo Yan
  2026-03-13 15:19 ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 18+ messages in thread
From: Leo Yan @ 2026-03-13 15:13 UTC (permalink / raw)
  To: Thomas Richter; +Cc: Ian Rogers, linux-perf-use., Jan Polensky

On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
> Ian, 
> 
> I just discovered a strange behavior on linux 7.0.0rc3.
> 
> I run these commands on my x86 virtual machine:
> 
> bash-5.3# uname -m
> x86_64
> bash-5.3# perf -v
> perf version 6.18.13-200.fc43.x86_64
> bash-5.3# perf stat -- true
> 
>  Performance counter stats for 'true':
> 
>          1.108.929      task-clock                       #    0,437 CPUs utilized             
>                  0      context-switches                 #    0,000 /sec                      
>                  0      cpu-migrations                   #    0,000 /sec                      
>                 55      page-faults                      #   49,597 K/sec                     
>    <not supported>      cycles                                                                
> 
>        0,002536439 seconds time elapsed
> 
>        0,001363000 seconds user
>        0,001393000 seconds sys
> 
> bash-5.3#
> 
> This is the epected output, however when I use the perf version 7.0.0rc3:
> 
> bash-5.3# ./perf -v
> perf version 7.0.rc3.g1f318b96cc84
> bash-5.3# ./perf stat -- true
> Error:
> No supported events found.
> trace.args_alignment
> bash-5.3# 
> 
> Same happens on my s390 systems (LPAR and z/VM).
> Is this a known already?

I assume you are missing the patch below.  It has been picked to
perf-tools-next but not landed to mainline.

Author: Ian Rogers <irogers@google.com>
Date:   Fri Feb 6 16:49:56 2026 -0800

    perf metricgroup: Fix metricgroup__has_metric_or_groups

    Use metricgroup__for_each_metric rather than
    pmu_metrics_table__for_each_metric that combines the default metric
    table with, a potentially empty, CPUID table.

    Fixes: cee275edcdb1 ("perf metricgroup: Don't early exit if no CPUID table exists")
    Signed-off-by: Ian Rogers <irogers@google.com>
    Reviewed-by: Leo Yan <leo.yan@arm.com>
    Tested-by: Leo Yan <leo.yan@arm.com>
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 46bf4dfeebc8..7e39d469111b 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1605,9 +1605,9 @@ bool metricgroup__has_metric_or_groups(const char *pmu, const char *metric_or_gr
                .metric_or_groups = metric_or_groups,
        };

-       return pmu_metrics_table__for_each_metric(table,
-                                                 metricgroup__has_metric_or_groups_callback,
-                                                 &data)
+       return metricgroup__for_each_metric(table,
+                                           metricgroup__has_metric_or_groups_callback,
+                                           &data)
                ? true : false;
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 13:13 perf stat issue with 7.0.0rc3 Thomas Richter
  2026-03-13 15:13 ` Leo Yan
@ 2026-03-13 15:19 ` Arnaldo Carvalho de Melo
  2026-03-13 15:41   ` Ian Rogers
  1 sibling, 1 reply; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-13 15:19 UTC (permalink / raw)
  To: Thomas Richter; +Cc: Ian Rogers, linux-perf-use., Jan Polensky

On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
> Ian, 
> 
> I just discovered a strange behavior on linux 7.0.0rc3.
> 
> I run these commands on my x86 virtual machine:
> 
> bash-5.3# uname -m
> x86_64
> bash-5.3# perf -v
> perf version 6.18.13-200.fc43.x86_64
> bash-5.3# perf stat -- true
> 
>  Performance counter stats for 'true':
> 
>          1.108.929      task-clock                       #    0,437 CPUs utilized             
>                  0      context-switches                 #    0,000 /sec                      
>                  0      cpu-migrations                   #    0,000 /sec                      
>                 55      page-faults                      #   49,597 K/sec                     
>    <not supported>      cycles                                                                
> 
>        0,002536439 seconds time elapsed
> 
>        0,001363000 seconds user
>        0,001393000 seconds sys
> 
> bash-5.3#
> 
> This is the epected output, however when I use the perf version 7.0.0rc3:
> 
> bash-5.3# ./perf -v
> perf version 7.0.rc3.g1f318b96cc84
> bash-5.3# ./perf stat -- true
> Error:
> No supported events found.
> trace.args_alignment

And this last line is even stranger, in my case I get something else,
also on x86_64:

root@number:~#  perf stat -- true
Error:
No supported events found.
addr2line.style
root@number:~#

On an ARM machine:

acme@raspberrypi:~/git/perf-tools $ perf stat -- true
Error:
No supported events found.

acme@raspberrypi:~/git/perf-tools $ uname -a
Linux raspberrypi 6.12.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.62-1+rpt1 (2025-12-18) aarch64 GNU/Linux
acme@raspberrypi:~/git/perf-tools $

I'll try to bisect this later, thanks for the report!

- Arnaldo

> bash-5.3# 
> 
> Same happens on my s390 systems (LPAR and z/VM).
> Is this a known already?
> I guess this is a perf tool issue, same result when you execute this on a 7.0.0rc3
> kernel.
> 
> Any ideas where to start debugging
> -- 
> Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> --
> IBM Deutschland Research & Development GmbH
> 
> Vorsitzender des Aufsichtsrats: Wolfgang Wendt
> 
> Geschäftsführung: David Faller
> 
> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 15:19 ` Arnaldo Carvalho de Melo
@ 2026-03-13 15:41   ` Ian Rogers
  2026-03-13 15:56     ` Arnaldo Melo
  2026-03-17 19:39     ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 18+ messages in thread
From: Ian Rogers @ 2026-03-13 15:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Thomas Richter, linux-perf-use., Jan Polensky

On Fri, Mar 13, 2026 at 8:19 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
> > Ian,
> >
> > I just discovered a strange behavior on linux 7.0.0rc3.
> >
> > I run these commands on my x86 virtual machine:
> >
> > bash-5.3# uname -m
> > x86_64
> > bash-5.3# perf -v
> > perf version 6.18.13-200.fc43.x86_64
> > bash-5.3# perf stat -- true
> >
> >  Performance counter stats for 'true':
> >
> >          1.108.929      task-clock                       #    0,437 CPUs utilized
> >                  0      context-switches                 #    0,000 /sec
> >                  0      cpu-migrations                   #    0,000 /sec
> >                 55      page-faults                      #   49,597 K/sec
> >    <not supported>      cycles
> >
> >        0,002536439 seconds time elapsed
> >
> >        0,001363000 seconds user
> >        0,001393000 seconds sys
> >
> > bash-5.3#
> >
> > This is the epected output, however when I use the perf version 7.0.0rc3:
> >
> > bash-5.3# ./perf -v
> > perf version 7.0.rc3.g1f318b96cc84
> > bash-5.3# ./perf stat -- true
> > Error:
> > No supported events found.
> > trace.args_alignment
>
> And this last line is even stranger, in my case I get something else,
> also on x86_64:
>
> root@number:~#  perf stat -- true
> Error:
> No supported events found.
> addr2line.style
> root@number:~#
>
> On an ARM machine:
>
> acme@raspberrypi:~/git/perf-tools $ perf stat -- true
> Error:
> No supported events found.
>
> acme@raspberrypi:~/git/perf-tools $ uname -a
> Linux raspberrypi 6.12.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.62-1+rpt1 (2025-12-18) aarch64 GNU/Linux
> acme@raspberrypi:~/git/perf-tools $
>
> I'll try to bisect this later, thanks for the report!

Hi Arnaldo,

Leo has pointed at the fix:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/tools/perf/util/metricgroup.c?h=perf-tools-next&id=c5a244bf17caf2de22f9e100832b75f72b31d3e6
This also showed up as missing for the LTS backports:
https://lore.kernel.org/lkml/ad95d781-7eb2-4c0c-a9e9-aaabae8eb602@kernel.org/
so I thought it was flagged as a fix for the next PR. I don't see it in there:
https://lore.kernel.org/lkml/20260313151434.1695228-1-acme@kernel.org/
Could we get it in?

Thanks,
Ian



> - Arnaldo
>
> > bash-5.3#
> >
> > Same happens on my s390 systems (LPAR and z/VM).
> > Is this a known already?
> > I guess this is a perf tool issue, same result when you execute this on a 7.0.0rc3
> > kernel.
> >
> > Any ideas where to start debugging
> > --
> > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> > --
> > IBM Deutschland Research & Development GmbH
> >
> > Vorsitzender des Aufsichtsrats: Wolfgang Wendt
> >
> > Geschäftsführung: David Faller
> >
> > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 15:41   ` Ian Rogers
@ 2026-03-13 15:56     ` Arnaldo Melo
  2026-03-13 16:10       ` Ian Rogers
  2026-03-17 19:39     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 18+ messages in thread
From: Arnaldo Melo @ 2026-03-13 15:56 UTC (permalink / raw)
  To: Ian Rogers, Arnaldo Carvalho de Melo
  Cc: Thomas Richter, linux-perf-use., Jan Polensky



On March 13, 2026 12:41:48 PM GMT-03:00, Ian Rogers <irogers@google.com> wrote:
>On Fri, Mar 13, 2026 at 8:19 AM Arnaldo Carvalho de Melo
><acme@kernel.org> wrote:
>>
>> On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
>> > Ian,
>> >
>> > I just discovered a strange behavior on linux 7.0.0rc3.
>> >
>> > I run these commands on my x86 virtual machine:
>> >
>> > bash-5.3# uname -m
>> > x86_64
>> > bash-5.3# perf -v
>> > perf version 6.18.13-200.fc43.x86_64
>> > bash-5.3# perf stat -- true
>> >
>> >  Performance counter stats for 'true':
>> >
>> >          1.108.929      task-clock                       #    0,437 CPUs utilized
>> >                  0      context-switches                 #    0,000 /sec
>> >                  0      cpu-migrations                   #    0,000 /sec
>> >                 55      page-faults                      #   49,597 K/sec
>> >    <not supported>      cycles
>> >
>> >        0,002536439 seconds time elapsed
>> >
>> >        0,001363000 seconds user
>> >        0,001393000 seconds sys
>> >
>> > bash-5.3#
>> >
>> > This is the epected output, however when I use the perf version 7.0.0rc3:
>> >
>> > bash-5.3# ./perf -v
>> > perf version 7.0.rc3.g1f318b96cc84
>> > bash-5.3# ./perf stat -- true
>> > Error:
>> > No supported events found.
>> > trace.args_alignment
>>
>> And this last line is even stranger, in my case I get something else,
>> also on x86_64:
>>
>> root@number:~#  perf stat -- true
>> Error:
>> No supported events found.
>> addr2line.style
>> root@number:~#
>>
>> On an ARM machine:
>>
>> acme@raspberrypi:~/git/perf-tools $ perf stat -- true
>> Error:
>> No supported events found.
>>
>> acme@raspberrypi:~/git/perf-tools $ uname -a
>> Linux raspberrypi 6.12.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.62-1+rpt1 (2025-12-18) aarch64 GNU/Linux
>> acme@raspberrypi:~/git/perf-tools $
>>
>> I'll try to bisect this later, thanks for the report!
>
>Hi Arnaldo,
>
>Leo has pointed at the fix:
>https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/tools/perf/util/metricgroup.c?h=perf-tools-next&id=c5a244bf17caf2de22f9e100832b75f72b31d3e6
>This also showed up as missing for the LTS backports:
>https://lore.kernel.org/lkml/ad95d781-7eb2-4c0c-a9e9-aaabae8eb602@kernel.org/
>so I thought it was flagged as a fix for the next PR. I don't see it in there:
>https://lore.kernel.org/lkml/20260313151434.1695228-1-acme@kernel.org/
>Could we get it in?
>

Sure, I'll process it for the next PR, thanks for pointing it to me, I'll test it later, 

- Arnaldo


>Thanks,
>Ian
>
>
>
>> - Arnaldo
>>
>> > bash-5.3#
>> >
>> > Same happens on my s390 systems (LPAR and z/VM).
>> > Is this a known already?
>> > I guess this is a perf tool issue, same result when you execute this on a 7.0.0rc3
>> > kernel.
>> >
>> > Any ideas where to start debugging
>> > --
>> > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
>> > --
>> > IBM Deutschland Research & Development GmbH
>> >
>> > Vorsitzender des Aufsichtsrats: Wolfgang Wendt
>> >
>> > Geschäftsführung: David Faller
>> >
>> > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
>> >

- Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 15:56     ` Arnaldo Melo
@ 2026-03-13 16:10       ` Ian Rogers
  2026-03-13 17:01         ` Arnaldo Melo
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Rogers @ 2026-03-13 16:10 UTC (permalink / raw)
  To: Arnaldo Melo
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, linux-perf-use.,
	Jan Polensky

On Fri, Mar 13, 2026 at 8:56 AM Arnaldo Melo <arnaldo.melo@gmail.com> wrote:
>
> On March 13, 2026 12:41:48 PM GMT-03:00, Ian Rogers <irogers@google.com> wrote:
> >On Fri, Mar 13, 2026 at 8:19 AM Arnaldo Carvalho de Melo
> ><acme@kernel.org> wrote:
> >>
> >> On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
> >> > Ian,
> >> >
> >> > I just discovered a strange behavior on linux 7.0.0rc3.
> >> >
> >> > I run these commands on my x86 virtual machine:
> >> >
> >> > bash-5.3# uname -m
> >> > x86_64
> >> > bash-5.3# perf -v
> >> > perf version 6.18.13-200.fc43.x86_64
> >> > bash-5.3# perf stat -- true
> >> >
> >> >  Performance counter stats for 'true':
> >> >
> >> >          1.108.929      task-clock                       #    0,437 CPUs utilized
> >> >                  0      context-switches                 #    0,000 /sec
> >> >                  0      cpu-migrations                   #    0,000 /sec
> >> >                 55      page-faults                      #   49,597 K/sec
> >> >    <not supported>      cycles
> >> >
> >> >        0,002536439 seconds time elapsed
> >> >
> >> >        0,001363000 seconds user
> >> >        0,001393000 seconds sys
> >> >
> >> > bash-5.3#
> >> >
> >> > This is the epected output, however when I use the perf version 7.0.0rc3:
> >> >
> >> > bash-5.3# ./perf -v
> >> > perf version 7.0.rc3.g1f318b96cc84
> >> > bash-5.3# ./perf stat -- true
> >> > Error:
> >> > No supported events found.
> >> > trace.args_alignment
> >>
> >> And this last line is even stranger, in my case I get something else,
> >> also on x86_64:
> >>
> >> root@number:~#  perf stat -- true
> >> Error:
> >> No supported events found.
> >> addr2line.style
> >> root@number:~#
> >>
> >> On an ARM machine:
> >>
> >> acme@raspberrypi:~/git/perf-tools $ perf stat -- true
> >> Error:
> >> No supported events found.
> >>
> >> acme@raspberrypi:~/git/perf-tools $ uname -a
> >> Linux raspberrypi 6.12.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.62-1+rpt1 (2025-12-18) aarch64 GNU/Linux
> >> acme@raspberrypi:~/git/perf-tools $
> >>
> >> I'll try to bisect this later, thanks for the report!
> >
> >Hi Arnaldo,
> >
> >Leo has pointed at the fix:
> >https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/tools/perf/util/metricgroup.c?h=perf-tools-next&id=c5a244bf17caf2de22f9e100832b75f72b31d3e6
> >This also showed up as missing for the LTS backports:
> >https://lore.kernel.org/lkml/ad95d781-7eb2-4c0c-a9e9-aaabae8eb602@kernel.org/
> >so I thought it was flagged as a fix for the next PR. I don't see it in there:
> >https://lore.kernel.org/lkml/20260313151434.1695228-1-acme@kernel.org/
> >Could we get it in?
> >
>
> Sure, I'll process it for the next PR, thanks for pointing it to me, I'll test it later,

No worries, presumably if I'd been monitoring:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git/log/?h=tmp.perf-tools
I'd have seen it was missing. Perhaps you can give a heads up to check
for missing patches when assembling the PRs. I rarely look at
perf-tools.git as perf-tools-next.git is where all the action is :-)
Actually, I vibe coded a script that may work to automate this:
```
#!/bin/bash

# Configuration
STABLE_REMOTE="perf-tools"
STABLE_BRANCH="perf-tools" # Adjust if your stable branch is named differently
NEXT_REMOTE="perf-tools-next"
NEXT_BRANCH="perf-tools-next"

echo "Checking for fix candidates in $NEXT_REMOTE/$NEXT_BRANCH..."
echo "Targeting stable release: $STABLE_REMOTE/$STABLE_BRANCH"
echo "-------------------------------------------------------"

# 1. Get all commits in 'next' that aren't in 'stable'
# 2. Search their bodies for the "Fixes:" tag
# 3. Extract the SHA from the Fixes tag
git log ${STABLE_REMOTE}/${STABLE_BRANCH}..${NEXT_REMOTE}/${NEXT_BRANCH}
--format="%H" | while read -r next_sha; do

    # Extract the SHA mentioned in the Fixes: line
    # Matches 'Fixes: <sha> ("subject")'
    fixed_sha=$(git log -1 --format="%b" "$next_sha" | grep -i
"Fixes:" | sed -E 's/.*Fixes: ([a-f0-9]+).*/\1/')

    if [ -n "$fixed_sha" ]; then
        # Check if the fixed_sha exists in the stable remote's history
        if git merge-base --is-ancestor "$fixed_sha"
${STABLE_REMOTE}/${STABLE_BRANCH} 2>/dev/null; then
            subject=$(git log -1 --format="%s" "$next_sha")
            echo "Candidate: $next_sha"
            echo "  Subject: $subject"
            echo "  Fixes:   $fixed_sha (Found in $STABLE_REMOTE)"
            echo "  Command: git cherry-pick -x $next_sha"
            echo ""
        fi
    fi
done
```
Running this showed:
```
Checking for fix candidates in perf-tools-next/perf-tools-next...
Targeting stable release: perf-tools/perf-tools
-------------------------------------------------------
Candidate: 6910944bf0b92fea63d5a7aeed69e4b9c14fd01b
  Subject: perf test type profiling: Remote typedef on struct
  Fixes:   335047109d7d (Found in perf-tools)
  Command: git cherry-pick -x 6910944bf0b92fea63d5a7aeed69e4b9c14fd01b

Candidate: 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec
  Subject: perf kvm stat: Fix relative paths for including headers
  Fixes:   a724a8fce5e2 (Found in perf-tools)
  Command: git cherry-pick -x 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec

Candidate: 96f202eab8133f94479b14a32902c636e9bdf6af
  Subject: perf trace: Fix IS_ERR() vs NULL check bug
  Fixes:   ef2da619b132c6f74 (Found in perf-tools)
  Command: git cherry-pick -x 96f202eab8133f94479b14a32902c636e9bdf6af

Candidate: c5a244bf17caf2de22f9e100832b75f72b31d3e6
  Subject: perf metricgroup: Fix metricgroup__has_metric_or_groups
  Fixes:   cee275edcdb1 (Found in perf-tools)
  Command: git cherry-pick -x c5a244bf17caf2de22f9e100832b75f72b31d3e6

Candidate: aa6a6a2d16c1e2e27e986936369959d70316199f
  Subject: perf parse-events: Fix big-endian 'overwrite' by writing
correct union member
  Fixes:   159ca97cd97c (Found in perf-tools)
  Command: git cherry-pick -x aa6a6a2d16c1e2e27e986936369959d70316199f
```

Thanks,
Ian

> - Arnaldo
>
>
> >Thanks,
> >Ian
> >
> >
> >
> >> - Arnaldo
> >>
> >> > bash-5.3#
> >> >
> >> > Same happens on my s390 systems (LPAR and z/VM).
> >> > Is this a known already?
> >> > I guess this is a perf tool issue, same result when you execute this on a 7.0.0rc3
> >> > kernel.
> >> >
> >> > Any ideas where to start debugging
> >> > --
> >> > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> >> > --
> >> > IBM Deutschland Research & Development GmbH
> >> >
> >> > Vorsitzender des Aufsichtsrats: Wolfgang Wendt
> >> >
> >> > Geschäftsführung: David Faller
> >> >
> >> > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
> >> >
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 16:10       ` Ian Rogers
@ 2026-03-13 17:01         ` Arnaldo Melo
  2026-03-13 18:27           ` Ian Rogers
  0 siblings, 1 reply; 18+ messages in thread
From: Arnaldo Melo @ 2026-03-13 17:01 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, linux-perf-use.,
	Jan Polensky



On March 13, 2026 1:10:03 PM GMT-03:00, Ian Rogers <irogers@google.com> wrote:
>On Fri, Mar 13, 2026 at 8:56 AM Arnaldo Melo <arnaldo.melo@gmail.com> wrote:
>>
>> On March 13, 2026 12:41:48 PM GMT-03:00, Ian Rogers <irogers@google.com> wrote:
>> >On Fri, Mar 13, 2026 at 8:19 AM Arnaldo Carvalho de Melo
>> ><acme@kernel.org> wrote:
>> >>
>> >> On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
>> >> > Ian,
>> >> >
>> >> > I just discovered a strange behavior on linux 7.0.0rc3.
>> >> >
>> >> > I run these commands on my x86 virtual machine:
>> >> >
>> >> > bash-5.3# uname -m
>> >> > x86_64
>> >> > bash-5.3# perf -v
>> >> > perf version 6.18.13-200.fc43.x86_64
>> >> > bash-5.3# perf stat -- true
>> >> >
>> >> >  Performance counter stats for 'true':
>> >> >
>> >> >          1.108.929      task-clock                       #    0,437 CPUs utilized
>> >> >                  0      context-switches                 #    0,000 /sec
>> >> >                  0      cpu-migrations                   #    0,000 /sec
>> >> >                 55      page-faults                      #   49,597 K/sec
>> >> >    <not supported>      cycles
>> >> >
>> >> >        0,002536439 seconds time elapsed
>> >> >
>> >> >        0,001363000 seconds user
>> >> >        0,001393000 seconds sys
>> >> >
>> >> > bash-5.3#
>> >> >
>> >> > This is the epected output, however when I use the perf version 7.0.0rc3:
>> >> >
>> >> > bash-5.3# ./perf -v
>> >> > perf version 7.0.rc3.g1f318b96cc84
>> >> > bash-5.3# ./perf stat -- true
>> >> > Error:
>> >> > No supported events found.
>> >> > trace.args_alignment
>> >>
>> >> And this last line is even stranger, in my case I get something else,
>> >> also on x86_64:
>> >>
>> >> root@number:~#  perf stat -- true
>> >> Error:
>> >> No supported events found.
>> >> addr2line.style
>> >> root@number:~#
>> >>
>> >> On an ARM machine:
>> >>
>> >> acme@raspberrypi:~/git/perf-tools $ perf stat -- true
>> >> Error:
>> >> No supported events found.
>> >>
>> >> acme@raspberrypi:~/git/perf-tools $ uname -a
>> >> Linux raspberrypi 6.12.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.62-1+rpt1 (2025-12-18) aarch64 GNU/Linux
>> >> acme@raspberrypi:~/git/perf-tools $
>> >>
>> >> I'll try to bisect this later, thanks for the report!
>> >
>> >Hi Arnaldo,
>> >
>> >Leo has pointed at the fix:
>> >https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/tools/perf/util/metricgroup.c?h=perf-tools-next&id=c5a244bf17caf2de22f9e100832b75f72b31d3e6
>> >This also showed up as missing for the LTS backports:
>> >https://lore.kernel.org/lkml/ad95d781-7eb2-4c0c-a9e9-aaabae8eb602@kernel.org/
>> >so I thought it was flagged as a fix for the next PR. I don't see it in there:
>> >https://lore.kernel.org/lkml/20260313151434.1695228-1-acme@kernel.org/
>> >Could we get it in?
>> >
>>
>> Sure, I'll process it for the next PR, thanks for pointing it to me, I'll test it later,
>
>No worries, presumably if I'd been monitoring:
>https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git/log/?h=tmp.perf-tools
>I'd have seen it was missing. Perhaps you can give a heads up to check
>for missing patches when assembling the PRs. I rarely look at
>perf-tools.git as perf-tools-next.git is where all the action is :-)
>Actually, I vibe coded a script that may work to automate this:
>```


Well, if those patches are ok for perf-tools, i.e. are fixes for bugs introduced in the current merge window or urgent fixes for longstanding bugs, they shouldn't be in perf-tools-next, they should be in perf-tools to be merger in the current merge window. 

Coordination in triaging via an Acked-by/Reviewed-by pointing out it's something for the current merge window, something I do, is what we should strive to do. 

When I notice something that's needed for the current window that's been processed by Namhyung for perf-tools-next, I merge it as well (as will be in this case and maybe in the others that you pointed out, thanks, I'll check each one).

No problem with that, better be in both branches than to get missed :-)

Thanks, 

- Arnaldo


>#!/bin/bash
>
># Configuration
>STABLE_REMOTE="perf-tools"
>STABLE_BRANCH="perf-tools" # Adjust if your stable branch is named differently
>NEXT_REMOTE="perf-tools-next"
>NEXT_BRANCH="perf-tools-next"
>
>echo "Checking for fix candidates in $NEXT_REMOTE/$NEXT_BRANCH..."
>echo "Targeting stable release: $STABLE_REMOTE/$STABLE_BRANCH"
>echo "-------------------------------------------------------"
>
># 1. Get all commits in 'next' that aren't in 'stable'
># 2. Search their bodies for the "Fixes:" tag
># 3. Extract the SHA from the Fixes tag
>git log ${STABLE_REMOTE}/${STABLE_BRANCH}..${NEXT_REMOTE}/${NEXT_BRANCH}
>--format="%H" | while read -r next_sha; do
>
>    # Extract the SHA mentioned in the Fixes: line
>    # Matches 'Fixes: <sha> ("subject")'
>    fixed_sha=$(git log -1 --format="%b" "$next_sha" | grep -i
>"Fixes:" | sed -E 's/.*Fixes: ([a-f0-9]+).*/\1/')
>
>    if [ -n "$fixed_sha" ]; then
>        # Check if the fixed_sha exists in the stable remote's history
>        if git merge-base --is-ancestor "$fixed_sha"
>${STABLE_REMOTE}/${STABLE_BRANCH} 2>/dev/null; then
>            subject=$(git log -1 --format="%s" "$next_sha")
>            echo "Candidate: $next_sha"
>            echo "  Subject: $subject"
>            echo "  Fixes:   $fixed_sha (Found in $STABLE_REMOTE)"
>            echo "  Command: git cherry-pick -x $next_sha"
>            echo ""
>        fi
>    fi
>done
>```
>Running this showed:
>```
>Checking for fix candidates in perf-tools-next/perf-tools-next...
>Targeting stable release: perf-tools/perf-tools
>-------------------------------------------------------
>Candidate: 6910944bf0b92fea63d5a7aeed69e4b9c14fd01b
>  Subject: perf test type profiling: Remote typedef on struct
>  Fixes:   335047109d7d (Found in perf-tools)
>  Command: git cherry-pick -x 6910944bf0b92fea63d5a7aeed69e4b9c14fd01b
>
>Candidate: 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec
>  Subject: perf kvm stat: Fix relative paths for including headers
>  Fixes:   a724a8fce5e2 (Found in perf-tools)
>  Command: git cherry-pick -x 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec
>
>Candidate: 96f202eab8133f94479b14a32902c636e9bdf6af
>  Subject: perf trace: Fix IS_ERR() vs NULL check bug
>  Fixes:   ef2da619b132c6f74 (Found in perf-tools)
>  Command: git cherry-pick -x 96f202eab8133f94479b14a32902c636e9bdf6af
>
>Candidate: c5a244bf17caf2de22f9e100832b75f72b31d3e6
>  Subject: perf metricgroup: Fix metricgroup__has_metric_or_groups
>  Fixes:   cee275edcdb1 (Found in perf-tools)
>  Command: git cherry-pick -x c5a244bf17caf2de22f9e100832b75f72b31d3e6
>
>Candidate: aa6a6a2d16c1e2e27e986936369959d70316199f
>  Subject: perf parse-events: Fix big-endian 'overwrite' by writing
>correct union member
>  Fixes:   159ca97cd97c (Found in perf-tools)
>  Command: git cherry-pick -x aa6a6a2d16c1e2e27e986936369959d70316199f
>```
>
>Thanks,
>Ian
>
>> - Arnaldo
>>
>>
>> >Thanks,
>> >Ian
>> >
>> >
>> >
>> >> - Arnaldo
>> >>
>> >> > bash-5.3#
>> >> >
>> >> > Same happens on my s390 systems (LPAR and z/VM).
>> >> > Is this a known already?
>> >> > I guess this is a perf tool issue, same result when you execute this on a 7.0.0rc3
>> >> > kernel.
>> >> >
>> >> > Any ideas where to start debugging
>> >> > --
>> >> > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
>> >> > --
>> >> > IBM Deutschland Research & Development GmbH
>> >> >
>> >> > Vorsitzender des Aufsichtsrats: Wolfgang Wendt
>> >> >
>> >> > Geschäftsführung: David Faller
>> >> >
>> >> > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
>> >> >
>>
>> - Arnaldo

- Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 17:01         ` Arnaldo Melo
@ 2026-03-13 18:27           ` Ian Rogers
  2026-03-13 21:10             ` Namhyung Kim
  2026-03-17 20:19             ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 18+ messages in thread
From: Ian Rogers @ 2026-03-13 18:27 UTC (permalink / raw)
  To: Arnaldo Melo
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, linux-perf-use.,
	Jan Polensky

> >No worries, presumably if I'd been monitoring:
> >https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git/log/?h=tmp.perf-tools
> >I'd have seen it was missing. Perhaps you can give a heads up to check
> >for missing patches when assembling the PRs. I rarely look at
> >perf-tools.git as perf-tools-next.git is where all the action is :-)
> >Actually, I vibe coded a script that may work to automate this:
> >```
>
>
> Well, if those patches are ok for perf-tools, i.e. are fixes for bugs introduced in the current merge window or urgent fixes for longstanding bugs, they shouldn't be in perf-tools-next, they should be in perf-tools to be merger in the current merge window.
>
> Coordination in triaging via an Acked-by/Reviewed-by pointing out it's something for the current merge window, something I do, is what we should strive to do.
>
> When I notice something that's needed for the current window that's been processed by Namhyung for perf-tools-next, I merge it as well (as will be in this case and maybe in the others that you pointed out, thanks, I'll check each one).
>
> No problem with that, better be in both branches than to get missed :-)

Thanks! In the patches tagged as "fixes," I think these 2 also need picking up:

commit aa6a6a2d16c1e2e27e986936369959d70316199f ("perf parse-events:
Fix big-endian 'overwrite' by writing") to fix s390.
commit 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec ("perf kvm stat: Fix
relative paths for including headers") for non-x86 builds.

The other patches are test fixes or address malloc failures that
shouldn't really happen, so they are not a priority.

Thanks,
Ian

> >Checking for fix candidates in perf-tools-next/perf-tools-next...
> >Targeting stable release: perf-tools/perf-tools
> >-------------------------------------------------------
> >Candidate: 6910944bf0b92fea63d5a7aeed69e4b9c14fd01b
> >  Subject: perf test type profiling: Remote typedef on struct
> >  Fixes:   335047109d7d (Found in perf-tools)
> >  Command: git cherry-pick -x 6910944bf0b92fea63d5a7aeed69e4b9c14fd01b
> >
> >Candidate: 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec
> >  Subject: perf kvm stat: Fix relative paths for including headers
> >  Fixes:   a724a8fce5e2 (Found in perf-tools)
> >  Command: git cherry-pick -x 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec
> >
> >Candidate: 96f202eab8133f94479b14a32902c636e9bdf6af
> >  Subject: perf trace: Fix IS_ERR() vs NULL check bug
> >  Fixes:   ef2da619b132c6f74 (Found in perf-tools)
> >  Command: git cherry-pick -x 96f202eab8133f94479b14a32902c636e9bdf6af
> >
> >Candidate: c5a244bf17caf2de22f9e100832b75f72b31d3e6
> >  Subject: perf metricgroup: Fix metricgroup__has_metric_or_groups
> >  Fixes:   cee275edcdb1 (Found in perf-tools)
> >  Command: git cherry-pick -x c5a244bf17caf2de22f9e100832b75f72b31d3e6
> >
> >Candidate: aa6a6a2d16c1e2e27e986936369959d70316199f
> >  Subject: perf parse-events: Fix big-endian 'overwrite' by writing
> >correct union member
> >  Fixes:   159ca97cd97c (Found in perf-tools)
> >  Command: git cherry-pick -x aa6a6a2d16c1e2e27e986936369959d70316199f
> >```
> >
> >Thanks,
> >Ian
> >
> >> - Arnaldo
> >>
> >>
> >> >Thanks,
> >> >Ian
> >> >
> >> >
> >> >
> >> >> - Arnaldo
> >> >>
> >> >> > bash-5.3#
> >> >> >
> >> >> > Same happens on my s390 systems (LPAR and z/VM).
> >> >> > Is this a known already?
> >> >> > I guess this is a perf tool issue, same result when you execute this on a 7.0.0rc3
> >> >> > kernel.
> >> >> >
> >> >> > Any ideas where to start debugging
> >> >> > --
> >> >> > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> >> >> > --
> >> >> > IBM Deutschland Research & Development GmbH
> >> >> >
> >> >> > Vorsitzender des Aufsichtsrats: Wolfgang Wendt
> >> >> >
> >> >> > Geschäftsführung: David Faller
> >> >> >
> >> >> > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
> >> >> >
> >>
> >> - Arnaldo
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 18:27           ` Ian Rogers
@ 2026-03-13 21:10             ` Namhyung Kim
  2026-03-17 20:19             ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 18+ messages in thread
From: Namhyung Kim @ 2026-03-13 21:10 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Melo, Arnaldo Carvalho de Melo, Thomas Richter,
	linux-perf-use., Jan Polensky

Hello,

On Fri, Mar 13, 2026 at 11:27:30AM -0700, Ian Rogers wrote:
> > >No worries, presumably if I'd been monitoring:
> > >https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git/log/?h=tmp.perf-tools
> > >I'd have seen it was missing. Perhaps you can give a heads up to check
> > >for missing patches when assembling the PRs. I rarely look at
> > >perf-tools.git as perf-tools-next.git is where all the action is :-)
> > >Actually, I vibe coded a script that may work to automate this:
> > >```
> >
> >
> > Well, if those patches are ok for perf-tools, i.e. are fixes for bugs introduced in the current merge window or urgent fixes for longstanding bugs, they shouldn't be in perf-tools-next, they should be in perf-tools to be merger in the current merge window.
> >
> > Coordination in triaging via an Acked-by/Reviewed-by pointing out it's something for the current merge window, something I do, is what we should strive to do.
> >
> > When I notice something that's needed for the current window that's been processed by Namhyung for perf-tools-next, I merge it as well (as will be in this case and maybe in the others that you pointed out, thanks, I'll check each one).
> >
> > No problem with that, better be in both branches than to get missed :-)
> 
> Thanks! In the patches tagged as "fixes," I think these 2 also need picking up:
> 
> commit aa6a6a2d16c1e2e27e986936369959d70316199f ("perf parse-events:
> Fix big-endian 'overwrite' by writing") to fix s390.
> commit 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec ("perf kvm stat: Fix
> relative paths for including headers") for non-x86 builds.
> 
> The other patches are test fixes or address malloc failures that
> shouldn't really happen, so they are not a priority.

Great!  Thanks for taking care of this.  I'll be more careful when
merging fixes.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 15:41   ` Ian Rogers
  2026-03-13 15:56     ` Arnaldo Melo
@ 2026-03-17 19:39     ` Arnaldo Carvalho de Melo
  2026-03-17 19:56       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-17 19:39 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-users, Jan Polensky,
	Linux Kernel Mailing List, Namhyung Kim

On Fri, Mar 13, 2026 at 08:41:48AM -0700, Ian Rogers wrote:
> On Fri, Mar 13, 2026 at 8:19 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
> > > I just discovered a strange behavior on linux 7.0.0rc3.
> > > This is the epected output, however when I use the perf version 7.0.0rc3:

> > > bash-5.3# ./perf -v
> > > perf version 7.0.rc3.g1f318b96cc84
> > > bash-5.3# ./perf stat -- true
> > > Error:
> > > No supported events found.
> > > trace.args_alignment

> > And this last line is even stranger, in my case I get something else,
> > also on x86_64:

> > root@number:~#  perf stat -- true
> > Error:
> > No supported events found.
> > addr2line.style
> > root@number:~#

> > On an ARM machine:

> > acme@raspberrypi:~/git/perf-tools $ perf stat -- true
> > Error:
> > No supported events found.

> > acme@raspberrypi:~/git/perf-tools $ uname -a
> > Linux raspberrypi 6.12.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.62-1+rpt1 (2025-12-18) aarch64 GNU/Linux
> > acme@raspberrypi:~/git/perf-tools $

> > I'll try to bisect this later, thanks for the report!
 
> Hi Arnaldo,
 
> Leo has pointed at the fix:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/tools/perf/util/metricgroup.c?h=perf-tools-next&id=c5a244bf17caf2de22f9e100832b75f72b31d3e6
> This also showed up as missing for the LTS backports:
> https://lore.kernel.org/lkml/ad95d781-7eb2-4c0c-a9e9-aaabae8eb602@kernel.org/
> so I thought it was flagged as a fix for the next PR. I don't see it in there:
> https://lore.kernel.org/lkml/20260313151434.1695228-1-acme@kernel.org/
> Could we get it in?

So, I applied the one Leo pointed out and got this:

root@number:~# perf stat sleep 1

 Performance counter stats for 'sleep 1':

                 1      context-switches                 #   3509.1 cs/sec  cs_per_second
                 0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
                75      page-faults                      # 263181.0 faults/sec  page_faults_per_second
              0.28 msec task-clock                       #      0.0 CPUs  CPUs_utilized
             8,018      branch-misses                    #      4.2 %  branch_miss_rate
           191,108      branches                         #    670.6 M/sec  branch_frequency
           870,234      cpu-cycles                       #      3.1 GHz  cycles_frequency
     <not counted>      instructions                     #      nan instructions  insn_per_cycle  (0.00%)
     <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)

       1.001839832 seconds time elapsed

       0.000727000 seconds user
       0.000000000 seconds sys


Some events weren't counted. Try disabling the NMI watchdog:
	echo 0 > /proc/sys/kernel/nmi_watchdog
	perf stat ...
	echo 1 > /proc/sys/kernel/nmi_watchdog
root@number:~#  echo 0 > /proc/sys/kernel/nmi_watchdog


Which is strange, in the past (see below) instructions and
stalled-cycles-frontend were counted in the default (no -e) 'perf stat'
set of events, now if I disable the nmi_watchdog I get 'instructions'
back, but not 'stalled-cycles-frontend':


root@number:~# perf stat sleep 1

 Performance counter stats for 'sleep 1':

                 1      context-switches                 #   6524.7 cs/sec  cs_per_second
                 0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
                71      page-faults                      # 463256.0 faults/sec  page_faults_per_second
              0.15 msec task-clock                       #      0.0 CPUs  CPUs_utilized
             7,484      branch-misses                    #      4.0 %  branch_miss_rate
           186,108      branches                         #   1214.3 M/sec  branch_frequency
           810,475      cpu-cycles                       #      5.3 GHz  cycles_frequency
           893,290      instructions                     #      1.1 instructions  insn_per_cycle
     <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)

       1.000382345 seconds time elapsed

       0.000443000 seconds user
       0.000000000 seconds sys


If I go back to v6.18 (will try a bisect to see exactly where this
happens), I get the events above, with a different printing order, but
'instructions' and 'stalled-cycles-frontend' successfully counts:

root@number:~# perf -v
perf version 6.18.g7d0a66e4bb90
root@number:~# perf stat sleep 1

 Performance counter stats for 'sleep 1':

           202,103      task-clock                       #    0.000 CPUs utilized
                 1      context-switches                 #    4.948 K/sec
                 0      cpu-migrations                   #    0.000 /sec
                71      page-faults                      #  351.306 K/sec
           895,485      instructions                     #    0.85  insn per cycle
                                                  #    0.56  stalled cycles per insn
         1,048,898      cycles                           #    5.190 GHz
           504,794      stalled-cycles-frontend          #   48.13% frontend cycles idle
           186,002      branches                         #  920.333 M/sec
             7,857      branch-misses                    #    4.22% of all branches

       1.000537221 seconds time elapsed

       0.000542000 seconds user
       0.000000000 seconds sys


root@number:~#

Trying bisection now.

- Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-17 19:39     ` Arnaldo Carvalho de Melo
@ 2026-03-17 19:56       ` Arnaldo Carvalho de Melo
  2026-03-17 20:12         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-17 19:56 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-users, Jan Polensky,
	Linux Kernel Mailing List, Namhyung Kim

On Tue, Mar 17, 2026 at 04:39:24PM -0300, Arnaldo Carvalho de Melo wrote:
> On Fri, Mar 13, 2026 at 08:41:48AM -0700, Ian Rogers wrote:
> > On Fri, Mar 13, 2026 at 8:19 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > On Fri, Mar 13, 2026 at 02:13:18PM +0100, Thomas Richter wrote:
> > > > I just discovered a strange behavior on linux 7.0.0rc3.
> > > > This is the epected output, however when I use the perf version 7.0.0rc3:
> 
> > > > bash-5.3# ./perf -v
> > > > perf version 7.0.rc3.g1f318b96cc84
> > > > bash-5.3# ./perf stat -- true
> > > > Error:
> > > > No supported events found.
> > > > trace.args_alignment
> 
> > > And this last line is even stranger, in my case I get something else,
> > > also on x86_64:
> 
> > > root@number:~#  perf stat -- true
> > > Error:
> > > No supported events found.
> > > addr2line.style
> > > root@number:~#
> 
> > > On an ARM machine:
> 
> > > acme@raspberrypi:~/git/perf-tools $ perf stat -- true
> > > Error:
> > > No supported events found.
> 
> > > acme@raspberrypi:~/git/perf-tools $ uname -a
> > > Linux raspberrypi 6.12.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.62-1+rpt1 (2025-12-18) aarch64 GNU/Linux
> > > acme@raspberrypi:~/git/perf-tools $
> 
> > > I'll try to bisect this later, thanks for the report!
>  
> > Hi Arnaldo,
>  
> > Leo has pointed at the fix:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/tools/perf/util/metricgroup.c?h=perf-tools-next&id=c5a244bf17caf2de22f9e100832b75f72b31d3e6
> > This also showed up as missing for the LTS backports:
> > https://lore.kernel.org/lkml/ad95d781-7eb2-4c0c-a9e9-aaabae8eb602@kernel.org/
> > so I thought it was flagged as a fix for the next PR. I don't see it in there:
> > https://lore.kernel.org/lkml/20260313151434.1695228-1-acme@kernel.org/
> > Could we get it in?
> 
> So, I applied the one Leo pointed out and got this:
> 
> root@number:~# perf stat sleep 1
> 
>  Performance counter stats for 'sleep 1':
> 
>                  1      context-switches                 #   3509.1 cs/sec  cs_per_second
>                  0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
>                 75      page-faults                      # 263181.0 faults/sec  page_faults_per_second
>               0.28 msec task-clock                       #      0.0 CPUs  CPUs_utilized
>              8,018      branch-misses                    #      4.2 %  branch_miss_rate
>            191,108      branches                         #    670.6 M/sec  branch_frequency
>            870,234      cpu-cycles                       #      3.1 GHz  cycles_frequency
>      <not counted>      instructions                     #      nan instructions  insn_per_cycle  (0.00%)
>      <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)
> 
>        1.001839832 seconds time elapsed
> 
>        0.000727000 seconds user
>        0.000000000 seconds sys
> 
> 
> Some events weren't counted. Try disabling the NMI watchdog:
> 	echo 0 > /proc/sys/kernel/nmi_watchdog
> 	perf stat ...
> 	echo 1 > /proc/sys/kernel/nmi_watchdog
> root@number:~#  echo 0 > /proc/sys/kernel/nmi_watchdog
> 
> 
> Which is strange, in the past (see below) instructions and
> stalled-cycles-frontend were counted in the default (no -e) 'perf stat'
> set of events, now if I disable the nmi_watchdog I get 'instructions'
> back, but not 'stalled-cycles-frontend':
> 
> 
> root@number:~# perf stat sleep 1
> 
>  Performance counter stats for 'sleep 1':
> 
>                  1      context-switches                 #   6524.7 cs/sec  cs_per_second
>                  0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
>                 71      page-faults                      # 463256.0 faults/sec  page_faults_per_second
>               0.15 msec task-clock                       #      0.0 CPUs  CPUs_utilized
>              7,484      branch-misses                    #      4.0 %  branch_miss_rate
>            186,108      branches                         #   1214.3 M/sec  branch_frequency
>            810,475      cpu-cycles                       #      5.3 GHz  cycles_frequency
>            893,290      instructions                     #      1.1 instructions  insn_per_cycle
>      <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)
> 
>        1.000382345 seconds time elapsed
> 
>        0.000443000 seconds user
>        0.000000000 seconds sys
> 
> 
> If I go back to v6.18 (will try a bisect to see exactly where this
> happens), I get the events above, with a different printing order, but
> 'instructions' and 'stalled-cycles-frontend' successfully counts:
> 
> root@number:~# perf -v
> perf version 6.18.g7d0a66e4bb90
> root@number:~# perf stat sleep 1
> 
>  Performance counter stats for 'sleep 1':
> 
>            202,103      task-clock                       #    0.000 CPUs utilized
>                  1      context-switches                 #    4.948 K/sec
>                  0      cpu-migrations                   #    0.000 /sec
>                 71      page-faults                      #  351.306 K/sec
>            895,485      instructions                     #    0.85  insn per cycle
>                                                   #    0.56  stalled cycles per insn
>          1,048,898      cycles                           #    5.190 GHz
>            504,794      stalled-cycles-frontend          #   48.13% frontend cycles idle
>            186,002      branches                         #  920.333 M/sec
>              7,857      branch-misses                    #    4.22% of all branches
> 
>        1.000537221 seconds time elapsed
> 
>        0.000542000 seconds user
>        0.000000000 seconds sys
> 
> 
> root@number:~#
> 
> Trying bisection now.

Its related to that:

a3248b5b5427dc2126c19aa9c32f1e840b65024f is the first bad commit
commit a3248b5b5427dc2126c19aa9c32f1e840b65024f
Author: Ian Rogers <irogers@google.com>
Date:   Tue Nov 11 13:21:52 2025 -0800

    perf jevents: Add metric DefaultShowEvents

    Some Default group metrics require their events showing for
    consistency with perf's previous behavior. Add a flag to indicate when
    this is the case and use it in stat-display.

root@number:~# strace -e perf_event_open perf stat sleep 1
perf_event_open({type=PERF_TYPE_SOFTWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_SW_CONTEXT_SWITCHES, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
perf_event_open({type=PERF_TYPE_SOFTWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_SW_CPU_MIGRATIONS, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 4
perf_event_open({type=PERF_TYPE_SOFTWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_SW_PAGE_FAULTS, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 5
perf_event_open({type=PERF_TYPE_SOFTWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_SW_TASK_CLOCK, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 7
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xc3, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 8
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_BRANCH_INSTRUCTIONS, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, 8, PERF_FLAG_FD_CLOEXEC) = 9
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_BRANCH_INSTRUCTIONS, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 10
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 11
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 12
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xc0, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, 12, PERF_FLAG_FD_CLOEXEC) = 13
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 14
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, 16, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 249821, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=249821, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

                 1      context-switches                 #   3356.1 cs/sec  cs_per_second     
                 0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
                74      page-faults                      # 248352.1 faults/sec  page_faults_per_second
              0.30 msec task-clock                       #      0.0 CPUs  CPUs_utilized       
             7,566      branch-misses                    #      4.0 %  branch_miss_rate       
           189,307      branches                         #    635.3 M/sec  branch_frequency   
           798,790      cpu-cycles                       #      2.7 GHz  cycles_frequency     
           908,746      instructions                     #      1.1 instructions  insn_per_cycle
     <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)

       1.000722079 seconds time elapsed

       0.000000000 seconds user
       0.000714000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=249820, si_uid=0} ---
+++ exited with 0 +++
root@number:~#

It is not trying PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, is asking for
PERF_COUNT_HW_STALLED_CYCLES_BACKEND instead...

- Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-17 19:56       ` Arnaldo Carvalho de Melo
@ 2026-03-17 20:12         ` Arnaldo Carvalho de Melo
  2026-03-17 20:50           ` Ian Rogers
  0 siblings, 1 reply; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-17 20:12 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-users, Jan Polensky,
	Linux Kernel Mailing List, Namhyung Kim

On Tue, Mar 17, 2026 at 04:56:51PM -0300, Arnaldo Carvalho de Melo wrote:
> It is not trying PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, is asking for
> PERF_COUNT_HW_STALLED_CYCLES_BACKEND instead...

If I instead ask just for stalled-cycles-frontend and
stalled-cycles-backend:

root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend,stalled-cycles-backend sleep 1
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=250619, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

           409,276      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend

       1.000428804 seconds time elapsed

       0.000439000 seconds user
       0.000000000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=250618, si_uid=0} ---
+++ exited with 0 +++
root@number:~#

It used type=PERF_TYPE_RAW, config=0xa9 for  stalled-cycles-frontend but
type=PERF_TYPE_HARDWARE, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND.

⬢ [acme@toolbx perf-tools]$ git grep stalled-cycles-frontend tools
tools/bpf/bpftool/link.c:       [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = "stalled-cycles-frontend",
tools/perf/builtin-stat.c:     3,856,436,920 stalled-cycles-frontend   #   74.09% frontend cycles idle
tools/perf/pmu-events/arch/common/common/legacy-hardware.json:    "EventName": "stalled-cycles-frontend",
tools/perf/pmu-events/empty-pmu-events.c:/* offset=122795 */ "stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
tools/perf/pmu-events/empty-pmu-events.c:/* offset=122945 */ "idle-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of stalled-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
tools/perf/pmu-events/empty-pmu-events.c:{ 122795 }, /* stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000 */
tools/perf/tests/shell/stat+std_output.sh:event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
tools/perf/util/evsel.c:        "stalled-cycles-frontend",
⬢ [acme@toolbx perf-tools]$

This machine is:

⬢ [acme@toolbx perf-tools]$ grep -m1 "model name" /proc/cpuinfo
model name	: AMD Ryzen 9 9950X3D 16-Core Processor
⬢ [acme@toolbx perf-tools]

And doesn't have PERF_COUNT_HW_STALLED_CYCLES_BACKEND, but has
PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, that gets configured using
PERF_TYPE_RAW and 0xa9 because:

root@number:~# cat /sys/devices/cpu/events/stalled-cycles-frontend
event=0xa9
root@number:~#

But I couldn't so far explain why in the default case it is asking for
PERF_COUNT_HW_STALLED_CYCLES_BACKEND, when it should be asking for
PERF_COUNT_HW_STALLED_CYCLES_FRONTEND or PERF_TYPE_RAW+config=0xa9...

- Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-13 18:27           ` Ian Rogers
  2026-03-13 21:10             ` Namhyung Kim
@ 2026-03-17 20:19             ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-17 20:19 UTC (permalink / raw)
  To: Ian Rogers; +Cc: Arnaldo Melo, Thomas Richter, linux-perf-use., Jan Polensky

On Fri, Mar 13, 2026 at 11:27:30AM -0700, Ian Rogers wrote:
> > >No worries, presumably if I'd been monitoring:
> > >https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git/log/?h=tmp.perf-tools
> > >I'd have seen it was missing. Perhaps you can give a heads up to check
> > >for missing patches when assembling the PRs. I rarely look at
> > >perf-tools.git as perf-tools-next.git is where all the action is :-)
> > >Actually, I vibe coded a script that may work to automate this:
> > >```

> > Well, if those patches are ok for perf-tools, i.e. are fixes for
> > bugs introduced in the current merge window or urgent fixes for
> > longstanding bugs, they shouldn't be in perf-tools-next, they should
> > be in perf-tools to be merger in the current merge window.

> > Coordination in triaging via an Acked-by/Reviewed-by pointing out
> > it's something for the current merge window, something I do, is what
> > we should strive to do.

> > When I notice something that's needed for the current window that's
> > been processed by Namhyung for perf-tools-next, I merge it as well
> > (as will be in this case and maybe in the others that you pointed
> > out, thanks, I'll check each one).

> > No problem with that, better be in both branches than to get missed
> > :-)
 
> Thanks! In the patches tagged as "fixes," I think these 2 also need picking up:
 
> commit aa6a6a2d16c1e2e27e986936369959d70316199f ("perf parse-events:
> Fix big-endian 'overwrite' by writing") to fix s390.
> commit 06ec44c2aa2ef15fd56f9808b6cf7495e1fbd8ec ("perf kvm stat: Fix
> relative paths for including headers") for non-x86 builds.

Picked the two above in addition to the original one that Leo pointed
out to me at the start of this thread,

Thanks,

- Arnaldo
 
> The other patches are test fixes or address malloc failures that
> shouldn't really happen, so they are not a priority.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-17 20:12         ` Arnaldo Carvalho de Melo
@ 2026-03-17 20:50           ` Ian Rogers
  2026-03-18  0:37             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Rogers @ 2026-03-17 20:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Richter, linux-perf-users, Jan Polensky,
	Linux Kernel Mailing List, Namhyung Kim

On Tue, Mar 17, 2026 at 1:12 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Tue, Mar 17, 2026 at 04:56:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > It is not trying PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, is asking for
> > PERF_COUNT_HW_STALLED_CYCLES_BACKEND instead...
>
> If I instead ask just for stalled-cycles-frontend and
> stalled-cycles-backend:
>
> root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend,stalled-cycles-backend sleep 1

I think you intend for this to be system wide '-a'.

> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=250619, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
>
>  Performance counter stats for 'sleep 1':
>
>            409,276      stalled-cycles-frontend
>    <not supported>      stalled-cycles-backend
>
>        1.000428804 seconds time elapsed
>
>        0.000439000 seconds user
>        0.000000000 seconds sys
>
>
> --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=250618, si_uid=0} ---
> +++ exited with 0 +++
> root@number:~#
>
> It used type=PERF_TYPE_RAW, config=0xa9 for  stalled-cycles-frontend but
> type=PERF_TYPE_HARDWARE, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND.
>
> ⬢ [acme@toolbx perf-tools]$ git grep stalled-cycles-frontend tools
> tools/bpf/bpftool/link.c:       [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = "stalled-cycles-frontend",
> tools/perf/builtin-stat.c:     3,856,436,920 stalled-cycles-frontend   #   74.09% frontend cycles idle
> tools/perf/pmu-events/arch/common/common/legacy-hardware.json:    "EventName": "stalled-cycles-frontend",
> tools/perf/pmu-events/empty-pmu-events.c:/* offset=122795 */ "stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> tools/perf/pmu-events/empty-pmu-events.c:/* offset=122945 */ "idle-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of stalled-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> tools/perf/pmu-events/empty-pmu-events.c:{ 122795 }, /* stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000 */
> tools/perf/tests/shell/stat+std_output.sh:event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
> tools/perf/util/evsel.c:        "stalled-cycles-frontend",
> ⬢ [acme@toolbx perf-tools]$
>
> This machine is:
>
> ⬢ [acme@toolbx perf-tools]$ grep -m1 "model name" /proc/cpuinfo
> model name      : AMD Ryzen 9 9950X3D 16-Core Processor

Lots of missing legacy events on AMD. The problem is worse with -dd and -ddd.

> ⬢ [acme@toolbx perf-tools]
>
> And doesn't have PERF_COUNT_HW_STALLED_CYCLES_BACKEND, but has
> PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, that gets configured using
> PERF_TYPE_RAW and 0xa9 because:
>
> root@number:~# cat /sys/devices/cpu/events/stalled-cycles-frontend
> event=0xa9
> root@number:~#
>
> But I couldn't so far explain why in the default case it is asking for
> PERF_COUNT_HW_STALLED_CYCLES_BACKEND, when it should be asking for
> PERF_COUNT_HW_STALLED_CYCLES_FRONTEND or PERF_TYPE_RAW+config=0xa9...

So the default events/metrics are now in json:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/common/common/metrics.json?h=perf-tools-next
Relating to the stalls there are:
```
    {
        "BriefDescription": "Max front or backend stalls per instruction",
        "MetricExpr": "max(stalled\\-cycles\\-frontend,
stalled\\-cycles\\-backend) / instructions",
        "MetricGroup": "Default",
        "MetricName": "stalled_cycles_per_instruction",
        "DefaultShowEvents": "1"
    },
    {
        "BriefDescription": "Frontend stalls per cycle",
        "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
        "MetricGroup": "Default",
        "MetricName": "frontend_cycles_idle",
        "MetricThreshold": "frontend_cycles_idle > 0.1",
        "DefaultShowEvents": "1"
    },
    {
        "BriefDescription": "Backend stalls per cycle",
        "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
        "MetricGroup": "Default",
        "MetricName": "backend_cycles_idle",
        "MetricThreshold": "backend_cycles_idle > 0.2",
        "DefaultShowEvents": "1"
    },
```
The stalled_cycles_per_instruction and backed_cycles_idle should fail
as the stalled-cycles-backend event is missing. frontend_cycles_idle
should work, I wonder if the 0 counts relate to trouble scheduling
groups of events. I'll need more verbose output to understand. Perhaps
for stalled_cycles_per_instruction, we should modify the metric to
tolerate missing events:

max(stalled\\-cycles\\-frontend if
have_event(stalled\\-cycles\\-frontend) else 0,
stalled\\-cycles\\-backend if have_event(stalled\\-cycles\\-backend)
else 0) / instructions

Thanks,
Ian

> - Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-17 20:50           ` Ian Rogers
@ 2026-03-18  0:37             ` Arnaldo Carvalho de Melo
  2026-03-18  2:25               ` Ian Rogers
  0 siblings, 1 reply; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-03-18  0:37 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-users, Jan Polensky,
	Linux Kernel Mailing List, Namhyung Kim

On Tue, Mar 17, 2026 at 01:50:21PM -0700, Ian Rogers wrote:
> On Tue, Mar 17, 2026 at 1:12 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > On Tue, Mar 17, 2026 at 04:56:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > > It is not trying PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, is asking for
> > > PERF_COUNT_HW_STALLED_CYCLES_BACKEND instead...

> > If I instead ask just for stalled-cycles-frontend and
> > stalled-cycles-backend:

> > root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend,stalled-cycles-backend sleep 1

> I think you intend for this to be system wide '-a'.

> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> > perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=250619, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
> >
> >  Performance counter stats for 'sleep 1':
> >
> >            409,276      stalled-cycles-frontend
> >    <not supported>      stalled-cycles-backend
> >
> >        1.000428804 seconds time elapsed
> >
> >        0.000439000 seconds user
> >        0.000000000 seconds sys
> >
> >
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=250618, si_uid=0} ---
> > +++ exited with 0 +++
> > root@number:~#
> >
> > It used type=PERF_TYPE_RAW, config=0xa9 for  stalled-cycles-frontend but
> > type=PERF_TYPE_HARDWARE, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND.
> >
> > ⬢ [acme@toolbx perf-tools]$ git grep stalled-cycles-frontend tools
> > tools/bpf/bpftool/link.c:       [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = "stalled-cycles-frontend",
> > tools/perf/builtin-stat.c:     3,856,436,920 stalled-cycles-frontend   #   74.09% frontend cycles idle
> > tools/perf/pmu-events/arch/common/common/legacy-hardware.json:    "EventName": "stalled-cycles-frontend",
> > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122795 */ "stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122945 */ "idle-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of stalled-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > tools/perf/pmu-events/empty-pmu-events.c:{ 122795 }, /* stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000 */
> > tools/perf/tests/shell/stat+std_output.sh:event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
> > tools/perf/util/evsel.c:        "stalled-cycles-frontend",
> > ⬢ [acme@toolbx perf-tools]$
> >
> > This machine is:
> >
> > ⬢ [acme@toolbx perf-tools]$ grep -m1 "model name" /proc/cpuinfo
> > model name      : AMD Ryzen 9 9950X3D 16-Core Processor
> 
> Lots of missing legacy events on AMD. The problem is worse with -dd and -ddd.
> 
> > ⬢ [acme@toolbx perf-tools]
> >
> > And doesn't have PERF_COUNT_HW_STALLED_CYCLES_BACKEND, but has
> > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, that gets configured using
> > PERF_TYPE_RAW and 0xa9 because:
> >
> > root@number:~# cat /sys/devices/cpu/events/stalled-cycles-frontend
> > event=0xa9
> > root@number:~#
> >
> > But I couldn't so far explain why in the default case it is asking for
> > PERF_COUNT_HW_STALLED_CYCLES_BACKEND, when it should be asking for
> > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND or PERF_TYPE_RAW+config=0xa9...

So you mean that it goes on to try this:

  {
        "BriefDescription": "Max front or backend stalls per instruction",
        "MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
        "MetricGroup": "Default",
        "MetricName": "stalled_cycles_per_instruction",
        "DefaultShowEvents": "1"
    },

Yeah, it tries both:

perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 16, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)

The RAW one is the equivalent to PERF_COUNT_HW_STALLED_CYCLES_FRONTEND,
I see now that I looked again at the 'strace perf stat sleep 1'

But in the output it also says:

     <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)

And if I try just this one:

root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend sleep 1
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865273, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865273, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

           422,404      stalled-cycles-frontend

       1.000432524 seconds time elapsed

       0.000438000 seconds user
       0.000000000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865272, si_uid=0} ---
+++ exited with 0 +++
root@number:~#

It works, so that line with stalled-cycles-frontend could have produced
the value, not '<not counted>', as this call succeeded:

perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16

Maybe the explanation is that it tries the metric, that uses both
frontend and backend, it fails at backend and then it discards the
frontend?

 
> So the default events/metrics are now in json:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/common/common/metrics.json?h=perf-tools-next
> Relating to the stalls there are:
> ```
>     {
>         "BriefDescription": "Max front or backend stalls per instruction",
>         "MetricExpr": "max(stalled\\-cycles\\-frontend,
> stalled\\-cycles\\-backend) / instructions",
>         "MetricGroup": "Default",
>         "MetricName": "stalled_cycles_per_instruction",
>         "DefaultShowEvents": "1"
>     },

root@number:~# strace -e perf_event_open perf stat -M stalled_cycles_per_instruction sleep 1
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865362, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
Error:
No supported events found.
The stalled-cycles-backend event is not supported.
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=865362, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
+++ exited with 1 +++
root@number:~#

>     {
>         "BriefDescription": "Frontend stalls per cycle",
>         "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
>         "MetricGroup": "Default",
>         "MetricName": "frontend_cycles_idle",
>         "MetricThreshold": "frontend_cycles_idle > 0.1",
>         "DefaultShowEvents": "1"
>     },

root@number:~# strace -e perf_event_open perf stat -M frontend_cycles_idle sleep 1
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, 3, PERF_FLAG_FD_CLOEXEC) = 4
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865414, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

           881,022      cpu-cycles                       #     0.48 frontend_cycles_idle
           422,386      stalled-cycles-frontend

       1.000468505 seconds time elapsed

       0.000504000 seconds user
       0.000000000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865413, si_uid=0} ---
+++ exited with 0 +++
root@number:~#
>     {
>         "BriefDescription": "Backend stalls per cycle",
>         "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
>         "MetricGroup": "Default",
>         "MetricName": "backend_cycles_idle",
>         "MetricThreshold": "backend_cycles_idle > 0.2",
>         "DefaultShowEvents": "1"
>     },

root@number:~# strace -e perf_event_open perf stat -M backend_cycles_idle sleep 1
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, 3, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865442, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

     <not counted>      cpu-cycles                       #      nan backend_cycles_idle       
   <not supported>      stalled-cycles-backend                                                

       1.000739264 seconds time elapsed

       0.000675000 seconds user
       0.000000000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865441, si_uid=0} ---
+++ exited with 0 +++
root@number:~#

> ```
> The stalled_cycles_per_instruction and backed_cycles_idle should fail
> as the stalled-cycles-backend event is missing. frontend_cycles_idle
> should work, I wonder if the 0 counts relate to trouble scheduling
> groups of events. I'll need more verbose output to understand. Perhaps
> for stalled_cycles_per_instruction, we should modify the metric to
> tolerate missing events:
> 
> max(stalled\\-cycles\\-frontend if
> have_event(stalled\\-cycles\\-frontend) else 0,
> stalled\\-cycles\\-backend if have_event(stalled\\-cycles\\-backend)
> else 0) / instructions

That have_event() part also have to be implemented, right?

- Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-18  0:37             ` Arnaldo Carvalho de Melo
@ 2026-03-18  2:25               ` Ian Rogers
  2026-03-19  1:01                 ` [PATCH v1] perf metrics: Make common stalled metrics conditional on having the event Ian Rogers
  2026-03-24  4:19                 ` perf stat issue with 7.0.0rc3 Ian Rogers
  0 siblings, 2 replies; 18+ messages in thread
From: Ian Rogers @ 2026-03-18  2:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Richter, linux-perf-users, Jan Polensky,
	Linux Kernel Mailing List, Namhyung Kim

On Tue, Mar 17, 2026 at 5:37 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Tue, Mar 17, 2026 at 01:50:21PM -0700, Ian Rogers wrote:
> > On Tue, Mar 17, 2026 at 1:12 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > On Tue, Mar 17, 2026 at 04:56:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > It is not trying PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, is asking for
> > > > PERF_COUNT_HW_STALLED_CYCLES_BACKEND instead...
>
> > > If I instead ask just for stalled-cycles-frontend and
> > > stalled-cycles-backend:
>
> > > root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend,stalled-cycles-backend sleep 1
>
> > I think you intend for this to be system wide '-a'.
>
> > > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> > > perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=250619, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
> > >
> > >  Performance counter stats for 'sleep 1':
> > >
> > >            409,276      stalled-cycles-frontend
> > >    <not supported>      stalled-cycles-backend
> > >
> > >        1.000428804 seconds time elapsed
> > >
> > >        0.000439000 seconds user
> > >        0.000000000 seconds sys
> > >
> > >
> > > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=250618, si_uid=0} ---
> > > +++ exited with 0 +++
> > > root@number:~#
> > >
> > > It used type=PERF_TYPE_RAW, config=0xa9 for  stalled-cycles-frontend but
> > > type=PERF_TYPE_HARDWARE, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND.
> > >
> > > ⬢ [acme@toolbx perf-tools]$ git grep stalled-cycles-frontend tools
> > > tools/bpf/bpftool/link.c:       [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = "stalled-cycles-frontend",
> > > tools/perf/builtin-stat.c:     3,856,436,920 stalled-cycles-frontend   #   74.09% frontend cycles idle
> > > tools/perf/pmu-events/arch/common/common/legacy-hardware.json:    "EventName": "stalled-cycles-frontend",
> > > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122795 */ "stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122945 */ "idle-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of stalled-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > > tools/perf/pmu-events/empty-pmu-events.c:{ 122795 }, /* stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000 */
> > > tools/perf/tests/shell/stat+std_output.sh:event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
> > > tools/perf/util/evsel.c:        "stalled-cycles-frontend",
> > > ⬢ [acme@toolbx perf-tools]$
> > >
> > > This machine is:
> > >
> > > ⬢ [acme@toolbx perf-tools]$ grep -m1 "model name" /proc/cpuinfo
> > > model name      : AMD Ryzen 9 9950X3D 16-Core Processor
> >
> > Lots of missing legacy events on AMD. The problem is worse with -dd and -ddd.
> >
> > > ⬢ [acme@toolbx perf-tools]
> > >
> > > And doesn't have PERF_COUNT_HW_STALLED_CYCLES_BACKEND, but has
> > > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, that gets configured using
> > > PERF_TYPE_RAW and 0xa9 because:
> > >
> > > root@number:~# cat /sys/devices/cpu/events/stalled-cycles-frontend
> > > event=0xa9
> > > root@number:~#
> > >
> > > But I couldn't so far explain why in the default case it is asking for
> > > PERF_COUNT_HW_STALLED_CYCLES_BACKEND, when it should be asking for
> > > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND or PERF_TYPE_RAW+config=0xa9...
>
> So you mean that it goes on to try this:
>
>   {
>         "BriefDescription": "Max front or backend stalls per instruction",
>         "MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
>         "MetricGroup": "Default",
>         "MetricName": "stalled_cycles_per_instruction",
>         "DefaultShowEvents": "1"
>     },
>
> Yeah, it tries both:
>
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16
> perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 16, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)

perf trace? The formatting on perf_event_open is better :-)

> The RAW one is the equivalent to PERF_COUNT_HW_STALLED_CYCLES_FRONTEND,
> I see now that I looked again at the 'strace perf stat sleep 1'
>
> But in the output it also says:
>
>      <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)
>
> And if I try just this one:
>
> root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend sleep 1
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865273, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865273, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
>
>  Performance counter stats for 'sleep 1':
>
>            422,404      stalled-cycles-frontend
>
>        1.000432524 seconds time elapsed
>
>        0.000438000 seconds user
>        0.000000000 seconds sys
>
>
> --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865272, si_uid=0} ---
> +++ exited with 0 +++
> root@number:~#
>
> It works, so that line with stalled-cycles-frontend could have produced
> the value, not '<not counted>', as this call succeeded:
>
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16
>
> Maybe the explanation is that it tries the metric, that uses both
> frontend and backend, it fails at backend and then it discards the
> frontend?

The metric logic tries to be smart, creating groups of events and then
sharing events between groups to avoid programming more events. I
suspect the frontend and backend are in a group, maybe the backend is
the leader. The backend fails to open, resulting in no group_fd and a
broken group for both the metrics.

> > So the default events/metrics are now in json:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/common/common/metrics.json?h=perf-tools-next
> > Relating to the stalls there are:
> > ```
> >     {
> >         "BriefDescription": "Max front or backend stalls per instruction",
> >         "MetricExpr": "max(stalled\\-cycles\\-frontend,
> > stalled\\-cycles\\-backend) / instructions",
> >         "MetricGroup": "Default",
> >         "MetricName": "stalled_cycles_per_instruction",
> >         "DefaultShowEvents": "1"
> >     },
>
> root@number:~# strace -e perf_event_open perf stat -M stalled_cycles_per_instruction sleep 1
> perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865362, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> Error:
> No supported events found.
> The stalled-cycles-backend event is not supported.
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=865362, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
> +++ exited with 1 +++
> root@number:~#
>
> >     {
> >         "BriefDescription": "Frontend stalls per cycle",
> >         "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
> >         "MetricGroup": "Default",
> >         "MetricName": "frontend_cycles_idle",
> >         "MetricThreshold": "frontend_cycles_idle > 0.1",
> >         "DefaultShowEvents": "1"
> >     },
>
> root@number:~# strace -e perf_event_open perf stat -M frontend_cycles_idle sleep 1
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, 3, PERF_FLAG_FD_CLOEXEC) = 4
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865414, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
>
>  Performance counter stats for 'sleep 1':
>
>            881,022      cpu-cycles                       #     0.48 frontend_cycles_idle
>            422,386      stalled-cycles-frontend
>
>        1.000468505 seconds time elapsed
>
>        0.000504000 seconds user
>        0.000000000 seconds sys
>
>
> --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865413, si_uid=0} ---
> +++ exited with 0 +++
> root@number:~#
> >     {
> >         "BriefDescription": "Backend stalls per cycle",
> >         "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
> >         "MetricGroup": "Default",
> >         "MetricName": "backend_cycles_idle",
> >         "MetricThreshold": "backend_cycles_idle > 0.2",
> >         "DefaultShowEvents": "1"
> >     },
>
> root@number:~# strace -e perf_event_open perf stat -M backend_cycles_idle sleep 1
> perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, 3, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865442, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
>
>  Performance counter stats for 'sleep 1':
>
>      <not counted>      cpu-cycles                       #      nan backend_cycles_idle
>    <not supported>      stalled-cycles-backend
>
>        1.000739264 seconds time elapsed
>
>        0.000675000 seconds user
>        0.000000000 seconds sys
>
>
> --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865441, si_uid=0} ---
> +++ exited with 0 +++
> root@number:~#
>
> > ```
> > The stalled_cycles_per_instruction and backed_cycles_idle should fail
> > as the stalled-cycles-backend event is missing. frontend_cycles_idle
> > should work, I wonder if the 0 counts relate to trouble scheduling
> > groups of events. I'll need more verbose output to understand. Perhaps
> > for stalled_cycles_per_instruction, we should modify the metric to
> > tolerate missing events:
> >
> > max(stalled\\-cycles\\-frontend if
> > have_event(stalled\\-cycles\\-frontend) else 0,
> > stalled\\-cycles\\-backend if have_event(stalled\\-cycles\\-backend)
> > else 0) / instructions
>
> That have_event() part also have to be implemented, right?

I mistyped, has_event not have_event :-)

Thanks,
Ian

> - Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v1] perf metrics: Make common stalled metrics conditional on having the event
  2026-03-18  2:25               ` Ian Rogers
@ 2026-03-19  1:01                 ` Ian Rogers
  2026-03-24  4:19                 ` perf stat issue with 7.0.0rc3 Ian Rogers
  1 sibling, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2026-03-19  1:01 UTC (permalink / raw)
  To: acme; +Cc: irogers, japo, linux-kernel, linux-perf-users, namhyung, tmricht

The metric code uses the event parsing code but it generally assumes
all events are supported. Arnaldo reported AMD supporting
stalled-cycles-frontend but not stalled-cycles-backend [1]. An issue
with this is that before parsing happens the metric code tries to
share events within groups to reduce the number of events and
multiplexing. If the group has some supported and not supported
events, the whole group will become broken. To avoid this situation
add has_event tests to the metrics for stalled-cycles-frontend and
stalled-cycles-backend. has_events is evaluated when parsing the
metric and its result constant propagated (with if-elses) to reduce
the number of events. This means when the metric code considers
sharing the events, only supported events will be shared.

Note for backporting. This change updates
tools/perf/pmu-events/empty-pmu-events.c a convenience file for builds
on systems without python present. While the metrics.json code should
backport easily there can be conflicts on empty-pmu-events.c. In this
case the build will have left a file test-empty-pmu-events.c that can
be copied over empty-pmu-events.c to resolve issues and make an
appropriate empty-pmu-events.c for the json in the source tree at the
time of the build.

[1] https://lore.kernel.org/lkml/abm1nR-2xjOUBroD@x1/

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Closes: https://lore.kernel.org/lkml/abm1nR-2xjOUBroD@x1/
Fixes: c7adeb0974f1 ("perf jevents: Add set of common metrics based on default ones")
Signed-off-by: Ian Rogers <irogers@google.com>
---
 .../arch/common/common/metrics.json           |   6 +-
 tools/perf/pmu-events/empty-pmu-events.c      | 108 +++++++++---------
 2 files changed, 57 insertions(+), 57 deletions(-)

diff --git a/tools/perf/pmu-events/arch/common/common/metrics.json b/tools/perf/pmu-events/arch/common/common/metrics.json
index 0d010b3ebc6d..cefc8bfe7830 100644
--- a/tools/perf/pmu-events/arch/common/common/metrics.json
+++ b/tools/perf/pmu-events/arch/common/common/metrics.json
@@ -46,14 +46,14 @@
     },
     {
         "BriefDescription": "Max front or backend stalls per instruction",
-        "MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
+        "MetricExpr": "(max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions) if (has_event(stalled\\-cycles\\-frontend) & has_event(stalled\\-cycles\\-backend)) else ((stalled\\-cycles\\-frontend / instructions) if has_event(stalled\\-cycles\\-frontend) else ((stalled\\-cycles\\-backend / instructions) if has_event(stalled\\-cycles\\-backend) else 0))",
         "MetricGroup": "Default",
         "MetricName": "stalled_cycles_per_instruction",
         "DefaultShowEvents": "1"
     },
     {
         "BriefDescription": "Frontend stalls per cycle",
-        "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
+        "MetricExpr": "(stalled\\-cycles\\-frontend / cpu\\-cycles) if has_event(stalled\\-cycles\\-frontend) else 0",
         "MetricGroup": "Default",
         "MetricName": "frontend_cycles_idle",
         "MetricThreshold": "frontend_cycles_idle > 0.1",
@@ -61,7 +61,7 @@
     },
     {
         "BriefDescription": "Backend stalls per cycle",
-        "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
+        "MetricExpr": "(stalled\\-cycles\\-backend / cpu\\-cycles) if has_event(stalled\\-cycles\\-backend) else 0",
         "MetricGroup": "Default",
         "MetricName": "backend_cycles_idle",
         "MetricThreshold": "backend_cycles_idle > 0.2",
diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c
index 76c395cf513c..a92dd0424f79 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -1310,33 +1310,33 @@ static const char *const big_c_string =
 /* offset=128375 */ "migrations_per_second\000Default\000software@cpu\\-migrations\\,name\\=cpu\\-migrations@ * 1e9 / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Process migrations to a new CPU per CPU second\000\0001migrations/sec\000\000\000\000011"
 /* offset=128635 */ "page_faults_per_second\000Default\000software@page\\-faults\\,name\\=page\\-faults@ * 1e9 / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Page faults per CPU second\000\0001faults/sec\000\000\000\000011"
 /* offset=128866 */ "insn_per_cycle\000Default\000instructions / cpu\\-cycles\000insn_per_cycle < 1\000Instructions Per Cycle\000\0001instructions\000\000\000\000001"
-/* offset=128979 */ "stalled_cycles_per_instruction\000Default\000max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions\000\000Max front or backend stalls per instruction\000\000\000\000\000\000001"
-/* offset=129143 */ "frontend_cycles_idle\000Default\000stalled\\-cycles\\-frontend / cpu\\-cycles\000frontend_cycles_idle > 0.1\000Frontend stalls per cycle\000\000\000\000\000\000001"
-/* offset=129273 */ "backend_cycles_idle\000Default\000stalled\\-cycles\\-backend / cpu\\-cycles\000backend_cycles_idle > 0.2\000Backend stalls per cycle\000\000\000\000\000\000001"
-/* offset=129399 */ "cycles_frequency\000Default\000cpu\\-cycles / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Cycles per CPU second\000\0001GHz\000\000\000\000011"
-/* offset=129575 */ "branch_frequency\000Default\000branches / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Branches per CPU second\000\0001000M/sec\000\000\000\000011"
-/* offset=129755 */ "branch_miss_rate\000Default\000branch\\-misses / branches\000branch_miss_rate > 0.05\000Branch miss rate\000\000100%\000\000\000\000001"
-/* offset=129859 */ "l1d_miss_rate\000Default2\000L1\\-dcache\\-load\\-misses / L1\\-dcache\\-loads\000l1d_miss_rate > 0.05\000L1D  miss rate\000\000100%\000\000\000\000001"
-/* offset=129975 */ "llc_miss_rate\000Default2\000LLC\\-load\\-misses / LLC\\-loads\000llc_miss_rate > 0.05\000LLC miss rate\000\000100%\000\000\000\000001"
-/* offset=130076 */ "l1i_miss_rate\000Default3\000L1\\-icache\\-load\\-misses / L1\\-icache\\-loads\000l1i_miss_rate > 0.05\000L1I miss rate\000\000100%\000\000\000\000001"
-/* offset=130191 */ "dtlb_miss_rate\000Default3\000dTLB\\-load\\-misses / dTLB\\-loads\000dtlb_miss_rate > 0.05\000dTLB miss rate\000\000100%\000\000\000\000001"
-/* offset=130297 */ "itlb_miss_rate\000Default3\000iTLB\\-load\\-misses / iTLB\\-loads\000itlb_miss_rate > 0.05\000iTLB miss rate\000\000100%\000\000\000\000001"
-/* offset=130403 */ "l1_prefetch_miss_rate\000Default4\000L1\\-dcache\\-prefetch\\-misses / L1\\-dcache\\-prefetches\000l1_prefetch_miss_rate > 0.05\000L1 prefetch miss rate\000\000100%\000\000\000\000001"
-/* offset=130551 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\000000"
-/* offset=130574 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\000000"
-/* offset=130638 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\000000"
-/* offset=130805 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000"
-/* offset=130870 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000"
-/* offset=130938 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\000000"
-/* offset=131010 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\000000"
-/* offset=131105 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\000000"
-/* offset=131240 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\000000"
-/* offset=131305 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\000000"
-/* offset=131374 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\000000"
-/* offset=131445 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\000000"
-/* offset=131468 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\000000"
-/* offset=131491 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\000000"
-/* offset=131512 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\000000"
+/* offset=128979 */ "stalled_cycles_per_instruction\000Default\000(max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions if has_event(stalled\\-cycles\\-frontend) & has_event(stalled\\-cycles\\-backend) else (stalled\\-cycles\\-frontend / instructions if has_event(stalled\\-cycles\\-frontend) else (stalled\\-cycles\\-backend / instructions if has_event(stalled\\-cycles\\-backend) else 0)))\000\000Max front or backend stalls per instruction\000\000\000\000\000\000001"
+/* offset=129404 */ "frontend_cycles_idle\000Default\000(stalled\\-cycles\\-frontend / cpu\\-cycles if has_event(stalled\\-cycles\\-frontend) else 0)\000frontend_cycles_idle > 0.1\000Frontend stalls per cycle\000\000\000\000\000\000001"
+/* offset=129583 */ "backend_cycles_idle\000Default\000(stalled\\-cycles\\-backend / cpu\\-cycles if has_event(stalled\\-cycles\\-backend) else 0)\000backend_cycles_idle > 0.2\000Backend stalls per cycle\000\000\000\000\000\000001"
+/* offset=129757 */ "cycles_frequency\000Default\000cpu\\-cycles / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Cycles per CPU second\000\0001GHz\000\000\000\000011"
+/* offset=129933 */ "branch_frequency\000Default\000branches / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Branches per CPU second\000\0001000M/sec\000\000\000\000011"
+/* offset=130113 */ "branch_miss_rate\000Default\000branch\\-misses / branches\000branch_miss_rate > 0.05\000Branch miss rate\000\000100%\000\000\000\000001"
+/* offset=130217 */ "l1d_miss_rate\000Default2\000L1\\-dcache\\-load\\-misses / L1\\-dcache\\-loads\000l1d_miss_rate > 0.05\000L1D  miss rate\000\000100%\000\000\000\000001"
+/* offset=130333 */ "llc_miss_rate\000Default2\000LLC\\-load\\-misses / LLC\\-loads\000llc_miss_rate > 0.05\000LLC miss rate\000\000100%\000\000\000\000001"
+/* offset=130434 */ "l1i_miss_rate\000Default3\000L1\\-icache\\-load\\-misses / L1\\-icache\\-loads\000l1i_miss_rate > 0.05\000L1I miss rate\000\000100%\000\000\000\000001"
+/* offset=130549 */ "dtlb_miss_rate\000Default3\000dTLB\\-load\\-misses / dTLB\\-loads\000dtlb_miss_rate > 0.05\000dTLB miss rate\000\000100%\000\000\000\000001"
+/* offset=130655 */ "itlb_miss_rate\000Default3\000iTLB\\-load\\-misses / iTLB\\-loads\000itlb_miss_rate > 0.05\000iTLB miss rate\000\000100%\000\000\000\000001"
+/* offset=130761 */ "l1_prefetch_miss_rate\000Default4\000L1\\-dcache\\-prefetch\\-misses / L1\\-dcache\\-prefetches\000l1_prefetch_miss_rate > 0.05\000L1 prefetch miss rate\000\000100%\000\000\000\000001"
+/* offset=130909 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\000000"
+/* offset=130932 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\000000"
+/* offset=130996 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\000000"
+/* offset=131163 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000"
+/* offset=131228 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000"
+/* offset=131296 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\000000"
+/* offset=131368 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\000000"
+/* offset=131463 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\000000"
+/* offset=131598 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\000000"
+/* offset=131663 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\000000"
+/* offset=131732 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\000000"
+/* offset=131803 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\000000"
+/* offset=131826 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\000000"
+/* offset=131849 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\000000"
+/* offset=131870 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\000000"
 ;
 
 static const struct compact_pmu_event pmu_events__common_default_core[] = {
@@ -2626,22 +2626,22 @@ static const struct pmu_table_entry pmu_events__common[] = {
 
 static const struct compact_pmu_event pmu_metrics__common_default_core[] = {
 { 127956 }, /* CPUs_utilized\000Default\000(software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@) / (duration_time * 1e9)\000\000Average CPU utilization\000\0001CPUs\000\000\000\000011 */
-{ 129273 }, /* backend_cycles_idle\000Default\000stalled\\-cycles\\-backend / cpu\\-cycles\000backend_cycles_idle > 0.2\000Backend stalls per cycle\000\000\000\000\000\000001 */
-{ 129575 }, /* branch_frequency\000Default\000branches / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Branches per CPU second\000\0001000M/sec\000\000\000\000011 */
-{ 129755 }, /* branch_miss_rate\000Default\000branch\\-misses / branches\000branch_miss_rate > 0.05\000Branch miss rate\000\000100%\000\000\000\000001 */
+{ 129583 }, /* backend_cycles_idle\000Default\000(stalled\\-cycles\\-backend / cpu\\-cycles if has_event(stalled\\-cycles\\-backend) else 0)\000backend_cycles_idle > 0.2\000Backend stalls per cycle\000\000\000\000\000\000001 */
+{ 129933 }, /* branch_frequency\000Default\000branches / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Branches per CPU second\000\0001000M/sec\000\000\000\000011 */
+{ 130113 }, /* branch_miss_rate\000Default\000branch\\-misses / branches\000branch_miss_rate > 0.05\000Branch miss rate\000\000100%\000\000\000\000001 */
 { 128142 }, /* cs_per_second\000Default\000software@context\\-switches\\,name\\=context\\-switches@ * 1e9 / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Context switches per CPU second\000\0001cs/sec\000\000\000\000011 */
-{ 129399 }, /* cycles_frequency\000Default\000cpu\\-cycles / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Cycles per CPU second\000\0001GHz\000\000\000\000011 */
-{ 130191 }, /* dtlb_miss_rate\000Default3\000dTLB\\-load\\-misses / dTLB\\-loads\000dtlb_miss_rate > 0.05\000dTLB miss rate\000\000100%\000\000\000\000001 */
-{ 129143 }, /* frontend_cycles_idle\000Default\000stalled\\-cycles\\-frontend / cpu\\-cycles\000frontend_cycles_idle > 0.1\000Frontend stalls per cycle\000\000\000\000\000\000001 */
+{ 129757 }, /* cycles_frequency\000Default\000cpu\\-cycles / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Cycles per CPU second\000\0001GHz\000\000\000\000011 */
+{ 130549 }, /* dtlb_miss_rate\000Default3\000dTLB\\-load\\-misses / dTLB\\-loads\000dtlb_miss_rate > 0.05\000dTLB miss rate\000\000100%\000\000\000\000001 */
+{ 129404 }, /* frontend_cycles_idle\000Default\000(stalled\\-cycles\\-frontend / cpu\\-cycles if has_event(stalled\\-cycles\\-frontend) else 0)\000frontend_cycles_idle > 0.1\000Frontend stalls per cycle\000\000\000\000\000\000001 */
 { 128866 }, /* insn_per_cycle\000Default\000instructions / cpu\\-cycles\000insn_per_cycle < 1\000Instructions Per Cycle\000\0001instructions\000\000\000\000001 */
-{ 130297 }, /* itlb_miss_rate\000Default3\000iTLB\\-load\\-misses / iTLB\\-loads\000itlb_miss_rate > 0.05\000iTLB miss rate\000\000100%\000\000\000\000001 */
-{ 130403 }, /* l1_prefetch_miss_rate\000Default4\000L1\\-dcache\\-prefetch\\-misses / L1\\-dcache\\-prefetches\000l1_prefetch_miss_rate > 0.05\000L1 prefetch miss rate\000\000100%\000\000\000\000001 */
-{ 129859 }, /* l1d_miss_rate\000Default2\000L1\\-dcache\\-load\\-misses / L1\\-dcache\\-loads\000l1d_miss_rate > 0.05\000L1D  miss rate\000\000100%\000\000\000\000001 */
-{ 130076 }, /* l1i_miss_rate\000Default3\000L1\\-icache\\-load\\-misses / L1\\-icache\\-loads\000l1i_miss_rate > 0.05\000L1I miss rate\000\000100%\000\000\000\000001 */
-{ 129975 }, /* llc_miss_rate\000Default2\000LLC\\-load\\-misses / LLC\\-loads\000llc_miss_rate > 0.05\000LLC miss rate\000\000100%\000\000\000\000001 */
+{ 130655 }, /* itlb_miss_rate\000Default3\000iTLB\\-load\\-misses / iTLB\\-loads\000itlb_miss_rate > 0.05\000iTLB miss rate\000\000100%\000\000\000\000001 */
+{ 130761 }, /* l1_prefetch_miss_rate\000Default4\000L1\\-dcache\\-prefetch\\-misses / L1\\-dcache\\-prefetches\000l1_prefetch_miss_rate > 0.05\000L1 prefetch miss rate\000\000100%\000\000\000\000001 */
+{ 130217 }, /* l1d_miss_rate\000Default2\000L1\\-dcache\\-load\\-misses / L1\\-dcache\\-loads\000l1d_miss_rate > 0.05\000L1D  miss rate\000\000100%\000\000\000\000001 */
+{ 130434 }, /* l1i_miss_rate\000Default3\000L1\\-icache\\-load\\-misses / L1\\-icache\\-loads\000l1i_miss_rate > 0.05\000L1I miss rate\000\000100%\000\000\000\000001 */
+{ 130333 }, /* llc_miss_rate\000Default2\000LLC\\-load\\-misses / LLC\\-loads\000llc_miss_rate > 0.05\000LLC miss rate\000\000100%\000\000\000\000001 */
 { 128375 }, /* migrations_per_second\000Default\000software@cpu\\-migrations\\,name\\=cpu\\-migrations@ * 1e9 / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Process migrations to a new CPU per CPU second\000\0001migrations/sec\000\000\000\000011 */
 { 128635 }, /* page_faults_per_second\000Default\000software@page\\-faults\\,name\\=page\\-faults@ * 1e9 / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)\000\000Page faults per CPU second\000\0001faults/sec\000\000\000\000011 */
-{ 128979 }, /* stalled_cycles_per_instruction\000Default\000max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions\000\000Max front or backend stalls per instruction\000\000\000\000\000\000001 */
+{ 128979 }, /* stalled_cycles_per_instruction\000Default\000(max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions if has_event(stalled\\-cycles\\-frontend) & has_event(stalled\\-cycles\\-backend) else (stalled\\-cycles\\-frontend / instructions if has_event(stalled\\-cycles\\-frontend) else (stalled\\-cycles\\-backend / instructions if has_event(stalled\\-cycles\\-backend) else 0)))\000\000Max front or backend stalls per instruction\000\000\000\000\000\000001 */
 
 };
 
@@ -2714,21 +2714,21 @@ static const struct pmu_table_entry pmu_events__test_soc_cpu[] = {
 };
 
 static const struct compact_pmu_event pmu_metrics__test_soc_cpu_default_core[] = {
-{ 130551 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\000000 */
-{ 131240 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\000000 */
-{ 131010 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\000000 */
-{ 131105 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\000000 */
-{ 131305 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\000000 */
-{ 131374 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\000000 */
-{ 130638 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\000000 */
-{ 130574 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\000000 */
-{ 131512 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\000000 */
-{ 131445 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\000000 */
-{ 131468 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\000000 */
-{ 131491 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\000000 */
-{ 130938 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\000000 */
-{ 130805 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000 */
-{ 130870 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000 */
+{ 130909 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\000000 */
+{ 131598 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\000000 */
+{ 131368 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\000000 */
+{ 131463 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\000000 */
+{ 131663 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\000000 */
+{ 131732 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\000000 */
+{ 130996 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\000000 */
+{ 130932 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\000000 */
+{ 131870 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\000000 */
+{ 131803 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\000000 */
+{ 131826 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\000000 */
+{ 131849 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\000000 */
+{ 131296 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\000000 */
+{ 131163 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000 */
+{ 131228 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\000000 */
 
 };
 
-- 
2.53.0.851.ga537e3e6e9-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: perf stat issue with 7.0.0rc3
  2026-03-18  2:25               ` Ian Rogers
  2026-03-19  1:01                 ` [PATCH v1] perf metrics: Make common stalled metrics conditional on having the event Ian Rogers
@ 2026-03-24  4:19                 ` Ian Rogers
  1 sibling, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2026-03-24  4:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Richter, linux-perf-users, Jan Polensky,
	Linux Kernel Mailing List, Namhyung Kim

On Tue, Mar 17, 2026 at 7:25 PM Ian Rogers <irogers@google.com> wrote:
>
> On Tue, Mar 17, 2026 at 5:37 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Tue, Mar 17, 2026 at 01:50:21PM -0700, Ian Rogers wrote:
> > > On Tue, Mar 17, 2026 at 1:12 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > On Tue, Mar 17, 2026 at 04:56:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > It is not trying PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, is asking for
> > > > > PERF_COUNT_HW_STALLED_CYCLES_BACKEND instead...
> >
> > > > If I instead ask just for stalled-cycles-frontend and
> > > > stalled-cycles-backend:
> >
> > > > root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend,stalled-cycles-backend sleep 1
> >
> > > I think you intend for this to be system wide '-a'.
> >
> > > > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> > > > perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> > > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=250619, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
> > > >
> > > >  Performance counter stats for 'sleep 1':
> > > >
> > > >            409,276      stalled-cycles-frontend
> > > >    <not supported>      stalled-cycles-backend
> > > >
> > > >        1.000428804 seconds time elapsed
> > > >
> > > >        0.000439000 seconds user
> > > >        0.000000000 seconds sys
> > > >
> > > >
> > > > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=250618, si_uid=0} ---
> > > > +++ exited with 0 +++
> > > > root@number:~#
> > > >
> > > > It used type=PERF_TYPE_RAW, config=0xa9 for  stalled-cycles-frontend but
> > > > type=PERF_TYPE_HARDWARE, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND.
> > > >
> > > > ⬢ [acme@toolbx perf-tools]$ git grep stalled-cycles-frontend tools
> > > > tools/bpf/bpftool/link.c:       [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = "stalled-cycles-frontend",
> > > > tools/perf/builtin-stat.c:     3,856,436,920 stalled-cycles-frontend   #   74.09% frontend cycles idle
> > > > tools/perf/pmu-events/arch/common/common/legacy-hardware.json:    "EventName": "stalled-cycles-frontend",
> > > > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122795 */ "stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > > > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122945 */ "idle-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of stalled-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > > > tools/perf/pmu-events/empty-pmu-events.c:{ 122795 }, /* stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000 */
> > > > tools/perf/tests/shell/stat+std_output.sh:event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
> > > > tools/perf/util/evsel.c:        "stalled-cycles-frontend",
> > > > ⬢ [acme@toolbx perf-tools]$
> > > >
> > > > This machine is:
> > > >
> > > > ⬢ [acme@toolbx perf-tools]$ grep -m1 "model name" /proc/cpuinfo
> > > > model name      : AMD Ryzen 9 9950X3D 16-Core Processor
> > >
> > > Lots of missing legacy events on AMD. The problem is worse with -dd and -ddd.
> > >
> > > > ⬢ [acme@toolbx perf-tools]
> > > >
> > > > And doesn't have PERF_COUNT_HW_STALLED_CYCLES_BACKEND, but has
> > > > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, that gets configured using
> > > > PERF_TYPE_RAW and 0xa9 because:
> > > >
> > > > root@number:~# cat /sys/devices/cpu/events/stalled-cycles-frontend
> > > > event=0xa9
> > > > root@number:~#
> > > >
> > > > But I couldn't so far explain why in the default case it is asking for
> > > > PERF_COUNT_HW_STALLED_CYCLES_BACKEND, when it should be asking for
> > > > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND or PERF_TYPE_RAW+config=0xa9...
> >
> > So you mean that it goes on to try this:
> >
> >   {
> >         "BriefDescription": "Max front or backend stalls per instruction",
> >         "MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
> >         "MetricGroup": "Default",
> >         "MetricName": "stalled_cycles_per_instruction",
> >         "DefaultShowEvents": "1"
> >     },
> >
> > Yeah, it tries both:
> >
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16
> > perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 16, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
>
> perf trace? The formatting on perf_event_open is better :-)
>
> > The RAW one is the equivalent to PERF_COUNT_HW_STALLED_CYCLES_FRONTEND,
> > I see now that I looked again at the 'strace perf stat sleep 1'
> >
> > But in the output it also says:
> >
> >      <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)
> >
> > And if I try just this one:
> >
> > root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend sleep 1
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865273, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865273, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
> >
> >  Performance counter stats for 'sleep 1':
> >
> >            422,404      stalled-cycles-frontend
> >
> >        1.000432524 seconds time elapsed
> >
> >        0.000438000 seconds user
> >        0.000000000 seconds sys
> >
> >
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865272, si_uid=0} ---
> > +++ exited with 0 +++
> > root@number:~#
> >
> > It works, so that line with stalled-cycles-frontend could have produced
> > the value, not '<not counted>', as this call succeeded:
> >
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16
> >
> > Maybe the explanation is that it tries the metric, that uses both
> > frontend and backend, it fails at backend and then it discards the
> > frontend?
>
> The metric logic tries to be smart, creating groups of events and then
> sharing events between groups to avoid programming more events. I
> suspect the frontend and backend are in a group, maybe the backend is
> the leader. The backend fails to open, resulting in no group_fd and a
> broken group for both the metrics.
>
> > > So the default events/metrics are now in json:
> > > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/common/common/metrics.json?h=perf-tools-next
> > > Relating to the stalls there are:
> > > ```
> > >     {
> > >         "BriefDescription": "Max front or backend stalls per instruction",
> > >         "MetricExpr": "max(stalled\\-cycles\\-frontend,
> > > stalled\\-cycles\\-backend) / instructions",
> > >         "MetricGroup": "Default",
> > >         "MetricName": "stalled_cycles_per_instruction",
> > >         "DefaultShowEvents": "1"
> > >     },
> >
> > root@number:~# strace -e perf_event_open perf stat -M stalled_cycles_per_instruction sleep 1
> > perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865362, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> > Error:
> > No supported events found.
> > The stalled-cycles-backend event is not supported.
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=865362, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
> > +++ exited with 1 +++
> > root@number:~#
> >
> > >     {
> > >         "BriefDescription": "Frontend stalls per cycle",
> > >         "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
> > >         "MetricGroup": "Default",
> > >         "MetricName": "frontend_cycles_idle",
> > >         "MetricThreshold": "frontend_cycles_idle > 0.1",
> > >         "DefaultShowEvents": "1"
> > >     },
> >
> > root@number:~# strace -e perf_event_open perf stat -M frontend_cycles_idle sleep 1
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, 3, PERF_FLAG_FD_CLOEXEC) = 4
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865414, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
> >
> >  Performance counter stats for 'sleep 1':
> >
> >            881,022      cpu-cycles                       #     0.48 frontend_cycles_idle
> >            422,386      stalled-cycles-frontend
> >
> >        1.000468505 seconds time elapsed
> >
> >        0.000504000 seconds user
> >        0.000000000 seconds sys
> >
> >
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865413, si_uid=0} ---
> > +++ exited with 0 +++
> > root@number:~#
> > >     {
> > >         "BriefDescription": "Backend stalls per cycle",
> > >         "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
> > >         "MetricGroup": "Default",
> > >         "MetricName": "backend_cycles_idle",
> > >         "MetricThreshold": "backend_cycles_idle > 0.2",
> > >         "DefaultShowEvents": "1"
> > >     },
> >
> > root@number:~# strace -e perf_event_open perf stat -M backend_cycles_idle sleep 1
> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> > perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, 3, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865442, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
> >
> >  Performance counter stats for 'sleep 1':
> >
> >      <not counted>      cpu-cycles                       #      nan backend_cycles_idle
> >    <not supported>      stalled-cycles-backend
> >
> >        1.000739264 seconds time elapsed
> >
> >        0.000675000 seconds user
> >        0.000000000 seconds sys
> >
> >
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865441, si_uid=0} ---
> > +++ exited with 0 +++
> > root@number:~#
> >
> > > ```
> > > The stalled_cycles_per_instruction and backed_cycles_idle should fail
> > > as the stalled-cycles-backend event is missing. frontend_cycles_idle
> > > should work, I wonder if the 0 counts relate to trouble scheduling
> > > groups of events. I'll need more verbose output to understand. Perhaps
> > > for stalled_cycles_per_instruction, we should modify the metric to
> > > tolerate missing events:
> > >
> > > max(stalled\\-cycles\\-frontend if
> > > have_event(stalled\\-cycles\\-frontend) else 0,
> > > stalled\\-cycles\\-backend if have_event(stalled\\-cycles\\-backend)
> > > else 0) / instructions
> >
> > That have_event() part also have to be implemented, right?
>
> I mistyped, has_event not have_event :-)

I posted:
https://lore.kernel.org/lkml/20260319010103.834106-1-irogers@google.com/
that implements the has_event approach.

Thanks,
Ian

> Thanks,
> Ian
>
> > - Arnaldo

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-03-24  4:19 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13 13:13 perf stat issue with 7.0.0rc3 Thomas Richter
2026-03-13 15:13 ` Leo Yan
2026-03-13 15:19 ` Arnaldo Carvalho de Melo
2026-03-13 15:41   ` Ian Rogers
2026-03-13 15:56     ` Arnaldo Melo
2026-03-13 16:10       ` Ian Rogers
2026-03-13 17:01         ` Arnaldo Melo
2026-03-13 18:27           ` Ian Rogers
2026-03-13 21:10             ` Namhyung Kim
2026-03-17 20:19             ` Arnaldo Carvalho de Melo
2026-03-17 19:39     ` Arnaldo Carvalho de Melo
2026-03-17 19:56       ` Arnaldo Carvalho de Melo
2026-03-17 20:12         ` Arnaldo Carvalho de Melo
2026-03-17 20:50           ` Ian Rogers
2026-03-18  0:37             ` Arnaldo Carvalho de Melo
2026-03-18  2:25               ` Ian Rogers
2026-03-19  1:01                 ` [PATCH v1] perf metrics: Make common stalled metrics conditional on having the event Ian Rogers
2026-03-24  4:19                 ` perf stat issue with 7.0.0rc3 Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox