public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>,
	linux-perf-users@vger.kernel.org,
	Jan Polensky <japo@linux.ibm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Namhyung Kim <namhyung@kernel.org>
Subject: Re: perf stat issue with 7.0.0rc3
Date: Tue, 17 Mar 2026 21:37:38 -0300	[thread overview]
Message-ID: <abnz0pxixB7L7-7R@x1> (raw)
In-Reply-To: <CAP-5=fU76bLLgorJJ0CVXwTanaZTbjtV=EWA2evy8UjGE8Sw4Q@mail.gmail.com>

On Tue, Mar 17, 2026 at 01:50:21PM -0700, Ian Rogers wrote:
> On Tue, Mar 17, 2026 at 1:12 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > On Tue, Mar 17, 2026 at 04:56:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > > It is not trying PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, is asking for
> > > PERF_COUNT_HW_STALLED_CYCLES_BACKEND instead...

> > If I instead ask just for stalled-cycles-frontend and
> > stalled-cycles-backend:

> > root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend,stalled-cycles-backend sleep 1

> I think you intend for this to be system wide '-a'.

> > perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> > perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 250619, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=250619, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
> >
> >  Performance counter stats for 'sleep 1':
> >
> >            409,276      stalled-cycles-frontend
> >    <not supported>      stalled-cycles-backend
> >
> >        1.000428804 seconds time elapsed
> >
> >        0.000439000 seconds user
> >        0.000000000 seconds sys
> >
> >
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=250618, si_uid=0} ---
> > +++ exited with 0 +++
> > root@number:~#
> >
> > It used type=PERF_TYPE_RAW, config=0xa9 for  stalled-cycles-frontend but
> > type=PERF_TYPE_HARDWARE, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND.
> >
> > ⬢ [acme@toolbx perf-tools]$ git grep stalled-cycles-frontend tools
> > tools/bpf/bpftool/link.c:       [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = "stalled-cycles-frontend",
> > tools/perf/builtin-stat.c:     3,856,436,920 stalled-cycles-frontend   #   74.09% frontend cycles idle
> > tools/perf/pmu-events/arch/common/common/legacy-hardware.json:    "EventName": "stalled-cycles-frontend",
> > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122795 */ "stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > tools/perf/pmu-events/empty-pmu-events.c:/* offset=122945 */ "idle-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of stalled-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000"
> > tools/perf/pmu-events/empty-pmu-events.c:{ 122795 }, /* stalled-cycles-frontend\000legacy hardware\000Stalled cycles during issue [This event is an alias of idle-cycles-frontend]\000legacy-hardware-config=7\000\00000\000\000\000\000\000 */
> > tools/perf/tests/shell/stat+std_output.sh:event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
> > tools/perf/util/evsel.c:        "stalled-cycles-frontend",
> > ⬢ [acme@toolbx perf-tools]$
> >
> > This machine is:
> >
> > ⬢ [acme@toolbx perf-tools]$ grep -m1 "model name" /proc/cpuinfo
> > model name      : AMD Ryzen 9 9950X3D 16-Core Processor
> 
> Lots of missing legacy events on AMD. The problem is worse with -dd and -ddd.
> 
> > ⬢ [acme@toolbx perf-tools]
> >
> > And doesn't have PERF_COUNT_HW_STALLED_CYCLES_BACKEND, but has
> > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND, that gets configured using
> > PERF_TYPE_RAW and 0xa9 because:
> >
> > root@number:~# cat /sys/devices/cpu/events/stalled-cycles-frontend
> > event=0xa9
> > root@number:~#
> >
> > But I couldn't so far explain why in the default case it is asking for
> > PERF_COUNT_HW_STALLED_CYCLES_BACKEND, when it should be asking for
> > PERF_COUNT_HW_STALLED_CYCLES_FRONTEND or PERF_TYPE_RAW+config=0xa9...

So you mean that it goes on to try this:

  {
        "BriefDescription": "Max front or backend stalls per instruction",
        "MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
        "MetricGroup": "Default",
        "MetricName": "stalled_cycles_per_instruction",
        "DefaultShowEvents": "1"
    },

Yeah, it tries both:

perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 16, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)

The RAW one is the equivalent to PERF_COUNT_HW_STALLED_CYCLES_FRONTEND,
I see now that I looked again at the 'strace perf stat sleep 1'

But in the output it also says:

     <not counted>      stalled-cycles-frontend          #      nan frontend_cycles_idle        (0.00%)

And if I try just this one:

root@number:~# strace -e perf_event_open perf stat -e stalled-cycles-frontend sleep 1
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865273, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865273, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

           422,404      stalled-cycles-frontend

       1.000432524 seconds time elapsed

       0.000438000 seconds user
       0.000000000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865272, si_uid=0} ---
+++ exited with 0 +++
root@number:~#

It works, so that line with stalled-cycles-frontend could have produced
the value, not '<not counted>', as this call succeeded:

perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, 14, PERF_FLAG_FD_CLOEXEC) = 15
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865157, -1, -1, PERF_FLAG_FD_CLOEXEC) = 16

Maybe the explanation is that it tries the metric, that uses both
frontend and backend, it fails at backend and then it discards the
frontend?

 
> So the default events/metrics are now in json:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/common/common/metrics.json?h=perf-tools-next
> Relating to the stalls there are:
> ```
>     {
>         "BriefDescription": "Max front or backend stalls per instruction",
>         "MetricExpr": "max(stalled\\-cycles\\-frontend,
> stalled\\-cycles\\-backend) / instructions",
>         "MetricGroup": "Default",
>         "MetricName": "stalled_cycles_per_instruction",
>         "DefaultShowEvents": "1"
>     },

root@number:~# strace -e perf_event_open perf stat -M stalled_cycles_per_instruction sleep 1
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865362, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
Error:
No supported events found.
The stalled-cycles-backend event is not supported.
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=865362, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
+++ exited with 1 +++
root@number:~#

>     {
>         "BriefDescription": "Frontend stalls per cycle",
>         "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
>         "MetricGroup": "Default",
>         "MetricName": "frontend_cycles_idle",
>         "MetricThreshold": "frontend_cycles_idle > 0.1",
>         "DefaultShowEvents": "1"
>     },

root@number:~# strace -e perf_event_open perf stat -M frontend_cycles_idle sleep 1
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0xa9, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865414, -1, 3, PERF_FLAG_FD_CLOEXEC) = 4
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865414, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

           881,022      cpu-cycles                       #     0.48 frontend_cycles_idle
           422,386      stalled-cycles-frontend

       1.000468505 seconds time elapsed

       0.000504000 seconds user
       0.000000000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865413, si_uid=0} ---
+++ exited with 0 +++
root@number:~#
>     {
>         "BriefDescription": "Backend stalls per cycle",
>         "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
>         "MetricGroup": "Default",
>         "MetricName": "backend_cycles_idle",
>         "MetricThreshold": "backend_cycles_idle > 0.2",
>         "DefaultShowEvents": "1"
>     },

root@number:~# strace -e perf_event_open perf stat -M backend_cycles_idle sleep 1
perf_event_open({type=PERF_TYPE_RAW, size=PERF_ATTR_SIZE_VER9, config=0x76, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
perf_event_open({type=PERF_TYPE_HARDWARE, size=PERF_ATTR_SIZE_VER9, config=PERF_COUNT_HW_STALLED_CYCLES_BACKEND, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP, inherit=1, precise_ip=0 /* arbitrary skid */, ...}, 865442, -1, 3, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=865442, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

 Performance counter stats for 'sleep 1':

     <not counted>      cpu-cycles                       #      nan backend_cycles_idle       
   <not supported>      stalled-cycles-backend                                                

       1.000739264 seconds time elapsed

       0.000675000 seconds user
       0.000000000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=865441, si_uid=0} ---
+++ exited with 0 +++
root@number:~#

> ```
> The stalled_cycles_per_instruction and backed_cycles_idle should fail
> as the stalled-cycles-backend event is missing. frontend_cycles_idle
> should work, I wonder if the 0 counts relate to trouble scheduling
> groups of events. I'll need more verbose output to understand. Perhaps
> for stalled_cycles_per_instruction, we should modify the metric to
> tolerate missing events:
> 
> max(stalled\\-cycles\\-frontend if
> have_event(stalled\\-cycles\\-frontend) else 0,
> stalled\\-cycles\\-backend if have_event(stalled\\-cycles\\-backend)
> else 0) / instructions

That have_event() part also have to be implemented, right?

- Arnaldo

  reply	other threads:[~2026-03-18  0:37 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13 13:13 perf stat issue with 7.0.0rc3 Thomas Richter
2026-03-13 15:13 ` Leo Yan
2026-03-13 15:19 ` Arnaldo Carvalho de Melo
2026-03-13 15:41   ` Ian Rogers
2026-03-13 15:56     ` Arnaldo Melo
2026-03-13 16:10       ` Ian Rogers
2026-03-13 17:01         ` Arnaldo Melo
2026-03-13 18:27           ` Ian Rogers
2026-03-13 21:10             ` Namhyung Kim
2026-03-17 20:19             ` Arnaldo Carvalho de Melo
2026-03-17 19:39     ` Arnaldo Carvalho de Melo
2026-03-17 19:56       ` Arnaldo Carvalho de Melo
2026-03-17 20:12         ` Arnaldo Carvalho de Melo
2026-03-17 20:50           ` Ian Rogers
2026-03-18  0:37             ` Arnaldo Carvalho de Melo [this message]
2026-03-18  2:25               ` Ian Rogers
2026-03-19  1:01                 ` [PATCH v1] perf metrics: Make common stalled metrics conditional on having the event Ian Rogers
2026-03-24  4:19                 ` perf stat issue with 7.0.0rc3 Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abnz0pxixB7L7-7R@x1 \
    --to=acme@kernel.org \
    --cc=irogers@google.com \
    --cc=japo@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=tmricht@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox