From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>,
"linux-perf-use." <linux-perf-users@vger.kernel.org>,
Sumanth Korikkar <sumanthk@linux.ibm.com>,
James Clark <james.clark@arm.com>, Leo Yan <leo.yan@linaro.org>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Mike Leach <mike.leach@linaro.org>,
Mark Rutland <mark.rutland@arm.com>,
John Garry <john.g.garry@oracle.com>,
Will Deacon <will@kernel.org>
Subject: Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
Date: Fri, 16 Jun 2023 13:53:41 -0300 [thread overview]
Message-ID: <ZIyTlR+jJ27YKPVx@kernel.org> (raw)
In-Reply-To: <CAP-5=fVpU6Bm1REcANEu2qsE5eQH2VOx2uz+X0VSoH76r_d7uQ@mail.gmail.com>
Em Fri, Jun 16, 2023 at 09:28:12AM -0700, Ian Rogers escreveu:
> On Fri, Jun 16, 2023 at 7:44 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Fri, Jun 16, 2023 at 11:36:27AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > I noticed this on an arm64 board:
> > >
> > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
> > > > > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt
> > >
> > > > > Performance counter stats for 'ls':
> > >
> > > > > <not supported> armv8_cortex_a72/cycles:u/
> > > > > <not supported> armv8_cortex_a53/cycles:u/
> > > > > <not supported> armv8_cortex_a72/instructions:u/
> > > > > <not supported> armv8_cortex_a53/instructions:u/
> > >
> > > > I tested on a raspberry pi and perf-tools-next is working there. I
> > > > suspect the issue here is the heterogeneous PMU. The cycles event is
> > > > converted into a perf_event_attr with type 0 and config 0. When there
> > > > are heterogeneous PMUs then we try to use the extended type to say we
> > > > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say
> > > > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9
> > > > and 10 respectively. With heterogeneous encodings the type in the
> > >
> > > The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed):
> > >
> > > root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1
> > > Using CPUID 0x00000000410fd080
> > > Control descriptor is not initialized
> > > ------------------------------------------------------------
> > > perf_event_attr:
> > > size 136
> > > config 0x800000000
> > > ------------------------------------------------------------
> > > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8
> > > sys_perf_event_open failed, error -2
> > > Warning:
> > > cycles event is not supported by the kernel.
> > > ------------------------------------------------------------
> > > perf_event_attr:
> > > size 136
> > > config 0x700000000
> > > ------------------------------------------------------------
> > > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8
> > > sys_perf_event_open failed, error -2
> > > Warning:
> > > cycles event is not supported by the kernel.
> > > failed to read counter cycles
> > > failed to read counter cycles
> > >
> > > Performance counter stats for 'sleep 1':
> > >
> > > <not supported> armv8_cortex_a72/cycles/
> > > <not supported> armv8_cortex_a53/cycles/
> > >
> > > 1.011406938 seconds time elapsed
> > >
> > > 0.000000000 seconds user
> > > 0.010886000 seconds sys
> > >
> > >
> > > root@roc-rk3399-pc:~#
> > >
> > > > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 <<
> > > > 32. I suspect your kernel is seeing the extended type information and
> > > > not handling it, hence the error.
> > >
> > > looks this is the case indeed
> > >
> > > > We add in the extended type for hardware and legacy cache events in
> > > > the parse events code:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511
> > > >
> > > > The addition of the extended type happens if
> > > > perf_pmus__supports_extended_type() returns true, its implementation
> > > > is:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480
> > > > bool perf_pmus__supports_extended_type(void)
> > > > {
> > > > return perf_pmus__num_core_pmus() > 1;
> > > > }
> > > >
> > > > Previously on heterogeneous ARM the extended type wouldn't be encoded
> > > > and I believe the event was opened on the PMU of the current CPU only.
> > >
> > > I think that is the case, haven't checked so far tho.
> > >
> > > > This is a bug because you will not count events on all PMUs. We can
> > > > make perf_pmus__supports_extended_type return false on ARM which
> > > > should bring back the previous behavior - or do some kind of dynamic
> > >
> > > simplest first step, trying it.
> >
> > Spot on:
> >
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat ls
> > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net rust samples scripts security sound tools usr virt
> >
> > Performance counter stats for 'ls':
> >
> > 9.01 msec task-clock:u # 0.401 CPUs utilized
> > 0 context-switches:u # 0.000 /sec
> > 0 cpu-migrations:u # 0.000 /sec
> > 84 page-faults:u # 9.320 K/sec
> > 1188641 cycles:u # 0.132 GHz
> > 601132 instructions:u # 0.51 insn per cycle
> > 64768 branches:u # 7.186 M/sec
> > 11680 branch-misses:u # 18.03% of all branches
> >
> > 0.022502514 seconds time elapsed
> >
> > 0.000000000 seconds user
> > 0.022946000 seconds sys
> >
> >
> > acme@roc-rk3399-pc:~/git/perf-tools-next$
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf record ls
> > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.003 MB perf.data (18 samples) ]
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf evlist
> > cycles:Pu
> > dummy:HGu
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ ldd ~/bin/perf | grep asan
> > libasan.so.6 => /lib/aarch64-linux-gnu/libasan.so.6 (0x0000ffffa5a00000)
> > acme@roc-rk3399-pc:~/git/perf-tools-next$
> >
> > With the following patch. Do you want to submit it or may I add it as is
> > using an edited discussion in this thread as the commit log message?
> >
> > Thanks!
> >
> > - Arnaldo
>
> Hi Arnaldo,
>
> presumably with the #ifdef you just get 1 PMU - shame. I think rather
> than do an #ifdef we can do something like call is_event_supported:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232
> so:
> bool perf_pmus__supports_extended_type(void)
> struct perf_pmu *pmu = NULL;
> if (perf_pmus__num_core_pmus() <= 1)
> return false;
> while((pmu = perf_pmus__scan_core(pmu) != NULL) {
> return is_event_supported(PERF_TYPE_HARDWARE,
> PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT);
> }
> return false;
> }
> We probably don't want to do this for each call of
> perf_pmus__supports_extended_type so you could use a static and
> pthread_once, etc.
>
> This would mean if this regression is introduced elsewhere than ARM it
> will self heal. It will also mean that when ARM support extended types
> in the kernel, they will get the normal heterogeneous behavior.
That looks better, I'll try it when I get back to my office.
- Arnaldo
> Thanks,
> Ian
>
> > diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> > index a2032c1b7644..9af961105a64 100644
> > --- a/tools/perf/util/pmus.c
> > +++ b/tools/perf/util/pmus.c
> > @@ -494,7 +494,13 @@ int perf_pmus__num_core_pmus(void)
> >
> > bool perf_pmus__supports_extended_type(void)
> > {
> > +#if defined(__aarch64__)
> > + // We can't use the extended type information where the PMU number
> > + // is encoded in the upper perf_event_attr::type bits. (<< 32).
> > + return false;
> > +#else
> > return perf_pmus__num_core_pmus() > 1;
> > +#endif
> > }
> >
> > struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
--
- Arnaldo
next prev parent reply other threads:[~2023-06-16 16:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-13 12:54 perf test failures in linux-next on s390 Thomas Richter
2023-06-13 14:32 ` Ian Rogers
2023-06-14 8:31 ` Thomas Richter
2023-06-14 14:57 ` Ian Rogers
2023-06-15 8:57 ` Thomas Richter
2023-06-15 9:39 ` Thomas Richter
2023-06-15 14:34 ` Arnaldo Carvalho de Melo
2023-06-16 14:23 ` Ian Rogers
2023-06-16 14:36 ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo
2023-06-16 14:44 ` Arnaldo Carvalho de Melo
2023-06-16 16:28 ` Ian Rogers
2023-06-16 16:53 ` Arnaldo Carvalho de Melo [this message]
2023-06-16 21:47 ` Arnaldo Carvalho de Melo
2023-06-16 22:09 ` Ian Rogers
2023-06-19 10:04 ` Thomas Richter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZIyTlR+jJ27YKPVx@kernel.org \
--to=acme@kernel.org \
--cc=irogers@google.com \
--cc=james.clark@arm.com \
--cc=john.g.garry@oracle.com \
--cc=leo.yan@linaro.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mike.leach@linaro.org \
--cc=sumanthk@linux.ibm.com \
--cc=suzuki.poulose@arm.com \
--cc=tmricht@linux.ibm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.