All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>,
	Hector Martin <marcan@marcan.st>,
	Arnaldo Carvalho de Melo   <acme@redhat.com>,
	Ian Rogers <irogers@google.com>,
	James Clark <james.clark@arm.com>
Cc: linux-perf-users@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Asahi Linux <asahi@lists.linux.dev>
Subject: Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5
Date: Tue, 21 Nov 2023 15:24:25 +0000	[thread overview]
Message-ID: <86o7fnyvrq.wl-maz@kernel.org> (raw)
In-Reply-To: <86pm03z0kw.wl-maz@kernel.org>

On Tue, 21 Nov 2023 13:40:31 +0000,
Marc Zyngier <maz@kernel.org> wrote:
> 
> [Adding key people on Cc]
> 
> On Tue, 21 Nov 2023 12:08:48 +0000,
> Hector Martin <marcan@marcan.st> wrote:
> > 
> > Perf broke on all Apple ARM64 systems (tested almost everything), and
> > according to maz also on Juno (so, probably all big.LITTLE) since v6.5.
> 
> I can confirm that at least on 6.7-rc2, perf is pretty busted on any
> asymmetric ARM platform. It isn't clear what criteria is used to pick
> the PMU, but nothing works anymore.
> 
> The saving grace in my case is that Debian still ships a 6.1 perftool
> package, but that's obviously not going to last.
> 
> I'm happy to test potential fixes.

At Mark's request, I've dumped a couple of perf (as of -rc2) runs with
-vvv.  And it is quite entertaining (this is taskset to an 'icestorm'
CPU):

<quote>
maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e
 apple_firestorm_pmu/cycles/ -e cycles ls
Using CPUID 0x00000000612f0280
Attempt to add: apple_icestorm_pmu/cycles=0/
..after resolving event: apple_icestorm_pmu/cycles=0/
Opening: unknown-hardware:HG
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0xb00000000
  disabled                         1
------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -95
Attempt to add: apple_firestorm_pmu/cycles=0/
..after resolving event: apple_firestorm_pmu/cycles=0/
Control descriptor is not initialized
Opening: apple_icestorm_pmu/cycles/
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 3
Opening: apple_firestorm_pmu/cycles/
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 4
Opening: cycles
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 5
arch			builtin-diff.o      builtin-mem.o	 common-cmds.h    perf-completion.sh
bench			builtin-evlist.c    builtin-probe.c	 CREDITS	  perf.h
Build			builtin-evlist.o    builtin-probe.o	 design.txt	  perf-in.o
builtin-annotate.c	builtin-ftrace.c    builtin-record.c	 dlfilters	  perf-iostat
builtin-annotate.o	builtin-ftrace.o    builtin-record.o	 Documentation    perf-iostat.sh
builtin-bench.c		builtin.h	    builtin-report.c	 FEATURE-DUMP	  perf.o
builtin-bench.o		builtin-help.c      builtin-report.o	 include	  perf-read-vdso.c
builtin-buildid-cache.c  builtin-help.o      builtin-sched.c	 jvmti		  perf-sys.h
builtin-buildid-cache.o  builtin-inject.c    builtin-script.c	 libapi	  PERF-VERSION-FILE
builtin-buildid-list.c	builtin-inject.o    builtin-script.o	 libperf	  perf-with-kcore
builtin-buildid-list.o	builtin-kallsyms.c  builtin-stat.c	 libsubcmd	  pmu-events
builtin-c2c.c		builtin-kallsyms.o  builtin-stat.o	 libsymbol	  python
builtin-c2c.o		builtin-kmem.c      builtin-timechart.c  Makefile	  python_ext_build
builtin-config.c	builtin-kvm.c	    builtin-top.c	 Makefile.config  scripts
builtin-config.o	builtin-kvm.o	    builtin-top.o	 Makefile.perf    tests
builtin-daemon.c	builtin-kwork.c     builtin-trace.c	 MANIFEST	  trace
builtin-daemon.o	builtin-list.c      builtin-version.c	 perf		  ui
builtin-data.c		builtin-list.o      builtin-version.o	 perf-archive	  util
builtin-data.o		builtin-lock.c      check-headers.sh	 perf-archive.sh
builtin-diff.c		builtin-mem.c	    command-list.txt	 perf.c
apple_icestorm_pmu/cycles/: -1: 0 873709 0
apple_firestorm_pmu/cycles/: -1: 0 873709 0
cycles: -1: 0 873709 0
apple_icestorm_pmu/cycles/: 0 873709 0
apple_firestorm_pmu/cycles/: 0 873709 0
cycles: 0 873709 0

 Performance counter stats for 'ls':

     <not counted>      apple_icestorm_pmu/cycles/                                              (0.00%)
     <not counted>      apple_firestorm_pmu/cycles/                                             (0.00%)
     <not counted>      cycles                                                                  (0.00%)

       0.000002250 seconds time elapsed

       0.000000000 seconds user
       0.000000000 seconds sys
</quote>

If I run the same thing on another CPU cluster (firestorm), I get
this:

<quote>
maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e
 apple_firestorm_pmu/cycles/ -e cycles ls
Using CPUID 0x00000000612f0280
Attempt to add: apple_icestorm_pmu/cycles=0/
..after resolving event: apple_icestorm_pmu/cycles=0/
Opening: unknown-hardware:HG
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0xb00000000
  disabled                         1
------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -95
Attempt to add: apple_firestorm_pmu/cycles=0/
..after resolving event: apple_firestorm_pmu/cycles=0/
Control descriptor is not initialized
Opening: apple_icestorm_pmu/cycles/
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1045925  cpu -1  group_fd -1  flags 0x8 = 3
Opening: apple_firestorm_pmu/cycles/
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1045925  cpu -1  group_fd -1  flags 0x8 = 4
Opening: cycles
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1045925  cpu -1  group_fd -1  flags 0x8 = 5
arch			builtin-diff.o      builtin-mem.o	 common-cmds.h    perf-completion.sh
bench			builtin-evlist.c    builtin-probe.c	 CREDITS	  perf.h
Build			builtin-evlist.o    builtin-probe.o	 design.txt	  perf-in.o
builtin-annotate.c	builtin-ftrace.c    builtin-record.c	 dlfilters	  perf-iostat
builtin-annotate.o	builtin-ftrace.o    builtin-record.o	 Documentation    perf-iostat.sh
builtin-bench.c		builtin.h	    builtin-report.c	 FEATURE-DUMP	  perf.o
builtin-bench.o		builtin-help.c      builtin-report.o	 include	  perf-read-vdso.c
builtin-buildid-cache.c  builtin-help.o      builtin-sched.c	 jvmti		  perf-sys.h
builtin-buildid-cache.o  builtin-inject.c    builtin-script.c	 libapi	  PERF-VERSION-FILE
builtin-buildid-list.c	builtin-inject.o    builtin-script.o	 libperf	  perf-with-kcore
builtin-buildid-list.o	builtin-kallsyms.c  builtin-stat.c	 libsubcmd	  pmu-events
builtin-c2c.c		builtin-kallsyms.o  builtin-stat.o	 libsymbol	  python
builtin-c2c.o		builtin-kmem.c      builtin-timechart.c  Makefile	  python_ext_build
builtin-config.c	builtin-kvm.c	    builtin-top.c	 Makefile.config  scripts
builtin-config.o	builtin-kvm.o	    builtin-top.o	 Makefile.perf    tests
builtin-daemon.c	builtin-kwork.c     builtin-trace.c	 MANIFEST	  trace
builtin-daemon.o	builtin-list.c      builtin-version.c	 perf		  ui
builtin-data.c		builtin-list.o      builtin-version.o	 perf-archive	  util
builtin-data.o		builtin-lock.c      check-headers.sh	 perf-archive.sh
builtin-diff.c		builtin-mem.c	    command-list.txt	 perf.c
apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125
apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125
cycles: -1: 1034653 469125 469125
apple_icestorm_pmu/cycles/: 1035101 469125 469125
apple_firestorm_pmu/cycles/: 1035035 469125 469125
cycles: 1034653 469125 469125

 Performance counter stats for 'ls':

         1,035,101      apple_icestorm_pmu/cycles/                                            
         1,035,035      apple_firestorm_pmu/cycles/                                           
         1,034,653      cycles                                                                

       0.000001333 seconds time elapsed

       0.000000000 seconds user
       0.000000000 seconds sys
</quote>

which doesn't make any sense either. I really don't understand what
this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11),
nor what this 'cycle=0' stuff is.

/puzzled

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2023-11-21 15:24 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-21 12:08 [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 Hector Martin
2023-11-21 13:40 ` Marc Zyngier
2023-11-21 15:24   ` Marc Zyngier [this message]
2023-11-21 15:40     ` Mark Rutland
2023-11-21 15:46       ` Ian Rogers
2023-11-21 16:02         ` Mark Rutland
2023-11-21 16:09           ` Ian Rogers
2023-11-21 16:15             ` Mark Rutland
2023-11-21 16:38               ` Ian Rogers
2023-11-22  3:23                 ` Hector Martin
2023-11-22 13:06                   ` Arnaldo Carvalho de Melo
2023-11-22 15:33                     ` Ian Rogers
2023-11-22 15:49                     ` Mark Rutland
2023-11-22 16:04                       ` Ian Rogers
2023-11-22 16:26                         ` Arnaldo Carvalho de Melo
2023-11-22 16:33                           ` Ian Rogers
2023-11-22 16:19                       ` Arnaldo Carvalho de Melo
2023-11-22 13:03                 ` Mark Rutland
2023-11-22 15:29                   ` Ian Rogers
2023-11-22 16:08                     ` Mark Rutland
2023-11-22 16:29                       ` Ian Rogers
2023-11-22 16:55                         ` Arnaldo Carvalho de Melo
2023-11-22 16:59                           ` Ian Rogers
2023-11-23  4:33                             ` Ian Rogers
2023-11-21 15:41     ` Ian Rogers
2023-11-21 15:56       ` Mark Rutland
2023-11-21 16:03         ` Ian Rogers
2023-11-21 16:08           ` Mark Rutland
2023-11-23 14:23     ` Mark Rutland
2023-11-23 14:45       ` Marc Zyngier
2023-11-23 15:14       ` Ian Rogers
2023-11-23 16:48         ` Mark Rutland
2023-11-23 17:08           ` James Clark
2023-11-23 17:15             ` Mark Rutland
2023-11-21 23:43 ` Bagas Sanjaya
2023-12-06 12:09   ` Linux regression tracking #update (Thorsten Leemhuis)
2024-08-01 19:05     ` Ian Rogers
2024-08-07  8:54       ` Thorsten Leemhuis
2024-08-14 16:28         ` James Clark
2024-08-14 16:41           ` Arnaldo Carvalho de Melo
2024-08-15 15:15             ` James Clark
2024-08-15 15:20               ` James Clark
2024-08-15 15:27               ` Arnaldo Carvalho de Melo
2024-08-15 15:53                 ` Arnaldo Carvalho de Melo
2024-08-16  8:57                   ` James Clark
2024-08-15 17:29           ` Ian Rogers
2024-08-16  9:22             ` James Clark
2024-08-16 15:30               ` Ian Rogers
2024-08-17  1:38                 ` Atish Kumar Patra
2024-08-20  8:58                   ` James Clark
2024-08-19 14:56                 ` James Clark
2024-08-19 15:44                   ` Ian Rogers
2025-03-09 21:19       ` Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86o7fnyvrq.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=acme@redhat.com \
    --cc=asahi@lists.linux.dev \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=marcan@marcan.st \
    --cc=mark.rutland@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.