* [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5
@ 2023-11-21 12:08 Hector Martin
2023-11-21 13:40 ` Marc Zyngier
2023-11-21 23:43 ` Bagas Sanjaya
0 siblings, 2 replies; 53+ messages in thread
From: Hector Martin @ 2023-11-21 12:08 UTC (permalink / raw)
To: linux-perf-users, LKML; +Cc: Marc Zyngier, Asahi Linux
Perf broke on all Apple ARM64 systems (tested almost everything), and
according to maz also on Juno (so, probably all big.LITTLE) since v6.5.
Test command:
sudo taskset -c 0 ./perf stat -e apple_icestorm_pmu/cycles/ -e
apple_firestorm_pmu/cycles/ -e cycles ls
Since this is taskset to CPU #0 (LITTLE core, icestorm), only events for
icestorm are expected.
I bisected the breakage to two distinct points:
5ea8f2ccffb is the first bad commit. With its parent, the output is as
expected (same as v6.4):
3,297,462 apple_icestorm_pmu/cycles/
<not counted> apple_firestorm_pmu/cycles/
(0.00%)
<not counted> cycles
(0.00%)
With 5ea8f2ccffb everything breaks:
<not supported> apple_icestorm_pmu/cycles/
<not supported> apple_firestorm_pmu/cycles/
<not counted> cycles
(0.00%)
Somewhere along the way to 82fe2e45cdb00 things get even worse (didn't
bother bisecting this range). With its parent:
<not supported> apple_icestorm_pmu/cycles/
<not supported> apple_firestorm_pmu/cycles/
<not supported> apple_icestorm_pmu/cycles/
<not supported> apple_firestorm_pmu/cycles/
Then 82fe2e45cdb00 leads to the current v6.5 behavior:
<not counted> apple_icestorm_pmu/cycles/
(0.00%)
<not counted> apple_firestorm_pmu/cycles/
(0.00%)
<not counted> cycles
(0.00%)
If I taskset the task to CPU#2 (big core, firestorm), I get events:
1,454,858 apple_icestorm_pmu/cycles/
1,454,760 apple_firestorm_pmu/cycles/
1,454,384 cycles
So the current behavior is that all output seems to come from the
firestorm PMU event counter, regardless of requested event.
This is all unchanged and still broken in v6.7-rc2.
- Hector
^ permalink raw reply [flat|nested] 53+ messages in thread* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 12:08 [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 Hector Martin @ 2023-11-21 13:40 ` Marc Zyngier 2023-11-21 15:24 ` Marc Zyngier 2023-11-21 23:43 ` Bagas Sanjaya 1 sibling, 1 reply; 53+ messages in thread From: Marc Zyngier @ 2023-11-21 13:40 UTC (permalink / raw) To: Hector Martin, Arnaldo Carvalho de Melo, Ian Rogers, James Clark Cc: linux-perf-users, LKML, Asahi Linux, Mark Rutland [Adding key people on Cc] On Tue, 21 Nov 2023 12:08:48 +0000, Hector Martin <marcan@marcan.st> wrote: > > Perf broke on all Apple ARM64 systems (tested almost everything), and > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. I can confirm that at least on 6.7-rc2, perf is pretty busted on any asymmetric ARM platform. It isn't clear what criteria is used to pick the PMU, but nothing works anymore. The saving grace in my case is that Debian still ships a 6.1 perftool package, but that's obviously not going to last. I'm happy to test potential fixes. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 13:40 ` Marc Zyngier @ 2023-11-21 15:24 ` Marc Zyngier 2023-11-21 15:40 ` Mark Rutland ` (2 more replies) 0 siblings, 3 replies; 53+ messages in thread From: Marc Zyngier @ 2023-11-21 15:24 UTC (permalink / raw) To: Mark Rutland, Hector Martin, Arnaldo Carvalho de Melo, Ian Rogers, James Clark Cc: linux-perf-users, LKML, Asahi Linux On Tue, 21 Nov 2023 13:40:31 +0000, Marc Zyngier <maz@kernel.org> wrote: > > [Adding key people on Cc] > > On Tue, 21 Nov 2023 12:08:48 +0000, > Hector Martin <marcan@marcan.st> wrote: > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > asymmetric ARM platform. It isn't clear what criteria is used to pick > the PMU, but nothing works anymore. > > The saving grace in my case is that Debian still ships a 6.1 perftool > package, but that's obviously not going to last. > > I'm happy to test potential fixes. At Mark's request, I've dumped a couple of perf (as of -rc2) runs with -vvv. And it is quite entertaining (this is taskset to an 'icestorm' CPU): <quote> maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e apple_firestorm_pmu/cycles/ -e cycles ls Using CPUID 0x00000000612f0280 Attempt to add: apple_icestorm_pmu/cycles=0/ ..after resolving event: apple_icestorm_pmu/cycles=0/ Opening: unknown-hardware:HG ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0xb00000000 disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -95 Attempt to add: apple_firestorm_pmu/cycles=0/ ..after resolving event: apple_firestorm_pmu/cycles=0/ Control descriptor is not initialized Opening: apple_icestorm_pmu/cycles/ ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 Opening: apple_firestorm_pmu/cycles/ ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 Opening: cycles ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh bench builtin-evlist.c builtin-probe.c CREDITS perf.h Build builtin-evlist.o builtin-probe.o design.txt perf-in.o builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace builtin-daemon.o builtin-list.c builtin-version.c perf ui builtin-data.c builtin-list.o builtin-version.o perf-archive util builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh builtin-diff.c builtin-mem.c command-list.txt perf.c apple_icestorm_pmu/cycles/: -1: 0 873709 0 apple_firestorm_pmu/cycles/: -1: 0 873709 0 cycles: -1: 0 873709 0 apple_icestorm_pmu/cycles/: 0 873709 0 apple_firestorm_pmu/cycles/: 0 873709 0 cycles: 0 873709 0 Performance counter stats for 'ls': <not counted> apple_icestorm_pmu/cycles/ (0.00%) <not counted> apple_firestorm_pmu/cycles/ (0.00%) <not counted> cycles (0.00%) 0.000002250 seconds time elapsed 0.000000000 seconds user 0.000000000 seconds sys </quote> If I run the same thing on another CPU cluster (firestorm), I get this: <quote> maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e apple_firestorm_pmu/cycles/ -e cycles ls Using CPUID 0x00000000612f0280 Attempt to add: apple_icestorm_pmu/cycles=0/ ..after resolving event: apple_icestorm_pmu/cycles=0/ Opening: unknown-hardware:HG ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0xb00000000 disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -95 Attempt to add: apple_firestorm_pmu/cycles=0/ ..after resolving event: apple_firestorm_pmu/cycles=0/ Control descriptor is not initialized Opening: apple_icestorm_pmu/cycles/ ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 Opening: apple_firestorm_pmu/cycles/ ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 Opening: cycles ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh bench builtin-evlist.c builtin-probe.c CREDITS perf.h Build builtin-evlist.o builtin-probe.o design.txt perf-in.o builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace builtin-daemon.o builtin-list.c builtin-version.c perf ui builtin-data.c builtin-list.o builtin-version.o perf-archive util builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh builtin-diff.c builtin-mem.c command-list.txt perf.c apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 cycles: -1: 1034653 469125 469125 apple_icestorm_pmu/cycles/: 1035101 469125 469125 apple_firestorm_pmu/cycles/: 1035035 469125 469125 cycles: 1034653 469125 469125 Performance counter stats for 'ls': 1,035,101 apple_icestorm_pmu/cycles/ 1,035,035 apple_firestorm_pmu/cycles/ 1,034,653 cycles 0.000001333 seconds time elapsed 0.000000000 seconds user 0.000000000 seconds sys </quote> which doesn't make any sense either. I really don't understand what this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11), nor what this 'cycle=0' stuff is. /puzzled M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 15:24 ` Marc Zyngier @ 2023-11-21 15:40 ` Mark Rutland 2023-11-21 15:46 ` Ian Rogers 2023-11-21 15:41 ` Ian Rogers 2023-11-23 14:23 ` Mark Rutland 2 siblings, 1 reply; 53+ messages in thread From: Mark Rutland @ 2023-11-21 15:40 UTC (permalink / raw) To: Marc Zyngier Cc: Hector Martin, Arnaldo Carvalho de Melo, Ian Rogers, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > On Tue, 21 Nov 2023 13:40:31 +0000, > Marc Zyngier <maz@kernel.org> wrote: > > > > [Adding key people on Cc] > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > Hector Martin <marcan@marcan.st> wrote: > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > the PMU, but nothing works anymore. > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > package, but that's obviously not going to last. > > > > I'm happy to test potential fixes. > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > CPU): IIUC the tool is doing the wrong thing here and overriding explicit ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using that ${pmu}'s type and event namespace. Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be targetted to a specific PMU, it's semantically wrong to rewrite events like this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named PERF_COUNT_HW_${EVENT}. Mark. > <quote> > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > apple_firestorm_pmu/cycles/ -e cycles ls > Using CPUID 0x00000000612f0280 > Attempt to add: apple_icestorm_pmu/cycles=0/ > ..after resolving event: apple_icestorm_pmu/cycles=0/ > Opening: unknown-hardware:HG > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0xb00000000 > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -95 > Attempt to add: apple_firestorm_pmu/cycles=0/ > ..after resolving event: apple_firestorm_pmu/cycles=0/ > Control descriptor is not initialized > Opening: apple_icestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > Opening: cycles > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > builtin-daemon.o builtin-list.c builtin-version.c perf ui > builtin-data.c builtin-list.o builtin-version.o perf-archive util > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > builtin-diff.c builtin-mem.c command-list.txt perf.c > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > cycles: -1: 0 873709 0 > apple_icestorm_pmu/cycles/: 0 873709 0 > apple_firestorm_pmu/cycles/: 0 873709 0 > cycles: 0 873709 0 > > Performance counter stats for 'ls': > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > <not counted> cycles (0.00%) > > 0.000002250 seconds time elapsed > > 0.000000000 seconds user > 0.000000000 seconds sys > </quote> > > If I run the same thing on another CPU cluster (firestorm), I get > this: > > <quote> > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > apple_firestorm_pmu/cycles/ -e cycles ls > Using CPUID 0x00000000612f0280 > Attempt to add: apple_icestorm_pmu/cycles=0/ > ..after resolving event: apple_icestorm_pmu/cycles=0/ > Opening: unknown-hardware:HG > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0xb00000000 > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -95 > Attempt to add: apple_firestorm_pmu/cycles=0/ > ..after resolving event: apple_firestorm_pmu/cycles=0/ > Control descriptor is not initialized > Opening: apple_icestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > Opening: cycles > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > builtin-daemon.o builtin-list.c builtin-version.c perf ui > builtin-data.c builtin-list.o builtin-version.o perf-archive util > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > builtin-diff.c builtin-mem.c command-list.txt perf.c > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > cycles: -1: 1034653 469125 469125 > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > cycles: 1034653 469125 469125 > > Performance counter stats for 'ls': > > 1,035,101 apple_icestorm_pmu/cycles/ > 1,035,035 apple_firestorm_pmu/cycles/ > 1,034,653 cycles > > 0.000001333 seconds time elapsed > > 0.000000000 seconds user > 0.000000000 seconds sys > </quote> > > which doesn't make any sense either. I really don't understand what > this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11), > nor what this 'cycle=0' stuff is. > > /puzzled > > M. > > -- > Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 15:40 ` Mark Rutland @ 2023-11-21 15:46 ` Ian Rogers 2023-11-21 16:02 ` Mark Rutland 0 siblings, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-21 15:46 UTC (permalink / raw) To: Mark Rutland Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > On Tue, 21 Nov 2023 13:40:31 +0000, > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > [Adding key people on Cc] > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > the PMU, but nothing works anymore. > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > package, but that's obviously not going to last. > > > > > > I'm happy to test potential fixes. > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > CPU): > > IIUC the tool is doing the wrong thing here and overriding explicit > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > that ${pmu}'s type and event namespace. > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > targetted to a specific PMU, it's semantically wrong to rewrite events like > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > PERF_COUNT_HW_${EVENT}. If you name a PMU and an event then the event should only be opened on that PMU, 100% agree. There's a bunch of output, but when the legacy cycles event is opened it appears to be because it was explicitly requested. Thanks, Ian > Mark. > > > <quote> > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > apple_firestorm_pmu/cycles/ -e cycles ls > > Using CPUID 0x00000000612f0280 > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > Opening: unknown-hardware:HG > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > config 0xb00000000 > > disabled 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > sys_perf_event_open failed, error -95 > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > Control descriptor is not initialized > > Opening: apple_icestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > > Opening: apple_firestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > Opening: cycles > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > > cycles: -1: 0 873709 0 > > apple_icestorm_pmu/cycles/: 0 873709 0 > > apple_firestorm_pmu/cycles/: 0 873709 0 > > cycles: 0 873709 0 > > > > Performance counter stats for 'ls': > > > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > > <not counted> cycles (0.00%) > > > > 0.000002250 seconds time elapsed > > > > 0.000000000 seconds user > > 0.000000000 seconds sys > > </quote> > > > > If I run the same thing on another CPU cluster (firestorm), I get > > this: > > > > <quote> > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > apple_firestorm_pmu/cycles/ -e cycles ls > > Using CPUID 0x00000000612f0280 > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > Opening: unknown-hardware:HG > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > config 0xb00000000 > > disabled 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > sys_perf_event_open failed, error -95 > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > Control descriptor is not initialized > > Opening: apple_icestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > > Opening: apple_firestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > > Opening: cycles > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > > cycles: -1: 1034653 469125 469125 > > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > > cycles: 1034653 469125 469125 > > > > Performance counter stats for 'ls': > > > > 1,035,101 apple_icestorm_pmu/cycles/ > > 1,035,035 apple_firestorm_pmu/cycles/ > > 1,034,653 cycles > > > > 0.000001333 seconds time elapsed > > > > 0.000000000 seconds user > > 0.000000000 seconds sys > > </quote> > > > > which doesn't make any sense either. I really don't understand what > > this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11), > > nor what this 'cycle=0' stuff is. > > > > /puzzled > > > > M. > > > > -- > > Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 15:46 ` Ian Rogers @ 2023-11-21 16:02 ` Mark Rutland 2023-11-21 16:09 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Mark Rutland @ 2023-11-21 16:02 UTC (permalink / raw) To: Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > [Adding key people on Cc] > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > the PMU, but nothing works anymore. > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > package, but that's obviously not going to last. > > > > > > > > I'm happy to test potential fixes. > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > CPU): > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > that ${pmu}'s type and event namespace. > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > PERF_COUNT_HW_${EVENT}. > > If you name a PMU and an event then the event should only be opened on > that PMU, 100% agree. There's a bunch of output, but when the legacy > cycles event is opened it appears to be because it was explicitly > requested. I think you've missed that the named PMU events are being erreously transformed into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. Opening: apple_firestorm_pmu/cycles/ ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. Marc said that he bisected the issue down to commit: 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") ... so it looks like something is going wrong when the events are being parsed, e.g. losing the HW PMU information? Thanks, Mark. > > > Thanks, > Ian > > > Mark. > > > > > <quote> > > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > > apple_firestorm_pmu/cycles/ -e cycles ls > > > Using CPUID 0x00000000612f0280 > > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > > Opening: unknown-hardware:HG > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > config 0xb00000000 > > > disabled 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > > sys_perf_event_open failed, error -95 > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > > Control descriptor is not initialized > > > Opening: apple_icestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > > > Opening: apple_firestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > Opening: cycles > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > > > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > > > cycles: -1: 0 873709 0 > > > apple_icestorm_pmu/cycles/: 0 873709 0 > > > apple_firestorm_pmu/cycles/: 0 873709 0 > > > cycles: 0 873709 0 > > > > > > Performance counter stats for 'ls': > > > > > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > > > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > > > <not counted> cycles (0.00%) > > > > > > 0.000002250 seconds time elapsed > > > > > > 0.000000000 seconds user > > > 0.000000000 seconds sys > > > </quote> > > > > > > If I run the same thing on another CPU cluster (firestorm), I get > > > this: > > > > > > <quote> > > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > > apple_firestorm_pmu/cycles/ -e cycles ls > > > Using CPUID 0x00000000612f0280 > > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > > Opening: unknown-hardware:HG > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > config 0xb00000000 > > > disabled 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > > sys_perf_event_open failed, error -95 > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > > Control descriptor is not initialized > > > Opening: apple_icestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > > > Opening: apple_firestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > > > Opening: cycles > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > > > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > > > cycles: -1: 1034653 469125 469125 > > > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > > > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > > > cycles: 1034653 469125 469125 > > > > > > Performance counter stats for 'ls': > > > > > > 1,035,101 apple_icestorm_pmu/cycles/ > > > 1,035,035 apple_firestorm_pmu/cycles/ > > > 1,034,653 cycles > > > > > > 0.000001333 seconds time elapsed > > > > > > 0.000000000 seconds user > > > 0.000000000 seconds sys > > > </quote> > > > > > > which doesn't make any sense either. I really don't understand what > > > this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11), > > > nor what this 'cycle=0' stuff is. > > > > > > /puzzled > > > > > > M. > > > > > > -- > > > Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 16:02 ` Mark Rutland @ 2023-11-21 16:09 ` Ian Rogers 2023-11-21 16:15 ` Mark Rutland 0 siblings, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-21 16:09 UTC (permalink / raw) To: Mark Rutland Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > [Adding key people on Cc] > > > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > the PMU, but nothing works anymore. > > > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > package, but that's obviously not going to last. > > > > > > > > > > I'm happy to test potential fixes. > > > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > CPU): > > > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > that ${pmu}'s type and event namespace. > > > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > PERF_COUNT_HW_${EVENT}. > > > > If you name a PMU and an event then the event should only be opened on > > that PMU, 100% agree. There's a bunch of output, but when the legacy > > cycles event is opened it appears to be because it was explicitly > > requested. > > I think you've missed that the named PMU events are being erreously transformed > into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > Marc said that he bisected the issue down to commit: > > 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > ... so it looks like something is going wrong when the events are being parsed, > e.g. losing the HW PMU information? Ok, I think I'm getting confused by other things. This looks like the issue. I think it may be working as intended, but not how you intended :-) If a core PMU is listed and then a legacy event, the legacy event should be opened on the core PMU as a legacy event with the extended type set. This is to allow things like legacy cache events to be opened on a specified PMU. Legacy event names match with a higher priority than those in sysfs or json as they are hard coded. Presumably the expectation was that by advertising a cycles event, presumably in sysfs, then this is what would be matched. Thanks, Ian > Thanks, > Mark. > > > > > > > Thanks, > > Ian > > > > > Mark. > > > > > > > <quote> > > > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > > > apple_firestorm_pmu/cycles/ -e cycles ls > > > > Using CPUID 0x00000000612f0280 > > > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > > > Opening: unknown-hardware:HG > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > config 0xb00000000 > > > > disabled 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > > > sys_perf_event_open failed, error -95 > > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > > > Control descriptor is not initialized > > > > Opening: apple_icestorm_pmu/cycles/ > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > size 136 > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > sample_type IDENTIFIER > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > disabled 1 > > > > inherit 1 > > > > enable_on_exec 1 > > > > exclude_guest 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > > > > Opening: apple_firestorm_pmu/cycles/ > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > size 136 > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > sample_type IDENTIFIER > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > disabled 1 > > > > inherit 1 > > > > enable_on_exec 1 > > > > exclude_guest 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > Opening: cycles > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > size 136 > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > sample_type IDENTIFIER > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > disabled 1 > > > > inherit 1 > > > > enable_on_exec 1 > > > > exclude_guest 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > > > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > > > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > > > > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > > > > cycles: -1: 0 873709 0 > > > > apple_icestorm_pmu/cycles/: 0 873709 0 > > > > apple_firestorm_pmu/cycles/: 0 873709 0 > > > > cycles: 0 873709 0 > > > > > > > > Performance counter stats for 'ls': > > > > > > > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > > > > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > > > > <not counted> cycles (0.00%) > > > > > > > > 0.000002250 seconds time elapsed > > > > > > > > 0.000000000 seconds user > > > > 0.000000000 seconds sys > > > > </quote> > > > > > > > > If I run the same thing on another CPU cluster (firestorm), I get > > > > this: > > > > > > > > <quote> > > > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > > > apple_firestorm_pmu/cycles/ -e cycles ls > > > > Using CPUID 0x00000000612f0280 > > > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > > > Opening: unknown-hardware:HG > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > config 0xb00000000 > > > > disabled 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > > > sys_perf_event_open failed, error -95 > > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > > > Control descriptor is not initialized > > > > Opening: apple_icestorm_pmu/cycles/ > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > size 136 > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > sample_type IDENTIFIER > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > disabled 1 > > > > inherit 1 > > > > enable_on_exec 1 > > > > exclude_guest 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > > > > Opening: apple_firestorm_pmu/cycles/ > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > size 136 > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > sample_type IDENTIFIER > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > disabled 1 > > > > inherit 1 > > > > enable_on_exec 1 > > > > exclude_guest 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > > > > Opening: cycles > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > size 136 > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > sample_type IDENTIFIER > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > disabled 1 > > > > inherit 1 > > > > enable_on_exec 1 > > > > exclude_guest 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > > > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > > > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > > > > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > > > > cycles: -1: 1034653 469125 469125 > > > > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > > > > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > > > > cycles: 1034653 469125 469125 > > > > > > > > Performance counter stats for 'ls': > > > > > > > > 1,035,101 apple_icestorm_pmu/cycles/ > > > > 1,035,035 apple_firestorm_pmu/cycles/ > > > > 1,034,653 cycles > > > > > > > > 0.000001333 seconds time elapsed > > > > > > > > 0.000000000 seconds user > > > > 0.000000000 seconds sys > > > > </quote> > > > > > > > > which doesn't make any sense either. I really don't understand what > > > > this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11), > > > > nor what this 'cycle=0' stuff is. > > > > > > > > /puzzled > > > > > > > > M. > > > > > > > > -- > > > > Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 16:09 ` Ian Rogers @ 2023-11-21 16:15 ` Mark Rutland 2023-11-21 16:38 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Mark Rutland @ 2023-11-21 16:15 UTC (permalink / raw) To: Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > [Adding key people on Cc] > > > > > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > > the PMU, but nothing works anymore. > > > > > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > > package, but that's obviously not going to last. > > > > > > > > > > > > I'm happy to test potential fixes. > > > > > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > > CPU): > > > > > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > that ${pmu}'s type and event namespace. > > > > > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > PERF_COUNT_HW_${EVENT}. > > > > > > If you name a PMU and an event then the event should only be opened on > > > that PMU, 100% agree. There's a bunch of output, but when the legacy > > > cycles event is opened it appears to be because it was explicitly > > > requested. > > > > I think you've missed that the named PMU events are being erreously transformed > > into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > Opening: apple_firestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > Marc said that he bisected the issue down to commit: > > > > 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > ... so it looks like something is going wrong when the events are being parsed, > > e.g. losing the HW PMU information? > > Ok, I think I'm getting confused by other things. This looks like the issue. > > I think it may be working as intended, but not how you intended :-) If > a core PMU is listed and then a legacy event, the legacy event should > be opened on the core PMU as a legacy event with the extended type > set. This is to allow things like legacy cache events to be opened on > a specified PMU. Legacy event names match with a higher priority than > those in sysfs or json as they are hard coded. That has never been the case previously, so this is user-visible breakage, and it prevents users from being able to do the right thing, so I think that's a broken design. > Presumably the expectation was that by advertising a cycles event, presumably > in sysfs, then this is what would be matched. I expect that if I ask for ${pmu}/${event}/, that PMU is used, and the event *in that PMU's namespace* is used. Overriding that breaks long-established practice and provides users with no recourse to get the behavioru they expect (and previosuly had). I do think that (regardless of whther this was the sematnic you intended) silently overriding events with legacy events is a bug, and one we should fix. As I mentioned in another reply, just because the events have the same name does not mean that they are semantically the same, so we're liable to give people the wrong numbers anyhow. Can we fix this? Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 16:15 ` Mark Rutland @ 2023-11-21 16:38 ` Ian Rogers 2023-11-22 3:23 ` Hector Martin 2023-11-22 13:03 ` Mark Rutland 0 siblings, 2 replies; 53+ messages in thread From: Ian Rogers @ 2023-11-21 16:38 UTC (permalink / raw) To: Mark Rutland Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > [Adding key people on Cc] > > > > > > > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > > > the PMU, but nothing works anymore. > > > > > > > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > > > package, but that's obviously not going to last. > > > > > > > > > > > > > > I'm happy to test potential fixes. > > > > > > > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > > > CPU): > > > > > > > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > > > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > > that ${pmu}'s type and event namespace. > > > > > > > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > > PERF_COUNT_HW_${EVENT}. > > > > > > > > If you name a PMU and an event then the event should only be opened on > > > > that PMU, 100% agree. There's a bunch of output, but when the legacy > > > > cycles event is opened it appears to be because it was explicitly > > > > requested. > > > > > > I think you've missed that the named PMU events are being erreously transformed > > > into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > > > Opening: apple_firestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > > > ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > > > Marc said that he bisected the issue down to commit: > > > > > > 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > > > ... so it looks like something is going wrong when the events are being parsed, > > > e.g. losing the HW PMU information? > > > > Ok, I think I'm getting confused by other things. This looks like the issue. > > > > I think it may be working as intended, but not how you intended :-) If > > a core PMU is listed and then a legacy event, the legacy event should > > be opened on the core PMU as a legacy event with the extended type > > set. This is to allow things like legacy cache events to be opened on > > a specified PMU. Legacy event names match with a higher priority than > > those in sysfs or json as they are hard coded. > > That has never been the case previously, so this is user-visible breakage, and > it prevents users from being able to do the right thing, so I think that's a > broken design. So the problem was caused by ARM and Intel doing two different things. Intel did at least contribute to the perf tool in support for their BIG.little/hybrid, so that's why the semantics match their approach. > > Presumably the expectation was that by advertising a cycles event, presumably > > in sysfs, then this is what would be matched. > > I expect that if I ask for ${pmu}/${event}/, that PMU is used, and the event > *in that PMU's namespace* is used. Overriding that breaks long-established > practice and provides users with no recourse to get the behavioru they expect > (and previosuly had). On ARM but not Intel. > I do think that (regardless of whther this was the sematnic you intended) > silently overriding events with legacy events is a bug, and one we should fix. > As I mentioned in another reply, just because the events have the same name > does not mean that they are semantically the same, so we're liable to give > people the wrong numbers anyhow. > > Can we fix this? So I'd like to fix this, some things from various conversations: 1) we lack testing. Our testing relies on the sysfs of the machine being run on, which is better than nothing. I think ideally we'd have a collection of zipped up sysfs directories and then we could have a test that asserts on ARM you get the behavior you want. 2) for RISC-V they want to make the legacy event matching something in user land to simplify the PMU driver. 3) I'd like to get rid of the PMU json interface. My idea is to convert json events/metrics into sysfs style files, zip these up and then link them into the perf binary. On Intel the json is 70% of the binary (7MB out of 10MB) and we may get this down to 3MB with this approach. The json lookup would need to incorporate the cpuid matching that currently exists. When we look up an event I'd like the approach to be like unionfs with a specified but configurable order. Users could provide directories of their own events/metrics for various PMUs, and then this approach could be used to help with (1). Those proposals are not something to add as a -rc fix, so what I think you're asking for here is a "if ARM" fix somewhere in the event parsing. That's of course possible but it will cause problems if you did say: perf stat -e arm_pmu/LLC-load-misses/ ... as I doubt the PMU driver is advertising this legacy event in sysfs and the "if ARM" logic would presumably be trying to disable legacy events in the term list for the ARM PMU. Given all of this, is anything actually broken and needing a fix for 6.7? Thanks, Ian > Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 16:38 ` Ian Rogers @ 2023-11-22 3:23 ` Hector Martin 2023-11-22 13:06 ` Arnaldo Carvalho de Melo 2023-11-22 13:03 ` Mark Rutland 1 sibling, 1 reply; 53+ messages in thread From: Hector Martin @ 2023-11-22 3:23 UTC (permalink / raw) To: Ian Rogers, Mark Rutland Cc: Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On 2023/11/22 1:38, Ian Rogers wrote: > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: >> >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: >>>> >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: >>>>>> >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000, >>>>>>> Marc Zyngier <maz@kernel.org> wrote: >>>>>>>> >>>>>>>> [Adding key people on Cc] >>>>>>>> >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000, >>>>>>>> Hector Martin <marcan@marcan.st> wrote: >>>>>>>>> >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >>>>>>>> >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick >>>>>>>> the PMU, but nothing works anymore. >>>>>>>> >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool >>>>>>>> package, but that's obviously not going to last. >>>>>>>> >>>>>>>> I'm happy to test potential fixes. >>>>>>> >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with >>>>>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' >>>>>>> CPU): >>>>>> >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using >>>>>> that ${pmu}'s type and event namespace. >>>>>> >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named >>>>>> PERF_COUNT_HW_${EVENT}. >>>>> >>>>> If you name a PMU and an event then the event should only be opened on >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy >>>>> cycles event is opened it appears to be because it was explicitly >>>>> requested. >>>> >>>> I think you've missed that the named PMU events are being erreously transformed >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. >>>> >>>> Opening: apple_firestorm_pmu/cycles/ >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) >>>> sample_type IDENTIFIER >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> enable_on_exec 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 >>>> >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. >>>> >>>> Marc said that he bisected the issue down to commit: >>>> >>>> 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") >>>> >>>> ... so it looks like something is going wrong when the events are being parsed, >>>> e.g. losing the HW PMU information? >>> >>> Ok, I think I'm getting confused by other things. This looks like the issue. >>> >>> I think it may be working as intended, but not how you intended :-) If >>> a core PMU is listed and then a legacy event, the legacy event should >>> be opened on the core PMU as a legacy event with the extended type >>> set. This is to allow things like legacy cache events to be opened on >>> a specified PMU. Legacy event names match with a higher priority than >>> those in sysfs or json as they are hard coded. >> >> That has never been the case previously, so this is user-visible breakage, and >> it prevents users from being able to do the right thing, so I think that's a >> broken design. > > So the problem was caused by ARM and Intel doing two different things. > Intel did at least contribute to the perf tool in support for their > BIG.little/hybrid, so that's why the semantics match their approach. > >>> Presumably the expectation was that by advertising a cycles event, presumably >>> in sysfs, then this is what would be matched. >> >> I expect that if I ask for ${pmu}/${event}/, that PMU is used, and the event >> *in that PMU's namespace* is used. Overriding that breaks long-established >> practice and provides users with no recourse to get the behavioru they expect >> (and previosuly had). > > On ARM but not Intel. > >> I do think that (regardless of whther this was the sematnic you intended) >> silently overriding events with legacy events is a bug, and one we should fix. >> As I mentioned in another reply, just because the events have the same name >> does not mean that they are semantically the same, so we're liable to give >> people the wrong numbers anyhow. >> >> Can we fix this? > > So I'd like to fix this, some things from various conversations: > > 1) we lack testing. Our testing relies on the sysfs of the machine > being run on, which is better than nothing. I think ideally we'd have > a collection of zipped up sysfs directories and then we could have a > test that asserts on ARM you get the behavior you want. > > 2) for RISC-V they want to make the legacy event matching something in > user land to simplify the PMU driver. > > 3) I'd like to get rid of the PMU json interface. My idea is to > convert json events/metrics into sysfs style files, zip these up and > then link them into the perf binary. On Intel the json is 70% of the > binary (7MB out of 10MB) and we may get this down to 3MB with this > approach. The json lookup would need to incorporate the cpuid matching > that currently exists. When we look up an event I'd like the approach > to be like unionfs with a specified but configurable order. Users > could provide directories of their own events/metrics for various > PMUs, and then this approach could be used to help with (1). > > Those proposals are not something to add as a -rc fix, so what I think > you're asking for here is a "if ARM" fix somewhere in the event > parsing. That's of course possible but it will cause problems if you > did say: > > perf stat -e arm_pmu/LLC-load-misses/ ... > > as I doubt the PMU driver is advertising this legacy event in sysfs > and the "if ARM" logic would presumably be trying to disable legacy > events in the term list for the ARM PMU. > > Given all of this, is anything actually broken and needing a fix for 6.7? You literally cannot use perf correctly on ARM big.LITTLE systems since 6.5, while it worked fine on 6.4. So, yes, it's broken and it needs fixing. This is a major regression. $ taskset -c 0 perf stat -e apple_icestorm_pmu/cycles/ echo Performance counter stats for 'echo': <not counted> apple_icestorm_pmu/cycles/u (0.00%) 0.001385544 seconds time elapsed 0.001375000 seconds user 0.000000000 seconds sys $ taskset -c 2 perf stat -e apple_firestorm_pmu/cycles/ echo Performance counter stats for 'echo': 169,965 apple_firestorm_pmu/cycles/u 0.000466667 seconds time elapsed 0.000475000 seconds user 0.000000000 seconds sys Both of those should return counts. One does not, and it doesn't even seem to be predictable which one you get. *On my particular system, it is currently impossible to get any performance counter data from the E cores, as far as I can tell, no matter how you invoke perf*. Feel free to argue semantics as to what went wrong or how it should be fixed, but there is no question that this is a regression that requires a fix. Perf is currently simply broken here, where it wasn't in 6.4. - Hector ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 3:23 ` Hector Martin @ 2023-11-22 13:06 ` Arnaldo Carvalho de Melo 2023-11-22 15:33 ` Ian Rogers 2023-11-22 15:49 ` Mark Rutland 0 siblings, 2 replies; 53+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-11-22 13:06 UTC (permalink / raw) To: Hector Martin Cc: Ian Rogers, Mark Rutland, Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux Em Wed, Nov 22, 2023 at 12:23:27PM +0900, Hector Martin escreveu: > On 2023/11/22 1:38, Ian Rogers wrote: > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000, > >>>>>>> Marc Zyngier <maz@kernel.org> wrote: > >>>>>>>> > >>>>>>>> [Adding key people on Cc] > >>>>>>>> > >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000, > >>>>>>>> Hector Martin <marcan@marcan.st> wrote: > >>>>>>>>> > >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > >>>>>>>> > >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any > >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick > >>>>>>>> the PMU, but nothing works anymore. > >>>>>>>> > >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool > >>>>>>>> package, but that's obviously not going to last. > >>>>>>>> > >>>>>>>> I'm happy to test potential fixes. > >>>>>>> > >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > >>>>>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > >>>>>>> CPU): > >>>>>> > >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit > >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > >>>>>> that ${pmu}'s type and event namespace. > >>>>>> > >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like > >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > >>>>>> PERF_COUNT_HW_${EVENT}. > >>>>> > >>>>> If you name a PMU and an event then the event should only be opened on > >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy > >>>>> cycles event is opened it appears to be because it was explicitly > >>>>> requested. > >>>> > >>>> I think you've missed that the named PMU events are being erreously transformed > >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > >>>> > >>>> Opening: apple_firestorm_pmu/cycles/ > >>>> ------------------------------------------------------------ > >>>> perf_event_attr: > >>>> type 0 (PERF_TYPE_HARDWARE) > >>>> size 136 > >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) > >>>> sample_type IDENTIFIER > >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > >>>> disabled 1 > >>>> inherit 1 > >>>> enable_on_exec 1 > >>>> exclude_guest 1 > >>>> ------------------------------------------------------------ > >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > >>>> > >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > >>>> > >>>> Marc said that he bisected the issue down to commit: > >>>> > >>>> 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > >>>> > >>>> ... so it looks like something is going wrong when the events are being parsed, > >>>> e.g. losing the HW PMU information? > >>> > >>> Ok, I think I'm getting confused by other things. This looks like the issue. > >>> > >>> I think it may be working as intended, but not how you intended :-) If > >>> a core PMU is listed and then a legacy event, the legacy event should The point is that "cycles" when prefixed with "pmu/" shouldn't be considered "cycles" as HW/0, in that setting it is "cycles" for that PMU. (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I have access in a Libre Computer rockchip 3399-pc hybrid board, if we use it, then we get what we want/had before, see below): And there is an attempt at using the specified PMU, see the first perf_event_open: root@roc-rk3399-pc:~# strace -e perf_event_open perf stat -vv -e cycles,armv8_cortex_a53/cycles/,armv8_cortex_a72/cycles/ echo Using CPUID 0x00000000410fd082 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0x700000000 disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES, sample_period=0, sample_type=0, read_format=0, disabled=1, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory) //// HERE: it tries config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES taking into //account the PMU number 0x7 root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/type 7 root@roc-rk3399-pc:~# But then we don't have "cycles" in that PMU: root@roc-rk3399-pc:~# ls -la /sys/devices/armv8_cortex_a53/events/cycles ls: cannot access '/sys/devices/armv8_cortex_a53/events/cycles': No such file or directory root@roc-rk3399-pc:~# Maybe: root@roc-rk3399-pc:~# taskset -c 5,6 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo Using CPUID 0x00000000410fd034 Control descriptor is not initialized armv8_cortex_a53/cpu_cycles/: 0 2079000 0 armv8_cortex_a72/cpu_cycles/: 2488961 2079000 2079000 Performance counter stats for 'echo': <not counted> armv8_cortex_a53/cpu_cycles/ (0.00%) 2488961 armv8_cortex_a72/cpu_cycles/ 0.003449266 seconds time elapsed 0.003502000 seconds user 0.000000000 seconds sys root@roc-rk3399-pc:~# taskset -c 0,1,2,3,4 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo Using CPUID 0x00000000410fd034 Control descriptor is not initialized armv8_cortex_a53/cpu_cycles/: 2986601 6999416 6999416 armv8_cortex_a72/cpu_cycles/: 0 6999416 0 Performance counter stats for 'echo': 2986601 armv8_cortex_a53/cpu_cycles/ <not counted> armv8_cortex_a72/cpu_cycles/ (0.00%) 0.011434508 seconds time elapsed 0.003911000 seconds user 0.007454000 seconds sys root@roc-rk3399-pc:~# root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/events/cpu_cycles event=0x0011 root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a72/events/cpu_cycles event=0x0011 root@roc-rk3399-pc:~# And the syscalls seem sane: root@roc-rk3399-pc:~# strace -e perf_event_open taskset -c 0,1,2,3,4 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo Using CPUID 0x00000000410fd034 Control descriptor is not initialized perf_event_open({type=0x7 /* PERF_TYPE_??? */, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0x11, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 14573, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3 perf_event_open({type=0x8 /* PERF_TYPE_??? */, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0x11, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 14573, -1, -1, PERF_FLAG_FD_CLOEXEC) = 4 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14573, si_uid=0, si_status=0, si_utime=0, si_stime=0} --- armv8_cortex_a53/cpu_cycles/: 3227098 4480875 4480875 armv8_cortex_a72/cpu_cycles/: 0 4480875 0 Performance counter stats for 'echo': 3227098 armv8_cortex_a53/cpu_cycles/ <not counted> armv8_cortex_a72/cpu_cycles/ (0.00%) 0.008381759 seconds time elapsed 0.004064000 seconds user 0.004121000 seconds sys --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=14572, si_uid=0} --- +++ exited with 0 +++ root@roc-rk3399-pc:~# As: root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/type 7 root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a72/type 8 root@roc-rk3399-pc:~# See the type=0x7 and type=0x8. So what we need here seems to be to translate the generic term "cycles" to "cpu_cycles" when a PMU is explicitely passed in the event name and it doesn't have "cycles" and then just retry. - Arnaldo ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 13:06 ` Arnaldo Carvalho de Melo @ 2023-11-22 15:33 ` Ian Rogers 2023-11-22 15:49 ` Mark Rutland 1 sibling, 0 replies; 53+ messages in thread From: Ian Rogers @ 2023-11-22 15:33 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Hector Martin, Mark Rutland, Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 5:06 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Nov 22, 2023 at 12:23:27PM +0900, Hector Martin escreveu: > > On 2023/11/22 1:38, Ian Rogers wrote: > > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000, > > >>>>>>> Marc Zyngier <maz@kernel.org> wrote: > > >>>>>>>> > > >>>>>>>> [Adding key people on Cc] > > >>>>>>>> > > >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000, > > >>>>>>>> Hector Martin <marcan@marcan.st> wrote: > > >>>>>>>>> > > >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > > >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > >>>>>>>> > > >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick > > >>>>>>>> the PMU, but nothing works anymore. > > >>>>>>>> > > >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool > > >>>>>>>> package, but that's obviously not going to last. > > >>>>>>>> > > >>>>>>>> I'm happy to test potential fixes. > > >>>>>>> > > >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > >>>>>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > >>>>>>> CPU): > > >>>>>> > > >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit > > >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > >>>>>> that ${pmu}'s type and event namespace. > > >>>>>> > > >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like > > >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > >>>>>> PERF_COUNT_HW_${EVENT}. > > >>>>> > > >>>>> If you name a PMU and an event then the event should only be opened on > > >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy > > >>>>> cycles event is opened it appears to be because it was explicitly > > >>>>> requested. > > >>>> > > >>>> I think you've missed that the named PMU events are being erreously transformed > > >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > >>>> > > >>>> Opening: apple_firestorm_pmu/cycles/ > > >>>> ------------------------------------------------------------ > > >>>> perf_event_attr: > > >>>> type 0 (PERF_TYPE_HARDWARE) > > >>>> size 136 > > >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) > > >>>> sample_type IDENTIFIER > > >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > >>>> disabled 1 > > >>>> inherit 1 > > >>>> enable_on_exec 1 > > >>>> exclude_guest 1 > > >>>> ------------------------------------------------------------ > > >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > >>>> > > >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > >>>> > > >>>> Marc said that he bisected the issue down to commit: > > >>>> > > >>>> 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > >>>> > > >>>> ... so it looks like something is going wrong when the events are being parsed, > > >>>> e.g. losing the HW PMU information? > > >>> > > >>> Ok, I think I'm getting confused by other things. This looks like the issue. > > >>> > > >>> I think it may be working as intended, but not how you intended :-) If > > >>> a core PMU is listed and then a legacy event, the legacy event should > > The point is that "cycles" when prefixed with "pmu/" shouldn't be > considered "cycles" as HW/0, in that setting it is "cycles" for that > PMU. (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I > have access in a Libre Computer rockchip 3399-pc hybrid board, if we use > it, then we get what we want/had before, see below): > > And there is an attempt at using the specified PMU, see the first > perf_event_open: > > root@roc-rk3399-pc:~# strace -e perf_event_open perf stat -vv -e cycles,armv8_cortex_a53/cycles/,armv8_cortex_a72/cycles/ echo > Using CPUID 0x00000000410fd082 > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0x700000000 > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES, sample_period=0, sample_type=0, read_format=0, disabled=1, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory) > > //// HERE: it tries config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES taking into > //account the PMU number 0x7 > > root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/type > 7 > root@roc-rk3399-pc:~# > > But then we don't have "cycles" in that PMU: > > root@roc-rk3399-pc:~# ls -la /sys/devices/armv8_cortex_a53/events/cycles > ls: cannot access '/sys/devices/armv8_cortex_a53/events/cycles': No such file or directory > root@roc-rk3399-pc:~# > > Maybe: > > root@roc-rk3399-pc:~# taskset -c 5,6 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo > Using CPUID 0x00000000410fd034 > Control descriptor is not initialized > > armv8_cortex_a53/cpu_cycles/: 0 2079000 0 > armv8_cortex_a72/cpu_cycles/: 2488961 2079000 2079000 > > Performance counter stats for 'echo': > > <not counted> armv8_cortex_a53/cpu_cycles/ (0.00%) > 2488961 armv8_cortex_a72/cpu_cycles/ > > 0.003449266 seconds time elapsed > > 0.003502000 seconds user > 0.000000000 seconds sys > > > root@roc-rk3399-pc:~# taskset -c 0,1,2,3,4 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo > Using CPUID 0x00000000410fd034 > Control descriptor is not initialized > > armv8_cortex_a53/cpu_cycles/: 2986601 6999416 6999416 > armv8_cortex_a72/cpu_cycles/: 0 6999416 0 > > Performance counter stats for 'echo': > > 2986601 armv8_cortex_a53/cpu_cycles/ > <not counted> armv8_cortex_a72/cpu_cycles/ (0.00%) > > 0.011434508 seconds time elapsed > > 0.003911000 seconds user > 0.007454000 seconds sys > > > root@roc-rk3399-pc:~# > > root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/events/cpu_cycles > event=0x0011 > root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a72/events/cpu_cycles > event=0x0011 > root@roc-rk3399-pc:~# > > And the syscalls seem sane: > > root@roc-rk3399-pc:~# strace -e perf_event_open taskset -c 0,1,2,3,4 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo > Using CPUID 0x00000000410fd034 > Control descriptor is not initialized > perf_event_open({type=0x7 /* PERF_TYPE_??? */, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0x11, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 14573, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3 > perf_event_open({type=0x8 /* PERF_TYPE_??? */, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0x11, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 14573, -1, -1, PERF_FLAG_FD_CLOEXEC) = 4 > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14573, si_uid=0, si_status=0, si_utime=0, si_stime=0} --- > armv8_cortex_a53/cpu_cycles/: 3227098 4480875 4480875 > armv8_cortex_a72/cpu_cycles/: 0 4480875 0 > > Performance counter stats for 'echo': > > 3227098 armv8_cortex_a53/cpu_cycles/ > <not counted> armv8_cortex_a72/cpu_cycles/ (0.00%) > > 0.008381759 seconds time elapsed > > 0.004064000 seconds user > 0.004121000 seconds sys > > > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=14572, si_uid=0} --- > +++ exited with 0 +++ > root@roc-rk3399-pc:~# > > As: > > root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/type > 7 > root@roc-rk3399-pc:~# cat /sys/devices/armv8_cortex_a72/type > 8 > root@roc-rk3399-pc:~# > > See the type=0x7 and type=0x8. > > So what we need here seems to be to translate the generic term "cycles" > to "cpu_cycles" when a PMU is explicitely passed in the event name and > it doesn't have "cycles" and then just retry. The PMU driver does the legacy to raw encoding translation, this is an assumption the tool has of core PMUs. You can see ARM's PMU driver doing the mapping here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/arm_pmuv3.c#n40 Thanks, Ian > - Arnaldo ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 13:06 ` Arnaldo Carvalho de Melo 2023-11-22 15:33 ` Ian Rogers @ 2023-11-22 15:49 ` Mark Rutland 2023-11-22 16:04 ` Ian Rogers 2023-11-22 16:19 ` Arnaldo Carvalho de Melo 1 sibling, 2 replies; 53+ messages in thread From: Mark Rutland @ 2023-11-22 15:49 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Hector Martin, Ian Rogers, Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 10:06:23AM -0300, Arnaldo Carvalho de Melo wrote: > Em Wed, Nov 22, 2023 at 12:23:27PM +0900, Hector Martin escreveu: > > On 2023/11/22 1:38, Ian Rogers wrote: > > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000, > > >>>>>>> Marc Zyngier <maz@kernel.org> wrote: > > >>>>>>>> > > >>>>>>>> [Adding key people on Cc] > > >>>>>>>> > > >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000, > > >>>>>>>> Hector Martin <marcan@marcan.st> wrote: > > >>>>>>>>> > > >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > > >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > >>>>>>>> > > >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick > > >>>>>>>> the PMU, but nothing works anymore. > > >>>>>>>> > > >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool > > >>>>>>>> package, but that's obviously not going to last. > > >>>>>>>> > > >>>>>>>> I'm happy to test potential fixes. > > >>>>>>> > > >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > >>>>>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > >>>>>>> CPU): > > >>>>>> > > >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit > > >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > >>>>>> that ${pmu}'s type and event namespace. > > >>>>>> > > >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like > > >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > >>>>>> PERF_COUNT_HW_${EVENT}. > > >>>>> > > >>>>> If you name a PMU and an event then the event should only be opened on > > >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy > > >>>>> cycles event is opened it appears to be because it was explicitly > > >>>>> requested. > > >>>> > > >>>> I think you've missed that the named PMU events are being erreously transformed > > >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > >>>> > > >>>> Opening: apple_firestorm_pmu/cycles/ > > >>>> ------------------------------------------------------------ > > >>>> perf_event_attr: > > >>>> type 0 (PERF_TYPE_HARDWARE) > > >>>> size 136 > > >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) > > >>>> sample_type IDENTIFIER > > >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > >>>> disabled 1 > > >>>> inherit 1 > > >>>> enable_on_exec 1 > > >>>> exclude_guest 1 > > >>>> ------------------------------------------------------------ > > >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > >>>> > > >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > >>>> > > >>>> Marc said that he bisected the issue down to commit: > > >>>> > > >>>> 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > >>>> > > >>>> ... so it looks like something is going wrong when the events are being parsed, > > >>>> e.g. losing the HW PMU information? > > >>> > > >>> Ok, I think I'm getting confused by other things. This looks like the issue. > > >>> > > >>> I think it may be working as intended, but not how you intended :-) If > > >>> a core PMU is listed and then a legacy event, the legacy event should > > The point is that "cycles" when prefixed with "pmu/" shouldn't be > considered "cycles" as HW/0, in that setting it is "cycles" for that > PMU. Exactly. > (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I > have access in a Libre Computer rockchip 3399-pc hybrid board, if we use > it, then we get what we want/had before, see below): Both Cortex-A53 and Cortex-A72 have the common PMUv3 events, so they have "cpu_cycles" and "bus_cycles". The Apple PMUs that Hector and Marc anre using don't follow the PMUv3 architecture, and just have a "cycles" event. [...] > So what we need here seems to be to translate the generic term "cycles" > to "cpu_cycles" when a PMU is explicitely passed in the event name and > it doesn't have "cycles" and then just retry. I'm not sure we need to map that. My thinking is: * If the user asks for "cycles" without a PMU name, that should use the PERF_TYPE_HARDWARE cycles event. The ARM PMUs handle that correctly when the event is directed to them. * If the user asks for "${pmu}/cycles/", that should only use the "cycles" event in that PMU's namespace, not PERF_TYPE_HARDWARE. * If we need a way so say "use the PERF_TYPE_HARDWARE cycles event on ${pmu}", then we should have a new syntax for that (e.g. as we have for raw events), e.g. it would be possible to have "pmu/hw:cycles/" or something like that. That way there's no ambiguity. Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 15:49 ` Mark Rutland @ 2023-11-22 16:04 ` Ian Rogers 2023-11-22 16:26 ` Arnaldo Carvalho de Melo 2023-11-22 16:19 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-22 16:04 UTC (permalink / raw) To: Mark Rutland Cc: Arnaldo Carvalho de Melo, Hector Martin, Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 7:49 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Wed, Nov 22, 2023 at 10:06:23AM -0300, Arnaldo Carvalho de Melo wrote: > > Em Wed, Nov 22, 2023 at 12:23:27PM +0900, Hector Martin escreveu: > > > On 2023/11/22 1:38, Ian Rogers wrote: > > > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > > >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000, > > > >>>>>>> Marc Zyngier <maz@kernel.org> wrote: > > > >>>>>>>> > > > >>>>>>>> [Adding key people on Cc] > > > >>>>>>>> > > > >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000, > > > >>>>>>>> Hector Martin <marcan@marcan.st> wrote: > > > >>>>>>>>> > > > >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > > > >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > >>>>>>>> > > > >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick > > > >>>>>>>> the PMU, but nothing works anymore. > > > >>>>>>>> > > > >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool > > > >>>>>>>> package, but that's obviously not going to last. > > > >>>>>>>> > > > >>>>>>>> I'm happy to test potential fixes. > > > >>>>>>> > > > >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > >>>>>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > >>>>>>> CPU): > > > >>>>>> > > > >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit > > > >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > >>>>>> that ${pmu}'s type and event namespace. > > > >>>>>> > > > >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like > > > >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > >>>>>> PERF_COUNT_HW_${EVENT}. > > > >>>>> > > > >>>>> If you name a PMU and an event then the event should only be opened on > > > >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy > > > >>>>> cycles event is opened it appears to be because it was explicitly > > > >>>>> requested. > > > >>>> > > > >>>> I think you've missed that the named PMU events are being erreously transformed > > > >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > >>>> > > > >>>> Opening: apple_firestorm_pmu/cycles/ > > > >>>> ------------------------------------------------------------ > > > >>>> perf_event_attr: > > > >>>> type 0 (PERF_TYPE_HARDWARE) > > > >>>> size 136 > > > >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > >>>> sample_type IDENTIFIER > > > >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > >>>> disabled 1 > > > >>>> inherit 1 > > > >>>> enable_on_exec 1 > > > >>>> exclude_guest 1 > > > >>>> ------------------------------------------------------------ > > > >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > >>>> > > > >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > >>>> > > > >>>> Marc said that he bisected the issue down to commit: > > > >>>> > > > >>>> 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > >>>> > > > >>>> ... so it looks like something is going wrong when the events are being parsed, > > > >>>> e.g. losing the HW PMU information? > > > >>> > > > >>> Ok, I think I'm getting confused by other things. This looks like the issue. > > > >>> > > > >>> I think it may be working as intended, but not how you intended :-) If > > > >>> a core PMU is listed and then a legacy event, the legacy event should > > > > The point is that "cycles" when prefixed with "pmu/" shouldn't be > > considered "cycles" as HW/0, in that setting it is "cycles" for that > > PMU. > > Exactly. > > > (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I > > have access in a Libre Computer rockchip 3399-pc hybrid board, if we use > > it, then we get what we want/had before, see below): > > Both Cortex-A53 and Cortex-A72 have the common PMUv3 events, so they have > "cpu_cycles" and "bus_cycles". > > The Apple PMUs that Hector and Marc anre using don't follow the PMUv3 > architecture, and just have a "cycles" event. > > [...] > > > So what we need here seems to be to translate the generic term "cycles" > > to "cpu_cycles" when a PMU is explicitely passed in the event name and > > it doesn't have "cycles" and then just retry. > > I'm not sure we need to map that. > > My thinking is: > > * If the user asks for "cycles" without a PMU name, that should use the > PERF_TYPE_HARDWARE cycles event. The ARM PMUs handle that correctly when the > event is directed to them. > > * If the user asks for "${pmu}/cycles/", that should only use the "cycles" > event in that PMU's namespace, not PERF_TYPE_HARDWARE. > > * If we need a way so say "use the PERF_TYPE_HARDWARE cycles event on ${pmu}", > then we should have a new syntax for that (e.g. as we have for raw events), > e.g. it would be possible to have "pmu/hw:cycles/" or something like that. > > That way there's no ambiguity. This would break cpu_core/LLC-load-misses/ on Intel hybrid as the LLC-load-misses event is legacy and not advertised in either sysfs or in json. Thanks, Ian > Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 16:04 ` Ian Rogers @ 2023-11-22 16:26 ` Arnaldo Carvalho de Melo 2023-11-22 16:33 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-11-22 16:26 UTC (permalink / raw) To: Ian Rogers Cc: Mark Rutland, Hector Martin, Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux Em Wed, Nov 22, 2023 at 08:04:26AM -0800, Ian Rogers escreveu: > On Wed, Nov 22, 2023 at 7:49 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Wed, Nov 22, 2023 at 10:06:23AM -0300, Arnaldo Carvalho de Melo wrote: > > > Em Wed, Nov 22, 2023 at 12:23:27PM +0900, Hector Martin escreveu: > > > > On 2023/11/22 1:38, Ian Rogers wrote: > > > > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > > > >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > > >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000, > > > > >>>>>>> Marc Zyngier <maz@kernel.org> wrote: > > > > >>>>>>>> > > > > >>>>>>>> [Adding key people on Cc] > > > > >>>>>>>> > > > > >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000, > > > > >>>>>>>> Hector Martin <marcan@marcan.st> wrote: > > > > >>>>>>>>> > > > > >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > >>>>>>>> > > > > >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > >>>>>>>> the PMU, but nothing works anymore. > > > > >>>>>>>> > > > > >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool > > > > >>>>>>>> package, but that's obviously not going to last. > > > > >>>>>>>> > > > > >>>>>>>> I'm happy to test potential fixes. > > > > >>>>>>> > > > > >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > >>>>>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > >>>>>>> CPU): > > > > >>>>>> > > > > >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit > > > > >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > >>>>>> that ${pmu}'s type and event namespace. > > > > >>>>>> > > > > >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > >>>>>> PERF_COUNT_HW_${EVENT}. > > > > >>>>> > > > > >>>>> If you name a PMU and an event then the event should only be opened on > > > > >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy > > > > >>>>> cycles event is opened it appears to be because it was explicitly > > > > >>>>> requested. > > > > >>>> > > > > >>>> I think you've missed that the named PMU events are being erreously transformed > > > > >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > >>>> > > > > >>>> Opening: apple_firestorm_pmu/cycles/ > > > > >>>> ------------------------------------------------------------ > > > > >>>> perf_event_attr: > > > > >>>> type 0 (PERF_TYPE_HARDWARE) > > > > >>>> size 136 > > > > >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > >>>> sample_type IDENTIFIER > > > > >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > >>>> disabled 1 > > > > >>>> inherit 1 > > > > >>>> enable_on_exec 1 > > > > >>>> exclude_guest 1 > > > > >>>> ------------------------------------------------------------ > > > > >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > >>>> > > > > >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > >>>> > > > > >>>> Marc said that he bisected the issue down to commit: > > > > >>>> > > > > >>>> 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > >>>> > > > > >>>> ... so it looks like something is going wrong when the events are being parsed, > > > > >>>> e.g. losing the HW PMU information? > > > > >>> > > > > >>> Ok, I think I'm getting confused by other things. This looks like the issue. > > > > >>> > > > > >>> I think it may be working as intended, but not how you intended :-) If > > > > >>> a core PMU is listed and then a legacy event, the legacy event should > > > > > > The point is that "cycles" when prefixed with "pmu/" shouldn't be > > > considered "cycles" as HW/0, in that setting it is "cycles" for that > > > PMU. > > > > Exactly. > > > > > (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I > > > have access in a Libre Computer rockchip 3399-pc hybrid board, if we use > > > it, then we get what we want/had before, see below): > > > > Both Cortex-A53 and Cortex-A72 have the common PMUv3 events, so they have > > "cpu_cycles" and "bus_cycles". > > > > The Apple PMUs that Hector and Marc anre using don't follow the PMUv3 > > architecture, and just have a "cycles" event. > > > > [...] > > > > > So what we need here seems to be to translate the generic term "cycles" > > > to "cpu_cycles" when a PMU is explicitely passed in the event name and > > > it doesn't have "cycles" and then just retry. > > > > I'm not sure we need to map that. > > > > My thinking is: > > > > * If the user asks for "cycles" without a PMU name, that should use the > > PERF_TYPE_HARDWARE cycles event. The ARM PMUs handle that correctly when the > > event is directed to them. > > > > * If the user asks for "${pmu}/cycles/", that should only use the "cycles" > > event in that PMU's namespace, not PERF_TYPE_HARDWARE. > > > > * If we need a way so say "use the PERF_TYPE_HARDWARE cycles event on ${pmu}", > > then we should have a new syntax for that (e.g. as we have for raw events), > > e.g. it would be possible to have "pmu/hw:cycles/" or something like that. > > > > That way there's no ambiguity. > > This would break cpu_core/LLC-load-misses/ on Intel hybrid as the > LLC-load-misses event is legacy and not advertised in either sysfs or > in json. Indeed: [root@quaco ~]# ls /sys/devices/cpu/events/ branch-instructions bus-cycles cache-references instructions mem-stores topdown-fetch-bubbles topdown-recovery-bubbles.scale topdown-slots-retired topdown-total-slots.scale branch-misses cache-misses cpu-cycles mem-loads ref-cycles topdown-recovery-bubbles topdown-slots-issued topdown-total-slots [root@quaco ~]# strace -e perf_event_open perf stat -e cpu/LLC-load-misses/ echo perf_event_open({type=PERF_TYPE_HW_CACHE, size=0x88 /* PERF_ATTR_SIZE_??? */, config=PERF_COUNT_HW_CACHE_RESULT_MISS<<16|PERF_COUNT_HW_CACHE_OP_READ<<8|PERF_COUNT_HW_CACHE_LL, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 41467, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=41467, si_uid=0, si_status=0, si_utime=0, si_stime=0} --- Performance counter stats for 'echo': 1,015 cpu/LLC-load-misses/ 0.005167119 seconds time elapsed 0.000821000 seconds user 0.004105000 seconds sys --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=41466, si_uid=0} --- +++ exited with 0 +++ [root@quaco ~]# Is it difficult to before doing the current expansion to PERF_TYPE_HARDWARE/PERF_HW_CPU_CYCLES just check if there is an event with the name specified in the PMU specified, if there is, use that. - Arnaldo ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 16:26 ` Arnaldo Carvalho de Melo @ 2023-11-22 16:33 ` Ian Rogers 0 siblings, 0 replies; 53+ messages in thread From: Ian Rogers @ 2023-11-22 16:33 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Mark Rutland, Hector Martin, Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 8:26 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Nov 22, 2023 at 08:04:26AM -0800, Ian Rogers escreveu: > > On Wed, Nov 22, 2023 at 7:49 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Wed, Nov 22, 2023 at 10:06:23AM -0300, Arnaldo Carvalho de Melo wrote: > > > > Em Wed, Nov 22, 2023 at 12:23:27PM +0900, Hector Martin escreveu: > > > > > On 2023/11/22 1:38, Ian Rogers wrote: > > > > > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > > > > >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > > > >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > > >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000, > > > > > >>>>>>> Marc Zyngier <maz@kernel.org> wrote: > > > > > >>>>>>>> > > > > > >>>>>>>> [Adding key people on Cc] > > > > > >>>>>>>> > > > > > >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > >>>>>>>> Hector Martin <marcan@marcan.st> wrote: > > > > > >>>>>>>>> > > > > > >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > >>>>>>>> > > > > > >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > >>>>>>>> the PMU, but nothing works anymore. > > > > > >>>>>>>> > > > > > >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > >>>>>>>> package, but that's obviously not going to last. > > > > > >>>>>>>> > > > > > >>>>>>>> I'm happy to test potential fixes. > > > > > >>>>>>> > > > > > >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > > >>>>>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > > >>>>>>> CPU): > > > > > >>>>>> > > > > > >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit > > > > > >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > > >>>>>> that ${pmu}'s type and event namespace. > > > > > >>>>>> > > > > > >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > > >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > > >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > > >>>>>> PERF_COUNT_HW_${EVENT}. > > > > > >>>>> > > > > > >>>>> If you name a PMU and an event then the event should only be opened on > > > > > >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy > > > > > >>>>> cycles event is opened it appears to be because it was explicitly > > > > > >>>>> requested. > > > > > >>>> > > > > > >>>> I think you've missed that the named PMU events are being erreously transformed > > > > > >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > > >>>> > > > > > >>>> Opening: apple_firestorm_pmu/cycles/ > > > > > >>>> ------------------------------------------------------------ > > > > > >>>> perf_event_attr: > > > > > >>>> type 0 (PERF_TYPE_HARDWARE) > > > > > >>>> size 136 > > > > > >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > > >>>> sample_type IDENTIFIER > > > > > >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > > >>>> disabled 1 > > > > > >>>> inherit 1 > > > > > >>>> enable_on_exec 1 > > > > > >>>> exclude_guest 1 > > > > > >>>> ------------------------------------------------------------ > > > > > >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > > >>>> > > > > > >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > > >>>> > > > > > >>>> Marc said that he bisected the issue down to commit: > > > > > >>>> > > > > > >>>> 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > > >>>> > > > > > >>>> ... so it looks like something is going wrong when the events are being parsed, > > > > > >>>> e.g. losing the HW PMU information? > > > > > >>> > > > > > >>> Ok, I think I'm getting confused by other things. This looks like the issue. > > > > > >>> > > > > > >>> I think it may be working as intended, but not how you intended :-) If > > > > > >>> a core PMU is listed and then a legacy event, the legacy event should > > > > > > > > The point is that "cycles" when prefixed with "pmu/" shouldn't be > > > > considered "cycles" as HW/0, in that setting it is "cycles" for that > > > > PMU. > > > > > > Exactly. > > > > > > > (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I > > > > have access in a Libre Computer rockchip 3399-pc hybrid board, if we use > > > > it, then we get what we want/had before, see below): > > > > > > Both Cortex-A53 and Cortex-A72 have the common PMUv3 events, so they have > > > "cpu_cycles" and "bus_cycles". > > > > > > The Apple PMUs that Hector and Marc anre using don't follow the PMUv3 > > > architecture, and just have a "cycles" event. > > > > > > [...] > > > > > > > So what we need here seems to be to translate the generic term "cycles" > > > > to "cpu_cycles" when a PMU is explicitely passed in the event name and > > > > it doesn't have "cycles" and then just retry. > > > > > > I'm not sure we need to map that. > > > > > > My thinking is: > > > > > > * If the user asks for "cycles" without a PMU name, that should use the > > > PERF_TYPE_HARDWARE cycles event. The ARM PMUs handle that correctly when the > > > event is directed to them. > > > > > > * If the user asks for "${pmu}/cycles/", that should only use the "cycles" > > > event in that PMU's namespace, not PERF_TYPE_HARDWARE. > > > > > > * If we need a way so say "use the PERF_TYPE_HARDWARE cycles event on ${pmu}", > > > then we should have a new syntax for that (e.g. as we have for raw events), > > > e.g. it would be possible to have "pmu/hw:cycles/" or something like that. > > > > > > That way there's no ambiguity. > > > > This would break cpu_core/LLC-load-misses/ on Intel hybrid as the > > LLC-load-misses event is legacy and not advertised in either sysfs or > > in json. > > Indeed: > > [root@quaco ~]# ls /sys/devices/cpu/events/ > branch-instructions bus-cycles cache-references instructions mem-stores topdown-fetch-bubbles topdown-recovery-bubbles.scale topdown-slots-retired topdown-total-slots.scale > branch-misses cache-misses cpu-cycles mem-loads ref-cycles topdown-recovery-bubbles topdown-slots-issued topdown-total-slots > [root@quaco ~]# strace -e perf_event_open perf stat -e cpu/LLC-load-misses/ echo > perf_event_open({type=PERF_TYPE_HW_CACHE, size=0x88 /* PERF_ATTR_SIZE_??? */, config=PERF_COUNT_HW_CACHE_RESULT_MISS<<16|PERF_COUNT_HW_CACHE_OP_READ<<8|PERF_COUNT_HW_CACHE_LL, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 41467, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3 > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=41467, si_uid=0, si_status=0, si_utime=0, si_stime=0} --- > > Performance counter stats for 'echo': > > 1,015 cpu/LLC-load-misses/ > > 0.005167119 seconds time elapsed > > 0.000821000 seconds user > 0.004105000 seconds sys > > > --- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=41466, si_uid=0} --- > +++ exited with 0 +++ > [root@quaco ~]# > > Is it difficult to before doing the current expansion to > PERF_TYPE_HARDWARE/PERF_HW_CPU_CYCLES just check if there is an event > with the name specified in the PMU specified, if there is, use that. Agreed and I've sent an early cut of this. The issue is that then we end up changing the encoding on Intel. I also don't see why ARM doesn't just fix their PMU. Thanks, Ian > - Arnaldo ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 15:49 ` Mark Rutland 2023-11-22 16:04 ` Ian Rogers @ 2023-11-22 16:19 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 53+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-11-22 16:19 UTC (permalink / raw) To: Mark Rutland Cc: Hector Martin, Ian Rogers, Marc Zyngier, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux Em Wed, Nov 22, 2023 at 03:49:18PM +0000, Mark Rutland escreveu: > On Wed, Nov 22, 2023 at 10:06:23AM -0300, Arnaldo Carvalho de Melo wrote: > > The point is that "cycles" when prefixed with "pmu/" shouldn't be > > considered "cycles" as HW/0, in that setting it is "cycles" for that > > PMU. > Exactly. > > (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I > > have access in a Libre Computer rockchip 3399-pc hybrid board, if we use > > it, then we get what we want/had before, see below): > Both Cortex-A53 and Cortex-A72 have the common PMUv3 events, so they have > "cpu_cycles" and "bus_cycles". root@roc-rk3399-pc:~# ls -la /sys/devices/*/events/*cycles -r--r--r-- 1 root root 4096 Nov 22 12:35 /sys/devices/armv8_cortex_a53/events/bus_cycles -r--r--r-- 1 root root 4096 Nov 22 12:35 /sys/devices/armv8_cortex_a53/events/cpu_cycles -r--r--r-- 1 root root 4096 Nov 22 12:35 /sys/devices/armv8_cortex_a72/events/bus_cycles -r--r--r-- 1 root root 4096 Nov 22 12:35 /sys/devices/armv8_cortex_a72/events/cpu_cycles root@roc-rk3399-pc:~# But on x86, on a AMD machine: ⬢[acme@toolbox ~]$ ls -la /sys/devices/*/events/*cycles -r--r--r--. 1 nobody nobody 4096 Nov 22 12:48 /sys/devices/cpu/events/cpu-cycles ⬢[acme@toolbox ~]$ And an Intel: [acme@quaco asahi]$ ls -la /sys/devices/*/events/*cycles -r--r--r--. 1 root root 4096 Nov 22 13:11 /sys/devices/cpu/events/bus-cycles -r--r--r--. 1 root root 4096 Nov 22 13:11 /sys/devices/cpu/events/cpu-cycles -r--r--r--. 1 root root 4096 Nov 22 13:11 /sys/devices/cpu/events/ref-cycles [acme@quaco asahi]$ Slight difference with those - and _. > The Apple PMUs that Hector and Marc anre using don't follow the PMUv3 > architecture, and just have a "cycles" event. I see, and even being prefixed with the PMU name, as "apple_icestorm_pmu/cycles/" it ends up trumping that and moving that to (PERF_TYPE_HARDWARE, PERF_HW_CPU_CYCLES) instead of (/sys/devices/apple_icestorm_pmu/events/type, /sys/devices/apple_icestorm_pmu/events/cycles) as I noticed with: sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES, sample_period=0, sample_type=0, read_format=0, disabled=1, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory) I.e.: type=PERF_TYPE_HARDWARE, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES It should be: type=/sys/devices/apple_icestorm_pmu/events/type, config=/sys/devices/apple_icestorm_pmu/events/cycles That is the minimal patch to address the regression reported, even if using some kludge to buy time for a longer term more elegant solution, Ian? > [...] > > So what we need here seems to be to translate the generic term "cycles" > > to "cpu_cycles" when a PMU is explicitely passed in the event name and > > it doesn't have "cycles" and then just retry. > > I'm not sure we need to map that. > > My thinking is: > > * If the user asks for "cycles" without a PMU name, that should use the > PERF_TYPE_HARDWARE cycles event. The ARM PMUs handle that correctly when the > event is directed to them. > > * If the user asks for "${pmu}/cycles/", that should only use the "cycles" > event in that PMU's namespace, not PERF_TYPE_HARDWARE. And thus, armv8_cortex_a53/cycles/ and armv8_cortex_a72/cycles/ should just fail as there is no "cycles" for that PMU, no fallback. > * If we need a way so say "use the PERF_TYPE_HARDWARE cycles event on ${pmu}", > then we should have a new syntax for that (e.g. as we have for raw events), > e.g. it would be possible to have "pmu/hw:cycles/" or something like that. > > That way there's no ambiguity. - Arnaldo ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 16:38 ` Ian Rogers 2023-11-22 3:23 ` Hector Martin @ 2023-11-22 13:03 ` Mark Rutland 2023-11-22 15:29 ` Ian Rogers 1 sibling, 1 reply; 53+ messages in thread From: Mark Rutland @ 2023-11-22 13:03 UTC (permalink / raw) To: Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 08:38:45AM -0800, Ian Rogers wrote: > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > > On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > > > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > > > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > > > [Adding key people on Cc] > > > > > > > > > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > > > > the PMU, but nothing works anymore. > > > > > > > > > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > > > > package, but that's obviously not going to last. > > > > > > > > > > > > > > > > I'm happy to test potential fixes. > > > > > > > > > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > > > > CPU): > > > > > > > > > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > > > > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > > > that ${pmu}'s type and event namespace. > > > > > > > > > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > > > PERF_COUNT_HW_${EVENT}. > > > > > > > > > > If you name a PMU and an event then the event should only be opened on > > > > > that PMU, 100% agree. There's a bunch of output, but when the legacy > > > > > cycles event is opened it appears to be because it was explicitly > > > > > requested. > > > > > > > > I think you've missed that the named PMU events are being erreously transformed > > > > into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > > > > > Opening: apple_firestorm_pmu/cycles/ > > > > ------------------------------------------------------------ > > > > perf_event_attr: > > > > type 0 (PERF_TYPE_HARDWARE) > > > > size 136 > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > sample_type IDENTIFIER > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > disabled 1 > > > > inherit 1 > > > > enable_on_exec 1 > > > > exclude_guest 1 > > > > ------------------------------------------------------------ > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > > > > > ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > > > > > Marc said that he bisected the issue down to commit: > > > > > > > > 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > > > > > ... so it looks like something is going wrong when the events are being parsed, > > > > e.g. losing the HW PMU information? > > > > > > Ok, I think I'm getting confused by other things. This looks like the issue. > > > > > > I think it may be working as intended, but not how you intended :-) If > > > a core PMU is listed and then a legacy event, the legacy event should > > > be opened on the core PMU as a legacy event with the extended type > > > set. This is to allow things like legacy cache events to be opened on > > > a specified PMU. Legacy event names match with a higher priority than > > > those in sysfs or json as they are hard coded. > > > > That has never been the case previously, so this is user-visible breakage, and > > it prevents users from being able to do the right thing, so I think that's a > > broken design. > > So the problem was caused by ARM and Intel doing two different things. > Intel did at least contribute to the perf tool in support for their > BIG.little/hybrid, so that's why the semantics match their approach. I appreciate that, and I agree that from the Arm side we haven't been as engaged with userspace on this front (please understand I'm the messenger here, this is something I've repeatedly asked for within Arm). Regardless, I don't think that changes the substance of the bug, which is that we're converting named-pmu events into entirely different PERF_TYPE_HARDWARE events. I agree that expanding plain legacy event names to a set of PMU-tagetted legacy events makes sense (and even for Arm, that's the right thing to do, IMO). If I ask for 'cycles' and that gets expanded to multiple legacy cycles events that target specific CPU PMUs, that's good. The thing that doesn't make sense here is converting named-pmu events into egacy events. If I ask for 'apple_firestorm_pmu/cycles/', that should be the 'cycles' event in the apple_firestorm_pmu's event namespace, and *shouldn't* be converted to a (potentially semantically different) PERF_TYPE_HARDWARE event, even if that's targetted towards the apple_firestorm_pmu. I think that should be true for *any* PMU, whether thats an arm/x86/whatever CPU PMU or a system PMU. > > > Presumably the expectation was that by advertising a cycles event, presumably > > > in sysfs, then this is what would be matched. Yes. That's how this has always worked prior to the changes Marc referenced. Note that this can *also* be expaned to events from json databases, but was *never* previously silently converted to a PERF_TYPE_HARDWARE event. Please note that the events in sysfs are *namespaced* to the PMU (specifically, when using that PMU's dynamic type); they are not necessarily the same as legacy events (though they may have similar or matching names in some cases), they may be semantically distinct from the legacy events even if the names match, and it is incorrect to conflate the two. > > I expect that if I ask for ${pmu}/${event}/, that PMU is used, and the event > > *in that PMU's namespace* is used. Overriding that breaks long-established > > practice and provides users with no recourse to get the behavioru they expect > > (and previosuly had). > > On ARM but not Intel. As above, I don't think the CPU architecture matters here for the case that I'm saying is broken. I think that regardless of CPU architecture (or for any non-CPU PMU) it is semantically incorrect to convert a named-pmu event to a legacy event. > > I do think that (regardless of whther this was the sematnic you intended) > > silently overriding events with legacy events is a bug, and one we should fix. > > As I mentioned in another reply, just because the events have the same name > > does not mean that they are semantically the same, so we're liable to give > > people the wrong numbers anyhow. > > > > Can we fix this? > > So I'd like to fix this, some things from various conversations: > > 1) we lack testing. Our testing relies on the sysfs of the machine > being run on, which is better than nothing. I think ideally we'd have > a collection of zipped up sysfs directories and then we could have a > test that asserts on ARM you get the behavior you want. I agree we lack testing, and I'd be happy to help here going forwards, though I don't think this is a prerequisite for fixing this issue. > 2) for RISC-V they want to make the legacy event matching something in > user land to simplify the PMU driver. Ok; I see how this might be related, but it doesn't sound like a prerequisite for fixing this issue -- there are plenty of people in this thread who can test. > 3) I'd like to get rid of the PMU json interface. My idea is to > convert json events/metrics into sysfs style files, zip these up and > then link them into the perf binary. On Intel the json is 70% of the > binary (7MB out of 10MB) and we may get this down to 3MB with this > approach. The json lookup would need to incorporate the cpuid matching > that currently exists. When we look up an event I'd like the approach > to be like unionfs with a specified but configurable order. Users > could provide directories of their own events/metrics for various > PMUs, and then this approach could be used to help with (1). I can see how that might interact with whatever changes we make to fix this issue, but this seems like a future aspiration, and not a prerequisite for fixing the existing functional regression. > Those proposals are not something to add as a -rc fix, so what I think > you're asking for here is a "if ARM" fix somewhere in the event > parsing. That's of course possible but it will cause problems if you > did say: > > perf stat -e arm_pmu/LLC-load-misses/ ... As above, I do not think this is an arm-specific issue, we're just the canary in the coalmine. Please note that: perf stat -e arm_pmu/LLC-load-misses/ ... ... would never have worked previously. No arm_pmu instances have a "LLC-load-misses" event in their event namespaces, and we don't have any userspace file mapping that event. That said, If I really wanted that legacy event, I'd have asked for it bare, e.g. perf stat -e LLC-load-misses ... and we're in agreement that it's sensible to expand this to multiple PERF_TYPE_HARDWARE events targeting the individual CPU PMUs. So I see no need to do anything to have magic for 'arm_pmu/LLC-load-misses/'. > as I doubt the PMU driver is advertising this legacy event in sysfs > and the "if ARM" logic would presumably be trying to disable legacy > events in the term list for the ARM PMU. > > Given all of this, is anything actually broken and needing a fix for 6.7? There is absolutely a bug that needs to be fixed here (and needs to be backported to stable so that it gets picked up by distributions). Thanks, Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 13:03 ` Mark Rutland @ 2023-11-22 15:29 ` Ian Rogers 2023-11-22 16:08 ` Mark Rutland 0 siblings, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-22 15:29 UTC (permalink / raw) To: Mark Rutland Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 5:04 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 08:38:45AM -0800, Ian Rogers wrote: > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > > > On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > > > On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > > > > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > > > > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > > > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > > > > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > > > > > [Adding key people on Cc] > > > > > > > > > > > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > > > > > the PMU, but nothing works anymore. > > > > > > > > > > > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > > > > > package, but that's obviously not going to last. > > > > > > > > > > > > > > > > > > I'm happy to test potential fixes. > > > > > > > > > > > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > > > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > > > > > CPU): > > > > > > > > > > > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > > > > > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > > > > that ${pmu}'s type and event namespace. > > > > > > > > > > > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > > > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > > > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > > > > PERF_COUNT_HW_${EVENT}. > > > > > > > > > > > > If you name a PMU and an event then the event should only be opened on > > > > > > that PMU, 100% agree. There's a bunch of output, but when the legacy > > > > > > cycles event is opened it appears to be because it was explicitly > > > > > > requested. > > > > > > > > > > I think you've missed that the named PMU events are being erreously transformed > > > > > into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > > > > > > > Opening: apple_firestorm_pmu/cycles/ > > > > > ------------------------------------------------------------ > > > > > perf_event_attr: > > > > > type 0 (PERF_TYPE_HARDWARE) > > > > > size 136 > > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > > sample_type IDENTIFIER > > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > > disabled 1 > > > > > inherit 1 > > > > > enable_on_exec 1 > > > > > exclude_guest 1 > > > > > ------------------------------------------------------------ > > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > > > > > > > ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > > > > > > > Marc said that he bisected the issue down to commit: > > > > > > > > > > 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > > > > > > > ... so it looks like something is going wrong when the events are being parsed, > > > > > e.g. losing the HW PMU information? > > > > > > > > Ok, I think I'm getting confused by other things. This looks like the issue. > > > > > > > > I think it may be working as intended, but not how you intended :-) If > > > > a core PMU is listed and then a legacy event, the legacy event should > > > > be opened on the core PMU as a legacy event with the extended type > > > > set. This is to allow things like legacy cache events to be opened on > > > > a specified PMU. Legacy event names match with a higher priority than > > > > those in sysfs or json as they are hard coded. > > > > > > That has never been the case previously, so this is user-visible breakage, and > > > it prevents users from being able to do the right thing, so I think that's a > > > broken design. > > > > So the problem was caused by ARM and Intel doing two different things. > > Intel did at least contribute to the perf tool in support for their > > BIG.little/hybrid, so that's why the semantics match their approach. > > I appreciate that, and I agree that from the Arm side we haven't been as > engaged with userspace on this front (please understand I'm the messenger here, > this is something I've repeatedly asked for within Arm). > > Regardless, I don't think that changes the substance of the bug, which is that > we're converting named-pmu events into entirely different PERF_TYPE_HARDWARE > events. > > I agree that expanding plain legacy event names to a set of PMU-tagetted legacy > events makes sense (and even for Arm, that's the right thing to do, IMO). If > I ask for 'cycles' and that gets expanded to multiple legacy cycles events that > target specific CPU PMUs, that's good. > > The thing that doesn't make sense here is converting named-pmu events into > egacy events. If I ask for 'apple_firestorm_pmu/cycles/', that should be the > 'cycles' event in the apple_firestorm_pmu's event namespace, and *shouldn't* be > converted to a (potentially semantically different) PERF_TYPE_HARDWARE event, > even if that's targetted towards the apple_firestorm_pmu. I think that should > be true for *any* PMU, whether thats an arm/x86/whatever CPU PMU or a system > PMU. This is saying that legacy events are lower than system events. We don't do this historically and as it requires extra PMU set up. On an Intel Tigerlake: ``` $ ls /sys/devices/cpu/events branch-instructions cache-misses instructions ref-cycles topdown-be-bound branch-misses cache-references mem-loads slots topdown-fe-bound bus-cycles cpu-cycles mem-stores topdown-bad-spec topdown-retiring ``` here (at least) branch-misses, bus-cycles, cache-references, cpu-cycles and instructions overlap with legacy event names ``` $ perf --version perf version 6.5.6 $ perf stat -vv -e branch-misses,bus-cycles,cache-references,cp u-cycles,instructions true Using CPUID GenuineIntel-6-8D-1 intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0x5 (PERF_COUNT_HW_BRANCH_MISSES) ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0x6 (PERF_COUNT_HW_BUS_CYCLES) ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0x2 (PERF_COUNT_HW_CACHE_REFERENCES) ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0x1 (PERF_COUNT_HW_INSTRUCTIONS) ... branch-misses: -1: 6571 826226 826226 bus-cycles: -1: 31411 826226 826226 cache-references: -1: 19507 826226 826226 cpu-cycles: -1: 1127215 826226 826226 instructions: -1: 1301583 826226 826226 branch-misses: 6571 826226 826226 bus-cycles: 31411 826226 826226 cache-references: 19507 826226 826226 cpu-cycles: 1127215 826226 826226 instructions: 1301583 826226 826226 Performance counter stats for 'true': ... ``` ie perf 6.5 and all events even though sysfs has events we're opening them with PERF_TYPE_HARDWARE. > > > > Presumably the expectation was that by advertising a cycles event, presumably > > > > in sysfs, then this is what would be matched. > > Yes. That's how this has always worked prior to the changes Marc referenced. > Note that this can *also* be expaned to events from json databases, but was > *never* previously silently converted to a PERF_TYPE_HARDWARE event. > > Please note that the events in sysfs are *namespaced* to the PMU (specifically, > when using that PMU's dynamic type); they are not necessarily the same as > legacy events (though they may have similar or matching > names in some cases), they may be semantically distinct from the legacy events > even if the names match, and it is incorrect to conflate the two. This was a behavior added by Intel so that say cpu_atom/legacy-event/ would only open as a hardware event on that PMU. The point of the blamed change is to make that behavior consistent for all core PMUs. > > > I expect that if I ask for ${pmu}/${event}/, that PMU is used, and the event > > > *in that PMU's namespace* is used. Overriding that breaks long-established > > > practice and provides users with no recourse to get the behavioru they expect > > > (and previosuly had). > > > > On ARM but not Intel. > > As above, I don't think the CPU architecture matters here for the case that I'm > saying is broken. I think that regardless of CPU architecture (or for any > non-CPU PMU) it is semantically incorrect to convert a named-pmu event to a > legacy event. So perf's behavior has always been that legacy event priority is greater-than sysfs and json. The distinction here is that a core PMU is explicitly listed and it doesn't seem unreasonable to use core PMU names with legacy events, the behavior Intel added. > > > I do think that (regardless of whther this was the sematnic you intended) > > > silently overriding events with legacy events is a bug, and one we should fix. > > > As I mentioned in another reply, just because the events have the same name > > > does not mean that they are semantically the same, so we're liable to give > > > people the wrong numbers anyhow. > > > > > > Can we fix this? > > > > So I'd like to fix this, some things from various conversations: > > > > 1) we lack testing. Our testing relies on the sysfs of the machine > > being run on, which is better than nothing. I think ideally we'd have > > a collection of zipped up sysfs directories and then we could have a > > test that asserts on ARM you get the behavior you want. > > I agree we lack testing, and I'd be happy to help here going forwards, though I > don't think this is a prerequisite for fixing this issue. > > > 2) for RISC-V they want to make the legacy event matching something in > > user land to simplify the PMU driver. > > Ok; I see how this might be related, but it doesn't sound like a prerequisite > for fixing this issue -- there are plenty of people in this thread who can > test. > > > 3) I'd like to get rid of the PMU json interface. My idea is to > > convert json events/metrics into sysfs style files, zip these up and > > then link them into the perf binary. On Intel the json is 70% of the > > binary (7MB out of 10MB) and we may get this down to 3MB with this > > approach. The json lookup would need to incorporate the cpuid matching > > that currently exists. When we look up an event I'd like the approach > > to be like unionfs with a specified but configurable order. Users > > could provide directories of their own events/metrics for various > > PMUs, and then this approach could be used to help with (1). > > I can see how that might interact with whatever changes we make to fix this > issue, but this seems like a future aspiration, and not a prerequisite for > fixing the existing functional regression. > > > Those proposals are not something to add as a -rc fix, so what I think > > you're asking for here is a "if ARM" fix somewhere in the event > > parsing. That's of course possible but it will cause problems if you > > did say: > > > > perf stat -e arm_pmu/LLC-load-misses/ ... > > As above, I do not think this is an arm-specific issue, we're just the canary > in the coalmine. Disagree, see comments above. A behavior change here would impact Intel. > Please note that: > > perf stat -e arm_pmu/LLC-load-misses/ ... > > ... would never have worked previously. No arm_pmu instances have a > "LLC-load-misses" event in their event namespaces, and we don't have any > userspace file mapping that event. This event was for the purpose of giving an example, perf list will show you events that work. The point is that a legacy event may not be available on both BIG.little PMU types so being able to designate the PMU there is helpful. > That said, If I really wanted that legacy event, I'd have asked for it bare, > e.g. > > perf stat -e LLC-load-misses > > ... and we're in agreement that it's sensible to expand this to multiple > PERF_TYPE_HARDWARE events targeting the individual CPU PMUs. > > So I see no need to do anything to have magic for 'arm_pmu/LLC-load-misses/'. > > > as I doubt the PMU driver is advertising this legacy event in sysfs > > and the "if ARM" logic would presumably be trying to disable legacy > > events in the term list for the ARM PMU. > > > > Given all of this, is anything actually broken and needing a fix for 6.7? > > There is absolutely a bug that needs to be fixed here (and needs to be > backported to stable so that it gets picked up by distributions). I'm not seeing this. The behavior is consistent with Intel, this has gone 2 releases without being spotted, it was triggered by a PMU event name aliasing a legacy event name and the behavior has always been legacy event names have higher priority than sysfs and json events. Whilst I'm seeing a lot of complaining, I've not seen a proposal of what behavior you want. Isn't it a PMU bug if the legacy event specifying the PMU doesn't get opened by the core PMU? Fixing the PMU driver appears to be the right fix and means there is consistency on core events across architectures. Thanks, Ian > Thanks, > Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 15:29 ` Ian Rogers @ 2023-11-22 16:08 ` Mark Rutland 2023-11-22 16:29 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Mark Rutland @ 2023-11-22 16:08 UTC (permalink / raw) To: Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 07:29:34AM -0800, Ian Rogers wrote: > On Wed, Nov 22, 2023 at 5:04 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 08:38:45AM -0800, Ian Rogers wrote: > > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > > > > On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > > > > > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > > > > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > > > > > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > [Adding key people on Cc] > > > > > > > > > > > > > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > > > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > > > > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > > > > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > > > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > > > > > > the PMU, but nothing works anymore. > > > > > > > > > > > > > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > > > > > > package, but that's obviously not going to last. > > > > > > > > > > > > > > > > > > > > I'm happy to test potential fixes. > > > > > > > > > > > > > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > > > > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > > > > > > CPU): > > > > > > > > > > > > > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > > > > > > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > > > > > that ${pmu}'s type and event namespace. > > > > > > > > > > > > > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > > > > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > > > > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > > > > > PERF_COUNT_HW_${EVENT}. > > > > > > > > > > > > > > If you name a PMU and an event then the event should only be opened on > > > > > > > that PMU, 100% agree. There's a bunch of output, but when the legacy > > > > > > > cycles event is opened it appears to be because it was explicitly > > > > > > > requested. > > > > > > > > > > > > I think you've missed that the named PMU events are being erreously transformed > > > > > > into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > > > > > > > > > Opening: apple_firestorm_pmu/cycles/ > > > > > > ------------------------------------------------------------ > > > > > > perf_event_attr: > > > > > > type 0 (PERF_TYPE_HARDWARE) > > > > > > size 136 > > > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > > > sample_type IDENTIFIER > > > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > > > disabled 1 > > > > > > inherit 1 > > > > > > enable_on_exec 1 > > > > > > exclude_guest 1 > > > > > > ------------------------------------------------------------ > > > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > > > > > > > > > ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > > > > > > > > > Marc said that he bisected the issue down to commit: > > > > > > > > > > > > 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > > > > > > > > > ... so it looks like something is going wrong when the events are being parsed, > > > > > > e.g. losing the HW PMU information? > > > > > > > > > > Ok, I think I'm getting confused by other things. This looks like the issue. > > > > > > > > > > I think it may be working as intended, but not how you intended :-) If > > > > > a core PMU is listed and then a legacy event, the legacy event should > > > > > be opened on the core PMU as a legacy event with the extended type > > > > > set. This is to allow things like legacy cache events to be opened on > > > > > a specified PMU. Legacy event names match with a higher priority than > > > > > those in sysfs or json as they are hard coded. > > > > > > > > That has never been the case previously, so this is user-visible breakage, and > > > > it prevents users from being able to do the right thing, so I think that's a > > > > broken design. > > > > > > So the problem was caused by ARM and Intel doing two different things. > > > Intel did at least contribute to the perf tool in support for their > > > BIG.little/hybrid, so that's why the semantics match their approach. > > > > I appreciate that, and I agree that from the Arm side we haven't been as > > engaged with userspace on this front (please understand I'm the messenger here, > > this is something I've repeatedly asked for within Arm). > > > > Regardless, I don't think that changes the substance of the bug, which is that > > we're converting named-pmu events into entirely different PERF_TYPE_HARDWARE > > events. > > > > I agree that expanding plain legacy event names to a set of PMU-tagetted legacy > > events makes sense (and even for Arm, that's the right thing to do, IMO). If > > I ask for 'cycles' and that gets expanded to multiple legacy cycles events that > > target specific CPU PMUs, that's good. > > > > The thing that doesn't make sense here is converting named-pmu events into > > egacy events. If I ask for 'apple_firestorm_pmu/cycles/', that should be the > > 'cycles' event in the apple_firestorm_pmu's event namespace, and *shouldn't* be > > converted to a (potentially semantically different) PERF_TYPE_HARDWARE event, > > even if that's targetted towards the apple_firestorm_pmu. I think that should > > be true for *any* PMU, whether thats an arm/x86/whatever CPU PMU or a system > > PMU. > > This is saying that legacy events are lower than system events. We > don't do this historically and as it requires extra PMU set up. On an > Intel Tigerlake: > > ``` > $ ls /sys/devices/cpu/events > branch-instructions cache-misses instructions ref-cycles > topdown-be-bound > branch-misses cache-references mem-loads slots > topdown-fe-bound > bus-cycles cpu-cycles mem-stores topdown-bad-spec > topdown-retiring > ``` > here (at least) branch-misses, bus-cycles, cache-references, > cpu-cycles and instructions overlap with legacy event names > ``` > $ perf --version > perf version 6.5.6 > $ perf stat -vv -e branch-misses,bus-cycles,cache-references,cp > u-cycles,instructions true Here you *aren't using a named PMU. As I said before, using the PERF_TYPE_HARDWARE events in this case is entriely fine, it's just the ${pmu}/${eventname}/ case that I'm saying should use the PMU's namespace, which was historically the case, and is what users are depending upon. i.e. perf stat -e cycles ./workload ... can/should use PERF_TYPE_HARDWARE events, as it used to However: perf srtat -e ${pmu}/cycles/ ./workload ... should use the PMU's namespaced events, as it used to > Using CPUID GenuineIntel-6-8D-1 > intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch > Control descriptor is not initialized > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0x5 (PERF_COUNT_HW_BRANCH_MISSES) > ... > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0x6 (PERF_COUNT_HW_BUS_CYCLES) > ... > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0x2 (PERF_COUNT_HW_CACHE_REFERENCES) > ... > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > ... > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0x1 (PERF_COUNT_HW_INSTRUCTIONS) > ... > branch-misses: -1: 6571 826226 826226 > bus-cycles: -1: 31411 826226 826226 > cache-references: -1: 19507 826226 826226 > cpu-cycles: -1: 1127215 826226 826226 > instructions: -1: 1301583 826226 826226 > branch-misses: 6571 826226 826226 > bus-cycles: 31411 826226 826226 > cache-references: 19507 826226 826226 > cpu-cycles: 1127215 826226 826226 > instructions: 1301583 826226 826226 > > Performance counter stats for 'true': > ... > ``` > ie perf 6.5 and all events even though sysfs has events we're opening > them with PERF_TYPE_HARDWARE. As above, this is a different case. > > > > > > Presumably the expectation was that by advertising a cycles event, presumably > > > > > in sysfs, then this is what would be matched. > > > > Yes. That's how this has always worked prior to the changes Marc referenced. > > Note that this can *also* be expaned to events from json databases, but was > > *never* previously silently converted to a PERF_TYPE_HARDWARE event. > > > > Please note that the events in sysfs are *namespaced* to the PMU (specifically, > > when using that PMU's dynamic type); they are not necessarily the same as > > legacy events (though they may have similar or matching > > names in some cases), they may be semantically distinct from the legacy events > > even if the names match, and it is incorrect to conflate the two. > > This was a behavior added by Intel so that say cpu_atom/legacy-event/ > would only open as a hardware event on that PMU. The point of the > blamed change is to make that behavior consistent for all core PMUs. Ok, so Intel has an intel-specific behaviour change, which was ok for them. That was made generic, but cause d a functional regression on arm (and possibly other architectures if anyone else cares about the namespaced events). Why can't this be rteturned to being x86 specific? > > > > I expect that if I ask for ${pmu}/${event}/, that PMU is used, and the event > > > > *in that PMU's namespace* is used. Overriding that breaks long-established > > > > practice and provides users with no recourse to get the behavioru they expect > > > > (and previosuly had). > > > > > > On ARM but not Intel. > > > > As above, I don't think the CPU architecture matters here for the case that I'm > > saying is broken. I think that regardless of CPU architecture (or for any > > non-CPU PMU) it is semantically incorrect to convert a named-pmu event to a > > legacy event. > > So perf's behavior has always been that legacy event priority is > greater-than sysfs and json. The distinction here is that a core PMU > is explicitly listed and it doesn't seem unreasonable to use core PMU > names with legacy events, the behavior Intel added. That may be ok for Intel, but given it *is* causing functional probelsm for others, why must it remain generic? > > > > I do think that (regardless of whther this was the sematnic you intended) > > > > silently overriding events with legacy events is a bug, and one we should fix. > > > > As I mentioned in another reply, just because the events have the same name > > > > does not mean that they are semantically the same, so we're liable to give > > > > people the wrong numbers anyhow. > > > > > > > > Can we fix this? > > > > > > So I'd like to fix this, some things from various conversations: > > > > > > 1) we lack testing. Our testing relies on the sysfs of the machine > > > being run on, which is better than nothing. I think ideally we'd have > > > a collection of zipped up sysfs directories and then we could have a > > > test that asserts on ARM you get the behavior you want. > > > > I agree we lack testing, and I'd be happy to help here going forwards, though I > > don't think this is a prerequisite for fixing this issue. > > > > > 2) for RISC-V they want to make the legacy event matching something in > > > user land to simplify the PMU driver. > > > > Ok; I see how this might be related, but it doesn't sound like a prerequisite > > for fixing this issue -- there are plenty of people in this thread who can > > test. > > > > > 3) I'd like to get rid of the PMU json interface. My idea is to > > > convert json events/metrics into sysfs style files, zip these up and > > > then link them into the perf binary. On Intel the json is 70% of the > > > binary (7MB out of 10MB) and we may get this down to 3MB with this > > > approach. The json lookup would need to incorporate the cpuid matching > > > that currently exists. When we look up an event I'd like the approach > > > to be like unionfs with a specified but configurable order. Users > > > could provide directories of their own events/metrics for various > > > PMUs, and then this approach could be used to help with (1). > > > > I can see how that might interact with whatever changes we make to fix this > > issue, but this seems like a future aspiration, and not a prerequisite for > > fixing the existing functional regression. > > > > > Those proposals are not something to add as a -rc fix, so what I think > > > you're asking for here is a "if ARM" fix somewhere in the event > > > parsing. That's of course possible but it will cause problems if you > > > did say: > > > > > > perf stat -e arm_pmu/LLC-load-misses/ ... > > > > As above, I do not think this is an arm-specific issue, we're just the canary > > in the coalmine. > > Disagree, see comments above. A behavior change here would impact Intel. Ok, so have Intel keep the Intel behaviour? > > Please note that: > > > > perf stat -e arm_pmu/LLC-load-misses/ ... > > > > ... would never have worked previously. No arm_pmu instances have a > > "LLC-load-misses" event in their event namespaces, and we don't have any > > userspace file mapping that event. > > This event was for the purpose of giving an example, perf list will > show you events that work. The point is that a legacy event may not be > available on both BIG.little PMU types so being able to designate the > PMU there is helpful. Sure, but (as per my reply to Arnaldo), it's possible to add an unambiguous way to specify that, e.g a 'hw:' prefix like: some_arm_pmu/hw:LLC-load-misses/ ... which wouldn't clash and cause hte regression that users are seing. > > That said, If I really wanted that legacy event, I'd have asked for it bare, > > e.g. > > > > perf stat -e LLC-load-misses > > > > ... and we're in agreement that it's sensible to expand this to multiple > > PERF_TYPE_HARDWARE events targeting the individual CPU PMUs. > > > > So I see no need to do anything to have magic for 'arm_pmu/LLC-load-misses/'. > > > > > as I doubt the PMU driver is advertising this legacy event in sysfs > > > and the "if ARM" logic would presumably be trying to disable legacy > > > events in the term list for the ARM PMU. > > > > > > Given all of this, is anything actually broken and needing a fix for 6.7? > > > > There is absolutely a bug that needs to be fixed here (and needs to be > > backported to stable so that it gets picked up by distributions). > > I'm not seeing this. The behavior is consistent with Intel, this has > gone 2 releases without being spotted, This has gone two releases because people has just updated their tools. The prior behaviour for Arm has been there for most of a decade. > it was triggered by a PMU event > name aliasing a legacy event name and the behavior has always been > legacy event names have higher priority than sysfs and json events. That has been the case for plain events without a PMU name. That was never the case for events with a PMU name, or there would not have been any difference in behaviour. > Whilst I'm seeing a lot of complaining, I've not seen a proposal of > what behavior you want. As per my initial reply the bevaiour we want is that: pmu/eventname/ ... opens 'eventname' in that PMU's event namespace, rather than converting the event into a PERF_TYPE_HARDWARE event. That was the prior behaviour, which people have been using for most of a decade. I understand that there was some Intel-specific behaviour, and that may need to be kept for Intel. Making that behaviour generic broke other existing users. If we need a mechanism to target a legacy event to a specific PMU, we can add an unambiguous way of descirbing that (e.g. the 'hw:' prefix I've suggested a few times). > Isn't it a PMU bug if the legacy event specifying the PMU doesn't get opened > by the core PMU? No? Prior to that mechanism being added to the kernel, there was no way to do that. When the mechanism was added to x86 specifically, it wasn't a generic feature. > Fixing the PMU driver appears to be the right fix and means there is > consistency on core events across architectures. I think that's orthogonal. Adding support to the PMU drivers (which has already been done, per the commit you quoted before) is good so that userspace can do the right thing for: perf stat -e some_generic_event ./workload ... but that should not be necessary to retain the existing behaviour for: perf stat -e pmu/some_similarly_named_event/ ./workload Thanks, Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 16:08 ` Mark Rutland @ 2023-11-22 16:29 ` Ian Rogers 2023-11-22 16:55 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-22 16:29 UTC (permalink / raw) To: Mark Rutland Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 8:08 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Wed, Nov 22, 2023 at 07:29:34AM -0800, Ian Rogers wrote: > > On Wed, Nov 22, 2023 at 5:04 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > On Tue, Nov 21, 2023 at 08:38:45AM -0800, Ian Rogers wrote: > > > > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote: > > > > > > On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote: > > > > > > > > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > > > > > > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > > > > > > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > [Adding key people on Cc] > > > > > > > > > > > > > > > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > > > > > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > > > > > > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > > > > > > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > > > > > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > > > > > > > > the PMU, but nothing works anymore. > > > > > > > > > > > > > > > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > > > > > > > > package, but that's obviously not going to last. > > > > > > > > > > > > > > > > > > > > > > I'm happy to test potential fixes. > > > > > > > > > > > > > > > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > > > > > > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > > > > > > > > CPU): > > > > > > > > > > > > > > > > > > IIUC the tool is doing the wrong thing here and overriding explicit > > > > > > > > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using > > > > > > > > > that ${pmu}'s type and event namespace. > > > > > > > > > > > > > > > > > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be > > > > > > > > > targetted to a specific PMU, it's semantically wrong to rewrite events like > > > > > > > > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named > > > > > > > > > PERF_COUNT_HW_${EVENT}. > > > > > > > > > > > > > > > > If you name a PMU and an event then the event should only be opened on > > > > > > > > that PMU, 100% agree. There's a bunch of output, but when the legacy > > > > > > > > cycles event is opened it appears to be because it was explicitly > > > > > > > > requested. > > > > > > > > > > > > > > I think you've missed that the named PMU events are being erreously transformed > > > > > > > into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g. > > > > > > > > > > > > > > Opening: apple_firestorm_pmu/cycles/ > > > > > > > ------------------------------------------------------------ > > > > > > > perf_event_attr: > > > > > > > type 0 (PERF_TYPE_HARDWARE) > > > > > > > size 136 > > > > > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > > > > > sample_type IDENTIFIER > > > > > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > > > > > disabled 1 > > > > > > > inherit 1 > > > > > > > enable_on_exec 1 > > > > > > > exclude_guest 1 > > > > > > > ------------------------------------------------------------ > > > > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > > > > > > > > > > > > ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES. > > > > > > > > > > > > > > Marc said that he bisected the issue down to commit: > > > > > > > > > > > > > > 5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms") > > > > > > > > > > > > > > ... so it looks like something is going wrong when the events are being parsed, > > > > > > > e.g. losing the HW PMU information? > > > > > > > > > > > > Ok, I think I'm getting confused by other things. This looks like the issue. > > > > > > > > > > > > I think it may be working as intended, but not how you intended :-) If > > > > > > a core PMU is listed and then a legacy event, the legacy event should > > > > > > be opened on the core PMU as a legacy event with the extended type > > > > > > set. This is to allow things like legacy cache events to be opened on > > > > > > a specified PMU. Legacy event names match with a higher priority than > > > > > > those in sysfs or json as they are hard coded. > > > > > > > > > > That has never been the case previously, so this is user-visible breakage, and > > > > > it prevents users from being able to do the right thing, so I think that's a > > > > > broken design. > > > > > > > > So the problem was caused by ARM and Intel doing two different things. > > > > Intel did at least contribute to the perf tool in support for their > > > > BIG.little/hybrid, so that's why the semantics match their approach. > > > > > > I appreciate that, and I agree that from the Arm side we haven't been as > > > engaged with userspace on this front (please understand I'm the messenger here, > > > this is something I've repeatedly asked for within Arm). > > > > > > Regardless, I don't think that changes the substance of the bug, which is that > > > we're converting named-pmu events into entirely different PERF_TYPE_HARDWARE > > > events. > > > > > > I agree that expanding plain legacy event names to a set of PMU-tagetted legacy > > > events makes sense (and even for Arm, that's the right thing to do, IMO). If > > > I ask for 'cycles' and that gets expanded to multiple legacy cycles events that > > > target specific CPU PMUs, that's good. > > > > > > The thing that doesn't make sense here is converting named-pmu events into > > > egacy events. If I ask for 'apple_firestorm_pmu/cycles/', that should be the > > > 'cycles' event in the apple_firestorm_pmu's event namespace, and *shouldn't* be > > > converted to a (potentially semantically different) PERF_TYPE_HARDWARE event, > > > even if that's targetted towards the apple_firestorm_pmu. I think that should > > > be true for *any* PMU, whether thats an arm/x86/whatever CPU PMU or a system > > > PMU. > > > > This is saying that legacy events are lower than system events. We > > don't do this historically and as it requires extra PMU set up. On an > > Intel Tigerlake: > > > > ``` > > $ ls /sys/devices/cpu/events > > branch-instructions cache-misses instructions ref-cycles > > topdown-be-bound > > branch-misses cache-references mem-loads slots > > topdown-fe-bound > > bus-cycles cpu-cycles mem-stores topdown-bad-spec > > topdown-retiring > > ``` > > here (at least) branch-misses, bus-cycles, cache-references, > > cpu-cycles and instructions overlap with legacy event names > > ``` > > $ perf --version > > perf version 6.5.6 > > $ perf stat -vv -e branch-misses,bus-cycles,cache-references,cp > > u-cycles,instructions true > > Here you *aren't using a named PMU. As I said before, using the > PERF_TYPE_HARDWARE events in this case is entriely fine, it's just the > ${pmu}/${eventname}/ case that I'm saying should use the PMU's namespace, > which was historically the case, and is what users are depending upon. > > i.e. > > perf stat -e cycles ./workload > > ... can/should use PERF_TYPE_HARDWARE events, as it used to > > However: > > perf srtat -e ${pmu}/cycles/ ./workload > > ... should use the PMU's namespaced events, as it used to > > > Using CPUID GenuineIntel-6-8D-1 > > intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch > > Control descriptor is not initialized > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0x5 (PERF_COUNT_HW_BRANCH_MISSES) > > ... > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0x6 (PERF_COUNT_HW_BUS_CYCLES) > > ... > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0x2 (PERF_COUNT_HW_CACHE_REFERENCES) > > ... > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > ... > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0x1 (PERF_COUNT_HW_INSTRUCTIONS) > > ... > > branch-misses: -1: 6571 826226 826226 > > bus-cycles: -1: 31411 826226 826226 > > cache-references: -1: 19507 826226 826226 > > cpu-cycles: -1: 1127215 826226 826226 > > instructions: -1: 1301583 826226 826226 > > branch-misses: 6571 826226 826226 > > bus-cycles: 31411 826226 826226 > > cache-references: 19507 826226 826226 > > cpu-cycles: 1127215 826226 826226 > > instructions: 1301583 826226 826226 > > > > Performance counter stats for 'true': > > ... > > ``` > > ie perf 6.5 and all events even though sysfs has events we're opening > > them with PERF_TYPE_HARDWARE. > > As above, this is a different case. > > > > > > > > > Presumably the expectation was that by advertising a cycles event, presumably > > > > > > in sysfs, then this is what would be matched. > > > > > > Yes. That's how this has always worked prior to the changes Marc referenced. > > > Note that this can *also* be expaned to events from json databases, but was > > > *never* previously silently converted to a PERF_TYPE_HARDWARE event. > > > > > > Please note that the events in sysfs are *namespaced* to the PMU (specifically, > > > when using that PMU's dynamic type); they are not necessarily the same as > > > legacy events (though they may have similar or matching > > > names in some cases), they may be semantically distinct from the legacy events > > > even if the names match, and it is incorrect to conflate the two. > > > > This was a behavior added by Intel so that say cpu_atom/legacy-event/ > > would only open as a hardware event on that PMU. The point of the > > blamed change is to make that behavior consistent for all core PMUs. > > Ok, so Intel has an intel-specific behaviour change, which was ok for them. > > That was made generic, but cause d a functional regression on arm (and possibly > other architectures if anyone else cares about the namespaced events). > > Why can't this be rteturned to being x86 specific? > > > > > > I expect that if I ask for ${pmu}/${event}/, that PMU is used, and the event > > > > > *in that PMU's namespace* is used. Overriding that breaks long-established > > > > > practice and provides users with no recourse to get the behavioru they expect > > > > > (and previosuly had). > > > > > > > > On ARM but not Intel. > > > > > > As above, I don't think the CPU architecture matters here for the case that I'm > > > saying is broken. I think that regardless of CPU architecture (or for any > > > non-CPU PMU) it is semantically incorrect to convert a named-pmu event to a > > > legacy event. > > > > So perf's behavior has always been that legacy event priority is > > greater-than sysfs and json. The distinction here is that a core PMU > > is explicitly listed and it doesn't seem unreasonable to use core PMU > > names with legacy events, the behavior Intel added. > > That may be ok for Intel, but given it *is* causing functional probelsm for > others, why must it remain generic? > > > > > > I do think that (regardless of whther this was the sematnic you intended) > > > > > silently overriding events with legacy events is a bug, and one we should fix. > > > > > As I mentioned in another reply, just because the events have the same name > > > > > does not mean that they are semantically the same, so we're liable to give > > > > > people the wrong numbers anyhow. > > > > > > > > > > Can we fix this? > > > > > > > > So I'd like to fix this, some things from various conversations: > > > > > > > > 1) we lack testing. Our testing relies on the sysfs of the machine > > > > being run on, which is better than nothing. I think ideally we'd have > > > > a collection of zipped up sysfs directories and then we could have a > > > > test that asserts on ARM you get the behavior you want. > > > > > > I agree we lack testing, and I'd be happy to help here going forwards, though I > > > don't think this is a prerequisite for fixing this issue. > > > > > > > 2) for RISC-V they want to make the legacy event matching something in > > > > user land to simplify the PMU driver. > > > > > > Ok; I see how this might be related, but it doesn't sound like a prerequisite > > > for fixing this issue -- there are plenty of people in this thread who can > > > test. > > > > > > > 3) I'd like to get rid of the PMU json interface. My idea is to > > > > convert json events/metrics into sysfs style files, zip these up and > > > > then link them into the perf binary. On Intel the json is 70% of the > > > > binary (7MB out of 10MB) and we may get this down to 3MB with this > > > > approach. The json lookup would need to incorporate the cpuid matching > > > > that currently exists. When we look up an event I'd like the approach > > > > to be like unionfs with a specified but configurable order. Users > > > > could provide directories of their own events/metrics for various > > > > PMUs, and then this approach could be used to help with (1). > > > > > > I can see how that might interact with whatever changes we make to fix this > > > issue, but this seems like a future aspiration, and not a prerequisite for > > > fixing the existing functional regression. > > > > > > > Those proposals are not something to add as a -rc fix, so what I think > > > > you're asking for here is a "if ARM" fix somewhere in the event > > > > parsing. That's of course possible but it will cause problems if you > > > > did say: > > > > > > > > perf stat -e arm_pmu/LLC-load-misses/ ... > > > > > > As above, I do not think this is an arm-specific issue, we're just the canary > > > in the coalmine. > > > > Disagree, see comments above. A behavior change here would impact Intel. > > Ok, so have Intel keep the Intel behaviour? > > > > Please note that: > > > > > > perf stat -e arm_pmu/LLC-load-misses/ ... > > > > > > ... would never have worked previously. No arm_pmu instances have a > > > "LLC-load-misses" event in their event namespaces, and we don't have any > > > userspace file mapping that event. > > > > This event was for the purpose of giving an example, perf list will > > show you events that work. The point is that a legacy event may not be > > available on both BIG.little PMU types so being able to designate the > > PMU there is helpful. > > Sure, but (as per my reply to Arnaldo), it's possible to add an unambiguous way > to specify that, e.g a 'hw:' prefix like: > > some_arm_pmu/hw:LLC-load-misses/ > > ... which wouldn't clash and cause hte regression that users are seing. > > > > That said, If I really wanted that legacy event, I'd have asked for it bare, > > > e.g. > > > > > > perf stat -e LLC-load-misses > > > > > > ... and we're in agreement that it's sensible to expand this to multiple > > > PERF_TYPE_HARDWARE events targeting the individual CPU PMUs. > > > > > > So I see no need to do anything to have magic for 'arm_pmu/LLC-load-misses/'. > > > > > > > as I doubt the PMU driver is advertising this legacy event in sysfs > > > > and the "if ARM" logic would presumably be trying to disable legacy > > > > events in the term list for the ARM PMU. > > > > > > > > Given all of this, is anything actually broken and needing a fix for 6.7? > > > > > > There is absolutely a bug that needs to be fixed here (and needs to be > > > backported to stable so that it gets picked up by distributions). > > > > I'm not seeing this. The behavior is consistent with Intel, this has > > gone 2 releases without being spotted, > > This has gone two releases because people has just updated their tools. The > prior behaviour for Arm has been there for most of a decade. > > > it was triggered by a PMU event > > name aliasing a legacy event name and the behavior has always been > > legacy event names have higher priority than sysfs and json events. > > That has been the case for plain events without a PMU name. That was never the > case for events with a PMU name, or there would not have been any difference in > behaviour. > > > Whilst I'm seeing a lot of complaining, I've not seen a proposal of > > what behavior you want. > > As per my initial reply the bevaiour we want is that: > > pmu/eventname/ > > ... opens 'eventname' in that PMU's event namespace, rather than converting the > event into a PERF_TYPE_HARDWARE event. That was the prior behaviour, which > people have been using for most of a decade. > > I understand that there was some Intel-specific behaviour, and that may need to > be kept for Intel. Making that behaviour generic broke other existing users. > > If we need a mechanism to target a legacy event to a specific PMU, we can add > an unambiguous way of descirbing that (e.g. the 'hw:' prefix I've suggested a > few times). > > > > Isn't it a PMU bug if the legacy event specifying the PMU doesn't get opened > > by the core PMU? > > No? > > Prior to that mechanism being added to the kernel, there was no way to do that. > > When the mechanism was added to x86 specifically, it wasn't a generic feature. > > > Fixing the PMU driver appears to be the right fix and means there is > > consistency on core events across architectures. > > I think that's orthogonal. > > Adding support to the PMU drivers (which has already been done, per the commit > you quoted before) is good so that userspace can do the right thing for: > > perf stat -e some_generic_event ./workload > > ... but that should not be necessary to retain the existing behaviour for: > > perf stat -e pmu/some_similarly_named_event/ ./workload > > Thanks, > Mark. Given the PMU mapping exists, what is the difficulty in the case of this PMU? I could explain what I see on ARMv8 devices and the broken PMU landscape from the last 10 years but that hardly feels constructive here. I'm not understanding the difficulty of translating: struct perf_event_attr { ... .type = PERF_TYPE_HARDARE, .config = <pmu's type> << 32 | PERF_COUNT_HW_CPU_CYCLES, ... } to the event called "cycles" that the PMU is advertising? Given the mapping already has to exist for every core PMU driver. I can look at doing an event parser change like: ``` diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index aa2f5c6fc7fc..9a18fda525d2 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -986,7 +986,8 @@ static int config_term_pmu(struct perf_event_attr *attr, err_str, /*help=*/NULL); return -EINVAL; } - if (perf_pmu__supports_legacy_cache(pmu)) { + if (perf_pmu__supports_legacy_cache(pmu) && + !perf_pmu__have_event(pmu, term->val.str)) { attr->type = PERF_TYPE_HW_CACHE; return parse_events__decode_legacy_cache(term->config, pmu->type, &attr->config); @@ -1004,10 +1005,15 @@ static int config_term_pmu(struct perf_event_attr *attr, err_str, /*help=*/NULL); return -EINVAL; } - attr->type = PERF_TYPE_HARDWARE; - attr->config = term->val.num; - if (perf_pmus__supports_extended_type()) - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; + if (perf_pmu__have_event(pmu, term->val.str)) { + /* If the PMU has a sysfs or json event prefer it over legacy. ARM requires this. */ + term->term_type = PARSE_EVENTS__TERM_TYPE_USER; + } else { + attr->type = PERF_TYPE_HARDWARE; + attr->config = term->val.num; + if (perf_pmus__supports_extended_type()) + attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; + } return 0; } if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || ``` (note: this is incomplete as term->val.str isn't populated for PARSE_EVENTS__TERM_TYPE_HARDWARE) but this is a behavioral change on Intel and shouldn't therefore come in as an rc fix. Thanks, Ian ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 16:29 ` Ian Rogers @ 2023-11-22 16:55 ` Arnaldo Carvalho de Melo 2023-11-22 16:59 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-11-22 16:55 UTC (permalink / raw) To: Ian Rogers Cc: Mark Rutland, Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux Em Wed, Nov 22, 2023 at 08:29:58AM -0800, Ian Rogers escreveu: > I can look at doing an event parser change like: > > ``` > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > index aa2f5c6fc7fc..9a18fda525d2 100644 > --- a/tools/perf/util/parse-events.c > +++ b/tools/perf/util/parse-events.c > @@ -986,7 +986,8 @@ static int config_term_pmu(struct perf_event_attr *attr, > err_str, > /*help=*/NULL); > return -EINVAL; > } > - if (perf_pmu__supports_legacy_cache(pmu)) { > + if (perf_pmu__supports_legacy_cache(pmu) && > + !perf_pmu__have_event(pmu, term->val.str)) { > attr->type = PERF_TYPE_HW_CACHE; > return > parse_events__decode_legacy_cache(term->config, pmu->type, > &attr->config); > @@ -1004,10 +1005,15 @@ static int config_term_pmu(struct perf_event_attr *attr, > err_str, > /*help=*/NULL); > return -EINVAL; > } > - attr->type = PERF_TYPE_HARDWARE; > - attr->config = term->val.num; > - if (perf_pmus__supports_extended_type()) > - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; > + if (perf_pmu__have_event(pmu, term->val.str)) { > + /* If the PMU has a sysfs or json event prefer > it over legacy. ARM requires this. */ > + term->term_type = PARSE_EVENTS__TERM_TYPE_USER; > + } else { > + attr->type = PERF_TYPE_HARDWARE; > + attr->config = term->val.num; > + if (perf_pmus__supports_extended_type()) > + attr->config |= (__u64)pmu->type << > PERF_PMU_TYPE_SHIFT; > + } > return 0; > } > if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || > ``` > (note: this is incomplete as term->val.str isn't populated for > PARSE_EVENTS__TERM_TYPE_HARDWARE) Yeah, I had to apply manually as your MUA mangled it, then it didn't build, had to remove some consts, then there was a struct member mistake, after all fixed I get to the patch below, but it now segfaults, probably what you mention... root@roc-rk3399-pc:~# strace -e perf_event_open taskset -c 4,5 perf stat -v -e cycles,armv8_cortex_a53/cycles/,armv8_cortex_a72/cycles/ echo Using CPUID 0x00000000410fd082 perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES, sample_period=0, sample_type=0, read_format=0, disabled=1, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory) --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} --- +++ killed by SIGSEGV +++ Segmentation fault root@roc-rk3399-pc:~# - Arnaldo diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index aa2f5c6fc7fc..1e648454cc49 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -976,7 +976,7 @@ static int config_term_pmu(struct perf_event_attr *attr, struct parse_events_error *err) { if (term->type_term == PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE) { - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); if (!pmu) { char *err_str; @@ -986,7 +986,8 @@ static int config_term_pmu(struct perf_event_attr *attr, err_str, /*help=*/NULL); return -EINVAL; } - if (perf_pmu__supports_legacy_cache(pmu)) { + if (perf_pmu__supports_legacy_cache(pmu) && + !perf_pmu__have_event(pmu, term->val.str)) { attr->type = PERF_TYPE_HW_CACHE; return parse_events__decode_legacy_cache(term->config, pmu->type, &attr->config); @@ -994,7 +995,7 @@ static int config_term_pmu(struct perf_event_attr *attr, term->type_term = PARSE_EVENTS__TERM_TYPE_USER; } if (term->type_term == PARSE_EVENTS__TERM_TYPE_HARDWARE) { - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); if (!pmu) { char *err_str; @@ -1004,10 +1005,15 @@ static int config_term_pmu(struct perf_event_attr *attr, err_str, /*help=*/NULL); return -EINVAL; } - attr->type = PERF_TYPE_HARDWARE; - attr->config = term->val.num; - if (perf_pmus__supports_extended_type()) - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; + if (perf_pmu__have_event(pmu, term->val.str)) { + /* If the PMU has a sysfs or JSON event prefer it over legacy. ARM requires this. */ + term->type_term = PARSE_EVENTS__TERM_TYPE_USER; + } else { + attr->type = PERF_TYPE_HARDWARE; + attr->config = term->val.num; + if (perf_pmus__supports_extended_type()) + attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; + } return 0; } if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 16:55 ` Arnaldo Carvalho de Melo @ 2023-11-22 16:59 ` Ian Rogers 2023-11-23 4:33 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-22 16:59 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Mark Rutland, Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 8:55 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Nov 22, 2023 at 08:29:58AM -0800, Ian Rogers escreveu: > > I can look at doing an event parser change like: > > > > ``` > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > > index aa2f5c6fc7fc..9a18fda525d2 100644 > > --- a/tools/perf/util/parse-events.c > > +++ b/tools/perf/util/parse-events.c > > @@ -986,7 +986,8 @@ static int config_term_pmu(struct perf_event_attr *attr, > > err_str, > > /*help=*/NULL); > > return -EINVAL; > > } > > - if (perf_pmu__supports_legacy_cache(pmu)) { > > + if (perf_pmu__supports_legacy_cache(pmu) && > > + !perf_pmu__have_event(pmu, term->val.str)) { > > attr->type = PERF_TYPE_HW_CACHE; > > return > > parse_events__decode_legacy_cache(term->config, pmu->type, > > &attr->config); > > @@ -1004,10 +1005,15 @@ static int config_term_pmu(struct perf_event_attr *attr, > > err_str, > > /*help=*/NULL); > > return -EINVAL; > > } > > - attr->type = PERF_TYPE_HARDWARE; > > - attr->config = term->val.num; > > - if (perf_pmus__supports_extended_type()) > > - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; > > + if (perf_pmu__have_event(pmu, term->val.str)) { > > + /* If the PMU has a sysfs or json event prefer > > it over legacy. ARM requires this. */ > > + term->term_type = PARSE_EVENTS__TERM_TYPE_USER; > > + } else { > > + attr->type = PERF_TYPE_HARDWARE; > > + attr->config = term->val.num; > > + if (perf_pmus__supports_extended_type()) > > + attr->config |= (__u64)pmu->type << > > PERF_PMU_TYPE_SHIFT; > > + } > > return 0; > > } > > if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || > > ``` > > (note: this is incomplete as term->val.str isn't populated for > > PARSE_EVENTS__TERM_TYPE_HARDWARE) > > Yeah, I had to apply manually as your MUA mangled it, then it didn't > build, had to remove some consts, then there was a struct member > mistake, after all fixed I get to the patch below, but it now segfaults, > probably what you mention... > > root@roc-rk3399-pc:~# strace -e perf_event_open taskset -c 4,5 perf stat -v -e cycles,armv8_cortex_a53/cycles/,armv8_cortex_a72/cycles/ echo > Using CPUID 0x00000000410fd082 > perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES, sample_period=0, sample_type=0, read_format=0, disabled=1, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory) > --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} --- > +++ killed by SIGSEGV +++ > Segmentation fault > root@roc-rk3399-pc:~# Right, I have something further along that fails tests. I'll try to send out an RFC today, but given the Intel behavior change ¯\_(ツ)_/¯ But Intel don't appear to have an issue having two things called, for example, cycles and them both being a cycles event so they may not care. It is only ARM's PMUs that appear broken in this way. Thanks, Ian > - Arnaldo > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > index aa2f5c6fc7fc..1e648454cc49 100644 > --- a/tools/perf/util/parse-events.c > +++ b/tools/perf/util/parse-events.c > @@ -976,7 +976,7 @@ static int config_term_pmu(struct perf_event_attr *attr, > struct parse_events_error *err) > { > if (term->type_term == PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE) { > - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > > if (!pmu) { > char *err_str; > @@ -986,7 +986,8 @@ static int config_term_pmu(struct perf_event_attr *attr, > err_str, /*help=*/NULL); > return -EINVAL; > } > - if (perf_pmu__supports_legacy_cache(pmu)) { > + if (perf_pmu__supports_legacy_cache(pmu) && > + !perf_pmu__have_event(pmu, term->val.str)) { > attr->type = PERF_TYPE_HW_CACHE; > return parse_events__decode_legacy_cache(term->config, pmu->type, > &attr->config); > @@ -994,7 +995,7 @@ static int config_term_pmu(struct perf_event_attr *attr, > term->type_term = PARSE_EVENTS__TERM_TYPE_USER; > } > if (term->type_term == PARSE_EVENTS__TERM_TYPE_HARDWARE) { > - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > > if (!pmu) { > char *err_str; > @@ -1004,10 +1005,15 @@ static int config_term_pmu(struct perf_event_attr *attr, > err_str, /*help=*/NULL); > return -EINVAL; > } > - attr->type = PERF_TYPE_HARDWARE; > - attr->config = term->val.num; > - if (perf_pmus__supports_extended_type()) > - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; > + if (perf_pmu__have_event(pmu, term->val.str)) { > + /* If the PMU has a sysfs or JSON event prefer it over legacy. ARM requires this. */ > + term->type_term = PARSE_EVENTS__TERM_TYPE_USER; > + } else { > + attr->type = PERF_TYPE_HARDWARE; > + attr->config = term->val.num; > + if (perf_pmus__supports_extended_type()) > + attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; > + } > return 0; > } > if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-22 16:59 ` Ian Rogers @ 2023-11-23 4:33 ` Ian Rogers 0 siblings, 0 replies; 53+ messages in thread From: Ian Rogers @ 2023-11-23 4:33 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Mark Rutland, Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Wed, Nov 22, 2023 at 8:59 AM Ian Rogers <irogers@google.com> wrote: > > On Wed, Nov 22, 2023 at 8:55 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Wed, Nov 22, 2023 at 08:29:58AM -0800, Ian Rogers escreveu: > > > I can look at doing an event parser change like: > > > > > > ``` > > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > > > index aa2f5c6fc7fc..9a18fda525d2 100644 > > > --- a/tools/perf/util/parse-events.c > > > +++ b/tools/perf/util/parse-events.c > > > @@ -986,7 +986,8 @@ static int config_term_pmu(struct perf_event_attr *attr, > > > err_str, > > > /*help=*/NULL); > > > return -EINVAL; > > > } > > > - if (perf_pmu__supports_legacy_cache(pmu)) { > > > + if (perf_pmu__supports_legacy_cache(pmu) && > > > + !perf_pmu__have_event(pmu, term->val.str)) { > > > attr->type = PERF_TYPE_HW_CACHE; > > > return > > > parse_events__decode_legacy_cache(term->config, pmu->type, > > > &attr->config); > > > @@ -1004,10 +1005,15 @@ static int config_term_pmu(struct perf_event_attr *attr, > > > err_str, > > > /*help=*/NULL); > > > return -EINVAL; > > > } > > > - attr->type = PERF_TYPE_HARDWARE; > > > - attr->config = term->val.num; > > > - if (perf_pmus__supports_extended_type()) > > > - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; > > > + if (perf_pmu__have_event(pmu, term->val.str)) { > > > + /* If the PMU has a sysfs or json event prefer > > > it over legacy. ARM requires this. */ > > > + term->term_type = PARSE_EVENTS__TERM_TYPE_USER; > > > + } else { > > > + attr->type = PERF_TYPE_HARDWARE; > > > + attr->config = term->val.num; > > > + if (perf_pmus__supports_extended_type()) > > > + attr->config |= (__u64)pmu->type << > > > PERF_PMU_TYPE_SHIFT; > > > + } > > > return 0; > > > } > > > if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || > > > ``` > > > (note: this is incomplete as term->val.str isn't populated for > > > PARSE_EVENTS__TERM_TYPE_HARDWARE) > > > > Yeah, I had to apply manually as your MUA mangled it, then it didn't > > build, had to remove some consts, then there was a struct member > > mistake, after all fixed I get to the patch below, but it now segfaults, > > probably what you mention... > > > > root@roc-rk3399-pc:~# strace -e perf_event_open taskset -c 4,5 perf stat -v -e cycles,armv8_cortex_a53/cycles/,armv8_cortex_a72/cycles/ echo > > Using CPUID 0x00000000410fd082 > > perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES, sample_period=0, sample_type=0, read_format=0, disabled=1, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory) > > --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} --- > > +++ killed by SIGSEGV +++ > > Segmentation fault > > root@roc-rk3399-pc:~# > > Right, I have something further along that fails tests. I'll try to > send out an RFC today, but given the Intel behavior change ¯\_(ツ)_/¯ > But Intel don't appear to have an issue having two things called, for > example, cycles and them both being a cycles event so they may not > care. It is only ARM's PMUs that appear broken in this way. To workaround the PMU bug posted: https://lore.kernel.org/lkml/20231123042922.834425-1-irogers@google.com/ Thanks, Ian > Thanks, > Ian > > > - Arnaldo > > > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > > index aa2f5c6fc7fc..1e648454cc49 100644 > > --- a/tools/perf/util/parse-events.c > > +++ b/tools/perf/util/parse-events.c > > @@ -976,7 +976,7 @@ static int config_term_pmu(struct perf_event_attr *attr, > > struct parse_events_error *err) > > { > > if (term->type_term == PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE) { > > - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > > + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > > > > if (!pmu) { > > char *err_str; > > @@ -986,7 +986,8 @@ static int config_term_pmu(struct perf_event_attr *attr, > > err_str, /*help=*/NULL); > > return -EINVAL; > > } > > - if (perf_pmu__supports_legacy_cache(pmu)) { > > + if (perf_pmu__supports_legacy_cache(pmu) && > > + !perf_pmu__have_event(pmu, term->val.str)) { > > attr->type = PERF_TYPE_HW_CACHE; > > return parse_events__decode_legacy_cache(term->config, pmu->type, > > &attr->config); > > @@ -994,7 +995,7 @@ static int config_term_pmu(struct perf_event_attr *attr, > > term->type_term = PARSE_EVENTS__TERM_TYPE_USER; > > } > > if (term->type_term == PARSE_EVENTS__TERM_TYPE_HARDWARE) { > > - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > > + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); > > > > if (!pmu) { > > char *err_str; > > @@ -1004,10 +1005,15 @@ static int config_term_pmu(struct perf_event_attr *attr, > > err_str, /*help=*/NULL); > > return -EINVAL; > > } > > - attr->type = PERF_TYPE_HARDWARE; > > - attr->config = term->val.num; > > - if (perf_pmus__supports_extended_type()) > > - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; > > + if (perf_pmu__have_event(pmu, term->val.str)) { > > + /* If the PMU has a sysfs or JSON event prefer it over legacy. ARM requires this. */ > > + term->type_term = PARSE_EVENTS__TERM_TYPE_USER; > > + } else { > > + attr->type = PERF_TYPE_HARDWARE; > > + attr->config = term->val.num; > > + if (perf_pmus__supports_extended_type()) > > + attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; > > + } > > return 0; > > } > > if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 15:24 ` Marc Zyngier 2023-11-21 15:40 ` Mark Rutland @ 2023-11-21 15:41 ` Ian Rogers 2023-11-21 15:56 ` Mark Rutland 2023-11-23 14:23 ` Mark Rutland 2 siblings, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-21 15:41 UTC (permalink / raw) To: Marc Zyngier Cc: Mark Rutland, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 7:24 AM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 21 Nov 2023 13:40:31 +0000, > Marc Zyngier <maz@kernel.org> wrote: > > > > [Adding key people on Cc] > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > Hector Martin <marcan@marcan.st> wrote: > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > the PMU, but nothing works anymore. > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > package, but that's obviously not going to last. > > > > I'm happy to test potential fixes. > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > CPU): > > <quote> > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > apple_firestorm_pmu/cycles/ -e cycles ls > Using CPUID 0x00000000612f0280 > Attempt to add: apple_icestorm_pmu/cycles=0/ > ..after resolving event: apple_icestorm_pmu/cycles=0/ > Opening: unknown-hardware:HG > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0xb00000000 > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -95 > Attempt to add: apple_firestorm_pmu/cycles=0/ > ..after resolving event: apple_firestorm_pmu/cycles=0/ > Control descriptor is not initialized > Opening: apple_icestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > Opening: cycles > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > builtin-daemon.o builtin-list.c builtin-version.c perf ui > builtin-data.c builtin-list.o builtin-version.o perf-archive util > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > builtin-diff.c builtin-mem.c command-list.txt perf.c > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > cycles: -1: 0 873709 0 > apple_icestorm_pmu/cycles/: 0 873709 0 > apple_firestorm_pmu/cycles/: 0 873709 0 > cycles: 0 873709 0 > > Performance counter stats for 'ls': > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > <not counted> cycles (0.00%) > > 0.000002250 seconds time elapsed > > 0.000000000 seconds user > 0.000000000 seconds sys > </quote> > > If I run the same thing on another CPU cluster (firestorm), I get > this: > > <quote> > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > apple_firestorm_pmu/cycles/ -e cycles ls > Using CPUID 0x00000000612f0280 > Attempt to add: apple_icestorm_pmu/cycles=0/ > ..after resolving event: apple_icestorm_pmu/cycles=0/ > Opening: unknown-hardware:HG > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0xb00000000 > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -95 > Attempt to add: apple_firestorm_pmu/cycles=0/ > ..after resolving event: apple_firestorm_pmu/cycles=0/ > Control descriptor is not initialized > Opening: apple_icestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > Opening: cycles > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > builtin-daemon.o builtin-list.c builtin-version.c perf ui > builtin-data.c builtin-list.o builtin-version.o perf-archive util > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > builtin-diff.c builtin-mem.c command-list.txt perf.c > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > cycles: -1: 1034653 469125 469125 > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > cycles: 1034653 469125 469125 > > Performance counter stats for 'ls': > > 1,035,101 apple_icestorm_pmu/cycles/ > 1,035,035 apple_firestorm_pmu/cycles/ > 1,034,653 cycles > > 0.000001333 seconds time elapsed > > 0.000000000 seconds user > 0.000000000 seconds sys > </quote> > > which doesn't make any sense either. I really don't understand what > this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11), > nor what this 'cycle=0' stuff is. Hi Marc, I'm unclear if you are running a newer perf tool on an older kernel or not. In any case I'll assume the kernel and perf tool versions match. In Linux 6.6 this patch was added to the ARM PMU: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/perf/arm_pmu.c?id=5c816728651ae425954542fed64d21d40cb75a9f My guess is that the apple_icestorm_pmu requires a similar patch. The perf tool is supposed to not use extended types when they aren't supported: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n532 So I share your confusion as to why something broke. PERF_TYPE_HARDWARE is a legacy type where there are hardcoded type and config values that correspond to an event. The PMU driver turns legacy events into the real types. On BIG.little systems if the legacy events are monitoring a task a different event is needed for each PMU (ie >1 event). In your example you are monitoring 'ls', a task, and so different cycles events are necessary. In the high 32-bits (the extended type) the PMU is identified. Thanks for reporting the issue, Ian > /puzzled > > M. > > -- > Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 15:41 ` Ian Rogers @ 2023-11-21 15:56 ` Mark Rutland 2023-11-21 16:03 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Mark Rutland @ 2023-11-21 15:56 UTC (permalink / raw) To: Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 07:41:17AM -0800, Ian Rogers wrote: > Hi Marc, Hi Ian, > I'm unclear if you are running a newer perf tool on an older kernel or > not. In any case I'll assume the kernel and perf tool versions match. > In Linux 6.6 this patch was added to the ARM PMU: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/perf/arm_pmu.c?id=5c816728651ae425954542fed64d21d40cb75a9f > > My guess is that the apple_icestorm_pmu requires a similar patch. The apple_icestorm_pmu PMU driver uses the arm_pmu framework, so it's using that code (since v6.6). > The perf tool is supposed to not use extended types when they aren't > supported: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n532 How does that is_event_supported() check actually work? I suspect that's giving the wrong answer. Regardless, I think the tool is doing something semantically wrong, see below. > So I share your confusion as to why something broke. > > PERF_TYPE_HARDWARE is a legacy type where there are hardcoded type and > config values that correspond to an event. The PMU driver turns legacy > events into the real types. On BIG.little systems if the legacy events > are monitoring a task a different event is needed for each PMU (ie >1 > event). In your example you are monitoring 'ls', a task, and so > different cycles events are necessary. In the high 32-bits (the > extended type) the PMU is identified. I think the interesting thing here is that the tool is mapping events with an explicit PMU into legacy PERF_TYPE_HARDWARE events, which is the opposite direction than intended. Regardless of whether PERF_TYPE_HARDWARE events can be targetted to a specific PMU, if the user has requested to use a specific PMU we should be using that PMU and related event namespace. Marc's command line was: sudo taskset -c 0 ./perf stat -vvv \ -e apple_icestorm_pmu/cycles/ \ -e apple_firestorm_pmu/cycles/ \ -e cycles \ ls ... and so the apple_*_pmu events should target their respective PMUs, and the plain 'cycles' event could legitimately be opened as a single PERF_TYPE_HARDWARE event, or split into two directed PERF_TYPE_HARDWARE events targetting the two PMUs. However, thwe tool opens three (undirected?) PERF_TYPE_HARDWARE events: Opening: apple_icestorm_pmu/cycles/ ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 Opening: apple_firestorm_pmu/cycles/ ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 Opening: cycles ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 15:56 ` Mark Rutland @ 2023-11-21 16:03 ` Ian Rogers 2023-11-21 16:08 ` Mark Rutland 0 siblings, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-21 16:03 UTC (permalink / raw) To: Mark Rutland Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 7:56 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 07:41:17AM -0800, Ian Rogers wrote: > > Hi Marc, > > Hi Ian, > > > I'm unclear if you are running a newer perf tool on an older kernel or > > not. In any case I'll assume the kernel and perf tool versions match. > > In Linux 6.6 this patch was added to the ARM PMU: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/perf/arm_pmu.c?id=5c816728651ae425954542fed64d21d40cb75a9f > > > > My guess is that the apple_icestorm_pmu requires a similar patch. > > The apple_icestorm_pmu PMU driver uses the arm_pmu framework, so it's using > that code (since v6.6). > > > The perf tool is supposed to not use extended types when they aren't > > supported: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n532 > > How does that is_event_supported() check actually work? I suspect that's giving > the wrong answer. Maybe, the implementation is to check using perf_event_open: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 This is recycling logic from perf list where many legacy cache events are elided due to a lack of support. > Regardless, I think the tool is doing something semantically wrong, see below. > > > So I share your confusion as to why something broke. > > > > PERF_TYPE_HARDWARE is a legacy type where there are hardcoded type and > > config values that correspond to an event. The PMU driver turns legacy > > events into the real types. On BIG.little systems if the legacy events > > are monitoring a task a different event is needed for each PMU (ie >1 > > event). In your example you are monitoring 'ls', a task, and so > > different cycles events are necessary. In the high 32-bits (the > > extended type) the PMU is identified. > > I think the interesting thing here is that the tool is mapping events with an > explicit PMU into legacy PERF_TYPE_HARDWARE events, which is the opposite > direction than intended. Regardless of whether PERF_TYPE_HARDWARE events can be > targetted to a specific PMU, if the user has requested to use a specific PMU we > should be using that PMU and related event namespace. > > Marc's command line was: > > sudo taskset -c 0 ./perf stat -vvv \ > -e apple_icestorm_pmu/cycles/ \ > -e apple_firestorm_pmu/cycles/ \ > -e cycles \ -e cycles here is a direct request for the legacy cycles event. It will match in the parser here: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.l?h=perf-tools-next#n301 which goes to: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.y?h=perf-tools-next#n397 and as this is a hardware event there is wildcard expansion on each core PMU. Thanks, Ian > ls > > ... and so the apple_*_pmu events should target their respective PMUs, and the > plain 'cycles' event could legitimately be opened as a single > PERF_TYPE_HARDWARE event, or split into two directed PERF_TYPE_HARDWARE events > targetting the two PMUs. > > However, thwe tool opens three (undirected?) PERF_TYPE_HARDWARE events: > > Opening: apple_icestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > Opening: cycles > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > > Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 16:03 ` Ian Rogers @ 2023-11-21 16:08 ` Mark Rutland 0 siblings, 0 replies; 53+ messages in thread From: Mark Rutland @ 2023-11-21 16:08 UTC (permalink / raw) To: Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 08:03:11AM -0800, Ian Rogers wrote: > On Tue, Nov 21, 2023 at 7:56 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Tue, Nov 21, 2023 at 07:41:17AM -0800, Ian Rogers wrote: > > > Hi Marc, > > > > Hi Ian, > > > > > I'm unclear if you are running a newer perf tool on an older kernel or > > > not. In any case I'll assume the kernel and perf tool versions match. > > > In Linux 6.6 this patch was added to the ARM PMU: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/perf/arm_pmu.c?id=5c816728651ae425954542fed64d21d40cb75a9f > > > > > > My guess is that the apple_icestorm_pmu requires a similar patch. > > > > The apple_icestorm_pmu PMU driver uses the arm_pmu framework, so it's using > > that code (since v6.6). > > > > > The perf tool is supposed to not use extended types when they aren't > > > supported: > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n532 > > > > How does that is_event_supported() check actually work? I suspect that's giving > > the wrong answer. > > Maybe, the implementation is to check using perf_event_open: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 > > This is recycling logic from perf list where many legacy cache events > are elided due to a lack of support. > > > Regardless, I think the tool is doing something semantically wrong, see below. > > > > > So I share your confusion as to why something broke. > > > > > > PERF_TYPE_HARDWARE is a legacy type where there are hardcoded type and > > > config values that correspond to an event. The PMU driver turns legacy > > > events into the real types. On BIG.little systems if the legacy events > > > are monitoring a task a different event is needed for each PMU (ie >1 > > > event). In your example you are monitoring 'ls', a task, and so > > > different cycles events are necessary. In the high 32-bits (the > > > extended type) the PMU is identified. > > > > I think the interesting thing here is that the tool is mapping events with an > > explicit PMU into legacy PERF_TYPE_HARDWARE events, which is the opposite > > direction than intended. Regardless of whether PERF_TYPE_HARDWARE events can be > > targetted to a specific PMU, if the user has requested to use a specific PMU we > > should be using that PMU and related event namespace. > > > > Marc's command line was: > > > > sudo taskset -c 0 ./perf stat -vvv \ > > -e apple_icestorm_pmu/cycles/ \ > > -e apple_firestorm_pmu/cycles/ \ > > -e cycles \ > > -e cycles here is a direct request for the legacy cycles event. It > will match in the parser here: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.l?h=perf-tools-next#n301 > > which goes to: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.y?h=perf-tools-next#n397 > > and as this is a hardware event there is wildcard expansion on each core PMU. Please read the rest of my message, which was talking about the other two events. Mark. > > Thanks, > Ian > > > ls > > > > ... and so the apple_*_pmu events should target their respective PMUs, and the > > plain 'cycles' event could legitimately be opened as a single > > PERF_TYPE_HARDWARE event, or split into two directed PERF_TYPE_HARDWARE events > > targetting the two PMUs. > > > > However, thwe tool opens three (undirected?) PERF_TYPE_HARDWARE events: > > > > Opening: apple_icestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > > Opening: apple_firestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > Opening: cycles > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > > > Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 15:24 ` Marc Zyngier 2023-11-21 15:40 ` Mark Rutland 2023-11-21 15:41 ` Ian Rogers @ 2023-11-23 14:23 ` Mark Rutland 2023-11-23 14:45 ` Marc Zyngier 2023-11-23 15:14 ` Ian Rogers 2 siblings, 2 replies; 53+ messages in thread From: Mark Rutland @ 2023-11-23 14:23 UTC (permalink / raw) To: Marc Zyngier Cc: Hector Martin, Arnaldo Carvalho de Melo, Ian Rogers, James Clark, linux-perf-users, LKML, Asahi Linux On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > On Tue, 21 Nov 2023 13:40:31 +0000, > Marc Zyngier <maz@kernel.org> wrote: > > > > [Adding key people on Cc] > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > Hector Martin <marcan@marcan.st> wrote: > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > the PMU, but nothing works anymore. > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > package, but that's obviously not going to last. > > > > I'm happy to test potential fixes. > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > CPU): Looking at this with fresh(er) eyes, I think there's a userspace bug here, regardless of whether one believes it's correct to convert a named-pmu event to a PERF_TYPE_HARDWARE event directed at that PMU. It looks like the userspace tool is dropping the extended type ID after an initial probe, and requests events with plain PERF_TYPE_HARDWARE (without an extended type ID), which explains why we seem to get events from one PMU only. More detail below... Marc, if you have time, could you run the same commands (on the same kernel) with a perf tool build from v6.4? > <quote> > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > apple_firestorm_pmu/cycles/ -e cycles ls > Using CPUID 0x00000000612f0280 > Attempt to add: apple_icestorm_pmu/cycles=0/ > ..after resolving event: apple_icestorm_pmu/cycles=0/ > Opening: unknown-hardware:HG > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0xb00000000 > disabled 1 > ------------------------------------------------------------ Here config[31:0] is 0 (PERF_COUNT_HW_CPU_CYCLES), and config[63:32] is 0xb, which is presumably the PMU ID for the apple_icestorm_pmu. The attr doesn't contain exclude_guest=1, so this will be rejected by the PMU driver due to its mode exclusion requirements. > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -95 ... which is what we see here (this is EOPNOTSUPP, which __hw_perf_event_init() in drivers/perf/arm_pmu.c returns when the mode requested mode exclusion options aren't supported). So far, so good... > Attempt to add: apple_firestorm_pmu/cycles=0/ > ..after resolving event: apple_firestorm_pmu/cycles=0/ > Control descriptor is not initialized > Opening: apple_icestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ ... but here, the extended type ID has been dropped, and this event is no longer directed towards the apple_firestorm_pmu PMU, so the kernel can direct this to *any* CPU PMU... > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 ... and *some* PMU accepts it. > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ Likewise here, no extended type ID... > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > Opening: cycles > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ Likewise here, no extended type ID... > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > builtin-daemon.o builtin-list.c builtin-version.c perf ui > builtin-data.c builtin-list.o builtin-version.o perf-archive util > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > builtin-diff.c builtin-mem.c command-list.txt perf.c > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > cycles: -1: 0 873709 0 > apple_icestorm_pmu/cycles/: 0 873709 0 > apple_firestorm_pmu/cycles/: 0 873709 0 > cycles: 0 873709 0 > > Performance counter stats for 'ls': > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > <not counted> cycles (0.00%) > > 0.000002250 seconds time elapsed > > 0.000000000 seconds user > 0.000000000 seconds sys So it looks like the tool has expanded the requested 'apple_icestorm_pmu/cycles/' event into three cycles events, each opened without an extended type ID. AFAICT, the kernel has done exactly what it has always done for PERF_TYPE_HARDWARE/PERF_COUNT_HW_CPU_CYCLES events: pick the first PMU which said it can handle them. > If I run the same thing on another CPU cluster (firestorm), I get > this: > > <quote> > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > apple_firestorm_pmu/cycles/ -e cycles ls > Using CPUID 0x00000000612f0280 > Attempt to add: apple_icestorm_pmu/cycles=0/ > ..after resolving event: apple_icestorm_pmu/cycles=0/ > Opening: unknown-hardware:HG > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0xb00000000 > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -95 Again, we see one request with an extended type ID, which fails due to mode exclusion requirements... > Attempt to add: apple_firestorm_pmu/cycles=0/ > ..after resolving event: apple_firestorm_pmu/cycles=0/ > Control descriptor is not initialized > Opening: apple_icestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > Opening: apple_firestorm_pmu/cycles/ > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > Opening: cycles > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > size 136 > config 0 (PERF_COUNT_HW_CPU_CYCLES) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ ... but all subsequent requests do not have an extended type ID, and the kernel directs these to whichever PMU accepts the event first... > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > builtin-daemon.o builtin-list.c builtin-version.c perf ui > builtin-data.c builtin-list.o builtin-version.o perf-archive util > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > builtin-diff.c builtin-mem.c command-list.txt perf.c > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > cycles: -1: 1034653 469125 469125 > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > cycles: 1034653 469125 469125 > > Performance counter stats for 'ls': > > 1,035,101 apple_icestorm_pmu/cycles/ > 1,035,035 apple_firestorm_pmu/cycles/ > 1,034,653 cycles > > 0.000001333 seconds time elapsed > > 0.000000000 seconds user > 0.000000000 seconds sys > </quote> ... and in this case the workload was run on a CPU affine ot that arbitrary PMU, hence we managed to count. So AFAICT, this is a userspace bug, maybe related to the way we probe for supported PMU features? Thanks, Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-23 14:23 ` Mark Rutland @ 2023-11-23 14:45 ` Marc Zyngier 2023-11-23 15:14 ` Ian Rogers 1 sibling, 0 replies; 53+ messages in thread From: Marc Zyngier @ 2023-11-23 14:45 UTC (permalink / raw) To: Mark Rutland Cc: Hector Martin, Arnaldo Carvalho de Melo, Ian Rogers, James Clark, linux-perf-users, LKML, Asahi Linux On Thu, 23 Nov 2023 14:23:10 +0000, Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > On Tue, 21 Nov 2023 13:40:31 +0000, > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > [Adding key people on Cc] > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > the PMU, but nothing works anymore. > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > package, but that's obviously not going to last. > > > > > > I'm happy to test potential fixes. > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > CPU): > > Looking at this with fresh(er) eyes, I think there's a userspace bug here, > regardless of whether one believes it's correct to convert a named-pmu event to > a PERF_TYPE_HARDWARE event directed at that PMU. > > It looks like the userspace tool is dropping the extended type ID after an > initial probe, and requests events with plain PERF_TYPE_HARDWARE (without an > extended type ID), which explains why we seem to get events from one PMU only. > > More detail below... > > Marc, if you have time, could you run the same commands (on the same kernel) > with a perf tool build from v6.4? Here you go: <quote> $ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e apple_firestorm_pmu/cycles/ -e cycles ls >/dev/null Using CPUID 0x00000000610f0280 Attempting to add event pmu 'apple_icestorm_pmu' with 'cycles,' that may result in non-fatal errors After aliases, add event pmu 'apple_icestorm_pmu' with 'event,' that may result in non-fatal errors Attempting to add event pmu 'apple_firestorm_pmu' with 'cycles,' that may result in non-fatal errors After aliases, add event pmu 'apple_firestorm_pmu' with 'event,' that may result in non-fatal errors Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 10 size 136 config 0x2 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1624462 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 11 size 136 config 0x2 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1624462 cpu -1 group_fd -1 flags 0x8 = 4 ------------------------------------------------------------ perf_event_attr: size 136 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1624462 cpu -1 group_fd -1 flags 0x8 = 5 apple_icestorm_pmu/cycles/: -1: 1492180 724333 724333 apple_firestorm_pmu/cycles/: -1: 0 724333 0 cycles: -1: 0 724333 0 apple_icestorm_pmu/cycles/: 1492180 724333 724333 apple_firestorm_pmu/cycles/: 0 724333 0 cycles: 0 724333 0 Performance counter stats for 'ls': 1,492,180 apple_icestorm_pmu/cycles/ <not counted> apple_firestorm_pmu/cycles/ (0.00%) <not counted> cycles (0.00%) 0.000001917 seconds time elapsed 0.000000000 seconds user 0.000000000 seconds sys </quote> and on the other cluster: <quote> $ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e apple_firestorm_pmu/cycles/ -e cycles ls >/dev/null Using CPUID 0x00000000610f0280 Attempting to add event pmu 'apple_icestorm_pmu' with 'cycles,' that may result in non-fatal errors After aliases, add event pmu 'apple_icestorm_pmu' with 'event,' that may result in non-fatal errors Attempting to add event pmu 'apple_firestorm_pmu' with 'cycles,' that may result in non-fatal errors After aliases, add event pmu 'apple_firestorm_pmu' with 'event,' that may result in non-fatal errors Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 10 size 136 config 0x2 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1624466 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 11 size 136 config 0x2 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1624466 cpu -1 group_fd -1 flags 0x8 = 4 ------------------------------------------------------------ perf_event_attr: size 136 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 1624466 cpu -1 group_fd -1 flags 0x8 = 5 apple_icestorm_pmu/cycles/: -1: 0 593209 0 apple_firestorm_pmu/cycles/: -1: 1038247 593209 593209 cycles: -1: 1037870 593209 593209 apple_icestorm_pmu/cycles/: 0 593209 0 apple_firestorm_pmu/cycles/: 1038247 593209 593209 cycles: 1037870 593209 593209 Performance counter stats for 'ls': <not counted> apple_icestorm_pmu/cycles/ (0.00%) 1,038,247 apple_firestorm_pmu/cycles/ 1,037,870 cycles 0.000001500 seconds time elapsed 0.000000000 seconds user 0.000000000 seconds sys </quote> For the record, this is on a 6.6-rc6 kernel, userspace perf as of v6.4.0. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-23 14:23 ` Mark Rutland 2023-11-23 14:45 ` Marc Zyngier @ 2023-11-23 15:14 ` Ian Rogers 2023-11-23 16:48 ` Mark Rutland 1 sibling, 1 reply; 53+ messages in thread From: Ian Rogers @ 2023-11-23 15:14 UTC (permalink / raw) To: Mark Rutland Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Thu, Nov 23, 2023 at 6:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > On Tue, 21 Nov 2023 13:40:31 +0000, > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > [Adding key people on Cc] > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > the PMU, but nothing works anymore. > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > package, but that's obviously not going to last. > > > > > > I'm happy to test potential fixes. > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > CPU): > > Looking at this with fresh(er) eyes, I think there's a userspace bug here, > regardless of whether one believes it's correct to convert a named-pmu event to > a PERF_TYPE_HARDWARE event directed at that PMU. > > It looks like the userspace tool is dropping the extended type ID after an > initial probe, and requests events with plain PERF_TYPE_HARDWARE (without an > extended type ID), which explains why we seem to get events from one PMU only. > > More detail below... > > Marc, if you have time, could you run the same commands (on the same kernel) > with a perf tool build from v6.4? > > > <quote> > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > apple_firestorm_pmu/cycles/ -e cycles ls > > Using CPUID 0x00000000612f0280 > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > Opening: unknown-hardware:HG > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > config 0xb00000000 > > disabled 1 > > ------------------------------------------------------------ > > Here config[31:0] is 0 (PERF_COUNT_HW_CPU_CYCLES), and config[63:32] is 0xb, > which is presumably the PMU ID for the apple_icestorm_pmu. > > The attr doesn't contain exclude_guest=1, so this will be rejected by the PMU > driver due to its mode exclusion requirements. > > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > sys_perf_event_open failed, error -95 > > ... which is what we see here (this is EOPNOTSUPP, which __hw_perf_event_init() > in drivers/perf/arm_pmu.c returns when the mode requested mode exclusion > options aren't supported). > > So far, so good... > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > Control descriptor is not initialized > > Opening: apple_icestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > ... but here, the extended type ID has been dropped, and this event is no > longer directed towards the apple_firestorm_pmu PMU, so the kernel can direct > this to *any* CPU PMU... > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > > ... and *some* PMU accepts it. > > > Opening: apple_firestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > Likewise here, no extended type ID... > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > Opening: cycles > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > Likewise here, no extended type ID... > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > > cycles: -1: 0 873709 0 > > apple_icestorm_pmu/cycles/: 0 873709 0 > > apple_firestorm_pmu/cycles/: 0 873709 0 > > cycles: 0 873709 0 > > > > Performance counter stats for 'ls': > > > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > > <not counted> cycles (0.00%) > > > > 0.000002250 seconds time elapsed > > > > 0.000000000 seconds user > > 0.000000000 seconds sys > > So it looks like the tool has expanded the requested > 'apple_icestorm_pmu/cycles/' event into three cycles events, each opened > without an extended type ID. > > AFAICT, the kernel has done exactly what it has always done for > PERF_TYPE_HARDWARE/PERF_COUNT_HW_CPU_CYCLES events: pick the first PMU which > said it can handle them. > > > If I run the same thing on another CPU cluster (firestorm), I get > > this: > > > > <quote> > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > apple_firestorm_pmu/cycles/ -e cycles ls > > Using CPUID 0x00000000612f0280 > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > Opening: unknown-hardware:HG > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > config 0xb00000000 > > disabled 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > sys_perf_event_open failed, error -95 > > Again, we see one request with an extended type ID, which fails due to mode exclusion requirements... > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > Control descriptor is not initialized > > Opening: apple_icestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > > Opening: apple_firestorm_pmu/cycles/ > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > > Opening: cycles > > ------------------------------------------------------------ > > perf_event_attr: > > type 0 (PERF_TYPE_HARDWARE) > > size 136 > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > sample_type IDENTIFIER > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > disabled 1 > > inherit 1 > > enable_on_exec 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > ... but all subsequent requests do not have an extended type ID, and the kernel > directs these to whichever PMU accepts the event first... > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > > cycles: -1: 1034653 469125 469125 > > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > > cycles: 1034653 469125 469125 > > > > Performance counter stats for 'ls': > > > > 1,035,101 apple_icestorm_pmu/cycles/ > > 1,035,035 apple_firestorm_pmu/cycles/ > > 1,034,653 cycles > > > > 0.000001333 seconds time elapsed > > > > 0.000000000 seconds user > > 0.000000000 seconds sys > > </quote> > > ... and in this case the workload was run on a CPU affine ot that arbitrary > PMU, hence we managed to count. > > So AFAICT, this is a userspace bug, maybe related to the way we probe for > supported PMU features? Probing PMU features is done by trying to perf_event_open events. For extended types it is a cycles event on each core PMU: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n532 The is_event_supported logic is here: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 There is the following comment: if (open_return == -EACCES) { /* * This happens if the paranoid value * /proc/sys/kernel/perf_event_paranoid is set to 2 * Re-run with exclude_kernel set; we don't do that * by default as some ARM machines do not support it. * */ Thanks, Ian > Thanks, > Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-23 15:14 ` Ian Rogers @ 2023-11-23 16:48 ` Mark Rutland 2023-11-23 17:08 ` James Clark 0 siblings, 1 reply; 53+ messages in thread From: Mark Rutland @ 2023-11-23 16:48 UTC (permalink / raw) To: Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, James Clark, linux-perf-users, LKML, Asahi Linux On Thu, Nov 23, 2023 at 07:14:21AM -0800, Ian Rogers wrote: > On Thu, Nov 23, 2023 at 6:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: > > > On Tue, 21 Nov 2023 13:40:31 +0000, > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > [Adding key people on Cc] > > > > > > > > On Tue, 21 Nov 2023 12:08:48 +0000, > > > > Hector Martin <marcan@marcan.st> wrote: > > > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick > > > > the PMU, but nothing works anymore. > > > > > > > > The saving grace in my case is that Debian still ships a 6.1 perftool > > > > package, but that's obviously not going to last. > > > > > > > > I'm happy to test potential fixes. > > > > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with > > > -vvv. And it is quite entertaining (this is taskset to an 'icestorm' > > > CPU): > > > > Looking at this with fresh(er) eyes, I think there's a userspace bug here, > > regardless of whether one believes it's correct to convert a named-pmu event to > > a PERF_TYPE_HARDWARE event directed at that PMU. > > > > It looks like the userspace tool is dropping the extended type ID after an > > initial probe, and requests events with plain PERF_TYPE_HARDWARE (without an > > extended type ID), which explains why we seem to get events from one PMU only. > > > > More detail below... > > > > Marc, if you have time, could you run the same commands (on the same kernel) > > with a perf tool build from v6.4? > > > > > <quote> > > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > > apple_firestorm_pmu/cycles/ -e cycles ls > > > Using CPUID 0x00000000612f0280 > > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > > Opening: unknown-hardware:HG > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > config 0xb00000000 > > > disabled 1 > > > ------------------------------------------------------------ > > > > Here config[31:0] is 0 (PERF_COUNT_HW_CPU_CYCLES), and config[63:32] is 0xb, > > which is presumably the PMU ID for the apple_icestorm_pmu. > > > > The attr doesn't contain exclude_guest=1, so this will be rejected by the PMU > > driver due to its mode exclusion requirements. > > > > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > > sys_perf_event_open failed, error -95 > > > > ... which is what we see here (this is EOPNOTSUPP, which __hw_perf_event_init() > > in drivers/perf/arm_pmu.c returns when the mode requested mode exclusion > > options aren't supported). > > > > So far, so good... > > > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > > Control descriptor is not initialized > > > Opening: apple_icestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > > ... but here, the extended type ID has been dropped, and this event is no > > longer directed towards the apple_firestorm_pmu PMU, so the kernel can direct > > this to *any* CPU PMU... > > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 > > > > ... and *some* PMU accepts it. > > > > > Opening: apple_firestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > > Likewise here, no extended type ID... > > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 > > > Opening: cycles > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > > Likewise here, no extended type ID... > > > > > sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 > > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > > apple_icestorm_pmu/cycles/: -1: 0 873709 0 > > > apple_firestorm_pmu/cycles/: -1: 0 873709 0 > > > cycles: -1: 0 873709 0 > > > apple_icestorm_pmu/cycles/: 0 873709 0 > > > apple_firestorm_pmu/cycles/: 0 873709 0 > > > cycles: 0 873709 0 > > > > > > Performance counter stats for 'ls': > > > > > > <not counted> apple_icestorm_pmu/cycles/ (0.00%) > > > <not counted> apple_firestorm_pmu/cycles/ (0.00%) > > > <not counted> cycles (0.00%) > > > > > > 0.000002250 seconds time elapsed > > > > > > 0.000000000 seconds user > > > 0.000000000 seconds sys > > > > So it looks like the tool has expanded the requested > > 'apple_icestorm_pmu/cycles/' event into three cycles events, each opened > > without an extended type ID. > > > > AFAICT, the kernel has done exactly what it has always done for > > PERF_TYPE_HARDWARE/PERF_COUNT_HW_CPU_CYCLES events: pick the first PMU which > > said it can handle them. > > > > > If I run the same thing on another CPU cluster (firestorm), I get > > > this: > > > > > > <quote> > > > maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e > > > apple_firestorm_pmu/cycles/ -e cycles ls > > > Using CPUID 0x00000000612f0280 > > > Attempt to add: apple_icestorm_pmu/cycles=0/ > > > ..after resolving event: apple_icestorm_pmu/cycles=0/ > > > Opening: unknown-hardware:HG > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > config 0xb00000000 > > > disabled 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 > > > sys_perf_event_open failed, error -95 > > > > Again, we see one request with an extended type ID, which fails due to mode exclusion requirements... > > > > > Attempt to add: apple_firestorm_pmu/cycles=0/ > > > ..after resolving event: apple_firestorm_pmu/cycles=0/ > > > Control descriptor is not initialized > > > Opening: apple_icestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 > > > Opening: apple_firestorm_pmu/cycles/ > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 > > > Opening: cycles > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 0 (PERF_TYPE_HARDWARE) > > > size 136 > > > config 0 (PERF_COUNT_HW_CPU_CYCLES) > > > sample_type IDENTIFIER > > > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > > > disabled 1 > > > inherit 1 > > > enable_on_exec 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > > ... but all subsequent requests do not have an extended type ID, and the kernel > > directs these to whichever PMU accepts the event first... > > > > > sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 > > > arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh > > > bench builtin-evlist.c builtin-probe.c CREDITS perf.h > > > Build builtin-evlist.o builtin-probe.o design.txt perf-in.o > > > builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat > > > builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh > > > builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o > > > builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c > > > builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h > > > builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE > > > builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore > > > builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events > > > builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python > > > builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build > > > builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts > > > builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests > > > builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace > > > builtin-daemon.o builtin-list.c builtin-version.c perf ui > > > builtin-data.c builtin-list.o builtin-version.o perf-archive util > > > builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh > > > builtin-diff.c builtin-mem.c command-list.txt perf.c > > > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 > > > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 > > > cycles: -1: 1034653 469125 469125 > > > apple_icestorm_pmu/cycles/: 1035101 469125 469125 > > > apple_firestorm_pmu/cycles/: 1035035 469125 469125 > > > cycles: 1034653 469125 469125 > > > > > > Performance counter stats for 'ls': > > > > > > 1,035,101 apple_icestorm_pmu/cycles/ > > > 1,035,035 apple_firestorm_pmu/cycles/ > > > 1,034,653 cycles > > > > > > 0.000001333 seconds time elapsed > > > > > > 0.000000000 seconds user > > > 0.000000000 seconds sys > > > </quote> > > > > ... and in this case the workload was run on a CPU affine ot that arbitrary > > PMU, hence we managed to count. > > > > So AFAICT, this is a userspace bug, maybe related to the way we probe for > > supported PMU features? > > Probing PMU features is done by trying to perf_event_open events. For > extended types it is a cycles event on each core PMU: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n532 > > The is_event_supported logic is here: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 Ah, so IIUC what's happening is: 1) Userspace tries to detect extended type support, with a cycles event directed to one of the CPU PMUs. The attr for this does not have exclude_guest set. 2) In the kernel, the core perf code sees the extended hw type id, and directs this towards the correct PMU (apple_icestorm_pmu). 3) The PMU driver looks at the attr, sees exclude_guest is not set, and returns -EOPNOTSUPP, exactly as it would regardless of whether the extended hw type is used. Note: this happens to be a difference between x86 PMUs and the apple_* PMUs, but this is a legitimate part of the perf ABI, not an arm-specific quirk or bug. 4) Userspace receives -EOPNOTSUPP, and so decide the extended hw_type is not supported (even though the kernel does support the extended hw type id, and the event was rejected for orthogonal reasons). 5) Userspace avoids the extended hw type, but still uses PERF_EVENT_TYPE_HARDWARE events for named-pmu events. Does that sound plausible to you, or have I misunderstood? From Marc's reply at: https://lore.kernel.org/lkml/86edggzfxx.wl-maz@kernel.org/ ... with perf built from v6.4, the perf tool can open named pmu events without issue, and sets exclude_guest in the attr. So it seems like there's a mismatch between regular opening of events and probing for extended hw type that causes that to differ. AFAICT, the kernel is doing the right thing here, but the userspace detection of extended type id support happens to differ from regular event opening, and mis-interprets -EOPNOTSUP as "the kernel doesn't support extended type IDs" rather than "The kernel was able to consume the extended type ID, but the specific PMU targetted said it doesn't support this attr". IIUC that means this'll be broken on older kernels (those before the extended hw type id support was introduced), too? It sounds like we need to make (4) more robust? I'm not immediately sure how, given the rats nest of returns in perf_event_open(), but I'm happy to try to help with that. It also seems like (5) is a problem regardless. If the user asks for a named PMU event on an older kernel (before the extended hw type id was a thing), and the tool converts that to a plain PERF_EVENT_TYPE_HARDWARE event, it's liable to be handled by a different PMU than the one the user asked for. Thanks, Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-23 16:48 ` Mark Rutland @ 2023-11-23 17:08 ` James Clark 2023-11-23 17:15 ` Mark Rutland 0 siblings, 1 reply; 53+ messages in thread From: James Clark @ 2023-11-23 17:08 UTC (permalink / raw) To: Mark Rutland, Ian Rogers Cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, linux-perf-users, LKML, Asahi Linux On 23/11/2023 16:48, Mark Rutland wrote: > On Thu, Nov 23, 2023 at 07:14:21AM -0800, Ian Rogers wrote: >> On Thu, Nov 23, 2023 at 6:23 AM Mark Rutland <mark.rutland@arm.com> wrote: >>> >>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote: >>>> On Tue, 21 Nov 2023 13:40:31 +0000, >>>> Marc Zyngier <maz@kernel.org> wrote: >>>>> >>>>> [Adding key people on Cc] >>>>> >>>>> On Tue, 21 Nov 2023 12:08:48 +0000, >>>>> Hector Martin <marcan@marcan.st> wrote: >>>>>> >>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >>>>> >>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any >>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick >>>>> the PMU, but nothing works anymore. >>>>> >>>>> The saving grace in my case is that Debian still ships a 6.1 perftool >>>>> package, but that's obviously not going to last. >>>>> >>>>> I'm happy to test potential fixes. >>>> >>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with >>>> -vvv. And it is quite entertaining (this is taskset to an 'icestorm' >>>> CPU): >>> >>> Looking at this with fresh(er) eyes, I think there's a userspace bug here, >>> regardless of whether one believes it's correct to convert a named-pmu event to >>> a PERF_TYPE_HARDWARE event directed at that PMU. >>> >>> It looks like the userspace tool is dropping the extended type ID after an >>> initial probe, and requests events with plain PERF_TYPE_HARDWARE (without an >>> extended type ID), which explains why we seem to get events from one PMU only. >>> >>> More detail below... >>> >>> Marc, if you have time, could you run the same commands (on the same kernel) >>> with a perf tool build from v6.4? >>> >>>> <quote> >>>> maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e >>>> apple_firestorm_pmu/cycles/ -e cycles ls >>>> Using CPUID 0x00000000612f0280 >>>> Attempt to add: apple_icestorm_pmu/cycles=0/ >>>> ..after resolving event: apple_icestorm_pmu/cycles=0/ >>>> Opening: unknown-hardware:HG >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> config 0xb00000000 >>>> disabled 1 >>>> ------------------------------------------------------------ >>> >>> Here config[31:0] is 0 (PERF_COUNT_HW_CPU_CYCLES), and config[63:32] is 0xb, >>> which is presumably the PMU ID for the apple_icestorm_pmu. >>> >>> The attr doesn't contain exclude_guest=1, so this will be rejected by the PMU >>> driver due to its mode exclusion requirements. >>> >>>> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 >>>> sys_perf_event_open failed, error -95 >>> >>> ... which is what we see here (this is EOPNOTSUPP, which __hw_perf_event_init() >>> in drivers/perf/arm_pmu.c returns when the mode requested mode exclusion >>> options aren't supported). >>> >>> So far, so good... >>> >>>> Attempt to add: apple_firestorm_pmu/cycles=0/ >>>> ..after resolving event: apple_firestorm_pmu/cycles=0/ >>>> Control descriptor is not initialized >>>> Opening: apple_icestorm_pmu/cycles/ >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) >>>> sample_type IDENTIFIER >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> enable_on_exec 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>> >>> ... but here, the extended type ID has been dropped, and this event is no >>> longer directed towards the apple_firestorm_pmu PMU, so the kernel can direct >>> this to *any* CPU PMU... >>> >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 3 >>> >>> ... and *some* PMU accepts it. >>> >>>> Opening: apple_firestorm_pmu/cycles/ >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) >>>> sample_type IDENTIFIER >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> enable_on_exec 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>> >>> Likewise here, no extended type ID... >>> >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 4 >>>> Opening: cycles >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) >>>> sample_type IDENTIFIER >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> enable_on_exec 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>> >>> Likewise here, no extended type ID... >>> >>>> sys_perf_event_open: pid 1045843 cpu -1 group_fd -1 flags 0x8 = 5 >>>> arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh >>>> bench builtin-evlist.c builtin-probe.c CREDITS perf.h >>>> Build builtin-evlist.o builtin-probe.o design.txt perf-in.o >>>> builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat >>>> builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh >>>> builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o >>>> builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c >>>> builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h >>>> builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE >>>> builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore >>>> builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events >>>> builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python >>>> builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build >>>> builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts >>>> builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests >>>> builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace >>>> builtin-daemon.o builtin-list.c builtin-version.c perf ui >>>> builtin-data.c builtin-list.o builtin-version.o perf-archive util >>>> builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh >>>> builtin-diff.c builtin-mem.c command-list.txt perf.c >>>> apple_icestorm_pmu/cycles/: -1: 0 873709 0 >>>> apple_firestorm_pmu/cycles/: -1: 0 873709 0 >>>> cycles: -1: 0 873709 0 >>>> apple_icestorm_pmu/cycles/: 0 873709 0 >>>> apple_firestorm_pmu/cycles/: 0 873709 0 >>>> cycles: 0 873709 0 >>>> >>>> Performance counter stats for 'ls': >>>> >>>> <not counted> apple_icestorm_pmu/cycles/ (0.00%) >>>> <not counted> apple_firestorm_pmu/cycles/ (0.00%) >>>> <not counted> cycles (0.00%) >>>> >>>> 0.000002250 seconds time elapsed >>>> >>>> 0.000000000 seconds user >>>> 0.000000000 seconds sys >>> >>> So it looks like the tool has expanded the requested >>> 'apple_icestorm_pmu/cycles/' event into three cycles events, each opened >>> without an extended type ID. >>> >>> AFAICT, the kernel has done exactly what it has always done for >>> PERF_TYPE_HARDWARE/PERF_COUNT_HW_CPU_CYCLES events: pick the first PMU which >>> said it can handle them. >>> >>>> If I run the same thing on another CPU cluster (firestorm), I get >>>> this: >>>> >>>> <quote> >>>> maz@valley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e >>>> apple_firestorm_pmu/cycles/ -e cycles ls >>>> Using CPUID 0x00000000612f0280 >>>> Attempt to add: apple_icestorm_pmu/cycles=0/ >>>> ..after resolving event: apple_icestorm_pmu/cycles=0/ >>>> Opening: unknown-hardware:HG >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> config 0xb00000000 >>>> disabled 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 >>>> sys_perf_event_open failed, error -95 >>> >>> Again, we see one request with an extended type ID, which fails due to mode exclusion requirements... >>> >>>> Attempt to add: apple_firestorm_pmu/cycles=0/ >>>> ..after resolving event: apple_firestorm_pmu/cycles=0/ >>>> Control descriptor is not initialized >>>> Opening: apple_icestorm_pmu/cycles/ >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) >>>> sample_type IDENTIFIER >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> enable_on_exec 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 3 >>>> Opening: apple_firestorm_pmu/cycles/ >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) >>>> sample_type IDENTIFIER >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> enable_on_exec 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 4 >>>> Opening: cycles >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_HW_CPU_CYCLES) >>>> sample_type IDENTIFIER >>>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> enable_on_exec 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>> >>> ... but all subsequent requests do not have an extended type ID, and the kernel >>> directs these to whichever PMU accepts the event first... >>> >>>> sys_perf_event_open: pid 1045925 cpu -1 group_fd -1 flags 0x8 = 5 >>>> arch builtin-diff.o builtin-mem.o common-cmds.h perf-completion.sh >>>> bench builtin-evlist.c builtin-probe.c CREDITS perf.h >>>> Build builtin-evlist.o builtin-probe.o design.txt perf-in.o >>>> builtin-annotate.c builtin-ftrace.c builtin-record.c dlfilters perf-iostat >>>> builtin-annotate.o builtin-ftrace.o builtin-record.o Documentation perf-iostat.sh >>>> builtin-bench.c builtin.h builtin-report.c FEATURE-DUMP perf.o >>>> builtin-bench.o builtin-help.c builtin-report.o include perf-read-vdso.c >>>> builtin-buildid-cache.c builtin-help.o builtin-sched.c jvmti perf-sys.h >>>> builtin-buildid-cache.o builtin-inject.c builtin-script.c libapi PERF-VERSION-FILE >>>> builtin-buildid-list.c builtin-inject.o builtin-script.o libperf perf-with-kcore >>>> builtin-buildid-list.o builtin-kallsyms.c builtin-stat.c libsubcmd pmu-events >>>> builtin-c2c.c builtin-kallsyms.o builtin-stat.o libsymbol python >>>> builtin-c2c.o builtin-kmem.c builtin-timechart.c Makefile python_ext_build >>>> builtin-config.c builtin-kvm.c builtin-top.c Makefile.config scripts >>>> builtin-config.o builtin-kvm.o builtin-top.o Makefile.perf tests >>>> builtin-daemon.c builtin-kwork.c builtin-trace.c MANIFEST trace >>>> builtin-daemon.o builtin-list.c builtin-version.c perf ui >>>> builtin-data.c builtin-list.o builtin-version.o perf-archive util >>>> builtin-data.o builtin-lock.c check-headers.sh perf-archive.sh >>>> builtin-diff.c builtin-mem.c command-list.txt perf.c >>>> apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125 >>>> apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125 >>>> cycles: -1: 1034653 469125 469125 >>>> apple_icestorm_pmu/cycles/: 1035101 469125 469125 >>>> apple_firestorm_pmu/cycles/: 1035035 469125 469125 >>>> cycles: 1034653 469125 469125 >>>> >>>> Performance counter stats for 'ls': >>>> >>>> 1,035,101 apple_icestorm_pmu/cycles/ >>>> 1,035,035 apple_firestorm_pmu/cycles/ >>>> 1,034,653 cycles >>>> >>>> 0.000001333 seconds time elapsed >>>> >>>> 0.000000000 seconds user >>>> 0.000000000 seconds sys >>>> </quote> >>> >>> ... and in this case the workload was run on a CPU affine ot that arbitrary >>> PMU, hence we managed to count. >>> >>> So AFAICT, this is a userspace bug, maybe related to the way we probe for >>> supported PMU features? >> >> Probing PMU features is done by trying to perf_event_open events. For >> extended types it is a cycles event on each core PMU: >> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n532 >> >> The is_event_supported logic is here: >> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 > > Ah, so IIUC what's happening is: > > 1) Userspace tries to detect extended type support, with a cycles event > directed to one of the CPU PMUs. The attr for this does not have > exclude_guest set. > > 2) In the kernel, the core perf code sees the extended hw type id, and directs > this towards the correct PMU (apple_icestorm_pmu). > > 3) The PMU driver looks at the attr, sees exclude_guest is not set, and returns > -EOPNOTSUPP, exactly as it would regardless of whether the extended hw type > is used. > > Note: this happens to be a difference between x86 PMUs and the apple_* PMUs, > but this is a legitimate part of the perf ABI, not an arm-specific quirk or > bug. > > 4) Userspace receives -EOPNOTSUPP, and so decide the extended hw_type is not > supported (even though the kernel does support the extended hw type id, and > the event was rejected for orthogonal reasons). > > 5) Userspace avoids the extended hw type, but still uses > PERF_EVENT_TYPE_HARDWARE events for named-pmu events. > > Does that sound plausible to you, or have I misunderstood? > > From Marc's reply at: > > https://lore.kernel.org/lkml/86edggzfxx.wl-maz@kernel.org/ > > ... with perf built from v6.4, the perf tool can open named pmu events without > issue, and sets exclude_guest in the attr. So it seems like there's a mismatch > between regular opening of events and probing for extended hw type that causes > that to differ. > > AFAICT, the kernel is doing the right thing here, but the userspace detection > of extended type id support happens to differ from regular event opening, and > mis-interprets -EOPNOTSUP as "the kernel doesn't support extended type IDs" > rather than "The kernel was able to consume the extended type ID, but the > specific PMU targetted said it doesn't support this attr". > > IIUC that means this'll be broken on older kernels (those before the extended > hw type id support was introduced), too? > > It sounds like we need to make (4) more robust? I'm not immediately sure how, > given the rats nest of returns in perf_event_open(), but I'm happy to try to > help with that. It might be worth reporting extended HW ID support in the caps folder of the PMU so that Perf can look there instead of trying to open the event. It's something that we know will always be on or always be off so it doesn't make sense to try to discover it by opening an event. > > It also seems like (5) is a problem regardless. If the user asks for a named > PMU event on an older kernel (before the extended hw type id was a thing), and > the tool converts that to a plain PERF_EVENT_TYPE_HARDWARE event, it's liable > to be handled by a different PMU than the one the user asked for. > > Thanks, > Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-23 17:08 ` James Clark @ 2023-11-23 17:15 ` Mark Rutland 0 siblings, 0 replies; 53+ messages in thread From: Mark Rutland @ 2023-11-23 17:15 UTC (permalink / raw) To: James Clark Cc: Ian Rogers, Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, linux-perf-users, LKML, Asahi Linux On Thu, Nov 23, 2023 at 05:08:43PM +0000, James Clark wrote: > On 23/11/2023 16:48, Mark Rutland wrote: > > Ah, so IIUC what's happening is: > > > > 1) Userspace tries to detect extended type support, with a cycles event > > directed to one of the CPU PMUs. The attr for this does not have > > exclude_guest set. > > > > 2) In the kernel, the core perf code sees the extended hw type id, and directs > > this towards the correct PMU (apple_icestorm_pmu). > > > > 3) The PMU driver looks at the attr, sees exclude_guest is not set, and returns > > -EOPNOTSUPP, exactly as it would regardless of whether the extended hw type > > is used. > > > > Note: this happens to be a difference between x86 PMUs and the apple_* PMUs, > > but this is a legitimate part of the perf ABI, not an arm-specific quirk or > > bug. > > > > 4) Userspace receives -EOPNOTSUPP, and so decide the extended hw_type is not > > supported (even though the kernel does support the extended hw type id, and > > the event was rejected for orthogonal reasons). > > It sounds like we need to make (4) more robust? I'm not immediately sure how, > > given the rats nest of returns in perf_event_open(), but I'm happy to try to > > help with that. > > It might be worth reporting extended HW ID support in the caps folder of > the PMU so that Perf can look there instead of trying to open the event. > It's something that we know will always be on or always be off so it > doesn't make sense to try to discover it by opening an event. Yep, I'm open to that idea. I'm more than happy to expose something that indicates "this PMU supports the extended HW ID" and/or "this kernel supports the extended HW ID". Given that the actual PMU drivers don't see the extended cap, and that's handled by the core, I'd like to make the core logic unconditional and remove the kernel-internal PERF_PMU_CAP_EXTENDED_HW_TYPE cap. So I'd lean towards the "this kernel supports the extended HW ID" option. Thanks, Mark. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 12:08 [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 Hector Martin 2023-11-21 13:40 ` Marc Zyngier @ 2023-11-21 23:43 ` Bagas Sanjaya 2023-12-06 12:09 ` Linux regression tracking #update (Thorsten Leemhuis) 1 sibling, 1 reply; 53+ messages in thread From: Bagas Sanjaya @ 2023-11-21 23:43 UTC (permalink / raw) To: Hector Martin, Linux perf Profiling, Linux Kernel Mailing List Cc: Marc Zyngier, Asahi Linux Mailing List, Ian Rogers, Kan Liang, Arnaldo Carvalho de Melo [-- Attachment #1: Type: text/plain, Size: 2389 bytes --] On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > Perf broke on all Apple ARM64 systems (tested almost everything), and > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > Test command: > > sudo taskset -c 0 ./perf stat -e apple_icestorm_pmu/cycles/ -e > apple_firestorm_pmu/cycles/ -e cycles ls > > Since this is taskset to CPU #0 (LITTLE core, icestorm), only events for > icestorm are expected. > > I bisected the breakage to two distinct points: > > 5ea8f2ccffb is the first bad commit. With its parent, the output is as > expected (same as v6.4): > > 3,297,462 apple_icestorm_pmu/cycles/ > > <not counted> apple_firestorm_pmu/cycles/ > (0.00%) > <not counted> cycles > (0.00%) > > With 5ea8f2ccffb everything breaks: > > <not supported> apple_icestorm_pmu/cycles/ > > <not supported> apple_firestorm_pmu/cycles/ > > <not counted> cycles > (0.00%) > > Somewhere along the way to 82fe2e45cdb00 things get even worse (didn't > bother bisecting this range). With its parent: > > <not supported> apple_icestorm_pmu/cycles/ > > <not supported> apple_firestorm_pmu/cycles/ > > <not supported> apple_icestorm_pmu/cycles/ > > <not supported> apple_firestorm_pmu/cycles/ > > Then 82fe2e45cdb00 leads to the current v6.5 behavior: > > <not counted> apple_icestorm_pmu/cycles/ > (0.00%) > <not counted> apple_firestorm_pmu/cycles/ > (0.00%) > <not counted> cycles > (0.00%) > > If I taskset the task to CPU#2 (big core, firestorm), I get events: > > 1,454,858 apple_icestorm_pmu/cycles/ > > 1,454,760 apple_firestorm_pmu/cycles/ > > 1,454,384 cycles > > > So the current behavior is that all output seems to come from the > firestorm PMU event counter, regardless of requested event. > > This is all unchanged and still broken in v6.7-rc2. > Thanks for the regression report (and it has been handled well already). I'm adding it to regzbot for tracking: #regzbot ^introduced: 5ea8f2ccffb239 -- An old man doll... just what I always wanted! - Clara [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-11-21 23:43 ` Bagas Sanjaya @ 2023-12-06 12:09 ` Linux regression tracking #update (Thorsten Leemhuis) 2024-08-01 19:05 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: Linux regression tracking #update (Thorsten Leemhuis) @ 2023-12-06 12:09 UTC (permalink / raw) To: Linux perf Profiling, Linux Kernel Mailing List [TLDR: This mail in primarily relevant for Linux kernel regression tracking. See link in footer if these mails annoy you.] On 22.11.23 00:43, Bagas Sanjaya wrote: > On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >> Perf broke on all Apple ARM64 systems (tested almost everything), and >> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. #regzbot fix: perf parse-events: Make legacy events lower priority than sysfs/JSON #regzbot ignore-activity Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2023-12-06 12:09 ` Linux regression tracking #update (Thorsten Leemhuis) @ 2024-08-01 19:05 ` Ian Rogers 2024-08-07 8:54 ` Thorsten Leemhuis 2025-03-09 21:19 ` Ian Rogers 0 siblings, 2 replies; 53+ messages in thread From: Ian Rogers @ 2024-08-01 19:05 UTC (permalink / raw) To: Linux regressions mailing list, to: Mark Rutland Cc: Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, Asahi Linux On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > > [TLDR: This mail in primarily relevant for Linux kernel regression > tracking. See link in footer if these mails annoy you.] > > On 22.11.23 00:43, Bagas Sanjaya wrote: > > On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > >> Perf broke on all Apple ARM64 systems (tested almost everything), and > >> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > #regzbot fix: perf parse-events: Make legacy events lower priority than > sysfs/JSON > #regzbot ignore-activity Note, this is still broken. The patch changed the priority in the case that you do something like: $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark but if you do: $ perf stat -e 'cycles' benchmark then the broken behavior will happen as legacy events have priority over sysfs/json events in that case. To fix this you need to revert: 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy" This causes some testing issues resolved in this unmerged patch series: https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ There is a bug as the arm_dsu PMU advertises an event called "cycles" and this PMU is present on Ampere systems. Reverting the commit above will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") to fix ARM's BIG.little systems (opening a cycles event on all PMUs not just 1) will cause the arm_dsu event to be opened by perf record and fail as the event won't support sampling. The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ fixes this by only opening the cycles event on core PMUs when choosing default events. Rather than take this patch the revert happened as Linus runs the command "perf record -e cycles:pp" (ie using a specified event and not defaults) and considers it a regression in the perf tool that on an Ampere system to need to do "perf record -e 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e will choose the cycles event correctly and with better precision the pp for systems that support it, but it was still considered a regression in the perf tool so the revert was made to happen. There is a lack of perf testing coverage for ARM, in particular as they choose to do everything in a different way to x86. The patch in question was in the linux-next tree for weeks without issues. ARM/Ampere could fix this by renaming the event from cycles to cpu_cycles, or by following Intel's convention that anything uncore uses the name clockticks rather than cycles. This could break people who rely on an event called arm_dsu/cycles/ but I imagine such people are rare. There has been no progress I'm aware of on renaming the event. Making perf not terminate on opening an event for perf record seems like the most likely workaround as that is at least something under the tool maintainers control. ARM have discussed doing this on the lists: https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ but since the revert in v6.10 no patches have appeared for the v6.11 merge window. Feature work like coresight improvements and ARMv9 are being actively pursued by ARM, but feature work won't resolve this regression. I'm keen to see such patches as there are perf stat fixes reliant on the stacked parse event fixes that are consequently not merged affecting more than just ARM. There is a related discussion that events specified without PMUs should inherently only mean core PMUs. Unfortunately such a change would break uncore events specified without a PMU, for example `perf stat -e data_read -a sleep 1` gathers read memory bandwidth on uncore memory controllers on recent Intel devices. Not specifying a PMU for uncore events is also assumed by perf metrics, so a large number of metrics would need updating to make such a change work. Many existing JSON uncore events specify a PMU in their name like UNC_M2HBM_CMS_CLOCKTICKS and it feels somewhat redundant to have to make that h2hbm/UNC_M2HBM_CMS_CLOCKTICKS/. It is unclear who would pursue fixing all of this, and so it seems not specifying a PMU with an event for perf will keep meaning trying to open the event on all PMUs that advertise such an event. Thanks, Ian > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > That page also explains what to do if mails like this annoy you. > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-01 19:05 ` Ian Rogers @ 2024-08-07 8:54 ` Thorsten Leemhuis 2024-08-14 16:28 ` James Clark 2025-03-09 21:19 ` Ian Rogers 1 sibling, 1 reply; 53+ messages in thread From: Thorsten Leemhuis @ 2024-08-07 8:54 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Ian Rogers, Linux regressions mailing list, to: Mark Rutland On 01.08.24 21:05, Ian Rogers wrote: > On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: >> >> [TLDR: This mail in primarily relevant for Linux kernel regression >> tracking. See link in footer if these mails annoy you.] >> >> On 22.11.23 00:43, Bagas Sanjaya wrote: >>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >> >> #regzbot fix: perf parse-events: Make legacy events lower priority than >> sysfs/JSON >> #regzbot ignore-activity > > Note, this is still broken. Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of this? Or is this a "we are screwed one way or another and someone has to bite the bullet" situation? Ciao, Thorsten > The patch changed the priority in the case > that you do something like: > > $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > > but if you do: > > $ perf stat -e 'cycles' benchmark > > then the broken behavior will happen as legacy events have priority > over sysfs/json events in that case. To fix this you need to revert: > 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > events over legacy" > > This causes some testing issues resolved in this unmerged patch series: > https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ > > There is a bug as the arm_dsu PMU advertises an event called "cycles" > and this PMU is present on Ampere systems. Reverting the commit above > will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove > __evlist__add_default") to fix ARM's BIG.little systems (opening a > cycles event on all PMUs not just 1) will cause the arm_dsu event to > be opened by perf record and fail as the event won't support sampling. > > The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ > fixes this by only opening the cycles event on core PMUs when choosing > default events. > > Rather than take this patch the revert happened as Linus runs the > command "perf record -e cycles:pp" (ie using a specified event and not > defaults) and considers it a regression in the perf tool that on an > Ampere system to need to do "perf record -e > 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e > will choose the cycles event correctly and with better precision the > pp for systems that support it, but it was still considered a > regression in the perf tool so the revert was made to happen. There is > a lack of perf testing coverage for ARM, in particular as they choose > to do everything in a different way to x86. The patch in question was > in the linux-next tree for weeks without issues. > > ARM/Ampere could fix this by renaming the event from cycles to > cpu_cycles, or by following Intel's convention that anything uncore > uses the name clockticks rather than cycles. This could break people > who rely on an event called arm_dsu/cycles/ but I imagine such people > are rare. There has been no progress I'm aware of on renaming the > event. > > Making perf not terminate on opening an event for perf record seems > like the most likely workaround as that is at least something under > the tool maintainers control. ARM have discussed doing this on the > lists: > https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ > but since the revert in v6.10 no patches have appeared for the v6.11 > merge window. Feature work like coresight improvements and ARMv9 are > being actively pursued by ARM, but feature work won't resolve this > regression. > > I'm keen to see such patches as there are perf stat fixes reliant on > the stacked parse event fixes that are consequently not merged > affecting more than just ARM. > > There is a related discussion that events specified without PMUs > should inherently only mean core PMUs. Unfortunately such a change > would break uncore events specified without a PMU, for example `perf > stat -e data_read -a sleep 1` gathers read memory bandwidth on uncore > memory controllers on recent Intel devices. Not specifying a PMU for > uncore events is also assumed by perf metrics, so a large number of > metrics would need updating to make such a change work. Many existing > JSON uncore events specify a PMU in their name like > UNC_M2HBM_CMS_CLOCKTICKS and it feels somewhat redundant to have to > make that h2hbm/UNC_M2HBM_CMS_CLOCKTICKS/. It is unclear who would > pursue fixing all of this, and so it seems not specifying a PMU with > an event for perf will keep meaning trying to open the event on all > PMUs that advertise such an event. > > Thanks, > Ian > >> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) >> -- >> Everything you wanna know about Linux kernel regression tracking: >> https://linux-regtracking.leemhuis.info/about/#tldr >> That page also explains what to do if mails like this annoy you. >> > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-07 8:54 ` Thorsten Leemhuis @ 2024-08-14 16:28 ` James Clark 2024-08-14 16:41 ` Arnaldo Carvalho de Melo 2024-08-15 17:29 ` Ian Rogers 0 siblings, 2 replies; 53+ messages in thread From: James Clark @ 2024-08-14 16:28 UTC (permalink / raw) To: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Ian Rogers, Mark Rutland Cc: Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Ian Rogers, Linux regressions mailing list, to: Mark Rutland On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: > On 01.08.24 21:05, Ian Rogers wrote: >> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update >> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: >>> >>> [TLDR: This mail in primarily relevant for Linux kernel regression >>> tracking. See link in footer if these mails annoy you.] >>> >>> On 22.11.23 00:43, Bagas Sanjaya wrote: >>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >>> >>> #regzbot fix: perf parse-events: Make legacy events lower priority than >>> sysfs/JSON >>> #regzbot ignore-activity >> >> Note, this is still broken. > > Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of > this? Or is this a "we are screwed one way or another and someone has to > bite the bullet" situation? > > Ciao, Thorsten > >> The patch changed the priority in the case >> that you do something like: >> >> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark >> >> but if you do: >> >> $ perf stat -e 'cycles' benchmark >> >> then the broken behavior will happen as legacy events have priority >> over sysfs/json events in that case. To fix this you need to revert: >> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware >> events over legacy" >> >> This causes some testing issues resolved in this unmerged patch series: >> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ >> >> There is a bug as the arm_dsu PMU advertises an event called "cycles" >> and this PMU is present on Ampere systems. Reverting the commit above >> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove >> __evlist__add_default") to fix ARM's BIG.little systems (opening a >> cycles event on all PMUs not just 1) will cause the arm_dsu event to >> be opened by perf record and fail as the event won't support sampling. >> >> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ >> fixes this by only opening the cycles event on core PMUs when choosing >> default events. >> >> Rather than take this patch the revert happened as Linus runs the >> command "perf record -e cycles:pp" (ie using a specified event and not >> defaults) and considers it a regression in the perf tool that on an >> Ampere system to need to do "perf record -e >> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e >> will choose the cycles event correctly and with better precision the >> pp for systems that support it, but it was still considered a >> regression in the perf tool so the revert was made to happen. There is >> a lack of perf testing coverage for ARM, in particular as they choose >> to do everything in a different way to x86. The patch in question was >> in the linux-next tree for weeks without issues. >> >> ARM/Ampere could fix this by renaming the event from cycles to >> cpu_cycles, or by following Intel's convention that anything uncore >> uses the name clockticks rather than cycles. This could break people >> who rely on an event called arm_dsu/cycles/ but I imagine such people >> are rare. There has been no progress I'm aware of on renaming the >> event. >> >> Making perf not terminate on opening an event for perf record seems >> like the most likely workaround as that is at least something under >> the tool maintainers control. ARM have discussed doing this on the >> lists: >> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ >> but since the revert in v6.10 no patches have appeared for the v6.11 >> merge window. Feature work like coresight improvements and ARMv9 are >> being actively pursued by ARM, but feature work won't resolve this >> regression. >> I got some hardware with the DSU PMU so I'm going to have a go at trying to send some fixes for this. My initial idea was to try incorporate the "not terminate on opening" change as discussed in the link directly above. And then do the revert of the "revert of prefer sysfs/json". FWIW I don't think Juno currently is broken if the kernel supports extended type ID? I could have missed some output in this thread but it seems like it's mostly related to Apple M hardware. I'm also a bit confused why the "supports extended type" check fails there, but maybe the v6.9 commit 25412c036 from Mark is missing? I sent a small fix the other day to make perf stat default arguments work on Juno, and didn't notice anything out of the ordinary: https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t I agree that change is quite narrow but it does incrementally improve things for the time being. It's possible that it would become redundant if I can just include Ian's change to use strings for Perf stat. Of course I only think I have a handle on the issue right now, seems like it has a lot of moving parts and something else always comes up. If I hit a wall at some point I will come back here. Thanks James ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-14 16:28 ` James Clark @ 2024-08-14 16:41 ` Arnaldo Carvalho de Melo 2024-08-15 15:15 ` James Clark 2024-08-15 17:29 ` Ian Rogers 1 sibling, 1 reply; 53+ messages in thread From: Arnaldo Carvalho de Melo @ 2024-08-14 16:41 UTC (permalink / raw) To: James Clark Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Ian Rogers, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On Wed, Aug 14, 2024 at 05:28:42PM +0100, James Clark wrote: > > > On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: > > On 01.08.24 21:05, Ian Rogers wrote: > > > On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > > > (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > > > > > > > > [TLDR: This mail in primarily relevant for Linux kernel regression > > > > tracking. See link in footer if these mails annoy you.] > > > > > > > > On 22.11.23 00:43, Bagas Sanjaya wrote: > > > > > On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > #regzbot fix: perf parse-events: Make legacy events lower priority than > > > > sysfs/JSON > > > > #regzbot ignore-activity > > > > > > Note, this is still broken. > > > > Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of > > this? Or is this a "we are screwed one way or another and someone has to > > bite the bullet" situation? > > > > Ciao, Thorsten > > > > > The patch changed the priority in the case > > > that you do something like: > > > > > > $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > > > > > > but if you do: > > > > > > $ perf stat -e 'cycles' benchmark > > > > > > then the broken behavior will happen as legacy events have priority > > > over sysfs/json events in that case. To fix this you need to revert: > > > 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > > > events over legacy" > > > > > > This causes some testing issues resolved in this unmerged patch series: > > > https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ > > > > > > There is a bug as the arm_dsu PMU advertises an event called "cycles" > > > and this PMU is present on Ampere systems. Reverting the commit above > > > will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove > > > __evlist__add_default") to fix ARM's BIG.little systems (opening a > > > cycles event on all PMUs not just 1) will cause the arm_dsu event to > > > be opened by perf record and fail as the event won't support sampling. > > > > > > The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ > > > fixes this by only opening the cycles event on core PMUs when choosing > > > default events. > > > > > > Rather than take this patch the revert happened as Linus runs the > > > command "perf record -e cycles:pp" (ie using a specified event and not > > > defaults) and considers it a regression in the perf tool that on an > > > Ampere system to need to do "perf record -e > > > 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e > > > will choose the cycles event correctly and with better precision the > > > pp for systems that support it, but it was still considered a > > > regression in the perf tool so the revert was made to happen. There is > > > a lack of perf testing coverage for ARM, in particular as they choose > > > to do everything in a different way to x86. The patch in question was > > > in the linux-next tree for weeks without issues. > > > > > > ARM/Ampere could fix this by renaming the event from cycles to > > > cpu_cycles, or by following Intel's convention that anything uncore > > > uses the name clockticks rather than cycles. This could break people > > > who rely on an event called arm_dsu/cycles/ but I imagine such people > > > are rare. There has been no progress I'm aware of on renaming the > > > event. > > > > > > Making perf not terminate on opening an event for perf record seems > > > like the most likely workaround as that is at least something under > > > the tool maintainers control. ARM have discussed doing this on the > > > lists: > > > https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ > > > but since the revert in v6.10 no patches have appeared for the v6.11 > > > merge window. Feature work like coresight improvements and ARMv9 are > > > being actively pursued by ARM, but feature work won't resolve this > > > regression. > > > > > I got some hardware with the DSU PMU so I'm going to have a go at trying to > send some fixes for this. My initial idea was to try incorporate the "not > terminate on opening" change as discussed in the link directly above. And > then do the revert of the "revert of prefer sysfs/json". > > FWIW I don't think Juno currently is broken if the kernel supports extended > type ID? I could have missed some output in this thread but it seems like > it's mostly related to Apple M hardware. I'm also a bit confused why the > "supports extended type" check fails there, but maybe the v6.9 commit > 25412c036 from Mark is missing? > > I sent a small fix the other day to make perf stat default arguments work on > Juno, and didn't notice anything out of the ordinary: https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t > I agree that change is quite narrow but it does incrementally improve things > for the time being. It's possible that it would become redundant if I can > just include Ian's change to use strings for Perf stat. > > Of course I only think I have a handle on the issue right now, seems like it > has a lot of moving parts and something else always comes up. If I hit a > wall at some point I will come back here. Thanks for working on this, hopefully we'll get to a solution that keeps all the expectations expressed in this thread about not breaking existing muscle memory and that allows us to progress on this matter. - Arnaldo ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-14 16:41 ` Arnaldo Carvalho de Melo @ 2024-08-15 15:15 ` James Clark 2024-08-15 15:20 ` James Clark 2024-08-15 15:27 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 53+ messages in thread From: James Clark @ 2024-08-15 15:15 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Ian Rogers, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On 14/08/2024 5:41 pm, Arnaldo Carvalho de Melo wrote: > On Wed, Aug 14, 2024 at 05:28:42PM +0100, James Clark wrote: >> >> >> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: >>> On 01.08.24 21:05, Ian Rogers wrote: >>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update >>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: >>>>> >>>>> [TLDR: This mail in primarily relevant for Linux kernel regression >>>>> tracking. See link in footer if these mails annoy you.] >>>>> >>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: >>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >>>>> >>>>> #regzbot fix: perf parse-events: Make legacy events lower priority than >>>>> sysfs/JSON >>>>> #regzbot ignore-activity >>>> >>>> Note, this is still broken. >>> >>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of >>> this? Or is this a "we are screwed one way or another and someone has to >>> bite the bullet" situation? >>> >>> Ciao, Thorsten >>> >>>> The patch changed the priority in the case >>>> that you do something like: >>>> >>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark >>>> >>>> but if you do: >>>> >>>> $ perf stat -e 'cycles' benchmark >>>> >>>> then the broken behavior will happen as legacy events have priority >>>> over sysfs/json events in that case. To fix this you need to revert: >>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware >>>> events over legacy" >>>> >>>> This causes some testing issues resolved in this unmerged patch series: >>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ >>>> >>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" >>>> and this PMU is present on Ampere systems. Reverting the commit above >>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove >>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a >>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to >>>> be opened by perf record and fail as the event won't support sampling. >>>> >>>> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ >>>> fixes this by only opening the cycles event on core PMUs when choosing >>>> default events. >>>> >>>> Rather than take this patch the revert happened as Linus runs the >>>> command "perf record -e cycles:pp" (ie using a specified event and not >>>> defaults) and considers it a regression in the perf tool that on an >>>> Ampere system to need to do "perf record -e >>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e >>>> will choose the cycles event correctly and with better precision the >>>> pp for systems that support it, but it was still considered a >>>> regression in the perf tool so the revert was made to happen. There is >>>> a lack of perf testing coverage for ARM, in particular as they choose >>>> to do everything in a different way to x86. The patch in question was >>>> in the linux-next tree for weeks without issues. >>>> >>>> ARM/Ampere could fix this by renaming the event from cycles to >>>> cpu_cycles, or by following Intel's convention that anything uncore >>>> uses the name clockticks rather than cycles. This could break people >>>> who rely on an event called arm_dsu/cycles/ but I imagine such people >>>> are rare. There has been no progress I'm aware of on renaming the >>>> event. >>>> >>>> Making perf not terminate on opening an event for perf record seems >>>> like the most likely workaround as that is at least something under >>>> the tool maintainers control. ARM have discussed doing this on the >>>> lists: >>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ >>>> but since the revert in v6.10 no patches have appeared for the v6.11 >>>> merge window. Feature work like coresight improvements and ARMv9 are >>>> being actively pursued by ARM, but feature work won't resolve this >>>> regression. >>>> >> >> I got some hardware with the DSU PMU so I'm going to have a go at trying to >> send some fixes for this. My initial idea was to try incorporate the "not >> terminate on opening" change as discussed in the link directly above. And >> then do the revert of the "revert of prefer sysfs/json". >> >> FWIW I don't think Juno currently is broken if the kernel supports extended >> type ID? I could have missed some output in this thread but it seems like >> it's mostly related to Apple M hardware. I'm also a bit confused why the >> "supports extended type" check fails there, but maybe the v6.9 commit >> 25412c036 from Mark is missing? >> >> I sent a small fix the other day to make perf stat default arguments work on >> Juno, and didn't notice anything out of the ordinary: https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t >> I agree that change is quite narrow but it does incrementally improve things >> for the time being. It's possible that it would become redundant if I can >> just include Ian's change to use strings for Perf stat. >> >> Of course I only think I have a handle on the issue right now, seems like it >> has a lot of moving parts and something else always comes up. If I hit a >> wall at some point I will come back here. > > Thanks for working on this, hopefully we'll get to a solution that keeps > all the expectations expressed in this thread about not breaking > existing muscle memory and that allows us to progress on this matter. > > - Arnaldo Hi Arnaldo, In one of your investigations here https://lore.kernel.org/lkml/Zld3dlJHjFMFG02v@x1/ comparing "cycles", "cpu-cycles" and "cpu_cycles" events on Arm you say only some of them open events on both core types. I wasn't able to reproduce that on perf-tools-next (27ac597c0e) or v6.9 (a38297e3fb) for perf record or stat. I guessed the 6.9 tag because you only mentioned it was on tip and it was 29th May. For me they all open exactly the same two legacy events with the extended type ID set. It looks like the behavior you see would be caused by either missing this kernel change: 5c81672865 ("arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability") (v6.6 release) Or this userspace change, but unlikely as it was a fix for Apple M hardware: 25412c036 ("perf print-events: make is_event_supported() more robust") (v6.9 release) Do you remember if you were using a new kernel or only testing a new Perf? Or if you don't mind could you re-test? Hopefully not to derail the discussion but I just want to make sure I'm not missing some other third issue before I start hacking away. I believe we still need to revert the revert of the JSON/legacy change. Because as Mark mentions there is no guarantee that a PMU's named event is the same as a legacy event of the same name, so we do want to prefer sysfs/JSON. There are some other edge cases like new Perf on an old kernel before we added extended type support, but I don't think I'll list all of them. Having said that, I believe that currently all the sysfs and legacy events actually _are_ the same. So it's not a user facing issue _yet_, or at least on any hardware mentioned in these threads. Thanks James ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-15 15:15 ` James Clark @ 2024-08-15 15:20 ` James Clark 2024-08-15 15:27 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 53+ messages in thread From: James Clark @ 2024-08-15 15:20 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Ian Rogers, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On 15/08/2024 4:15 pm, James Clark wrote: > > > On 14/08/2024 5:41 pm, Arnaldo Carvalho de Melo wrote: >> On Wed, Aug 14, 2024 at 05:28:42PM +0100, James Clark wrote: >>> >>> >>> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: >>>> On 01.08.24 21:05, Ian Rogers wrote: >>>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update >>>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: >>>>>> >>>>>> [TLDR: This mail in primarily relevant for Linux kernel regression >>>>>> tracking. See link in footer if these mails annoy you.] >>>>>> >>>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: >>>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >>>>>>>> Perf broke on all Apple ARM64 systems (tested almost >>>>>>>> everything), and >>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) >>>>>>>> since v6.5. >>>>>> >>>>>> #regzbot fix: perf parse-events: Make legacy events lower priority >>>>>> than >>>>>> sysfs/JSON >>>>>> #regzbot ignore-activity >>>>> >>>>> Note, this is still broken. >>>> >>>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of >>>> this? Or is this a "we are screwed one way or another and someone >>>> has to >>>> bite the bullet" situation? >>>> >>>> Ciao, Thorsten >>>> >>>>> The patch changed the priority in the case >>>>> that you do something like: >>>>> >>>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark >>>>> >>>>> but if you do: >>>>> >>>>> $ perf stat -e 'cycles' benchmark >>>>> >>>>> then the broken behavior will happen as legacy events have priority >>>>> over sysfs/json events in that case. To fix this you need to revert: >>>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware >>>>> events over legacy" >>>>> >>>>> This causes some testing issues resolved in this unmerged patch >>>>> series: >>>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ >>>>> >>>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" >>>>> and this PMU is present on Ampere systems. Reverting the commit above >>>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove >>>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a >>>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to >>>>> be opened by perf record and fail as the event won't support sampling. >>>>> >>>>> The patch >>>>> https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ >>>>> fixes this by only opening the cycles event on core PMUs when choosing >>>>> default events. >>>>> >>>>> Rather than take this patch the revert happened as Linus runs the >>>>> command "perf record -e cycles:pp" (ie using a specified event and not >>>>> defaults) and considers it a regression in the perf tool that on an >>>>> Ampere system to need to do "perf record -e >>>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e >>>>> will choose the cycles event correctly and with better precision the >>>>> pp for systems that support it, but it was still considered a >>>>> regression in the perf tool so the revert was made to happen. There is >>>>> a lack of perf testing coverage for ARM, in particular as they choose >>>>> to do everything in a different way to x86. The patch in question was >>>>> in the linux-next tree for weeks without issues. >>>>> >>>>> ARM/Ampere could fix this by renaming the event from cycles to >>>>> cpu_cycles, or by following Intel's convention that anything uncore >>>>> uses the name clockticks rather than cycles. This could break people >>>>> who rely on an event called arm_dsu/cycles/ but I imagine such people >>>>> are rare. There has been no progress I'm aware of on renaming the >>>>> event. >>>>> >>>>> Making perf not terminate on opening an event for perf record seems >>>>> like the most likely workaround as that is at least something under >>>>> the tool maintainers control. ARM have discussed doing this on the >>>>> lists: >>>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ >>>>> but since the revert in v6.10 no patches have appeared for the v6.11 >>>>> merge window. Feature work like coresight improvements and ARMv9 are >>>>> being actively pursued by ARM, but feature work won't resolve this >>>>> regression. >>>>> >>> >>> I got some hardware with the DSU PMU so I'm going to have a go at >>> trying to >>> send some fixes for this. My initial idea was to try incorporate the >>> "not >>> terminate on opening" change as discussed in the link directly above. >>> And >>> then do the revert of the "revert of prefer sysfs/json". >>> >>> FWIW I don't think Juno currently is broken if the kernel supports >>> extended >>> type ID? I could have missed some output in this thread but it seems >>> like >>> it's mostly related to Apple M hardware. I'm also a bit confused why the >>> "supports extended type" check fails there, but maybe the v6.9 commit >>> 25412c036 from Mark is missing? >>> >>> I sent a small fix the other day to make perf stat default arguments >>> work on >>> Juno, and didn't notice anything out of the ordinary: >>> https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t >>> I agree that change is quite narrow but it does incrementally improve >>> things >>> for the time being. It's possible that it would become redundant if I >>> can >>> just include Ian's change to use strings for Perf stat. >>> >>> Of course I only think I have a handle on the issue right now, seems >>> like it >>> has a lot of moving parts and something else always comes up. If I hit a >>> wall at some point I will come back here. >> >> Thanks for working on this, hopefully we'll get to a solution that keeps >> all the expectations expressed in this thread about not breaking >> existing muscle memory and that allows us to progress on this matter. >> >> - Arnaldo > > Hi Arnaldo, > > In one of your investigations here > https://lore.kernel.org/lkml/Zld3dlJHjFMFG02v@x1/ comparing "cycles", > "cpu-cycles" and "cpu_cycles" events on Arm you say only some of them > open events on both core types. I wasn't able to reproduce that on > perf-tools-next (27ac597c0e) or v6.9 (a38297e3fb) for perf record or > stat. I guessed the 6.9 tag because you only mentioned it was on tip and > it was 29th May. For me they all open exactly the same two legacy events > with the extended type ID set. Minor correction, one opens using the PMU type rather a legacy event with extended type ID. But importantly they do all open on both CPU types. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-15 15:15 ` James Clark 2024-08-15 15:20 ` James Clark @ 2024-08-15 15:27 ` Arnaldo Carvalho de Melo 2024-08-15 15:53 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 53+ messages in thread From: Arnaldo Carvalho de Melo @ 2024-08-15 15:27 UTC (permalink / raw) To: James Clark Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Ian Rogers, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On Thu, Aug 15, 2024 at 04:15:41PM +0100, James Clark wrote: > > > On 14/08/2024 5:41 pm, Arnaldo Carvalho de Melo wrote: > > On Wed, Aug 14, 2024 at 05:28:42PM +0100, James Clark wrote: > > > > > > > > > On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: > > > > On 01.08.24 21:05, Ian Rogers wrote: > > > > > On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > > > > > (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > > > > > > > > > > > > [TLDR: This mail in primarily relevant for Linux kernel regression > > > > > > tracking. See link in footer if these mails annoy you.] > > > > > > > > > > > > On 22.11.23 00:43, Bagas Sanjaya wrote: > > > > > > > On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > > > > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and > > > > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > > > > > > > > > #regzbot fix: perf parse-events: Make legacy events lower priority than > > > > > > sysfs/JSON > > > > > > #regzbot ignore-activity > > > > > > > > > > Note, this is still broken. > > > > > > > > Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of > > > > this? Or is this a "we are screwed one way or another and someone has to > > > > bite the bullet" situation? > > > > > > > > Ciao, Thorsten > > > > > > > > > The patch changed the priority in the case > > > > > that you do something like: > > > > > > > > > > $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > > > > > > > > > > but if you do: > > > > > > > > > > $ perf stat -e 'cycles' benchmark > > > > > > > > > > then the broken behavior will happen as legacy events have priority > > > > > over sysfs/json events in that case. To fix this you need to revert: > > > > > 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > > > > > events over legacy" > > > > > > > > > > This causes some testing issues resolved in this unmerged patch series: > > > > > https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ > > > > > > > > > > There is a bug as the arm_dsu PMU advertises an event called "cycles" > > > > > and this PMU is present on Ampere systems. Reverting the commit above > > > > > will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove > > > > > __evlist__add_default") to fix ARM's BIG.little systems (opening a > > > > > cycles event on all PMUs not just 1) will cause the arm_dsu event to > > > > > be opened by perf record and fail as the event won't support sampling. > > > > > > > > > > The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ > > > > > fixes this by only opening the cycles event on core PMUs when choosing > > > > > default events. > > > > > > > > > > Rather than take this patch the revert happened as Linus runs the > > > > > command "perf record -e cycles:pp" (ie using a specified event and not > > > > > defaults) and considers it a regression in the perf tool that on an > > > > > Ampere system to need to do "perf record -e > > > > > 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e > > > > > will choose the cycles event correctly and with better precision the > > > > > pp for systems that support it, but it was still considered a > > > > > regression in the perf tool so the revert was made to happen. There is > > > > > a lack of perf testing coverage for ARM, in particular as they choose > > > > > to do everything in a different way to x86. The patch in question was > > > > > in the linux-next tree for weeks without issues. > > > > > > > > > > ARM/Ampere could fix this by renaming the event from cycles to > > > > > cpu_cycles, or by following Intel's convention that anything uncore > > > > > uses the name clockticks rather than cycles. This could break people > > > > > who rely on an event called arm_dsu/cycles/ but I imagine such people > > > > > are rare. There has been no progress I'm aware of on renaming the > > > > > event. > > > > > > > > > > Making perf not terminate on opening an event for perf record seems > > > > > like the most likely workaround as that is at least something under > > > > > the tool maintainers control. ARM have discussed doing this on the > > > > > lists: > > > > > https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ > > > > > but since the revert in v6.10 no patches have appeared for the v6.11 > > > > > merge window. Feature work like coresight improvements and ARMv9 are > > > > > being actively pursued by ARM, but feature work won't resolve this > > > > > regression. > > > > > > > > > > > I got some hardware with the DSU PMU so I'm going to have a go at trying to > > > send some fixes for this. My initial idea was to try incorporate the "not > > > terminate on opening" change as discussed in the link directly above. And > > > then do the revert of the "revert of prefer sysfs/json". > > > > > > FWIW I don't think Juno currently is broken if the kernel supports extended > > > type ID? I could have missed some output in this thread but it seems like > > > it's mostly related to Apple M hardware. I'm also a bit confused why the > > > "supports extended type" check fails there, but maybe the v6.9 commit > > > 25412c036 from Mark is missing? > > > > > > I sent a small fix the other day to make perf stat default arguments work on > > > Juno, and didn't notice anything out of the ordinary: https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t > > > I agree that change is quite narrow but it does incrementally improve things > > > for the time being. It's possible that it would become redundant if I can > > > just include Ian's change to use strings for Perf stat. > > > > > > Of course I only think I have a handle on the issue right now, seems like it > > > has a lot of moving parts and something else always comes up. If I hit a > > > wall at some point I will come back here. > > > > Thanks for working on this, hopefully we'll get to a solution that keeps > > all the expectations expressed in this thread about not breaking > > existing muscle memory and that allows us to progress on this matter. > > > > - Arnaldo > > Hi Arnaldo, > > In one of your investigations here > https://lore.kernel.org/lkml/Zld3dlJHjFMFG02v@x1/ comparing "cycles", > "cpu-cycles" and "cpu_cycles" events on Arm you say only some of them open > events on both core types. I wasn't able to reproduce that on > perf-tools-next (27ac597c0e) or v6.9 (a38297e3fb) for perf record or stat. I > guessed the 6.9 tag because you only mentioned it was on tip and it was 29th > May. For me they all open exactly the same two legacy events with the > extended type ID set. > > It looks like the behavior you see would be caused by either missing this > kernel change: > > 5c81672865 ("arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability") > (v6.6 release) > > Or this userspace change, but unlikely as it was a fix for Apple M hardware: > > 25412c036 ("perf print-events: make is_event_supported() more robust") > (v6.9 release) > > Do you remember if you were using a new kernel or only testing a new Perf? I normally use the distro/SoC provided kernel, didn't I add the 'uname -a' output in those investigations (/me slaps himself in the face speculatively...)? > Or if you don't mind could you re-test? Hopefully not to derail the Sure > discussion but I just want to make sure I'm not missing some other third > issue before I start hacking away. This is full of subtleties and has generated a lot of back and forth, so making sure we don't miss anything is what we should do. > I believe we still need to revert the revert of the JSON/legacy change. Good to see progress on assessing that. /me goes and turns on his trusty libre computer board... - Arnaldo > Because as Mark mentions there is no guarantee that a PMU's named event is > the same as a legacy event of the same name, so we do want to prefer > sysfs/JSON. There are some other edge cases like new Perf on an old kernel > before we added extended type support, but I don't think I'll list all of > them. > > Having said that, I believe that currently all the sysfs and legacy events > actually _are_ the same. So it's not a user facing issue _yet_, or at least > on any hardware mentioned in these threads. > > Thanks > James ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-15 15:27 ` Arnaldo Carvalho de Melo @ 2024-08-15 15:53 ` Arnaldo Carvalho de Melo 2024-08-16 8:57 ` James Clark 0 siblings, 1 reply; 53+ messages in thread From: Arnaldo Carvalho de Melo @ 2024-08-15 15:53 UTC (permalink / raw) To: James Clark Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Ian Rogers, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On Thu, Aug 15, 2024 at 12:27:21PM -0300, Arnaldo Carvalho de Melo wrote: > On Thu, Aug 15, 2024 at 04:15:41PM +0100, James Clark wrote: > > In one of your investigations here > > https://lore.kernel.org/lkml/Zld3dlJHjFMFG02v@x1/ comparing "cycles", > > "cpu-cycles" and "cpu_cycles" events on Arm you say only some of them open > > events on both core types. I wasn't able to reproduce that on > > perf-tools-next (27ac597c0e) or v6.9 (a38297e3fb) for perf record or stat. I > > guessed the 6.9 tag because you only mentioned it was on tip and it was 29th > > May. For me they all open exactly the same two legacy events with the > > extended type ID set. > > > > It looks like the behavior you see would be caused by either missing this > > kernel change: > > > > 5c81672865 ("arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability") > > (v6.6 release) What I have now is: 6.1.92-15907-gf36fd2695db3 It was a bit older, but 6.1 ish as well, I'll try to either get a new kernel from Libre Computer or build one myself. - Arnaldo > > Or this userspace change, but unlikely as it was a fix for Apple M hardware: > > > > 25412c036 ("perf print-events: make is_event_supported() more robust") > > (v6.9 release) > > > > Do you remember if you were using a new kernel or only testing a new Perf? > > I normally use the distro/SoC provided kernel, didn't I add the 'uname > -a' output in those investigations (/me slaps himself in the face > speculatively...)? ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-15 15:53 ` Arnaldo Carvalho de Melo @ 2024-08-16 8:57 ` James Clark 0 siblings, 0 replies; 53+ messages in thread From: James Clark @ 2024-08-16 8:57 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Ian Rogers, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On 15/08/2024 4:53 pm, Arnaldo Carvalho de Melo wrote: > On Thu, Aug 15, 2024 at 12:27:21PM -0300, Arnaldo Carvalho de Melo wrote: >> On Thu, Aug 15, 2024 at 04:15:41PM +0100, James Clark wrote: >>> In one of your investigations here >>> https://lore.kernel.org/lkml/Zld3dlJHjFMFG02v@x1/ comparing "cycles", >>> "cpu-cycles" and "cpu_cycles" events on Arm you say only some of them open >>> events on both core types. I wasn't able to reproduce that on >>> perf-tools-next (27ac597c0e) or v6.9 (a38297e3fb) for perf record or stat. I >>> guessed the 6.9 tag because you only mentioned it was on tip and it was 29th >>> May. For me they all open exactly the same two legacy events with the >>> extended type ID set. >>> >>> It looks like the behavior you see would be caused by either missing this >>> kernel change: >>> >>> 5c81672865 ("arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability") >>> (v6.6 release) > > What I have now is: > > 6.1.92-15907-gf36fd2695db3 > > It was a bit older, but 6.1 ish as well, I'll try to either get a new > kernel from Libre Computer or build one myself. > > - Arnaldo > Thanks for the confirmation. In that case you may not even need to retest. I was only wondering if it was broken from v6.6 onwards, but 6.1 not working is expected. And I'm certain that you'll find any later versions working. >>> Or this userspace change, but unlikely as it was a fix for Apple M hardware: >>> >>> 25412c036 ("perf print-events: make is_event_supported() more robust") >>> (v6.9 release) >>> >>> Do you remember if you were using a new kernel or only testing a new Perf? >> >> I normally use the distro/SoC provided kernel, didn't I add the 'uname >> -a' output in those investigations (/me slaps himself in the face >> speculatively...)? ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-14 16:28 ` James Clark 2024-08-14 16:41 ` Arnaldo Carvalho de Melo @ 2024-08-15 17:29 ` Ian Rogers 2024-08-16 9:22 ` James Clark 1 sibling, 1 reply; 53+ messages in thread From: Ian Rogers @ 2024-08-15 17:29 UTC (permalink / raw) To: James Clark Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list, Atish Patra On Wed, Aug 14, 2024 at 9:28 AM James Clark <james.clark@linaro.org> wrote: > On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: > > On 01.08.24 21:05, Ian Rogers wrote: > >> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > >> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > >>> > >>> [TLDR: This mail in primarily relevant for Linux kernel regression > >>> tracking. See link in footer if these mails annoy you.] > >>> > >>> On 22.11.23 00:43, Bagas Sanjaya wrote: > >>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > >>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > >>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > >>> > >>> #regzbot fix: perf parse-events: Make legacy events lower priority than > >>> sysfs/JSON > >>> #regzbot ignore-activity > >> > >> Note, this is still broken. > > > > Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of > > this? Or is this a "we are screwed one way or another and someone has to > > bite the bullet" situation? > > > > Ciao, Thorsten > > > >> The patch changed the priority in the case > >> that you do something like: > >> > >> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > >> > >> but if you do: > >> > >> $ perf stat -e 'cycles' benchmark > >> > >> then the broken behavior will happen as legacy events have priority > >> over sysfs/json events in that case. To fix this you need to revert: > >> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > >> events over legacy" > >> > >> This causes some testing issues resolved in this unmerged patch series: > >> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ > >> > >> There is a bug as the arm_dsu PMU advertises an event called "cycles" > >> and this PMU is present on Ampere systems. Reverting the commit above > >> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove > >> __evlist__add_default") to fix ARM's BIG.little systems (opening a > >> cycles event on all PMUs not just 1) will cause the arm_dsu event to > >> be opened by perf record and fail as the event won't support sampling. > >> > >> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ > >> fixes this by only opening the cycles event on core PMUs when choosing > >> default events. > >> > >> Rather than take this patch the revert happened as Linus runs the > >> command "perf record -e cycles:pp" (ie using a specified event and not > >> defaults) and considers it a regression in the perf tool that on an > >> Ampere system to need to do "perf record -e > >> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e > >> will choose the cycles event correctly and with better precision the > >> pp for systems that support it, but it was still considered a > >> regression in the perf tool so the revert was made to happen. There is > >> a lack of perf testing coverage for ARM, in particular as they choose > >> to do everything in a different way to x86. The patch in question was > >> in the linux-next tree for weeks without issues. > >> > >> ARM/Ampere could fix this by renaming the event from cycles to > >> cpu_cycles, or by following Intel's convention that anything uncore > >> uses the name clockticks rather than cycles. This could break people > >> who rely on an event called arm_dsu/cycles/ but I imagine such people > >> are rare. There has been no progress I'm aware of on renaming the > >> event. > >> > >> Making perf not terminate on opening an event for perf record seems > >> like the most likely workaround as that is at least something under > >> the tool maintainers control. ARM have discussed doing this on the > >> lists: > >> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ > >> but since the revert in v6.10 no patches have appeared for the v6.11 > >> merge window. Feature work like coresight improvements and ARMv9 are > >> being actively pursued by ARM, but feature work won't resolve this > >> regression. > >> > > I got some hardware with the DSU PMU so I'm going to have a go at trying > to send some fixes for this. My initial idea was to try incorporate the > "not terminate on opening" change as discussed in the link directly > above. And then do the revert of the "revert of prefer sysfs/json". Thanks, I think this would be good. The biggest issue is that none of the record logic expects a file descriptor to be not opened, deleting unopened evsels from the evlist breaks all the indexing into the mmaps, etc. Tbh, you probably wouldn't do the code this way if was written afresh. Perhaps a hashmap would map from an evsel to ring buffer mmaps, etc. Trying to avoid having global state and benefitting from encapsulation. I'd focus on just doing the expedient thing in the changes, which probably just means making the record code tolerant of evsels that fail to open and not modifying the evlist due to the risk it breaks the indices. (To point out the obvious, this work wouldn't be necessary if arm_dsu event were renamed from "cycles" to "cpu_cycles" which would also make it more intention revealing alongside the arm_dsu's "bus_cycles" event name). > FWIW I don't think Juno currently is broken if the kernel supports > extended type ID? I could have missed some output in this thread but it > seems like it's mostly related to Apple M hardware. I'm also a bit > confused why the "supports extended type" check fails there, but maybe > the v6.9 commit 25412c036 from Mark is missing? So I think your later emails clarify Arnaldo is probably missing: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/drivers/perf/arm_pmu.c?h=perf-tools-next&id=5c816728651ae425954542fed64d21d40cb75a9f Fwiw, the Apple M hardware issue came to me by way of Mark Rutland (iirc), this regression report, etc. My understanding is that Apple M has something like a v2 ARM PMU and the legacy events are encoded incorrectly in the driver for this. The regression in v6.5 happened because ARM's core PMUs had previously been treated as uncore PMUs, meaning we wouldn't try to program legacy events on them. Fixing the handling of ARM's core PMUs broke Apple M due to the broken legacy event mappings. Why not fix the Apple M PMU driver? Well there was anyway a similar RISC-V issue reported by Atish Patra (iirc) where the RISC-V PMU driver wants to delegate the mapping of legacy events to the perf tool so the driver needn't be aware of all and future RISC-V configurations. The fix discussed with Mark, Atish, etc. has been to swap the priority of legacy and sysfs/json events so that the latter has priority. We need the revert of the revert as currently we only do this if a PMU is specified with an event, not for the general wildcard PMUs case that most people use. There was huge fallout from flipping the priority particularly on Intel as all test expectations needed updating. I've sent out similar fixes that need incorporating when the revert is reverted. Ideally tools/perf/tests/parse-events.c would be updated to cover ARM's PMUs that don't follow the normal pattern that the core PMU is called "cpu" (this would mean that we were testing event parsing on ARM was WAI wrt encoding priorities, BIG.little, etc). > I sent a small fix the other day to make perf stat default arguments > work on Juno, and didn't notice anything out of the ordinary: > https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t > I agree that change is quite narrow but it does incrementally improve > things for the time being. It's possible that it would become redundant > if I can just include Ian's change to use strings for Perf stat. I'd prefer we didn't merge this as we'd need to rebase: https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ and those changes would then delete the code introduced. I'm fine with adding the tests. There are more exotic heterogeneous core things upcoming, probably also from ARM, and the thought of duplicating the default attribute logic and event parsing constraints is just something I'd prefer not to have to do. > Of course I only think I have a handle on the issue right now, seems > like it has a lot of moving parts and something else always comes up. If > I hit a wall at some point I will come back here. Thanks, Ian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-15 17:29 ` Ian Rogers @ 2024-08-16 9:22 ` James Clark 2024-08-16 15:30 ` Ian Rogers 0 siblings, 1 reply; 53+ messages in thread From: James Clark @ 2024-08-16 9:22 UTC (permalink / raw) To: Ian Rogers Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list, Atish Patra On 15/08/2024 6:29 pm, Ian Rogers wrote: > On Wed, Aug 14, 2024 at 9:28 AM James Clark <james.clark@linaro.org> wrote: >> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: >>> On 01.08.24 21:05, Ian Rogers wrote: >>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update >>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: >>>>> >>>>> [TLDR: This mail in primarily relevant for Linux kernel regression >>>>> tracking. See link in footer if these mails annoy you.] >>>>> >>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: >>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >>>>> >>>>> #regzbot fix: perf parse-events: Make legacy events lower priority than >>>>> sysfs/JSON >>>>> #regzbot ignore-activity >>>> >>>> Note, this is still broken. >>> >>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of >>> this? Or is this a "we are screwed one way or another and someone has to >>> bite the bullet" situation? >>> >>> Ciao, Thorsten >>> >>>> The patch changed the priority in the case >>>> that you do something like: >>>> >>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark >>>> >>>> but if you do: >>>> >>>> $ perf stat -e 'cycles' benchmark >>>> >>>> then the broken behavior will happen as legacy events have priority >>>> over sysfs/json events in that case. To fix this you need to revert: >>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware >>>> events over legacy" >>>> >>>> This causes some testing issues resolved in this unmerged patch series: >>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ >>>> >>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" >>>> and this PMU is present on Ampere systems. Reverting the commit above >>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove >>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a >>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to >>>> be opened by perf record and fail as the event won't support sampling. >>>> >>>> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ >>>> fixes this by only opening the cycles event on core PMUs when choosing >>>> default events. >>>> >>>> Rather than take this patch the revert happened as Linus runs the >>>> command "perf record -e cycles:pp" (ie using a specified event and not >>>> defaults) and considers it a regression in the perf tool that on an >>>> Ampere system to need to do "perf record -e >>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e >>>> will choose the cycles event correctly and with better precision the >>>> pp for systems that support it, but it was still considered a >>>> regression in the perf tool so the revert was made to happen. There is >>>> a lack of perf testing coverage for ARM, in particular as they choose >>>> to do everything in a different way to x86. The patch in question was >>>> in the linux-next tree for weeks without issues. >>>> >>>> ARM/Ampere could fix this by renaming the event from cycles to >>>> cpu_cycles, or by following Intel's convention that anything uncore >>>> uses the name clockticks rather than cycles. This could break people >>>> who rely on an event called arm_dsu/cycles/ but I imagine such people >>>> are rare. There has been no progress I'm aware of on renaming the >>>> event. >>>> >>>> Making perf not terminate on opening an event for perf record seems >>>> like the most likely workaround as that is at least something under >>>> the tool maintainers control. ARM have discussed doing this on the >>>> lists: >>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ >>>> but since the revert in v6.10 no patches have appeared for the v6.11 >>>> merge window. Feature work like coresight improvements and ARMv9 are >>>> being actively pursued by ARM, but feature work won't resolve this >>>> regression. >>>> >> >> I got some hardware with the DSU PMU so I'm going to have a go at trying >> to send some fixes for this. My initial idea was to try incorporate the >> "not terminate on opening" change as discussed in the link directly >> above. And then do the revert of the "revert of prefer sysfs/json". > > Thanks, I think this would be good. The biggest issue is that none of > the record logic expects a file descriptor to be not opened, deleting > unopened evsels from the evlist breaks all the indexing into the > mmaps, etc. Tbh, you probably wouldn't do the code this way if was > written afresh. Perhaps a hashmap would map from an evsel to ring > buffer mmaps, etc. Trying to avoid having global state and benefitting > from encapsulation. I'd focus on just doing the expedient thing in the > changes, which probably just means making the record code tolerant of > evsels that fail to open and not modifying the evlist due to the risk > it breaks the indices. > Thanks for the tips. > (To point out the obvious, this work wouldn't be necessary if arm_dsu > event were renamed from "cycles" to "cpu_cycles" which would also make > it more intention revealing alongside the arm_dsu's "bus_cycles" event > name). > I understand but I can imagine the following conversation if we rename that: User: "I updated my kernel and now my (non Perf) tool fails to open the DSU cycles event because it doesn't exist anymore" Linus/maintainers: "Oh ok yes that was a userspace breaking change, lets revert it" Just because Perf can handle 3 different names for cycles doesn't mean other tools can. >> FWIW I don't think Juno currently is broken if the kernel supports >> extended type ID? I could have missed some output in this thread but it >> seems like it's mostly related to Apple M hardware. I'm also a bit >> confused why the "supports extended type" check fails there, but maybe >> the v6.9 commit 25412c036 from Mark is missing? > > So I think your later emails clarify Arnaldo is probably missing: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/drivers/perf/arm_pmu.c?h=perf-tools-next&id=5c816728651ae425954542fed64d21d40cb75a9f > > Fwiw, the Apple M hardware issue came to me by way of Mark Rutland > (iirc), this regression report, etc. My understanding is that Apple M > has something like a v2 ARM PMU and the legacy events are encoded > incorrectly in the driver for this. The regression in v6.5 happened I'm not sure about that. The M PMU events may be incomplete, but the two that are there have a mapping that looks sane: static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = { PERF_MAP_ALL_UNSUPPORTED, [PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES, [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS, /* No idea about the rest yet */ }; And they map to the same named events: static struct attribute *m1_pmu_event_attrs[] = { M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES), M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS), NULL, }; So in this case I can't see using legacy vs sysfs events making a difference. Maybe there is some other case that was mentioned in a previous thread that I missed though. > because ARM's core PMUs had previously been treated as uncore PMUs, > meaning we wouldn't try to program legacy events on them. Fixing the > handling of ARM's core PMUs broke Apple M due to the broken legacy > event mappings. Why not fix the Apple M PMU driver? Well there was > anyway a similar RISC-V issue reported by Atish Patra (iirc) where the > RISC-V PMU driver wants to delegate the mapping of legacy events to > the perf tool so the driver needn't be aware of all and future RISC-V > configurations. The fix discussed with Mark, Atish, etc. has been to > swap the priority of legacy and sysfs/json events so that the latter > has priority. We need the revert of the revert as currently we only do > this if a PMU is specified with an event, not for the general wildcard > PMUs case that most people use. There was huge fallout from flipping Yep makes sense to do the revert if RISC-V isn't going to support any legacy events. Although from what I understand that would technically only require JSON to be the highest priority? Because putting named events in sysfs still requires kernel involvement so doesn't get you any further than supporting the legacy events? Seems like there is another reason to do the revert though as Mark mentioned: That now directly specifying the PMU eg "-e arm_cortex_a56/cycles/" opens a legacy event if the event matches one, which is not the best thing to do. But the revert fixes this AFAIK, so while having the priority JSON/legacy/sysfs might work for RISC-V it wouldn't work for a platform that wants a slightly different sysfs event than legacy but with the same name. And the priority should be JSON/sysfs/legacy. > the priority particularly on Intel as all test expectations needed > updating. I've sent out similar fixes that need incorporating when the > revert is reverted. Ideally tools/perf/tests/parse-events.c would be > updated to cover ARM's PMUs that don't follow the normal pattern that > the core PMU is called "cpu" (this would mean that we were testing > event parsing on ARM was WAI wrt encoding priorities, BIG.little, > etc). > >> I sent a small fix the other day to make perf stat default arguments >> work on Juno, and didn't notice anything out of the ordinary: >> https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t >> I agree that change is quite narrow but it does incrementally improve >> things for the time being. It's possible that it would become redundant >> if I can just include Ian's change to use strings for Perf stat. > > I'd prefer we didn't merge this as we'd need to rebase: > https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ > and those changes would then delete the code introduced. I'm fine with > adding the tests. > > There are more exotic heterogeneous core things upcoming, probably > also from ARM, and the thought of duplicating the default attribute > logic and event parsing constraints is just something I'd prefer not > to have to do. > Yep I don't have any strong feelings about this. Even if we don't merge it it helped me understand the code and the issue a bit. I think one thing I assumed about your change was that there was some dependency on these other changes. But the more I look at it I think it's actually fine on it's own? Using the cycles string actually works today, even on Apple M. The only real remaining issue is softening the error for failure to open, but that's _after_ doing the revert of the revert and is separate. I will re-test that one today with fresh eyes. >> Of course I only think I have a handle on the issue right now, seems >> like it has a lot of moving parts and something else always comes up. If >> I hit a wall at some point I will come back here. > > Thanks, > Ian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-16 9:22 ` James Clark @ 2024-08-16 15:30 ` Ian Rogers 2024-08-17 1:38 ` Atish Kumar Patra 2024-08-19 14:56 ` James Clark 0 siblings, 2 replies; 53+ messages in thread From: Ian Rogers @ 2024-08-16 15:30 UTC (permalink / raw) To: James Clark Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list, Atish Patra On Fri, Aug 16, 2024 at 2:23 AM James Clark <james.clark@linaro.org> wrote: > > > > On 15/08/2024 6:29 pm, Ian Rogers wrote: > > On Wed, Aug 14, 2024 at 9:28 AM James Clark <james.clark@linaro.org> wrote: > >> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: > >>> On 01.08.24 21:05, Ian Rogers wrote: > >>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > >>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > >>>>> > >>>>> [TLDR: This mail in primarily relevant for Linux kernel regression > >>>>> tracking. See link in footer if these mails annoy you.] > >>>>> > >>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: > >>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > >>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > >>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > >>>>> > >>>>> #regzbot fix: perf parse-events: Make legacy events lower priority than > >>>>> sysfs/JSON > >>>>> #regzbot ignore-activity > >>>> > >>>> Note, this is still broken. > >>> > >>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of > >>> this? Or is this a "we are screwed one way or another and someone has to > >>> bite the bullet" situation? > >>> > >>> Ciao, Thorsten > >>> > >>>> The patch changed the priority in the case > >>>> that you do something like: > >>>> > >>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > >>>> > >>>> but if you do: > >>>> > >>>> $ perf stat -e 'cycles' benchmark > >>>> > >>>> then the broken behavior will happen as legacy events have priority > >>>> over sysfs/json events in that case. To fix this you need to revert: > >>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > >>>> events over legacy" > >>>> > >>>> This causes some testing issues resolved in this unmerged patch series: > >>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ > >>>> > >>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" > >>>> and this PMU is present on Ampere systems. Reverting the commit above > >>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove > >>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a > >>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to > >>>> be opened by perf record and fail as the event won't support sampling. > >>>> > >>>> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ > >>>> fixes this by only opening the cycles event on core PMUs when choosing > >>>> default events. > >>>> > >>>> Rather than take this patch the revert happened as Linus runs the > >>>> command "perf record -e cycles:pp" (ie using a specified event and not > >>>> defaults) and considers it a regression in the perf tool that on an > >>>> Ampere system to need to do "perf record -e > >>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e > >>>> will choose the cycles event correctly and with better precision the > >>>> pp for systems that support it, but it was still considered a > >>>> regression in the perf tool so the revert was made to happen. There is > >>>> a lack of perf testing coverage for ARM, in particular as they choose > >>>> to do everything in a different way to x86. The patch in question was > >>>> in the linux-next tree for weeks without issues. > >>>> > >>>> ARM/Ampere could fix this by renaming the event from cycles to > >>>> cpu_cycles, or by following Intel's convention that anything uncore > >>>> uses the name clockticks rather than cycles. This could break people > >>>> who rely on an event called arm_dsu/cycles/ but I imagine such people > >>>> are rare. There has been no progress I'm aware of on renaming the > >>>> event. > >>>> > >>>> Making perf not terminate on opening an event for perf record seems > >>>> like the most likely workaround as that is at least something under > >>>> the tool maintainers control. ARM have discussed doing this on the > >>>> lists: > >>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ > >>>> but since the revert in v6.10 no patches have appeared for the v6.11 > >>>> merge window. Feature work like coresight improvements and ARMv9 are > >>>> being actively pursued by ARM, but feature work won't resolve this > >>>> regression. > >>>> > >> > >> I got some hardware with the DSU PMU so I'm going to have a go at trying > >> to send some fixes for this. My initial idea was to try incorporate the > >> "not terminate on opening" change as discussed in the link directly > >> above. And then do the revert of the "revert of prefer sysfs/json". > > > > Thanks, I think this would be good. The biggest issue is that none of > > the record logic expects a file descriptor to be not opened, deleting > > unopened evsels from the evlist breaks all the indexing into the > > mmaps, etc. Tbh, you probably wouldn't do the code this way if was > > written afresh. Perhaps a hashmap would map from an evsel to ring > > buffer mmaps, etc. Trying to avoid having global state and benefitting > > from encapsulation. I'd focus on just doing the expedient thing in the > > changes, which probably just means making the record code tolerant of > > evsels that fail to open and not modifying the evlist due to the risk > > it breaks the indices. > > > > Thanks for the tips. > > > (To point out the obvious, this work wouldn't be necessary if arm_dsu > > event were renamed from "cycles" to "cpu_cycles" which would also make > > it more intention revealing alongside the arm_dsu's "bus_cycles" event > > name). > > > > I understand but I can imagine the following conversation if we rename that: > > User: "I updated my kernel and now my (non Perf) tool fails to open > the DSU cycles event because it doesn't exist anymore" > > Linus/maintainers: "Oh ok yes that was a userspace breaking change, > lets revert it" > > Just because Perf can handle 3 different names for cycles doesn't mean > other tools can. cycles was a bad event name, dsu is a terrible name for what is mainly the l3 cache, the risk that the two are combined get broken I'm fine with as neoverse users with uncore permissions are say much rarer than Apple M users. Having a cycles and a bus_cycles event is already ambiguous, they sound the same. Renaming cycles to cpu_cycles would be best. > >> FWIW I don't think Juno currently is broken if the kernel supports > >> extended type ID? I could have missed some output in this thread but it > >> seems like it's mostly related to Apple M hardware. I'm also a bit > >> confused why the "supports extended type" check fails there, but maybe > >> the v6.9 commit 25412c036 from Mark is missing? > > > > So I think your later emails clarify Arnaldo is probably missing: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/drivers/perf/arm_pmu.c?h=perf-tools-next&id=5c816728651ae425954542fed64d21d40cb75a9f > > > > Fwiw, the Apple M hardware issue came to me by way of Mark Rutland > > (iirc), this regression report, etc. My understanding is that Apple M > > has something like a v2 ARM PMU and the legacy events are encoded > > incorrectly in the driver for this. The regression in v6.5 happened > > I'm not sure about that. The M PMU events may be incomplete, but the two > that are there have a mapping that looks sane: > > static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = { > PERF_MAP_ALL_UNSUPPORTED, > [PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES, > [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS, > /* No idea about the rest yet */ > }; > > And they map to the same named events: > > static struct attribute *m1_pmu_event_attrs[] = { > M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES), > M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS), > NULL, > }; > > So in this case I can't see using legacy vs sysfs events making a > difference. Maybe there is some other case that was mentioned in a > previous thread that I missed though. No idea, iirc Mark Rutland requested not to use legacy events for Apple M. > > because ARM's core PMUs had previously been treated as uncore PMUs, > > meaning we wouldn't try to program legacy events on them. Fixing the > > handling of ARM's core PMUs broke Apple M due to the broken legacy > > event mappings. Why not fix the Apple M PMU driver? Well there was > > anyway a similar RISC-V issue reported by Atish Patra (iirc) where the > > RISC-V PMU driver wants to delegate the mapping of legacy events to > > the perf tool so the driver needn't be aware of all and future RISC-V > > configurations. The fix discussed with Mark, Atish, etc. has been to > > swap the priority of legacy and sysfs/json events so that the latter > > has priority. We need the revert of the revert as currently we only do > > this if a PMU is specified with an event, not for the general wildcard > > PMUs case that most people use. There was huge fallout from flipping > > Yep makes sense to do the revert if RISC-V isn't going to support any > legacy events. Although from what I understand that would technically > only require JSON to be the highest priority? Because putting named > events in sysfs still requires kernel involvement so doesn't get you any > further than supporting the legacy events? The sysfs and json event handling is interwoven, for example you can add to a sysfs event with json information. There are basically two approaches in the event parser, hardcoded legacy things and event names (optionally with PMU names). I'm trying to get rid of the hardcoded legacy things as they were fine when you had a single core type, but I want to have events everywhere - say instructions and cycles on a GPU so we can IPC on a GPU. For RISC-V as long as the legacy events are covered as names in json and json/sysfs has priority over legacy then things will be fine. > Seems like there is another reason to do the revert though as Mark > mentioned: That now directly specifying the PMU eg "-e > arm_cortex_a56/cycles/" opens a legacy event if the event matches one, > which is not the best thing to do. But the revert fixes this AFAIK, so > while having the priority JSON/legacy/sysfs might work for RISC-V it > wouldn't work for a platform that wants a slightly different sysfs event > than legacy but with the same name. And the priority should be > JSON/sysfs/legacy. The priority for events with a PMU is the sysfs/json has a priority over legacy names, so I don't understand what you're saying here. Your example shouldn't be broken. The revert is for the case where no PMU is specified, where the priority is the opposite which is at best inconsistent. > > the priority particularly on Intel as all test expectations needed > > updating. I've sent out similar fixes that need incorporating when the > > revert is reverted. Ideally tools/perf/tests/parse-events.c would be > > updated to cover ARM's PMUs that don't follow the normal pattern that > > the core PMU is called "cpu" (this would mean that we were testing > > event parsing on ARM was WAI wrt encoding priorities, BIG.little, > > etc). > > > >> I sent a small fix the other day to make perf stat default arguments > >> work on Juno, and didn't notice anything out of the ordinary: > >> https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t > >> I agree that change is quite narrow but it does incrementally improve > >> things for the time being. It's possible that it would become redundant > >> if I can just include Ian's change to use strings for Perf stat. > > > > I'd prefer we didn't merge this as we'd need to rebase: > > https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ > > and those changes would then delete the code introduced. I'm fine with > > adding the tests. > > > > There are more exotic heterogeneous core things upcoming, probably > > also from ARM, and the thought of duplicating the default attribute > > logic and event parsing constraints is just something I'd prefer not > > to have to do. > > > > Yep I don't have any strong feelings about this. Even if we don't merge > it it helped me understand the code and the issue a bit. > > I think one thing I assumed about your change was that there was some > dependency on these other changes. But the more I look at it I think > it's actually fine on it's own? Which change? If the change is trying to use "cycles" to open on all PMUs because it will be wild carded then it will run into the priority issue. > Using the cycles string actually works today, even on Apple M. The only > real remaining issue is softening the error for failure to open, but > that's _after_ doing the revert of the revert and is separate. > > I will re-test that one today with fresh eyes. Perhaps it is other legacy events, not cycles and instructions. There must have been a reason for this regression report but I don't have an Apple M CPU to test on. Thanks, Ian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-16 15:30 ` Ian Rogers @ 2024-08-17 1:38 ` Atish Kumar Patra 2024-08-20 8:58 ` James Clark 2024-08-19 14:56 ` James Clark 1 sibling, 1 reply; 53+ messages in thread From: Atish Kumar Patra @ 2024-08-17 1:38 UTC (permalink / raw) To: Ian Rogers Cc: James Clark, Thorsten Leemhuis, Arnaldo Carvalho de Melo, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On Fri, Aug 16, 2024 at 8:30 AM Ian Rogers <irogers@google.com> wrote: > > On Fri, Aug 16, 2024 at 2:23 AM James Clark <james.clark@linaro.org> wrote: > > > > > > > > On 15/08/2024 6:29 pm, Ian Rogers wrote: > > > On Wed, Aug 14, 2024 at 9:28 AM James Clark <james.clark@linaro.org> wrote: > > >> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: > > >>> On 01.08.24 21:05, Ian Rogers wrote: > > >>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > > >>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > > >>>>> > > >>>>> [TLDR: This mail in primarily relevant for Linux kernel regression > > >>>>> tracking. See link in footer if these mails annoy you.] > > >>>>> > > >>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: > > >>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > > >>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > > >>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > >>>>> > > >>>>> #regzbot fix: perf parse-events: Make legacy events lower priority than > > >>>>> sysfs/JSON > > >>>>> #regzbot ignore-activity > > >>>> > > >>>> Note, this is still broken. > > >>> > > >>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of > > >>> this? Or is this a "we are screwed one way or another and someone has to > > >>> bite the bullet" situation? > > >>> > > >>> Ciao, Thorsten > > >>> > > >>>> The patch changed the priority in the case > > >>>> that you do something like: > > >>>> > > >>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > > >>>> > > >>>> but if you do: > > >>>> > > >>>> $ perf stat -e 'cycles' benchmark > > >>>> > > >>>> then the broken behavior will happen as legacy events have priority > > >>>> over sysfs/json events in that case. To fix this you need to revert: > > >>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > > >>>> events over legacy" > > >>>> > > >>>> This causes some testing issues resolved in this unmerged patch series: > > >>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ > > >>>> > > >>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" > > >>>> and this PMU is present on Ampere systems. Reverting the commit above > > >>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove > > >>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a > > >>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to > > >>>> be opened by perf record and fail as the event won't support sampling. > > >>>> > > >>>> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ > > >>>> fixes this by only opening the cycles event on core PMUs when choosing > > >>>> default events. > > >>>> > > >>>> Rather than take this patch the revert happened as Linus runs the > > >>>> command "perf record -e cycles:pp" (ie using a specified event and not > > >>>> defaults) and considers it a regression in the perf tool that on an > > >>>> Ampere system to need to do "perf record -e > > >>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e > > >>>> will choose the cycles event correctly and with better precision the > > >>>> pp for systems that support it, but it was still considered a > > >>>> regression in the perf tool so the revert was made to happen. There is > > >>>> a lack of perf testing coverage for ARM, in particular as they choose > > >>>> to do everything in a different way to x86. The patch in question was > > >>>> in the linux-next tree for weeks without issues. > > >>>> > > >>>> ARM/Ampere could fix this by renaming the event from cycles to > > >>>> cpu_cycles, or by following Intel's convention that anything uncore > > >>>> uses the name clockticks rather than cycles. This could break people > > >>>> who rely on an event called arm_dsu/cycles/ but I imagine such people > > >>>> are rare. There has been no progress I'm aware of on renaming the > > >>>> event. > > >>>> > > >>>> Making perf not terminate on opening an event for perf record seems > > >>>> like the most likely workaround as that is at least something under > > >>>> the tool maintainers control. ARM have discussed doing this on the > > >>>> lists: > > >>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ > > >>>> but since the revert in v6.10 no patches have appeared for the v6.11 > > >>>> merge window. Feature work like coresight improvements and ARMv9 are > > >>>> being actively pursued by ARM, but feature work won't resolve this > > >>>> regression. > > >>>> > > >> > > >> I got some hardware with the DSU PMU so I'm going to have a go at trying > > >> to send some fixes for this. My initial idea was to try incorporate the > > >> "not terminate on opening" change as discussed in the link directly > > >> above. And then do the revert of the "revert of prefer sysfs/json". > > > > > > Thanks, I think this would be good. The biggest issue is that none of > > > the record logic expects a file descriptor to be not opened, deleting > > > unopened evsels from the evlist breaks all the indexing into the > > > mmaps, etc. Tbh, you probably wouldn't do the code this way if was > > > written afresh. Perhaps a hashmap would map from an evsel to ring > > > buffer mmaps, etc. Trying to avoid having global state and benefitting > > > from encapsulation. I'd focus on just doing the expedient thing in the > > > changes, which probably just means making the record code tolerant of > > > evsels that fail to open and not modifying the evlist due to the risk > > > it breaks the indices. > > > > > > > Thanks for the tips. > > > > > (To point out the obvious, this work wouldn't be necessary if arm_dsu > > > event were renamed from "cycles" to "cpu_cycles" which would also make > > > it more intention revealing alongside the arm_dsu's "bus_cycles" event > > > name). > > > > > > > I understand but I can imagine the following conversation if we rename that: > > > > User: "I updated my kernel and now my (non Perf) tool fails to open > > the DSU cycles event because it doesn't exist anymore" > > > > Linus/maintainers: "Oh ok yes that was a userspace breaking change, > > lets revert it" > > > > Just because Perf can handle 3 different names for cycles doesn't mean > > other tools can. > > cycles was a bad event name, dsu is a terrible name for what is mainly > the l3 cache, the risk that the two are combined get broken I'm fine > with as neoverse users with uncore permissions are say much rarer than > Apple M users. Having a cycles and a bus_cycles event is already > ambiguous, they sound the same. Renaming cycles to cpu_cycles would be > best. > > > >> FWIW I don't think Juno currently is broken if the kernel supports > > >> extended type ID? I could have missed some output in this thread but it > > >> seems like it's mostly related to Apple M hardware. I'm also a bit > > >> confused why the "supports extended type" check fails there, but maybe > > >> the v6.9 commit 25412c036 from Mark is missing? > > > > > > So I think your later emails clarify Arnaldo is probably missing: > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/drivers/perf/arm_pmu.c?h=perf-tools-next&id=5c816728651ae425954542fed64d21d40cb75a9f > > > > > > Fwiw, the Apple M hardware issue came to me by way of Mark Rutland > > > (iirc), this regression report, etc. My understanding is that Apple M > > > has something like a v2 ARM PMU and the legacy events are encoded > > > incorrectly in the driver for this. The regression in v6.5 happened > > > > I'm not sure about that. The M PMU events may be incomplete, but the two > > that are there have a mapping that looks sane: > > > > static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = { > > PERF_MAP_ALL_UNSUPPORTED, > > [PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES, > > [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS, > > /* No idea about the rest yet */ > > }; > > > > And they map to the same named events: > > > > static struct attribute *m1_pmu_event_attrs[] = { > > M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES), > > M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS), > > NULL, > > }; > > > > So in this case I can't see using legacy vs sysfs events making a > > difference. Maybe there is some other case that was mentioned in a > > previous thread that I missed though. > > No idea, iirc Mark Rutland requested not to use legacy events for Apple M. > > > > because ARM's core PMUs had previously been treated as uncore PMUs, > > > meaning we wouldn't try to program legacy events on them. Fixing the > > > handling of ARM's core PMUs broke Apple M due to the broken legacy > > > event mappings. Why not fix the Apple M PMU driver? Well there was > > > anyway a similar RISC-V issue reported by Atish Patra (iirc) where the > > > RISC-V PMU driver wants to delegate the mapping of legacy events to > > > the perf tool so the driver needn't be aware of all and future RISC-V > > > configurations. The fix discussed with Mark, Atish, etc. has been to > > > swap the priority of legacy and sysfs/json events so that the latter > > > has priority. We need the revert of the revert as currently we only do > > > this if a PMU is specified with an event, not for the general wildcard > > > PMUs case that most people use. There was huge fallout from flipping > > > > Yep makes sense to do the revert if RISC-V isn't going to support any > > legacy events. Although from what I understand that would technically > > only require JSON to be the highest priority? Because putting named > > events in sysfs still requires kernel involvement so doesn't get you any > > further than supporting the legacy events? > > The sysfs and json event handling is interwoven, for example you can > add to a sysfs event with json information. There are basically two > approaches in the event parser, hardcoded legacy things and event > names (optionally with PMU names). I'm trying to get rid of the > hardcoded legacy things as they were fine when you had a single core > type, but I want to have events everywhere - say instructions and > cycles on a GPU so we can IPC on a GPU. For RISC-V as long as the > legacy events are covered as names in json and json/sysfs has priority > over legacy then things will be fine. > RISC-V does want to support legacy events as that's how users on other architectures are used to run perf. It would be weird if we don't support it. Our initial reasoning behind relying on json for legacy events to avoid vendor specific encodings for these events in the driver. Unlike other ISAs, RISC-V ISA doesn't define an event encoding for these legacy events. As a result every platform vendor will have custom encoding. Managing them in the driver is cumbersome. Many thanks to Ian for posting the patches to reverse the priority which works fine for RISC-V. However, I understand that it is easier said than done and some use cases are broken. We also discovered there are few other use cases which still have the same problem even if we solve the bigger problem via json parsing for legacy events. 1. Any other user profiling application that invokes perf system calls directly may also try to just legacy event attributes in perf_event_attr. Android simpleperf application also falls in this category. We need to describe the platform specific encoding somewhere for these applications. 2. Perf running inside guests may run on any hardware and can't be tied to a platform specific json file. If we bind the legacy events via json file, those users won't be able to use perf cycle or instruction without the json file available. I don't have any good solutions for the above said problems without specifying the encoding in the driver itself. Given all the problems around json parsing for legacy events, we are thinking of biting the bullet and allowing platform vendors to encode the legacy events in the driver itself similar to other ISAs. We will try to keep the interface as scalable as possible. Any suggestions ? > > Seems like there is another reason to do the revert though as Mark > > mentioned: That now directly specifying the PMU eg "-e > > arm_cortex_a56/cycles/" opens a legacy event if the event matches one, > > which is not the best thing to do. But the revert fixes this AFAIK, so > > while having the priority JSON/legacy/sysfs might work for RISC-V it > > wouldn't work for a platform that wants a slightly different sysfs event > > than legacy but with the same name. And the priority should be > > JSON/sysfs/legacy. > > The priority for events with a PMU is the sysfs/json has a priority > over legacy names, so I don't understand what you're saying here. Your > example shouldn't be broken. The revert is for the case where no PMU > is specified, where the priority is the opposite which is at best > inconsistent. > > > > the priority particularly on Intel as all test expectations needed > > > updating. I've sent out similar fixes that need incorporating when the > > > revert is reverted. Ideally tools/perf/tests/parse-events.c would be > > > updated to cover ARM's PMUs that don't follow the normal pattern that > > > the core PMU is called "cpu" (this would mean that we were testing > > > event parsing on ARM was WAI wrt encoding priorities, BIG.little, > > > etc). > > > > > >> I sent a small fix the other day to make perf stat default arguments > > >> work on Juno, and didn't notice anything out of the ordinary: > > >> https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t > > >> I agree that change is quite narrow but it does incrementally improve > > >> things for the time being. It's possible that it would become redundant > > >> if I can just include Ian's change to use strings for Perf stat. > > > > > > I'd prefer we didn't merge this as we'd need to rebase: > > > https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ > > > and those changes would then delete the code introduced. I'm fine with > > > adding the tests. > > > > > > There are more exotic heterogeneous core things upcoming, probably > > > also from ARM, and the thought of duplicating the default attribute > > > logic and event parsing constraints is just something I'd prefer not > > > to have to do. > > > > > > > Yep I don't have any strong feelings about this. Even if we don't merge > > it it helped me understand the code and the issue a bit. > > > > I think one thing I assumed about your change was that there was some > > dependency on these other changes. But the more I look at it I think > > it's actually fine on it's own? > > Which change? If the change is trying to use "cycles" to open on all > PMUs because it will be wild carded then it will run into the priority > issue. > > > Using the cycles string actually works today, even on Apple M. The only > > real remaining issue is softening the error for failure to open, but > > that's _after_ doing the revert of the revert and is separate. > > > > I will re-test that one today with fresh eyes. > > Perhaps it is other legacy events, not cycles and instructions. There > must have been a reason for this regression report but I don't have an > Apple M CPU to test on. > > Thanks, > Ian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-17 1:38 ` Atish Kumar Patra @ 2024-08-20 8:58 ` James Clark 0 siblings, 0 replies; 53+ messages in thread From: James Clark @ 2024-08-20 8:58 UTC (permalink / raw) To: Atish Kumar Patra, Ian Rogers Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list On 17/08/2024 2:38 am, Atish Kumar Patra wrote: > On Fri, Aug 16, 2024 at 8:30 AM Ian Rogers <irogers@google.com> wrote: >> >> On Fri, Aug 16, 2024 at 2:23 AM James Clark <james.clark@linaro.org> wrote: >>> >>> >>> >>> On 15/08/2024 6:29 pm, Ian Rogers wrote: >>>> On Wed, Aug 14, 2024 at 9:28 AM James Clark <james.clark@linaro.org> wrote: >>>>> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: >>>>>> On 01.08.24 21:05, Ian Rogers wrote: >>>>>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update >>>>>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: >>>>>>>> >>>>>>>> [TLDR: This mail in primarily relevant for Linux kernel regression >>>>>>>> tracking. See link in footer if these mails annoy you.] >>>>>>>> >>>>>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: >>>>>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >>>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >>>>>>>> >>>>>>>> #regzbot fix: perf parse-events: Make legacy events lower priority than >>>>>>>> sysfs/JSON >>>>>>>> #regzbot ignore-activity >>>>>>> >>>>>>> Note, this is still broken. >>>>>> >>>>>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of >>>>>> this? Or is this a "we are screwed one way or another and someone has to >>>>>> bite the bullet" situation? >>>>>> >>>>>> Ciao, Thorsten >>>>>> >>>>>>> The patch changed the priority in the case >>>>>>> that you do something like: >>>>>>> >>>>>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark >>>>>>> >>>>>>> but if you do: >>>>>>> >>>>>>> $ perf stat -e 'cycles' benchmark >>>>>>> >>>>>>> then the broken behavior will happen as legacy events have priority >>>>>>> over sysfs/json events in that case. To fix this you need to revert: >>>>>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware >>>>>>> events over legacy" >>>>>>> >>>>>>> This causes some testing issues resolved in this unmerged patch series: >>>>>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ >>>>>>> >>>>>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" >>>>>>> and this PMU is present on Ampere systems. Reverting the commit above >>>>>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove >>>>>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a >>>>>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to >>>>>>> be opened by perf record and fail as the event won't support sampling. >>>>>>> >>>>>>> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ >>>>>>> fixes this by only opening the cycles event on core PMUs when choosing >>>>>>> default events. >>>>>>> >>>>>>> Rather than take this patch the revert happened as Linus runs the >>>>>>> command "perf record -e cycles:pp" (ie using a specified event and not >>>>>>> defaults) and considers it a regression in the perf tool that on an >>>>>>> Ampere system to need to do "perf record -e >>>>>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e >>>>>>> will choose the cycles event correctly and with better precision the >>>>>>> pp for systems that support it, but it was still considered a >>>>>>> regression in the perf tool so the revert was made to happen. There is >>>>>>> a lack of perf testing coverage for ARM, in particular as they choose >>>>>>> to do everything in a different way to x86. The patch in question was >>>>>>> in the linux-next tree for weeks without issues. >>>>>>> >>>>>>> ARM/Ampere could fix this by renaming the event from cycles to >>>>>>> cpu_cycles, or by following Intel's convention that anything uncore >>>>>>> uses the name clockticks rather than cycles. This could break people >>>>>>> who rely on an event called arm_dsu/cycles/ but I imagine such people >>>>>>> are rare. There has been no progress I'm aware of on renaming the >>>>>>> event. >>>>>>> >>>>>>> Making perf not terminate on opening an event for perf record seems >>>>>>> like the most likely workaround as that is at least something under >>>>>>> the tool maintainers control. ARM have discussed doing this on the >>>>>>> lists: >>>>>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ >>>>>>> but since the revert in v6.10 no patches have appeared for the v6.11 >>>>>>> merge window. Feature work like coresight improvements and ARMv9 are >>>>>>> being actively pursued by ARM, but feature work won't resolve this >>>>>>> regression. >>>>>>> >>>>> >>>>> I got some hardware with the DSU PMU so I'm going to have a go at trying >>>>> to send some fixes for this. My initial idea was to try incorporate the >>>>> "not terminate on opening" change as discussed in the link directly >>>>> above. And then do the revert of the "revert of prefer sysfs/json". >>>> >>>> Thanks, I think this would be good. The biggest issue is that none of >>>> the record logic expects a file descriptor to be not opened, deleting >>>> unopened evsels from the evlist breaks all the indexing into the >>>> mmaps, etc. Tbh, you probably wouldn't do the code this way if was >>>> written afresh. Perhaps a hashmap would map from an evsel to ring >>>> buffer mmaps, etc. Trying to avoid having global state and benefitting >>>> from encapsulation. I'd focus on just doing the expedient thing in the >>>> changes, which probably just means making the record code tolerant of >>>> evsels that fail to open and not modifying the evlist due to the risk >>>> it breaks the indices. >>>> >>> >>> Thanks for the tips. >>> >>>> (To point out the obvious, this work wouldn't be necessary if arm_dsu >>>> event were renamed from "cycles" to "cpu_cycles" which would also make >>>> it more intention revealing alongside the arm_dsu's "bus_cycles" event >>>> name). >>>> >>> >>> I understand but I can imagine the following conversation if we rename that: >>> >>> User: "I updated my kernel and now my (non Perf) tool fails to open >>> the DSU cycles event because it doesn't exist anymore" >>> >>> Linus/maintainers: "Oh ok yes that was a userspace breaking change, >>> lets revert it" >>> >>> Just because Perf can handle 3 different names for cycles doesn't mean >>> other tools can. >> >> cycles was a bad event name, dsu is a terrible name for what is mainly >> the l3 cache, the risk that the two are combined get broken I'm fine >> with as neoverse users with uncore permissions are say much rarer than >> Apple M users. Having a cycles and a bus_cycles event is already >> ambiguous, they sound the same. Renaming cycles to cpu_cycles would be >> best. >> >>>>> FWIW I don't think Juno currently is broken if the kernel supports >>>>> extended type ID? I could have missed some output in this thread but it >>>>> seems like it's mostly related to Apple M hardware. I'm also a bit >>>>> confused why the "supports extended type" check fails there, but maybe >>>>> the v6.9 commit 25412c036 from Mark is missing? >>>> >>>> So I think your later emails clarify Arnaldo is probably missing: >>>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/drivers/perf/arm_pmu.c?h=perf-tools-next&id=5c816728651ae425954542fed64d21d40cb75a9f >>>> >>>> Fwiw, the Apple M hardware issue came to me by way of Mark Rutland >>>> (iirc), this regression report, etc. My understanding is that Apple M >>>> has something like a v2 ARM PMU and the legacy events are encoded >>>> incorrectly in the driver for this. The regression in v6.5 happened >>> >>> I'm not sure about that. The M PMU events may be incomplete, but the two >>> that are there have a mapping that looks sane: >>> >>> static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = { >>> PERF_MAP_ALL_UNSUPPORTED, >>> [PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES, >>> [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS, >>> /* No idea about the rest yet */ >>> }; >>> >>> And they map to the same named events: >>> >>> static struct attribute *m1_pmu_event_attrs[] = { >>> M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES), >>> M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS), >>> NULL, >>> }; >>> >>> So in this case I can't see using legacy vs sysfs events making a >>> difference. Maybe there is some other case that was mentioned in a >>> previous thread that I missed though. >> >> No idea, iirc Mark Rutland requested not to use legacy events for Apple M. >> >>>> because ARM's core PMUs had previously been treated as uncore PMUs, >>>> meaning we wouldn't try to program legacy events on them. Fixing the >>>> handling of ARM's core PMUs broke Apple M due to the broken legacy >>>> event mappings. Why not fix the Apple M PMU driver? Well there was >>>> anyway a similar RISC-V issue reported by Atish Patra (iirc) where the >>>> RISC-V PMU driver wants to delegate the mapping of legacy events to >>>> the perf tool so the driver needn't be aware of all and future RISC-V >>>> configurations. The fix discussed with Mark, Atish, etc. has been to >>>> swap the priority of legacy and sysfs/json events so that the latter >>>> has priority. We need the revert of the revert as currently we only do >>>> this if a PMU is specified with an event, not for the general wildcard >>>> PMUs case that most people use. There was huge fallout from flipping >>> >>> Yep makes sense to do the revert if RISC-V isn't going to support any >>> legacy events. Although from what I understand that would technically >>> only require JSON to be the highest priority? Because putting named >>> events in sysfs still requires kernel involvement so doesn't get you any >>> further than supporting the legacy events? >> >> The sysfs and json event handling is interwoven, for example you can >> add to a sysfs event with json information. There are basically two >> approaches in the event parser, hardcoded legacy things and event >> names (optionally with PMU names). I'm trying to get rid of the >> hardcoded legacy things as they were fine when you had a single core >> type, but I want to have events everywhere - say instructions and >> cycles on a GPU so we can IPC on a GPU. For RISC-V as long as the >> legacy events are covered as names in json and json/sysfs has priority >> over legacy then things will be fine. >> > > RISC-V does want to support legacy events as that's how users on other > architectures are used to > run perf. It would be weird if we don't support it. > > Our initial reasoning behind relying on json for legacy events to > avoid vendor specific encodings for these > events in the driver. Unlike other ISAs, RISC-V ISA doesn't define an > event encoding for these legacy > events. As a result every platform vendor will have custom encoding. > Managing them in the driver is > cumbersome. Many thanks to Ian for posting the patches to reverse the > priority which works fine for RISC-V. > > However, I understand that it is easier said than done and some use > cases are broken. We also discovered > there are few other use cases which still have the same problem even > if we solve the bigger problem via json parsing > for legacy events. > > 1. Any other user profiling application that invokes perf system calls > directly may also try to just legacy event attributes in > perf_event_attr. > Android simpleperf application also falls in this category. We need to > describe the platform specific encoding somewhere for these > applications. > I think this use case is important. Not just for profiling applications but even something that wants to monitor itself. I imagine opening PERF_COUNT_HW_CPU_CYCLES or INSRUCTIONS is actually somewhat common, and I don't think every application that wants to do perf system calls should have to maintain JSON mappings for all platforms. That doesn't sound feasible to me, unless there is a smart way to do it? Maybe the mappings could be in libperf or something? But then that still requires everyone to add that as a dependency and keep it up to date. By that point you might as well just add them in the kernel and keep the existing interface. James ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-16 15:30 ` Ian Rogers 2024-08-17 1:38 ` Atish Kumar Patra @ 2024-08-19 14:56 ` James Clark 2024-08-19 15:44 ` Ian Rogers 1 sibling, 1 reply; 53+ messages in thread From: James Clark @ 2024-08-19 14:56 UTC (permalink / raw) To: Ian Rogers Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list, Atish Patra On 16/08/2024 4:30 pm, Ian Rogers wrote: > On Fri, Aug 16, 2024 at 2:23 AM James Clark <james.clark@linaro.org> wrote: >> >> >> >> On 15/08/2024 6:29 pm, Ian Rogers wrote: >>> On Wed, Aug 14, 2024 at 9:28 AM James Clark <james.clark@linaro.org> wrote: >>>> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: >>>>> On 01.08.24 21:05, Ian Rogers wrote: >>>>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update >>>>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: >>>>>>> >>>>>>> [TLDR: This mail in primarily relevant for Linux kernel regression >>>>>>> tracking. See link in footer if these mails annoy you.] >>>>>>> >>>>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: >>>>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. >>>>>>> >>>>>>> #regzbot fix: perf parse-events: Make legacy events lower priority than >>>>>>> sysfs/JSON >>>>>>> #regzbot ignore-activity >>>>>> >>>>>> Note, this is still broken. >>>>> >>>>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of >>>>> this? Or is this a "we are screwed one way or another and someone has to >>>>> bite the bullet" situation? >>>>> >>>>> Ciao, Thorsten >>>>> >>>>>> The patch changed the priority in the case >>>>>> that you do something like: >>>>>> >>>>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark >>>>>> >>>>>> but if you do: >>>>>> >>>>>> $ perf stat -e 'cycles' benchmark >>>>>> >>>>>> then the broken behavior will happen as legacy events have priority >>>>>> over sysfs/json events in that case. To fix this you need to revert: >>>>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware >>>>>> events over legacy" >>>>>> >>>>>> This causes some testing issues resolved in this unmerged patch series: >>>>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ >>>>>> >>>>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" >>>>>> and this PMU is present on Ampere systems. Reverting the commit above >>>>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove >>>>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a >>>>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to >>>>>> be opened by perf record and fail as the event won't support sampling. >>>>>> >>>>>> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ >>>>>> fixes this by only opening the cycles event on core PMUs when choosing >>>>>> default events. >>>>>> >>>>>> Rather than take this patch the revert happened as Linus runs the >>>>>> command "perf record -e cycles:pp" (ie using a specified event and not >>>>>> defaults) and considers it a regression in the perf tool that on an >>>>>> Ampere system to need to do "perf record -e >>>>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e >>>>>> will choose the cycles event correctly and with better precision the >>>>>> pp for systems that support it, but it was still considered a >>>>>> regression in the perf tool so the revert was made to happen. There is >>>>>> a lack of perf testing coverage for ARM, in particular as they choose >>>>>> to do everything in a different way to x86. The patch in question was >>>>>> in the linux-next tree for weeks without issues. >>>>>> >>>>>> ARM/Ampere could fix this by renaming the event from cycles to >>>>>> cpu_cycles, or by following Intel's convention that anything uncore >>>>>> uses the name clockticks rather than cycles. This could break people >>>>>> who rely on an event called arm_dsu/cycles/ but I imagine such people >>>>>> are rare. There has been no progress I'm aware of on renaming the >>>>>> event. >>>>>> >>>>>> Making perf not terminate on opening an event for perf record seems >>>>>> like the most likely workaround as that is at least something under >>>>>> the tool maintainers control. ARM have discussed doing this on the >>>>>> lists: >>>>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ >>>>>> but since the revert in v6.10 no patches have appeared for the v6.11 >>>>>> merge window. Feature work like coresight improvements and ARMv9 are >>>>>> being actively pursued by ARM, but feature work won't resolve this >>>>>> regression. >>>>>> >>>> >>>> I got some hardware with the DSU PMU so I'm going to have a go at trying >>>> to send some fixes for this. My initial idea was to try incorporate the >>>> "not terminate on opening" change as discussed in the link directly >>>> above. And then do the revert of the "revert of prefer sysfs/json". >>> >>> Thanks, I think this would be good. The biggest issue is that none of >>> the record logic expects a file descriptor to be not opened, deleting >>> unopened evsels from the evlist breaks all the indexing into the >>> mmaps, etc. Tbh, you probably wouldn't do the code this way if was >>> written afresh. Perhaps a hashmap would map from an evsel to ring >>> buffer mmaps, etc. Trying to avoid having global state and benefitting >>> from encapsulation. I'd focus on just doing the expedient thing in the >>> changes, which probably just means making the record code tolerant of >>> evsels that fail to open and not modifying the evlist due to the risk >>> it breaks the indices. >>> >> >> Thanks for the tips. >> >>> (To point out the obvious, this work wouldn't be necessary if arm_dsu >>> event were renamed from "cycles" to "cpu_cycles" which would also make >>> it more intention revealing alongside the arm_dsu's "bus_cycles" event >>> name). >>> >> >> I understand but I can imagine the following conversation if we rename that: >> >> User: "I updated my kernel and now my (non Perf) tool fails to open >> the DSU cycles event because it doesn't exist anymore" >> >> Linus/maintainers: "Oh ok yes that was a userspace breaking change, >> lets revert it" >> >> Just because Perf can handle 3 different names for cycles doesn't mean >> other tools can. > > cycles was a bad event name, dsu is a terrible name for what is mainly > the l3 cache, the risk that the two are combined get broken I'm fine > with as neoverse users with uncore permissions are say much rarer than > Apple M users. Having a cycles and a bus_cycles event is already > ambiguous, they sound the same. Renaming cycles to cpu_cycles would be > best. > >>>> FWIW I don't think Juno currently is broken if the kernel supports >>>> extended type ID? I could have missed some output in this thread but it >>>> seems like it's mostly related to Apple M hardware. I'm also a bit >>>> confused why the "supports extended type" check fails there, but maybe >>>> the v6.9 commit 25412c036 from Mark is missing? >>> >>> So I think your later emails clarify Arnaldo is probably missing: >>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/drivers/perf/arm_pmu.c?h=perf-tools-next&id=5c816728651ae425954542fed64d21d40cb75a9f >>> >>> Fwiw, the Apple M hardware issue came to me by way of Mark Rutland >>> (iirc), this regression report, etc. My understanding is that Apple M >>> has something like a v2 ARM PMU and the legacy events are encoded >>> incorrectly in the driver for this. The regression in v6.5 happened >> >> I'm not sure about that. The M PMU events may be incomplete, but the two >> that are there have a mapping that looks sane: >> >> static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = { >> PERF_MAP_ALL_UNSUPPORTED, >> [PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES, >> [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS, >> /* No idea about the rest yet */ >> }; >> >> And they map to the same named events: >> >> static struct attribute *m1_pmu_event_attrs[] = { >> M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES), >> M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS), >> NULL, >> }; >> >> So in this case I can't see using legacy vs sysfs events making a >> difference. Maybe there is some other case that was mentioned in a >> previous thread that I missed though. > > No idea, iirc Mark Rutland requested not to use legacy events for Apple M. > The point I was trying to make here was that there isn't _technically_ any user facing bug on Apple M with both a new kernel and new perf, despite the issues Mark mentioned. I think there's a bit more subtlety in Mark's request. Using sysfs is only required for old kernels that don't support extended type ID, and it's not specific to apple M, that's for everywhere. The other case he mentioned was when the events are slightly different but with the same name as legacy, which isn't the case here specifically but is already fixed by ("perf parse-events: Make legacy events lower priority than sysfs/JSON") (v6.8). >>> because ARM's core PMUs had previously been treated as uncore PMUs, >>> meaning we wouldn't try to program legacy events on them. Fixing the >>> handling of ARM's core PMUs broke Apple M due to the broken legacy >>> event mappings. Why not fix the Apple M PMU driver? Well there was >>> anyway a similar RISC-V issue reported by Atish Patra (iirc) where the >>> RISC-V PMU driver wants to delegate the mapping of legacy events to >>> the perf tool so the driver needn't be aware of all and future RISC-V >>> configurations. The fix discussed with Mark, Atish, etc. has been to >>> swap the priority of legacy and sysfs/json events so that the latter >>> has priority. We need the revert of the revert as currently we only do >>> this if a PMU is specified with an event, not for the general wildcard >>> PMUs case that most people use. There was huge fallout from flipping >> >> Yep makes sense to do the revert if RISC-V isn't going to support any >> legacy events. Although from what I understand that would technically >> only require JSON to be the highest priority? Because putting named >> events in sysfs still requires kernel involvement so doesn't get you any >> further than supporting the legacy events? > > The sysfs and json event handling is interwoven, for example you can > add to a sysfs event with json information. There are basically two > approaches in the event parser, hardcoded legacy things and event > names (optionally with PMU names). I'm trying to get rid of the > hardcoded legacy things as they were fine when you had a single core > type, but I want to have events everywhere - say instructions and > cycles on a GPU so we can IPC on a GPU. For RISC-V as long as the > legacy events are covered as names in json and json/sysfs has priority > over legacy then things will be fine. > >> Seems like there is another reason to do the revert though as Mark >> mentioned: That now directly specifying the PMU eg "-e >> arm_cortex_a56/cycles/" opens a legacy event if the event matches one, >> which is not the best thing to do. But the revert fixes this AFAIK, so >> while having the priority JSON/legacy/sysfs might work for RISC-V it >> wouldn't work for a platform that wants a slightly different sysfs event >> than legacy but with the same name. And the priority should be >> JSON/sysfs/legacy. > > The priority for events with a PMU is the sysfs/json has a priority > over legacy names, so I don't understand what you're saying here. Your > example shouldn't be broken. The revert is for the case where no PMU > is specified, where the priority is the opposite which is at best > inconsistent. > Yep you're right, I got confused with the original bug report which is now old. With commit a24d9d9dc ("perf parse-events: Make legacy events lower priority than sysfs/JSON") (v6.8) named PMUs do prioritize sysfs. >>> the priority particularly on Intel as all test expectations needed >>> updating. I've sent out similar fixes that need incorporating when the >>> revert is reverted. Ideally tools/perf/tests/parse-events.c would be >>> updated to cover ARM's PMUs that don't follow the normal pattern that >>> the core PMU is called "cpu" (this would mean that we were testing >>> event parsing on ARM was WAI wrt encoding priorities, BIG.little, >>> etc). >>> >>>> I sent a small fix the other day to make perf stat default arguments >>>> work on Juno, and didn't notice anything out of the ordinary: >>>> https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t >>>> I agree that change is quite narrow but it does incrementally improve >>>> things for the time being. It's possible that it would become redundant >>>> if I can just include Ian's change to use strings for Perf stat. >>> >>> I'd prefer we didn't merge this as we'd need to rebase: >>> https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ >>> and those changes would then delete the code introduced. I'm fine with >>> adding the tests. >>> >>> There are more exotic heterogeneous core things upcoming, probably >>> also from ARM, and the thought of duplicating the default attribute >>> logic and event parsing constraints is just something I'd prefer not >>> to have to do. >>> >> >> Yep I don't have any strong feelings about this. Even if we don't merge >> it it helped me understand the code and the issue a bit. >> >> I think one thing I assumed about your change was that there was some >> dependency on these other changes. But the more I look at it I think >> it's actually fine on it's own? > > Which change? If the change is trying to use "cycles" to open on all > PMUs because it will be wild carded then it will run into the priority > issue. > Just patch 3 here: https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ I assume it works because we don't open on uncore right now. But I'm still rebasing and testing it. So we could merge that, and then when we do the priority revert along with the fix to ignore the DSU error it will continue to work. >> Using the cycles string actually works today, even on Apple M. The only >> real remaining issue is softening the error for failure to open, but >> that's _after_ doing the revert of the revert and is separate. >> >> I will re-test that one today with fresh eyes. > > Perhaps it is other legacy events, not cycles and instructions. There > must have been a reason for this regression report but I don't have an > Apple M CPU to test on. > This regression report is for various (admittedly extremely confusing) combinations of kernels and perfs without the following patches: 5c81672865 ("arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability") (v6.6 kernel release) 25412c036 ("perf print-events: make is_event_supported() more robust") (v6.9 Perf release for Apple M) a24d9d9dc ("perf parse-events: Make legacy events lower priority than sysfs/JSON") (v6.8 Perf) With all of those applied everything is fixed even on Apple M. I don't think anything needs to be fixed for the bare "-e cycles" that you mentioned at the beginning of the chain because that never regressed, it actually never worked on big.LITTLE until 5c81672865, and after that using legacy was fine. I don't think Mark actually wants bare "cycles" to _not_ use legacy either because it never did. He only mentioned what happens when you really do want to target a PMU with a name (already fixed in a24d9d9dc). > Thanks, > Ian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-19 14:56 ` James Clark @ 2024-08-19 15:44 ` Ian Rogers 0 siblings, 0 replies; 53+ messages in thread From: Ian Rogers @ 2024-08-19 15:44 UTC (permalink / raw) To: James Clark Cc: Thorsten Leemhuis, Arnaldo Carvalho de Melo, Mark Rutland, Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Asahi Linux, Linux regressions mailing list, Atish Patra On Mon, Aug 19, 2024 at 7:56 AM James Clark <james.clark@linaro.org> wrote: > > > > On 16/08/2024 4:30 pm, Ian Rogers wrote: > > On Fri, Aug 16, 2024 at 2:23 AM James Clark <james.clark@linaro.org> wrote: > >> > >> > >> > >> On 15/08/2024 6:29 pm, Ian Rogers wrote: > >>> On Wed, Aug 14, 2024 at 9:28 AM James Clark <james.clark@linaro.org> wrote: > >>>> On 07/08/2024 9:54 am, Thorsten Leemhuis wrote: > >>>>> On 01.08.24 21:05, Ian Rogers wrote: > >>>>>> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > >>>>>> (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > >>>>>>> > >>>>>>> [TLDR: This mail in primarily relevant for Linux kernel regression > >>>>>>> tracking. See link in footer if these mails annoy you.] > >>>>>>> > >>>>>>> On 22.11.23 00:43, Bagas Sanjaya wrote: > >>>>>>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and > >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > >>>>>>> > >>>>>>> #regzbot fix: perf parse-events: Make legacy events lower priority than > >>>>>>> sysfs/JSON > >>>>>>> #regzbot ignore-activity > >>>>>> > >>>>>> Note, this is still broken. > >>>>> > >>>>> Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of > >>>>> this? Or is this a "we are screwed one way or another and someone has to > >>>>> bite the bullet" situation? > >>>>> > >>>>> Ciao, Thorsten > >>>>> > >>>>>> The patch changed the priority in the case > >>>>>> that you do something like: > >>>>>> > >>>>>> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > >>>>>> > >>>>>> but if you do: > >>>>>> > >>>>>> $ perf stat -e 'cycles' benchmark > >>>>>> > >>>>>> then the broken behavior will happen as legacy events have priority > >>>>>> over sysfs/json events in that case. To fix this you need to revert: > >>>>>> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > >>>>>> events over legacy" > >>>>>> > >>>>>> This causes some testing issues resolved in this unmerged patch series: > >>>>>> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/ > >>>>>> > >>>>>> There is a bug as the arm_dsu PMU advertises an event called "cycles" > >>>>>> and this PMU is present on Ampere systems. Reverting the commit above > >>>>>> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove > >>>>>> __evlist__add_default") to fix ARM's BIG.little systems (opening a > >>>>>> cycles event on all PMUs not just 1) will cause the arm_dsu event to > >>>>>> be opened by perf record and fail as the event won't support sampling. > >>>>>> > >>>>>> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/ > >>>>>> fixes this by only opening the cycles event on core PMUs when choosing > >>>>>> default events. > >>>>>> > >>>>>> Rather than take this patch the revert happened as Linus runs the > >>>>>> command "perf record -e cycles:pp" (ie using a specified event and not > >>>>>> defaults) and considers it a regression in the perf tool that on an > >>>>>> Ampere system to need to do "perf record -e > >>>>>> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e > >>>>>> will choose the cycles event correctly and with better precision the > >>>>>> pp for systems that support it, but it was still considered a > >>>>>> regression in the perf tool so the revert was made to happen. There is > >>>>>> a lack of perf testing coverage for ARM, in particular as they choose > >>>>>> to do everything in a different way to x86. The patch in question was > >>>>>> in the linux-next tree for weeks without issues. > >>>>>> > >>>>>> ARM/Ampere could fix this by renaming the event from cycles to > >>>>>> cpu_cycles, or by following Intel's convention that anything uncore > >>>>>> uses the name clockticks rather than cycles. This could break people > >>>>>> who rely on an event called arm_dsu/cycles/ but I imagine such people > >>>>>> are rare. There has been no progress I'm aware of on renaming the > >>>>>> event. > >>>>>> > >>>>>> Making perf not terminate on opening an event for perf record seems > >>>>>> like the most likely workaround as that is at least something under > >>>>>> the tool maintainers control. ARM have discussed doing this on the > >>>>>> lists: > >>>>>> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/ > >>>>>> but since the revert in v6.10 no patches have appeared for the v6.11 > >>>>>> merge window. Feature work like coresight improvements and ARMv9 are > >>>>>> being actively pursued by ARM, but feature work won't resolve this > >>>>>> regression. > >>>>>> > >>>> > >>>> I got some hardware with the DSU PMU so I'm going to have a go at trying > >>>> to send some fixes for this. My initial idea was to try incorporate the > >>>> "not terminate on opening" change as discussed in the link directly > >>>> above. And then do the revert of the "revert of prefer sysfs/json". > >>> > >>> Thanks, I think this would be good. The biggest issue is that none of > >>> the record logic expects a file descriptor to be not opened, deleting > >>> unopened evsels from the evlist breaks all the indexing into the > >>> mmaps, etc. Tbh, you probably wouldn't do the code this way if was > >>> written afresh. Perhaps a hashmap would map from an evsel to ring > >>> buffer mmaps, etc. Trying to avoid having global state and benefitting > >>> from encapsulation. I'd focus on just doing the expedient thing in the > >>> changes, which probably just means making the record code tolerant of > >>> evsels that fail to open and not modifying the evlist due to the risk > >>> it breaks the indices. > >>> > >> > >> Thanks for the tips. > >> > >>> (To point out the obvious, this work wouldn't be necessary if arm_dsu > >>> event were renamed from "cycles" to "cpu_cycles" which would also make > >>> it more intention revealing alongside the arm_dsu's "bus_cycles" event > >>> name). > >>> > >> > >> I understand but I can imagine the following conversation if we rename that: > >> > >> User: "I updated my kernel and now my (non Perf) tool fails to open > >> the DSU cycles event because it doesn't exist anymore" > >> > >> Linus/maintainers: "Oh ok yes that was a userspace breaking change, > >> lets revert it" > >> > >> Just because Perf can handle 3 different names for cycles doesn't mean > >> other tools can. > > > > cycles was a bad event name, dsu is a terrible name for what is mainly > > the l3 cache, the risk that the two are combined get broken I'm fine > > with as neoverse users with uncore permissions are say much rarer than > > Apple M users. Having a cycles and a bus_cycles event is already > > ambiguous, they sound the same. Renaming cycles to cpu_cycles would be > > best. > > > >>>> FWIW I don't think Juno currently is broken if the kernel supports > >>>> extended type ID? I could have missed some output in this thread but it > >>>> seems like it's mostly related to Apple M hardware. I'm also a bit > >>>> confused why the "supports extended type" check fails there, but maybe > >>>> the v6.9 commit 25412c036 from Mark is missing? > >>> > >>> So I think your later emails clarify Arnaldo is probably missing: > >>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/drivers/perf/arm_pmu.c?h=perf-tools-next&id=5c816728651ae425954542fed64d21d40cb75a9f > >>> > >>> Fwiw, the Apple M hardware issue came to me by way of Mark Rutland > >>> (iirc), this regression report, etc. My understanding is that Apple M > >>> has something like a v2 ARM PMU and the legacy events are encoded > >>> incorrectly in the driver for this. The regression in v6.5 happened > >> > >> I'm not sure about that. The M PMU events may be incomplete, but the two > >> that are there have a mapping that looks sane: > >> > >> static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = { > >> PERF_MAP_ALL_UNSUPPORTED, > >> [PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES, > >> [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS, > >> /* No idea about the rest yet */ > >> }; > >> > >> And they map to the same named events: > >> > >> static struct attribute *m1_pmu_event_attrs[] = { > >> M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES), > >> M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS), > >> NULL, > >> }; > >> > >> So in this case I can't see using legacy vs sysfs events making a > >> difference. Maybe there is some other case that was mentioned in a > >> previous thread that I missed though. > > > > No idea, iirc Mark Rutland requested not to use legacy events for Apple M. > > > > The point I was trying to make here was that there isn't _technically_ > any user facing bug on Apple M with both a new kernel and new perf, > despite the issues Mark mentioned. > > I think there's a bit more subtlety in Mark's request. Using sysfs is > only required for old kernels that don't support extended type ID, and > it's not specific to apple M, that's for everywhere. The other case he > mentioned was when the events are slightly different but with the same > name as legacy, which isn't the case here specifically but is already > fixed by ("perf parse-events: Make legacy events lower priority than > sysfs/JSON") (v6.8). > > >>> because ARM's core PMUs had previously been treated as uncore PMUs, > >>> meaning we wouldn't try to program legacy events on them. Fixing the > >>> handling of ARM's core PMUs broke Apple M due to the broken legacy > >>> event mappings. Why not fix the Apple M PMU driver? Well there was > >>> anyway a similar RISC-V issue reported by Atish Patra (iirc) where the > >>> RISC-V PMU driver wants to delegate the mapping of legacy events to > >>> the perf tool so the driver needn't be aware of all and future RISC-V > >>> configurations. The fix discussed with Mark, Atish, etc. has been to > >>> swap the priority of legacy and sysfs/json events so that the latter > >>> has priority. We need the revert of the revert as currently we only do > >>> this if a PMU is specified with an event, not for the general wildcard > >>> PMUs case that most people use. There was huge fallout from flipping > >> > >> Yep makes sense to do the revert if RISC-V isn't going to support any > >> legacy events. Although from what I understand that would technically > >> only require JSON to be the highest priority? Because putting named > >> events in sysfs still requires kernel involvement so doesn't get you any > >> further than supporting the legacy events? > > > > The sysfs and json event handling is interwoven, for example you can > > add to a sysfs event with json information. There are basically two > > approaches in the event parser, hardcoded legacy things and event > > names (optionally with PMU names). I'm trying to get rid of the > > hardcoded legacy things as they were fine when you had a single core > > type, but I want to have events everywhere - say instructions and > > cycles on a GPU so we can IPC on a GPU. For RISC-V as long as the > > legacy events are covered as names in json and json/sysfs has priority > > over legacy then things will be fine. > > > >> Seems like there is another reason to do the revert though as Mark > >> mentioned: That now directly specifying the PMU eg "-e > >> arm_cortex_a56/cycles/" opens a legacy event if the event matches one, > >> which is not the best thing to do. But the revert fixes this AFAIK, so > >> while having the priority JSON/legacy/sysfs might work for RISC-V it > >> wouldn't work for a platform that wants a slightly different sysfs event > >> than legacy but with the same name. And the priority should be > >> JSON/sysfs/legacy. > > > > The priority for events with a PMU is the sysfs/json has a priority > > over legacy names, so I don't understand what you're saying here. Your > > example shouldn't be broken. The revert is for the case where no PMU > > is specified, where the priority is the opposite which is at best > > inconsistent. > > > > Yep you're right, I got confused with the original bug report which is > now old. With commit a24d9d9dc ("perf parse-events: Make legacy events > lower priority than sysfs/JSON") (v6.8) named PMUs do prioritize sysfs. > > >>> the priority particularly on Intel as all test expectations needed > >>> updating. I've sent out similar fixes that need incorporating when the > >>> revert is reverted. Ideally tools/perf/tests/parse-events.c would be > >>> updated to cover ARM's PMUs that don't follow the normal pattern that > >>> the core PMU is called "cpu" (this would mean that we were testing > >>> event parsing on ARM was WAI wrt encoding priorities, BIG.little, > >>> etc). > >>> > >>>> I sent a small fix the other day to make perf stat default arguments > >>>> work on Juno, and didn't notice anything out of the ordinary: > >>>> https://lore.kernel.org/linux-perf-users/dac6ad1d-5aca-48b4-9dcb-ff7e54ca43f6@linaro.org/T/#t > >>>> I agree that change is quite narrow but it does incrementally improve > >>>> things for the time being. It's possible that it would become redundant > >>>> if I can just include Ian's change to use strings for Perf stat. > >>> > >>> I'd prefer we didn't merge this as we'd need to rebase: > >>> https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ > >>> and those changes would then delete the code introduced. I'm fine with > >>> adding the tests. > >>> > >>> There are more exotic heterogeneous core things upcoming, probably > >>> also from ARM, and the thought of duplicating the default attribute > >>> logic and event parsing constraints is just something I'd prefer not > >>> to have to do. > >>> > >> > >> Yep I don't have any strong feelings about this. Even if we don't merge > >> it it helped me understand the code and the issue a bit. > >> > >> I think one thing I assumed about your change was that there was some > >> dependency on these other changes. But the more I look at it I think > >> it's actually fine on it's own? > > > > Which change? If the change is trying to use "cycles" to open on all > > PMUs because it will be wild carded then it will run into the priority > > issue. > > > > Just patch 3 here: > https://lore.kernel.org/lkml/20240510053705.2462258-4-irogers@google.com/ > > I assume it works because we don't open on uncore right now. But I'm > still rebasing and testing it. So we could merge that, and then when we > do the priority revert along with the fix to ignore the DSU error it > will continue to work. > > >> Using the cycles string actually works today, even on Apple M. The only > >> real remaining issue is softening the error for failure to open, but > >> that's _after_ doing the revert of the revert and is separate. > >> > >> I will re-test that one today with fresh eyes. > > > > Perhaps it is other legacy events, not cycles and instructions. There > > must have been a reason for this regression report but I don't have an > > Apple M CPU to test on. > > > > This regression report is for various (admittedly extremely confusing) > combinations of kernels and perfs without the following patches: > > 5c81672865 ("arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability") > (v6.6 kernel release) > > 25412c036 ("perf print-events: make is_event_supported() more robust") > (v6.9 Perf release for Apple M) > > a24d9d9dc ("perf parse-events: Make legacy events lower priority than > sysfs/JSON") > (v6.8 Perf) > > With all of those applied everything is fixed even on Apple M. I don't > think anything needs to be fixed for the bare "-e cycles" that you > mentioned at the beginning of the chain because that never regressed, it > actually never worked on big.LITTLE until 5c81672865, and after that > using legacy was fine. I don't think Mark actually wants bare "cycles" > to _not_ use legacy either because it never did. He only mentioned what > happens when you really do want to target a PMU with a name (already > fixed in a24d9d9dc). I'm not clear, is your point that when we get regression reports on the tool like this and Mark says things to me face-to-face at LPC we should ignore the issue and wait for the driver fix? The PMU driver for Apple M has fixed the legacy defaults for instructions and cycles, great - this was the obvious fix for a driver issue from the get go. Has it fixed all legacy values? Are you saying we should flip from sysfs/json preferred over legacy to legacy preferred over sysfs json? I still would like to get rid of legacy events having different wild card behavior, cpu-cycles (legacy - matches only core PMUs) vs cpu_cycles (sysfs - matches on all PMUs) but if we need to carry this awkwardness for the sake of arm_dsu then *sigh* ok, it'll be forever a potential trap when writing metrics - beware magic legacy names that won't work on anything other than core PMUs. We carry lots of other discrepancies around for things like arbitrary hex cut off values to work around PMU suffix naming (5fabcdef vs a53 - both hex suffixes with different interpretations), hotplug handling, etc. One concern that's been raised is other tools being able to work correctly, given the minefield set up in this regard I can imagine legacy events working but little else. At least we can work to have the reference implementation that comes with the kernel working. Thanks, Ian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 2024-08-01 19:05 ` Ian Rogers 2024-08-07 8:54 ` Thorsten Leemhuis @ 2025-03-09 21:19 ` Ian Rogers 1 sibling, 0 replies; 53+ messages in thread From: Ian Rogers @ 2025-03-09 21:19 UTC (permalink / raw) To: Linux regressions mailing list, to: Mark Rutland Cc: Linux perf Profiling, Linux Kernel Mailing List, James Clark, cc: Marc Zyngier, Hector Martin, Arnaldo Carvalho de Melo, Asahi Linux On Thu, Aug 1, 2024 at 12:05 PM Ian Rogers <irogers@google.com> wrote: > > On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update > (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > > > > [TLDR: This mail in primarily relevant for Linux kernel regression > > tracking. See link in footer if these mails annoy you.] > > > > On 22.11.23 00:43, Bagas Sanjaya wrote: > > > On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote: > > >> Perf broke on all Apple ARM64 systems (tested almost everything), and > > >> according to maz also on Juno (so, probably all big.LITTLE) since v6.5. > > > > #regzbot fix: perf parse-events: Make legacy events lower priority than > > sysfs/JSON > > #regzbot ignore-activity > > Note, this is still broken. The patch changed the priority in the case > that you do something like: > > $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark > > but if you do: > > $ perf stat -e 'cycles' benchmark > > then the broken behavior will happen as legacy events have priority > over sysfs/json events in that case. To fix this you need to revert: > 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware > events over legacy" This still hasn't been fixed and I'm at the point of saying I no longer care except I want consistency. Let's revert the prioritization of sysfs/json events for PMUs. I don't want to carry around patches like: https://lore.kernel.org/r/20240926144851.245903-2-james.clark@linaro.org If this re-opens this bug then I'm fine with that, and I'm happy to point to James and Arnaldo's comments [1] saying that somehow legacy events are better, because drill down or something (what a bit pattern has to do with that, no idea, we already default on Intel to non-legacy events and drill down just dandily for topdown). Whatever, I'm fed up with dealing with mine and others' comments being taken out of context. I'm fed up with the ambiguity of two encoding systems, one with and one without PMUs specified. I'm fed up with working on PMU and event encoding, ordering, matching, metrics, etc. where it is unclear what the behavior should be. I'm fed up with ARM choosing bad uncore event names, refusing to correct them and creating a massive mess they barely help clean up other than by largely reposting my patches. I'm fed up that all of this was done for ARM and then they don't seem to care about its resolution or testing the original regression. Yes, this sucks as user land won't be able to be a source for event configuration fixes. Yes, this sucks as such functionality would slim down PMU drivers and was a behavior requested by RISC-V face-to-face with a maintainer. I don't see why I should have to fight for this other than I unexpectedly broke things in the first place (this regression report) and I was trying to help RISC-V. To be specific, I don't want the event 'instructions' be encoded as type 'hardware' and config 'instructions', be reported as 'cpu_core/instructions/' but then that event to be encoded as type 4 (RAW) and config 0xc0. The fact with this cpu-cycles will only wild card on core PMUs, but cpu_cycles will wildcard on all of them. Again, why do I have to try to fight for sanity, let's just back everything this regression report created out. We check legacy events and do their behaviors, otherwise we fall back on sysfs/json. Thanks, Ian [1] https://lore.kernel.org/all/Z8sMcta0zTWeOso4@x1/ ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2025-03-09 21:19 UTC | newest] Thread overview: 53+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-21 12:08 [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 Hector Martin 2023-11-21 13:40 ` Marc Zyngier 2023-11-21 15:24 ` Marc Zyngier 2023-11-21 15:40 ` Mark Rutland 2023-11-21 15:46 ` Ian Rogers 2023-11-21 16:02 ` Mark Rutland 2023-11-21 16:09 ` Ian Rogers 2023-11-21 16:15 ` Mark Rutland 2023-11-21 16:38 ` Ian Rogers 2023-11-22 3:23 ` Hector Martin 2023-11-22 13:06 ` Arnaldo Carvalho de Melo 2023-11-22 15:33 ` Ian Rogers 2023-11-22 15:49 ` Mark Rutland 2023-11-22 16:04 ` Ian Rogers 2023-11-22 16:26 ` Arnaldo Carvalho de Melo 2023-11-22 16:33 ` Ian Rogers 2023-11-22 16:19 ` Arnaldo Carvalho de Melo 2023-11-22 13:03 ` Mark Rutland 2023-11-22 15:29 ` Ian Rogers 2023-11-22 16:08 ` Mark Rutland 2023-11-22 16:29 ` Ian Rogers 2023-11-22 16:55 ` Arnaldo Carvalho de Melo 2023-11-22 16:59 ` Ian Rogers 2023-11-23 4:33 ` Ian Rogers 2023-11-21 15:41 ` Ian Rogers 2023-11-21 15:56 ` Mark Rutland 2023-11-21 16:03 ` Ian Rogers 2023-11-21 16:08 ` Mark Rutland 2023-11-23 14:23 ` Mark Rutland 2023-11-23 14:45 ` Marc Zyngier 2023-11-23 15:14 ` Ian Rogers 2023-11-23 16:48 ` Mark Rutland 2023-11-23 17:08 ` James Clark 2023-11-23 17:15 ` Mark Rutland 2023-11-21 23:43 ` Bagas Sanjaya 2023-12-06 12:09 ` Linux regression tracking #update (Thorsten Leemhuis) 2024-08-01 19:05 ` Ian Rogers 2024-08-07 8:54 ` Thorsten Leemhuis 2024-08-14 16:28 ` James Clark 2024-08-14 16:41 ` Arnaldo Carvalho de Melo 2024-08-15 15:15 ` James Clark 2024-08-15 15:20 ` James Clark 2024-08-15 15:27 ` Arnaldo Carvalho de Melo 2024-08-15 15:53 ` Arnaldo Carvalho de Melo 2024-08-16 8:57 ` James Clark 2024-08-15 17:29 ` Ian Rogers 2024-08-16 9:22 ` James Clark 2024-08-16 15:30 ` Ian Rogers 2024-08-17 1:38 ` Atish Kumar Patra 2024-08-20 8:58 ` James Clark 2024-08-19 14:56 ` James Clark 2024-08-19 15:44 ` Ian Rogers 2025-03-09 21:19 ` Ian Rogers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).