All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Hector Martin <marcan@marcan.st>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Ian Rogers <irogers@google.com>,
	James Clark <james.clark@arm.com>,
	linux-perf-users@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Asahi Linux <asahi@lists.linux.dev>
Subject: Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5
Date: Thu, 23 Nov 2023 14:45:30 +0000	[thread overview]
Message-ID: <86edggzfxx.wl-maz@kernel.org> (raw)
In-Reply-To: <ZV9gThJ52slPHqlV@FVFF77S0Q05N.cambridge.arm.com>

On Thu, 23 Nov 2023 14:23:10 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote:
> > On Tue, 21 Nov 2023 13:40:31 +0000,
> > Marc Zyngier <maz@kernel.org> wrote:
> > > 
> > > [Adding key people on Cc]
> > > 
> > > On Tue, 21 Nov 2023 12:08:48 +0000,
> > > Hector Martin <marcan@marcan.st> wrote:
> > > > 
> > > > Perf broke on all Apple ARM64 systems (tested almost everything), and
> > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5.
> > > 
> > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any
> > > asymmetric ARM platform. It isn't clear what criteria is used to pick
> > > the PMU, but nothing works anymore.
> > > 
> > > The saving grace in my case is that Debian still ships a 6.1 perftool
> > > package, but that's obviously not going to last.
> > > 
> > > I'm happy to test potential fixes.
> > 
> > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with
> > -vvv.  And it is quite entertaining (this is taskset to an 'icestorm'
> > CPU):
> 
> Looking at this with fresh(er) eyes, I think there's a userspace bug here,
> regardless of whether one believes it's correct to convert a named-pmu event to
> a PERF_TYPE_HARDWARE event directed at that PMU.
> 
> It looks like the userspace tool is dropping the extended type ID after an
> initial probe, and requests events with plain PERF_TYPE_HARDWARE (without an
> extended type ID), which explains why we seem to get events from one PMU only.
> 
> More detail below...
> 
> Marc, if you have time, could you run the same commands (on the same kernel)
> with a perf tool build from v6.4?

Here you go:

<quote>
$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e  apple_firestorm_pmu/cycles/ -e cycles ls >/dev/null
Using CPUID 0x00000000610f0280
Attempting to add event pmu 'apple_icestorm_pmu' with 'cycles,' that may result in non-fatal errors
After aliases, add event pmu 'apple_icestorm_pmu' with 'event,' that may result in non-fatal errors
Attempting to add event pmu 'apple_firestorm_pmu' with 'cycles,' that may result in non-fatal errors
After aliases, add event pmu 'apple_firestorm_pmu' with 'event,' that may result in non-fatal errors
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
  type                             10
  size                             136
  config                           0x2
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1624462  cpu -1  group_fd -1  flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
  type                             11
  size                             136
  config                           0x2
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1624462  cpu -1  group_fd -1  flags 0x8 = 4
------------------------------------------------------------
perf_event_attr:
  size                             136
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1624462  cpu -1  group_fd -1  flags 0x8 = 5
apple_icestorm_pmu/cycles/: -1: 1492180 724333 724333
apple_firestorm_pmu/cycles/: -1: 0 724333 0
cycles: -1: 0 724333 0
apple_icestorm_pmu/cycles/: 1492180 724333 724333
apple_firestorm_pmu/cycles/: 0 724333 0
cycles: 0 724333 0

 Performance counter stats for 'ls':

         1,492,180      apple_icestorm_pmu/cycles/                                            
     <not counted>      apple_firestorm_pmu/cycles/                                             (0.00%)
     <not counted>      cycles                                                                  (0.00%)

       0.000001917 seconds time elapsed

       0.000000000 seconds user
       0.000000000 seconds sys
</quote>

and on the other cluster:

<quote>
$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e  apple_firestorm_pmu/cycles/ -e cycles ls >/dev/null
Using CPUID 0x00000000610f0280
Attempting to add event pmu 'apple_icestorm_pmu' with 'cycles,' that may result in non-fatal errors
After aliases, add event pmu 'apple_icestorm_pmu' with 'event,' that may result in non-fatal errors
Attempting to add event pmu 'apple_firestorm_pmu' with 'cycles,' that may result in non-fatal errors
After aliases, add event pmu 'apple_firestorm_pmu' with 'event,' that may result in non-fatal errors
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
  type                             10
  size                             136
  config                           0x2
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1624466  cpu -1  group_fd -1  flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
  type                             11
  size                             136
  config                           0x2
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1624466  cpu -1  group_fd -1  flags 0x8 = 4
------------------------------------------------------------
perf_event_attr:
  size                             136
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 1624466  cpu -1  group_fd -1  flags 0x8 = 5
apple_icestorm_pmu/cycles/: -1: 0 593209 0
apple_firestorm_pmu/cycles/: -1: 1038247 593209 593209
cycles: -1: 1037870 593209 593209
apple_icestorm_pmu/cycles/: 0 593209 0
apple_firestorm_pmu/cycles/: 1038247 593209 593209
cycles: 1037870 593209 593209

 Performance counter stats for 'ls':

     <not counted>      apple_icestorm_pmu/cycles/                                              (0.00%)
         1,038,247      apple_firestorm_pmu/cycles/                                           
         1,037,870      cycles                                                                

       0.000001500 seconds time elapsed

       0.000000000 seconds user
       0.000000000 seconds sys
</quote>

For the record, this is on a 6.6-rc6 kernel, userspace perf as of v6.4.0.

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2023-11-23 14:45 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-21 12:08 [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5 Hector Martin
2023-11-21 13:40 ` Marc Zyngier
2023-11-21 15:24   ` Marc Zyngier
2023-11-21 15:40     ` Mark Rutland
2023-11-21 15:46       ` Ian Rogers
2023-11-21 16:02         ` Mark Rutland
2023-11-21 16:09           ` Ian Rogers
2023-11-21 16:15             ` Mark Rutland
2023-11-21 16:38               ` Ian Rogers
2023-11-22  3:23                 ` Hector Martin
2023-11-22 13:06                   ` Arnaldo Carvalho de Melo
2023-11-22 15:33                     ` Ian Rogers
2023-11-22 15:49                     ` Mark Rutland
2023-11-22 16:04                       ` Ian Rogers
2023-11-22 16:26                         ` Arnaldo Carvalho de Melo
2023-11-22 16:33                           ` Ian Rogers
2023-11-22 16:19                       ` Arnaldo Carvalho de Melo
2023-11-22 13:03                 ` Mark Rutland
2023-11-22 15:29                   ` Ian Rogers
2023-11-22 16:08                     ` Mark Rutland
2023-11-22 16:29                       ` Ian Rogers
2023-11-22 16:55                         ` Arnaldo Carvalho de Melo
2023-11-22 16:59                           ` Ian Rogers
2023-11-23  4:33                             ` Ian Rogers
2023-11-21 15:41     ` Ian Rogers
2023-11-21 15:56       ` Mark Rutland
2023-11-21 16:03         ` Ian Rogers
2023-11-21 16:08           ` Mark Rutland
2023-11-23 14:23     ` Mark Rutland
2023-11-23 14:45       ` Marc Zyngier [this message]
2023-11-23 15:14       ` Ian Rogers
2023-11-23 16:48         ` Mark Rutland
2023-11-23 17:08           ` James Clark
2023-11-23 17:15             ` Mark Rutland
2023-11-21 23:43 ` Bagas Sanjaya
2023-12-06 12:09   ` Linux regression tracking #update (Thorsten Leemhuis)
2024-08-01 19:05     ` Ian Rogers
2024-08-07  8:54       ` Thorsten Leemhuis
2024-08-14 16:28         ` James Clark
2024-08-14 16:41           ` Arnaldo Carvalho de Melo
2024-08-15 15:15             ` James Clark
2024-08-15 15:20               ` James Clark
2024-08-15 15:27               ` Arnaldo Carvalho de Melo
2024-08-15 15:53                 ` Arnaldo Carvalho de Melo
2024-08-16  8:57                   ` James Clark
2024-08-15 17:29           ` Ian Rogers
2024-08-16  9:22             ` James Clark
2024-08-16 15:30               ` Ian Rogers
2024-08-17  1:38                 ` Atish Kumar Patra
2024-08-20  8:58                   ` James Clark
2024-08-19 14:56                 ` James Clark
2024-08-19 15:44                   ` Ian Rogers
2025-03-09 21:19       ` Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86edggzfxx.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=acme@redhat.com \
    --cc=asahi@lists.linux.dev \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=marcan@marcan.st \
    --cc=mark.rutland@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.