From: Ian Rogers <irogers@google.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
mingo@redhat.com, will@kernel.org, acme@kernel.org,
namhyung@kernel.org, alexander.shishkin@linux.intel.com,
jolsa@kernel.org, adrian.hunter@intel.com,
kan.liang@linux.intel.com, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
linux-snps-arc@lists.infradead.org,
linux-arm-kernel@lists.infradead.org, imx@lists.linux.dev,
linux-csky@vger.kernel.org, loongarch@lists.linux.dev,
linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
sparclinux@vger.kernel.org, linux-pm@vger.kernel.org,
linux-rockchip@lists.infradead.org, dmaengine@vger.kernel.org,
linux-fpga@vger.kernel.org, amd-gfx@lists.freedesktop.org,
dri-devel@lists.freedesktop.org,
intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
coresight@lists.linaro.org, iommu@lists.linux.dev,
linux-amlogic@lists.infradead.org, linux-cxl@vger.kernel.org,
linux-arm-msm@vger.kernel.org, linux-riscv@lists.infradead.org
Subject: Re: [PATCH 12/19] perf: Ignore event state for group validation
Date: Wed, 27 Aug 2025 08:15:29 -0700 [thread overview]
Message-ID: <CAP-5=fU0-QDMP-VG3O1qBvJ8uzHHYCQ8j1Vrzy9a0YUk=UMvHw@mail.gmail.com> (raw)
In-Reply-To: <aK6_XrA_OaLnoFkr@J2N7QTR9R3>
On Wed, Aug 27, 2025 at 1:18 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Tue, Aug 26, 2025 at 11:48:48AM -0700, Ian Rogers wrote:
> > On Tue, Aug 26, 2025 at 8:32 AM Robin Murphy <robin.murphy@arm.com> wrote:
> > >
> > > On 2025-08-26 2:03 pm, Peter Zijlstra wrote:
> > > > On Wed, Aug 13, 2025 at 06:01:04PM +0100, Robin Murphy wrote:
> > > >> It may have been different long ago, but today it seems wrong for these
> > > >> drivers to skip counting disabled sibling events in group validation,
> > > >> given that perf_event_enable() could make them schedulable again, and
> > > >> thus increase the effective size of the group later. Conversely, if a
> > > >> sibling event is truly dead then it stands to reason that the whole
> > > >> group is dead, so it's not worth going to any special effort to try to
> > > >> squeeze in a new event that's never going to run anyway. Thus, we can
> > > >> simply remove all these checks.
> > > >
> > > > So currently you can do sort of a manual event rotation inside an
> > > > over-sized group and have it work.
> > > >
> > > > I'm not sure if anybody actually does this, but its possible.
> > > >
> > > > Eg. on a PMU that supports only 4 counters, create a group of 5 and
> > > > periodically cycle which of the 5 events is off.
> >
> > I'm not sure this is true, I thought this would fail in the
> > perf_event_open when adding the 5th event and there being insufficient
> > counters for the group.
>
> We're talking specifically about cases where the logic in a pmu's
> pmu::event_init() callback doesn't count events in specific states, and
> hence the 5th even doesn't get rejected when it is initialised.
>
> For example, in arch/x86/events/core.c, validate_group() uses
> collect_events(), which has:
>
> for_each_sibling_event(event, leader) {
> if (!is_x86_event(event) || event->state <= PERF_EVENT_STATE_OFF)
> continue;
>
> if (collect_event(cpuc, event, max_count, n))
> return -EINVAL;
>
> n++;
> }
>
> ... and so where an event's state is <= PERF_EVENT_STATE_OFF at init
> time, that event is not counted to see if it fits into HW counters.
Hmm.. Thinking out loud. So it looked like perf with weak groups could
be broken then:
```
$ sudo perf stat -vv -e '{instructions,cycles}:W' true
...
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0x400000001
(cpu_core/PERF_COUNT_HW_INSTRUCTIONS/)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
disabled 1
inherit 1
enable_on_exec 1
------------------------------------------------------------
sys_perf_event_open: pid 3337764 cpu -1 group_fd -1 flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0x400000000
(cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
inherit 1
------------------------------------------------------------
sys_perf_event_open: pid 3337764 cpu -1 group_fd 5 flags 0x8 = 7
...
```
Note, the group leader (instructions) is disabled because of:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/stat.c?h=perf-tools-next#n761
```
/*
* Disabling all counters initially, they will be enabled
* either manually by us or by kernel via enable_on_exec
* set later.
*/
if (evsel__is_group_leader(evsel)) {
attr->disabled = 1;
```
but the checking of being disabled (PERF_EVENT_STATE_OFF) is only done
on siblings in the code you show above. So yes, you can disable the
group events to allow the perf_event_open to succeed but not on the
leader which is always checked (no PERF_EVENT_STATE_OFF check):
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/arch/x86/events/core.c?h=perf-tools-next#n1204
```
if (is_x86_event(leader)) {
if (collect_event(cpuc, leader, max_count, n))
return -EINVAL;
```
Thanks,
Ian
next prev parent reply other threads:[~2025-08-27 15:15 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-13 17:00 [PATCH 00/19] perf: Rework event_init checks Robin Murphy
2025-08-13 17:00 ` [PATCH 01/19] perf/arm-cmn: Fix event validation Robin Murphy
2025-08-26 10:46 ` Mark Rutland
2025-08-13 17:00 ` [PATCH 02/19] perf/hisilicon: Fix group validation Robin Murphy
2025-08-26 11:15 ` Mark Rutland
2025-08-26 13:18 ` Mark Rutland
2025-08-26 14:35 ` Robin Murphy
2025-08-26 15:31 ` Mark Rutland
2025-08-26 15:55 ` Mark Rutland
2025-08-27 14:03 ` Mark Rutland
2025-08-13 17:00 ` [PATCH 03/19] perf/imx8_ddr: " Robin Murphy
2025-08-13 17:00 ` [PATCH 04/19] perf/starfive: " Robin Murphy
2025-08-13 17:00 ` [PATCH 05/19] iommu/vt-d: Fix perfmon " Robin Murphy
2025-08-13 17:00 ` [PATCH 06/19] ARM: l2x0: Fix " Robin Murphy
2025-08-13 17:00 ` [PATCH 07/19] ARM: imx: Fix MMDC PMU " Robin Murphy
2025-08-13 17:01 ` [PATCH 08/19] perf/arm_smmu_v3: Improve " Robin Murphy
2025-08-13 17:01 ` [PATCH 09/19] perf/qcom: " Robin Murphy
2025-08-13 17:01 ` [PATCH 10/19] perf/arm-ni: Improve event validation Robin Murphy
2025-08-13 17:01 ` [PATCH 11/19] perf/arm-cci: Tidy up " Robin Murphy
2025-08-13 17:01 ` [PATCH 12/19] perf: Ignore event state for group validation Robin Murphy
2025-08-26 13:03 ` Peter Zijlstra
2025-08-26 15:32 ` Robin Murphy
2025-08-26 18:48 ` Ian Rogers
2025-08-27 8:18 ` Mark Rutland
2025-08-27 15:15 ` Ian Rogers [this message]
2025-08-13 17:01 ` [PATCH 13/19] perf: Add helper for checking grouped events Robin Murphy
2025-08-14 5:43 ` kernel test robot
2025-08-13 17:01 ` [PATCH 14/19] perf: Clean up redundant group validation Robin Murphy
2025-08-13 17:01 ` [PATCH 15/19] perf: Simplify " Robin Murphy
2025-08-13 17:01 ` [PATCH 16/19] perf: Introduce positive capability for sampling Robin Murphy
2025-08-26 13:08 ` Peter Zijlstra
2025-08-26 13:28 ` Mark Rutland
2025-08-26 16:35 ` Robin Murphy
2025-08-26 13:11 ` Leo Yan
2025-08-26 15:53 ` Robin Murphy
2025-08-27 8:06 ` Leo Yan
2025-08-13 17:01 ` [PATCH 17/19] perf: Retire PERF_PMU_CAP_NO_INTERRUPT Robin Murphy
2025-08-26 13:08 ` Peter Zijlstra
2025-08-13 17:01 ` [PATCH 18/19] perf: Introduce positive capability for raw events Robin Murphy
2025-08-19 13:15 ` Robin Murphy
2025-08-20 8:09 ` Thomas Richter
2025-08-20 11:39 ` Robin Murphy
2025-08-21 2:53 ` kernel test robot
2025-08-26 13:43 ` Mark Rutland
2025-08-26 22:46 ` Robin Murphy
2025-08-27 8:04 ` Mark Rutland
2025-08-27 5:27 ` Thomas Richter
2025-08-13 17:01 ` [PATCH 19/19] perf: Garbage-collect event_init checks Robin Murphy
2025-08-14 8:04 ` kernel test robot
2025-08-19 2:44 ` kernel test robot
2025-08-19 17:49 ` Robin Murphy
2025-08-19 13:25 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAP-5=fU0-QDMP-VG3O1qBvJ8uzHHYCQ8j1Vrzy9a0YUk=UMvHw@mail.gmail.com' \
--to=irogers@google.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=coresight@lists.linaro.org \
--cc=dmaengine@vger.kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=imx@lists.linux.dev \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-amlogic@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-fpga@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-rockchip@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linux-snps-arc@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=robin.murphy@arm.com \
--cc=sparclinux@vger.kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).