Re: [PATCH 02/19] perf/hisilicon: Fix group validation

linux-fpga.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mark Rutland <mark.rutland@arm.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: peterz@infradead.org, mingo@redhat.com, will@kernel.org,
	acme@kernel.org, namhyung@kernel.org,
	alexander.shishkin@linux.intel.com, jolsa@kernel.org,
	irogers@google.com, adrian.hunter@intel.com,
	kan.liang@linux.intel.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
	linux-snps-arc@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org, imx@lists.linux.dev,
	linux-csky@vger.kernel.org, loongarch@lists.linux.dev,
	linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-rockchip@lists.infradead.org, dmaengine@vger.kernel.org,
	linux-fpga@vger.kernel.org, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, coresight@lists.linaro.org,
	iommu@lists.linux.dev, linux-amlogic@lists.infradead.org,
	linux-cxl@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-riscv@lists.infradead.org
Subject: Re: [PATCH 02/19] perf/hisilicon: Fix group validation
Date: Tue, 26 Aug 2025 16:31:23 +0100	[thread overview]
Message-ID: <aK3TS3s5_Pczx1nu@J2N7QTR9R3> (raw)
In-Reply-To: <ab80cb84-42b2-4ce8-aa6c-4ce6be7a12b7@arm.com>

On Tue, Aug 26, 2025 at 03:35:48PM +0100, Robin Murphy wrote:
> On 2025-08-26 12:15 pm, Mark Rutland wrote:
> > On Wed, Aug 13, 2025 at 06:00:54PM +0100, Robin Murphy wrote:
> > > The group validation logic shared by the HiSilicon HNS3/PCIe drivers is
> > > a bit off, in that given a software group leader, it will consider that
> > > event *in place of* the actual new event being opened. At worst this
> > > could theoretically allow an unschedulable group if the software event
> > > config happens to look like one of the hardware siblings.
> > > 
> > > The uncore framework avoids that particular issue,
> > 
> > What is "the uncore framework"? I'm not sure exactly what you're
> > referring to, nor how that composes with the problem described above.
> 
> Literally that hisi_uncore_pmu.c is actually a framework for half a dozen
> individual sub-drivers rather than a "driver" itself per se, but I suppose
> that detail doesn't strictly matter at this level.

I see. My concern was just that I couldn't figure out what "the uncore
framework" was, since it sounded more generic. If you say something like
"the shared code in hisi_uncore_pmu.c", I think that would be clearer.

> > > but all 3 also share the common issue of not preventing racy access to
> > > the sibling list,
> > 
> > Can you please elaborate on this racy access to the silbing list? I'm
> > not sure exactly what you're referring to.
> 
> Hmm, yes, I guess an actual race is probably impossible since if we're still
> in the middle of opening the group leader event then we haven't yet
> allocated the fd that userspace would need to start adding siblings, even if
> it tried to guess. I leaned on "racy" as a concise way to infer "when it
> isn't locked (even though the reasons for that are more subtle)" repeatedly
> over several patches - after all, the overall theme of this series is that I
> dislike repetitive boilerplate :)
> 
> I'll dedicate some time for polishing commit messages for v2, especially the
> common context for these "part 1" patches per your feedback on patch #1.

Thanks!

> > > and some redundant checks which can be cleaned up.
> > > 
> > > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > > ---
> > >   drivers/perf/hisilicon/hisi_pcie_pmu.c   | 17 ++++++-----------
> > >   drivers/perf/hisilicon/hisi_uncore_pmu.c | 23 +++++++----------------
> > >   drivers/perf/hisilicon/hns3_pmu.c        | 17 ++++++-----------
> > >   3 files changed, 19 insertions(+), 38 deletions(-)
> > > 
> > > diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > > index c5394d007b61..3b0b2f7197d0 100644
> > > --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > > +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > > @@ -338,21 +338,16 @@ static bool hisi_pcie_pmu_validate_event_group(struct perf_event *event)
> > >   	int counters = 1;
> > >   	int num;
> > > -	event_group[0] = leader;
> > > -	if (!is_software_event(leader)) {
> > > -		if (leader->pmu != event->pmu)
> > > -			return false;
> > > +	if (leader == event)
> > > +		return true;
> > > -		if (leader != event && !hisi_pcie_pmu_cmp_event(leader, event))
> > > -			event_group[counters++] = event;
> > > -	}
> > > +	event_group[0] = event;
> > > +	if (leader->pmu == event->pmu && !hisi_pcie_pmu_cmp_event(leader, event))
> > > +		event_group[counters++] = leader;
> > 
> > Looking at this, the existing logic to share counters (which
> > hisi_pcie_pmu_cmp_event() is trying to permit) looks to be bogus, given
> > that the start/stop callbacks will reprogram the HW counters (and hence
> > can fight with one another).
> 
> Yeah, this had a dodgy smell when I first came across it, but after doing
> all the digging I think it does actually work out - the trick seems to be
> the group_leader check in hisi_pcie_pmu_get_event_idx(), with the
> implication the PMU is going to be stopped while scheduling in/out the whole
> group, so assuming hisi_pcie_pmu_del() doesn't clear the counter value in
> hardware (even though the first call nukes the rest of the event
> configuration), then the events should stay in sync.

I don't think that's sufficient. If nothing else, overflow is handled
per-event, and for a group of two identical events, upon overflow
hisi_pcie_pmu_irq() will reprogram the shared HW counter when handling
the first event, and the second event will see an arbitrary
discontinuity. Maybe no-one has spotted that due to the 2^63 counter
period that we program, but this is clearly bogus.

In addition, AFAICT the IRQ handler doesn't stop the PMU, so in general
groups aren't handled atomically, and snapshots of the counters won't be
atomic.

> It does seem somewhat nonsensical to have multiple copies of the same event
> in the same group, but I imagine it could happen with some sort of scripted
> combination of metrics, and supporting it at this level saves needing
> explicit deduplication further up. So even though my initial instinct was to
> rip it out too, in the end I concluded that that doesn't seem justified.

As above, I think it's clearly bogus. I don't think we should have
merged it as-is and it's not something I'd like to see others copy.
Other PMUs don't do this sort of event deduplication, and in general it
should be up to the user or userspace software to do that rather than
doing that badly in the kernel.

Given it was implemented with no rationale I think we should rip it out.
If that breaks someone's scripting, then we can consider implementing
something that actually works.

Mark.

next prev parent reply	other threads:[~2025-08-26 15:31 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-13 17:00 [PATCH 00/19] perf: Rework event_init checks Robin Murphy
2025-08-13 17:00 ` [PATCH 01/19] perf/arm-cmn: Fix event validation Robin Murphy
2025-08-26 10:46   ` Mark Rutland
2025-08-13 17:00 ` [PATCH 02/19] perf/hisilicon: Fix group validation Robin Murphy
2025-08-26 11:15   ` Mark Rutland
2025-08-26 13:18     ` Mark Rutland
2025-08-26 14:35     ` Robin Murphy
2025-08-26 15:31       ` Mark Rutland [this message]
2025-08-26 15:55         ` Mark Rutland
2025-08-27 14:03         ` Mark Rutland
2025-08-13 17:00 ` [PATCH 03/19] perf/imx8_ddr: " Robin Murphy
2025-08-13 17:00 ` [PATCH 04/19] perf/starfive: " Robin Murphy
2025-08-13 17:00 ` [PATCH 05/19] iommu/vt-d: Fix perfmon " Robin Murphy
2025-08-13 17:00 ` [PATCH 06/19] ARM: l2x0: Fix " Robin Murphy
2025-08-13 17:00 ` [PATCH 07/19] ARM: imx: Fix MMDC PMU " Robin Murphy
2025-08-13 17:01 ` [PATCH 08/19] perf/arm_smmu_v3: Improve " Robin Murphy
2025-08-13 17:01 ` [PATCH 09/19] perf/qcom: " Robin Murphy
2025-08-13 17:01 ` [PATCH 10/19] perf/arm-ni: Improve event validation Robin Murphy
2025-08-13 17:01 ` [PATCH 11/19] perf/arm-cci: Tidy up " Robin Murphy
2025-08-13 17:01 ` [PATCH 12/19] perf: Ignore event state for group validation Robin Murphy
2025-08-26 13:03   ` Peter Zijlstra
2025-08-26 15:32     ` Robin Murphy
2025-08-26 18:48       ` Ian Rogers
2025-08-27  8:18         ` Mark Rutland
2025-08-27 15:15           ` Ian Rogers
2025-08-13 17:01 ` [PATCH 13/19] perf: Add helper for checking grouped events Robin Murphy
2025-08-14  5:43   ` kernel test robot
2025-08-13 17:01 ` [PATCH 14/19] perf: Clean up redundant group validation Robin Murphy
2025-08-13 17:01 ` [PATCH 15/19] perf: Simplify " Robin Murphy
2025-08-13 17:01 ` [PATCH 16/19] perf: Introduce positive capability for sampling Robin Murphy
2025-08-26 13:08   ` Peter Zijlstra
2025-08-26 13:28     ` Mark Rutland
2025-08-26 16:35       ` Robin Murphy
2025-08-26 13:11   ` Leo Yan
2025-08-26 15:53     ` Robin Murphy
2025-08-27  8:06       ` Leo Yan
2025-08-13 17:01 ` [PATCH 17/19] perf: Retire PERF_PMU_CAP_NO_INTERRUPT Robin Murphy
2025-08-26 13:08   ` Peter Zijlstra
2025-08-13 17:01 ` [PATCH 18/19] perf: Introduce positive capability for raw events Robin Murphy
2025-08-19 13:15   ` Robin Murphy
2025-08-20  8:09     ` Thomas Richter
2025-08-20 11:39       ` Robin Murphy
2025-08-21  2:53   ` kernel test robot
2025-08-26 13:43   ` Mark Rutland
2025-08-26 22:46     ` Robin Murphy
2025-08-27  8:04       ` Mark Rutland
2025-08-27  5:27     ` Thomas Richter
2025-08-13 17:01 ` [PATCH 19/19] perf: Garbage-collect event_init checks Robin Murphy
2025-08-14  8:04   ` kernel test robot
2025-08-19  2:44   ` kernel test robot
2025-08-19 17:49     ` Robin Murphy
2025-08-19 13:25   ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aK3TS3s5_Pczx1nu@J2N7QTR9R3 \
    --to=mark.rutland@arm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=coresight@lists.linaro.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=imx@lists.linux.dev \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=iommu@lists.linux.dev \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-amlogic@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-fpga@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).