From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8ECDE3680AF; Tue, 26 Aug 2025 15:55:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756223760; cv=none; b=Qt4HBc0xM4mZUaAEAo41fCLeJ05mKVj7Ij6ClpueH3FEihL0AyH5byguGwwwAULyDWQgww54UOuQk+Du3P3OAOgikEyNRF5KGYqXK1UOA1Hgdm/3m8pXt40D3TwDg55xsJHcn/nCpZZBaGkIH5zn/0UuOeYnNPexi7kmUI036Y8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756223760; c=relaxed/simple; bh=jYBq+PgIvySwkPnJ+epoOGBplfR4iCEGkU3hyHU0kTA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GIjlREOUZrDmonqDN5qo2cMkuniab44noHlyO0ehsA8wRIWGxNnZaDB7fqdldWpbskt+GT7mfRaIPLdIrazDPQvYbOYakPjgUxCedyKUc521urMHBTqTq/5H6sWSSfUGoHS936nLA/KHg1UDe/KtnPXQgbpm31g36quD2rnacTY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A3F871A25; Tue, 26 Aug 2025 08:55:49 -0700 (PDT) Received: from J2N7QTR9R3 (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D7E113F694; Tue, 26 Aug 2025 08:55:51 -0700 (PDT) Date: Tue, 26 Aug 2025 16:55:49 +0100 From: Mark Rutland To: Robin Murphy Cc: peterz@infradead.org, mingo@redhat.com, will@kernel.org, acme@kernel.org, namhyung@kernel.org, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, kan.liang@linux.intel.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, imx@lists.linux.dev, linux-csky@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-pm@vger.kernel.org, linux-rockchip@lists.infradead.org, dmaengine@vger.kernel.org, linux-fpga@vger.kernel.org, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, coresight@lists.linaro.org, iommu@lists.linux.dev, linux-amlogic@lists.infradead.org, linux-cxl@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-riscv@lists.infradead.org Subject: Re: [PATCH 02/19] perf/hisilicon: Fix group validation Message-ID: References: Precedence: bulk X-Mailing-List: sparclinux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Aug 26, 2025 at 04:31:23PM +0100, Mark Rutland wrote: > On Tue, Aug 26, 2025 at 03:35:48PM +0100, Robin Murphy wrote: > > On 2025-08-26 12:15 pm, Mark Rutland wrote: > > > On Wed, Aug 13, 2025 at 06:00:54PM +0100, Robin Murphy wrote: > > > > diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c > > > > index c5394d007b61..3b0b2f7197d0 100644 > > > > --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c > > > > +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c > > > > @@ -338,21 +338,16 @@ static bool hisi_pcie_pmu_validate_event_group(struct perf_event *event) > > > > int counters = 1; > > > > int num; > > > > - event_group[0] = leader; > > > > - if (!is_software_event(leader)) { > > > > - if (leader->pmu != event->pmu) > > > > - return false; > > > > + if (leader == event) > > > > + return true; > > > > - if (leader != event && !hisi_pcie_pmu_cmp_event(leader, event)) > > > > - event_group[counters++] = event; > > > > - } > > > > + event_group[0] = event; > > > > + if (leader->pmu == event->pmu && !hisi_pcie_pmu_cmp_event(leader, event)) > > > > + event_group[counters++] = leader; > > > > > > Looking at this, the existing logic to share counters (which > > > hisi_pcie_pmu_cmp_event() is trying to permit) looks to be bogus, given > > > that the start/stop callbacks will reprogram the HW counters (and hence > > > can fight with one another). > > > > Yeah, this had a dodgy smell when I first came across it, but after doing > > all the digging I think it does actually work out - the trick seems to be > > the group_leader check in hisi_pcie_pmu_get_event_idx(), with the > > implication the PMU is going to be stopped while scheduling in/out the whole > > group, so assuming hisi_pcie_pmu_del() doesn't clear the counter value in > > hardware (even though the first call nukes the rest of the event > > configuration), then the events should stay in sync. > > I don't think that's sufficient. If nothing else, overflow is handled > per-event, and for a group of two identical events, upon overflow > hisi_pcie_pmu_irq() will reprogram the shared HW counter when handling > the first event, and the second event will see an arbitrary > discontinuity. Maybe no-one has spotted that due to the 2^63 counter > period that we program, but this is clearly bogus. > > In addition, AFAICT the IRQ handler doesn't stop the PMU, so in general > groups aren't handled atomically, and snapshots of the counters won't be > atomic. > > > It does seem somewhat nonsensical to have multiple copies of the same event > > in the same group, but I imagine it could happen with some sort of scripted > > combination of metrics, and supporting it at this level saves needing > > explicit deduplication further up. So even though my initial instinct was to > > rip it out too, in the end I concluded that that doesn't seem justified. > [...] > As above, I think it's clearly bogus. I don't think we should have > merged it as-is and it's not something I'd like to see others copy. > Other PMUs don't do this sort of event deduplication, and in general it > should be up to the user or userspace software to do that rather than > doing that badly in the kernel. > > Given it was implemented with no rationale I think we should rip it out. > If that breaks someone's scripting, then we can consider implementing > something that actually works. FWIW, I'm happy to go do that as a follow-up, so if that's a pain, feel free to leave that as-is for now. Mark.