sparclinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: peterz@infradead.org, mingo@redhat.com, will@kernel.org,
	acme@kernel.org, namhyung@kernel.org,
	alexander.shishkin@linux.intel.com, jolsa@kernel.org,
	irogers@google.com, adrian.hunter@intel.com,
	kan.liang@linux.intel.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
	linux-snps-arc@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org, imx@lists.linux.dev,
	linux-csky@vger.kernel.org, loongarch@lists.linux.dev,
	linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-rockchip@lists.infradead.org, dmaengine@vger.kernel.org,
	linux-fpga@vger.kernel.org, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, coresight@lists.linaro.org,
	iommu@lists.linux.dev, linux-amlogic@lists.infradead.org,
	linux-cxl@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-riscv@lists.infradead.org
Subject: Re: [PATCH 18/19] perf: Introduce positive capability for raw events
Date: Tue, 26 Aug 2025 23:46:02 +0100	[thread overview]
Message-ID: <015974a4-f129-4ae5-adf9-c94b29f0576a@arm.com> (raw)
In-Reply-To: <aK259PrpyxguQzdN@J2N7QTR9R3>

On 2025-08-26 2:43 pm, Mark Rutland wrote:
> On Wed, Aug 13, 2025 at 06:01:10PM +0100, Robin Murphy wrote:
>> Only a handful of CPU PMUs accept PERF_TYPE_{RAW,HARDWARE,HW_CACHE}
>> events without registering themselves as PERF_TYPE_RAW in the first
>> place. Add an explicit opt-in for these special cases, so that we can
>> make life easier for every other driver (and probably also speed up the
>> slow-path search) by having perf_try_init_event() do the basic type
>> checking to cover the majority of cases.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> 
> 
> To bikeshed a little here, I'm not keen on the PERF_PMU_CAP_RAW_EVENTS
> name, because it's not clear what "RAW" really means, and people will
> definitely read that to mean something else.
> 
> Could we go with something like PERF_PMU_CAP_COMMON_CPU_EVENTS, to make
> it clear that this is about opting into CPU-PMU specific event types (of
> which PERF_TYPE_RAW is one of)?

Indeed I started with that very intention after our previous discussion, 
but soon realised that in fact nowhere in the code is there any 
definition or even established notion of what "common" means in this 
context, so it's hardly immune to misinterpretation either. Furthermore 
the semantics of the cap as it ended up are specifically that the PMU 
wants the same behaviour as if it had registered as PERF_TYPE_RAW, so 
having "raw" in the name started to look like the more intuitive option 
after all (plus being nice and short helps.)

If anything, it's "events" that carries the implication that's proving 
hard to capture precisely and concisely here, so maybe the answer to 
avoid ambiguity is to lean further away from a "what it represents" to a 
"what it actually does" naming - PERF_PMU_CAP_TYPE_RAW, anyone?

> Likewise, s/is_raw_pmu()/pmu_supports_common_cpu_events()/.

Case in point: is it any more logical and expected that supporting 
common CPU events implies a PMU should be offered software or breakpoint 
events as well? Because that's what such a mere rename would currently 
mean :/

>> ---
>>
>> A further possibility is to automatically add the cap to PERF_TYPE_RAW
>> PMUs in perf_pmu_register() to have a single point-of-use condition; I'm
>> undecided...
> 
> I reckon we don't need to automagically do that, but I reckon that
> is_raw_pmu()/pmu_supports_common_cpu_events() should only check the cap,
> and we don't read anything special into any of
> PERF_TYPE_{RAW,HARDWARE,HW_CACHE}.

OK, but that would then necessitate having to explicitly add the cap to 
all 15-odd other drivers which register as PERF_TYPE_RAW as well, at 
which point it starts to look like a more general "I am a CPU PMU in 
terms of most typical assumptions you might want to make about that" flag...

To clarify (and perhaps something for a v2 commit message), we currently 
have 3 categories of PMU driver:

1: (Older/simpler CPUs) Registers as PERF_TYPE_RAW, wants 
PERF_TYPE_RAW/HARDWARE/HW_CACHE events
2: (Heterogeneous CPUs) Registers as dynamic type, wants 
PERF_TYPE_RAW/HARDWARE/HW_CACHE events plus events of its own type
3: (Mostly uncore) Registers as dynamic type, only wants events of its 
own type

My vested interest is in making category 3 the default behaviour, given 
that the growing majority of new drivers are uncore (and I keep having 
to write them...) However unclear the type overlaps in category 1 might 
be, it's been like that for 15 years, so I didn't feel compelled to 
churn fossils like Alpha more than reasonably necessary. Category 2 is 
only these 5 drivers, so a relatively small tweak to distinguish them 
from category 3 and let them retain the effective category 1 behaviour 
(which remains the current one of potentially still being offered 
software etc. events too) seemed like the neatest way to make progress.

I'm not saying I'm necessarily against a general overhaul of CPU PMUs 
being attempted too, just that it seems more like a whole other 
side-quest, and I'd really like to slay the uncore-boilerplate dragon first.

>> ---
>>   arch/s390/kernel/perf_cpum_cf.c    |  1 +
>>   arch/s390/kernel/perf_pai_crypto.c |  2 +-
>>   arch/s390/kernel/perf_pai_ext.c    |  2 +-
>>   arch/x86/events/core.c             |  2 +-
>>   drivers/perf/arm_pmu.c             |  1 +
>>   include/linux/perf_event.h         |  1 +
>>   kernel/events/core.c               | 15 +++++++++++++++
>>   7 files changed, 21 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
>> index 1a94e0944bc5..782ab755ddd4 100644
>> --- a/arch/s390/kernel/perf_cpum_cf.c
>> +++ b/arch/s390/kernel/perf_cpum_cf.c
>> @@ -1054,6 +1054,7 @@ static void cpumf_pmu_del(struct perf_event *event, int flags)
>>   /* Performance monitoring unit for s390x */
>>   static struct pmu cpumf_pmu = {
>>   	.task_ctx_nr  = perf_sw_context,
>> +	.capabilities = PERF_PMU_CAP_RAW_EVENTS,
>>   	.pmu_enable   = cpumf_pmu_enable,
>>   	.pmu_disable  = cpumf_pmu_disable,
>>   	.event_init   = cpumf_pmu_event_init,
> 
> Tangential, but use of perf_sw_context here looks bogus.

Indeed, according to the history it was intentional, but perhaps that no 
longer applies since the big context redesign? FWIW there seem to be a 
fair few instances of this, including Arm SPE.

Thanks,
Robin.

>> diff --git a/arch/s390/kernel/perf_pai_crypto.c b/arch/s390/kernel/perf_pai_crypto.c
>> index a64b6b056a21..b5b6d8b5d943 100644
>> --- a/arch/s390/kernel/perf_pai_crypto.c
>> +++ b/arch/s390/kernel/perf_pai_crypto.c
>> @@ -569,7 +569,7 @@ static const struct attribute_group *paicrypt_attr_groups[] = {
>>   /* Performance monitoring unit for mapped counters */
>>   static struct pmu paicrypt = {
>>   	.task_ctx_nr  = perf_hw_context,
>> -	.capabilities = PERF_PMU_CAP_SAMPLING,
>> +	.capabilities = PERF_PMU_CAP_SAMPLING | PERF_PMU_CAP_RAW_EVENTS,
>>   	.event_init   = paicrypt_event_init,
>>   	.add	      = paicrypt_add,
>>   	.del	      = paicrypt_del,
>> diff --git a/arch/s390/kernel/perf_pai_ext.c b/arch/s390/kernel/perf_pai_ext.c
>> index 1261f80c6d52..bcd28c38da70 100644
>> --- a/arch/s390/kernel/perf_pai_ext.c
>> +++ b/arch/s390/kernel/perf_pai_ext.c
>> @@ -595,7 +595,7 @@ static const struct attribute_group *paiext_attr_groups[] = {
>>   /* Performance monitoring unit for mapped counters */
>>   static struct pmu paiext = {
>>   	.task_ctx_nr  = perf_hw_context,
>> -	.capabilities = PERF_PMU_CAP_SAMPLING,
>> +	.capabilities = PERF_PMU_CAP_SAMPLING | PERF_PMU_CAP_RAW_EVENTS,
>>   	.event_init   = paiext_event_init,
>>   	.add	      = paiext_add,
>>   	.del	      = paiext_del,
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index 789dfca2fa67..764728bb80ae 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -2697,7 +2697,7 @@ static bool x86_pmu_filter(struct pmu *pmu, int cpu)
>>   }
>>   
>>   static struct pmu pmu = {
>> -	.capabilities		= PERF_PMU_CAP_SAMPLING,
>> +	.capabilities		= PERF_PMU_CAP_SAMPLING | PERF_PMU_CAP_RAW_EVENTS,
>>   
>>   	.pmu_enable		= x86_pmu_enable,
>>   	.pmu_disable		= x86_pmu_disable,
>> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
>> index 72d8f38d0aa5..bc772a3bf411 100644
>> --- a/drivers/perf/arm_pmu.c
>> +++ b/drivers/perf/arm_pmu.c
>> @@ -877,6 +877,7 @@ struct arm_pmu *armpmu_alloc(void)
>>   		 * specific PMU.
>>   		 */
>>   		.capabilities	= PERF_PMU_CAP_SAMPLING |
>> +				  PERF_PMU_CAP_RAW_EVENTS |
>>   				  PERF_PMU_CAP_EXTENDED_REGS |
>>   				  PERF_PMU_CAP_EXTENDED_HW_TYPE,
>>   	};
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 183b7c48b329..c6ad036c0037 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -305,6 +305,7 @@ struct perf_event_pmu_context;
>>   #define PERF_PMU_CAP_EXTENDED_HW_TYPE	0x0100
>>   #define PERF_PMU_CAP_AUX_PAUSE		0x0200
>>   #define PERF_PMU_CAP_AUX_PREFER_LARGE	0x0400
>> +#define PERF_PMU_CAP_RAW_EVENTS		0x0800
>>   
>>   /**
>>    * pmu::scope
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 71b2a6730705..2ecee76d2ae2 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -12556,11 +12556,26 @@ static inline bool has_extended_regs(struct perf_event *event)
>>   	       (event->attr.sample_regs_intr & PERF_REG_EXTENDED_MASK);
>>   }
>>   
>> +static bool is_raw_pmu(const struct pmu *pmu)
>> +{
>> +	return pmu->type == PERF_TYPE_RAW ||
>> +	       pmu->capabilities & PERF_PMU_CAP_RAW_EVENTS;
>> +}
> 
> As above, I reckon we should make this:
> 
> static bool pmu_supports_common_cpu_events(const struct pmu *pmu)
> {
> 	return pmu->capabilities & PERF_PMU_CAP_RAW_EVENTS;
> }
> 
> Other than the above, this looks good to me.
> 
> Mark.
> 
>> +
>>   static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
>>   {
>>   	struct perf_event_context *ctx = NULL;
>>   	int ret;
>>   
>> +	/*
>> +	 * Before touching anything, we can safely skip:
>> +	 * - any event for a specific PMU which is not this one
>> +	 * - any common event if this PMU doesn't support them
>> +	 */
>> +	if (event->attr.type != pmu->type &&
>> +	    (event->attr.type >= PERF_TYPE_MAX || is_raw_pmu(pmu)))
>> +		return -ENOENT;
>> +
>>   	if (!try_module_get(pmu->module))
>>   		return -ENODEV;
>>   
>> -- 
>> 2.39.2.101.g768bb238c484.dirty
>>

  reply	other threads:[~2025-08-26 22:46 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-13 17:00 [PATCH 00/19] perf: Rework event_init checks Robin Murphy
2025-08-13 17:00 ` [PATCH 01/19] perf/arm-cmn: Fix event validation Robin Murphy
2025-08-26 10:46   ` Mark Rutland
2025-08-13 17:00 ` [PATCH 02/19] perf/hisilicon: Fix group validation Robin Murphy
2025-08-26 11:15   ` Mark Rutland
2025-08-26 13:18     ` Mark Rutland
2025-08-26 14:35     ` Robin Murphy
2025-08-26 15:31       ` Mark Rutland
2025-08-26 15:55         ` Mark Rutland
2025-08-27 14:03         ` Mark Rutland
2025-08-13 17:00 ` [PATCH 03/19] perf/imx8_ddr: " Robin Murphy
2025-08-13 17:00 ` [PATCH 04/19] perf/starfive: " Robin Murphy
2025-08-13 17:00 ` [PATCH 05/19] iommu/vt-d: Fix perfmon " Robin Murphy
2025-08-13 17:00 ` [PATCH 06/19] ARM: l2x0: Fix " Robin Murphy
2025-08-13 17:00 ` [PATCH 07/19] ARM: imx: Fix MMDC PMU " Robin Murphy
2025-08-13 17:01 ` [PATCH 08/19] perf/arm_smmu_v3: Improve " Robin Murphy
2025-08-13 17:01 ` [PATCH 09/19] perf/qcom: " Robin Murphy
2025-08-13 17:01 ` [PATCH 10/19] perf/arm-ni: Improve event validation Robin Murphy
2025-08-13 17:01 ` [PATCH 11/19] perf/arm-cci: Tidy up " Robin Murphy
2025-08-13 17:01 ` [PATCH 12/19] perf: Ignore event state for group validation Robin Murphy
2025-08-26 13:03   ` Peter Zijlstra
2025-08-26 15:32     ` Robin Murphy
2025-08-26 18:48       ` Ian Rogers
2025-08-27  8:18         ` Mark Rutland
2025-08-27 15:15           ` Ian Rogers
2025-08-13 17:01 ` [PATCH 13/19] perf: Add helper for checking grouped events Robin Murphy
2025-08-14  5:43   ` kernel test robot
2025-08-13 17:01 ` [PATCH 14/19] perf: Clean up redundant group validation Robin Murphy
2025-08-13 17:01 ` [PATCH 15/19] perf: Simplify " Robin Murphy
2025-08-13 17:01 ` [PATCH 16/19] perf: Introduce positive capability for sampling Robin Murphy
2025-08-26 13:08   ` Peter Zijlstra
2025-08-26 13:28     ` Mark Rutland
2025-08-26 16:35       ` Robin Murphy
2025-08-26 13:11   ` Leo Yan
2025-08-26 15:53     ` Robin Murphy
2025-08-27  8:06       ` Leo Yan
2025-08-13 17:01 ` [PATCH 17/19] perf: Retire PERF_PMU_CAP_NO_INTERRUPT Robin Murphy
2025-08-26 13:08   ` Peter Zijlstra
2025-08-13 17:01 ` [PATCH 18/19] perf: Introduce positive capability for raw events Robin Murphy
2025-08-19 13:15   ` Robin Murphy
2025-08-20  8:09     ` Thomas Richter
2025-08-20 11:39       ` Robin Murphy
2025-08-21  2:53   ` kernel test robot
2025-08-26 13:43   ` Mark Rutland
2025-08-26 22:46     ` Robin Murphy [this message]
2025-08-27  8:04       ` Mark Rutland
2025-08-27  5:27     ` Thomas Richter
2025-08-13 17:01 ` [PATCH 19/19] perf: Garbage-collect event_init checks Robin Murphy
2025-08-14  8:04   ` kernel test robot
2025-08-19  2:44   ` kernel test robot
2025-08-19 17:49     ` Robin Murphy
2025-08-19 13:25   ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=015974a4-f129-4ae5-adf9-c94b29f0576a@arm.com \
    --to=robin.murphy@arm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=coresight@lists.linaro.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=imx@lists.linux.dev \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=iommu@lists.linux.dev \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-amlogic@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-fpga@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).