linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support
@ 2025-03-05 16:10 Robin Murphy
  2025-03-05 16:10 ` [PATCH 1/3] perf/arm_cspmu: Move register definitons to header Robin Murphy
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Robin Murphy @ 2025-03-05 16:10 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users

Hi all,

This is part of something I'm working towards, but I figure it stands
up on its own enough to be worth getting out early. Compile-tested
only, but it's straightforward enough that I'm confident.

Cheers,
Robin.


Robin Murphy (3):
  perf/arm_cspmu: Move register definitons to header
  perf/arm_cspmu: Generalise event filtering
  perf/arm_cspmu: Add PMEVFILT2R support

 drivers/perf/arm_cspmu/ampere_cspmu.c | 32 ++++-------
 drivers/perf/arm_cspmu/arm_cspmu.c    | 81 ++++++---------------------
 drivers/perf/arm_cspmu/arm_cspmu.h    | 57 +++++++++++++++++--
 drivers/perf/arm_cspmu/nvidia_cspmu.c | 21 ++++++-
 4 files changed, 100 insertions(+), 91 deletions(-)

-- 
2.39.2.101.g768bb238c484.dirty


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] perf/arm_cspmu: Move register definitons to header
  2025-03-05 16:10 [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
@ 2025-03-05 16:10 ` Robin Murphy
  2025-03-06  9:45   ` James Clark
  2025-03-06 23:28   ` Ilkka Koskinen
  2025-03-05 16:10 ` [PATCH 2/3] perf/arm_cspmu: Generalise event filtering Robin Murphy
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Robin Murphy @ 2025-03-05 16:10 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users

Implementations may occasionally want to refer to register offsets, so
for the sake of consistency move all of the register definitions to join
the PMIIDR fields in the private header where they can be shared. As an
example nicety, we can then define Ampere's imp-def filters in terms of
the architectural PMIMPDEF range rather than open-coded offsets.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm_cspmu/ampere_cspmu.c |  8 ++---
 drivers/perf/arm_cspmu/arm_cspmu.c    | 45 --------------------------
 drivers/perf/arm_cspmu/arm_cspmu.h    | 46 +++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/drivers/perf/arm_cspmu/ampere_cspmu.c b/drivers/perf/arm_cspmu/ampere_cspmu.c
index f72f5689923c..31cc1a4ac9df 100644
--- a/drivers/perf/arm_cspmu/ampere_cspmu.c
+++ b/drivers/perf/arm_cspmu/ampere_cspmu.c
@@ -10,10 +10,10 @@
 
 #include "arm_cspmu.h"
 
-#define PMAUXR0		0xD80
-#define PMAUXR1		0xD84
-#define PMAUXR2		0xD88
-#define PMAUXR3		0xD8C
+#define PMAUXR0		PMIMPDEF
+#define PMAUXR1		(PMIMPDEF + 0x4)
+#define PMAUXR2		(PMIMPDEF + 0x8)
+#define PMAUXR3		(PMIMPDEF + 0xC)
 
 #define to_ampere_cspmu_ctx(cspmu)	((struct ampere_cspmu_ctx *)(cspmu->impl.ctx))
 
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index 81e8b97e9353..769466d55bea 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -40,51 +40,6 @@
 	ARM_CSPMU_EXT_ATTR(_name, arm_cspmu_cpumask_show,	\
 				(unsigned long)_config)
 
-/*
- * CoreSight PMU Arch register offsets.
- */
-#define PMEVCNTR_LO					0x0
-#define PMEVCNTR_HI					0x4
-#define PMEVTYPER					0x400
-#define PMCCFILTR					0x47C
-#define PMEVFILTR					0xA00
-#define PMCNTENSET					0xC00
-#define PMCNTENCLR					0xC20
-#define PMINTENSET					0xC40
-#define PMINTENCLR					0xC60
-#define PMOVSCLR					0xC80
-#define PMOVSSET					0xCC0
-#define PMCFGR						0xE00
-#define PMCR						0xE04
-#define PMIIDR						0xE08
-
-/* PMCFGR register field */
-#define PMCFGR_NCG					GENMASK(31, 28)
-#define PMCFGR_HDBG					BIT(24)
-#define PMCFGR_TRO					BIT(23)
-#define PMCFGR_SS					BIT(22)
-#define PMCFGR_FZO					BIT(21)
-#define PMCFGR_MSI					BIT(20)
-#define PMCFGR_UEN					BIT(19)
-#define PMCFGR_NA					BIT(17)
-#define PMCFGR_EX					BIT(16)
-#define PMCFGR_CCD					BIT(15)
-#define PMCFGR_CC					BIT(14)
-#define PMCFGR_SIZE					GENMASK(13, 8)
-#define PMCFGR_N					GENMASK(7, 0)
-
-/* PMCR register field */
-#define PMCR_TRO					BIT(11)
-#define PMCR_HDBG					BIT(10)
-#define PMCR_FZO					BIT(9)
-#define PMCR_NA						BIT(8)
-#define PMCR_DP						BIT(5)
-#define PMCR_X						BIT(4)
-#define PMCR_D						BIT(3)
-#define PMCR_C						BIT(2)
-#define PMCR_P						BIT(1)
-#define PMCR_E						BIT(0)
-
 /* Each SET/CLR register supports up to 32 counters. */
 #define ARM_CSPMU_SET_CLR_COUNTER_SHIFT		5
 #define ARM_CSPMU_SET_CLR_COUNTER_NUM		\
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
index 2621f3111148..576249e0deea 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.h
+++ b/drivers/perf/arm_cspmu/arm_cspmu.h
@@ -65,6 +65,52 @@
 /* The cycle counter, if implemented, is located at counter[31]. */
 #define ARM_CSPMU_CYCLE_CNTR_IDX	31
 
+/*
+ * CoreSight PMU Arch register offsets.
+ */
+#define PMEVCNTR_LO			0x0
+#define PMEVCNTR_HI			0x4
+#define PMEVTYPER			0x400
+#define PMCCFILTR			0x47C
+#define PMEVFILTR			0xA00
+#define PMCNTENSET			0xC00
+#define PMCNTENCLR			0xC20
+#define PMINTENSET			0xC40
+#define PMINTENCLR			0xC60
+#define PMOVSCLR			0xC80
+#define PMOVSSET			0xCC0
+#define PMIMPDEF			0xD80
+#define PMCFGR				0xE00
+#define PMCR				0xE04
+#define PMIIDR				0xE08
+
+/* PMCFGR register field */
+#define PMCFGR_NCG			GENMASK(31, 28)
+#define PMCFGR_HDBG			BIT(24)
+#define PMCFGR_TRO			BIT(23)
+#define PMCFGR_SS			BIT(22)
+#define PMCFGR_FZO			BIT(21)
+#define PMCFGR_MSI			BIT(20)
+#define PMCFGR_UEN			BIT(19)
+#define PMCFGR_NA			BIT(17)
+#define PMCFGR_EX			BIT(16)
+#define PMCFGR_CCD			BIT(15)
+#define PMCFGR_CC			BIT(14)
+#define PMCFGR_SIZE			GENMASK(13, 8)
+#define PMCFGR_N			GENMASK(7, 0)
+
+/* PMCR register field */
+#define PMCR_TRO			BIT(11)
+#define PMCR_HDBG			BIT(10)
+#define PMCR_FZO			BIT(9)
+#define PMCR_NA				BIT(8)
+#define PMCR_DP				BIT(5)
+#define PMCR_X				BIT(4)
+#define PMCR_D				BIT(3)
+#define PMCR_C				BIT(2)
+#define PMCR_P				BIT(1)
+#define PMCR_E				BIT(0)
+
 /* PMIIDR register field */
 #define ARM_CSPMU_PMIIDR_IMPLEMENTER	GENMASK(11, 0)
 #define ARM_CSPMU_PMIIDR_PRODUCTID	GENMASK(31, 20)
-- 
2.39.2.101.g768bb238c484.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] perf/arm_cspmu: Generalise event filtering
  2025-03-05 16:10 [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
  2025-03-05 16:10 ` [PATCH 1/3] perf/arm_cspmu: Move register definitons to header Robin Murphy
@ 2025-03-05 16:10 ` Robin Murphy
  2025-03-06  9:45   ` James Clark
  2025-03-06 23:29   ` Ilkka Koskinen
  2025-03-05 16:10 ` [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Robin Murphy @ 2025-03-05 16:10 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users

The notion of a single u32 filter value for any event doesn't scale well
when the potential architectural scope is already two 64-bit values, and
implementations may add custom stuff on the side too. Rather than try to
thread arbitrary filter data through the common path, let's just make
the set_ev_filter op self-contained in terms of parsing and configuring
any and all filtering for the given event - splitting out a distinct op
for cycles events which inherently differ - and let implementations
override the whole thing if they want to do something different. This
already allows the Ampere code to stop looking a bit hacky.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm_cspmu/ampere_cspmu.c | 24 ++++++----------------
 drivers/perf/arm_cspmu/arm_cspmu.c    | 29 +++++++++++----------------
 drivers/perf/arm_cspmu/arm_cspmu.h    |  8 ++++----
 drivers/perf/arm_cspmu/nvidia_cspmu.c | 21 ++++++++++++++++++-
 4 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/drivers/perf/arm_cspmu/ampere_cspmu.c b/drivers/perf/arm_cspmu/ampere_cspmu.c
index 31cc1a4ac9df..b8ca69fd9d1d 100644
--- a/drivers/perf/arm_cspmu/ampere_cspmu.c
+++ b/drivers/perf/arm_cspmu/ampere_cspmu.c
@@ -132,32 +132,20 @@ ampere_cspmu_get_name(const struct arm_cspmu *cspmu)
 	return ctx->name;
 }
 
-static u32 ampere_cspmu_event_filter(const struct perf_event *event)
+static void ampere_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
+				       const struct perf_event *event)
 {
 	/*
-	 * PMEVFILTR or PMCCFILTR aren't used in Ampere SoC PMU but are marked
-	 * as RES0. Make sure, PMCCFILTR is written zero.
+	 * PMCCFILTR is RES0, so this is just a dummy callback to override
+	 * the default implementation and avoid writing to it.
 	 */
-	return 0;
 }
 
 static void ampere_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
-				       struct hw_perf_event *hwc,
-				       u32 filter)
+				       const struct perf_event *event)
 {
-	struct perf_event *event;
-	unsigned int idx;
 	u32 threshold, rank, bank;
 
-	/*
-	 * At this point, all the events have the same filter settings.
-	 * Therefore, take the first event and use its configuration.
-	 */
-	idx = find_first_bit(cspmu->hw_events.used_ctrs,
-			     cspmu->cycle_counter_logical_idx);
-
-	event = cspmu->hw_events.events[idx];
-
 	threshold	= get_threshold(event);
 	rank		= get_rank(event);
 	bank		= get_bank(event);
@@ -233,7 +221,7 @@ static int ampere_cspmu_init_ops(struct arm_cspmu *cspmu)
 
 	cspmu->impl.ctx = ctx;
 
-	impl_ops->event_filter		= ampere_cspmu_event_filter;
+	impl_ops->set_cc_filter		= ampere_cspmu_set_cc_filter;
 	impl_ops->set_ev_filter		= ampere_cspmu_set_ev_filter;
 	impl_ops->validate_event	= ampere_cspmu_validate_event;
 	impl_ops->get_name		= ampere_cspmu_get_name;
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index 769466d55bea..053bb7920df6 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -66,7 +66,9 @@ static unsigned long arm_cspmu_cpuhp_state;
 static DEFINE_MUTEX(arm_cspmu_lock);
 
 static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
-				    struct hw_perf_event *hwc, u32 filter);
+				    const struct perf_event *event);
+static void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
+				    const struct perf_event *event);
 
 static struct acpi_apmt_node *arm_cspmu_apmt_node(struct device *dev)
 {
@@ -205,11 +207,6 @@ static bool arm_cspmu_is_cycle_counter_event(const struct perf_event *event)
 	return (event->attr.config == ARM_CSPMU_EVT_CYCLES_DEFAULT);
 }
 
-static u32 arm_cspmu_event_filter(const struct perf_event *event)
-{
-	return event->attr.config1 & ARM_CSPMU_FILTER_MASK;
-}
-
 static ssize_t arm_cspmu_identifier_show(struct device *dev,
 					 struct device_attribute *attr,
 					 char *page)
@@ -371,7 +368,7 @@ static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu)
 		DEFAULT_IMPL_OP(get_name),
 		DEFAULT_IMPL_OP(is_cycle_counter_event),
 		DEFAULT_IMPL_OP(event_type),
-		DEFAULT_IMPL_OP(event_filter),
+		DEFAULT_IMPL_OP(set_cc_filter),
 		DEFAULT_IMPL_OP(set_ev_filter),
 		DEFAULT_IMPL_OP(event_attr_is_visible),
 	};
@@ -767,26 +764,26 @@ static inline void arm_cspmu_set_event(struct arm_cspmu *cspmu,
 }
 
 static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
-					struct hw_perf_event *hwc,
-					u32 filter)
+				    const struct perf_event *event)
 {
+	u32 filter = event->attr.config1 & ARM_CSPMU_FILTER_MASK;
 	u32 offset = PMEVFILTR + (4 * hwc->idx);
 
 	writel(filter, cspmu->base0 + offset);
 }
 
-static inline void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu, u32 filter)
+static void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
+				    const struct perf_event *event)
 {
-	u32 offset = PMCCFILTR;
+	u32 filter = event->attr.config1 & ARM_CSPMU_FILTER_MASK;
 
-	writel(filter, cspmu->base0 + offset);
+	writel(filter, cspmu->base0 + PMCCFILTR);
 }
 
 static void arm_cspmu_start(struct perf_event *event, int pmu_flags)
 {
 	struct arm_cspmu *cspmu = to_arm_cspmu(event->pmu);
 	struct hw_perf_event *hwc = &event->hw;
-	u32 filter;
 
 	/* We always reprogram the counter */
 	if (pmu_flags & PERF_EF_RELOAD)
@@ -794,13 +791,11 @@ static void arm_cspmu_start(struct perf_event *event, int pmu_flags)
 
 	arm_cspmu_set_event_period(event);
 
-	filter = cspmu->impl.ops.event_filter(event);
-
 	if (event->hw.extra_reg.idx == cspmu->cycle_counter_logical_idx) {
-		arm_cspmu_set_cc_filter(cspmu, filter);
+		cspmu->impl.ops.set_cc_filter(cspmu, event);
 	} else {
 		arm_cspmu_set_event(cspmu, hwc);
-		cspmu->impl.ops.set_ev_filter(cspmu, hwc, filter);
+		cspmu->impl.ops.set_ev_filter(cspmu, event);
 	}
 
 	hwc->state = 0;
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
index 576249e0deea..d59040d6a7e3 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.h
+++ b/drivers/perf/arm_cspmu/arm_cspmu.h
@@ -149,11 +149,11 @@ struct arm_cspmu_impl_ops {
 	bool (*is_cycle_counter_event)(const struct perf_event *event);
 	/* Decode event type/id from configs */
 	u32 (*event_type)(const struct perf_event *event);
-	/* Decode filter value from configs */
-	u32 (*event_filter)(const struct perf_event *event);
-	/* Set event filter */
+	/* Set event filters */
+	void (*set_cc_filter)(struct arm_cspmu *cspmu,
+			      const struct perf_event *event);
 	void (*set_ev_filter)(struct arm_cspmu *cspmu,
-			      struct hw_perf_event *hwc, u32 filter);
+			      const struct perf_event *event);
 	/* Implementation specific event validation */
 	int (*validate_event)(struct arm_cspmu *cspmu,
 			      struct perf_event *event);
diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu/nvidia_cspmu.c
index 8116c7846a46..9e817f120828 100644
--- a/drivers/perf/arm_cspmu/nvidia_cspmu.c
+++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c
@@ -183,6 +183,24 @@ static u32 nv_cspmu_event_filter(const struct perf_event *event)
 	return filter_val;
 }
 
+static void nv_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
+				   const struct perf_event *event)
+{
+	u32 filter = nv_cspmu_event_filter(event);
+	u32 offset = PMEVFILTR + (4 * event->hw.idx);
+
+	writel(filter, cspmu->base0 + offset);
+}
+
+static void nv_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
+				   const struct perf_event *event)
+{
+	u32 filter = nv_cspmu_event_filter(event);
+
+	writel(filter, cspmu->base0 + PMCCFILTR);
+}
+
+
 enum nv_cspmu_name_fmt {
 	NAME_FMT_GENERIC,
 	NAME_FMT_SOCKET
@@ -322,7 +340,8 @@ static int nv_cspmu_init_ops(struct arm_cspmu *cspmu)
 	cspmu->impl.ctx = ctx;
 
 	/* NVIDIA specific callbacks. */
-	impl_ops->event_filter			= nv_cspmu_event_filter;
+	impl_ops->set_cc_filter			= nv_cspmu_set_cc_filter;
+	impl_ops->set_ev_filter			= nv_cspmu_set_ev_filter;
 	impl_ops->get_event_attrs		= nv_cspmu_get_event_attrs;
 	impl_ops->get_format_attrs		= nv_cspmu_get_format_attrs;
 	impl_ops->get_name			= nv_cspmu_get_name;
-- 
2.39.2.101.g768bb238c484.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support
  2025-03-05 16:10 [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
  2025-03-05 16:10 ` [PATCH 1/3] perf/arm_cspmu: Move register definitons to header Robin Murphy
  2025-03-05 16:10 ` [PATCH 2/3] perf/arm_cspmu: Generalise event filtering Robin Murphy
@ 2025-03-05 16:10 ` Robin Murphy
  2025-03-06  9:32   ` James Clark
  2025-03-06 23:30   ` Ilkka Koskinen
  2025-03-06 23:26 ` [PATCH 0/3] " Ilkka Koskinen
  2025-03-13 22:22 ` Will Deacon
  4 siblings, 2 replies; 12+ messages in thread
From: Robin Murphy @ 2025-03-05 16:10 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users

Architecturally we have two filters for each regular event counter,
so add generic support for the second one too.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm_cspmu/arm_cspmu.c | 7 +++++--
 drivers/perf/arm_cspmu/arm_cspmu.h | 3 +++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index 053bb7920df6..efa9b229e701 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -183,6 +183,7 @@ arm_cspmu_event_attr_is_visible(struct kobject *kobj,
 static struct attribute *arm_cspmu_format_attrs[] = {
 	ARM_CSPMU_FORMAT_EVENT_ATTR,
 	ARM_CSPMU_FORMAT_FILTER_ATTR,
+	ARM_CSPMU_FORMAT_FILTER2_ATTR,
 	NULL,
 };
 
@@ -767,9 +768,11 @@ static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
 				    const struct perf_event *event)
 {
 	u32 filter = event->attr.config1 & ARM_CSPMU_FILTER_MASK;
-	u32 offset = PMEVFILTR + (4 * hwc->idx);
+	u32 filter2 = event->attr.config2 & ARM_CSPMU_FILTER_MASK;
+	u32 offset = 4 * event->hw.idx;
 
-	writel(filter, cspmu->base0 + offset);
+	writel(filter, cspmu->base0 + PMEVFILTR + offset);
+	writel(filter2, cspmu->base0 + PMEVFILT2R + offset);
 }
 
 static void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
index d59040d6a7e3..19684b76bd96 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.h
+++ b/drivers/perf/arm_cspmu/arm_cspmu.h
@@ -47,6 +47,8 @@
 /* Default filter format */
 #define ARM_CSPMU_FORMAT_FILTER_ATTR	\
 	ARM_CSPMU_FORMAT_ATTR(filter, "config1:0-31")
+#define ARM_CSPMU_FORMAT_FILTER2_ATTR	\
+	ARM_CSPMU_FORMAT_ATTR(filter2, "config2:0-31")
 
 /*
  * This is the default event number for cycle count, if supported, since the
@@ -72,6 +74,7 @@
 #define PMEVCNTR_HI			0x4
 #define PMEVTYPER			0x400
 #define PMCCFILTR			0x47C
+#define PMEVFILT2R			0x800
 #define PMEVFILTR			0xA00
 #define PMCNTENSET			0xC00
 #define PMCNTENCLR			0xC20
-- 
2.39.2.101.g768bb238c484.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support
  2025-03-05 16:10 ` [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
@ 2025-03-06  9:32   ` James Clark
  2025-03-06 23:30   ` Ilkka Koskinen
  1 sibling, 0 replies; 12+ messages in thread
From: James Clark @ 2025-03-06  9:32 UTC (permalink / raw)
  To: Robin Murphy, will
  Cc: mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users



On 05/03/2025 4:10 pm, Robin Murphy wrote:
> Architecturally we have two filters for each regular event counter,
> so add generic support for the second one too.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>   drivers/perf/arm_cspmu/arm_cspmu.c | 7 +++++--
>   drivers/perf/arm_cspmu/arm_cspmu.h | 3 +++
>   2 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index 053bb7920df6..efa9b229e701 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -183,6 +183,7 @@ arm_cspmu_event_attr_is_visible(struct kobject *kobj,
>   static struct attribute *arm_cspmu_format_attrs[] = {
>   	ARM_CSPMU_FORMAT_EVENT_ATTR,
>   	ARM_CSPMU_FORMAT_FILTER_ATTR,
> +	ARM_CSPMU_FORMAT_FILTER2_ATTR,
>   	NULL,
>   };
>   
> @@ -767,9 +768,11 @@ static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
>   				    const struct perf_event *event)
>   {
>   	u32 filter = event->attr.config1 & ARM_CSPMU_FILTER_MASK;
> -	u32 offset = PMEVFILTR + (4 * hwc->idx);
> +	u32 filter2 = event->attr.config2 & ARM_CSPMU_FILTER_MASK;
> +	u32 offset = 4 * event->hw.idx;
>   
> -	writel(filter, cspmu->base0 + offset);
> +	writel(filter, cspmu->base0 + PMEVFILTR + offset);
> +	writel(filter2, cspmu->base0 + PMEVFILT2R + offset);
>   }
>   
>   static void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
> index d59040d6a7e3..19684b76bd96 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
> @@ -47,6 +47,8 @@
>   /* Default filter format */
>   #define ARM_CSPMU_FORMAT_FILTER_ATTR	\
>   	ARM_CSPMU_FORMAT_ATTR(filter, "config1:0-31")
> +#define ARM_CSPMU_FORMAT_FILTER2_ATTR	\
> +	ARM_CSPMU_FORMAT_ATTR(filter2, "config2:0-31")
>   
>   /*
>    * This is the default event number for cycle count, if supported, since the
> @@ -72,6 +74,7 @@
>   #define PMEVCNTR_HI			0x4
>   #define PMEVTYPER			0x400
>   #define PMCCFILTR			0x47C
> +#define PMEVFILT2R			0x800
>   #define PMEVFILTR			0xA00
>   #define PMCNTENSET			0xC00
>   #define PMCNTENCLR			0xC20

Reviewed-by: James Clark <james.clark@linaro.org>

Minor comment that the doc linked at the beginning of the file "ARM IHI 
0091 A.a-00bet0" doesn't have PMEVFILT2R, you need the latest version 
for that.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] perf/arm_cspmu: Generalise event filtering
  2025-03-05 16:10 ` [PATCH 2/3] perf/arm_cspmu: Generalise event filtering Robin Murphy
@ 2025-03-06  9:45   ` James Clark
  2025-03-06 23:29   ` Ilkka Koskinen
  1 sibling, 0 replies; 12+ messages in thread
From: James Clark @ 2025-03-06  9:45 UTC (permalink / raw)
  To: Robin Murphy, will
  Cc: mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users



On 05/03/2025 4:10 pm, Robin Murphy wrote:
> The notion of a single u32 filter value for any event doesn't scale well
> when the potential architectural scope is already two 64-bit values, and
> implementations may add custom stuff on the side too. Rather than try to
> thread arbitrary filter data through the common path, let's just make
> the set_ev_filter op self-contained in terms of parsing and configuring
> any and all filtering for the given event - splitting out a distinct op
> for cycles events which inherently differ - and let implementations
> override the whole thing if they want to do something different. This
> already allows the Ampere code to stop looking a bit hacky.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Reviewed-by: James Clark <james.clark@linaro.org>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] perf/arm_cspmu: Move register definitons to header
  2025-03-05 16:10 ` [PATCH 1/3] perf/arm_cspmu: Move register definitons to header Robin Murphy
@ 2025-03-06  9:45   ` James Clark
  2025-03-06 23:28   ` Ilkka Koskinen
  1 sibling, 0 replies; 12+ messages in thread
From: James Clark @ 2025-03-06  9:45 UTC (permalink / raw)
  To: Robin Murphy, will
  Cc: mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users



On 05/03/2025 4:10 pm, Robin Murphy wrote:
> Implementations may occasionally want to refer to register offsets, so
> for the sake of consistency move all of the register definitions to join
> the PMIIDR fields in the private header where they can be shared. As an
> example nicety, we can then define Ampere's imp-def filters in terms of
> the architectural PMIMPDEF range rather than open-coded offsets.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---

Reviewed-by: James Clark <james.clark@linaro.org>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support
  2025-03-05 16:10 [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
                   ` (2 preceding siblings ...)
  2025-03-05 16:10 ` [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
@ 2025-03-06 23:26 ` Ilkka Koskinen
  2025-03-13 22:22 ` Will Deacon
  4 siblings, 0 replies; 12+ messages in thread
From: Ilkka Koskinen @ 2025-03-06 23:26 UTC (permalink / raw)
  To: Robin Murphy
  Cc: will, mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users



On Wed, 5 Mar 2025, Robin Murphy wrote:

> Hi all,
>
> This is part of something I'm working towards, but I figure it stands
> up on its own enough to be worth getting out early. Compile-tested
> only, but it's straightforward enough that I'm confident.
>
> Cheers,
> Robin.

I ran a very quick test on Ampere AmpereOne and everything seemed to work
as expected.

Cheers, Ilkka

> Robin Murphy (3):
>  perf/arm_cspmu: Move register definitons to header
>  perf/arm_cspmu: Generalise event filtering
>  perf/arm_cspmu: Add PMEVFILT2R support
>
> drivers/perf/arm_cspmu/ampere_cspmu.c | 32 ++++-------
> drivers/perf/arm_cspmu/arm_cspmu.c    | 81 ++++++---------------------
> drivers/perf/arm_cspmu/arm_cspmu.h    | 57 +++++++++++++++++--
> drivers/perf/arm_cspmu/nvidia_cspmu.c | 21 ++++++-
> 4 files changed, 100 insertions(+), 91 deletions(-)
>
> -- 
> 2.39.2.101.g768bb238c484.dirty
>
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] perf/arm_cspmu: Move register definitons to header
  2025-03-05 16:10 ` [PATCH 1/3] perf/arm_cspmu: Move register definitons to header Robin Murphy
  2025-03-06  9:45   ` James Clark
@ 2025-03-06 23:28   ` Ilkka Koskinen
  1 sibling, 0 replies; 12+ messages in thread
From: Ilkka Koskinen @ 2025-03-06 23:28 UTC (permalink / raw)
  To: Robin Murphy
  Cc: will, mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users



On Wed, 5 Mar 2025, Robin Murphy wrote:

> Implementations may occasionally want to refer to register offsets, so
> for the sake of consistency move all of the register definitions to join
> the PMIIDR fields in the private header where they can be shared. As an
> example nicety, we can then define Ampere's imp-def filters in terms of
> the architectural PMIMPDEF range rather than open-coded offsets.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>


> ---
> drivers/perf/arm_cspmu/ampere_cspmu.c |  8 ++---
> drivers/perf/arm_cspmu/arm_cspmu.c    | 45 --------------------------
> drivers/perf/arm_cspmu/arm_cspmu.h    | 46 +++++++++++++++++++++++++++
> 3 files changed, 50 insertions(+), 49 deletions(-)
>
> diff --git a/drivers/perf/arm_cspmu/ampere_cspmu.c b/drivers/perf/arm_cspmu/ampere_cspmu.c
> index f72f5689923c..31cc1a4ac9df 100644
> --- a/drivers/perf/arm_cspmu/ampere_cspmu.c
> +++ b/drivers/perf/arm_cspmu/ampere_cspmu.c
> @@ -10,10 +10,10 @@
>
> #include "arm_cspmu.h"
>
> -#define PMAUXR0		0xD80
> -#define PMAUXR1		0xD84
> -#define PMAUXR2		0xD88
> -#define PMAUXR3		0xD8C
> +#define PMAUXR0		PMIMPDEF
> +#define PMAUXR1		(PMIMPDEF + 0x4)
> +#define PMAUXR2		(PMIMPDEF + 0x8)
> +#define PMAUXR3		(PMIMPDEF + 0xC)
>
> #define to_ampere_cspmu_ctx(cspmu)	((struct ampere_cspmu_ctx *)(cspmu->impl.ctx))
>
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index 81e8b97e9353..769466d55bea 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -40,51 +40,6 @@
> 	ARM_CSPMU_EXT_ATTR(_name, arm_cspmu_cpumask_show,	\
> 				(unsigned long)_config)
>
> -/*
> - * CoreSight PMU Arch register offsets.
> - */
> -#define PMEVCNTR_LO					0x0
> -#define PMEVCNTR_HI					0x4
> -#define PMEVTYPER					0x400
> -#define PMCCFILTR					0x47C
> -#define PMEVFILTR					0xA00
> -#define PMCNTENSET					0xC00
> -#define PMCNTENCLR					0xC20
> -#define PMINTENSET					0xC40
> -#define PMINTENCLR					0xC60
> -#define PMOVSCLR					0xC80
> -#define PMOVSSET					0xCC0
> -#define PMCFGR						0xE00
> -#define PMCR						0xE04
> -#define PMIIDR						0xE08
> -
> -/* PMCFGR register field */
> -#define PMCFGR_NCG					GENMASK(31, 28)
> -#define PMCFGR_HDBG					BIT(24)
> -#define PMCFGR_TRO					BIT(23)
> -#define PMCFGR_SS					BIT(22)
> -#define PMCFGR_FZO					BIT(21)
> -#define PMCFGR_MSI					BIT(20)
> -#define PMCFGR_UEN					BIT(19)
> -#define PMCFGR_NA					BIT(17)
> -#define PMCFGR_EX					BIT(16)
> -#define PMCFGR_CCD					BIT(15)
> -#define PMCFGR_CC					BIT(14)
> -#define PMCFGR_SIZE					GENMASK(13, 8)
> -#define PMCFGR_N					GENMASK(7, 0)
> -
> -/* PMCR register field */
> -#define PMCR_TRO					BIT(11)
> -#define PMCR_HDBG					BIT(10)
> -#define PMCR_FZO					BIT(9)
> -#define PMCR_NA						BIT(8)
> -#define PMCR_DP						BIT(5)
> -#define PMCR_X						BIT(4)
> -#define PMCR_D						BIT(3)
> -#define PMCR_C						BIT(2)
> -#define PMCR_P						BIT(1)
> -#define PMCR_E						BIT(0)
> -
> /* Each SET/CLR register supports up to 32 counters. */
> #define ARM_CSPMU_SET_CLR_COUNTER_SHIFT		5
> #define ARM_CSPMU_SET_CLR_COUNTER_NUM		\
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
> index 2621f3111148..576249e0deea 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
> @@ -65,6 +65,52 @@
> /* The cycle counter, if implemented, is located at counter[31]. */
> #define ARM_CSPMU_CYCLE_CNTR_IDX	31
>
> +/*
> + * CoreSight PMU Arch register offsets.
> + */
> +#define PMEVCNTR_LO			0x0
> +#define PMEVCNTR_HI			0x4
> +#define PMEVTYPER			0x400
> +#define PMCCFILTR			0x47C
> +#define PMEVFILTR			0xA00
> +#define PMCNTENSET			0xC00
> +#define PMCNTENCLR			0xC20
> +#define PMINTENSET			0xC40
> +#define PMINTENCLR			0xC60
> +#define PMOVSCLR			0xC80
> +#define PMOVSSET			0xCC0
> +#define PMIMPDEF			0xD80
> +#define PMCFGR				0xE00
> +#define PMCR				0xE04
> +#define PMIIDR				0xE08
> +
> +/* PMCFGR register field */
> +#define PMCFGR_NCG			GENMASK(31, 28)
> +#define PMCFGR_HDBG			BIT(24)
> +#define PMCFGR_TRO			BIT(23)
> +#define PMCFGR_SS			BIT(22)
> +#define PMCFGR_FZO			BIT(21)
> +#define PMCFGR_MSI			BIT(20)
> +#define PMCFGR_UEN			BIT(19)
> +#define PMCFGR_NA			BIT(17)
> +#define PMCFGR_EX			BIT(16)
> +#define PMCFGR_CCD			BIT(15)
> +#define PMCFGR_CC			BIT(14)
> +#define PMCFGR_SIZE			GENMASK(13, 8)
> +#define PMCFGR_N			GENMASK(7, 0)
> +
> +/* PMCR register field */
> +#define PMCR_TRO			BIT(11)
> +#define PMCR_HDBG			BIT(10)
> +#define PMCR_FZO			BIT(9)
> +#define PMCR_NA				BIT(8)
> +#define PMCR_DP				BIT(5)
> +#define PMCR_X				BIT(4)
> +#define PMCR_D				BIT(3)
> +#define PMCR_C				BIT(2)
> +#define PMCR_P				BIT(1)
> +#define PMCR_E				BIT(0)
> +
> /* PMIIDR register field */
> #define ARM_CSPMU_PMIIDR_IMPLEMENTER	GENMASK(11, 0)
> #define ARM_CSPMU_PMIIDR_PRODUCTID	GENMASK(31, 20)
> -- 
> 2.39.2.101.g768bb238c484.dirty
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] perf/arm_cspmu: Generalise event filtering
  2025-03-05 16:10 ` [PATCH 2/3] perf/arm_cspmu: Generalise event filtering Robin Murphy
  2025-03-06  9:45   ` James Clark
@ 2025-03-06 23:29   ` Ilkka Koskinen
  1 sibling, 0 replies; 12+ messages in thread
From: Ilkka Koskinen @ 2025-03-06 23:29 UTC (permalink / raw)
  To: Robin Murphy
  Cc: will, mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users



On Wed, 5 Mar 2025, Robin Murphy wrote:

> The notion of a single u32 filter value for any event doesn't scale well
> when the potential architectural scope is already two 64-bit values, and
> implementations may add custom stuff on the side too. Rather than try to
> thread arbitrary filter data through the common path, let's just make
> the set_ev_filter op self-contained in terms of parsing and configuring
> any and all filtering for the given event - splitting out a distinct op
> for cycles events which inherently differ - and let implementations
> override the whole thing if they want to do something different. This
> already allows the Ampere code to stop looking a bit hacky.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>

> ---
> drivers/perf/arm_cspmu/ampere_cspmu.c | 24 ++++++----------------
> drivers/perf/arm_cspmu/arm_cspmu.c    | 29 +++++++++++----------------
> drivers/perf/arm_cspmu/arm_cspmu.h    |  8 ++++----
> drivers/perf/arm_cspmu/nvidia_cspmu.c | 21 ++++++++++++++++++-
> 4 files changed, 42 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/perf/arm_cspmu/ampere_cspmu.c b/drivers/perf/arm_cspmu/ampere_cspmu.c
> index 31cc1a4ac9df..b8ca69fd9d1d 100644
> --- a/drivers/perf/arm_cspmu/ampere_cspmu.c
> +++ b/drivers/perf/arm_cspmu/ampere_cspmu.c
> @@ -132,32 +132,20 @@ ampere_cspmu_get_name(const struct arm_cspmu *cspmu)
> 	return ctx->name;
> }
>
> -static u32 ampere_cspmu_event_filter(const struct perf_event *event)
> +static void ampere_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
> +				       const struct perf_event *event)
> {
> 	/*
> -	 * PMEVFILTR or PMCCFILTR aren't used in Ampere SoC PMU but are marked
> -	 * as RES0. Make sure, PMCCFILTR is written zero.
> +	 * PMCCFILTR is RES0, so this is just a dummy callback to override
> +	 * the default implementation and avoid writing to it.
> 	 */
> -	return 0;
> }
>
> static void ampere_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
> -				       struct hw_perf_event *hwc,
> -				       u32 filter)
> +				       const struct perf_event *event)
> {
> -	struct perf_event *event;
> -	unsigned int idx;
> 	u32 threshold, rank, bank;
>
> -	/*
> -	 * At this point, all the events have the same filter settings.
> -	 * Therefore, take the first event and use its configuration.
> -	 */
> -	idx = find_first_bit(cspmu->hw_events.used_ctrs,
> -			     cspmu->cycle_counter_logical_idx);
> -
> -	event = cspmu->hw_events.events[idx];
> -
> 	threshold	= get_threshold(event);
> 	rank		= get_rank(event);
> 	bank		= get_bank(event);
> @@ -233,7 +221,7 @@ static int ampere_cspmu_init_ops(struct arm_cspmu *cspmu)
>
> 	cspmu->impl.ctx = ctx;
>
> -	impl_ops->event_filter		= ampere_cspmu_event_filter;
> +	impl_ops->set_cc_filter		= ampere_cspmu_set_cc_filter;
> 	impl_ops->set_ev_filter		= ampere_cspmu_set_ev_filter;
> 	impl_ops->validate_event	= ampere_cspmu_validate_event;
> 	impl_ops->get_name		= ampere_cspmu_get_name;
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index 769466d55bea..053bb7920df6 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -66,7 +66,9 @@ static unsigned long arm_cspmu_cpuhp_state;
> static DEFINE_MUTEX(arm_cspmu_lock);
>
> static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
> -				    struct hw_perf_event *hwc, u32 filter);
> +				    const struct perf_event *event);
> +static void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
> +				    const struct perf_event *event);
>
> static struct acpi_apmt_node *arm_cspmu_apmt_node(struct device *dev)
> {
> @@ -205,11 +207,6 @@ static bool arm_cspmu_is_cycle_counter_event(const struct perf_event *event)
> 	return (event->attr.config == ARM_CSPMU_EVT_CYCLES_DEFAULT);
> }
>
> -static u32 arm_cspmu_event_filter(const struct perf_event *event)
> -{
> -	return event->attr.config1 & ARM_CSPMU_FILTER_MASK;
> -}
> -
> static ssize_t arm_cspmu_identifier_show(struct device *dev,
> 					 struct device_attribute *attr,
> 					 char *page)
> @@ -371,7 +368,7 @@ static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu)
> 		DEFAULT_IMPL_OP(get_name),
> 		DEFAULT_IMPL_OP(is_cycle_counter_event),
> 		DEFAULT_IMPL_OP(event_type),
> -		DEFAULT_IMPL_OP(event_filter),
> +		DEFAULT_IMPL_OP(set_cc_filter),
> 		DEFAULT_IMPL_OP(set_ev_filter),
> 		DEFAULT_IMPL_OP(event_attr_is_visible),
> 	};
> @@ -767,26 +764,26 @@ static inline void arm_cspmu_set_event(struct arm_cspmu *cspmu,
> }
>
> static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
> -					struct hw_perf_event *hwc,
> -					u32 filter)
> +				    const struct perf_event *event)
> {
> +	u32 filter = event->attr.config1 & ARM_CSPMU_FILTER_MASK;
> 	u32 offset = PMEVFILTR + (4 * hwc->idx);
>
> 	writel(filter, cspmu->base0 + offset);
> }
>
> -static inline void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu, u32 filter)
> +static void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
> +				    const struct perf_event *event)
> {
> -	u32 offset = PMCCFILTR;
> +	u32 filter = event->attr.config1 & ARM_CSPMU_FILTER_MASK;
>
> -	writel(filter, cspmu->base0 + offset);
> +	writel(filter, cspmu->base0 + PMCCFILTR);
> }
>
> static void arm_cspmu_start(struct perf_event *event, int pmu_flags)
> {
> 	struct arm_cspmu *cspmu = to_arm_cspmu(event->pmu);
> 	struct hw_perf_event *hwc = &event->hw;
> -	u32 filter;
>
> 	/* We always reprogram the counter */
> 	if (pmu_flags & PERF_EF_RELOAD)
> @@ -794,13 +791,11 @@ static void arm_cspmu_start(struct perf_event *event, int pmu_flags)
>
> 	arm_cspmu_set_event_period(event);
>
> -	filter = cspmu->impl.ops.event_filter(event);
> -
> 	if (event->hw.extra_reg.idx == cspmu->cycle_counter_logical_idx) {
> -		arm_cspmu_set_cc_filter(cspmu, filter);
> +		cspmu->impl.ops.set_cc_filter(cspmu, event);
> 	} else {
> 		arm_cspmu_set_event(cspmu, hwc);
> -		cspmu->impl.ops.set_ev_filter(cspmu, hwc, filter);
> +		cspmu->impl.ops.set_ev_filter(cspmu, event);
> 	}
>
> 	hwc->state = 0;
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
> index 576249e0deea..d59040d6a7e3 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
> @@ -149,11 +149,11 @@ struct arm_cspmu_impl_ops {
> 	bool (*is_cycle_counter_event)(const struct perf_event *event);
> 	/* Decode event type/id from configs */
> 	u32 (*event_type)(const struct perf_event *event);
> -	/* Decode filter value from configs */
> -	u32 (*event_filter)(const struct perf_event *event);
> -	/* Set event filter */
> +	/* Set event filters */
> +	void (*set_cc_filter)(struct arm_cspmu *cspmu,
> +			      const struct perf_event *event);
> 	void (*set_ev_filter)(struct arm_cspmu *cspmu,
> -			      struct hw_perf_event *hwc, u32 filter);
> +			      const struct perf_event *event);
> 	/* Implementation specific event validation */
> 	int (*validate_event)(struct arm_cspmu *cspmu,
> 			      struct perf_event *event);
> diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu/nvidia_cspmu.c
> index 8116c7846a46..9e817f120828 100644
> --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c
> +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c
> @@ -183,6 +183,24 @@ static u32 nv_cspmu_event_filter(const struct perf_event *event)
> 	return filter_val;
> }
>
> +static void nv_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
> +				   const struct perf_event *event)
> +{
> +	u32 filter = nv_cspmu_event_filter(event);
> +	u32 offset = PMEVFILTR + (4 * event->hw.idx);
> +
> +	writel(filter, cspmu->base0 + offset);
> +}
> +
> +static void nv_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
> +				   const struct perf_event *event)
> +{
> +	u32 filter = nv_cspmu_event_filter(event);
> +
> +	writel(filter, cspmu->base0 + PMCCFILTR);
> +}
> +
> +
> enum nv_cspmu_name_fmt {
> 	NAME_FMT_GENERIC,
> 	NAME_FMT_SOCKET
> @@ -322,7 +340,8 @@ static int nv_cspmu_init_ops(struct arm_cspmu *cspmu)
> 	cspmu->impl.ctx = ctx;
>
> 	/* NVIDIA specific callbacks. */
> -	impl_ops->event_filter			= nv_cspmu_event_filter;
> +	impl_ops->set_cc_filter			= nv_cspmu_set_cc_filter;
> +	impl_ops->set_ev_filter			= nv_cspmu_set_ev_filter;
> 	impl_ops->get_event_attrs		= nv_cspmu_get_event_attrs;
> 	impl_ops->get_format_attrs		= nv_cspmu_get_format_attrs;
> 	impl_ops->get_name			= nv_cspmu_get_name;
> -- 
> 2.39.2.101.g768bb238c484.dirty
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support
  2025-03-05 16:10 ` [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
  2025-03-06  9:32   ` James Clark
@ 2025-03-06 23:30   ` Ilkka Koskinen
  1 sibling, 0 replies; 12+ messages in thread
From: Ilkka Koskinen @ 2025-03-06 23:30 UTC (permalink / raw)
  To: Robin Murphy
  Cc: will, mark.rutland, bwicaksono, ilkka, linux-arm-kernel,
	linux-perf-users



On Wed, 5 Mar 2025, Robin Murphy wrote:

> Architecturally we have two filters for each regular event counter,
> so add generic support for the second one too.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>

> ---
> drivers/perf/arm_cspmu/arm_cspmu.c | 7 +++++--
> drivers/perf/arm_cspmu/arm_cspmu.h | 3 +++
> 2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index 053bb7920df6..efa9b229e701 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -183,6 +183,7 @@ arm_cspmu_event_attr_is_visible(struct kobject *kobj,
> static struct attribute *arm_cspmu_format_attrs[] = {
> 	ARM_CSPMU_FORMAT_EVENT_ATTR,
> 	ARM_CSPMU_FORMAT_FILTER_ATTR,
> +	ARM_CSPMU_FORMAT_FILTER2_ATTR,
> 	NULL,
> };
>
> @@ -767,9 +768,11 @@ static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
> 				    const struct perf_event *event)
> {
> 	u32 filter = event->attr.config1 & ARM_CSPMU_FILTER_MASK;
> -	u32 offset = PMEVFILTR + (4 * hwc->idx);
> +	u32 filter2 = event->attr.config2 & ARM_CSPMU_FILTER_MASK;
> +	u32 offset = 4 * event->hw.idx;
>
> -	writel(filter, cspmu->base0 + offset);
> +	writel(filter, cspmu->base0 + PMEVFILTR + offset);
> +	writel(filter2, cspmu->base0 + PMEVFILT2R + offset);
> }
>
> static void arm_cspmu_set_cc_filter(struct arm_cspmu *cspmu,
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
> index d59040d6a7e3..19684b76bd96 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
> @@ -47,6 +47,8 @@
> /* Default filter format */
> #define ARM_CSPMU_FORMAT_FILTER_ATTR	\
> 	ARM_CSPMU_FORMAT_ATTR(filter, "config1:0-31")
> +#define ARM_CSPMU_FORMAT_FILTER2_ATTR	\
> +	ARM_CSPMU_FORMAT_ATTR(filter2, "config2:0-31")
>
> /*
>  * This is the default event number for cycle count, if supported, since the
> @@ -72,6 +74,7 @@
> #define PMEVCNTR_HI			0x4
> #define PMEVTYPER			0x400
> #define PMCCFILTR			0x47C
> +#define PMEVFILT2R			0x800
> #define PMEVFILTR			0xA00
> #define PMCNTENSET			0xC00
> #define PMCNTENCLR			0xC20
> -- 
> 2.39.2.101.g768bb238c484.dirty
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support
  2025-03-05 16:10 [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
                   ` (3 preceding siblings ...)
  2025-03-06 23:26 ` [PATCH 0/3] " Ilkka Koskinen
@ 2025-03-13 22:22 ` Will Deacon
  4 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2025-03-13 22:22 UTC (permalink / raw)
  To: Robin Murphy
  Cc: catalin.marinas, kernel-team, Will Deacon, mark.rutland,
	bwicaksono, ilkka, linux-arm-kernel, linux-perf-users

On Wed, 05 Mar 2025 16:10:05 +0000, Robin Murphy wrote:
> This is part of something I'm working towards, but I figure it stands
> up on its own enough to be worth getting out early. Compile-tested
> only, but it's straightforward enough that I'm confident.
> 
> Cheers,
> Robin.
> 
> [...]

Applied to will (for-next/perf), thanks!

[1/3] perf/arm_cspmu: Move register definitons to header
      https://git.kernel.org/will/c/862f7ad4d7fd
[2/3] perf/arm_cspmu: Generalise event filtering
      https://git.kernel.org/will/c/6de0298a3925
[3/3] perf/arm_cspmu: Add PMEVFILT2R support
      https://git.kernel.org/will/c/a28f3cbfd11f

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-03-13 22:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-05 16:10 [PATCH 0/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
2025-03-05 16:10 ` [PATCH 1/3] perf/arm_cspmu: Move register definitons to header Robin Murphy
2025-03-06  9:45   ` James Clark
2025-03-06 23:28   ` Ilkka Koskinen
2025-03-05 16:10 ` [PATCH 2/3] perf/arm_cspmu: Generalise event filtering Robin Murphy
2025-03-06  9:45   ` James Clark
2025-03-06 23:29   ` Ilkka Koskinen
2025-03-05 16:10 ` [PATCH 3/3] perf/arm_cspmu: Add PMEVFILT2R support Robin Murphy
2025-03-06  9:32   ` James Clark
2025-03-06 23:30   ` Ilkka Koskinen
2025-03-06 23:26 ` [PATCH 0/3] " Ilkka Koskinen
2025-03-13 22:22 ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).