All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] perf/arm_dsu: Support newer DSUs
@ 2025-12-15 13:04 Robin Murphy
  2025-12-15 13:04 ` [PATCH 1/3] perf/arm_dsu: Support DSU-110 Robin Murphy
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Robin Murphy @ 2025-12-15 13:04 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, suzuki.poulose, linux-arm-kernel, linux-perf-users

Hi all,

DSU-110 and 120 have been out for a while, and although they continue
to share the same basic system register interface, they have subtle
functional changes that require a little attention to support properly.
And while DSUs still aren't widely exposed in current systems, at least
some folks do apparently care.

Sanity-checked on DSU-120 in FPGA, and original DSU on Rockchip RK3566
with TF-A hacked to set ACTLR_EL3.CLUSTERPMUEN.

Cheers,
Robin.


Robin Murphy (3):
  perf/arm_dsu: Support DSU-110
  perf/arm_dsu: Support DSU-120
  perf/arm_dsu: Allow standard cycles events

 drivers/perf/arm_dsu_pmu.c | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

-- 
2.39.2.101.g768bb238c484.dirty



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] perf/arm_dsu: Support DSU-110
  2025-12-15 13:04 [PATCH 0/3] perf/arm_dsu: Support newer DSUs Robin Murphy
@ 2025-12-15 13:04 ` Robin Murphy
  2025-12-15 13:04 ` [PATCH 2/3] perf/arm_dsu: Support DSU-120 Robin Murphy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Robin Murphy @ 2025-12-15 13:04 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, suzuki.poulose, linux-arm-kernel, linux-perf-users

DSU-110 sneakily made all the event counters 64-bit, perhaps related
to no longer having AArch32 EL1 to worry about. While the DSU version
itself is not easily discoverable, the size of a counter certainly is.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm_dsu_pmu.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index cb4fb59fe04b..8663721ee018 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -66,13 +66,6 @@
  */
 #define DSU_PMU_IDX_CYCLE_COUNTER	31
 
-/* All event counters are 32bit, with a 64bit Cycle counter */
-#define DSU_PMU_COUNTER_WIDTH(idx)	\
-	(((idx) == DSU_PMU_IDX_CYCLE_COUNTER) ? 64 : 32)
-
-#define DSU_PMU_COUNTER_MASK(idx)	\
-	GENMASK_ULL((DSU_PMU_COUNTER_WIDTH((idx)) - 1), 0)
-
 #define DSU_EXT_ATTR(_name, _func, _config)		\
 	(&((struct dev_ext_attribute[]) {				\
 		{							\
@@ -107,6 +100,7 @@ struct dsu_hw_events {
  * @num_counters	: Number of event counters implemented by the PMU,
  *			  excluding the cycle counter.
  * @irq			: Interrupt line for counter overflow.
+ * @has_32b_pmevcntr	: Are the non-cycle counters only 32-bit?
  * @cpmceid_bitmap	: Bitmap for the availability of architected common
  *			  events (event_code < 0x40).
  */
@@ -120,6 +114,7 @@ struct dsu_pmu {
 	struct hlist_node		cpuhp_node;
 	s8				num_counters;
 	int				irq;
+	bool				has_32b_pmevcntr;
 	DECLARE_BITMAP(cpmceid_bitmap, DSU_PMU_MAX_COMMON_EVENTS);
 };
 
@@ -328,6 +323,11 @@ static inline void dsu_pmu_set_event(struct dsu_pmu *dsu_pmu,
 	raw_spin_unlock_irqrestore(&dsu_pmu->pmu_lock, flags);
 }
 
+static u64 dsu_pmu_counter_mask(struct hw_perf_event *hw)
+{
+	return (hw->flags && hw->idx != DSU_PMU_IDX_CYCLE_COUNTER) ? U32_MAX : U64_MAX;
+}
+
 static void dsu_pmu_event_update(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
@@ -339,7 +339,7 @@ static void dsu_pmu_event_update(struct perf_event *event)
 		new_count = dsu_pmu_read_counter(event);
 	} while (local64_cmpxchg(&hwc->prev_count, prev_count, new_count) !=
 			prev_count);
-	delta = (new_count - prev_count) & DSU_PMU_COUNTER_MASK(hwc->idx);
+	delta = (new_count - prev_count) & dsu_pmu_counter_mask(hwc);
 	local64_add(delta, &event->count);
 }
 
@@ -362,8 +362,7 @@ static inline u32 dsu_pmu_get_reset_overflow(void)
  */
 static void dsu_pmu_set_event_period(struct perf_event *event)
 {
-	int idx = event->hw.idx;
-	u64 val = DSU_PMU_COUNTER_MASK(idx) >> 1;
+	u64 val = dsu_pmu_counter_mask(&event->hw) >> 1;
 
 	local64_set(&event->hw.prev_count, val);
 	dsu_pmu_write_counter(event, val);
@@ -564,6 +563,7 @@ static int dsu_pmu_event_init(struct perf_event *event)
 		return -EINVAL;
 
 	event->hw.config_base = event->attr.config;
+	event->hw.flags = dsu_pmu->has_32b_pmevcntr;
 	return 0;
 }
 
@@ -664,6 +664,10 @@ static void dsu_pmu_probe_pmu(struct dsu_pmu *dsu_pmu)
 	cpmceid[1] = __dsu_pmu_read_pmceid(1);
 	bitmap_from_arr32(dsu_pmu->cpmceid_bitmap, cpmceid,
 			  DSU_PMU_MAX_COMMON_EVENTS);
+	/* Newer DSUs have 64-bit counters */
+	__dsu_pmu_write_counter(0, U64_MAX);
+	if (__dsu_pmu_read_counter(0) != U64_MAX)
+		dsu_pmu->has_32b_pmevcntr = true;
 }
 
 static void dsu_pmu_set_active_cpu(int cpu, struct dsu_pmu *dsu_pmu)
-- 
2.39.2.101.g768bb238c484.dirty



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] perf/arm_dsu: Support DSU-120
  2025-12-15 13:04 [PATCH 0/3] perf/arm_dsu: Support newer DSUs Robin Murphy
  2025-12-15 13:04 ` [PATCH 1/3] perf/arm_dsu: Support DSU-110 Robin Murphy
@ 2025-12-15 13:04 ` Robin Murphy
  2025-12-15 13:05 ` [PATCH 3/3] perf/arm_dsu: Allow standard cycles events Robin Murphy
  2026-01-06 23:13 ` [PATCH 0/3] perf/arm_dsu: Support newer DSUs Will Deacon
  3 siblings, 0 replies; 5+ messages in thread
From: Robin Murphy @ 2025-12-15 13:04 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, suzuki.poulose, linux-arm-kernel, linux-perf-users

DSU-120 has the same system register interface as previous DSUs, but
no longer offers a dedicated cycle counter. While this is not directly
discoverable via PMCR, the PMCCNTR register is still defined to exist
with RAZ/WI behaviour, allowing for a straightforward heuristic.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm_dsu_pmu.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index 8663721ee018..56c592f0dae3 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -101,6 +101,7 @@ struct dsu_hw_events {
  *			  excluding the cycle counter.
  * @irq			: Interrupt line for counter overflow.
  * @has_32b_pmevcntr	: Are the non-cycle counters only 32-bit?
+ * @has_pmccntr		: Do we even have a dedicated cycle counter?
  * @cpmceid_bitmap	: Bitmap for the availability of architected common
  *			  events (event_code < 0x40).
  */
@@ -115,6 +116,7 @@ struct dsu_pmu {
 	s8				num_counters;
 	int				irq;
 	bool				has_32b_pmevcntr;
+	bool				has_pmccntr;
 	DECLARE_BITMAP(cpmceid_bitmap, DSU_PMU_MAX_COMMON_EVENTS);
 };
 
@@ -281,7 +283,7 @@ static int dsu_pmu_get_event_idx(struct dsu_hw_events *hw_events,
 	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
 	unsigned long *used_mask = hw_events->used_mask;
 
-	if (evtype == DSU_PMU_EVT_CYCLES) {
+	if (evtype == DSU_PMU_EVT_CYCLES && dsu_pmu->has_pmccntr) {
 		if (test_and_set_bit(DSU_PMU_IDX_CYCLE_COUNTER, used_mask))
 			return -EAGAIN;
 		return DSU_PMU_IDX_CYCLE_COUNTER;
@@ -668,6 +670,10 @@ static void dsu_pmu_probe_pmu(struct dsu_pmu *dsu_pmu)
 	__dsu_pmu_write_counter(0, U64_MAX);
 	if (__dsu_pmu_read_counter(0) != U64_MAX)
 		dsu_pmu->has_32b_pmevcntr = true;
+	/* On even newer DSUs, PMCCNTR is RAZ/WI */
+	__dsu_pmu_write_pmccntr(U64_MAX);
+	if (__dsu_pmu_read_pmccntr() == U64_MAX)
+		dsu_pmu->has_pmccntr = true;
 }
 
 static void dsu_pmu_set_active_cpu(int cpu, struct dsu_pmu *dsu_pmu)
-- 
2.39.2.101.g768bb238c484.dirty



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] perf/arm_dsu: Allow standard cycles events
  2025-12-15 13:04 [PATCH 0/3] perf/arm_dsu: Support newer DSUs Robin Murphy
  2025-12-15 13:04 ` [PATCH 1/3] perf/arm_dsu: Support DSU-110 Robin Murphy
  2025-12-15 13:04 ` [PATCH 2/3] perf/arm_dsu: Support DSU-120 Robin Murphy
@ 2025-12-15 13:05 ` Robin Murphy
  2026-01-06 23:13 ` [PATCH 0/3] perf/arm_dsu: Support newer DSUs Will Deacon
  3 siblings, 0 replies; 5+ messages in thread
From: Robin Murphy @ 2025-12-15 13:05 UTC (permalink / raw)
  To: will; +Cc: mark.rutland, suzuki.poulose, linux-arm-kernel, linux-perf-users

Since we do not use the divide-by-64 option, there should be no
significant difference between the dedicated cycle counter and the
standard cycles event. Since using the latter on DSU-120 now has
the side-effect of allowing multiple cycles events to be scheduled
simultaneously (beneficial for multiple cycle-based metrics), there
seems little reason not to allow the same on older DSUs as well.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm_dsu_pmu.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index 56c592f0dae3..32b0dd7c693b 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -284,9 +284,8 @@ static int dsu_pmu_get_event_idx(struct dsu_hw_events *hw_events,
 	unsigned long *used_mask = hw_events->used_mask;
 
 	if (evtype == DSU_PMU_EVT_CYCLES && dsu_pmu->has_pmccntr) {
-		if (test_and_set_bit(DSU_PMU_IDX_CYCLE_COUNTER, used_mask))
-			return -EAGAIN;
-		return DSU_PMU_IDX_CYCLE_COUNTER;
+		if (!test_and_set_bit(DSU_PMU_IDX_CYCLE_COUNTER, used_mask))
+			return DSU_PMU_IDX_CYCLE_COUNTER;
 	}
 
 	idx = find_first_zero_bit(used_mask, dsu_pmu->num_counters);
-- 
2.39.2.101.g768bb238c484.dirty



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] perf/arm_dsu: Support newer DSUs
  2025-12-15 13:04 [PATCH 0/3] perf/arm_dsu: Support newer DSUs Robin Murphy
                   ` (2 preceding siblings ...)
  2025-12-15 13:05 ` [PATCH 3/3] perf/arm_dsu: Allow standard cycles events Robin Murphy
@ 2026-01-06 23:13 ` Will Deacon
  3 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2026-01-06 23:13 UTC (permalink / raw)
  To: Robin Murphy
  Cc: catalin.marinas, kernel-team, Will Deacon, mark.rutland,
	suzuki.poulose, linux-arm-kernel, linux-perf-users

On Mon, 15 Dec 2025 13:04:57 +0000, Robin Murphy wrote:
> DSU-110 and 120 have been out for a while, and although they continue
> to share the same basic system register interface, they have subtle
> functional changes that require a little attention to support properly.
> And while DSUs still aren't widely exposed in current systems, at least
> some folks do apparently care.
> 
> Sanity-checked on DSU-120 in FPGA, and original DSU on Rockchip RK3566
> with TF-A hacked to set ACTLR_EL3.CLUSTERPMUEN.
> 
> [...]

Applied to will (for-next/perf), thanks!

[1/3] perf/arm_dsu: Support DSU-110
      https://git.kernel.org/will/c/0113affc9101
[2/3] perf/arm_dsu: Support DSU-120
      https://git.kernel.org/will/c/85c0dbd8b6e2
[3/3] perf/arm_dsu: Allow standard cycles events
      https://git.kernel.org/will/c/79448fa1f495

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-01-06 23:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-15 13:04 [PATCH 0/3] perf/arm_dsu: Support newer DSUs Robin Murphy
2025-12-15 13:04 ` [PATCH 1/3] perf/arm_dsu: Support DSU-110 Robin Murphy
2025-12-15 13:04 ` [PATCH 2/3] perf/arm_dsu: Support DSU-120 Robin Murphy
2025-12-15 13:05 ` [PATCH 3/3] perf/arm_dsu: Allow standard cycles events Robin Murphy
2026-01-06 23:13 ` [PATCH 0/3] perf/arm_dsu: Support newer DSUs Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.