linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] perf: ampere: Add support for Ampere SoC PMUs
@ 2023-06-01  3:01 Ilkka Koskinen
  2023-06-01  3:01 ` [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers Ilkka Koskinen
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-01  3:01 UTC (permalink / raw)
  To: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, Robin Murphy
  Cc: linux-doc, linux-kernel, linux-arm-kernel, Ilkka Koskinen

Changes since v1:
    * Rather than creating a completely new driver, implemented as a submodule
      of Arm CoreSight PMU driver
    * Fixed shared filter handling


Ilkka Koskinen (5):
  perf: arm_cspmu: Support 32-bit accesses to 64-bit registers
  perf: arm_cspmu: Support shared interrupts
  perf: arm_cspmu: Support implementation specific filters
  perf: arm_cspmu: Support implementation specific event validation
  perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU

 .../admin-guide/perf/ampere_cspmu.rst         |  30 +++
 drivers/perf/arm_cspmu/Makefile               |   2 +-
 drivers/perf/arm_cspmu/ampere_cspmu.c         | 232 ++++++++++++++++++
 drivers/perf/arm_cspmu/ampere_cspmu.h         |  17 ++
 drivers/perf/arm_cspmu/arm_cspmu.c            |  31 ++-
 drivers/perf/arm_cspmu/arm_cspmu.h            |   7 +
 6 files changed, 312 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/admin-guide/perf/ampere_cspmu.rst
 create mode 100644 drivers/perf/arm_cspmu/ampere_cspmu.c
 create mode 100644 drivers/perf/arm_cspmu/ampere_cspmu.h

-- 
2.40.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers
  2023-06-01  3:01 [PATCH v2 0/5] perf: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen
@ 2023-06-01  3:01 ` Ilkka Koskinen
  2023-06-01 14:49   ` Robin Murphy
  2023-06-01  3:01 ` [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts Ilkka Koskinen
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-01  3:01 UTC (permalink / raw)
  To: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, Robin Murphy
  Cc: linux-doc, linux-kernel, linux-arm-kernel, Ilkka Koskinen

Split the 64-bit register accesses if 64-bit access is not supported
by the PMU.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
---
 drivers/perf/arm_cspmu/arm_cspmu.c | 8 ++++++--
 drivers/perf/arm_cspmu/arm_cspmu.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index a3f1c410b417..88547a2b73e6 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -701,8 +701,12 @@ static void arm_cspmu_write_counter(struct perf_event *event, u64 val)
 
 	if (use_64b_counter_reg(cspmu)) {
 		offset = counter_offset(sizeof(u64), event->hw.idx);
-
-		writeq(val, cspmu->base1 + offset);
+		if (!cspmu->impl.split_64bit_access) {
+			writeq(val, cspmu->base1 + offset);
+		} else {
+			writel(lower_32_bits(val), cspmu->base1 + offset);
+			writel(upper_32_bits(val), cspmu->base1 + offset + 4);
+		}
 	} else {
 		offset = counter_offset(sizeof(u32), event->hw.idx);
 
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
index 51323b175a4a..c0412cf2bd97 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.h
+++ b/drivers/perf/arm_cspmu/arm_cspmu.h
@@ -110,6 +110,7 @@ struct arm_cspmu_impl_ops {
 /* Vendor/implementer descriptor. */
 struct arm_cspmu_impl {
 	u32 pmiidr;
+	bool split_64bit_access;
 	struct arm_cspmu_impl_ops ops;
 	void *ctx;
 };
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts
  2023-06-01  3:01 [PATCH v2 0/5] perf: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen
  2023-06-01  3:01 ` [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers Ilkka Koskinen
@ 2023-06-01  3:01 ` Ilkka Koskinen
  2023-06-01 14:54   ` Robin Murphy
  2023-06-01  3:01 ` [PATCH v2 3/5] perf: arm_cspmu: Support implementation specific filters Ilkka Koskinen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-01  3:01 UTC (permalink / raw)
  To: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, Robin Murphy
  Cc: linux-doc, linux-kernel, linux-arm-kernel, Ilkka Koskinen

Some of the PMUs may share the interrupt. Support them by
setting IRQF_SHARED

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
---
 drivers/perf/arm_cspmu/arm_cspmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index 88547a2b73e6..cc5204d1b5fb 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -1067,8 +1067,8 @@ static int arm_cspmu_request_irq(struct arm_cspmu *cspmu)
 		return irq;
 
 	ret = devm_request_irq(dev, irq, arm_cspmu_handle_irq,
-			       IRQF_NOBALANCING | IRQF_NO_THREAD, dev_name(dev),
-			       cspmu);
+			       IRQF_NOBALANCING | IRQF_NO_THREAD | IRQF_SHARED,
+			       dev_name(dev), cspmu);
 	if (ret) {
 		dev_err(dev, "Could not request IRQ %d\n", irq);
 		return ret;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/5] perf: arm_cspmu: Support implementation specific filters
  2023-06-01  3:01 [PATCH v2 0/5] perf: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen
  2023-06-01  3:01 ` [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers Ilkka Koskinen
  2023-06-01  3:01 ` [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts Ilkka Koskinen
@ 2023-06-01  3:01 ` Ilkka Koskinen
  2023-06-01  3:01 ` [PATCH v2 4/5] perf: arm_cspmu: Support implementation specific event validation Ilkka Koskinen
  2023-06-01  3:01 ` [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen
  4 siblings, 0 replies; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-01  3:01 UTC (permalink / raw)
  To: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, Robin Murphy
  Cc: linux-doc, linux-kernel, linux-arm-kernel, Ilkka Koskinen

Generic filters aren't used in all the platforms. Instead, the platforms
may use different means to filter events. Add support for implementation
specific filters.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
---
 drivers/perf/arm_cspmu/arm_cspmu.c | 8 ++++++--
 drivers/perf/arm_cspmu/arm_cspmu.h | 4 ++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index cc5204d1b5fb..b4c4ef81c719 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -122,6 +122,9 @@
 
 static unsigned long arm_cspmu_cpuhp_state;
 
+static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
+				    struct hw_perf_event *hwc, u32 filter);
+
 /*
  * In CoreSight PMU architecture, all of the MMIO registers are 32-bit except
  * counter register. The counter register can be implemented as 32-bit or 64-bit
@@ -432,6 +435,7 @@ static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu)
 	CHECK_DEFAULT_IMPL_OPS(impl_ops, event_type);
 	CHECK_DEFAULT_IMPL_OPS(impl_ops, event_filter);
 	CHECK_DEFAULT_IMPL_OPS(impl_ops, event_attr_is_visible);
+	CHECK_DEFAULT_IMPL_OPS(impl_ops, set_ev_filter);
 
 	return 0;
 }
@@ -799,7 +803,7 @@ static inline void arm_cspmu_set_event(struct arm_cspmu *cspmu,
 	writel(hwc->config, cspmu->base0 + offset);
 }
 
-static inline void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
+static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
 					   struct hw_perf_event *hwc,
 					   u32 filter)
 {
@@ -833,7 +837,7 @@ static void arm_cspmu_start(struct perf_event *event, int pmu_flags)
 		arm_cspmu_set_cc_filter(cspmu, filter);
 	} else {
 		arm_cspmu_set_event(cspmu, hwc);
-		arm_cspmu_set_ev_filter(cspmu, hwc, filter);
+		cspmu->impl.ops.set_ev_filter(cspmu, hwc, filter);
 	}
 
 	hwc->state = 0;
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
index c0412cf2bd97..4a29b921f7e8 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.h
+++ b/drivers/perf/arm_cspmu/arm_cspmu.h
@@ -102,6 +102,10 @@ struct arm_cspmu_impl_ops {
 	u32 (*event_type)(const struct perf_event *event);
 	/* Decode filter value from configs */
 	u32 (*event_filter)(const struct perf_event *event);
+	/* Set event filter */
+	void (*set_ev_filter)(struct arm_cspmu *cspmu,
+			      struct hw_perf_event *hwc,
+			      u32 filter);
 	/* Hide/show unsupported events */
 	umode_t (*event_attr_is_visible)(struct kobject *kobj,
 					 struct attribute *attr, int unused);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/5] perf: arm_cspmu: Support implementation specific event validation
  2023-06-01  3:01 [PATCH v2 0/5] perf: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen
                   ` (2 preceding siblings ...)
  2023-06-01  3:01 ` [PATCH v2 3/5] perf: arm_cspmu: Support implementation specific filters Ilkka Koskinen
@ 2023-06-01  3:01 ` Ilkka Koskinen
  2023-06-01 15:09   ` Robin Murphy
  2023-06-01  3:01 ` [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen
  4 siblings, 1 reply; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-01  3:01 UTC (permalink / raw)
  To: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, Robin Murphy
  Cc: linux-doc, linux-kernel, linux-arm-kernel, Ilkka Koskinen

Some platforms may use e.g. different filtering mechanism and, thus,
may need different way to validate the events.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
---
 drivers/perf/arm_cspmu/arm_cspmu.c | 4 ++++
 drivers/perf/arm_cspmu/arm_cspmu.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index b4c4ef81c719..a26f484e06b1 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -593,6 +593,10 @@ static int arm_cspmu_get_event_idx(struct arm_cspmu_hw_events *hw_events,
 	if (idx >= cspmu->num_logical_ctrs)
 		return -EAGAIN;
 
+	if (cspmu->impl.ops.validate_event &&
+	    !cspmu->impl.ops.validate_event(cspmu, event))
+		return -EAGAIN;
+
 	set_bit(idx, hw_events->used_ctrs);
 
 	return idx;
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
index 4a29b921f7e8..0e5c316c96f9 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.h
+++ b/drivers/perf/arm_cspmu/arm_cspmu.h
@@ -106,6 +106,8 @@ struct arm_cspmu_impl_ops {
 	void (*set_ev_filter)(struct arm_cspmu *cspmu,
 			      struct hw_perf_event *hwc,
 			      u32 filter);
+	/* Implementation specific event validation */
+	bool (*validate_event)(struct arm_cspmu *cspmu, struct perf_event *new);
 	/* Hide/show unsupported events */
 	umode_t (*event_attr_is_visible)(struct kobject *kobj,
 					 struct attribute *attr, int unused);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
  2023-06-01  3:01 [PATCH v2 0/5] perf: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen
                   ` (3 preceding siblings ...)
  2023-06-01  3:01 ` [PATCH v2 4/5] perf: arm_cspmu: Support implementation specific event validation Ilkka Koskinen
@ 2023-06-01  3:01 ` Ilkka Koskinen
  2023-06-01 15:23   ` Robin Murphy
  2023-06-03  1:16   ` kernel test robot
  4 siblings, 2 replies; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-01  3:01 UTC (permalink / raw)
  To: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, Robin Murphy
  Cc: linux-doc, linux-kernel, linux-arm-kernel, Ilkka Koskinen

Ampere SoC PMU follows CoreSight PMU architecture. It uses implementation
specific registers to filter events rather than PMEVFILTnR registers.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
---
 .../admin-guide/perf/ampere_cspmu.rst         |  30 +++
 drivers/perf/arm_cspmu/Makefile               |   2 +-
 drivers/perf/arm_cspmu/ampere_cspmu.c         | 232 ++++++++++++++++++
 drivers/perf/arm_cspmu/ampere_cspmu.h         |  17 ++
 drivers/perf/arm_cspmu/arm_cspmu.c            |   7 +
 5 files changed, 287 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/perf/ampere_cspmu.rst
 create mode 100644 drivers/perf/arm_cspmu/ampere_cspmu.c
 create mode 100644 drivers/perf/arm_cspmu/ampere_cspmu.h

diff --git a/Documentation/admin-guide/perf/ampere_cspmu.rst b/Documentation/admin-guide/perf/ampere_cspmu.rst
new file mode 100644
index 000000000000..8da877f2a8c3
--- /dev/null
+++ b/Documentation/admin-guide/perf/ampere_cspmu.rst
@@ -0,0 +1,30 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================================
+Ampere SoC Performance Monitoring Unit (PMU)
+============================================
+
+Ampere SoC PMU is a generic PMU IP that follows Arm CoreSight PMU architecture.
+Therefore, the driver is implemented as a submodule of arm_cspmu driver. At the
+first phase it's used for counting MCU events on AmpereOne.
+
+
+MCU PMU events
+--------------
+
+The PMU driver supports setting filters for "rank", "bank", and "threshold".
+Note, that the filters are per PMU instance rather than per event. To enable
+filters, one needs to set "filter_enable=1".
+
+
+Example for perf tool use::
+
+  / # perf list ampere
+
+    ampere_mcu_pmu_0/act_sent/                         [Kernel PMU event]
+    <...>
+    ampere_mcu_pmu_1/rd_sent/                          [Kernel PMU event]
+    <...>
+
+  / # perf stat -a -e ampere_mcu_pmu_0/act_sent,filter_enable=3,bank=5,rank=3,threshold=2/,ampere_mcu_pmu_1/rd_sent/ \
+        sleep 1
diff --git a/drivers/perf/arm_cspmu/Makefile b/drivers/perf/arm_cspmu/Makefile
index fedb17df982d..b80a8bd8da54 100644
--- a/drivers/perf/arm_cspmu/Makefile
+++ b/drivers/perf/arm_cspmu/Makefile
@@ -3,4 +3,4 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu_module.o
-arm_cspmu_module-y := arm_cspmu.o nvidia_cspmu.o
+arm_cspmu_module-y := arm_cspmu.o nvidia_cspmu.o ampere_cspmu.o
diff --git a/drivers/perf/arm_cspmu/ampere_cspmu.c b/drivers/perf/arm_cspmu/ampere_cspmu.c
new file mode 100644
index 000000000000..9101e0446a68
--- /dev/null
+++ b/drivers/perf/arm_cspmu/ampere_cspmu.c
@@ -0,0 +1,232 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Ampere SoC PMU (Performance Monitor Unit)
+ *
+ * Copyright (c) 2023, Ampere Computing LLC
+ */
+
+#include "ampere_cspmu.h"
+
+#define PMAUXR0		0xD80
+#define PMAUXR1		0xD84
+#define PMAUXR2		0xD88
+#define PMAUXR3		0xD8C
+
+#define to_ampere_cspmu_ctx(cspmu)	((struct ampere_cspmu_ctx *)(cspmu->impl.ctx))
+
+struct ampere_cspmu_ctx {
+	const char *name;
+	struct attribute **event_attr;
+	struct attribute **format_attr;
+};
+
+#define SOC_PMU_EVENT_ATTR_EXTRACTOR(_name, _config, _start, _end)        \
+	static inline u32 get_##_name(const struct perf_event *event)     \
+	{                                                                 \
+		return FIELD_GET(GENMASK_ULL(_end, _start),               \
+				 event->attr._config);                    \
+	}                                                                 \
+
+SOC_PMU_EVENT_ATTR_EXTRACTOR(event, config, 0, 8);
+SOC_PMU_EVENT_ATTR_EXTRACTOR(threshold, config1, 0, 7);
+SOC_PMU_EVENT_ATTR_EXTRACTOR(rank, config1, 8, 23);
+SOC_PMU_EVENT_ATTR_EXTRACTOR(bank, config2, 0, 31);
+SOC_PMU_EVENT_ATTR_EXTRACTOR(filter_enable, config2, 32, 32);
+
+static struct attribute *ampereone_mcu_pmu_event_attrs[] = {
+	ARM_CSPMU_EVENT_ATTR(cycle_count,		0x00),
+	ARM_CSPMU_EVENT_ATTR(act_sent,			0x01),
+	ARM_CSPMU_EVENT_ATTR(pre_sent,			0x02),
+	ARM_CSPMU_EVENT_ATTR(rd_sent,			0x03),
+	ARM_CSPMU_EVENT_ATTR(rda_sent,			0x04),
+	ARM_CSPMU_EVENT_ATTR(wr_sent,			0x05),
+	ARM_CSPMU_EVENT_ATTR(wra_sent,			0x06),
+	ARM_CSPMU_EVENT_ATTR(pd_entry_vld,		0x07),
+	ARM_CSPMU_EVENT_ATTR(sref_entry_vld,		0x08),
+	ARM_CSPMU_EVENT_ATTR(prea_sent,			0x09),
+	ARM_CSPMU_EVENT_ATTR(pre_sb_sent,		0x0a),
+	ARM_CSPMU_EVENT_ATTR(ref_sent,			0x0b),
+	ARM_CSPMU_EVENT_ATTR(rfm_sent,			0x0c),
+	ARM_CSPMU_EVENT_ATTR(ref_sb_sent,		0x0d),
+	ARM_CSPMU_EVENT_ATTR(rfm_sb_sent,		0x0e),
+	ARM_CSPMU_EVENT_ATTR(rd_rda_sent,		0x0f),
+	ARM_CSPMU_EVENT_ATTR(wr_wra_sent,		0x10),
+	ARM_CSPMU_EVENT_ATTR(raw_hazard,		0x11),
+	ARM_CSPMU_EVENT_ATTR(war_hazard,		0x12),
+	ARM_CSPMU_EVENT_ATTR(waw_hazard,		0x13),
+	ARM_CSPMU_EVENT_ATTR(rar_hazard,		0x14),
+	ARM_CSPMU_EVENT_ATTR(raw_war_waw_hazard,	0x15),
+	ARM_CSPMU_EVENT_ATTR(hprd_lprd_wr_req_vld,	0x16),
+	ARM_CSPMU_EVENT_ATTR(lprd_req_vld,		0x17),
+	ARM_CSPMU_EVENT_ATTR(hprd_req_vld,		0x18),
+	ARM_CSPMU_EVENT_ATTR(hprd_lprd_req_vld,		0x19),
+	ARM_CSPMU_EVENT_ATTR(prefetch_tgt,		0x1a),
+	ARM_CSPMU_EVENT_ATTR(wr_req_vld,		0x1b),
+	ARM_CSPMU_EVENT_ATTR(partial_wr_req_vld,	0x1c),
+	ARM_CSPMU_EVENT_ATTR(rd_retry,			0x1d),
+	ARM_CSPMU_EVENT_ATTR(wr_retry,			0x1e),
+	ARM_CSPMU_EVENT_ATTR(retry_gnt,			0x1f),
+	ARM_CSPMU_EVENT_ATTR(rank_change,		0x20),
+	ARM_CSPMU_EVENT_ATTR(dir_change,		0x21),
+	ARM_CSPMU_EVENT_ATTR(rank_dir_change,		0x22),
+	ARM_CSPMU_EVENT_ATTR(rank_active,		0x23),
+	ARM_CSPMU_EVENT_ATTR(rank_idle,			0x24),
+	ARM_CSPMU_EVENT_ATTR(rank_pd,			0x25),
+	ARM_CSPMU_EVENT_ATTR(rank_sref,			0x26),
+	ARM_CSPMU_EVENT_ATTR(queue_fill_gt_thresh,	0x27),
+	ARM_CSPMU_EVENT_ATTR(queue_rds_gt_thresh,	0x28),
+	ARM_CSPMU_EVENT_ATTR(queue_wrs_gt_thresh,	0x29),
+	ARM_CSPMU_EVENT_ATTR(phy_updt_complt,		0x2a),
+	ARM_CSPMU_EVENT_ATTR(tz_fail,			0x2b),
+	ARM_CSPMU_EVENT_ATTR(dram_errc,			0x2c),
+	ARM_CSPMU_EVENT_ATTR(dram_errd,			0x2d),
+	ARM_CSPMU_EVENT_ATTR(read_data_return,		0x32),
+	ARM_CSPMU_EVENT_ATTR(chi_wr_data_delta,		0x33),
+	ARM_CSPMU_EVENT_ATTR(zq_start,			0x34),
+	ARM_CSPMU_EVENT_ATTR(zq_latch,			0x35),
+	ARM_CSPMU_EVENT_ATTR(wr_fifo_full,		0x36),
+	ARM_CSPMU_EVENT_ATTR(info_fifo_full,		0x37),
+	ARM_CSPMU_EVENT_ATTR(cmd_fifo_full,		0x38),
+	ARM_CSPMU_EVENT_ATTR(dfi_nop,			0x39),
+	ARM_CSPMU_EVENT_ATTR(dfi_cmd,			0x3a),
+	ARM_CSPMU_EVENT_ATTR(rd_run_len,		0x3b),
+	ARM_CSPMU_EVENT_ATTR(wr_run_len,		0x3c),
+
+	ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT),
+	NULL,
+};
+
+static struct attribute *ampereone_mcu_format_attrs[] = {
+	ARM_CSPMU_FORMAT_EVENT_ATTR,
+	ARM_CSPMU_FORMAT_ATTR(threshold, "config1:0-7"),
+	ARM_CSPMU_FORMAT_ATTR(rank, "config1:8-23"),
+	ARM_CSPMU_FORMAT_ATTR(bank, "config2:0-31"),
+	ARM_CSPMU_FORMAT_ATTR(filter_enable, "config2:32"),
+	NULL,
+};
+
+static struct attribute **
+ampere_cspmu_get_event_attrs(const struct arm_cspmu *cspmu)
+{
+	const struct ampere_cspmu_ctx *ctx = to_ampere_cspmu_ctx(cspmu);
+
+	return ctx->event_attr;
+}
+
+static struct attribute **
+ampere_cspmu_get_format_attrs(const struct arm_cspmu *cspmu)
+{
+	const struct ampere_cspmu_ctx *ctx = to_ampere_cspmu_ctx(cspmu);
+
+	return ctx->format_attr;
+}
+
+static const char *
+ampere_cspmu_get_name(const struct arm_cspmu *cspmu)
+{
+	const struct ampere_cspmu_ctx *ctx = to_ampere_cspmu_ctx(cspmu);
+
+	return ctx->name;
+}
+
+static u32 ampere_cspmu_event_filter(const struct perf_event *event)
+{
+	return 0;
+}
+
+static void ampere_cspmu_set_ev_filter(struct arm_cspmu *cspmu,
+				       struct hw_perf_event *hwc,
+				       u32 filter)
+{
+	struct perf_event *event;
+	unsigned int idx;
+	u32 threshold = 0, rank = 0, bank = 0;
+
+	/*
+	 * At this point, all the events have the same filter settings.
+	 * Therefore, take the first event and use its configuration.
+	 */
+	idx = find_first_bit(cspmu->hw_events.used_ctrs,
+			     cspmu->cycle_counter_logical_idx);
+
+	event = cspmu->hw_events.events[idx];
+	if (get_filter_enable(event)) {
+		threshold	= get_threshold(event);
+		rank		= get_rank(event);
+		bank		= get_bank(event);
+	}
+
+	writel(threshold, cspmu->base0 + PMAUXR0);
+	writel(rank, cspmu->base0 + PMAUXR1);
+	writel(bank, cspmu->base0 + PMAUXR2);
+}
+
+static bool ampere_cspmu_validate_event(struct arm_cspmu *cspmu,
+					struct perf_event *new)
+{
+	struct perf_event *curr;
+	unsigned int idx;
+	u32 threshold = 0, rank = 0, bank = 0;
+
+	/* We compare the global filter settings to existing events */
+	idx = find_first_bit(cspmu->hw_events.used_ctrs,
+			     cspmu->cycle_counter_logical_idx);
+
+	/* This is the first event */
+	if (idx == cspmu->cycle_counter_logical_idx)
+		return true;
+
+	curr = cspmu->hw_events.events[idx];
+
+	if (get_filter_enable(new)) {
+		threshold	= get_threshold(new);
+		rank		= get_rank(new);
+		bank		= get_bank(new);
+	}
+
+	if (get_filter_enable(new) != get_filter_enable(curr) ||
+	    get_threshold(curr) != threshold ||
+	    get_rank(curr) != rank ||
+	    get_bank(curr) != bank)
+		return false;
+
+	return true;
+}
+
+static char *ampere_cspmu_format_name(const struct arm_cspmu *cspmu,
+				      const char *name_pattern)
+{
+	struct device *dev = cspmu->dev;
+	static atomic_t pmu_generic_idx = {0};
+
+	return devm_kasprintf(dev, GFP_KERNEL, name_pattern,
+			      atomic_fetch_inc(&pmu_generic_idx));
+}
+
+int ampere_cspmu_init_ops(struct arm_cspmu *cspmu)
+{
+	struct device *dev = cspmu->dev;
+	struct ampere_cspmu_ctx *ctx;
+	struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops;
+
+	ctx = devm_kzalloc(dev, sizeof(struct ampere_cspmu_ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+
+	ctx->event_attr		= ampereone_mcu_pmu_event_attrs;
+	ctx->format_attr	= ampereone_mcu_format_attrs;
+	ctx->name		= ampere_cspmu_format_name(cspmu,
+							   "ampere_mcu_pmu_%u");
+	cspmu->impl.ctx = ctx;
+	cspmu->impl.split_64bit_access	= true;
+
+	impl_ops->event_filter		= ampere_cspmu_event_filter;
+	impl_ops->set_ev_filter		= ampere_cspmu_set_ev_filter;
+	impl_ops->validate_event	= ampere_cspmu_validate_event;
+	impl_ops->get_name		= ampere_cspmu_get_name;
+	impl_ops->get_event_attrs	= ampere_cspmu_get_event_attrs;
+	impl_ops->get_format_attrs	= ampere_cspmu_get_format_attrs;
+
+	return 0;
+}
diff --git a/drivers/perf/arm_cspmu/ampere_cspmu.h b/drivers/perf/arm_cspmu/ampere_cspmu.h
new file mode 100644
index 000000000000..9b3e1628d1d6
--- /dev/null
+++ b/drivers/perf/arm_cspmu/ampere_cspmu.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Ampere SoC PMU (Performance Monitor Unit)
+ *
+ * Copyright (c) 2023, Ampere Computing LLC
+ */
+
+#ifndef __AMPERE_CSPMU_H__
+#define __AMPERE_CSPMU_H__
+
+#include "arm_cspmu.h"
+
+/* Allocate AMPERE descriptor. */
+int ampere_cspmu_init_ops(struct arm_cspmu *cspmu);
+
+#endif /* __AMPERE_CSPMU_H__ */
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index a26f484e06b1..42b57f353777 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -30,6 +30,7 @@
 #include <linux/platform_device.h>
 #include <acpi/processor.h>
 
+#include "ampere_cspmu.h"
 #include "arm_cspmu.h"
 #include "nvidia_cspmu.h"
 
@@ -119,6 +120,7 @@
 
 /* JEDEC-assigned JEP106 identification code */
 #define ARM_CSPMU_IMPL_ID_NVIDIA		0x36B
+#define ARM_CSPMU_IMPL_ID_AMPERE		0xA16
 
 static unsigned long arm_cspmu_cpuhp_state;
 
@@ -394,6 +396,11 @@ static const struct impl_match impl_match[] = {
 	  .mask = ARM_CSPMU_PMIIDR_IMPLEMENTER,
 	  .impl_init_ops = nv_cspmu_init_ops
 	},
+	{
+	  .pmiidr = ARM_CSPMU_IMPL_ID_AMPERE,
+	  .mask = ARM_CSPMU_PMIIDR_IMPLEMENTER,
+	  .impl_init_ops = ampere_cspmu_init_ops
+	},
 	{}
 };
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers
  2023-06-01  3:01 ` [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers Ilkka Koskinen
@ 2023-06-01 14:49   ` Robin Murphy
  2023-06-02  6:47     ` Ilkka Koskinen
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2023-06-01 14:49 UTC (permalink / raw)
  To: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose
  Cc: linux-doc, linux-kernel, linux-arm-kernel

On 2023-06-01 04:01, Ilkka Koskinen wrote:
> Split the 64-bit register accesses if 64-bit access is not supported
> by the PMU.
> 
> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
> ---
>   drivers/perf/arm_cspmu/arm_cspmu.c | 8 ++++++--
>   drivers/perf/arm_cspmu/arm_cspmu.h | 1 +
>   2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index a3f1c410b417..88547a2b73e6 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -701,8 +701,12 @@ static void arm_cspmu_write_counter(struct perf_event *event, u64 val)
>   
>   	if (use_64b_counter_reg(cspmu)) {
>   		offset = counter_offset(sizeof(u64), event->hw.idx);
> -
> -		writeq(val, cspmu->base1 + offset);
> +		if (!cspmu->impl.split_64bit_access) {

Could we not just hang this off the 64-bit atomicity property to match 
the read path? It doesn't seem like there's much benefit in 
micro-optimising for whether the interconnect splits 64-bit accesses 
into 32-bit bursts vs. just not accepting them at all.

> +			writeq(val, cspmu->base1 + offset);
> +		} else {
> +			writel(lower_32_bits(val), cspmu->base1 + offset);
> +			writel(upper_32_bits(val), cspmu->base1 + offset + 4);

lo_hi_writeq() - the header's already included for 32-bit build 
coverage, so we may as well put it to use :)

Thanks,
Robin.

> +		}
>   	} else {
>   		offset = counter_offset(sizeof(u32), event->hw.idx);
>   
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
> index 51323b175a4a..c0412cf2bd97 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
> @@ -110,6 +110,7 @@ struct arm_cspmu_impl_ops {
>   /* Vendor/implementer descriptor. */
>   struct arm_cspmu_impl {
>   	u32 pmiidr;
> +	bool split_64bit_access;
>   	struct arm_cspmu_impl_ops ops;
>   	void *ctx;
>   };

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts
  2023-06-01  3:01 ` [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts Ilkka Koskinen
@ 2023-06-01 14:54   ` Robin Murphy
  2023-06-02  7:04     ` Ilkka Koskinen
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2023-06-01 14:54 UTC (permalink / raw)
  To: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose
  Cc: linux-doc, linux-kernel, linux-arm-kernel

On 2023-06-01 04:01, Ilkka Koskinen wrote:
> Some of the PMUs may share the interrupt. Support them by
> setting IRQF_SHARED

This has the usual problem of allowing any PMU instance to move the IRQ 
affinity to a different CPU without also migrating all the other PMU 
contexts, and thus breaking perf core's assumptions of mutual exclusion.

Thanks,
Robin.

> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
> ---
>   drivers/perf/arm_cspmu/arm_cspmu.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index 88547a2b73e6..cc5204d1b5fb 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -1067,8 +1067,8 @@ static int arm_cspmu_request_irq(struct arm_cspmu *cspmu)
>   		return irq;
>   
>   	ret = devm_request_irq(dev, irq, arm_cspmu_handle_irq,
> -			       IRQF_NOBALANCING | IRQF_NO_THREAD, dev_name(dev),
> -			       cspmu);
> +			       IRQF_NOBALANCING | IRQF_NO_THREAD | IRQF_SHARED,
> +			       dev_name(dev), cspmu);
>   	if (ret) {
>   		dev_err(dev, "Could not request IRQ %d\n", irq);
>   		return ret;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] perf: arm_cspmu: Support implementation specific event validation
  2023-06-01  3:01 ` [PATCH v2 4/5] perf: arm_cspmu: Support implementation specific event validation Ilkka Koskinen
@ 2023-06-01 15:09   ` Robin Murphy
  2023-06-02  7:09     ` Ilkka Koskinen
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2023-06-01 15:09 UTC (permalink / raw)
  To: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose
  Cc: linux-doc, linux-kernel, linux-arm-kernel

On 2023-06-01 04:01, Ilkka Koskinen wrote:
> Some platforms may use e.g. different filtering mechanism and, thus,
> may need different way to validate the events.
> 
> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
> ---
>   drivers/perf/arm_cspmu/arm_cspmu.c | 4 ++++
>   drivers/perf/arm_cspmu/arm_cspmu.h | 2 ++
>   2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index b4c4ef81c719..a26f484e06b1 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -593,6 +593,10 @@ static int arm_cspmu_get_event_idx(struct arm_cspmu_hw_events *hw_events,
>   	if (idx >= cspmu->num_logical_ctrs)
>   		return -EAGAIN;
>   
> +	if (cspmu->impl.ops.validate_event &&
> +	    !cspmu->impl.ops.validate_event(cspmu, event))
> +		return -EAGAIN;

Seems like this should be -EINVAL, or maybe the callback should return 
int so it can make its own distinction (yes, I know the outer logic 
doesn't actually propagate it, but there's no reason that couldn't 
improve at some point as well).

Another thought is that once we get into imp-def conditions for whether 
an event is valid in itself, we presumably also need to consider imp-def 
conditions for whether a given pair of events are compatible to be grouped?

Thanks,
Robin.

> +
>   	set_bit(idx, hw_events->used_ctrs);
>   
>   	return idx;
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h
> index 4a29b921f7e8..0e5c316c96f9 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
> @@ -106,6 +106,8 @@ struct arm_cspmu_impl_ops {
>   	void (*set_ev_filter)(struct arm_cspmu *cspmu,
>   			      struct hw_perf_event *hwc,
>   			      u32 filter);
> +	/* Implementation specific event validation */
> +	bool (*validate_event)(struct arm_cspmu *cspmu, struct perf_event *new);
>   	/* Hide/show unsupported events */
>   	umode_t (*event_attr_is_visible)(struct kobject *kobj,
>   					 struct attribute *attr, int unused);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
  2023-06-01  3:01 ` [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen
@ 2023-06-01 15:23   ` Robin Murphy
  2023-06-02  7:13     ` Ilkka Koskinen
  2023-06-03  1:16   ` kernel test robot
  1 sibling, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2023-06-01 15:23 UTC (permalink / raw)
  To: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose
  Cc: linux-doc, linux-kernel, linux-arm-kernel

On 2023-06-01 04:01, Ilkka Koskinen wrote:
[...]
> +static bool ampere_cspmu_validate_event(struct arm_cspmu *cspmu,
> +					struct perf_event *new)
> +{
> +	struct perf_event *curr;
> +	unsigned int idx;
> +	u32 threshold = 0, rank = 0, bank = 0;
> +
> +	/* We compare the global filter settings to existing events */
> +	idx = find_first_bit(cspmu->hw_events.used_ctrs,
> +			     cspmu->cycle_counter_logical_idx);
> +
> +	/* This is the first event */
> +	if (idx == cspmu->cycle_counter_logical_idx)
> +		return true;
> +
> +	curr = cspmu->hw_events.events[idx];
> +
> +	if (get_filter_enable(new)) {
> +		threshold	= get_threshold(new);
> +		rank		= get_rank(new);
> +		bank		= get_bank(new);
> +	}
> +
> +	if (get_filter_enable(new) != get_filter_enable(curr) ||

Is there any useful purpose in allowing the user to specify nonzero 
rank, bank or threshold values with filter_enable=0? Assuming not, then 
between this and ampere_cspmu_set_ev_filter() it appears that you don't 
need filter_enable at all.

Thanks,
Robin.

> +	    get_threshold(curr) != threshold ||
> +	    get_rank(curr) != rank ||
> +	    get_bank(curr) != bank)
> +		return false;
> +
> +	return true;
> +}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers
  2023-06-01 14:49   ` Robin Murphy
@ 2023-06-02  6:47     ` Ilkka Koskinen
  0 siblings, 0 replies; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-02  6:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose, linux-doc, linux-kernel,
	linux-arm-kernel


Hi Robin,

On Thu, 1 Jun 2023, Robin Murphy wrote:
> On 2023-06-01 04:01, Ilkka Koskinen wrote:
>> Split the 64-bit register accesses if 64-bit access is not supported
>> by the PMU.
>> 
>> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
>> ---
>>   drivers/perf/arm_cspmu/arm_cspmu.c | 8 ++++++--
>>   drivers/perf/arm_cspmu/arm_cspmu.h | 1 +
>>   2 files changed, 7 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c 
>> b/drivers/perf/arm_cspmu/arm_cspmu.c
>> index a3f1c410b417..88547a2b73e6 100644
>> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
>> @@ -701,8 +701,12 @@ static void arm_cspmu_write_counter(struct perf_event 
>> *event, u64 val)
>>     	if (use_64b_counter_reg(cspmu)) {
>>   		offset = counter_offset(sizeof(u64), event->hw.idx);
>> -
>> -		writeq(val, cspmu->base1 + offset);
>> +		if (!cspmu->impl.split_64bit_access) {
>
> Could we not just hang this off the 64-bit atomicity property to match the 
> read path? It doesn't seem like there's much benefit in micro-optimising for 
> whether the interconnect splits 64-bit accesses into 32-bit bursts vs. just 
> not accepting them at all.

Sure, I was actually wondering if I could use it or if there could be 
a really weird hw implementation. I'll change it.

>
>> +			writeq(val, cspmu->base1 + offset);
>> +		} else {
>> +			writel(lower_32_bits(val), cspmu->base1 + offset);
>> +			writel(upper_32_bits(val), cspmu->base1 + offset + 
>> 4);
>
> lo_hi_writeq() - the header's already included for 32-bit build coverage, so 
> we may as well put it to use :)

Oh, I have missed that function completely. I fix this too.

Cheers, Ilkka


> Thanks,
> Robin.
>
>> +		}
>>   	} else {
>>   		offset = counter_offset(sizeof(u32), event->hw.idx);
>>   diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h 
>> b/drivers/perf/arm_cspmu/arm_cspmu.h
>> index 51323b175a4a..c0412cf2bd97 100644
>> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
>> @@ -110,6 +110,7 @@ struct arm_cspmu_impl_ops {
>>   /* Vendor/implementer descriptor. */
>>   struct arm_cspmu_impl {
>>   	u32 pmiidr;
>> +	bool split_64bit_access;
>>   	struct arm_cspmu_impl_ops ops;
>>   	void *ctx;
>>   };
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts
  2023-06-01 14:54   ` Robin Murphy
@ 2023-06-02  7:04     ` Ilkka Koskinen
  2023-06-02 11:25       ` Robin Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-02  7:04 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose, linux-doc, linux-kernel,
	linux-arm-kernel


Hi Robin,

On Thu, 1 Jun 2023, Robin Murphy wrote:
> On 2023-06-01 04:01, Ilkka Koskinen wrote:
>> Some of the PMUs may share the interrupt. Support them by
>> setting IRQF_SHARED
>
> This has the usual problem of allowing any PMU instance to move the IRQ 
> affinity to a different CPU without also migrating all the other PMU 
> contexts, and thus breaking perf core's assumptions of mutual exclusion.

I see, I wasn't aware of such an assumption. Sounds like there isn't 
necessarily an easy and clean solution for the shared interrupt case. I 
drop the patch and get back on the issue if we come up with something 
reasonable later.

Cheers, Ilkka

>
> Thanks,
> Robin.
>
>> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
>> ---
>>   drivers/perf/arm_cspmu/arm_cspmu.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c 
>> b/drivers/perf/arm_cspmu/arm_cspmu.c
>> index 88547a2b73e6..cc5204d1b5fb 100644
>> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
>> @@ -1067,8 +1067,8 @@ static int arm_cspmu_request_irq(struct arm_cspmu 
>> *cspmu)
>>   		return irq;
>>     	ret = devm_request_irq(dev, irq, arm_cspmu_handle_irq,
>> -			       IRQF_NOBALANCING | IRQF_NO_THREAD, 
>> dev_name(dev),
>> -			       cspmu);
>> +			       IRQF_NOBALANCING | IRQF_NO_THREAD | 
>> IRQF_SHARED,
>> +			       dev_name(dev), cspmu);
>>   	if (ret) {
>>   		dev_err(dev, "Could not request IRQ %d\n", irq);
>>   		return ret;
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] perf: arm_cspmu: Support implementation specific event validation
  2023-06-01 15:09   ` Robin Murphy
@ 2023-06-02  7:09     ` Ilkka Koskinen
  0 siblings, 0 replies; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-02  7:09 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose, linux-doc, linux-kernel,
	linux-arm-kernel


Hi Robin,

On Thu, 1 Jun 2023, Robin Murphy wrote:
> On 2023-06-01 04:01, Ilkka Koskinen wrote:
>> Some platforms may use e.g. different filtering mechanism and, thus,
>> may need different way to validate the events.
>> 
>> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
>> ---
>>   drivers/perf/arm_cspmu/arm_cspmu.c | 4 ++++
>>   drivers/perf/arm_cspmu/arm_cspmu.h | 2 ++
>>   2 files changed, 6 insertions(+)
>> 
>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c 
>> b/drivers/perf/arm_cspmu/arm_cspmu.c
>> index b4c4ef81c719..a26f484e06b1 100644
>> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
>> @@ -593,6 +593,10 @@ static int arm_cspmu_get_event_idx(struct 
>> arm_cspmu_hw_events *hw_events,
>>   	if (idx >= cspmu->num_logical_ctrs)
>>   		return -EAGAIN;
>>   +	if (cspmu->impl.ops.validate_event &&
>> +	    !cspmu->impl.ops.validate_event(cspmu, event))
>> +		return -EAGAIN;
>
> Seems like this should be -EINVAL, or maybe the callback should return int so 
> it can make its own distinction (yes, I know the outer logic doesn't actually 
> propagate it, but there's no reason that couldn't improve at some point as 
> well).

Makes sense to me.

> Another thought is that once we get into imp-def conditions for whether an 
> event is valid in itself, we presumably also need to consider imp-def 
> conditions for whether a given pair of events are compatible to be grouped?

That's a good point. I'll take a look at it.

Cheers, Ilkka

>
> Thanks,
> Robin.
>
>> +
>>   	set_bit(idx, hw_events->used_ctrs);
>>     	return idx;
>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h 
>> b/drivers/perf/arm_cspmu/arm_cspmu.h
>> index 4a29b921f7e8..0e5c316c96f9 100644
>> --- a/drivers/perf/arm_cspmu/arm_cspmu.h
>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
>> @@ -106,6 +106,8 @@ struct arm_cspmu_impl_ops {
>>   	void (*set_ev_filter)(struct arm_cspmu *cspmu,
>>   			      struct hw_perf_event *hwc,
>>   			      u32 filter);
>> +	/* Implementation specific event validation */
>> +	bool (*validate_event)(struct arm_cspmu *cspmu, struct perf_event 
>> *new);
>>   	/* Hide/show unsupported events */
>>   	umode_t (*event_attr_is_visible)(struct kobject *kobj,
>>   					 struct attribute *attr, int unused);
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
  2023-06-01 15:23   ` Robin Murphy
@ 2023-06-02  7:13     ` Ilkka Koskinen
  2023-06-02 11:51       ` Robin Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: Ilkka Koskinen @ 2023-06-02  7:13 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose, linux-doc, linux-kernel,
	linux-arm-kernel


Hi Robin,

On Thu, 1 Jun 2023, Robin Murphy wrote:
> On 2023-06-01 04:01, Ilkka Koskinen wrote:
> [...]
>> +static bool ampere_cspmu_validate_event(struct arm_cspmu *cspmu,
>> +					struct perf_event *new)
>> +{
>> +	struct perf_event *curr;
>> +	unsigned int idx;
>> +	u32 threshold = 0, rank = 0, bank = 0;
>> +
>> +	/* We compare the global filter settings to existing events */
>> +	idx = find_first_bit(cspmu->hw_events.used_ctrs,
>> +			     cspmu->cycle_counter_logical_idx);
>> +
>> +	/* This is the first event */
>> +	if (idx == cspmu->cycle_counter_logical_idx)
>> +		return true;
>> +
>> +	curr = cspmu->hw_events.events[idx];
>> +
>> +	if (get_filter_enable(new)) {
>> +		threshold	= get_threshold(new);
>> +		rank		= get_rank(new);
>> +		bank		= get_bank(new);
>> +	}
>> +
>> +	if (get_filter_enable(new) != get_filter_enable(curr) ||
>
> Is there any useful purpose in allowing the user to specify nonzero rank, 
> bank or threshold values with filter_enable=0? Assuming not, then between 
> this and ampere_cspmu_set_ev_filter() it appears that you don't need 
> filter_enable at all.

Not really. I dropped filter_enable at one point but restored it later to 
match the smmuv3 pmu driver. I totally agree, it's unnecessary and it's 
better to remove it completely.

Cheers, Ilkka

>
> Thanks,
> Robin.
>
>> +	    get_threshold(curr) != threshold ||
>> +	    get_rank(curr) != rank ||
>> +	    get_bank(curr) != bank)
>> +		return false;
>> +
>> +	return true;
>> +}
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts
  2023-06-02  7:04     ` Ilkka Koskinen
@ 2023-06-02 11:25       ` Robin Murphy
  0 siblings, 0 replies; 17+ messages in thread
From: Robin Murphy @ 2023-06-02 11:25 UTC (permalink / raw)
  To: Ilkka Koskinen
  Cc: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, linux-doc, linux-kernel, linux-arm-kernel

On 2023-06-02 08:04, Ilkka Koskinen wrote:
> 
> Hi Robin,
> 
> On Thu, 1 Jun 2023, Robin Murphy wrote:
>> On 2023-06-01 04:01, Ilkka Koskinen wrote:
>>> Some of the PMUs may share the interrupt. Support them by
>>> setting IRQF_SHARED
>>
>> This has the usual problem of allowing any PMU instance to move the 
>> IRQ affinity to a different CPU without also migrating all the other 
>> PMU contexts, and thus breaking perf core's assumptions of mutual 
>> exclusion.
> 
> I see, I wasn't aware of such an assumption. Sounds like there isn't 
> necessarily an easy and clean solution for the shared interrupt case. I 
> drop the patch and get back on the issue if we come up with something 
> reasonable later.

What comes to mind is factoring out the explicit interrupt-sharing 
machinery that we wrote to solve this problem in arm_dmc620_pmu, or 
possibly trying to do something with IRQ affinity notifiers (however, I 
recall looking into that a while ago and it didn't seem like they 
actually interact with CPU hotplug in the way we'd want).

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
  2023-06-02  7:13     ` Ilkka Koskinen
@ 2023-06-02 11:51       ` Robin Murphy
  0 siblings, 0 replies; 17+ messages in thread
From: Robin Murphy @ 2023-06-02 11:51 UTC (permalink / raw)
  To: Ilkka Koskinen
  Cc: Jonathan Corbet, Will Deacon, Mark Rutland, Besar Wicaksono,
	Suzuki K Poulose, linux-doc, linux-kernel, linux-arm-kernel

On 2023-06-02 08:13, Ilkka Koskinen wrote:
> 
> Hi Robin,
> 
> On Thu, 1 Jun 2023, Robin Murphy wrote:
>> On 2023-06-01 04:01, Ilkka Koskinen wrote:
>> [...]
>>> +static bool ampere_cspmu_validate_event(struct arm_cspmu *cspmu,
>>> +                    struct perf_event *new)
>>> +{
>>> +    struct perf_event *curr;
>>> +    unsigned int idx;
>>> +    u32 threshold = 0, rank = 0, bank = 0;
>>> +
>>> +    /* We compare the global filter settings to existing events */
>>> +    idx = find_first_bit(cspmu->hw_events.used_ctrs,
>>> +                 cspmu->cycle_counter_logical_idx);
>>> +
>>> +    /* This is the first event */
>>> +    if (idx == cspmu->cycle_counter_logical_idx)
>>> +        return true;
>>> +
>>> +    curr = cspmu->hw_events.events[idx];
>>> +
>>> +    if (get_filter_enable(new)) {
>>> +        threshold    = get_threshold(new);
>>> +        rank        = get_rank(new);
>>> +        bank        = get_bank(new);
>>> +    }
>>> +
>>> +    if (get_filter_enable(new) != get_filter_enable(curr) ||
>>
>> Is there any useful purpose in allowing the user to specify nonzero 
>> rank, bank or threshold values with filter_enable=0? Assuming not, 
>> then between this and ampere_cspmu_set_ev_filter() it appears that you 
>> don't need filter_enable at all.
> 
> Not really. I dropped filter_enable at one point but restored it later 
> to match the smmuv3 pmu driver. I totally agree, it's unnecessary and 
> it's better to remove it completely.

Ah, I see - the SMMU PMCG driver needs that to differentiate between 
"filter values defaulted to 0 because user didn't ask for filtering" and 
"user asked to filter an exact match on StreamID 0", since it's 
impractical to expect userspace tools to understand and manually set the 
all-ones mask value to indicate that filtering wasn't requested. In your 
case, though, since values of 0 appear to mean "no filter", it should 
just work as expected without needing any additional complexity. Ideally 
your interface should reflect the functionality and expected usage model 
of your PMU hardware in the way that's most intuitive and helpful for 
the user - it doesn't need to be influenced by other PMUs that work 
differently.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
  2023-06-01  3:01 ` [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen
  2023-06-01 15:23   ` Robin Murphy
@ 2023-06-03  1:16   ` kernel test robot
  1 sibling, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-06-03  1:16 UTC (permalink / raw)
  To: Ilkka Koskinen, Jonathan Corbet, Will Deacon, Mark Rutland,
	Besar Wicaksono, Suzuki K Poulose, Robin Murphy
  Cc: oe-kbuild-all, linux-doc, linux-kernel, linux-arm-kernel,
	Ilkka Koskinen

Hi Ilkka,

kernel test robot noticed the following build warnings:

[auto build test WARNING on arm-perf/for-next/perf]
[also build test WARNING on soc/for-next linus/master v6.4-rc4 next-20230602]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ilkka-Koskinen/perf-arm_cspmu-Support-32-bit-accesses-to-64-bit-registers/20230601-110440
base:   https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-next/perf
patch link:    https://lore.kernel.org/r/20230601030144.3458136-6-ilkka%40os.amperecomputing.com
patch subject: [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
reproduce:
        # https://github.com/intel-lab-lkp/linux/commit/6757d231639bb7a9f0b47f52d7d4f101ad3a0e29
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ilkka-Koskinen/perf-arm_cspmu-Support-32-bit-accesses-to-64-bit-registers/20230601-110440
        git checkout 6757d231639bb7a9f0b47f52d7d4f101ad3a0e29
        make menuconfig
        # enable CONFIG_COMPILE_TEST, CONFIG_WARN_MISSING_DOCUMENTS, CONFIG_WARN_ABI_ERRORS
        make htmldocs

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306030958.1IskqGEt-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> Documentation/admin-guide/perf/ampere_cspmu.rst: WARNING: document isn't included in any toctree

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-06-03  1:17 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-01  3:01 [PATCH v2 0/5] perf: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen
2023-06-01  3:01 ` [PATCH v2 1/5] perf: arm_cspmu: Support 32-bit accesses to 64-bit registers Ilkka Koskinen
2023-06-01 14:49   ` Robin Murphy
2023-06-02  6:47     ` Ilkka Koskinen
2023-06-01  3:01 ` [PATCH v2 2/5] perf: arm_cspmu: Support shared interrupts Ilkka Koskinen
2023-06-01 14:54   ` Robin Murphy
2023-06-02  7:04     ` Ilkka Koskinen
2023-06-02 11:25       ` Robin Murphy
2023-06-01  3:01 ` [PATCH v2 3/5] perf: arm_cspmu: Support implementation specific filters Ilkka Koskinen
2023-06-01  3:01 ` [PATCH v2 4/5] perf: arm_cspmu: Support implementation specific event validation Ilkka Koskinen
2023-06-01 15:09   ` Robin Murphy
2023-06-02  7:09     ` Ilkka Koskinen
2023-06-01  3:01 ` [PATCH v2 5/5] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen
2023-06-01 15:23   ` Robin Murphy
2023-06-02  7:13     ` Ilkka Koskinen
2023-06-02 11:51       ` Robin Murphy
2023-06-03  1:16   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).