public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/2] perf: marvell: Add CN20K DDR PMU support
@ 2026-04-13 16:56 Geetha sowjanya
  2026-04-13 16:56 ` [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding Geetha sowjanya
  2026-04-13 16:56 ` [PATCH v5 2/2] perf: marvell: Add CN20K DDR PMU support Geetha sowjanya
  0 siblings, 2 replies; 6+ messages in thread
From: Geetha sowjanya @ 2026-04-13 16:56 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
  Cc: mark.rutland, will, krzk+dt

This series adds support for the DDR Performance Monitoring Unit (PMU)
present in Marvell CN20K SoCs.

The DDR PMU is part of the DRAM Subsystem (DSS) and provides hardware
counters to monitor DDR traffic and performance events. The block
implements eight programmable counters and two fixed-function counters
tracking DDR read and write activity, and is accessed via a dedicated
MMIO region.

CN20K is the successor to CN10K, and the DDR PMU hardware is functionally
equivalent to the CN10K implementation, with only minor differences in
register offsets and event mappings. To allow software to distinguish
between the two silicon variants, this series introduces a specific
"marvell,cn20k-ddr-pmu" compatible and extends the existing
marvell_cn10k_ddr_pmu driver to handle CN20K via variant-specific data.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>

Chnages in v4:
 - Fixed document file name.

Chnages in v3:
- Expanded cover letter and commit message to better describe the DDR PMU
  hardware and its relationship to CN10K
- Fixed the file name.

Changes in v2:
 - Fixed YAML syntax error triggered by a tab character in the examples
  section, which caused dt_binding_check to fail.

Changes in v1:
- Added a description field to the binding.
- Simplified the compatible property using 'const' instead of 'items/enum'.
- Updated the example node name to include a unit-address matching the reg base.

Geetha sowjanya (2):
  dt-bindings: perf: marvell: Document CN20K DDR PMU
  perf: marvell: Add CN20K DDR PMU support

 .../bindings/perf/marvell,cn20k-ddr-pmu.yaml  |  39 ++++
 drivers/perf/marvell_cn10k_ddr_pmu.c          | 187 ++++++++++++++++--
 2 files changed, 210 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding
  2026-04-13 16:56 [PATCH v5 0/2] perf: marvell: Add CN20K DDR PMU support Geetha sowjanya
@ 2026-04-13 16:56 ` Geetha sowjanya
  2026-04-13 17:12   ` sashiko-bot
  2026-04-13 16:56 ` [PATCH v5 2/2] perf: marvell: Add CN20K DDR PMU support Geetha sowjanya
  1 sibling, 1 reply; 6+ messages in thread
From: Geetha sowjanya @ 2026-04-13 16:56 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
  Cc: mark.rutland, will, krzk+dt

Marvell CN20K SoCs integrate a DDR Performance Monitoring Unit (PMU)
associated with the DDR controller. The block provides hardware counters
to monitor DDR traffic and performance events and is accessed via a
dedicated MMIO region.

The CN20K DDR PMU is functionally equivalent to the CN10K DDR PMU, with
minor register offset differences.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
 .../bindings/perf/marvell,cn20k-ddr-pmu.yaml  | 39 +++++++++++++++++++
 1 file changed, 39 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml

diff --git a/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
new file mode 100644
index 000000000000..cc6aa760de49
--- /dev/null
+++ b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
@@ -0,0 +1,39 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/perf/marvell,cn20k-ddr-pmu.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Marvell CN20K DDR performance monitor
+
+description:
+  Performance Monitoring Unit (PMU) for the DDR controller
+  in Marvell CN20K SoCs.
+
+maintainers:
+  - Geetha sowjanya <gakula@marvell.com>
+
+properties:
+  compatible:
+    const: marvell,cn20k-ddr-pmu
+
+  reg:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |
+    bus {
+        #address-cells = <2>;
+        #size-cells = <2>;
+
+        ddr-pmu@c200000000 {
+            compatible = "marvell,cn20k-ddr-pmu";
+            reg = <0xc200 0x00000000 0x0 0x100000>;
+        };
+    };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v5 2/2] perf: marvell: Add CN20K DDR PMU support
  2026-04-13 16:56 [PATCH v5 0/2] perf: marvell: Add CN20K DDR PMU support Geetha sowjanya
  2026-04-13 16:56 ` [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding Geetha sowjanya
@ 2026-04-13 16:56 ` Geetha sowjanya
  2026-04-13 17:50   ` sashiko-bot
  1 sibling, 1 reply; 6+ messages in thread
From: Geetha sowjanya @ 2026-04-13 16:56 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
  Cc: mark.rutland, will, krzk+dt

The CN20K DRAM Subsystem exposes eight programmable
performance counters and two fixed counters for DDR
read and write traffic.  Software selects events for
the programmable counters from traffic at the DDR PHY
interface, the CHI interconnect, or inside the DDR controller.

Add CN20K register offsets, event maps, and sysfs attributes;
match the device via OF (marvell,cn20k-ddr-pmu) and ACPI (MRVL000B).
Represent the SoC variant in platform data with bit flags so
CN20K can reuse the CN10K PMU code path where appropriate.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
 drivers/perf/marvell_cn10k_ddr_pmu.c | 187 ++++++++++++++++++++++++---
 1 file changed, 171 insertions(+), 16 deletions(-)

diff --git a/drivers/perf/marvell_cn10k_ddr_pmu.c b/drivers/perf/marvell_cn10k_ddr_pmu.c
index 72ac17efd846..7e2e1823b009 100644
--- a/drivers/perf/marvell_cn10k_ddr_pmu.c
+++ b/drivers/perf/marvell_cn10k_ddr_pmu.c
@@ -13,31 +13,43 @@
 #include <linux/hrtimer.h>
 #include <linux/acpi.h>
 #include <linux/platform_device.h>
+#include <linux/bits.h>
+
+/* SoC variant flags for struct ddr_pmu_platform_data (mutually exclusive in pdata) */
+#define IS_CN10K	BIT(0)
+#define IS_ODY		BIT(1)
+#define IS_CN20K	BIT(2)
 
 /* Performance Counters Operating Mode Control Registers */
 #define CN10K_DDRC_PERF_CNT_OP_MODE_CTRL	0x8020
 #define ODY_DDRC_PERF_CNT_OP_MODE_CTRL		0x20020
+#define CN20K_DDRC_PERF_CNT_OP_MODE_CTRL	0x20000
 #define OP_MODE_CTRL_VAL_MANUAL	0x1
 
 /* Performance Counters Start Operation Control Registers */
 #define CN10K_DDRC_PERF_CNT_START_OP_CTRL	0x8028
 #define ODY_DDRC_PERF_CNT_START_OP_CTRL		0x200A0
+#define CN20K_DDRC_PERF_CNT_START_OP_CTRL	0x20080
 #define START_OP_CTRL_VAL_START		0x1ULL
 #define START_OP_CTRL_VAL_ACTIVE	0x2
 
 /* Performance Counters End Operation Control Registers */
 #define CN10K_DDRC_PERF_CNT_END_OP_CTRL	0x8030
 #define ODY_DDRC_PERF_CNT_END_OP_CTRL	0x200E0
+#define CN20K_DDRC_PERF_CNT_END_OP_CTRL	0x200C0
 #define END_OP_CTRL_VAL_END		0x1ULL
 
 /* Performance Counters End Status Registers */
 #define CN10K_DDRC_PERF_CNT_END_STATUS		0x8038
 #define ODY_DDRC_PERF_CNT_END_STATUS		0x20120
+#define CN20K_DDRC_PERF_CNT_END_STATUS		0x20100
 #define END_STATUS_VAL_END_TIMER_MODE_END	0x1
 
 /* Performance Counters Configuration Registers */
 #define CN10K_DDRC_PERF_CFG_BASE		0x8040
 #define ODY_DDRC_PERF_CFG_BASE			0x20160
+#define CN20K_DDRC_PERF_CFG_BASE		0x20140
+#define CN20K_DDRC_PERF_CFG1_BASE		0x20180
 
 /* 8 Generic event counter + 2 fixed event counters */
 #define DDRC_PERF_NUM_GEN_COUNTERS	8
@@ -61,6 +73,23 @@
  * DO NOT change these event-id numbers, they are used to
  * program event bitmap in h/w.
  */
+
+/* CN20K specific events */
+#define EVENT_PERF_OP_IS_RD16			61
+#define EVENT_PERF_OP_IS_RD32			60
+#define EVENT_PERF_OP_IS_WR16			59
+#define EVENT_PERF_OP_IS_WR32			58
+#define EVENT_OP_IS_ENTER_DSM			44
+#define EVENT_OP_IS_RFM				43
+
+#define EVENT_CN20K_OP_IS_TCR_MRR			50
+#define EVENT_CN20K_OP_IS_DQSOSC_MRR			49
+#define EVENT_CN20K_OP_IS_DQSOSC_MPC			48
+#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_WR	47
+#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_RD	46
+#define EVENT_CN20K_OP_IS_ZQLATCH			21
+#define EVENT_CN20K_OP_IS_ZQSTART			22
+
 #define EVENT_DFI_CMD_IS_RETRY			61
 #define EVENT_RD_UC_ECC_ERROR			60
 #define EVENT_RD_CRC_ERROR			59
@@ -87,6 +116,9 @@
 #define EVENT_OP_IS_SPEC_REF			41
 #define EVENT_OP_IS_CRIT_REF			40
 #define EVENT_OP_IS_REFRESH			39
+#define EVENT_OP_IS_CAS_WCK_SUS			38
+#define EVENT_OP_IS_CAS_WS_OFF			37
+#define EVENT_OP_IS_CAS_WS			36
 #define EVENT_OP_IS_ENTER_MPSM			35
 #define EVENT_OP_IS_ENTER_POWERDOWN		31
 #define EVENT_OP_IS_ENTER_SELFREF		27
@@ -183,8 +215,8 @@ struct ddr_pmu_platform_data {
 	u64 cnt_freerun_clr;
 	u64 cnt_value_wr_op;
 	u64 cnt_value_rd_op;
-	bool is_cn10k;
-	bool is_ody;
+	u64 cfg1_base;
+	unsigned int silicon_flags; /* IS_CN10K, IS_ODY, or IS_CN20K */
 };
 
 static ssize_t cn10k_ddr_pmu_event_show(struct device *dev,
@@ -336,6 +368,80 @@ static struct attribute *odyssey_ddr_perf_events_attrs[] = {
 	NULL
 };
 
+static struct attribute *cn20k_ddr_perf_events_attrs[] = {
+	/* Programmable */
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rd_or_wr_access, EVENT_HIF_RD_OR_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_wr_access, EVENT_HIF_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rd_access, EVENT_HIF_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rmw_access, EVENT_HIF_RMW),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_pri_rdaccess, EVENT_HIF_HI_PRI_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_rd_bypass_access, EVENT_READ_BYPASS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_act_bypass_access, EVENT_ACT_BYPASS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_wr_data_access,
+				 EVENT_DFI_WR_DATA_CYCLES),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_rd_data_access,
+				 EVENT_DFI_RD_DATA_CYCLES),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hpri_sched_rd_crit_access,
+				 EVENT_HPR_XACT_WHEN_CRITICAL),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_lpri_sched_rd_crit_access,
+				 EVENT_LPR_XACT_WHEN_CRITICAL),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_wr_trxn_crit_access,
+				 EVENT_WR_XACT_WHEN_CRITICAL),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_active_access, EVENT_OP_IS_ACTIVATE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_rd_or_wr_access,
+				 EVENT_OP_IS_RD_OR_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_rd_active_access,
+				 EVENT_OP_IS_RD_ACTIVATE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_read, EVENT_OP_IS_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_write, EVENT_OP_IS_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_mwr, EVENT_OP_IS_MWR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge, EVENT_OP_IS_PRECHARGE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge_for_rdwr,
+				 EVENT_PRECHARGE_FOR_RDWR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge_for_other,
+				 EVENT_PRECHARGE_FOR_OTHER),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_rdwr_transitions, EVENT_RDWR_TRANSITIONS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_write_combine, EVENT_WRITE_COMBINE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_war_hazard, EVENT_WAR_HAZARD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_raw_hazard, EVENT_RAW_HAZARD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_waw_hazard, EVENT_WAW_HAZARD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_selfref, EVENT_OP_IS_ENTER_SELFREF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_powerdown,
+				 EVENT_OP_IS_ENTER_POWERDOWN),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_ws, EVENT_OP_IS_CAS_WS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_ws_off, EVENT_OP_IS_CAS_WS_OFF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_wck_sus, EVENT_OP_IS_CAS_WCK_SUS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_refresh, EVENT_OP_IS_REFRESH),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_crit_ref, EVENT_OP_IS_CRIT_REF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_spec_ref, EVENT_OP_IS_SPEC_REF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_load_mode, EVENT_OP_IS_LOAD_MODE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_rfm, EVENT_OP_IS_RFM),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_dsm, EVENT_OP_IS_ENTER_DSM),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_cycles, EVENT_DFI_CYCLES),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_win_limit_reached_rd,
+				 EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_win_limit_reached_wr,
+				 EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dqsosc_mpc, EVENT_CN20K_OP_IS_DQSOSC_MPC),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dqsosc_mrr, EVENT_CN20K_OP_IS_DQSOSC_MRR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_tcr_mrr, EVENT_CN20K_OP_IS_TCR_MRR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_zqstart, EVENT_CN20K_OP_IS_ZQSTART),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_zqlatch, EVENT_CN20K_OP_IS_ZQLATCH),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_read16, EVENT_PERF_OP_IS_RD16),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_read32, EVENT_PERF_OP_IS_RD32),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_write16, EVENT_PERF_OP_IS_WR16),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_write32, EVENT_PERF_OP_IS_WR32),
+	/* Free run event counters */
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_ddr_reads, EVENT_DDR_READS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_ddr_writes, EVENT_DDR_WRITES),
+	NULL
+};
+
+static struct attribute_group cn20k_ddr_perf_events_attr_group = {
+	.name = "events",
+	.attrs = cn20k_ddr_perf_events_attrs,
+};
+
 static struct attribute_group odyssey_ddr_perf_events_attr_group = {
 	.name = "events",
 	.attrs = odyssey_ddr_perf_events_attrs,
@@ -393,6 +499,13 @@ static const struct attribute_group *odyssey_attr_groups[] = {
 	NULL
 };
 
+static const struct attribute_group *cn20k_attr_groups[] = {
+	&cn20k_ddr_perf_events_attr_group,
+	&cn10k_ddr_perf_format_attr_group,
+	&cn10k_ddr_perf_cpumask_attr_group,
+	NULL
+};
+
 /* Default poll timeout is 100 sec, which is very sufficient for
  * 48 bit counter incremented max at 5.6 GT/s, which may take many
  * hours to overflow.
@@ -412,7 +525,7 @@ static int ddr_perf_get_event_bitmap(int eventid, u64 *event_bitmap,
 
 	switch (eventid) {
 	case EVENT_DFI_PARITY_POISON ...EVENT_DFI_CMD_IS_RETRY:
-		if (!ddr_pmu->p_data->is_ody) {
+		if (!(ddr_pmu->p_data->silicon_flags & IS_ODY)) {
 			err = -EINVAL;
 			break;
 		}
@@ -524,9 +637,9 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
 					  int counter, bool enable)
 {
 	const struct ddr_pmu_platform_data *p_data = pmu->p_data;
+	unsigned int silicon_flags = pmu->p_data->silicon_flags;
 	u64 ctrl_reg = pmu->p_data->cnt_op_mode_ctrl;
 	const struct ddr_pmu_ops *ops = pmu->ops;
-	bool is_ody = pmu->p_data->is_ody;
 	u32 reg;
 	u64 val;
 
@@ -546,7 +659,7 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
 
 		writeq_relaxed(val, pmu->base + reg);
 
-		if (is_ody) {
+		if (silicon_flags & IS_ODY) {
 			if (enable) {
 				/*
 				 * Setup the PMU counter to work in
@@ -621,6 +734,7 @@ static int cn10k_ddr_perf_event_add(struct perf_event *event, int flags)
 {
 	struct cn10k_ddr_pmu *pmu = to_cn10k_ddr_pmu(event->pmu);
 	const struct ddr_pmu_platform_data *p_data = pmu->p_data;
+	unsigned int silicon_flags = pmu->p_data->silicon_flags;
 	const struct ddr_pmu_ops *ops = pmu->ops;
 	struct hw_perf_event *hwc = &event->hw;
 	u8 config = event->attr.config;
@@ -642,10 +756,17 @@ static int cn10k_ddr_perf_event_add(struct perf_event *event, int flags)
 	if (counter < DDRC_PERF_NUM_GEN_COUNTERS) {
 		/* Generic counters, configure event id */
 		reg_offset = DDRC_PERF_CFG(p_data->cfg_base, counter);
-		ret = ddr_perf_get_event_bitmap(config, &val, pmu);
-		if (ret)
-			return ret;
 
+		if (silicon_flags & IS_CN20K) {
+			val =  (1ULL << (config - 1));
+			if (config == EVENT_CN20K_OP_IS_ZQSTART ||
+			    config == EVENT_CN20K_OP_IS_ZQLATCH)
+				reg_offset = DDRC_PERF_CFG(p_data->cfg1_base, counter);
+		} else {
+			ret = ddr_perf_get_event_bitmap(config, &val, pmu);
+			if (ret)
+				return ret;
+		}
 		writeq_relaxed(val, pmu->base + reg_offset);
 	} else {
 		/* fixed event counter, clear counter value */
@@ -952,7 +1073,25 @@ static const struct ddr_pmu_platform_data cn10k_ddr_pmu_pdata = {
 	.cnt_freerun_clr = 0,
 	.cnt_value_wr_op = CN10K_DDRC_PERF_CNT_VALUE_WR_OP,
 	.cnt_value_rd_op = CN10K_DDRC_PERF_CNT_VALUE_RD_OP,
-	.is_cn10k = TRUE,
+	.silicon_flags = IS_CN10K,
+};
+
+static const struct ddr_pmu_platform_data cn20k_ddr_pmu_pdata = {
+	.counter_overflow_val = 0,
+	.counter_max_val = GENMASK_ULL(63, 0),
+	.cnt_base = ODY_DDRC_PERF_CNT_VALUE_BASE,
+	.cfg_base = CN20K_DDRC_PERF_CFG_BASE,
+	.cfg1_base = CN20K_DDRC_PERF_CFG1_BASE,
+	.cnt_op_mode_ctrl = CN20K_DDRC_PERF_CNT_OP_MODE_CTRL,
+	.cnt_start_op_ctrl = CN20K_DDRC_PERF_CNT_START_OP_CTRL,
+	.cnt_end_op_ctrl = CN20K_DDRC_PERF_CNT_END_OP_CTRL,
+	.cnt_end_status = CN20K_DDRC_PERF_CNT_END_STATUS,
+	.cnt_freerun_en = 0,
+	.cnt_freerun_ctrl = ODY_DDRC_PERF_CNT_FREERUN_CTRL,
+	.cnt_freerun_clr = ODY_DDRC_PERF_CNT_FREERUN_CLR,
+	.cnt_value_wr_op = ODY_DDRC_PERF_CNT_VALUE_WR_OP,
+	.cnt_value_rd_op = ODY_DDRC_PERF_CNT_VALUE_RD_OP,
+	.silicon_flags = IS_CN20K,
 };
 #endif
 
@@ -979,7 +1118,7 @@ static const struct ddr_pmu_platform_data odyssey_ddr_pmu_pdata = {
 	.cnt_freerun_clr = ODY_DDRC_PERF_CNT_FREERUN_CLR,
 	.cnt_value_wr_op = ODY_DDRC_PERF_CNT_VALUE_WR_OP,
 	.cnt_value_rd_op = ODY_DDRC_PERF_CNT_VALUE_RD_OP,
-	.is_ody = TRUE,
+	.silicon_flags = IS_ODY,
 };
 #endif
 
@@ -989,8 +1128,7 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 	struct cn10k_ddr_pmu *ddr_pmu;
 	struct resource *res;
 	void __iomem *base;
-	bool is_cn10k;
-	bool is_ody;
+	unsigned int silicon_flags;
 	char *name;
 	int ret;
 
@@ -1014,10 +1152,9 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 	ddr_pmu->base = base;
 
 	ddr_pmu->p_data = dev_data;
-	is_cn10k = ddr_pmu->p_data->is_cn10k;
-	is_ody = ddr_pmu->p_data->is_ody;
+	silicon_flags = ddr_pmu->p_data->silicon_flags;
 
-	if (is_cn10k) {
+	if (silicon_flags & IS_CN10K) {
 		ddr_pmu->ops = &ddr_pmu_ops;
 		/* Setup the PMU counter to work in manual mode */
 		writeq_relaxed(OP_MODE_CTRL_VAL_MANUAL, ddr_pmu->base +
@@ -1039,7 +1176,7 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 		};
 	}
 
-	if (is_ody) {
+	if (silicon_flags & IS_ODY) {
 		ddr_pmu->ops = &ddr_pmu_ody_ops;
 
 		ddr_pmu->pmu = (struct pmu) {
@@ -1056,6 +1193,22 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 		};
 	}
 
+	if (silicon_flags & IS_CN20K) {
+		ddr_pmu->ops = &ddr_pmu_ody_ops;
+
+		ddr_pmu->pmu = (struct pmu) {
+			.module       = THIS_MODULE,
+			.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
+			.task_ctx_nr = perf_invalid_context,
+			.attr_groups = cn20k_attr_groups,
+			.event_init  = cn10k_ddr_perf_event_init,
+			.add         = cn10k_ddr_perf_event_add,
+			.del         = cn10k_ddr_perf_event_del,
+			.start       = cn10k_ddr_perf_event_start,
+			.stop        = cn10k_ddr_perf_event_stop,
+			.read        = cn10k_ddr_perf_event_update,
+		};
+	}
 	/* Choose this cpu to collect perf data */
 	ddr_pmu->cpu = raw_smp_processor_id();
 
@@ -1098,6 +1251,7 @@ static void cn10k_ddr_perf_remove(struct platform_device *pdev)
 #ifdef CONFIG_OF
 static const struct of_device_id cn10k_ddr_pmu_of_match[] = {
 	{ .compatible = "marvell,cn10k-ddr-pmu", .data = &cn10k_ddr_pmu_pdata },
+	{ .compatible = "marvell,cn20k-ddr-pmu", .data = &cn20k_ddr_pmu_pdata },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, cn10k_ddr_pmu_of_match);
@@ -1107,6 +1261,7 @@ MODULE_DEVICE_TABLE(of, cn10k_ddr_pmu_of_match);
 static const struct acpi_device_id cn10k_ddr_pmu_acpi_match[] = {
 	{"MRVL000A", (kernel_ulong_t)&cn10k_ddr_pmu_pdata },
 	{"MRVL000C", (kernel_ulong_t)&odyssey_ddr_pmu_pdata},
+	{"MRVL000B", (kernel_ulong_t)&cn20k_ddr_pmu_pdata},
 	{},
 };
 MODULE_DEVICE_TABLE(acpi, cn10k_ddr_pmu_acpi_match);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding
  2026-04-13 16:56 ` [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding Geetha sowjanya
@ 2026-04-13 17:12   ` sashiko-bot
  2026-04-14  6:37     ` Krzysztof Kozlowski
  0 siblings, 1 reply; 6+ messages in thread
From: sashiko-bot @ 2026-04-13 17:12 UTC (permalink / raw)
  To: Geetha sowjanya; +Cc: linux-perf-users

Sashiko AI review found 2 potential issue(s):
- [Low] DT binding example unit address and generic node name mismatch.
- [Low] Unnecessary duplication of binding files instead of updating the existing schema.
--

commit e2d439a05897c67e7bc90170f7cf6e218505769a
Author: Geetha sowjanya <gakula@marvell.com>

dt-bindings: perf: marvell: Add CN20K DDR PMU binding

This commit adds a new device tree binding for the DDR Performance Monitoring
Unit found in Marvell CN20K SoCs. The hardware provides counters to monitor
DDR traffic and is accessed via a dedicated MMIO region.

> diff --git a/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
> new file mode 100644
> index 0000000000000..cc6aa760de49e
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml

The commit message notes that the CN20K DDR PMU is functionally equivalent
to the CN10K DDR PMU. Does it make sense to consolidate this by adding the new
compatible string to the existing marvell-cn10k-ddr.yaml schema instead of
introducing a new duplicate file?

Splitting schemas is typically recommended only when conditionals become
too unwieldy.

[ ... ]

> +examples:
> +  - |
> +    bus {
> +        #address-cells = <2>;
> +        #size-cells = <2>;
> +
> +        ddr-pmu@c200000000 {
> +            compatible = "marvell,cn20k-ddr-pmu";
> +            reg = <0xc200 0x00000000 0x0 0x100000>;
> +        };
> +    };

Is there a unit address mismatch here? The 64-bit base address formed by the
reg property is 0xc20000000000, but the node name uses @c200000000, missing two
trailing zeros. This can trigger a unit_address_vs_reg regression during
dt_binding_check.

Also, does this example need to use a generic class-based node name like pmu
instead of ddr-pmu to match devicetree recommendations?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260413165621.10921-1-gakula@marvell.com?part=1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 2/2] perf: marvell: Add CN20K DDR PMU support
  2026-04-13 16:56 ` [PATCH v5 2/2] perf: marvell: Add CN20K DDR PMU support Geetha sowjanya
@ 2026-04-13 17:50   ` sashiko-bot
  0 siblings, 0 replies; 6+ messages in thread
From: sashiko-bot @ 2026-04-13 17:50 UTC (permalink / raw)
  To: Geetha sowjanya; +Cc: linux-perf-users

Sashiko AI review found 4 potential issue(s):
- [High] Missing Hardware Initialization for CN20K Programmable Counters.
- [High] Undefined Behavior via Unvalidated Event `config` on CN20K.
- [Medium] Event ID Conflicts and Incorrect Hardware Routing.
- [Medium] Broken Enable Logic and Stale Hardware State for `cfg1_base` Events.
--

commit 5466234dfcbce6dad8475395fbdf56b2dfd56308
Author: Geetha sowjanya <gakula@marvell.com>

    perf: marvell: Add CN20K DDR PMU support
    
    The CN20K DRAM Subsystem exposes eight programmable
    performance counters and two fixed counters for DDR
    read and write traffic.  Software selects events for
    the programmable counters from traffic at the DDR PHY
    interface, the CHI interconnect, or inside the DDR controller.

> diff --git a/drivers/perf/marvell_cn10k_ddr_pmu.c b/drivers/perf/marvell_cn10k_ddr_pmu.c
> index 72ac17efd846a..7e2e1823b009c 100644
> --- a/drivers/perf/marvell_cn10k_ddr_pmu.c
> +++ b/drivers/perf/marvell_cn10k_ddr_pmu.c
[ ... ]
> @@ -61,6 +73,23 @@
>   * DO NOT change these event-id numbers, they are used to
>   * program event bitmap in h/w.
>   */
> +
> +/* CN20K specific events */
> +#define EVENT_PERF_OP_IS_RD16			61
> +#define EVENT_PERF_OP_IS_RD32			60
> +#define EVENT_PERF_OP_IS_WR16			59
> +#define EVENT_PERF_OP_IS_WR32			58
> +#define EVENT_OP_IS_ENTER_DSM			44
> +#define EVENT_OP_IS_RFM				43
> +
> +#define EVENT_CN20K_OP_IS_TCR_MRR			50
> +#define EVENT_CN20K_OP_IS_DQSOSC_MRR			49
> +#define EVENT_CN20K_OP_IS_DQSOSC_MPC			48
> +#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_WR	47
> +#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_RD	46
> +#define EVENT_CN20K_OP_IS_ZQLATCH			21
> +#define EVENT_CN20K_OP_IS_ZQSTART			22
> +

Could these new EVENT_CN20K_OP_IS_ZQ* IDs conflict with existing events?

For example, EVENT_PRECHARGE_FOR_OTHER is also 21, and EVENT_RDWR_TRANSITIONS
is 22. Since they share the same ID value, wouldn't standard events 21 and 22
incorrectly trigger the cfg1_base routing logic added in
cn10k_ddr_perf_event_add() for CN20K?

[ ... ]
> @@ -524,9 +637,9 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
>  					  int counter, bool enable)
>  {
>  	const struct ddr_pmu_platform_data *p_data = pmu->p_data;
> +	unsigned int silicon_flags = pmu->p_data->silicon_flags;
>  	u64 ctrl_reg = pmu->p_data->cnt_op_mode_ctrl;
>  	const struct ddr_pmu_ops *ops = pmu->ops;
> -	bool is_ody = pmu->p_data->is_ody;
>  	u32 reg;
>  	u64 val;
>  
> @@ -546,7 +659,7 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
>  
>  		writeq_relaxed(val, pmu->base + reg);
>  
> -		if (is_ody) {
> +		if (silicon_flags & IS_ODY) {
>  			if (enable) {
>  				/*
>  				 * Setup the PMU counter to work in

Are CN20K programmable counters ever actually enabled?

Unlike CN10K, CN20K does not register global .pmu_enable or .pmu_disable
callbacks during probe. And since this check is only for IS_ODY, the hardware
counters are never configured to work in manual mode or started via
cn10k_ddr_perf_counter_start(), which might leave them non-functional.

Additionally, just before this block, the function applies EVENT_ENABLE
exclusively to cfg_base:

    reg = DDRC_PERF_CFG(p_data->cfg_base, counter);
    val = readq_relaxed(pmu->base + reg);

    if (enable)
        val |= EVENT_ENABLE;

How are the ZQSTART and ZQLATCH events handled, since they are supposed to be
written to cfg1_base? It looks like EVENT_ENABLE would inadvertently be
applied to cfg_base instead.

[ ... ]
> @@ -642,10 +756,17 @@ static int cn10k_ddr_perf_event_add(struct perf_event *event, int flags)
>  	if (counter < DDRC_PERF_NUM_GEN_COUNTERS) {
>  		/* Generic counters, configure event id */
>  		reg_offset = DDRC_PERF_CFG(p_data->cfg_base, counter);
> -		ret = ddr_perf_get_event_bitmap(config, &val, pmu);
> -		if (ret)
> -			return ret;
>  
> +		if (silicon_flags & IS_CN20K) {
> +			val =  (1ULL << (config - 1));
> +			if (config == EVENT_CN20K_OP_IS_ZQSTART ||
> +			    config == EVENT_CN20K_OP_IS_ZQLATCH)
> +				reg_offset = DDRC_PERF_CFG(p_data->cfg1_base, counter);
> +		} else {
> +			ret = ddr_perf_get_event_bitmap(config, &val, pmu);
> +			if (ret)
> +				return ret;
> +		}
>  		writeq_relaxed(val, pmu->base + reg_offset);
>  	} else {

Is it possible for the config variable (which comes from event->attr.config)
to cause undefined behavior here?

Since the CN20K path bypasses ddr_perf_get_event_bitmap(), it lacks
bounds checking on config. If a user provides config = 0, config - 1
could result in a negative shift. If config >= 65, the shift amount
would exceed the width of the 64-bit operand.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260413165621.10921-1-gakula@marvell.com?part=2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding
  2026-04-13 17:12   ` sashiko-bot
@ 2026-04-14  6:37     ` Krzysztof Kozlowski
  0 siblings, 0 replies; 6+ messages in thread
From: Krzysztof Kozlowski @ 2026-04-14  6:37 UTC (permalink / raw)
  To: sashiko; +Cc: Geetha sowjanya, linux-perf-users

On Mon, Apr 13, 2026 at 05:12:06PM +0000, sashiko-bot@kernel.org wrote:
> Sashiko AI review found 2 potential issue(s):
> - [Low] DT binding example unit address and generic node name mismatch.
> - [Low] Unnecessary duplication of binding files instead of updating the existing schema.
> --
> 
> commit e2d439a05897c67e7bc90170f7cf6e218505769a
> Author: Geetha sowjanya <gakula@marvell.com>
> 
> dt-bindings: perf: marvell: Add CN20K DDR PMU binding
> 
> This commit adds a new device tree binding for the DDR Performance Monitoring
> Unit found in Marvell CN20K SoCs. The hardware provides counters to monitor
> DDR traffic and is accessed via a dedicated MMIO region.
> 
> > diff --git a/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
> > new file mode 100644
> > index 0000000000000..cc6aa760de49e
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
> 
> The commit message notes that the CN20K DDR PMU is functionally equivalent
> to the CN10K DDR PMU. Does it make sense to consolidate this by adding the new
> compatible string to the existing marvell-cn10k-ddr.yaml schema instead of
> introducing a new duplicate file?
> 

That's valid comment, I did not notice there is a file which is almost
identical. Why isn't this part of that file?

> Splitting schemas is typically recommended only when conditionals become
> too unwieldy.
> 
> [ ... ]
> 
> > +examples:
> > +  - |
> > +    bus {
> > +        #address-cells = <2>;
> > +        #size-cells = <2>;
> > +
> > +        ddr-pmu@c200000000 {
> > +            compatible = "marvell,cn20k-ddr-pmu";
> > +            reg = <0xc200 0x00000000 0x0 0x100000>;
> > +        };
> > +    };
> 
> Is there a unit address mismatch here? The 64-bit base address formed by the
> reg property is 0xc20000000000, but the node name uses @c200000000, missing two
> trailing zeros. This can trigger a unit_address_vs_reg regression during
> dt_binding_check.
> 

Also valid.

Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-14  6:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 16:56 [PATCH v5 0/2] perf: marvell: Add CN20K DDR PMU support Geetha sowjanya
2026-04-13 16:56 ` [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding Geetha sowjanya
2026-04-13 17:12   ` sashiko-bot
2026-04-14  6:37     ` Krzysztof Kozlowski
2026-04-13 16:56 ` [PATCH v5 2/2] perf: marvell: Add CN20K DDR PMU support Geetha sowjanya
2026-04-13 17:50   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox