* [PATCH v2 0/3] perf: marvell: LLC-TAD PMU MPAM filtering support
@ 2026-06-12 9:57 Geetha sowjanya
2026-06-12 9:57 ` [PATCH v2 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU Geetha sowjanya
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Geetha sowjanya @ 2026-06-12 9:57 UTC (permalink / raw)
To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
Cc: mark.rutland, will, krzk+dt, gakula
This series extends the Marvell LLC-TAD PMU driver for CN10K and CN20K
platforms by adding MPAM-based filtering support and introducing CN20K
hardware support.
Patch 1 adds optional MPAM partition-id (partid) filtering for the subset
of events that support it. The partid and partid_en fields are exposed via
the PMU format attribute, while platforms that do not support filtering
continue to expose a reduced event set without these fields.
This patch also includes several fixes and cleanups:
- Avoid modifying platform_get_resource() bounds in-place
- Validate the MMIO window size against tad-cnt
- Correct ordering of perf registration and CPU hotplug with proper unwind
- Align the filter-enable bit in config1 with the sysfs format (bit 9)
Patch 2 adds support for the CN20K LLC-TAD PMU. Compared to CN10K, CN20K
uses different PFC/PRF register offsets and introduces additional events.
This patch:
- Adds a CN20K (V3) profile with platform-specific register offsets
- Extends the event map and hides CN20K-only events on CN10K
- Implements CN20K-specific MPAM encoding for filtering
- Ensures correct counter initialization using local64_set(prev_count)
- Adds device discovery via OF and ACPI (MRVL000F)
Patch 3 updates the Devicetree binding documentation to add support for
"marvell,cn20k-tad-pmu"
Changes since v1
----------------
- config1: use bit 9 for MPAM filter enable consistently with partid_en in
the PMU format; allow only bits 0..9 in event_init on CN10K/CN20K paths.
- Hide V3-only sysfs events on V1.
- Reset prev_count when starting counters after clearing hardware.
- DT binding: explain non-fallback compatibles for CN10K vs CN20K.
Tanmay Jagdale (1):
perf: marvell: Add MPAM partid filtering to CN10K TAD PMU
Geetha sowjanya (2):
perf: marvell: Add CN20K LLC-TAD PMU support
dt-bindings: perf: marvell: Extend CN10K TAD PMU binding for CN20K
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
--
2.25.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU
2026-06-12 9:57 [PATCH v2 0/3] perf: marvell: LLC-TAD PMU MPAM filtering support Geetha sowjanya
@ 2026-06-12 9:57 ` Geetha sowjanya
2026-06-12 10:14 ` sashiko-bot
2026-06-12 9:57 ` [PATCH v2 2/3] perf: marvell: Add CN20K LLC-TAD PMU support Geetha sowjanya
2026-06-12 9:57 ` [PATCH v2 3/3] dt-bindings: perf: marvell: add CN20K TAD " Geetha sowjanya
2 siblings, 1 reply; 6+ messages in thread
From: Geetha sowjanya @ 2026-06-12 9:57 UTC (permalink / raw)
To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
Cc: mark.rutland, will, krzk+dt, gakula
From: Tanmay Jagdale <tanmay@marvell.com>
The TAD PMU exposes counters that can be filtered by MPAM partition id
for a subset of allocation and hit events.
Add a 9-bit partid format attribute (config1) and route counter programming
through variant-specific ops so CN10K keeps MPAM-capable programming while
Odyssey keeps the reduced event set without advertising partid in sysfs.
Probe no longer mutates the platform_device MMIO resource (walk a local
map_start), rejects tad-cnt / page sizes of zero, validates the memory
window against tad-cnt, and registers the perf PMU before hotplug with
correct unwind.
Example:
perf stat -e tad/tad_alloc_any,partid=0x12,partid_en=1/ -- <program>
Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
---
Changelog (since v1)
--------------------
- Fix config1 filter enable to use bit 9 consistently with the PMU format
string (partid_en) and reject reserved bits with GENMASK(9, 0).
- Register perf_pmu_register before cpuhp_state_add_instance_nocalls and
unregister on hotplug failure.
drivers/perf/marvell_cn10k_tad_pmu.c | 212 ++++++++++++++++++++-------
1 file changed, 160 insertions(+), 52 deletions(-)
diff --git a/drivers/perf/marvell_cn10k_tad_pmu.c b/drivers/perf/marvell_cn10k_tad_pmu.c
index 51ccb0befa05..af706b890bf1 100644
--- a/drivers/perf/marvell_cn10k_tad_pmu.c
+++ b/drivers/perf/marvell_cn10k_tad_pmu.c
@@ -7,6 +7,7 @@
#define pr_fmt(fmt) "tad_pmu: " fmt
#include <linux/io.h>
+#include <linux/bits.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/cpuhotplug.h>
@@ -14,11 +15,18 @@
#include <linux/platform_device.h>
#include <linux/acpi.h>
-#define TAD_PFC_OFFSET 0x800
-#define TAD_PFC(counter) (TAD_PFC_OFFSET | (counter << 3))
#define TAD_PRF_OFFSET 0x900
-#define TAD_PRF(counter) (TAD_PRF_OFFSET | (counter << 3))
+#define TAD_PFC_OFFSET 0x800
+#define TAD_PFC(base, counter) ((base) | ((u64)(counter) << 3))
+#define TAD_PRF(base, counter) ((base) | ((u64)(counter) << 3))
#define TAD_PRF_CNTSEL_MASK 0xFF
+#define TAD_PRF_MATCH_PARTID BIT(8)
+#define TAD_PRF_PARTID_NS BIT(10)
+/*
+ * config1: bits 0..8 MPAM partition id (including 0); bit 9 requests
+ * filtering for MPAM-capable events. All-zero config1 means no filter.
+ */
+#define TAD_PARTID_FILTER_EN BIT(9)
#define TAD_MAX_COUNTERS 8
#define to_tad_pmu(p) (container_of(p, struct tad_pmu, pmu))
@@ -27,30 +35,92 @@ struct tad_region {
void __iomem *base;
};
+enum mrvl_tad_pmu_version {
+ TAD_PMU_V1 = 1,
+ TAD_PMU_V2,
+};
+
+struct tad_pmu_data {
+ int id;
+ u64 tad_prf_offset;
+ u64 tad_pfc_offset;
+};
+
struct tad_pmu {
struct pmu pmu;
struct tad_region *regions;
u32 region_cnt;
unsigned int cpu;
+ const struct tad_pmu_ops *ops;
+ const struct tad_pmu_data *pdata;
struct hlist_node node;
struct perf_event *events[TAD_MAX_COUNTERS];
DECLARE_BITMAP(counters_map, TAD_MAX_COUNTERS);
};
-enum mrvl_tad_pmu_version {
- TAD_PMU_V1 = 1,
- TAD_PMU_V2,
-};
-
-struct tad_pmu_data {
- int id;
+struct tad_pmu_ops {
+ void (*start_counter)(struct tad_pmu *pmu, struct perf_event *event);
};
static int tad_pmu_cpuhp_state;
+static void tad_pmu_start_counter(struct tad_pmu *pmu,
+ struct perf_event *event)
+{
+ const struct tad_pmu_data *pdata = pmu->pdata;
+ struct hw_perf_event *hwc = &event->hw;
+ u32 event_idx = event->attr.config;
+ u32 counter_idx = hwc->idx;
+ u64 partid_filter = 0;
+ u64 reg_val;
+ u64 cfg1 = event->attr.config1;
+ bool use_mpam = cfg1 & TAD_PARTID_FILTER_EN;
+ u32 partid = (u32)(cfg1 & GENMASK(8, 0));
+ int i;
+
+ for (i = 0; i < pmu->region_cnt; i++)
+ writeq_relaxed(0, pmu->regions[i].base +
+ TAD_PFC(pdata->tad_pfc_offset, counter_idx));
+
+ if (use_mpam && event_idx > 0x19 && event_idx < 0x21) {
+ partid_filter = TAD_PRF_MATCH_PARTID | TAD_PRF_PARTID_NS |
+ ((u64)partid << 11);
+ }
+
+
+ for (i = 0; i < pmu->region_cnt; i++) {
+ reg_val = event_idx & 0xFF;
+ reg_val |= partid_filter;
+ writeq_relaxed(reg_val, pmu->regions[i].base +
+ TAD_PRF(pdata->tad_prf_offset, counter_idx));
+ }
+}
+
+static void tad_pmu_v2_start_counter(struct tad_pmu *pmu,
+ struct perf_event *event)
+{
+ const struct tad_pmu_data *pdata = pmu->pdata;
+ struct hw_perf_event *hwc = &event->hw;
+ u32 event_idx = event->attr.config;
+ u32 counter_idx = hwc->idx;
+ u64 reg_val;
+ int i;
+
+ for (i = 0; i < pmu->region_cnt; i++)
+ writeq_relaxed(0, pmu->regions[i].base +
+ TAD_PFC(pdata->tad_pfc_offset, counter_idx));
+
+ for (i = 0; i < pmu->region_cnt; i++) {
+ reg_val = event_idx & 0xFF;
+ writeq_relaxed(reg_val, pmu->regions[i].base +
+ TAD_PRF(pdata->tad_prf_offset, counter_idx));
+ }
+}
+
static void tad_pmu_event_counter_read(struct perf_event *event)
{
struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
+ const struct tad_pmu_data *pdata = tad_pmu->pdata;
struct hw_perf_event *hwc = &event->hw;
u32 counter_idx = hwc->idx;
u64 prev, new;
@@ -60,7 +130,7 @@ static void tad_pmu_event_counter_read(struct perf_event *event)
prev = local64_read(&hwc->prev_count);
for (i = 0, new = 0; i < tad_pmu->region_cnt; i++)
new += readq(tad_pmu->regions[i].base +
- TAD_PFC(counter_idx));
+ TAD_PFC(pdata->tad_pfc_offset, counter_idx));
} while (local64_cmpxchg(&hwc->prev_count, prev, new) != prev);
local64_add(new - prev, &event->count);
@@ -69,16 +139,14 @@ static void tad_pmu_event_counter_read(struct perf_event *event)
static void tad_pmu_event_counter_stop(struct perf_event *event, int flags)
{
struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
+ const struct tad_pmu_data *pdata = tad_pmu->pdata;
struct hw_perf_event *hwc = &event->hw;
u32 counter_idx = hwc->idx;
int i;
- /* TAD()_PFC() stop counting on the write
- * which sets TAD()_PRF()[CNTSEL] == 0
- */
for (i = 0; i < tad_pmu->region_cnt; i++) {
writeq_relaxed(0, tad_pmu->regions[i].base +
- TAD_PRF(counter_idx));
+ TAD_PRF(pdata->tad_prf_offset, counter_idx));
}
tad_pmu_event_counter_read(event);
@@ -89,26 +157,10 @@ static void tad_pmu_event_counter_start(struct perf_event *event, int flags)
{
struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw;
- u32 event_idx = event->attr.config;
- u32 counter_idx = hwc->idx;
- u64 reg_val;
- int i;
hwc->state = 0;
- /* Typically TAD_PFC() are zeroed to start counting */
- for (i = 0; i < tad_pmu->region_cnt; i++)
- writeq_relaxed(0, tad_pmu->regions[i].base +
- TAD_PFC(counter_idx));
-
- /* TAD()_PFC() start counting on the write
- * which sets TAD()_PRF()[CNTSEL] != 0
- */
- for (i = 0; i < tad_pmu->region_cnt; i++) {
- reg_val = event_idx & 0xFF;
- writeq_relaxed(reg_val, tad_pmu->regions[i].base +
- TAD_PRF(counter_idx));
- }
+ tad_pmu->ops->start_counter(tad_pmu, event);
}
static void tad_pmu_event_counter_del(struct perf_event *event, int flags)
@@ -128,7 +180,6 @@ static int tad_pmu_event_counter_add(struct perf_event *event, int flags)
struct hw_perf_event *hwc = &event->hw;
int idx;
- /* Get a free counter for this event */
idx = find_first_zero_bit(tad_pmu->counters_map, TAD_MAX_COUNTERS);
if (idx == TAD_MAX_COUNTERS)
return -EAGAIN;
@@ -148,6 +199,9 @@ static int tad_pmu_event_counter_add(struct perf_event *event, int flags)
static int tad_pmu_event_init(struct perf_event *event)
{
struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
+ const struct tad_pmu_data *pdata = tad_pmu->pdata;
+ u32 event_idx = (u32)(event->attr.config & GENMASK(7, 0));
+ u64 cfg1 = event->attr.config1;
if (event->attr.type != event->pmu->type)
return -ENOENT;
@@ -158,6 +212,20 @@ static int tad_pmu_event_init(struct perf_event *event)
if (event->state != PERF_EVENT_STATE_OFF)
return -EINVAL;
+ if (pdata->id == TAD_PMU_V2) {
+ if (cfg1)
+ return -EINVAL;
+ } else {
+ if ((cfg1 & GENMASK(8, 0)) && !(cfg1 & TAD_PARTID_FILTER_EN))
+ return -EINVAL;
+ if (cfg1 & TAD_PARTID_FILTER_EN) {
+ if (event_idx <= 0x19 || event_idx >= 0x21)
+ return -EINVAL;
+ }
+ if (cfg1 & ~GENMASK(9, 0))
+ return -EINVAL;
+ }
+
event->cpu = tad_pmu->cpu;
event->hw.idx = -1;
event->hw.config_base = event->attr.config;
@@ -232,7 +300,7 @@ static struct attribute *ody_tad_pmu_event_attrs[] = {
TAD_PMU_EVENT_ATTR(tad_hit_ltg, 0x1e),
TAD_PMU_EVENT_ATTR(tad_hit_any, 0x1f),
TAD_PMU_EVENT_ATTR(tad_tag_rd, 0x20),
- TAD_PMU_EVENT_ATTR(tad_tot_cycle, 0xFF),
+ TAD_PMU_EVENT_ATTR(tad_tot_cycle, 0xff),
NULL
};
@@ -242,9 +310,13 @@ static const struct attribute_group ody_tad_pmu_events_attr_group = {
};
PMU_FORMAT_ATTR(event, "config:0-7");
+PMU_FORMAT_ATTR(partid, "config1:0-8");
+PMU_FORMAT_ATTR(partid_en, "config1:9-9");
static struct attribute *tad_pmu_format_attrs[] = {
&format_attr_event.attr,
+ &format_attr_partid.attr,
+ &format_attr_partid_en.attr,
NULL
};
@@ -253,6 +325,16 @@ static struct attribute_group tad_pmu_format_attr_group = {
.attrs = tad_pmu_format_attrs,
};
+static struct attribute *ody_tad_pmu_format_attrs[] = {
+ &format_attr_event.attr,
+ NULL
+};
+
+static struct attribute_group ody_tad_pmu_format_attr_group = {
+ .name = "format",
+ .attrs = ody_tad_pmu_format_attrs,
+};
+
static ssize_t tad_pmu_cpumask_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -281,16 +363,25 @@ static const struct attribute_group *tad_pmu_attr_groups[] = {
static const struct attribute_group *ody_tad_pmu_attr_groups[] = {
&ody_tad_pmu_events_attr_group,
- &tad_pmu_format_attr_group,
+ &ody_tad_pmu_format_attr_group,
&tad_pmu_cpumask_attr_group,
NULL
};
+static const struct tad_pmu_ops tad_pmu_ops = {
+ .start_counter = tad_pmu_start_counter,
+};
+
+static const struct tad_pmu_ops tad_pmu_v2_ops = {
+ .start_counter = tad_pmu_v2_start_counter,
+};
+
static int tad_pmu_probe(struct platform_device *pdev)
{
const struct tad_pmu_data *dev_data;
struct device *dev = &pdev->dev;
struct tad_region *regions;
+ resource_size_t map_start;
struct tad_pmu *tad_pmu;
struct resource *res;
u32 tad_pmu_page_size;
@@ -298,7 +389,6 @@ static int tad_pmu_probe(struct platform_device *pdev)
u32 tad_cnt;
int version;
int i, ret;
- char *name;
tad_pmu = devm_kzalloc(&pdev->dev, sizeof(*tad_pmu), GFP_KERNEL);
if (!tad_pmu)
@@ -312,6 +402,7 @@ static int tad_pmu_probe(struct platform_device *pdev)
return -ENODEV;
}
version = dev_data->id;
+ tad_pmu->pdata = dev_data;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res) {
@@ -338,22 +429,31 @@ static int tad_pmu_probe(struct platform_device *pdev)
dev_err(&pdev->dev, "Can't find tad-cnt property\n");
return ret;
}
+ if (!tad_cnt || !tad_page_size || !tad_pmu_page_size) {
+ dev_err(&pdev->dev, "Invalid tad-cnt or page size\n");
+ return -EINVAL;
+ }
regions = devm_kcalloc(&pdev->dev, tad_cnt,
sizeof(*regions), GFP_KERNEL);
if (!regions)
return -ENOMEM;
- /* ioremap the distributed TAD pmu regions */
- for (i = 0; i < tad_cnt && res->start < res->end; i++) {
- regions[i].base = devm_ioremap(&pdev->dev,
- res->start,
+ map_start = res->start;
+ for (i = 0; i < tad_cnt; i++) {
+ if (map_start > res->end ||
+ tad_pmu_page_size > (resource_size_t)(res->end - map_start + 1)) {
+ dev_err(&pdev->dev, "TAD PMU mem window too small for tad-cnt=%u\n",
+ tad_cnt);
+ return -EINVAL;
+ }
+ regions[i].base = devm_ioremap(&pdev->dev, map_start,
tad_pmu_page_size);
if (!regions[i].base) {
dev_err(&pdev->dev, "TAD%d ioremap fail\n", i);
return -ENOMEM;
}
- res->start += tad_page_size;
+ map_start += tad_page_size;
}
tad_pmu->regions = regions;
@@ -374,28 +474,31 @@ static int tad_pmu_probe(struct platform_device *pdev)
.read = tad_pmu_event_counter_read,
};
- if (version == TAD_PMU_V1)
+ if (version == TAD_PMU_V1) {
tad_pmu->pmu.attr_groups = tad_pmu_attr_groups;
- else
+ tad_pmu->ops = &tad_pmu_ops;
+ } else {
tad_pmu->pmu.attr_groups = ody_tad_pmu_attr_groups;
+ tad_pmu->ops = &tad_pmu_v2_ops;
+ }
tad_pmu->cpu = raw_smp_processor_id();
- /* Register pmu instance for cpu hotplug */
+ ret = perf_pmu_register(&tad_pmu->pmu, "tad", -1);
+ if (ret) {
+ dev_err(&pdev->dev, "Error %d registering perf PMU\n", ret);
+ return ret;
+ }
+
ret = cpuhp_state_add_instance_nocalls(tad_pmu_cpuhp_state,
&tad_pmu->node);
if (ret) {
dev_err(&pdev->dev, "Error %d registering hotplug\n", ret);
+ perf_pmu_unregister(&tad_pmu->pmu);
return ret;
}
- name = "tad";
- ret = perf_pmu_register(&tad_pmu->pmu, name, -1);
- if (ret)
- cpuhp_state_remove_instance_nocalls(tad_pmu_cpuhp_state,
- &tad_pmu->node);
-
- return ret;
+ return 0;
}
static void tad_pmu_remove(struct platform_device *pdev)
@@ -410,12 +513,17 @@ static void tad_pmu_remove(struct platform_device *pdev)
#if defined(CONFIG_OF) || defined(CONFIG_ACPI)
static const struct tad_pmu_data tad_pmu_data = {
.id = TAD_PMU_V1,
+ .tad_prf_offset = TAD_PRF_OFFSET,
+ .tad_pfc_offset = TAD_PFC_OFFSET,
};
+
#endif
#ifdef CONFIG_ACPI
static const struct tad_pmu_data tad_pmu_v2_data = {
.id = TAD_PMU_V2,
+ .tad_prf_offset = TAD_PRF_OFFSET,
+ .tad_pfc_offset = TAD_PFC_OFFSET,
};
#endif
@@ -491,6 +599,6 @@ static void __exit tad_pmu_exit(void)
module_init(tad_pmu_init);
module_exit(tad_pmu_exit);
-MODULE_DESCRIPTION("Marvell CN10K LLC-TAD Perf driver");
+MODULE_DESCRIPTION("Marvell CN10K LLC-TAD perf driver");
MODULE_AUTHOR("Bhaskara Budiredla <bbudiredla@marvell.com>");
MODULE_LICENSE("GPL v2");
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/3] perf: marvell: Add CN20K LLC-TAD PMU support
2026-06-12 9:57 [PATCH v2 0/3] perf: marvell: LLC-TAD PMU MPAM filtering support Geetha sowjanya
2026-06-12 9:57 ` [PATCH v2 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU Geetha sowjanya
@ 2026-06-12 9:57 ` Geetha sowjanya
2026-06-12 10:26 ` sashiko-bot
2026-06-12 9:57 ` [PATCH v2 3/3] dt-bindings: perf: marvell: add CN20K TAD " Geetha sowjanya
2 siblings, 1 reply; 6+ messages in thread
From: Geetha sowjanya @ 2026-06-12 9:57 UTC (permalink / raw)
To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
Cc: mark.rutland, will, krzk+dt, gakula
Add support for the LLC Tag-and-Data (TAD) PMU present in
Marvell CN20K SoCs.
The CN20K TAD PMU is based on the CN10K design but differs in the
layout of PFC/PRF register offsets relative to each TAD base, and
introduces additional events. These offsets are selected by the driver
based on the compatible string and are not described via DT properties.
Because of this, "marvell,cn10k-tad-pmu" cannot be used as a fallback
for CN20K, as it would result in incorrect register programming.
Add support for "marvell,cn20k-tad-pmu" by:
- Introducing a TAD_PMU_V3 profile with CN20K-specific register bases
- Extending the event map for new CN20K events
- Matching the PMU via OF and ACPI (MRVL000F)
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
Changelog (since v1)
--------------------
- Hide V3-only events on CN10K via sysfs is_visible and reject them in
event_init.
- Use CN20K-specific MPAM PRF bits (MATCH_MPAMNS, partid << 10) for V3;
software partid is limited to nine bits so this does not collide with
the fixed bit at 25.
- Reset hwc->prev_count when starting counters so reads match cleared HW.
drivers/perf/marvell_cn10k_tad_pmu.c | 54 ++++++++++++++++++++++++++--
1 file changed, 52 insertions(+), 2 deletions(-)
diff --git a/drivers/perf/marvell_cn10k_tad_pmu.c b/drivers/perf/marvell_cn10k_tad_pmu.c
index af706b890bf1..e43598a52859 100644
--- a/drivers/perf/marvell_cn10k_tad_pmu.c
+++ b/drivers/perf/marvell_cn10k_tad_pmu.c
@@ -17,11 +17,14 @@
#define TAD_PRF_OFFSET 0x900
#define TAD_PFC_OFFSET 0x800
+#define TAD_PRF_NS_OFFSET 0x30900
+#define TAD_PFC_NS_OFFSET 0x30800
#define TAD_PFC(base, counter) ((base) | ((u64)(counter) << 3))
#define TAD_PRF(base, counter) ((base) | ((u64)(counter) << 3))
#define TAD_PRF_CNTSEL_MASK 0xFF
#define TAD_PRF_MATCH_PARTID BIT(8)
#define TAD_PRF_PARTID_NS BIT(10)
+#define TAD_PRF_MATCH_MPAMNS BIT(25)
/*
* config1: bits 0..8 MPAM partition id (including 0); bit 9 requests
* filtering for MPAM-capable events. All-zero config1 means no filter.
@@ -38,6 +41,7 @@ struct tad_region {
enum mrvl_tad_pmu_version {
TAD_PMU_V1 = 1,
TAD_PMU_V2,
+ TAD_PMU_V3,
};
struct tad_pmu_data {
@@ -85,8 +89,15 @@ static void tad_pmu_start_counter(struct tad_pmu *pmu,
if (use_mpam && event_idx > 0x19 && event_idx < 0x21) {
partid_filter = TAD_PRF_MATCH_PARTID | TAD_PRF_PARTID_NS |
((u64)partid << 11);
+
+ if (pdata->id == TAD_PMU_V3)
+ partid_filter = TAD_PRF_MATCH_PARTID | TAD_PRF_MATCH_MPAMNS |
+ ((u64)partid << 10);
}
+ /* CN10K support events 0:24*/
+ if (pdata->id == TAD_PMU_V1 && event_idx >= 0x25)
+ return;
for (i = 0; i < pmu->region_cnt; i++) {
reg_val = event_idx & 0xFF;
@@ -159,6 +170,7 @@ static void tad_pmu_event_counter_start(struct perf_event *event, int flags)
struct hw_perf_event *hwc = &event->hw;
hwc->state = 0;
+ local64_set(&hwc->prev_count, 0);
tad_pmu->ops->start_counter(tad_pmu, event);
}
@@ -216,6 +228,8 @@ static int tad_pmu_event_init(struct perf_event *event)
if (cfg1)
return -EINVAL;
} else {
+ if (pdata->id == TAD_PMU_V1 && event_idx >= 0x25)
+ return -EINVAL;
if ((cfg1 & GENMASK(8, 0)) && !(cfg1 & TAD_PARTID_FILTER_EN))
return -EINVAL;
if (cfg1 & TAD_PARTID_FILTER_EN) {
@@ -242,6 +256,22 @@ static ssize_t tad_pmu_event_show(struct device *dev,
return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
}
+static umode_t tad_pmu_event_attr_is_visible(struct kobject *kobj,
+ struct attribute *attr, int unused)
+{
+ struct pmu *pmu = dev_get_drvdata(kobj_to_dev(kobj));
+ struct tad_pmu *t = to_tad_pmu(pmu);
+ struct device_attribute *da = container_of(attr, struct device_attribute,
+ attr);
+ struct perf_pmu_events_attr *e = container_of(da, struct perf_pmu_events_attr,
+ attr);
+ u64 id = e->id;
+
+ if (t->pdata->id != TAD_PMU_V3 && id >= 0x25)
+ return 0;
+ return attr->mode;
+}
+
#define TAD_PMU_EVENT_ATTR(name, config) \
PMU_EVENT_ATTR_ID(name, tad_pmu_event_show, config)
@@ -283,12 +313,25 @@ static struct attribute *tad_pmu_event_attrs[] = {
TAD_PMU_EVENT_ATTR(tad_dat_rd_byp, 0x22),
TAD_PMU_EVENT_ATTR(tad_ifb_occ, 0x23),
TAD_PMU_EVENT_ATTR(tad_req_occ, 0x24),
+ TAD_PMU_EVENT_ATTR(tad_req_msh_out_dtg_evict, 0x25),
+ TAD_PMU_EVENT_ATTR(tad_req_msh_out_ltg_evict, 0x26),
+ TAD_PMU_EVENT_ATTR(tad_rsp_msh_out_mpam, 0x28),
+ TAD_PMU_EVENT_ATTR(tad_replays, 0x29),
+ TAD_PMU_EVENT_ATTR(tad_req_byp0, 0x2a),
+ TAD_PMU_EVENT_ATTR(tad_req_byp1, 0x2b),
+ TAD_PMU_EVENT_ATTR(tad_txreq_byp, 0x2c),
+ TAD_PMU_EVENT_ATTR(tad_time_in_dslp, 0x2d),
+ TAD_PMU_EVENT_ATTR(tad_time_elapsed, 0x2e),
+ TAD_PMU_EVENT_ATTR(tad_req_msh_out_dss_rd_128mrg, 0x2f),
+ TAD_PMU_EVENT_ATTR(tad_req_msh_out_dss_wr_128mrg, 0x30),
+ TAD_PMU_EVENT_ATTR(tad_tot_cycle, 0xff),
NULL
};
static const struct attribute_group tad_pmu_events_attr_group = {
.name = "events",
.attrs = tad_pmu_event_attrs,
+ .is_visible = tad_pmu_event_attr_is_visible,
};
static struct attribute *ody_tad_pmu_event_attrs[] = {
@@ -474,7 +517,7 @@ static int tad_pmu_probe(struct platform_device *pdev)
.read = tad_pmu_event_counter_read,
};
- if (version == TAD_PMU_V1) {
+ if (version == TAD_PMU_V1 || version == TAD_PMU_V3) {
tad_pmu->pmu.attr_groups = tad_pmu_attr_groups;
tad_pmu->ops = &tad_pmu_ops;
} else {
@@ -517,6 +560,11 @@ static const struct tad_pmu_data tad_pmu_data = {
.tad_pfc_offset = TAD_PFC_OFFSET,
};
+static const struct tad_pmu_data tad_pmu_cn20k_data = {
+ .id = TAD_PMU_V3,
+ .tad_prf_offset = TAD_PRF_NS_OFFSET,
+ .tad_pfc_offset = TAD_PFC_NS_OFFSET,
+};
#endif
#ifdef CONFIG_ACPI
@@ -530,6 +578,7 @@ static const struct tad_pmu_data tad_pmu_v2_data = {
#ifdef CONFIG_OF
static const struct of_device_id tad_pmu_of_match[] = {
{ .compatible = "marvell,cn10k-tad-pmu", .data = &tad_pmu_data },
+ { .compatible = "marvell,cn20k-tad-pmu", .data = &tad_pmu_cn20k_data },
{},
};
#endif
@@ -538,6 +587,7 @@ static const struct of_device_id tad_pmu_of_match[] = {
static const struct acpi_device_id tad_pmu_acpi_match[] = {
{"MRVL000B", (kernel_ulong_t)&tad_pmu_data},
{"MRVL000D", (kernel_ulong_t)&tad_pmu_v2_data},
+ {"MRVL000F", (kernel_ulong_t)&tad_pmu_cn20k_data},
{},
};
MODULE_DEVICE_TABLE(acpi, tad_pmu_acpi_match);
@@ -599,6 +649,6 @@ static void __exit tad_pmu_exit(void)
module_init(tad_pmu_init);
module_exit(tad_pmu_exit);
-MODULE_DESCRIPTION("Marvell CN10K LLC-TAD perf driver");
+MODULE_DESCRIPTION("Marvell CN10K/CN20K LLC-TAD perf driver");
MODULE_AUTHOR("Bhaskara Budiredla <bbudiredla@marvell.com>");
MODULE_LICENSE("GPL v2");
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 3/3] dt-bindings: perf: marvell: add CN20K TAD PMU support
2026-06-12 9:57 [PATCH v2 0/3] perf: marvell: LLC-TAD PMU MPAM filtering support Geetha sowjanya
2026-06-12 9:57 ` [PATCH v2 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU Geetha sowjanya
2026-06-12 9:57 ` [PATCH v2 2/3] perf: marvell: Add CN20K LLC-TAD PMU support Geetha sowjanya
@ 2026-06-12 9:57 ` Geetha sowjanya
2 siblings, 0 replies; 6+ messages in thread
From: Geetha sowjanya @ 2026-06-12 9:57 UTC (permalink / raw)
To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
Cc: mark.rutland, will, krzk+dt, gakula
Marvell CN20K SoCs integrate a Performance Monitoring Unit (PMU)
associated with the LLC Tag-and-Data (TAD) blocks. The PMU provides
hardware counters to monitor cache traffic and performance events
via a dedicated MMIO region.
The CN20K LLC-TAD PMU is largely similar to CN10K, but differs in the
layout of PFC/PRF register offsets relative to each TAD base. These
offsets are derived from the compatible string in the driver and are
not described through Devicetree properties.
Because of this, using "marvell,cn10k-tad-pmu" as a fallback for CN20K
would result in incorrect register programming. Therefore, add a
separate compatible string:
"marvell,cn20k-tad-pmu"
Update the binding to document CN20K alongside CN10K.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
Documentation/devicetree/bindings/perf/marvell-cn10k-tad.yaml | 17 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/Documentation/devicetree/bindings/perf/marvell-cn10k-tad.yaml b/Documentation/devicetree/bindings/perf/marvell-cn10k-tad.yaml
index 362142252667..d11121a1e2c9 100644
--- a/Documentation/devicetree/bindings/perf/marvell-cn10k-tad.yaml
+++ b/Documentation/devicetree/bindings/perf/marvell-cn10k-tad.yaml
@@ -4,23 +4,32 @@
$id: http://devicetree.org/schemas/perf/marvell-cn10k-tad.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
-title: Marvell CN10K LLC-TAD performance monitor
+title: Marvell CN10K / CN20K LLC-TAD performance monitor
maintainers:
- Bhaskara Budiredla <bbudiredla@marvell.com>
+ - Geetha sowjanya <gakula@marvell.com>
description: |
- The Tag-and-Data units (TADs) maintain coherence and contain CN10K
- shared on-chip last level cache (LLC). The tad pmu measures the
- performance of last-level cache. Each tad pmu supports up to eight
- counters.
+ The Tag-and-Data units (TADs) maintain coherence and contain the
+ shared on-chip last level cache (LLC) on Marvell CN10K and CN20K SoCs.
+ The TAD PMU measures last-level cache performance. Each TAD PMU
+ supports up to eight counters.
- The DT setup comprises of number of tad blocks, the sizes of pmu
- regions, tad blocks and overall base address of the HW.
+ The DT setup describes the number of TAD blocks, the sizes of PMU
+ regions and TAD pages, and the overall MMIO base of the hardware.
+
+ marvell,cn20k-tad-pmu is not a compatible fallback for
+ marvell,cn10k-tad-pmu (and vice versa): the driver selects different
+ PFC/PRF MMIO offsets from the compatible string, and those offsets are
+ not described by separate DT properties today.
properties:
compatible:
- const: marvell,cn10k-tad-pmu
+ items:
+ - enum:
+ - marvell,cn10k-tad-pmu
+ - marvell,cn20k-tad-pmu
reg:
maxItems: 1
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU
2026-06-12 9:57 ` [PATCH v2 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU Geetha sowjanya
@ 2026-06-12 10:14 ` sashiko-bot
0 siblings, 0 replies; 6+ messages in thread
From: sashiko-bot @ 2026-06-12 10:14 UTC (permalink / raw)
To: Geetha sowjanya; +Cc: robh, conor+dt, devicetree, linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Inconsistent masking of `event->attr.config` causes valid MPAM partition ID filters to be silently bypassed (or hardware programming to be skipped entirely) when upper bits are set.
--
commit f4f2ecc176ec6e4641448761b6600d2344554e81
Author: Tanmay Jagdale <tanmay@marvell.com>
perf: marvell: Add MPAM partid filtering to CN10K TAD PMU
The TAD PMU exposes counters that can be filtered by MPAM partition id
for a subset of allocation and hit events.
> diff --git a/drivers/perf/marvell_cn10k_tad_pmu.c b/drivers/perf/marvell_cn10k_tad_pmu.c
> index 51ccb0befa05d..af706b890bf17 100644
> --- a/drivers/perf/marvell_cn10k_tad_pmu.c
> +++ b/drivers/perf/marvell_cn10k_tad_pmu.c
[ ... ]
> @@ -27,30 +35,92 @@ struct tad_region {
[ ... ]
> +static void tad_pmu_start_counter(struct tad_pmu *pmu,
> + struct perf_event *event)
> +{
> + const struct tad_pmu_data *pdata = pmu->pdata;
> + struct hw_perf_event *hwc = &event->hw;
> + u32 event_idx = event->attr.config;
[Severity: Medium]
Does this cause valid MPAM partition ID filters to be silently bypassed if
upper bits are set in event->attr.config?
In tad_pmu_event_init(), the event_idx is masked to 8 bits:
u32 event_idx = (u32)(event->attr.config & GENMASK(7, 0));
Since it is masked there, an event with upper bits set successfully passes
the validation checks. However, because it is unmasked here in
tad_pmu_start_counter(), it will fail the bounds checks below while the
lower 8 bits are still eventually programmed into the hardware counter:
> + u32 counter_idx = hwc->idx;
> + u64 partid_filter = 0;
> + u64 reg_val;
> + u64 cfg1 = event->attr.config1;
> + bool use_mpam = cfg1 & TAD_PARTID_FILTER_EN;
> + u32 partid = (u32)(cfg1 & GENMASK(8, 0));
> + int i;
> +
> + for (i = 0; i < pmu->region_cnt; i++)
> + writeq_relaxed(0, pmu->regions[i].base +
> + TAD_PFC(pdata->tad_pfc_offset, counter_idx));
> +
> + if (use_mpam && event_idx > 0x19 && event_idx < 0x21) {
> + partid_filter = TAD_PRF_MATCH_PARTID | TAD_PRF_PARTID_NS |
> + ((u64)partid << 11);
> + }
> +
> +
> + for (i = 0; i < pmu->region_cnt; i++) {
> + reg_val = event_idx & 0xFF;
> + reg_val |= partid_filter;
> + writeq_relaxed(reg_val, pmu->regions[i].base +
> + TAD_PRF(pdata->tad_prf_offset, counter_idx));
> + }
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260612095746.19679-1-gakula@marvell.com?part=1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 2/3] perf: marvell: Add CN20K LLC-TAD PMU support
2026-06-12 9:57 ` [PATCH v2 2/3] perf: marvell: Add CN20K LLC-TAD PMU support Geetha sowjanya
@ 2026-06-12 10:26 ` sashiko-bot
0 siblings, 0 replies; 6+ messages in thread
From: sashiko-bot @ 2026-06-12 10:26 UTC (permalink / raw)
To: Geetha sowjanya; +Cc: robh, devicetree, linux-perf-users, conor+dt
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Inconsistent masking of `event_idx` between validation and execution allows userspace to induce an aborted hardware programming state or bypass MPAM filters.
--
commit 13a14a3d137a1e1e8ddf65c82eb4ff187ff64ca5
Author: Geetha sowjanya <gakula@marvell.com>
perf: marvell: Add CN20K LLC-TAD PMU support
This commit adds support for the LLC Tag-and-Data (TAD) PMU on Marvell
CN20K SoCs. It introduces a new PMU profile, extended event maps, and
specific register base offsets.
> diff --git a/drivers/perf/marvell_cn10k_tad_pmu.c b/drivers/perf/marvell_cn10k_tad_pmu.c
> index af706b890bf17..e43598a52859e 100644
> --- a/drivers/perf/marvell_cn10k_tad_pmu.c
> +++ b/drivers/perf/marvell_cn10k_tad_pmu.c
[ ... ]
> @@ -85,8 +89,15 @@ static void tad_pmu_start_counter(struct tad_pmu *pmu,
> if (use_mpam && event_idx > 0x19 && event_idx < 0x21) {
> partid_filter = TAD_PRF_MATCH_PARTID | TAD_PRF_PARTID_NS |
> ((u64)partid << 11);
> +
> + if (pdata->id == TAD_PMU_V3)
> + partid_filter = TAD_PRF_MATCH_PARTID | TAD_PRF_MATCH_MPAMNS |
> + ((u64)partid << 10);
> }
>
> + /* CN10K support events 0:24*/
> + if (pdata->id == TAD_PMU_V1 && event_idx >= 0x25)
> + return;
[Severity: Medium]
In tad_pmu_event_init(), the event configuration is validated using a masked
value:
tad_pmu_event_init() {
...
u32 event_idx = (u32)(event->attr.config & GENMASK(7, 0));
...
}
However, in tad_pmu_start_counter(), event_idx is assigned directly from the
unmasked event->attr.config.
If a user provides an event configuration with upper bits set (e.g., 0x124),
it will pass the masked initialization checks, but won't it fail the unmasked
bounds checks here?
For V1 hardware, an unmasked config like 0x124 would evaluate as >= 0x25 and
trigger this new early return after clearing the counter but before
programming it, leaving the hardware in an aborted state.
Similarly, for V3 hardware, wouldn't the unmasked event_idx fail the
MPAM filter condition (event_idx > 0x19 && event_idx < 0x21) earlier in this
function, causing the counter to incorrectly bypass the filter and record all
partition IDs?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260612095746.19679-1-gakula@marvell.com?part=2
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-12 10:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 9:57 [PATCH v2 0/3] perf: marvell: LLC-TAD PMU MPAM filtering support Geetha sowjanya
2026-06-12 9:57 ` [PATCH v2 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU Geetha sowjanya
2026-06-12 10:14 ` sashiko-bot
2026-06-12 9:57 ` [PATCH v2 2/3] perf: marvell: Add CN20K LLC-TAD PMU support Geetha sowjanya
2026-06-12 10:26 ` sashiko-bot
2026-06-12 9:57 ` [PATCH v2 3/3] dt-bindings: perf: marvell: add CN20K TAD " Geetha sowjanya
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox