linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU
@ 2025-08-21 13:50 Yushan Wang
  2025-08-21 13:50 ` [PATCH v2 1/9] drivers/perf: hisi: Relax the event ID check in the framework Yushan Wang
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

Support new version of L3C PMU, which supports extended events space
which can be controlled in up to 2 extra address spaces with separate
overflow interrupts.  The layout of the control/event registers are kept
the same.  The extended events with original ones together cover the
monitoring job of all transactions on L3C.

That's said, the driver supports finer granual statistics of L3 cache
with separated and dedicated PMUs, and a new operand `ext` to give a
hint of to which part should perf counting command be delivered.

The extended events is specified with `ext=[1|2]` option for the driver
to distinguish:

perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=<ext>/

Currently only event option using config bit [7, 0]. There's still
plenty unused space. Make ext using config [16, 17] and reserve
bit [15, 8] for event option for future extension.

With the capability of extra counters, number of counters for HiSilicon
uncore PMU could reach up to 24, the usedmap is extended accordingly.

The hw_perf_event::event_base is initialized to the base MMIO address
of the event and will be used for later control, overflow handling and
counts readout.

We still make use of the Uncore PMU framework for handling the events
and interrupt migration on CPU hotplug. The framework's cpuhp callback
will handle the event migration and interrupt migration of orginial
event, if PMU supports extended events then the interrupt of extended
events is migrated to the same CPU choosed by the framework.

A new HID of HISI0215 is used for this version of L3C PMU.

Some necessary refactor is included, allowing the framework to cope with
the new version of driver.

Depends-on: drivers/perf: hisi: Add support for HiSilicon NOC and MN PMU driver
Depends-on: Message-ID: <20250717121727.61057-1-yangyicong@huawei.com>

---

Changes:

v1 -> v2:
  - Don't call disable_irq() and simply return success when there is no
    CPU available for irq migration.
  - Documentation patch split.
  - Fix of a few other issues etc. per Jonathan.

Yicong Yang (7):
  drivers/perf: hisi: Relax the event ID check in the framework
  drivers/perf: hisi: Export hisi_uncore_pmu_isr()
  drivers/perf: hisi: Simplify the probe process of each L3C PMU version
  drivers/perf: hisi: Extract the event filter check of L3C PMU
  drivers/perf: hisi: Extend the field of tt_core
  drivers/perf: hisi: Refactor the event configuration of L3C PMU
  drivers/perf: hisi: Add support for L3C PMU v3

Yushan Wang (2):
  Documentation: hisi-pmu: Fix of minor format error
  Documentation: hisi-pmu: Add introduction to HiSilicon

 Documentation/admin-guide/perf/hisi-pmu.rst  |  43 +-
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 520 +++++++++++++++----
 drivers/perf/hisilicon/hisi_uncore_pmu.c     |   5 +-
 drivers/perf/hisilicon/hisi_uncore_pmu.h     |   6 +-
 4 files changed, 477 insertions(+), 97 deletions(-)

-- 
2.33.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 1/9] drivers/perf: hisi: Relax the event ID check in the framework
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:03   ` Jonathan Cameron
  2025-08-21 13:50 ` [PATCH v2 2/9] drivers/perf: hisi: Export hisi_uncore_pmu_isr() Yushan Wang
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

From: Yicong Yang <yangyicong@hisilicon.com>

Event ID is only using the attr::config bit [7, 0] but we check the
event range using the whole 64bit field. It blocks the usage of the
rest field of attr::config. Relax the check by only using the
bit [7, 0].

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
 drivers/perf/hisilicon/hisi_uncore_pmu.h | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index a449651f79c9..6594d64b03a9 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -234,7 +234,7 @@ int hisi_uncore_pmu_event_init(struct perf_event *event)
 		return -EINVAL;
 
 	hisi_pmu = to_hisi_pmu(event->pmu);
-	if (event->attr.config > hisi_pmu->check_event)
+	if ((event->attr.config & HISI_EVENTID_MASK) > hisi_pmu->check_event)
 		return -EINVAL;
 
 	if (hisi_pmu->on_cpu == -1)
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h
index 777675838b80..6186b232f454 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.h
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h
@@ -43,7 +43,8 @@
 		return FIELD_GET(GENMASK_ULL(hi, lo), event->attr.config);  \
 	}
 
-#define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff)
+#define HISI_EVENTID_MASK	0xff
+#define HISI_GET_EVENTID(ev) ((ev)->hw.config_base & HISI_EVENTID_MASK)
 
 #define HISI_PMU_EVTYPE_BITS		8
 #define HISI_PMU_EVTYPE_SHIFT(idx)	((idx) % 4 * HISI_PMU_EVTYPE_BITS)
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 2/9] drivers/perf: hisi: Export hisi_uncore_pmu_isr()
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
  2025-08-21 13:50 ` [PATCH v2 1/9] drivers/perf: hisi: Relax the event ID check in the framework Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:03   ` Jonathan Cameron
  2025-08-21 13:50 ` [PATCH v2 3/9] drivers/perf: hisi: Simplify the probe process of each L3C PMU version Yushan Wang
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

From: Yicong Yang <yangyicong@hisilicon.com>

Currently Uncore PMU framework assume one PMU device only have one
interrupt and will help register the interrupt handler. It cannot
support a PMU with multiple interrupt resources.  An uncore PMU may
have multiple interrupts that can share the same handler.  Export
hisi_uncore_pmu_isr() to allow drivers register the irq handler by
their own routine.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 3 ++-
 drivers/perf/hisilicon/hisi_uncore_pmu.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index 6594d64b03a9..de71dcf11653 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -149,7 +149,7 @@ static void hisi_uncore_pmu_clear_event_idx(struct hisi_pmu *hisi_pmu, int idx)
 	clear_bit(idx, hisi_pmu->pmu_events.used_mask);
 }
 
-static irqreturn_t hisi_uncore_pmu_isr(int irq, void *data)
+irqreturn_t hisi_uncore_pmu_isr(int irq, void *data)
 {
 	struct hisi_pmu *hisi_pmu = data;
 	struct perf_event *event;
@@ -178,6 +178,7 @@ static irqreturn_t hisi_uncore_pmu_isr(int irq, void *data)
 
 	return IRQ_HANDLED;
 }
+EXPORT_SYMBOL_NS_GPL(hisi_uncore_pmu_isr, "HISI_PMU");
 
 int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu,
 			     struct platform_device *pdev)
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h
index 6186b232f454..02fa022925d4 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.h
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h
@@ -165,6 +165,7 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node);
 ssize_t hisi_uncore_pmu_identifier_attr_show(struct device *dev,
 					     struct device_attribute *attr,
 					     char *page);
+irqreturn_t hisi_uncore_pmu_isr(int irq, void *data);
 int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu,
 			     struct platform_device *pdev);
 void hisi_uncore_pmu_init_topology(struct hisi_pmu *hisi_pmu, struct device *dev);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 3/9] drivers/perf: hisi: Simplify the probe process of each L3C PMU version
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
  2025-08-21 13:50 ` [PATCH v2 1/9] drivers/perf: hisi: Relax the event ID check in the framework Yushan Wang
  2025-08-21 13:50 ` [PATCH v2 2/9] drivers/perf: hisi: Export hisi_uncore_pmu_isr() Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:06   ` Jonathan Cameron
  2025-08-21 13:50 ` [PATCH v2 4/9] drivers/perf: hisi: Extract the event filter check of L3C PMU Yushan Wang
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

From: Yicong Yang <yangyicong@hisilicon.com>

Version 1 and 2 of L3C PMU also use different HID. Make use of
struct acpi_device_id::driver_data for version specific information
rather than judge the version register. This will help to
simplify the probe process and also a bit easier for extension.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 43 ++++++++++++--------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
index 412fc3a97963..db683dd7375c 100644
--- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
@@ -345,13 +345,6 @@ static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx)
 	writel(1 << idx, l3c_pmu->base + L3C_INT_CLEAR);
 }
 
-static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = {
-	{ "HISI0213", },
-	{ "HISI0214", },
-	{}
-};
-MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match);
-
 static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
 				  struct hisi_pmu *l3c_pmu)
 {
@@ -371,6 +364,10 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
 		return -EINVAL;
 	}
 
+	l3c_pmu->dev_info = device_get_match_data(&pdev->dev);
+	if (!l3c_pmu->dev_info)
+		return -ENODEV;
+
 	l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0);
 	if (IS_ERR(l3c_pmu->base)) {
 		dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n");
@@ -457,6 +454,18 @@ static const struct attribute_group *hisi_l3c_pmu_v2_attr_groups[] = {
 	NULL
 };
 
+static const struct hisi_pmu_dev_info hisi_l3c_pmu_v1 = {
+	.attr_groups = hisi_l3c_pmu_v1_attr_groups,
+	.counter_bits = 48,
+	.check_event = L3C_V1_NR_EVENTS,
+};
+
+static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = {
+	.attr_groups = hisi_l3c_pmu_v2_attr_groups,
+	.counter_bits = 64,
+	.check_event = L3C_V2_NR_EVENTS,
+};
+
 static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
 	.write_evtype		= hisi_l3c_pmu_write_evtype,
 	.get_event_idx		= hisi_uncore_pmu_get_event_idx,
@@ -487,16 +496,9 @@ static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev,
 	if (ret)
 		return ret;
 
-	if (l3c_pmu->identifier >= HISI_PMU_V2) {
-		l3c_pmu->counter_bits = 64;
-		l3c_pmu->check_event = L3C_V2_NR_EVENTS;
-		l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v2_attr_groups;
-	} else {
-		l3c_pmu->counter_bits = 48;
-		l3c_pmu->check_event = L3C_V1_NR_EVENTS;
-		l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v1_attr_groups;
-	}
-
+	l3c_pmu->pmu_events.attr_groups = l3c_pmu->dev_info->attr_groups;
+	l3c_pmu->counter_bits = l3c_pmu->dev_info->counter_bits;
+	l3c_pmu->check_event = l3c_pmu->dev_info->check_event;
 	l3c_pmu->num_counters = L3C_NR_COUNTERS;
 	l3c_pmu->ops = &hisi_uncore_l3c_ops;
 	l3c_pmu->dev = &pdev->dev;
@@ -554,6 +556,13 @@ static void hisi_l3c_pmu_remove(struct platform_device *pdev)
 					    &l3c_pmu->node);
 }
 
+static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = {
+	{ "HISI0213", (kernel_ulong_t)&hisi_l3c_pmu_v1 },
+	{ "HISI0214", (kernel_ulong_t)&hisi_l3c_pmu_v2 },
+	{}
+};
+MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match);
+
 static struct platform_driver hisi_l3c_pmu_driver = {
 	.driver = {
 		.name = "hisi_l3c_pmu",
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 4/9] drivers/perf: hisi: Extract the event filter check of L3C PMU
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
                   ` (2 preceding siblings ...)
  2025-08-21 13:50 ` [PATCH v2 3/9] drivers/perf: hisi: Simplify the probe process of each L3C PMU version Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:06   ` Jonathan Cameron
  2025-08-21 13:50 ` [PATCH v2 5/9] drivers/perf: hisi: Extend the field of tt_core Yushan Wang
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

From: Yicong Yang <yangyicong@hisilicon.com>

L3C PMU has 4 filter options which are sharing perf_event_attr::config1.
Driver will check config1 to see whether a certain event has a filter
setting. It'll be incorrect if we make use of other bits in config1
for non-filter options. So check whether each filter options are set
directly in a separate function instead.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
index db683dd7375c..a372dd2c07b5 100644
--- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
@@ -204,9 +204,15 @@ static void hisi_l3c_pmu_clear_core_tracetag(struct perf_event *event)
 	}
 }
 
+static bool hisi_l3c_pmu_have_filter(struct perf_event *event)
+{
+	return hisi_get_tt_req(event) || hisi_get_tt_core(event) ||
+	       hisi_get_datasrc_cfg(event) || hisi_get_datasrc_skt(event);
+}
+
 static void hisi_l3c_pmu_enable_filter(struct perf_event *event)
 {
-	if (event->attr.config1 != 0x0) {
+	if (hisi_l3c_pmu_have_filter(event)) {
 		hisi_l3c_pmu_config_req_tracetag(event);
 		hisi_l3c_pmu_config_core_tracetag(event);
 		hisi_l3c_pmu_config_ds(event);
@@ -215,7 +221,7 @@ static void hisi_l3c_pmu_enable_filter(struct perf_event *event)
 
 static void hisi_l3c_pmu_disable_filter(struct perf_event *event)
 {
-	if (event->attr.config1 != 0x0) {
+	if (hisi_l3c_pmu_have_filter(event)) {
 		hisi_l3c_pmu_clear_ds(event);
 		hisi_l3c_pmu_clear_core_tracetag(event);
 		hisi_l3c_pmu_clear_req_tracetag(event);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 5/9] drivers/perf: hisi: Extend the field of tt_core
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
                   ` (3 preceding siblings ...)
  2025-08-21 13:50 ` [PATCH v2 4/9] drivers/perf: hisi: Extract the event filter check of L3C PMU Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:07   ` Jonathan Cameron
  2025-08-21 13:50 ` [PATCH v2 6/9] drivers/perf: hisi: Refactor the event configuration of L3C PMU Yushan Wang
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

From: Yicong Yang <yangyicong@hisilicon.com>

Currently the tt_core's using config1's bit [7, 0] and can not be
extended. For some platforms there's more the 8 CPUs sharing the
L3 cache. So make tt_core use config2's bit [15, 0] and the remaining
bits in config2 is reserved for extension.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
index a372dd2c07b5..39444f11cbad 100644
--- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
@@ -55,10 +55,10 @@
 #define L3C_V1_NR_EVENTS	0x59
 #define L3C_V2_NR_EVENTS	0xFF
 
-HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config1, 7, 0);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16);
+HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0);
 
 static void hisi_l3c_pmu_config_req_tracetag(struct perf_event *event)
 {
@@ -397,7 +397,7 @@ static const struct attribute_group hisi_l3c_pmu_v1_format_group = {
 
 static struct attribute *hisi_l3c_pmu_v2_format_attr[] = {
 	HISI_PMU_FORMAT_ATTR(event, "config:0-7"),
-	HISI_PMU_FORMAT_ATTR(tt_core, "config1:0-7"),
+	HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"),
 	HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"),
 	HISI_PMU_FORMAT_ATTR(datasrc_cfg, "config1:11-15"),
 	HISI_PMU_FORMAT_ATTR(datasrc_skt, "config1:16"),
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 6/9] drivers/perf: hisi: Refactor the event configuration of L3C PMU
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
                   ` (4 preceding siblings ...)
  2025-08-21 13:50 ` [PATCH v2 5/9] drivers/perf: hisi: Extend the field of tt_core Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:08   ` Jonathan Cameron
  2025-08-21 13:50 ` [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3 Yushan Wang
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

From: Yicong Yang <yangyicong@hisilicon.com>

The event register is configured using hisi_pmu::base directly since
only one address space is supported for L3C PMU. We need to extend if
events configuration locates in different address space. In order to
make preparation for such hardware, extract the event register
configuration to separate function using hw_perf_event::event_base as
each event's base address.  Implement a private
hisi_uncore_ops::get_event_idx() callback for initialize the event_base
besides get the hardware index.

No functional changes intended.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 129 ++++++++++++-------
 1 file changed, 84 insertions(+), 45 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
index 39444f11cbad..7928b9bb3e7e 100644
--- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
@@ -60,51 +60,87 @@ HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0);
 
-static void hisi_l3c_pmu_config_req_tracetag(struct perf_event *event)
+static int hisi_l3c_pmu_get_event_idx(struct perf_event *event)
 {
 	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
+	u32 num_counters = l3c_pmu->num_counters;
+	int idx;
+
+	idx = find_first_zero_bit(used_mask, num_counters);
+	if (idx == num_counters)
+		return -EAGAIN;
+
+	set_bit(idx, used_mask);
+	event->hw.event_base = (unsigned long)l3c_pmu->base;
+
+	return idx;
+}
+
+static u32 hisi_l3c_pmu_event_readl(struct hw_perf_event *hwc, u32 reg)
+{
+	return readl((void __iomem *)hwc->event_base + reg);
+}
+
+static void hisi_l3c_pmu_event_writel(struct hw_perf_event *hwc, u32 reg, u32 val)
+{
+	writel(val, (void __iomem *)hwc->event_base + reg);
+}
+
+static u64 hisi_l3c_pmu_event_readq(struct hw_perf_event *hwc, u32 reg)
+{
+	return readq((void __iomem *)hwc->event_base + reg);
+}
+
+static void hisi_l3c_pmu_event_writeq(struct hw_perf_event *hwc, u32 reg, u64 val)
+{
+	writeq(val, (void __iomem *)hwc->event_base + reg);
+}
+
+static void hisi_l3c_pmu_config_req_tracetag(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
 	u32 tt_req = hisi_get_tt_req(event);
 
 	if (tt_req) {
 		u32 val;
 
 		/* Set request-type for tracetag */
-		val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL);
 		val |= tt_req << L3C_TRACETAG_REQ_SHIFT;
 		val |= L3C_TRACETAG_REQ_EN;
-		writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val);
 
 		/* Enable request-tracetag statistics */
-		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL);
 		val |= L3C_TRACETAG_EN;
-		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val);
 	}
 }
 
 static void hisi_l3c_pmu_clear_req_tracetag(struct perf_event *event)
 {
-	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
 	u32 tt_req = hisi_get_tt_req(event);
 
 	if (tt_req) {
 		u32 val;
 
 		/* Clear request-type */
-		val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL);
 		val &= ~(tt_req << L3C_TRACETAG_REQ_SHIFT);
 		val &= ~L3C_TRACETAG_REQ_EN;
-		writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val);
 
 		/* Disable request-tracetag statistics */
-		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL);
 		val &= ~L3C_TRACETAG_EN;
-		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val);
 	}
 }
 
 static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg)
 {
-	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
 	struct hw_perf_event *hwc = &event->hw;
 	u32 reg, reg_idx, shift, val;
 	int idx = hwc->idx;
@@ -120,15 +156,15 @@ static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg)
 	reg_idx = idx % 4;
 	shift = 8 * reg_idx;
 
-	val = readl(l3c_pmu->base + reg);
+	val = hisi_l3c_pmu_event_readl(hwc, reg);
 	val &= ~(L3C_DATSRC_MASK << shift);
 	val |= ds_cfg << shift;
-	writel(val, l3c_pmu->base + reg);
+	hisi_l3c_pmu_event_writel(hwc, reg, val);
 }
 
 static void hisi_l3c_pmu_config_ds(struct perf_event *event)
 {
-	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
 	u32 ds_cfg = hisi_get_datasrc_cfg(event);
 	u32 ds_skt = hisi_get_datasrc_skt(event);
 
@@ -138,15 +174,15 @@ static void hisi_l3c_pmu_config_ds(struct perf_event *event)
 	if (ds_skt) {
 		u32 val;
 
-		val = readl(l3c_pmu->base + L3C_DATSRC_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_DATSRC_CTRL);
 		val |= L3C_DATSRC_SKT_EN;
-		writel(val, l3c_pmu->base + L3C_DATSRC_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_DATSRC_CTRL, val);
 	}
 }
 
 static void hisi_l3c_pmu_clear_ds(struct perf_event *event)
 {
-	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
 	u32 ds_cfg = hisi_get_datasrc_cfg(event);
 	u32 ds_skt = hisi_get_datasrc_skt(event);
 
@@ -156,51 +192,51 @@ static void hisi_l3c_pmu_clear_ds(struct perf_event *event)
 	if (ds_skt) {
 		u32 val;
 
-		val = readl(l3c_pmu->base + L3C_DATSRC_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_DATSRC_CTRL);
 		val &= ~L3C_DATSRC_SKT_EN;
-		writel(val, l3c_pmu->base + L3C_DATSRC_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_DATSRC_CTRL, val);
 	}
 }
 
 static void hisi_l3c_pmu_config_core_tracetag(struct perf_event *event)
 {
-	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
 	u32 core = hisi_get_tt_core(event);
 
 	if (core) {
 		u32 val;
 
 		/* Config and enable core information */
-		writel(core, l3c_pmu->base + L3C_CORE_CTRL);
-		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_CORE_CTRL, core);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL);
 		val |= L3C_CORE_EN;
-		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val);
 
 		/* Enable core-tracetag statistics */
-		val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL);
 		val |= L3C_TRACETAG_CORE_EN;
-		writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val);
 	}
 }
 
 static void hisi_l3c_pmu_clear_core_tracetag(struct perf_event *event)
 {
-	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
 	u32 core = hisi_get_tt_core(event);
 
 	if (core) {
 		u32 val;
 
 		/* Clear core information */
-		writel(L3C_COER_NONE, l3c_pmu->base + L3C_CORE_CTRL);
-		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_CORE_CTRL, L3C_COER_NONE);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL);
 		val &= ~L3C_CORE_EN;
-		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val);
 
 		/* Disable core-tracetag statistics */
-		val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL);
+		val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL);
 		val &= ~L3C_TRACETAG_CORE_EN;
-		writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL);
+		hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val);
 	}
 }
 
@@ -239,18 +275,19 @@ static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx)
 static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu,
 				     struct hw_perf_event *hwc)
 {
-	return readq(l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx));
+	return hisi_l3c_pmu_event_readq(hwc, hisi_l3c_pmu_get_counter_offset(hwc->idx));
 }
 
 static void hisi_l3c_pmu_write_counter(struct hisi_pmu *l3c_pmu,
 				       struct hw_perf_event *hwc, u64 val)
 {
-	writeq(val, l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx));
+	hisi_l3c_pmu_event_writeq(hwc, hisi_l3c_pmu_get_counter_offset(hwc->idx), val);
 }
 
 static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
 				      u32 type)
 {
+	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
 	u32 reg, reg_idx, shift, val;
 
 	/*
@@ -265,10 +302,10 @@ static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
 	shift = 8 * reg_idx;
 
 	/* Write event code to L3C_EVENT_TYPEx Register */
-	val = readl(l3c_pmu->base + reg);
+	val = hisi_l3c_pmu_event_readl(hwc, reg);
 	val &= ~(L3C_EVTYPE_NONE << shift);
 	val |= (type << shift);
-	writel(val, l3c_pmu->base + reg);
+	hisi_l3c_pmu_event_writel(hwc, reg, val);
 }
 
 static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu)
@@ -303,9 +340,9 @@ static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu,
 	u32 val;
 
 	/* Enable counter index in L3C_EVENT_CTRL register */
-	val = readl(l3c_pmu->base + L3C_EVENT_CTRL);
+	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
 	val |= (1 << hwc->idx);
-	writel(val, l3c_pmu->base + L3C_EVENT_CTRL);
+	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
 }
 
 static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu,
@@ -314,9 +351,9 @@ static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu,
 	u32 val;
 
 	/* Clear counter index in L3C_EVENT_CTRL register */
-	val = readl(l3c_pmu->base + L3C_EVENT_CTRL);
+	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
 	val &= ~(1 << hwc->idx);
-	writel(val, l3c_pmu->base + L3C_EVENT_CTRL);
+	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
 }
 
 static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu,
@@ -324,10 +361,10 @@ static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu,
 {
 	u32 val;
 
-	val = readl(l3c_pmu->base + L3C_INT_MASK);
+	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
 	/* Write 0 to enable interrupt */
 	val &= ~(1 << hwc->idx);
-	writel(val, l3c_pmu->base + L3C_INT_MASK);
+	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
 }
 
 static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu,
@@ -335,10 +372,10 @@ static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu,
 {
 	u32 val;
 
-	val = readl(l3c_pmu->base + L3C_INT_MASK);
+	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
 	/* Write 1 to mask interrupt */
 	val |= (1 << hwc->idx);
-	writel(val, l3c_pmu->base + L3C_INT_MASK);
+	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
 }
 
 static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu)
@@ -348,7 +385,9 @@ static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu)
 
 static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx)
 {
-	writel(1 << idx, l3c_pmu->base + L3C_INT_CLEAR);
+	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
+
+	hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << idx);
 }
 
 static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
@@ -474,7 +513,7 @@ static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = {
 
 static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
 	.write_evtype		= hisi_l3c_pmu_write_evtype,
-	.get_event_idx		= hisi_uncore_pmu_get_event_idx,
+	.get_event_idx		= hisi_l3c_pmu_get_event_idx,
 	.start_counters		= hisi_l3c_pmu_start_counters,
 	.stop_counters		= hisi_l3c_pmu_stop_counters,
 	.enable_counter		= hisi_l3c_pmu_enable_counter,
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
                   ` (5 preceding siblings ...)
  2025-08-21 13:50 ` [PATCH v2 6/9] drivers/perf: hisi: Refactor the event configuration of L3C PMU Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:12   ` Jonathan Cameron
  2025-08-27  3:43   ` Yicong Yang
  2025-08-21 13:50 ` [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error Yushan Wang
  2025-08-21 13:50 ` [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon Yushan Wang
  8 siblings, 2 replies; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

From: Yicong Yang <yangyicong@hisilicon.com>

This patch adds support for L3C PMU v3. The v3 L3C PMU supports
an extended events space which can be controlled in up to 2 extra
address spaces with separate overflow interrupts. The layout
of the control/event registers are kept the same. The extended events
with original ones together cover the monitoring job of all transactions
on L3C.

The extended events is specified with `ext=[1|2]` option for the
driver to distinguish, like below:

perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=1/

Currently only event option using config bit [7, 0]. There's
still plenty unused space. Make ext using config [16, 17] and
reserve bit [15, 8] for event option for future extension.

With the capability of extra counters, number of counters for HiSilicon
uncore PMU could reach up to 24, the usedmap is extended accordingly.

The hw_perf_event::event_base is initialized to the base MMIO
address of the event and will be used for later control,
overflow handling and counts readout.

We still make use of the Uncore PMU framework for handling the
events and interrupt migration on CPU hotplug. The framework's
cpuhp callback will handle the event migration and interrupt
migration of orginial event, if PMU supports extended events
then the interrupt of extended events is migrated to the same
CPU choosed by the framework.

A new HID of HISI0215 is used for this version of L3C PMU.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Co-developed-by: Yushan Wang <wangyushan12@huawei.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 352 +++++++++++++++++--
 drivers/perf/hisilicon/hisi_uncore_pmu.h     |   2 +-
 2 files changed, 321 insertions(+), 33 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
index 7928b9bb3e7e..95136c01f17b 100644
--- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
@@ -55,24 +55,85 @@
 #define L3C_V1_NR_EVENTS	0x59
 #define L3C_V2_NR_EVENTS	0xFF
 
+#define L3C_MAX_EXT		2
+
+HISI_PMU_EVENT_ATTR_EXTRACTOR(ext, config, 17, 16);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16);
 HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0);
 
+struct hisi_l3c_pmu {
+	struct hisi_pmu l3c_pmu;
+
+	/* MMIO and IRQ resources for extension events */
+	void __iomem *ext_base[L3C_MAX_EXT];
+	int ext_irq[L3C_MAX_EXT];
+	int ext_num;
+};
+
+#define to_hisi_l3c_pmu(_l3c_pmu) \
+	container_of(_l3c_pmu, struct hisi_l3c_pmu, l3c_pmu)
+
+/*
+ * The hardware counter idx used in counter enable/disable,
+ * interrupt enable/disable and status check, etc.
+ */
+#define L3C_HW_IDX(_idx)		((_idx) % L3C_NR_COUNTERS)
+
+/* The ext resource number to which a hardware counter belongs. */
+#define L3C_CNTR_EXT(_idx)		((_idx) / L3C_NR_COUNTERS)
+
+struct hisi_l3c_pmu_ext {
+	bool support_ext;
+};
+
+static bool support_ext(struct hisi_l3c_pmu *pmu)
+{
+	struct hisi_l3c_pmu_ext *l3c_pmu_ext = pmu->l3c_pmu.dev_info->private;
+
+	return l3c_pmu_ext->support_ext;
+}
+
 static int hisi_l3c_pmu_get_event_idx(struct perf_event *event)
 {
 	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
 	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
-	u32 num_counters = l3c_pmu->num_counters;
+	int ext = hisi_get_ext(event);
 	int idx;
 
-	idx = find_first_zero_bit(used_mask, num_counters);
-	if (idx == num_counters)
+	/*
+	 * For an L3C PMU that supports extension events, we can monitor
+	 * maximum 2 * num_counters to 3 * num_counters events, depending on
+	 * the number of ext regions supported by hardware. Thus use bit
+	 * [0, num_counters - 1] for normal events and bit
+	 * [ext * num_counters, (ext + 1) * num_counters - 1] for extension
+	 * events. The idx allocation will keep unchanged for normal events and
+	 * we can also use the idx to distinguish whether it's an extension
+	 * event or not.
+	 *
+	 * Since normal events and extension events locates on the different
+	 * address space, save the base address to the event->hw.event_base.
+	 */
+	if (ext) {
+		if (!support_ext(hisi_l3c_pmu))
+			return -EOPNOTSUPP;
+
+		event->hw.event_base = (unsigned long)hisi_l3c_pmu->ext_base[ext - 1];
+		idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS * (ext + 1),
+					 L3C_NR_COUNTERS * ext);
+	} else {
+		event->hw.event_base = (unsigned long)l3c_pmu->base;
+		idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS, 0);
+	}
+
+	if (idx >= L3C_NR_COUNTERS * (ext + 1))
 		return -EAGAIN;
 
 	set_bit(idx, used_mask);
-	event->hw.event_base = (unsigned long)l3c_pmu->base;
+
+	WARN_ON(idx < L3C_NR_COUNTERS * ext || idx >= L3C_NR_COUNTERS * (ext + 1));
 
 	return idx;
 }
@@ -143,7 +204,7 @@ static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg)
 {
 	struct hw_perf_event *hwc = &event->hw;
 	u32 reg, reg_idx, shift, val;
-	int idx = hwc->idx;
+	int idx = L3C_HW_IDX(hwc->idx);
 
 	/*
 	 * Select the appropriate datasource register(L3C_DATSRC_TYPE0/1).
@@ -264,12 +325,23 @@ static void hisi_l3c_pmu_disable_filter(struct perf_event *event)
 	}
 }
 
+static int hisi_l3c_pmu_check_filter(struct perf_event *event)
+{
+	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	int ext = hisi_get_ext(event);
+
+	if (ext < 0 || ext > hisi_l3c_pmu->ext_num)
+		return -EINVAL;
+	return 0;
+}
+
 /*
  * Select the counter register offset using the counter index
  */
 static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx)
 {
-	return (L3C_CNTR0_LOWER + (cntr_idx * 8));
+	return (L3C_CNTR0_LOWER + (L3C_HW_IDX(cntr_idx) * 8));
 }
 
 static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu,
@@ -290,6 +362,8 @@ static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
 	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
 	u32 reg, reg_idx, shift, val;
 
+	idx = L3C_HW_IDX(idx);
+
 	/*
 	 * Select the appropriate event select register(L3C_EVENT_TYPE0/1).
 	 * There are 2 event select registers for the 8 hardware counters.
@@ -310,28 +384,63 @@ static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
 
 static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu)
 {
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
+	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
 	u32 val;
+	int i;
 
 	/*
-	 * Set perf_enable bit in L3C_PERF_CTRL register to start counting
-	 * for all enabled counters.
+	 * Check if any counter belongs to the normal range (instead of ext
+	 * range). If so, enable it.
 	 */
-	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
-	val |= L3C_PERF_CTRL_EN;
-	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+	if (bit < L3C_NR_COUNTERS) {
+		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
+		val |= L3C_PERF_CTRL_EN;
+		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+	}
+
+	/* If not, do enable it on ext ranges. */
+	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
+		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
+				    L3C_NR_COUNTERS * (i + 1));
+		if (L3C_CNTR_EXT(bit) == i + 1) {
+			val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
+			val |= L3C_PERF_CTRL_EN;
+			writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
+		}
+	}
 }
 
 static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu)
 {
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
+	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
 	u32 val;
+	int i;
 
 	/*
-	 * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting
-	 * for all enabled counters.
+	 * Check if any counter belongs to the normal range (instead of ext
+	 * range). If so, stop it.
 	 */
-	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
-	val &= ~(L3C_PERF_CTRL_EN);
-	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+	if (bit < L3C_NR_COUNTERS) {
+		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
+		val &= ~(L3C_PERF_CTRL_EN);
+		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
+	}
+
+	/* If not, do stop it on ext ranges. */
+	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
+		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
+				    L3C_NR_COUNTERS * (i + 1));
+		if (L3C_CNTR_EXT(bit) != i + 1)
+			continue;
+
+		val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
+		val &= ~L3C_PERF_CTRL_EN;
+		writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
+	}
 }
 
 static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu,
@@ -341,7 +450,7 @@ static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu,
 
 	/* Enable counter index in L3C_EVENT_CTRL register */
 	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
-	val |= (1 << hwc->idx);
+	val |= (1 << L3C_HW_IDX(hwc->idx));
 	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
 }
 
@@ -352,7 +461,7 @@ static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu,
 
 	/* Clear counter index in L3C_EVENT_CTRL register */
 	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
-	val &= ~(1 << hwc->idx);
+	val &= ~(1 << L3C_HW_IDX(hwc->idx));
 	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
 }
 
@@ -363,7 +472,7 @@ static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu,
 
 	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
 	/* Write 0 to enable interrupt */
-	val &= ~(1 << hwc->idx);
+	val &= ~(1 << L3C_HW_IDX(hwc->idx));
 	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
 }
 
@@ -374,20 +483,34 @@ static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu,
 
 	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
 	/* Write 1 to mask interrupt */
-	val |= (1 << hwc->idx);
+	val |= (1 << L3C_HW_IDX(hwc->idx));
 	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
 }
 
 static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu)
 {
-	return readl(l3c_pmu->base + L3C_INT_STATUS);
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	u32 ext_int, status, status_ext = 0;
+	int i;
+
+	status = readl(l3c_pmu->base + L3C_INT_STATUS);
+
+	if (!support_ext(hisi_l3c_pmu))
+		return status;
+
+	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
+		ext_int = readl(hisi_l3c_pmu->ext_base[i] + L3C_INT_STATUS);
+		status_ext |= ext_int << (L3C_NR_COUNTERS * i);
+	}
+
+	return status | (status_ext << L3C_NR_COUNTERS);
 }
 
 static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx)
 {
 	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
 
-	hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << idx);
+	hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << L3C_HW_IDX(idx));
 }
 
 static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
@@ -409,10 +532,6 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
 		return -EINVAL;
 	}
 
-	l3c_pmu->dev_info = device_get_match_data(&pdev->dev);
-	if (!l3c_pmu->dev_info)
-		return -ENODEV;
-
 	l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0);
 	if (IS_ERR(l3c_pmu->base)) {
 		dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n");
@@ -424,6 +543,46 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
 	return 0;
 }
 
+static int hisi_l3c_pmu_init_ext(struct hisi_pmu *l3c_pmu, struct platform_device *pdev)
+{
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	int ret, irq, ext_num, i;
+	char *irqname;
+
+	/* HiSilicon L3C PMU ext should have more than 1 irq resources. */
+	ext_num = platform_irq_count(pdev);
+	if (ext_num < 2)
+		return -ENODEV;
+
+	hisi_l3c_pmu->ext_num = ext_num - 1;
+
+	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
+		hisi_l3c_pmu->ext_base[i] = devm_platform_ioremap_resource(pdev, i + 1);
+		if (IS_ERR(hisi_l3c_pmu->ext_base[i]))
+			return PTR_ERR(hisi_l3c_pmu->ext_base[i]);
+
+		irq = platform_get_irq(pdev, i + 1);
+		if (irq < 0)
+			return irq;
+
+		irqname = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s ext%d",
+					 dev_name(&pdev->dev), i + 1);
+		if (!irqname)
+			return -ENOMEM;
+
+		ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr,
+				       IRQF_NOBALANCING | IRQF_NO_THREAD,
+				       irqname, l3c_pmu);
+		if (ret < 0)
+			return dev_err_probe(&pdev->dev, ret,
+				"Fail to request EXT IRQ: %d.\n", irq);
+
+		hisi_l3c_pmu->ext_irq[i] = irq;
+	}
+
+	return 0;
+}
+
 static struct attribute *hisi_l3c_pmu_v1_format_attr[] = {
 	HISI_PMU_FORMAT_ATTR(event, "config:0-7"),
 	NULL,
@@ -448,6 +607,19 @@ static const struct attribute_group hisi_l3c_pmu_v2_format_group = {
 	.attrs = hisi_l3c_pmu_v2_format_attr,
 };
 
+static struct attribute *hisi_l3c_pmu_v3_format_attr[] = {
+	HISI_PMU_FORMAT_ATTR(event, "config:0-7"),
+	HISI_PMU_FORMAT_ATTR(ext, "config:16-17"),
+	HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"),
+	HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"),
+	NULL
+};
+
+static const struct attribute_group hisi_l3c_pmu_v3_format_group = {
+	.name = "format",
+	.attrs = hisi_l3c_pmu_v3_format_attr,
+};
+
 static struct attribute *hisi_l3c_pmu_v1_events_attr[] = {
 	HISI_PMU_EVENT_ATTR(rd_cpipe,		0x00),
 	HISI_PMU_EVENT_ATTR(wr_cpipe,		0x01),
@@ -483,6 +655,26 @@ static const struct attribute_group hisi_l3c_pmu_v2_events_group = {
 	.attrs = hisi_l3c_pmu_v2_events_attr,
 };
 
+static struct attribute *hisi_l3c_pmu_v3_events_attr[] = {
+	HISI_PMU_EVENT_ATTR(rd_spipe,		0x18),
+	HISI_PMU_EVENT_ATTR(rd_hit_spipe,	0x19),
+	HISI_PMU_EVENT_ATTR(wr_spipe,		0x1a),
+	HISI_PMU_EVENT_ATTR(wr_hit_spipe,	0x1b),
+	HISI_PMU_EVENT_ATTR(io_rd_spipe,	0x1c),
+	HISI_PMU_EVENT_ATTR(io_rd_hit_spipe,	0x1d),
+	HISI_PMU_EVENT_ATTR(io_wr_spipe,	0x1e),
+	HISI_PMU_EVENT_ATTR(io_wr_hit_spipe,	0x1f),
+	HISI_PMU_EVENT_ATTR(cycles,		0x7f),
+	HISI_PMU_EVENT_ATTR(l3c_ref,		0xbc),
+	HISI_PMU_EVENT_ATTR(l3c2ring,		0xbd),
+	NULL
+};
+
+static const struct attribute_group hisi_l3c_pmu_v3_events_group = {
+	.name = "events",
+	.attrs = hisi_l3c_pmu_v3_events_attr,
+};
+
 static const struct attribute_group *hisi_l3c_pmu_v1_attr_groups[] = {
 	&hisi_l3c_pmu_v1_format_group,
 	&hisi_l3c_pmu_v1_events_group,
@@ -499,16 +691,41 @@ static const struct attribute_group *hisi_l3c_pmu_v2_attr_groups[] = {
 	NULL
 };
 
+static const struct attribute_group *hisi_l3c_pmu_v3_attr_groups[] = {
+	&hisi_l3c_pmu_v3_format_group,
+	&hisi_l3c_pmu_v3_events_group,
+	&hisi_pmu_cpumask_attr_group,
+	&hisi_pmu_identifier_group,
+	NULL
+};
+
+static struct hisi_l3c_pmu_ext hisi_l3c_pmu_support_ext = {
+	.support_ext = true,
+};
+
+static struct hisi_l3c_pmu_ext hisi_l3c_pmu_not_support_ext = {
+	.support_ext = false,
+};
+
 static const struct hisi_pmu_dev_info hisi_l3c_pmu_v1 = {
 	.attr_groups = hisi_l3c_pmu_v1_attr_groups,
 	.counter_bits = 48,
 	.check_event = L3C_V1_NR_EVENTS,
+	.private = &hisi_l3c_pmu_not_support_ext,
 };
 
 static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = {
 	.attr_groups = hisi_l3c_pmu_v2_attr_groups,
 	.counter_bits = 64,
 	.check_event = L3C_V2_NR_EVENTS,
+	.private = &hisi_l3c_pmu_not_support_ext,
+};
+
+static const struct hisi_pmu_dev_info hisi_l3c_pmu_v3 = {
+	.attr_groups = hisi_l3c_pmu_v3_attr_groups,
+	.counter_bits = 64,
+	.check_event = L3C_V2_NR_EVENTS,
+	.private = &hisi_l3c_pmu_support_ext,
 };
 
 static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
@@ -526,11 +743,14 @@ static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
 	.clear_int_status	= hisi_l3c_pmu_clear_int_status,
 	.enable_filter		= hisi_l3c_pmu_enable_filter,
 	.disable_filter		= hisi_l3c_pmu_disable_filter,
+	.check_filter		= hisi_l3c_pmu_check_filter,
 };
 
 static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev,
 				  struct hisi_pmu *l3c_pmu)
 {
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	struct hisi_l3c_pmu_ext *l3c_pmu_dev_ext = l3c_pmu->dev_info->private;
 	int ret;
 
 	ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu);
@@ -549,27 +769,50 @@ static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev,
 	l3c_pmu->dev = &pdev->dev;
 	l3c_pmu->on_cpu = -1;
 
+	if (l3c_pmu_dev_ext->support_ext) {
+		ret = hisi_l3c_pmu_init_ext(l3c_pmu, pdev);
+		if (ret)
+			return ret;
+		/*
+		 * The extension events have their own counters with the
+		 * same number of the normal events counters. So we can
+		 * have at maximum num_counters * ext events monitored.
+		 */
+		l3c_pmu->num_counters += hisi_l3c_pmu->ext_num * L3C_NR_COUNTERS;
+	}
+
 	return 0;
 }
 
 static int hisi_l3c_pmu_probe(struct platform_device *pdev)
 {
+	struct hisi_l3c_pmu *hisi_l3c_pmu;
 	struct hisi_pmu *l3c_pmu;
 	char *name;
 	int ret;
 
-	l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*l3c_pmu), GFP_KERNEL);
-	if (!l3c_pmu)
+	hisi_l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*hisi_l3c_pmu), GFP_KERNEL);
+	if (!hisi_l3c_pmu)
 		return -ENOMEM;
 
+	l3c_pmu = &hisi_l3c_pmu->l3c_pmu;
 	platform_set_drvdata(pdev, l3c_pmu);
 
+	l3c_pmu->dev_info = device_get_match_data(&pdev->dev);
+	if (!l3c_pmu->dev_info)
+		return -ENODEV;
+
 	ret = hisi_l3c_pmu_dev_probe(pdev, l3c_pmu);
 	if (ret)
 		return ret;
 
-	name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d",
-			      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id);
+	if (l3c_pmu->topo.sub_id >= 0)
+		name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d_%d",
+				      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id,
+				      l3c_pmu->topo.sub_id);
+	else
+		name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d",
+				      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id);
 	if (!name)
 		return -ENOMEM;
 
@@ -604,6 +847,7 @@ static void hisi_l3c_pmu_remove(struct platform_device *pdev)
 static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = {
 	{ "HISI0213", (kernel_ulong_t)&hisi_l3c_pmu_v1 },
 	{ "HISI0214", (kernel_ulong_t)&hisi_l3c_pmu_v2 },
+	{ "HISI0215", (kernel_ulong_t)&hisi_l3c_pmu_v3 },
 	{}
 };
 MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match);
@@ -618,14 +862,58 @@ static struct platform_driver hisi_l3c_pmu_driver = {
 	.remove = hisi_l3c_pmu_remove,
 };
 
+static int hisi_l3c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	int ret;
+
+	ret = hisi_uncore_pmu_online_cpu(cpu, node);
+	if (ret)
+		return ret;
+
+	/* Avoid L3C pmu not supporting ext from ext irq migrating. */
+	if (!support_ext(hisi_l3c_pmu))
+		return 0;
+
+	for (int i = 0; i < hisi_l3c_pmu->ext_num; i++)
+		WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i],
+					 cpumask_of(l3c_pmu->on_cpu)));
+	return 0;
+}
+
+static int hisi_l3c_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
+	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
+	int ret;
+
+	ret = hisi_uncore_pmu_offline_cpu(cpu, node);
+	if (ret)
+		return ret;
+
+	/* If failed to find any available CPU, skip irq migration. */
+	if (l3c_pmu->on_cpu <= 0)
+		return 0;
+
+	/* Avoid L3C pmu not supporting ext from ext irq migrating. */
+	if (!support_ext(hisi_l3c_pmu))
+		return 0;
+
+	for (int i = 0; i < hisi_l3c_pmu->ext_num; i++)
+		WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i],
+					 cpumask_of(l3c_pmu->on_cpu)));
+	return 0;
+}
+
 static int __init hisi_l3c_pmu_module_init(void)
 {
 	int ret;
 
 	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE,
 				      "AP_PERF_ARM_HISI_L3_ONLINE",
-				      hisi_uncore_pmu_online_cpu,
-				      hisi_uncore_pmu_offline_cpu);
+				      hisi_l3c_pmu_online_cpu,
+				      hisi_l3c_pmu_offline_cpu);
 	if (ret) {
 		pr_err("L3C PMU: Error setup hotplug, ret = %d\n", ret);
 		return ret;
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h
index 02fa022925d4..0334a797e499 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.h
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h
@@ -24,7 +24,7 @@
 #define pr_fmt(fmt)     "hisi_pmu: " fmt
 
 #define HISI_PMU_V2		0x30
-#define HISI_MAX_COUNTERS 0x10
+#define HISI_MAX_COUNTERS 0x18
 #define to_hisi_pmu(p)	(container_of(p, struct hisi_pmu, pmu))
 
 #define HISI_PMU_ATTR(_name, _func, _config)				\
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
                   ` (6 preceding siblings ...)
  2025-08-21 13:50 ` [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3 Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:21   ` Jonathan Cameron
  2025-08-27  2:15   ` Yicong Yang
  2025-08-21 13:50 ` [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon Yushan Wang
  8 siblings, 2 replies; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

The inline path of sysfs should be placed in literal blocks to make
documentation look better.

Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 Documentation/admin-guide/perf/hisi-pmu.rst | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst
index 48992a0b8e94..a307bce2f5c5 100644
--- a/Documentation/admin-guide/perf/hisi-pmu.rst
+++ b/Documentation/admin-guide/perf/hisi-pmu.rst
@@ -18,9 +18,10 @@ HiSilicon SoC uncore PMU driver
 Each device PMU has separate registers for event counting, control and
 interrupt, and the PMU driver shall register perf PMU drivers like L3C,
 HHA and DDRC etc. The available events and configuration options shall
-be described in the sysfs, see:
+be described in the sysfs, see::
+
+/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>
 
-/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>.
 The "perf list" command shall list the available events from sysfs.
 
 Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon
  2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
                   ` (7 preceding siblings ...)
  2025-08-21 13:50 ` [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error Yushan Wang
@ 2025-08-21 13:50 ` Yushan Wang
  2025-08-26 13:22   ` Jonathan Cameron
  2025-08-27  2:27   ` Yicong Yang
  8 siblings, 2 replies; 25+ messages in thread
From: Yushan Wang @ 2025-08-21 13:50 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: robin.murphy, yangyicong, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, wangyushan12, hejunhao3

Some of HiSilicon V3 PMU hardware is divided into parts to fulfill the
job of monitoring specific parts of a device.  Add description on that
as well as the newly added ext operand for L3C PMU.

Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
---
 Documentation/admin-guide/perf/hisi-pmu.rst | 38 +++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst
index a307bce2f5c5..4c7584fe3c1a 100644
--- a/Documentation/admin-guide/perf/hisi-pmu.rst
+++ b/Documentation/admin-guide/perf/hisi-pmu.rst
@@ -12,8 +12,8 @@ The HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster
 called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has
 two HHAs (0 - 1) and four DDRCs (0 - 3), respectively.
 
-HiSilicon SoC uncore PMU driver
--------------------------------
+HiSilicon SoC uncore PMU v1
+---------------------------
 
 Each device PMU has separate registers for event counting, control and
 interrupt, and the PMU driver shall register perf PMU drivers like L3C,
@@ -56,6 +56,9 @@ Example usage of perf::
   $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
   $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5
 
+HiSilicon SoC uncore PMU v2
+----------------------------------
+
 For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
 as PMU v1, but some new functions are added to the hardware.
 
@@ -113,6 +116,37 @@ uring channel. It is 2 bits. Some important codes are as follows:
 - 2'b00: default value, count the events which sent to the both uring and
   uring_ext channel;
 
+HiSilicon SoC uncore PMU v3
+----------------------------------
+
+For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are
+further divided into parts for finer granularity of tracing, each part has its
+own dedicated PMU, and all such PMUs together cover the monitoring job of events
+on particular uncore device. Such PMUs are described in sysfs with name format
+slightly changed::
+
+/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}>
+
+Z is the sub-id, indicating different PMUs for part of hardware device.
+
+Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU
+provides ``ext`` operand to allow exploration of even finer granual statistics
+of L3C PMU, L3C PMU driver use that as hint of termination when delivering perf
+command to hardware:
+
+- ext=0: Default, could be used with event names.
+- ext=1 and ext=2: Must be used with event codes, event names are not supported.
+
+An example of perf command could be::
+
+  $# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5
+
+or::
+
+  $# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5
+
+As above, ``hisi_sccl0_l3c1_0`` locates PMU on CPU cluster 0, L3 cache 1 pipe0.
+
 Users could configure IDs to count data come from specific CCL/ICL, by setting
 srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
 tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/9] drivers/perf: hisi: Relax the event ID check in the framework
  2025-08-21 13:50 ` [PATCH v2 1/9] drivers/perf: hisi: Relax the event ID check in the framework Yushan Wang
@ 2025-08-26 13:03   ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:03 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:41 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Event ID is only using the attr::config bit [7, 0] but we check the
> event range using the whole 64bit field. It blocks the usage of the
> rest field of attr::config. Relax the check by only using the
> bit [7, 0].
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>

Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>

One comment inline but up to you whether you act on it.

> ---
>  drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
>  drivers/perf/hisilicon/hisi_uncore_pmu.h | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> index a449651f79c9..6594d64b03a9 100644
> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> @@ -234,7 +234,7 @@ int hisi_uncore_pmu_event_init(struct perf_event *event)
>  		return -EINVAL;
>  
>  	hisi_pmu = to_hisi_pmu(event->pmu);
> -	if (event->attr.config > hisi_pmu->check_event)
> +	if ((event->attr.config & HISI_EVENTID_MASK) > hisi_pmu->check_event)
>  		return -EINVAL;
>  
>  	if (hisi_pmu->on_cpu == -1)
> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h
> index 777675838b80..6186b232f454 100644
> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.h
> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h
> @@ -43,7 +43,8 @@
>  		return FIELD_GET(GENMASK_ULL(hi, lo), event->attr.config);  \
>  	}
>  
> -#define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff)
> +#define HISI_EVENTID_MASK	0xff

I'd use GENMASK(7, 0) here but this one is obvious enough that it's not important
and clearly you are just moving the definition.

> +#define HISI_GET_EVENTID(ev) ((ev)->hw.config_base & HISI_EVENTID_MASK)
>  
>  #define HISI_PMU_EVTYPE_BITS		8
>  #define HISI_PMU_EVTYPE_SHIFT(idx)	((idx) % 4 * HISI_PMU_EVTYPE_BITS)



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/9] drivers/perf: hisi: Export hisi_uncore_pmu_isr()
  2025-08-21 13:50 ` [PATCH v2 2/9] drivers/perf: hisi: Export hisi_uncore_pmu_isr() Yushan Wang
@ 2025-08-26 13:03   ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:03 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:42 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Currently Uncore PMU framework assume one PMU device only have one
> interrupt and will help register the interrupt handler. It cannot
> support a PMU with multiple interrupt resources.  An uncore PMU may
> have multiple interrupts that can share the same handler.  Export
> hisi_uncore_pmu_isr() to allow drivers register the irq handler by
> their own routine.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/9] drivers/perf: hisi: Simplify the probe process of each L3C PMU version
  2025-08-21 13:50 ` [PATCH v2 3/9] drivers/perf: hisi: Simplify the probe process of each L3C PMU version Yushan Wang
@ 2025-08-26 13:06   ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:06 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:43 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Version 1 and 2 of L3C PMU also use different HID. Make use of
> struct acpi_device_id::driver_data for version specific information
> rather than judge the version register. This will help to
> simplify the probe process and also a bit easier for extension.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/9] drivers/perf: hisi: Extract the event filter check of L3C PMU
  2025-08-21 13:50 ` [PATCH v2 4/9] drivers/perf: hisi: Extract the event filter check of L3C PMU Yushan Wang
@ 2025-08-26 13:06   ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:06 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:44 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> L3C PMU has 4 filter options which are sharing perf_event_attr::config1.
> Driver will check config1 to see whether a certain event has a filter
> setting. It'll be incorrect if we make use of other bits in config1
> for non-filter options. So check whether each filter options are set
> directly in a separate function instead.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 5/9] drivers/perf: hisi: Extend the field of tt_core
  2025-08-21 13:50 ` [PATCH v2 5/9] drivers/perf: hisi: Extend the field of tt_core Yushan Wang
@ 2025-08-26 13:07   ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:07 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:45 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Currently the tt_core's using config1's bit [7, 0] and can not be
> extended. For some platforms there's more the 8 CPUs sharing the
> L3 cache. So make tt_core use config2's bit [15, 0] and the remaining
> bits in config2 is reserved for extension.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 6/9] drivers/perf: hisi: Refactor the event configuration of L3C PMU
  2025-08-21 13:50 ` [PATCH v2 6/9] drivers/perf: hisi: Refactor the event configuration of L3C PMU Yushan Wang
@ 2025-08-26 13:08   ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:08 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:46 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> The event register is configured using hisi_pmu::base directly since
> only one address space is supported for L3C PMU. We need to extend if
> events configuration locates in different address space. In order to
> make preparation for such hardware, extract the event register
> configuration to separate function using hw_perf_event::event_base as
> each event's base address.  Implement a private
> hisi_uncore_ops::get_event_idx() callback for initialize the event_base
> besides get the hardware index.
> 
> No functional changes intended.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3
  2025-08-21 13:50 ` [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3 Yushan Wang
@ 2025-08-26 13:12   ` Jonathan Cameron
  2025-08-27  6:21     ` wangyushan
  2025-08-27  3:43   ` Yicong Yang
  1 sibling, 1 reply; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:12 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:47 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> This patch adds support for L3C PMU v3. The v3 L3C PMU supports
> an extended events space which can be controlled in up to 2 extra
> address spaces with separate overflow interrupts. The layout
> of the control/event registers are kept the same. The extended events
> with original ones together cover the monitoring job of all transactions
> on L3C.
> 
> The extended events is specified with `ext=[1|2]` option for the
> driver to distinguish, like below:
> 
> perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=1/
> 
> Currently only event option using config bit [7, 0]. There's
> still plenty unused space. Make ext using config [16, 17] and
> reserve bit [15, 8] for event option for future extension.
> 
> With the capability of extra counters, number of counters for HiSilicon
> uncore PMU could reach up to 24, the usedmap is extended accordingly.
> 
> The hw_perf_event::event_base is initialized to the base MMIO
> address of the event and will be used for later control,
> overflow handling and counts readout.
> 
> We still make use of the Uncore PMU framework for handling the
> events and interrupt migration on CPU hotplug. The framework's
> cpuhp callback will handle the event migration and interrupt
> migration of orginial event, if PMU supports extended events
> then the interrupt of extended events is migrated to the same
> CPU choosed by the framework.
> 
> A new HID of HISI0215 is used for this version of L3C PMU.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Co-developed-by: Yushan Wang <wangyushan12@huawei.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
One minor formatting thing I missed in internal reviews. With that
tidied up (check other patches for this as well)

Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>

>  
>  static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu)
>  {
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
> +	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
>  	u32 val;
> +	int i;
>  
>  	/*
> -	 * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting
> -	 * for all enabled counters.
> +	 * Check if any counter belongs to the normal range (instead of ext
> +	 * range). If so, stop it.
>  	 */
> -	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
> -	val &= ~(L3C_PERF_CTRL_EN);
> -	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
> +	if (bit < L3C_NR_COUNTERS) {
> +		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
> +		val &= ~(L3C_PERF_CTRL_EN);

Brackets not adding anything here and inconsistently applied.
Please clean these up.

> +		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
> +	}
> +
> +	/* If not, do stop it on ext ranges. */
> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
> +		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
> +				    L3C_NR_COUNTERS * (i + 1));
> +		if (L3C_CNTR_EXT(bit) != i + 1)
> +			continue;
> +
> +		val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
> +		val &= ~L3C_PERF_CTRL_EN;
> +		writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
> +	}
>  }



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error
  2025-08-21 13:50 ` [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error Yushan Wang
@ 2025-08-26 13:21   ` Jonathan Cameron
  2025-08-27  2:15   ` Yicong Yang
  1 sibling, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:21 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:48 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> The inline path of sysfs should be placed in literal blocks to make
> documentation look better.
> 
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon
  2025-08-21 13:50 ` [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon Yushan Wang
@ 2025-08-26 13:22   ` Jonathan Cameron
  2025-08-27  2:27   ` Yicong Yang
  1 sibling, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2025-08-26 13:22 UTC (permalink / raw)
  To: Yushan Wang
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3

On Thu, 21 Aug 2025 21:50:49 +0800
Yushan Wang <wangyushan12@huawei.com> wrote:

> Some of HiSilicon V3 PMU hardware is divided into parts to fulfill the
> job of monitoring specific parts of a device.  Add description on that
> as well as the newly added ext operand for L3C PMU.
> 
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error
  2025-08-21 13:50 ` [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error Yushan Wang
  2025-08-26 13:21   ` Jonathan Cameron
@ 2025-08-27  2:15   ` Yicong Yang
  1 sibling, 0 replies; 25+ messages in thread
From: Yicong Yang @ 2025-08-27  2:15 UTC (permalink / raw)
  To: Yushan Wang, will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: yangyicong, robin.murphy, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, hejunhao3

On 2025/8/21 21:50, Yushan Wang wrote:
> The inline path of sysfs should be placed in literal blocks to make
> documentation look better.
> 
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>

Acked-by: Yicong Yang <yangyicong@hisilicon.com>

> ---
>  Documentation/admin-guide/perf/hisi-pmu.rst | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst
> index 48992a0b8e94..a307bce2f5c5 100644
> --- a/Documentation/admin-guide/perf/hisi-pmu.rst
> +++ b/Documentation/admin-guide/perf/hisi-pmu.rst
> @@ -18,9 +18,10 @@ HiSilicon SoC uncore PMU driver
>  Each device PMU has separate registers for event counting, control and
>  interrupt, and the PMU driver shall register perf PMU drivers like L3C,
>  HHA and DDRC etc. The available events and configuration options shall
> -be described in the sysfs, see:
> +be described in the sysfs, see::
> +
> +/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>
>  
> -/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>.
>  The "perf list" command shall list the available events from sysfs.
>  
>  Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon
  2025-08-21 13:50 ` [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon Yushan Wang
  2025-08-26 13:22   ` Jonathan Cameron
@ 2025-08-27  2:27   ` Yicong Yang
  2025-08-27  7:22     ` wangyushan
  1 sibling, 1 reply; 25+ messages in thread
From: Yicong Yang @ 2025-08-27  2:27 UTC (permalink / raw)
  To: Yushan Wang, will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: yangyicong, robin.murphy, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, hejunhao3

Hi Yushan,

the subject seems to be truncated? should it be like below?

Documentation: hisi-pmu: Add introduction to HiSilicon v3 PMU

other comments inline. sorry for the late reply..

On 2025/8/21 21:50, Yushan Wang wrote:
> Some of HiSilicon V3 PMU hardware is divided into parts to fulfill the
> job of monitoring specific parts of a device.  Add description on that
> as well as the newly added ext operand for L3C PMU.
> 
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
> ---
>  Documentation/admin-guide/perf/hisi-pmu.rst | 38 +++++++++++++++++++--
>  1 file changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst
> index a307bce2f5c5..4c7584fe3c1a 100644
> --- a/Documentation/admin-guide/perf/hisi-pmu.rst
> +++ b/Documentation/admin-guide/perf/hisi-pmu.rst
> @@ -12,8 +12,8 @@ The HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster
>  called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has
>  two HHAs (0 - 1) and four DDRCs (0 - 3), respectively.
>  
> -HiSilicon SoC uncore PMU driver
> --------------------------------
> +HiSilicon SoC uncore PMU v1

these (and below) new sections will break the ordered list of the options. this should not be
necessary to mention the version, just add the newly added options in the current way and
mention the introduced version should be enough.

> +---------------------------
>  
>  Each device PMU has separate registers for event counting, control and
>  interrupt, and the PMU driver shall register perf PMU drivers like L3C,
> @@ -56,6 +56,9 @@ Example usage of perf::
>    $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
>    $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5
>  
> +HiSilicon SoC uncore PMU v2
> +----------------------------------
> +
>  For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
>  as PMU v1, but some new functions are added to the hardware.
>  
> @@ -113,6 +116,37 @@ uring channel. It is 2 bits. Some important codes are as follows:
>  - 2'b00: default value, count the events which sent to the both uring and
>    uring_ext channel;
>  
> +HiSilicon SoC uncore PMU v3
> +----------------------------------
> +
> +For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are
> +further divided into parts for finer granularity of tracing, each part has its
> +own dedicated PMU, and all such PMUs together cover the monitoring job of events
> +on particular uncore device. Such PMUs are described in sysfs with name format
> +slightly changed::
> +
> +/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}>
> +
> +Z is the sub-id, indicating different PMUs for part of hardware device.
> +
> +Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU
> +provides ``ext`` operand to allow exploration of even finer granual statistics
> +of L3C PMU, L3C PMU driver use that as hint of termination when delivering perf
> +command to hardware:
> +
> +- ext=0: Default, could be used with event names.
> +- ext=1 and ext=2: Must be used with event codes, event names are not supported.
> +
> +An example of perf command could be::
> +
> +  $# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5
> +
> +or::
> +
> +  $# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5
> +
> +As above, ``hisi_sccl0_l3c1_0`` locates PMU on CPU cluster 0, L3 cache 1 pipe0.

this isn't correct. sccl0 indicates the Super CPU CLuster 0 which is already
described in the document.

thanks.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3
  2025-08-21 13:50 ` [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3 Yushan Wang
  2025-08-26 13:12   ` Jonathan Cameron
@ 2025-08-27  3:43   ` Yicong Yang
  2025-08-27  7:07     ` wangyushan
  1 sibling, 1 reply; 25+ messages in thread
From: Yicong Yang @ 2025-08-27  3:43 UTC (permalink / raw)
  To: Yushan Wang, will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: yangyicong, robin.murphy, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, hejunhao3, Linuxarm

On 2025/8/21 21:50, Yushan Wang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> This patch adds support for L3C PMU v3. The v3 L3C PMU supports
> an extended events space which can be controlled in up to 2 extra
> address spaces with separate overflow interrupts. The layout
> of the control/event registers are kept the same. The extended events
> with original ones together cover the monitoring job of all transactions
> on L3C.
> 
> The extended events is specified with `ext=[1|2]` option for the
> driver to distinguish, like below:
> 
> perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=1/
> 
> Currently only event option using config bit [7, 0]. There's
> still plenty unused space. Make ext using config [16, 17] and
> reserve bit [15, 8] for event option for future extension.
> 
> With the capability of extra counters, number of counters for HiSilicon
> uncore PMU could reach up to 24, the usedmap is extended accordingly.
> 
> The hw_perf_event::event_base is initialized to the base MMIO
> address of the event and will be used for later control,
> overflow handling and counts readout.
> 
> We still make use of the Uncore PMU framework for handling the
> events and interrupt migration on CPU hotplug. The framework's
> cpuhp callback will handle the event migration and interrupt
> migration of orginial event, if PMU supports extended events
> then the interrupt of extended events is migrated to the same
> CPU choosed by the framework.
> 
> A new HID of HISI0215 is used for this version of L3C PMU.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Co-developed-by: Yushan Wang <wangyushan12@huawei.com>
> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>

some minor comments inline. sorry again for not raising them in previous version.

> ---
>  drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 352 +++++++++++++++++--
>  drivers/perf/hisilicon/hisi_uncore_pmu.h     |   2 +-
>  2 files changed, 321 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
> index 7928b9bb3e7e..95136c01f17b 100644
> --- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
> @@ -55,24 +55,85 @@
>  #define L3C_V1_NR_EVENTS	0x59
>  #define L3C_V2_NR_EVENTS	0xFF
>  
> +#define L3C_MAX_EXT		2
> +
> +HISI_PMU_EVENT_ATTR_EXTRACTOR(ext, config, 17, 16);
>  HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8);
>  HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11);
>  HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16);
>  HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0);
>  
> +struct hisi_l3c_pmu {
> +	struct hisi_pmu l3c_pmu;
> +
> +	/* MMIO and IRQ resources for extension events */
> +	void __iomem *ext_base[L3C_MAX_EXT];
> +	int ext_irq[L3C_MAX_EXT];
> +	int ext_num;
> +};
> +
> +#define to_hisi_l3c_pmu(_l3c_pmu) \
> +	container_of(_l3c_pmu, struct hisi_l3c_pmu, l3c_pmu)
> +
> +/*
> + * The hardware counter idx used in counter enable/disable,
> + * interrupt enable/disable and status check, etc.
> + */
> +#define L3C_HW_IDX(_idx)		((_idx) % L3C_NR_COUNTERS)
> +
> +/* The ext resource number to which a hardware counter belongs. */
> +#define L3C_CNTR_EXT(_idx)		((_idx) / L3C_NR_COUNTERS)
> +
> +struct hisi_l3c_pmu_ext {
> +	bool support_ext;
> +};
> +
> +static bool support_ext(struct hisi_l3c_pmu *pmu)
> +{
> +	struct hisi_l3c_pmu_ext *l3c_pmu_ext = pmu->l3c_pmu.dev_info->private;
> +
> +	return l3c_pmu_ext->support_ext;
> +}
> +
>  static int hisi_l3c_pmu_get_event_idx(struct perf_event *event)
>  {
>  	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>  	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
> -	u32 num_counters = l3c_pmu->num_counters;
> +	int ext = hisi_get_ext(event);
>  	int idx;
>  
> -	idx = find_first_zero_bit(used_mask, num_counters);
> -	if (idx == num_counters)
> +	/*
> +	 * For an L3C PMU that supports extension events, we can monitor
> +	 * maximum 2 * num_counters to 3 * num_counters events, depending on
> +	 * the number of ext regions supported by hardware. Thus use bit
> +	 * [0, num_counters - 1] for normal events and bit
> +	 * [ext * num_counters, (ext + 1) * num_counters - 1] for extension
> +	 * events. The idx allocation will keep unchanged for normal events and
> +	 * we can also use the idx to distinguish whether it's an extension
> +	 * event or not.
> +	 *
> +	 * Since normal events and extension events locates on the different
> +	 * address space, save the base address to the event->hw.event_base.
> +	 */
> +	if (ext) {
> +		if (!support_ext(hisi_l3c_pmu))
> +			return -EOPNOTSUPP;
> +
> +		event->hw.event_base = (unsigned long)hisi_l3c_pmu->ext_base[ext - 1];
> +		idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS * (ext + 1),
> +					 L3C_NR_COUNTERS * ext);
> +	} else {
> +		event->hw.event_base = (unsigned long)l3c_pmu->base;
> +		idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS, 0);
> +	}
> +
> +	if (idx >= L3C_NR_COUNTERS * (ext + 1))
>  		return -EAGAIN;
>  
>  	set_bit(idx, used_mask);
> -	event->hw.event_base = (unsigned long)l3c_pmu->base;
> +
> +	WARN_ON(idx < L3C_NR_COUNTERS * ext || idx >= L3C_NR_COUNTERS * (ext + 1));
>  
>  	return idx;
>  }
> @@ -143,7 +204,7 @@ static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg)
>  {
>  	struct hw_perf_event *hwc = &event->hw;
>  	u32 reg, reg_idx, shift, val;
> -	int idx = hwc->idx;
> +	int idx = L3C_HW_IDX(hwc->idx);
>  
>  	/*
>  	 * Select the appropriate datasource register(L3C_DATSRC_TYPE0/1).
> @@ -264,12 +325,23 @@ static void hisi_l3c_pmu_disable_filter(struct perf_event *event)
>  	}
>  }
>  
> +static int hisi_l3c_pmu_check_filter(struct perf_event *event)
> +{
> +	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	int ext = hisi_get_ext(event);
> +
> +	if (ext < 0 || ext > hisi_l3c_pmu->ext_num)
> +		return -EINVAL;

newline here. similar to other places.

> +	return 0;
> +}
> +
>  /*
>   * Select the counter register offset using the counter index
>   */
>  static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx)
>  {
> -	return (L3C_CNTR0_LOWER + (cntr_idx * 8));
> +	return (L3C_CNTR0_LOWER + (L3C_HW_IDX(cntr_idx) * 8));
>  }
>  
>  static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu,
> @@ -290,6 +362,8 @@ static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
>  	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
>  	u32 reg, reg_idx, shift, val;
>  
> +	idx = L3C_HW_IDX(idx);
> +
>  	/*
>  	 * Select the appropriate event select register(L3C_EVENT_TYPE0/1).
>  	 * There are 2 event select registers for the 8 hardware counters.
> @@ -310,28 +384,63 @@ static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
>  
>  static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu)
>  {
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
> +	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
>  	u32 val;
> +	int i;
>  
>  	/*
> -	 * Set perf_enable bit in L3C_PERF_CTRL register to start counting
> -	 * for all enabled counters.
> +	 * Check if any counter belongs to the normal range (instead of ext
> +	 * range). If so, enable it.
>  	 */
> -	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
> -	val |= L3C_PERF_CTRL_EN;
> -	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
> +	if (bit < L3C_NR_COUNTERS) {
> +		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
> +		val |= L3C_PERF_CTRL_EN;
> +		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
> +	}
> +
> +	/* If not, do enable it on ext ranges. */
> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
> +		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
> +				    L3C_NR_COUNTERS * (i + 1));
> +		if (L3C_CNTR_EXT(bit) == i + 1) {
> +			val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
> +			val |= L3C_PERF_CTRL_EN;
> +			writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
> +		}
> +	}
>  }
>  
>  static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu)
>  {
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
> +	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
>  	u32 val;
> +	int i;
>  
>  	/*
> -	 * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting
> -	 * for all enabled counters.
> +	 * Check if any counter belongs to the normal range (instead of ext
> +	 * range). If so, stop it.
>  	 */
> -	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
> -	val &= ~(L3C_PERF_CTRL_EN);
> -	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
> +	if (bit < L3C_NR_COUNTERS) {
> +		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
> +		val &= ~(L3C_PERF_CTRL_EN);
> +		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
> +	}
> +
> +	/* If not, do stop it on ext ranges. */
> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
> +		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
> +				    L3C_NR_COUNTERS * (i + 1));
> +		if (L3C_CNTR_EXT(bit) != i + 1)
> +			continue;
> +
> +		val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
> +		val &= ~L3C_PERF_CTRL_EN;
> +		writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
> +	}
>  }
>  
>  static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu,
> @@ -341,7 +450,7 @@ static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu,
>  
>  	/* Enable counter index in L3C_EVENT_CTRL register */
>  	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
> -	val |= (1 << hwc->idx);
> +	val |= (1 << L3C_HW_IDX(hwc->idx));
>  	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
>  }
>  
> @@ -352,7 +461,7 @@ static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu,
>  
>  	/* Clear counter index in L3C_EVENT_CTRL register */
>  	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
> -	val &= ~(1 << hwc->idx);
> +	val &= ~(1 << L3C_HW_IDX(hwc->idx));
>  	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
>  }
>  
> @@ -363,7 +472,7 @@ static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu,
>  
>  	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
>  	/* Write 0 to enable interrupt */
> -	val &= ~(1 << hwc->idx);
> +	val &= ~(1 << L3C_HW_IDX(hwc->idx));
>  	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
>  }
>  
> @@ -374,20 +483,34 @@ static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu,
>  
>  	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
>  	/* Write 1 to mask interrupt */
> -	val |= (1 << hwc->idx);
> +	val |= (1 << L3C_HW_IDX(hwc->idx));
>  	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
>  }
>  
>  static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu)
>  {
> -	return readl(l3c_pmu->base + L3C_INT_STATUS);
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	u32 ext_int, status, status_ext = 0;
> +	int i;
> +
> +	status = readl(l3c_pmu->base + L3C_INT_STATUS);
> +
> +	if (!support_ext(hisi_l3c_pmu))
> +		return status;
> +
> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
> +		ext_int = readl(hisi_l3c_pmu->ext_base[i] + L3C_INT_STATUS);
> +		status_ext |= ext_int << (L3C_NR_COUNTERS * i);
> +	}
> +
> +	return status | (status_ext << L3C_NR_COUNTERS);
>  }
>  
>  static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx)
>  {
>  	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
>  
> -	hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << idx);
> +	hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << L3C_HW_IDX(idx));
>  }
>  
>  static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
> @@ -409,10 +532,6 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
>  		return -EINVAL;
>  	}
>  
> -	l3c_pmu->dev_info = device_get_match_data(&pdev->dev);
> -	if (!l3c_pmu->dev_info)
> -		return -ENODEV;
> -
>  	l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0);
>  	if (IS_ERR(l3c_pmu->base)) {
>  		dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n");
> @@ -424,6 +543,46 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
>  	return 0;
>  }
>  
> +static int hisi_l3c_pmu_init_ext(struct hisi_pmu *l3c_pmu, struct platform_device *pdev)
> +{
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	int ret, irq, ext_num, i;
> +	char *irqname;
> +
> +	/* HiSilicon L3C PMU ext should have more than 1 irq resources. */
> +	ext_num = platform_irq_count(pdev);
> +	if (ext_num < 2)

will be more readable using L3C_MAX_EXT:

if (ext_num < L3C_MAX_EXT)
	return -ENODEV;

> +		return -ENODEV;
> +
> +	hisi_l3c_pmu->ext_num = ext_num - 1;
> +
> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
> +		hisi_l3c_pmu->ext_base[i] = devm_platform_ioremap_resource(pdev, i + 1);
> +		if (IS_ERR(hisi_l3c_pmu->ext_base[i]))
> +			return PTR_ERR(hisi_l3c_pmu->ext_base[i]);
> +
> +		irq = platform_get_irq(pdev, i + 1);
> +		if (irq < 0)
> +			return irq;
> +
> +		irqname = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s ext%d",
> +					 dev_name(&pdev->dev), i + 1);
> +		if (!irqname)
> +			return -ENOMEM;
> +
> +		ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr,
> +				       IRQF_NOBALANCING | IRQF_NO_THREAD,
> +				       irqname, l3c_pmu);
> +		if (ret < 0)
> +			return dev_err_probe(&pdev->dev, ret,
> +				"Fail to request EXT IRQ: %d.\n", irq);
> +
> +		hisi_l3c_pmu->ext_irq[i] = irq;
> +	}
> +
> +	return 0;
> +}
> +
>  static struct attribute *hisi_l3c_pmu_v1_format_attr[] = {
>  	HISI_PMU_FORMAT_ATTR(event, "config:0-7"),
>  	NULL,
> @@ -448,6 +607,19 @@ static const struct attribute_group hisi_l3c_pmu_v2_format_group = {
>  	.attrs = hisi_l3c_pmu_v2_format_attr,
>  };
>  
> +static struct attribute *hisi_l3c_pmu_v3_format_attr[] = {
> +	HISI_PMU_FORMAT_ATTR(event, "config:0-7"),
> +	HISI_PMU_FORMAT_ATTR(ext, "config:16-17"),
> +	HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"),
> +	HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"),
> +	NULL
> +};
> +
> +static const struct attribute_group hisi_l3c_pmu_v3_format_group = {
> +	.name = "format",
> +	.attrs = hisi_l3c_pmu_v3_format_attr,
> +};
> +
>  static struct attribute *hisi_l3c_pmu_v1_events_attr[] = {
>  	HISI_PMU_EVENT_ATTR(rd_cpipe,		0x00),
>  	HISI_PMU_EVENT_ATTR(wr_cpipe,		0x01),
> @@ -483,6 +655,26 @@ static const struct attribute_group hisi_l3c_pmu_v2_events_group = {
>  	.attrs = hisi_l3c_pmu_v2_events_attr,
>  };
>  
> +static struct attribute *hisi_l3c_pmu_v3_events_attr[] = {
> +	HISI_PMU_EVENT_ATTR(rd_spipe,		0x18),
> +	HISI_PMU_EVENT_ATTR(rd_hit_spipe,	0x19),
> +	HISI_PMU_EVENT_ATTR(wr_spipe,		0x1a),
> +	HISI_PMU_EVENT_ATTR(wr_hit_spipe,	0x1b),
> +	HISI_PMU_EVENT_ATTR(io_rd_spipe,	0x1c),
> +	HISI_PMU_EVENT_ATTR(io_rd_hit_spipe,	0x1d),
> +	HISI_PMU_EVENT_ATTR(io_wr_spipe,	0x1e),
> +	HISI_PMU_EVENT_ATTR(io_wr_hit_spipe,	0x1f),
> +	HISI_PMU_EVENT_ATTR(cycles,		0x7f),
> +	HISI_PMU_EVENT_ATTR(l3c_ref,		0xbc),
> +	HISI_PMU_EVENT_ATTR(l3c2ring,		0xbd),
> +	NULL
> +};
> +
> +static const struct attribute_group hisi_l3c_pmu_v3_events_group = {
> +	.name = "events",
> +	.attrs = hisi_l3c_pmu_v3_events_attr,
> +};
> +
>  static const struct attribute_group *hisi_l3c_pmu_v1_attr_groups[] = {
>  	&hisi_l3c_pmu_v1_format_group,
>  	&hisi_l3c_pmu_v1_events_group,
> @@ -499,16 +691,41 @@ static const struct attribute_group *hisi_l3c_pmu_v2_attr_groups[] = {
>  	NULL
>  };
>  
> +static const struct attribute_group *hisi_l3c_pmu_v3_attr_groups[] = {
> +	&hisi_l3c_pmu_v3_format_group,
> +	&hisi_l3c_pmu_v3_events_group,
> +	&hisi_pmu_cpumask_attr_group,
> +	&hisi_pmu_identifier_group,
> +	NULL
> +};
> +
> +static struct hisi_l3c_pmu_ext hisi_l3c_pmu_support_ext = {
> +	.support_ext = true,
> +};
> +
> +static struct hisi_l3c_pmu_ext hisi_l3c_pmu_not_support_ext = {
> +	.support_ext = false,
> +};
> +
>  static const struct hisi_pmu_dev_info hisi_l3c_pmu_v1 = {
>  	.attr_groups = hisi_l3c_pmu_v1_attr_groups,
>  	.counter_bits = 48,
>  	.check_event = L3C_V1_NR_EVENTS,
> +	.private = &hisi_l3c_pmu_not_support_ext,
>  };
>  
>  static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = {
>  	.attr_groups = hisi_l3c_pmu_v2_attr_groups,
>  	.counter_bits = 64,
>  	.check_event = L3C_V2_NR_EVENTS,
> +	.private = &hisi_l3c_pmu_not_support_ext,
> +};
> +
> +static const struct hisi_pmu_dev_info hisi_l3c_pmu_v3 = {
> +	.attr_groups = hisi_l3c_pmu_v3_attr_groups,
> +	.counter_bits = 64,
> +	.check_event = L3C_V2_NR_EVENTS,
> +	.private = &hisi_l3c_pmu_support_ext,
>  };
>  
>  static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
> @@ -526,11 +743,14 @@ static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
>  	.clear_int_status	= hisi_l3c_pmu_clear_int_status,
>  	.enable_filter		= hisi_l3c_pmu_enable_filter,
>  	.disable_filter		= hisi_l3c_pmu_disable_filter,
> +	.check_filter		= hisi_l3c_pmu_check_filter,
>  };
>  
>  static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev,
>  				  struct hisi_pmu *l3c_pmu)
>  {
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	struct hisi_l3c_pmu_ext *l3c_pmu_dev_ext = l3c_pmu->dev_info->private;
>  	int ret;
>  
>  	ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu);
> @@ -549,27 +769,50 @@ static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev,
>  	l3c_pmu->dev = &pdev->dev;
>  	l3c_pmu->on_cpu = -1;
>  
> +	if (l3c_pmu_dev_ext->support_ext) {
> +		ret = hisi_l3c_pmu_init_ext(l3c_pmu, pdev);
> +		if (ret)
> +			return ret;
> +		/*
> +		 * The extension events have their own counters with the
> +		 * same number of the normal events counters. So we can
> +		 * have at maximum num_counters * ext events monitored.
> +		 */
> +		l3c_pmu->num_counters += hisi_l3c_pmu->ext_num * L3C_NR_COUNTERS;
> +	}
> +
>  	return 0;
>  }
>  
>  static int hisi_l3c_pmu_probe(struct platform_device *pdev)
>  {
> +	struct hisi_l3c_pmu *hisi_l3c_pmu;
>  	struct hisi_pmu *l3c_pmu;
>  	char *name;
>  	int ret;
>  
> -	l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*l3c_pmu), GFP_KERNEL);
> -	if (!l3c_pmu)
> +	hisi_l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*hisi_l3c_pmu), GFP_KERNEL);
> +	if (!hisi_l3c_pmu)
>  		return -ENOMEM;
>  
> +	l3c_pmu = &hisi_l3c_pmu->l3c_pmu;
>  	platform_set_drvdata(pdev, l3c_pmu);
>  
> +	l3c_pmu->dev_info = device_get_match_data(&pdev->dev);
> +	if (!l3c_pmu->dev_info)
> +		return -ENODEV;
> +

any reason to move this out of hisi_l3c_pmu_init_data()? it looks unnecessary.

>  	ret = hisi_l3c_pmu_dev_probe(pdev, l3c_pmu);
>  	if (ret)
>  		return ret;
>  
> -	name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d",
> -			      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id);
> +	if (l3c_pmu->topo.sub_id >= 0)
> +		name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d_%d",
> +				      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id,
> +				      l3c_pmu->topo.sub_id);
> +	else
> +		name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d",
> +				      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id);
>  	if (!name)
>  		return -ENOMEM;
>  
> @@ -604,6 +847,7 @@ static void hisi_l3c_pmu_remove(struct platform_device *pdev)
>  static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = {
>  	{ "HISI0213", (kernel_ulong_t)&hisi_l3c_pmu_v1 },
>  	{ "HISI0214", (kernel_ulong_t)&hisi_l3c_pmu_v2 },
> +	{ "HISI0215", (kernel_ulong_t)&hisi_l3c_pmu_v3 },
>  	{}
>  };
>  MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match);
> @@ -618,14 +862,58 @@ static struct platform_driver hisi_l3c_pmu_driver = {
>  	.remove = hisi_l3c_pmu_remove,
>  };
>  
> +static int hisi_l3c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	int ret;
> +
> +	ret = hisi_uncore_pmu_online_cpu(cpu, node);
> +	if (ret)
> +		return ret;
> +
> +	/* Avoid L3C pmu not supporting ext from ext irq migrating. */
> +	if (!support_ext(hisi_l3c_pmu))
> +		return 0;
> +
> +	for (int i = 0; i < hisi_l3c_pmu->ext_num; i++)
> +		WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i],
> +					 cpumask_of(l3c_pmu->on_cpu)));
> +	return 0;
> +}
> +
> +static int hisi_l3c_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
> +	int ret;
> +
> +	ret = hisi_uncore_pmu_offline_cpu(cpu, node);
> +	if (ret)
> +		return ret;
> +
> +	/* If failed to find any available CPU, skip irq migration. */
> +	if (l3c_pmu->on_cpu <= 0)

0 should be a valid cpu for migration, do I miss something here?

thanks.

> +		return 0;
> +
> +	/* Avoid L3C pmu not supporting ext from ext irq migrating. */
> +	if (!support_ext(hisi_l3c_pmu))
> +		return 0;
> +
> +	for (int i = 0; i < hisi_l3c_pmu->ext_num; i++)
> +		WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i],
> +					 cpumask_of(l3c_pmu->on_cpu)));
> +	return 0;
> +}
> +
>  static int __init hisi_l3c_pmu_module_init(void)
>  {
>  	int ret;
>  
>  	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE,
>  				      "AP_PERF_ARM_HISI_L3_ONLINE",
> -				      hisi_uncore_pmu_online_cpu,
> -				      hisi_uncore_pmu_offline_cpu);
> +				      hisi_l3c_pmu_online_cpu,
> +				      hisi_l3c_pmu_offline_cpu);
>  	if (ret) {
>  		pr_err("L3C PMU: Error setup hotplug, ret = %d\n", ret);
>  		return ret;
> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h
> index 02fa022925d4..0334a797e499 100644
> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.h
> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h
> @@ -24,7 +24,7 @@
>  #define pr_fmt(fmt)     "hisi_pmu: " fmt
>  
>  #define HISI_PMU_V2		0x30
> -#define HISI_MAX_COUNTERS 0x10
> +#define HISI_MAX_COUNTERS 0x18
>  #define to_hisi_pmu(p)	(container_of(p, struct hisi_pmu, pmu))
>  
>  #define HISI_PMU_ATTR(_name, _func, _config)				\
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3
  2025-08-26 13:12   ` Jonathan Cameron
@ 2025-08-27  6:21     ` wangyushan
  0 siblings, 0 replies; 25+ messages in thread
From: wangyushan @ 2025-08-27  6:21 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, robin.murphy,
	yangyicong, liuyonglong, wanghuiqiang, prime.zeng, hejunhao3



On 8/26/2025 9:12 PM, Jonathan Cameron wrote:
> On Thu, 21 Aug 2025 21:50:47 +0800
> Yushan Wang <wangyushan12@huawei.com> wrote:
>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> This patch adds support for L3C PMU v3. The v3 L3C PMU supports
>> an extended events space which can be controlled in up to 2 extra
>> address spaces with separate overflow interrupts. The layout
>> of the control/event registers are kept the same. The extended events
>> with original ones together cover the monitoring job of all transactions
>> on L3C.
>>
>> The extended events is specified with `ext=[1|2]` option for the
>> driver to distinguish, like below:
>>
>> perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=1/
>>
>> Currently only event option using config bit [7, 0]. There's
>> still plenty unused space. Make ext using config [16, 17] and
>> reserve bit [15, 8] for event option for future extension.
>>
>> With the capability of extra counters, number of counters for HiSilicon
>> uncore PMU could reach up to 24, the usedmap is extended accordingly.
>>
>> The hw_perf_event::event_base is initialized to the base MMIO
>> address of the event and will be used for later control,
>> overflow handling and counts readout.
>>
>> We still make use of the Uncore PMU framework for handling the
>> events and interrupt migration on CPU hotplug. The framework's
>> cpuhp callback will handle the event migration and interrupt
>> migration of orginial event, if PMU supports extended events
>> then the interrupt of extended events is migrated to the same
>> CPU choosed by the framework.
>>
>> A new HID of HISI0215 is used for this version of L3C PMU.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> Co-developed-by: Yushan Wang <wangyushan12@huawei.com>
>> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
> One minor formatting thing I missed in internal reviews. With that
> tidied up (check other patches for this as well)
>
> Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>
>>   
>>   static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu)
>>   {
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
>> +	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
>>   	u32 val;
>> +	int i;
>>   
>>   	/*
>> -	 * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting
>> -	 * for all enabled counters.
>> +	 * Check if any counter belongs to the normal range (instead of ext
>> +	 * range). If so, stop it.
>>   	 */
>> -	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
>> -	val &= ~(L3C_PERF_CTRL_EN);
>> -	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
>> +	if (bit < L3C_NR_COUNTERS) {
>> +		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
>> +		val &= ~(L3C_PERF_CTRL_EN);
> Brackets not adding anything here and inconsistently applied.
> Please clean these up.

Sure, sorry for the noises.

Will fix that and look for other similar issues in the patches.

Thanks!

>> +		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
>> +	}
>> +
>> +	/* If not, do stop it on ext ranges. */
>> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
>> +		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
>> +				    L3C_NR_COUNTERS * (i + 1));
>> +		if (L3C_CNTR_EXT(bit) != i + 1)
>> +			continue;
>> +
>> +		val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
>> +		val &= ~L3C_PERF_CTRL_EN;
>> +		writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
>> +	}
>>   }



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3
  2025-08-27  3:43   ` Yicong Yang
@ 2025-08-27  7:07     ` wangyushan
  0 siblings, 0 replies; 25+ messages in thread
From: wangyushan @ 2025-08-27  7:07 UTC (permalink / raw)
  To: Yicong Yang
  Cc: robin.murphy, Jonathan.Cameron, liuyonglong, wanghuiqiang,
	linux-kernel, prime.zeng, hejunhao3, Linuxarm, will,
	linux-arm-kernel, mark.rutland



On 8/27/2025 11:43 AM, Yicong Yang wrote:
> On 2025/8/21 21:50, Yushan Wang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> This patch adds support for L3C PMU v3. The v3 L3C PMU supports
>> an extended events space which can be controlled in up to 2 extra
>> address spaces with separate overflow interrupts. The layout
>> of the control/event registers are kept the same. The extended events
>> with original ones together cover the monitoring job of all transactions
>> on L3C.
>>
>> The extended events is specified with `ext=[1|2]` option for the
>> driver to distinguish, like below:
>>
>> perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=1/
>>
>> Currently only event option using config bit [7, 0]. There's
>> still plenty unused space. Make ext using config [16, 17] and
>> reserve bit [15, 8] for event option for future extension.
>>
>> With the capability of extra counters, number of counters for HiSilicon
>> uncore PMU could reach up to 24, the usedmap is extended accordingly.
>>
>> The hw_perf_event::event_base is initialized to the base MMIO
>> address of the event and will be used for later control,
>> overflow handling and counts readout.
>>
>> We still make use of the Uncore PMU framework for handling the
>> events and interrupt migration on CPU hotplug. The framework's
>> cpuhp callback will handle the event migration and interrupt
>> migration of orginial event, if PMU supports extended events
>> then the interrupt of extended events is migrated to the same
>> CPU choosed by the framework.
>>
>> A new HID of HISI0215 is used for this version of L3C PMU.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> Co-developed-by: Yushan Wang <wangyushan12@huawei.com>
>> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
> some minor comments inline. sorry again for not raising them in previous version.
>
>> ---
>>   drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 352 +++++++++++++++++--
>>   drivers/perf/hisilicon/hisi_uncore_pmu.h     |   2 +-
>>   2 files changed, 321 insertions(+), 33 deletions(-)
>>
>> diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
>> index 7928b9bb3e7e..95136c01f17b 100644
>> --- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
>> +++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
>> @@ -55,24 +55,85 @@
>>   #define L3C_V1_NR_EVENTS	0x59
>>   #define L3C_V2_NR_EVENTS	0xFF
>>   
>> +#define L3C_MAX_EXT		2
>> +
>> +HISI_PMU_EVENT_ATTR_EXTRACTOR(ext, config, 17, 16);
>>   HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8);
>>   HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11);
>>   HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16);
>>   HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0);
>>   
>> +struct hisi_l3c_pmu {
>> +	struct hisi_pmu l3c_pmu;
>> +
>> +	/* MMIO and IRQ resources for extension events */
>> +	void __iomem *ext_base[L3C_MAX_EXT];
>> +	int ext_irq[L3C_MAX_EXT];
>> +	int ext_num;
>> +};
>> +
>> +#define to_hisi_l3c_pmu(_l3c_pmu) \
>> +	container_of(_l3c_pmu, struct hisi_l3c_pmu, l3c_pmu)
>> +
>> +/*
>> + * The hardware counter idx used in counter enable/disable,
>> + * interrupt enable/disable and status check, etc.
>> + */
>> +#define L3C_HW_IDX(_idx)		((_idx) % L3C_NR_COUNTERS)
>> +
>> +/* The ext resource number to which a hardware counter belongs. */
>> +#define L3C_CNTR_EXT(_idx)		((_idx) / L3C_NR_COUNTERS)
>> +
>> +struct hisi_l3c_pmu_ext {
>> +	bool support_ext;
>> +};
>> +
>> +static bool support_ext(struct hisi_l3c_pmu *pmu)
>> +{
>> +	struct hisi_l3c_pmu_ext *l3c_pmu_ext = pmu->l3c_pmu.dev_info->private;
>> +
>> +	return l3c_pmu_ext->support_ext;
>> +}
>> +
>>   static int hisi_l3c_pmu_get_event_idx(struct perf_event *event)
>>   {
>>   	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>>   	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
>> -	u32 num_counters = l3c_pmu->num_counters;
>> +	int ext = hisi_get_ext(event);
>>   	int idx;
>>   
>> -	idx = find_first_zero_bit(used_mask, num_counters);
>> -	if (idx == num_counters)
>> +	/*
>> +	 * For an L3C PMU that supports extension events, we can monitor
>> +	 * maximum 2 * num_counters to 3 * num_counters events, depending on
>> +	 * the number of ext regions supported by hardware. Thus use bit
>> +	 * [0, num_counters - 1] for normal events and bit
>> +	 * [ext * num_counters, (ext + 1) * num_counters - 1] for extension
>> +	 * events. The idx allocation will keep unchanged for normal events and
>> +	 * we can also use the idx to distinguish whether it's an extension
>> +	 * event or not.
>> +	 *
>> +	 * Since normal events and extension events locates on the different
>> +	 * address space, save the base address to the event->hw.event_base.
>> +	 */
>> +	if (ext) {
>> +		if (!support_ext(hisi_l3c_pmu))
>> +			return -EOPNOTSUPP;
>> +
>> +		event->hw.event_base = (unsigned long)hisi_l3c_pmu->ext_base[ext - 1];
>> +		idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS * (ext + 1),
>> +					 L3C_NR_COUNTERS * ext);
>> +	} else {
>> +		event->hw.event_base = (unsigned long)l3c_pmu->base;
>> +		idx = find_next_zero_bit(used_mask, L3C_NR_COUNTERS, 0);
>> +	}
>> +
>> +	if (idx >= L3C_NR_COUNTERS * (ext + 1))
>>   		return -EAGAIN;
>>   
>>   	set_bit(idx, used_mask);
>> -	event->hw.event_base = (unsigned long)l3c_pmu->base;
>> +
>> +	WARN_ON(idx < L3C_NR_COUNTERS * ext || idx >= L3C_NR_COUNTERS * (ext + 1));
>>   
>>   	return idx;
>>   }
>> @@ -143,7 +204,7 @@ static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg)
>>   {
>>   	struct hw_perf_event *hwc = &event->hw;
>>   	u32 reg, reg_idx, shift, val;
>> -	int idx = hwc->idx;
>> +	int idx = L3C_HW_IDX(hwc->idx);
>>   
>>   	/*
>>   	 * Select the appropriate datasource register(L3C_DATSRC_TYPE0/1).
>> @@ -264,12 +325,23 @@ static void hisi_l3c_pmu_disable_filter(struct perf_event *event)
>>   	}
>>   }
>>   
>> +static int hisi_l3c_pmu_check_filter(struct perf_event *event)
>> +{
>> +	struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu);
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	int ext = hisi_get_ext(event);
>> +
>> +	if (ext < 0 || ext > hisi_l3c_pmu->ext_num)
>> +		return -EINVAL;
> newline here. similar to other places.
>
>> +	return 0;
>> +}
>> +
>>   /*
>>    * Select the counter register offset using the counter index
>>    */
>>   static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx)
>>   {
>> -	return (L3C_CNTR0_LOWER + (cntr_idx * 8));
>> +	return (L3C_CNTR0_LOWER + (L3C_HW_IDX(cntr_idx) * 8));
>>   }
>>   
>>   static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu,
>> @@ -290,6 +362,8 @@ static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
>>   	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
>>   	u32 reg, reg_idx, shift, val;
>>   
>> +	idx = L3C_HW_IDX(idx);
>> +
>>   	/*
>>   	 * Select the appropriate event select register(L3C_EVENT_TYPE0/1).
>>   	 * There are 2 event select registers for the 8 hardware counters.
>> @@ -310,28 +384,63 @@ static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx,
>>   
>>   static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu)
>>   {
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
>> +	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
>>   	u32 val;
>> +	int i;
>>   
>>   	/*
>> -	 * Set perf_enable bit in L3C_PERF_CTRL register to start counting
>> -	 * for all enabled counters.
>> +	 * Check if any counter belongs to the normal range (instead of ext
>> +	 * range). If so, enable it.
>>   	 */
>> -	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
>> -	val |= L3C_PERF_CTRL_EN;
>> -	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
>> +	if (bit < L3C_NR_COUNTERS) {
>> +		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
>> +		val |= L3C_PERF_CTRL_EN;
>> +		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
>> +	}
>> +
>> +	/* If not, do enable it on ext ranges. */
>> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
>> +		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
>> +				    L3C_NR_COUNTERS * (i + 1));
>> +		if (L3C_CNTR_EXT(bit) == i + 1) {
>> +			val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
>> +			val |= L3C_PERF_CTRL_EN;
>> +			writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
>> +		}
>> +	}
>>   }
>>   
>>   static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu)
>>   {
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	unsigned long *used_mask = l3c_pmu->pmu_events.used_mask;
>> +	unsigned long bit = find_first_bit(used_mask, l3c_pmu->num_counters);
>>   	u32 val;
>> +	int i;
>>   
>>   	/*
>> -	 * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting
>> -	 * for all enabled counters.
>> +	 * Check if any counter belongs to the normal range (instead of ext
>> +	 * range). If so, stop it.
>>   	 */
>> -	val = readl(l3c_pmu->base + L3C_PERF_CTRL);
>> -	val &= ~(L3C_PERF_CTRL_EN);
>> -	writel(val, l3c_pmu->base + L3C_PERF_CTRL);
>> +	if (bit < L3C_NR_COUNTERS) {
>> +		val = readl(l3c_pmu->base + L3C_PERF_CTRL);
>> +		val &= ~(L3C_PERF_CTRL_EN);
>> +		writel(val, l3c_pmu->base + L3C_PERF_CTRL);
>> +	}
>> +
>> +	/* If not, do stop it on ext ranges. */
>> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
>> +		bit = find_next_bit(used_mask, L3C_NR_COUNTERS * (i + 2),
>> +				    L3C_NR_COUNTERS * (i + 1));
>> +		if (L3C_CNTR_EXT(bit) != i + 1)
>> +			continue;
>> +
>> +		val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
>> +		val &= ~L3C_PERF_CTRL_EN;
>> +		writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL);
>> +	}
>>   }
>>   
>>   static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu,
>> @@ -341,7 +450,7 @@ static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu,
>>   
>>   	/* Enable counter index in L3C_EVENT_CTRL register */
>>   	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
>> -	val |= (1 << hwc->idx);
>> +	val |= (1 << L3C_HW_IDX(hwc->idx));
>>   	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
>>   }
>>   
>> @@ -352,7 +461,7 @@ static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu,
>>   
>>   	/* Clear counter index in L3C_EVENT_CTRL register */
>>   	val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL);
>> -	val &= ~(1 << hwc->idx);
>> +	val &= ~(1 << L3C_HW_IDX(hwc->idx));
>>   	hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val);
>>   }
>>   
>> @@ -363,7 +472,7 @@ static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu,
>>   
>>   	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
>>   	/* Write 0 to enable interrupt */
>> -	val &= ~(1 << hwc->idx);
>> +	val &= ~(1 << L3C_HW_IDX(hwc->idx));
>>   	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
>>   }
>>   
>> @@ -374,20 +483,34 @@ static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu,
>>   
>>   	val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK);
>>   	/* Write 1 to mask interrupt */
>> -	val |= (1 << hwc->idx);
>> +	val |= (1 << L3C_HW_IDX(hwc->idx));
>>   	hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val);
>>   }
>>   
>>   static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu)
>>   {
>> -	return readl(l3c_pmu->base + L3C_INT_STATUS);
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	u32 ext_int, status, status_ext = 0;
>> +	int i;
>> +
>> +	status = readl(l3c_pmu->base + L3C_INT_STATUS);
>> +
>> +	if (!support_ext(hisi_l3c_pmu))
>> +		return status;
>> +
>> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
>> +		ext_int = readl(hisi_l3c_pmu->ext_base[i] + L3C_INT_STATUS);
>> +		status_ext |= ext_int << (L3C_NR_COUNTERS * i);
>> +	}
>> +
>> +	return status | (status_ext << L3C_NR_COUNTERS);
>>   }
>>   
>>   static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx)
>>   {
>>   	struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw;
>>   
>> -	hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << idx);
>> +	hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << L3C_HW_IDX(idx));
>>   }
>>   
>>   static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
>> @@ -409,10 +532,6 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
>>   		return -EINVAL;
>>   	}
>>   
>> -	l3c_pmu->dev_info = device_get_match_data(&pdev->dev);
>> -	if (!l3c_pmu->dev_info)
>> -		return -ENODEV;
>> -
>>   	l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0);
>>   	if (IS_ERR(l3c_pmu->base)) {
>>   		dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n");
>> @@ -424,6 +543,46 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
>>   	return 0;
>>   }
>>   
>> +static int hisi_l3c_pmu_init_ext(struct hisi_pmu *l3c_pmu, struct platform_device *pdev)
>> +{
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	int ret, irq, ext_num, i;
>> +	char *irqname;
>> +
>> +	/* HiSilicon L3C PMU ext should have more than 1 irq resources. */
>> +	ext_num = platform_irq_count(pdev);
>> +	if (ext_num < 2)
> will be more readable using L3C_MAX_EXT:
>
> if (ext_num < L3C_MAX_EXT)
> 	return -ENODEV;
>
>> +		return -ENODEV;
>> +
>> +	hisi_l3c_pmu->ext_num = ext_num - 1;
>> +
>> +	for (i = 0; i < hisi_l3c_pmu->ext_num; i++) {
>> +		hisi_l3c_pmu->ext_base[i] = devm_platform_ioremap_resource(pdev, i + 1);
>> +		if (IS_ERR(hisi_l3c_pmu->ext_base[i]))
>> +			return PTR_ERR(hisi_l3c_pmu->ext_base[i]);
>> +
>> +		irq = platform_get_irq(pdev, i + 1);
>> +		if (irq < 0)
>> +			return irq;
>> +
>> +		irqname = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s ext%d",
>> +					 dev_name(&pdev->dev), i + 1);
>> +		if (!irqname)
>> +			return -ENOMEM;
>> +
>> +		ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr,
>> +				       IRQF_NOBALANCING | IRQF_NO_THREAD,
>> +				       irqname, l3c_pmu);
>> +		if (ret < 0)
>> +			return dev_err_probe(&pdev->dev, ret,
>> +				"Fail to request EXT IRQ: %d.\n", irq);
>> +
>> +		hisi_l3c_pmu->ext_irq[i] = irq;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   static struct attribute *hisi_l3c_pmu_v1_format_attr[] = {
>>   	HISI_PMU_FORMAT_ATTR(event, "config:0-7"),
>>   	NULL,
>> @@ -448,6 +607,19 @@ static const struct attribute_group hisi_l3c_pmu_v2_format_group = {
>>   	.attrs = hisi_l3c_pmu_v2_format_attr,
>>   };
>>   
>> +static struct attribute *hisi_l3c_pmu_v3_format_attr[] = {
>> +	HISI_PMU_FORMAT_ATTR(event, "config:0-7"),
>> +	HISI_PMU_FORMAT_ATTR(ext, "config:16-17"),
>> +	HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"),
>> +	HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"),
>> +	NULL
>> +};
>> +
>> +static const struct attribute_group hisi_l3c_pmu_v3_format_group = {
>> +	.name = "format",
>> +	.attrs = hisi_l3c_pmu_v3_format_attr,
>> +};
>> +
>>   static struct attribute *hisi_l3c_pmu_v1_events_attr[] = {
>>   	HISI_PMU_EVENT_ATTR(rd_cpipe,		0x00),
>>   	HISI_PMU_EVENT_ATTR(wr_cpipe,		0x01),
>> @@ -483,6 +655,26 @@ static const struct attribute_group hisi_l3c_pmu_v2_events_group = {
>>   	.attrs = hisi_l3c_pmu_v2_events_attr,
>>   };
>>   
>> +static struct attribute *hisi_l3c_pmu_v3_events_attr[] = {
>> +	HISI_PMU_EVENT_ATTR(rd_spipe,		0x18),
>> +	HISI_PMU_EVENT_ATTR(rd_hit_spipe,	0x19),
>> +	HISI_PMU_EVENT_ATTR(wr_spipe,		0x1a),
>> +	HISI_PMU_EVENT_ATTR(wr_hit_spipe,	0x1b),
>> +	HISI_PMU_EVENT_ATTR(io_rd_spipe,	0x1c),
>> +	HISI_PMU_EVENT_ATTR(io_rd_hit_spipe,	0x1d),
>> +	HISI_PMU_EVENT_ATTR(io_wr_spipe,	0x1e),
>> +	HISI_PMU_EVENT_ATTR(io_wr_hit_spipe,	0x1f),
>> +	HISI_PMU_EVENT_ATTR(cycles,		0x7f),
>> +	HISI_PMU_EVENT_ATTR(l3c_ref,		0xbc),
>> +	HISI_PMU_EVENT_ATTR(l3c2ring,		0xbd),
>> +	NULL
>> +};
>> +
>> +static const struct attribute_group hisi_l3c_pmu_v3_events_group = {
>> +	.name = "events",
>> +	.attrs = hisi_l3c_pmu_v3_events_attr,
>> +};
>> +
>>   static const struct attribute_group *hisi_l3c_pmu_v1_attr_groups[] = {
>>   	&hisi_l3c_pmu_v1_format_group,
>>   	&hisi_l3c_pmu_v1_events_group,
>> @@ -499,16 +691,41 @@ static const struct attribute_group *hisi_l3c_pmu_v2_attr_groups[] = {
>>   	NULL
>>   };
>>   
>> +static const struct attribute_group *hisi_l3c_pmu_v3_attr_groups[] = {
>> +	&hisi_l3c_pmu_v3_format_group,
>> +	&hisi_l3c_pmu_v3_events_group,
>> +	&hisi_pmu_cpumask_attr_group,
>> +	&hisi_pmu_identifier_group,
>> +	NULL
>> +};
>> +
>> +static struct hisi_l3c_pmu_ext hisi_l3c_pmu_support_ext = {
>> +	.support_ext = true,
>> +};
>> +
>> +static struct hisi_l3c_pmu_ext hisi_l3c_pmu_not_support_ext = {
>> +	.support_ext = false,
>> +};
>> +
>>   static const struct hisi_pmu_dev_info hisi_l3c_pmu_v1 = {
>>   	.attr_groups = hisi_l3c_pmu_v1_attr_groups,
>>   	.counter_bits = 48,
>>   	.check_event = L3C_V1_NR_EVENTS,
>> +	.private = &hisi_l3c_pmu_not_support_ext,
>>   };
>>   
>>   static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = {
>>   	.attr_groups = hisi_l3c_pmu_v2_attr_groups,
>>   	.counter_bits = 64,
>>   	.check_event = L3C_V2_NR_EVENTS,
>> +	.private = &hisi_l3c_pmu_not_support_ext,
>> +};
>> +
>> +static const struct hisi_pmu_dev_info hisi_l3c_pmu_v3 = {
>> +	.attr_groups = hisi_l3c_pmu_v3_attr_groups,
>> +	.counter_bits = 64,
>> +	.check_event = L3C_V2_NR_EVENTS,
>> +	.private = &hisi_l3c_pmu_support_ext,
>>   };
>>   
>>   static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
>> @@ -526,11 +743,14 @@ static const struct hisi_uncore_ops hisi_uncore_l3c_ops = {
>>   	.clear_int_status	= hisi_l3c_pmu_clear_int_status,
>>   	.enable_filter		= hisi_l3c_pmu_enable_filter,
>>   	.disable_filter		= hisi_l3c_pmu_disable_filter,
>> +	.check_filter		= hisi_l3c_pmu_check_filter,
>>   };
>>   
>>   static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev,
>>   				  struct hisi_pmu *l3c_pmu)
>>   {
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	struct hisi_l3c_pmu_ext *l3c_pmu_dev_ext = l3c_pmu->dev_info->private;
>>   	int ret;
>>   
>>   	ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu);
>> @@ -549,27 +769,50 @@ static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev,
>>   	l3c_pmu->dev = &pdev->dev;
>>   	l3c_pmu->on_cpu = -1;
>>   
>> +	if (l3c_pmu_dev_ext->support_ext) {
>> +		ret = hisi_l3c_pmu_init_ext(l3c_pmu, pdev);
>> +		if (ret)
>> +			return ret;
>> +		/*
>> +		 * The extension events have their own counters with the
>> +		 * same number of the normal events counters. So we can
>> +		 * have at maximum num_counters * ext events monitored.
>> +		 */
>> +		l3c_pmu->num_counters += hisi_l3c_pmu->ext_num * L3C_NR_COUNTERS;
>> +	}
>> +
>>   	return 0;
>>   }
>>   
>>   static int hisi_l3c_pmu_probe(struct platform_device *pdev)
>>   {
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu;
>>   	struct hisi_pmu *l3c_pmu;
>>   	char *name;
>>   	int ret;
>>   
>> -	l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*l3c_pmu), GFP_KERNEL);
>> -	if (!l3c_pmu)
>> +	hisi_l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*hisi_l3c_pmu), GFP_KERNEL);
>> +	if (!hisi_l3c_pmu)
>>   		return -ENOMEM;
>>   
>> +	l3c_pmu = &hisi_l3c_pmu->l3c_pmu;
>>   	platform_set_drvdata(pdev, l3c_pmu);
>>   
>> +	l3c_pmu->dev_info = device_get_match_data(&pdev->dev);
>> +	if (!l3c_pmu->dev_info)
>> +		return -ENODEV;
>> +
> any reason to move this out of hisi_l3c_pmu_init_data()? it looks unnecessary.

The private data stored in l3c_pmu->dev_info was used in hisi_l3c_pmu_dev_probe(),
so I moved this before the call right below.

But since the actual usage of l3c_pmu->dev_info is after the hisi_l3c_pmu_init_data()
call anyway, so leave it unmoved should be neater.

Will fix that in the next version.

>
>>   	ret = hisi_l3c_pmu_dev_probe(pdev, l3c_pmu);
>>   	if (ret)
>>   		return ret;
>>   
>> -	name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d",
>> -			      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id);
>> +	if (l3c_pmu->topo.sub_id >= 0)
>> +		name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d_%d",
>> +				      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id,
>> +				      l3c_pmu->topo.sub_id);
>> +	else
>> +		name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d",
>> +				      l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id);
>>   	if (!name)
>>   		return -ENOMEM;
>>   
>> @@ -604,6 +847,7 @@ static void hisi_l3c_pmu_remove(struct platform_device *pdev)
>>   static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = {
>>   	{ "HISI0213", (kernel_ulong_t)&hisi_l3c_pmu_v1 },
>>   	{ "HISI0214", (kernel_ulong_t)&hisi_l3c_pmu_v2 },
>> +	{ "HISI0215", (kernel_ulong_t)&hisi_l3c_pmu_v3 },
>>   	{}
>>   };
>>   MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match);
>> @@ -618,14 +862,58 @@ static struct platform_driver hisi_l3c_pmu_driver = {
>>   	.remove = hisi_l3c_pmu_remove,
>>   };
>>   
>> +static int hisi_l3c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	int ret;
>> +
>> +	ret = hisi_uncore_pmu_online_cpu(cpu, node);
>> +	if (ret)
>> +		return ret;
>> +
>> +	/* Avoid L3C pmu not supporting ext from ext irq migrating. */
>> +	if (!support_ext(hisi_l3c_pmu))
>> +		return 0;
>> +
>> +	for (int i = 0; i < hisi_l3c_pmu->ext_num; i++)
>> +		WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i],
>> +					 cpumask_of(l3c_pmu->on_cpu)));
>> +	return 0;
>> +}
>> +
>> +static int hisi_l3c_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
>> +	struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu);
>> +	int ret;
>> +
>> +	ret = hisi_uncore_pmu_offline_cpu(cpu, node);
>> +	if (ret)
>> +		return ret;
>> +
>> +	/* If failed to find any available CPU, skip irq migration. */
>> +	if (l3c_pmu->on_cpu <= 0)
> 0 should be a valid cpu for migration, do I miss something here?
>
> thanks.

This could be problematic, I will fix this and other above issues.

Thanks!

>
>> +		return 0;
>> +
>> +	/* Avoid L3C pmu not supporting ext from ext irq migrating. */
>> +	if (!support_ext(hisi_l3c_pmu))
>> +		return 0;
>> +
>> +	for (int i = 0; i < hisi_l3c_pmu->ext_num; i++)
>> +		WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i],
>> +					 cpumask_of(l3c_pmu->on_cpu)));
>> +	return 0;
>> +}
>> +
>>   static int __init hisi_l3c_pmu_module_init(void)
>>   {
>>   	int ret;
>>   
>>   	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE,
>>   				      "AP_PERF_ARM_HISI_L3_ONLINE",
>> -				      hisi_uncore_pmu_online_cpu,
>> -				      hisi_uncore_pmu_offline_cpu);
>> +				      hisi_l3c_pmu_online_cpu,
>> +				      hisi_l3c_pmu_offline_cpu);
>>   	if (ret) {
>>   		pr_err("L3C PMU: Error setup hotplug, ret = %d\n", ret);
>>   		return ret;
>> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h
>> index 02fa022925d4..0334a797e499 100644
>> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.h
>> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h
>> @@ -24,7 +24,7 @@
>>   #define pr_fmt(fmt)     "hisi_pmu: " fmt
>>   
>>   #define HISI_PMU_V2		0x30
>> -#define HISI_MAX_COUNTERS 0x10
>> +#define HISI_MAX_COUNTERS 0x18
>>   #define to_hisi_pmu(p)	(container_of(p, struct hisi_pmu, pmu))
>>   
>>   #define HISI_PMU_ATTR(_name, _func, _config)				\
>>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon
  2025-08-27  2:27   ` Yicong Yang
@ 2025-08-27  7:22     ` wangyushan
  0 siblings, 0 replies; 25+ messages in thread
From: wangyushan @ 2025-08-27  7:22 UTC (permalink / raw)
  To: Yicong Yang, will, mark.rutland, linux-arm-kernel, linux-kernel
  Cc: yangyicong, robin.murphy, Jonathan.Cameron, liuyonglong,
	wanghuiqiang, prime.zeng, hejunhao3



On 8/27/2025 10:27 AM, Yicong Yang wrote:
> Hi Yushan,
>
> the subject seems to be truncated? should it be like below?
>
> Documentation: hisi-pmu: Add introduction to HiSilicon v3 PMU
>
> other comments inline. sorry for the late reply..

The subject was automatically wrapped, sorry.

"Add introduction to HiSilicon v3 PMU" it is.



>
> On 2025/8/21 21:50, Yushan Wang wrote:
>> Some of HiSilicon V3 PMU hardware is divided into parts to fulfill the
>> job of monitoring specific parts of a device.  Add description on that
>> as well as the newly added ext operand for L3C PMU.
>>
>> Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
>> ---
>>   Documentation/admin-guide/perf/hisi-pmu.rst | 38 +++++++++++++++++++--
>>   1 file changed, 36 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst
>> index a307bce2f5c5..4c7584fe3c1a 100644
>> --- a/Documentation/admin-guide/perf/hisi-pmu.rst
>> +++ b/Documentation/admin-guide/perf/hisi-pmu.rst
>> @@ -12,8 +12,8 @@ The HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster
>>   called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has
>>   two HHAs (0 - 1) and four DDRCs (0 - 3), respectively.
>>   
>> -HiSilicon SoC uncore PMU driver
>> --------------------------------
>> +HiSilicon SoC uncore PMU v1
> these (and below) new sections will break the ordered list of the options. this should not be
> necessary to mention the version, just add the newly added options in the current way and
> mention the introduced version should be enough.

Given ext operand is for L3C PMU only, I could add that to the existing option order lists, as well
as the relationship between ext and the PMU name formats.

>
>> +---------------------------
>>   
>>   Each device PMU has separate registers for event counting, control and
>>   interrupt, and the PMU driver shall register perf PMU drivers like L3C,
>> @@ -56,6 +56,9 @@ Example usage of perf::
>>     $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
>>     $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5
>>   
>> +HiSilicon SoC uncore PMU v2
>> +----------------------------------
>> +
>>   For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
>>   as PMU v1, but some new functions are added to the hardware.
>>   
>> @@ -113,6 +116,37 @@ uring channel. It is 2 bits. Some important codes are as follows:
>>   - 2'b00: default value, count the events which sent to the both uring and
>>     uring_ext channel;
>>   
>> +HiSilicon SoC uncore PMU v3
>> +----------------------------------
>> +
>> +For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are
>> +further divided into parts for finer granularity of tracing, each part has its
>> +own dedicated PMU, and all such PMUs together cover the monitoring job of events
>> +on particular uncore device. Such PMUs are described in sysfs with name format
>> +slightly changed::
>> +
>> +/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}>
>> +
>> +Z is the sub-id, indicating different PMUs for part of hardware device.
>> +
>> +Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU
>> +provides ``ext`` operand to allow exploration of even finer granual statistics
>> +of L3C PMU, L3C PMU driver use that as hint of termination when delivering perf
>> +command to hardware:
>> +
>> +- ext=0: Default, could be used with event names.
>> +- ext=1 and ext=2: Must be used with event codes, event names are not supported.
>> +
>> +An example of perf command could be::
>> +
>> +  $# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5
>> +
>> +or::
>> +
>> +  $# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5
>> +
>> +As above, ``hisi_sccl0_l3c1_0`` locates PMU on CPU cluster 0, L3 cache 1 pipe0.
> this isn't correct. sccl0 indicates the Super CPU CLuster 0 which is already
> described in the document.
>
> thanks.

Yes, will fix that, thanks.




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-08-27  7:25 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-21 13:50 [PATCH v2 0/9] Updates of HiSilicon Uncore L3C PMU Yushan Wang
2025-08-21 13:50 ` [PATCH v2 1/9] drivers/perf: hisi: Relax the event ID check in the framework Yushan Wang
2025-08-26 13:03   ` Jonathan Cameron
2025-08-21 13:50 ` [PATCH v2 2/9] drivers/perf: hisi: Export hisi_uncore_pmu_isr() Yushan Wang
2025-08-26 13:03   ` Jonathan Cameron
2025-08-21 13:50 ` [PATCH v2 3/9] drivers/perf: hisi: Simplify the probe process of each L3C PMU version Yushan Wang
2025-08-26 13:06   ` Jonathan Cameron
2025-08-21 13:50 ` [PATCH v2 4/9] drivers/perf: hisi: Extract the event filter check of L3C PMU Yushan Wang
2025-08-26 13:06   ` Jonathan Cameron
2025-08-21 13:50 ` [PATCH v2 5/9] drivers/perf: hisi: Extend the field of tt_core Yushan Wang
2025-08-26 13:07   ` Jonathan Cameron
2025-08-21 13:50 ` [PATCH v2 6/9] drivers/perf: hisi: Refactor the event configuration of L3C PMU Yushan Wang
2025-08-26 13:08   ` Jonathan Cameron
2025-08-21 13:50 ` [PATCH v2 7/9] drivers/perf: hisi: Add support for L3C PMU v3 Yushan Wang
2025-08-26 13:12   ` Jonathan Cameron
2025-08-27  6:21     ` wangyushan
2025-08-27  3:43   ` Yicong Yang
2025-08-27  7:07     ` wangyushan
2025-08-21 13:50 ` [PATCH v2 8/9] Documentation: hisi-pmu: Fix of minor format error Yushan Wang
2025-08-26 13:21   ` Jonathan Cameron
2025-08-27  2:15   ` Yicong Yang
2025-08-21 13:50 ` [PATCH v2 9/9] Documentation: hisi-pmu: Add introduction to HiSilicon Yushan Wang
2025-08-26 13:22   ` Jonathan Cameron
2025-08-27  2:27   ` Yicong Yang
2025-08-27  7:22     ` wangyushan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).