linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver
@ 2025-08-15  3:47 Koichi Okuno
  2025-08-15  3:47 ` [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC " Koichi Okuno
  2025-08-15  3:47 ` [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI " Koichi Okuno
  0 siblings, 2 replies; 10+ messages in thread
From: Koichi Okuno @ 2025-08-15  3:47 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
	Gowthami Thiagarajan, Linu Cherian, Robin Murphy,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra,
	Jonathan Cameron, linux-doc, linux-kernel, Koichi Okuno

This adds two new dynamic PMUs to the Perf Events framework to program
and control the Uncore MAC/PCI PMUs in Fujitsu chips.

These drivers were created with reference to drivers/perf/qcom_l3_pmu.c.

These drivers export formatting and event information to sysfs so they can
be used by the perf user space tools with the syntaxes:

perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls

perf stat -e pci_iod0_pci0/ea-pci/ ls
perf stat -e pci_iod0_pci0/event=0x80/ ls

FUJITSU-MONAKA PMU Events Specification v1.1 URL:
https://github.com/fujitsu/FUJITSU-MONAKA

Changes in v7:
- Modify the code as suggested. (Jonathan Cameron)
  - Renamed the macros name to make it clear which register it applies to.
  - Deleted unused macro.
  - Changed some programming styles as suggested.
  - I tested using v6.17-rc1 and confirmed that I get the same results
    as before the macro name change.
- Link to v6:https://lore.kernel.org/all/20250711071404.2138816-1-fj2767dz@fujitsu.com/

Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>

Koichi Okuno (2):
  perf: Fujitsu: Add the Uncore MAC PMU driver
  perf: Fujitsu: Add the Uncore PCI PMU driver

 .../admin-guide/perf/fujitsu_mac_pmu.rst      |  73 +++
 .../admin-guide/perf/fujitsu_pci_pmu.rst      |  50 ++
 Documentation/admin-guide/perf/index.rst      |   2 +
 drivers/perf/Kconfig                          |  18 +
 drivers/perf/Makefile                         |   2 +
 drivers/perf/fujitsu_mac_pmu.c                | 552 ++++++++++++++++++
 drivers/perf/fujitsu_pci_pmu.c                | 536 +++++++++++++++++
 7 files changed, 1233 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
 create mode 100644 Documentation/admin-guide/perf/fujitsu_pci_pmu.rst
 create mode 100644 drivers/perf/fujitsu_mac_pmu.c
 create mode 100644 drivers/perf/fujitsu_pci_pmu.c

-- 
2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver
  2025-08-15  3:47 [PATCH v7 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver Koichi Okuno
@ 2025-08-15  3:47 ` Koichi Okuno
  2025-08-15 11:33   ` Jonathan Cameron
  2025-08-15 12:39   ` Robin Murphy
  2025-08-15  3:47 ` [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI " Koichi Okuno
  1 sibling, 2 replies; 10+ messages in thread
From: Koichi Okuno @ 2025-08-15  3:47 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
	Gowthami Thiagarajan, Linu Cherian, Robin Murphy,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra,
	Jonathan Cameron, linux-doc, linux-kernel, Koichi Okuno

This adds a new dynamic PMU to the Perf Events framework to program and
control the Uncore MAC PMUs in Fujitsu chips.

This driver was created with reference to drivers/perf/qcom_l3_pmu.c.

This driver exports formatting and event information to sysfs so it can
be used by the perf user space tools with the syntaxes:

perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls

FUJITSU-MONAKA PMU Events Specification v1.1 URL:
https://github.com/fujitsu/FUJITSU-MONAKA

Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>
---
 .../admin-guide/perf/fujitsu_mac_pmu.rst      |  73 +++
 Documentation/admin-guide/perf/index.rst      |   1 +
 drivers/perf/Kconfig                          |   9 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/fujitsu_mac_pmu.c                | 552 ++++++++++++++++++
 5 files changed, 636 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
 create mode 100644 drivers/perf/fujitsu_mac_pmu.c

diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
new file mode 100644
index 000000000000..383f3c1bbde7
--- /dev/null
+++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
@@ -0,0 +1,73 @@
+====================================================
+Fujitsu Uncore MAC Performance Monitoring Unit (PMU)
+====================================================
+
+This driver supports the Uncore MAC PMUs found in Fujitsu chips.
+Each MAC PMU on these chips is exposed as a uncore perf PMU with device name
+mac_iod<iod>_mac<mac>_ch<ch>.
+
+The driver provides a description of its available events and configuration
+options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/.
+This driver exports:
+- formats, used by perf user space and other tools to configure events
+- events, used by perf user space and other tools to create events
+  symbolically, e.g.:
+    perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls
+- cpumask, used by perf user space and other tools to know on which CPUs
+  to open the events
+
+This driver supports the following events:
+- cycles
+  This event counts MAC cycles at MAC frequency.
+- read-count
+  This event counts the number of read requests to MAC.
+- read-count-request
+  This event counts the number of read requests including retry to MAC.
+- read-count-return
+  This event counts the number of responses to read requests to MAC.
+- read-count-request-pftgt
+  This event counts the number of read requests including retry with PFTGT
+  flag.
+- read-count-request-normal
+  This event counts the number of read requests including retry without PFTGT
+  flag.
+- read-count-return-pftgt-hit
+  This event counts the number of responses to read requests which hit the
+  PFTGT buffer.
+- read-count-return-pftgt-miss
+  This event counts the number of responses to read requests which miss the
+  PFTGT buffer.
+- read-wait
+  This event counts outstanding read requests issued by DDR memory controller
+  per cycle.
+- write-count
+  This event counts the number of write requests to MAC (including zero write,
+  full write, partial write, write cancel).
+- write-count-write
+  This event counts the number of full write requests to MAC (not including
+  zero write).
+- write-count-pwrite
+  This event counts the number of partial write requests to MAC.
+- memory-read-count
+  This event counts the number of read requests from MAC to memory.
+- memory-write-count
+  This event counts the number of full write requests from MAC to memory.
+- memory-pwrite-count
+  This event counts the number of partial write requests from MAC to memory.
+- ea-mac
+  This event counts energy consumption of MAC.
+- ea-memory
+  This event counts energy consumption of memory.
+- ea-memory-mac-write
+  This event counts the number of write requests from MAC to memory.
+- ea-ha
+  This event counts energy consumption of HA.
+
+  'ea' is the abbreviation for 'Energy Analyzer'.
+
+Examples for use with perf::
+
+  perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
+
+Given that these are uncore PMUs the driver does not support sampling, therefore
+"perf record" will not work. Per-task perf sessions are not supported.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 072b510385c4..e0262060db17 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -29,3 +29,4 @@ Performance monitor support
    cxl
    ampere_cspmu
    mrvl-pem-pmu
+   fujitsu_mac_pmu
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index a9188dec36fe..ab4e44c99a53 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU
 	 can give information about memory throughput and other related
 	 events.
 
+config FUJITSU_MAC_PMU
+	bool "Fujitsu Uncore MAC PMU"
+	depends on (ARM64 && (ACPI || COMPILE_TEST))
+	help
+	 Provides support for the Uncore MAC performance monitor unit (PMU)
+	 in Fujitsu processors.
+	 Adds the Uncore MAC PMU into the perf events subsystem for
+	 monitoring Uncore MAC events.
+
 config QCOM_L2_PMU
 	bool "Qualcomm Technologies L2-cache PMU"
 	depends on ARCH_QCOM && ARM64 && ACPI
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 192fc8b16204..be7f74c3b14c 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_ARM_SMMU_V3_PMU) += arm_smmuv3_pmu.o
 obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o
 obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o
 obj-$(CONFIG_HISI_PMU) += hisilicon/
+obj-$(CONFIG_FUJITSU_MAC_PMU) += fujitsu_mac_pmu.o
 obj-$(CONFIG_QCOM_L2_PMU)	+= qcom_l2_pmu.o
 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
 obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
diff --git a/drivers/perf/fujitsu_mac_pmu.c b/drivers/perf/fujitsu_mac_pmu.c
new file mode 100644
index 000000000000..1031e0221bb2
--- /dev/null
+++ b/drivers/perf/fujitsu_mac_pmu.c
@@ -0,0 +1,552 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Driver for the Uncore MAC PMUs in Fujitsu chips.
+ *
+ * See Documentation/admin-guide/perf/fujitsu_mac_pmu.rst for more details.
+ *
+ * This driver is based on drivers/perf/qcom_l3_pmu.c
+ * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024 Fujitsu. All rights reserved.
+ */
+
+#include <linux/acpi.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/list.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+
+/* Number of counters on each PMU */
+#define MAC_NUM_COUNTERS  8
+/* Mask for the event type field within perf_event_attr.config and EVTYPE reg */
+#define MAC_EVTYPE_MASK   0xFF
+
+/* Perfmon registers */
+#define MAC_PM_EVCNTR(__cntr)           (0x000 + (__cntr) * 8)
+#define MAC_PM_CNTCTL(__cntr)           (0x100 + (__cntr) * 8)
+#define MAC_PM_CNTCTL_RESET             0
+#define MAC_PM_EVTYPE(__cntr)           (0x200 + (__cntr) * 8)
+#define MAC_PM_EVTYPE_EVSEL(__val)      FIELD_GET(MAC_EVTYPE_MASK, __val)
+#define MAC_PM_CR                       0x400
+#define MAC_PM_CR_RESET                 BIT(1)
+#define MAC_PM_CR_ENABLE                BIT(0)
+#define MAC_PM_CNTENSET                 0x410
+#define MAC_PM_CNTENSET_IDX(__cntr)     BIT(__cntr)
+#define MAC_PM_CNTENCLR                 0x418
+#define MAC_PM_CNTENCLR_IDX(__cntr)     BIT(__cntr)
+#define MAC_PM_CNTENCLR_RESET           0xFF
+#define MAC_PM_INTENSET                 0x420
+#define MAC_PM_INTENSET_IDX(__cntr)     BIT(__cntr)
+#define MAC_PM_INTENCLR                 0x428
+#define MAC_PM_INTENCLR_IDX(__cntr)     BIT(__cntr)
+#define MAC_PM_INTENCLR_RESET           0xFF
+#define MAC_PM_OVSR                     0x440
+#define MAC_PM_OVSR_OVSRCLR_RESET       0xFF
+
+#define MAC_EVENT_CYCLES                        0x000
+#define MAC_EVENT_READ_COUNT                    0x010
+#define MAC_EVENT_READ_COUNT_REQUEST            0x011
+#define MAC_EVENT_READ_COUNT_RETURN             0x012
+#define MAC_EVENT_READ_COUNT_REQUEST_PFTGT      0x013
+#define MAC_EVENT_READ_COUNT_REQUEST_NORMAL     0x014
+#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT   0x015
+#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS  0x016
+#define MAC_EVENT_READ_WAIT                     0x017
+#define MAC_EVENT_WRITE_COUNT                   0x020
+#define MAC_EVENT_WRITE_COUNT_WRITE             0x021
+#define MAC_EVENT_WRITE_COUNT_PWRITE            0x022
+#define MAC_EVENT_MEMORY_READ_COUNT             0x040
+#define MAC_EVENT_MEMORY_WRITE_COUNT            0x050
+#define MAC_EVENT_MEMORY_PWRITE_COUNT           0x060
+#define MAC_EVENT_EA_MAC                        0x080
+#define MAC_EVENT_EA_MEMORY                     0x090
+#define MAC_EVENT_EA_MEMORY_MAC_WRITE           0x092
+#define MAC_EVENT_EA_HA                         0x0a0
+
+struct mac_pmu {
+	struct pmu		pmu;
+	struct hlist_node	node;
+	void __iomem		*regs;
+	struct perf_event	*events[MAC_NUM_COUNTERS];
+	unsigned long		used_mask[BITS_TO_LONGS(MAC_NUM_COUNTERS)];
+	cpumask_t		cpumask;
+};
+
+#define to_mac_pmu(p) (container_of(p, struct mac_pmu, pmu))
+
+static int mac_pmu_cpuhp_state;
+
+static void fujitsu_mac_counter_start(struct perf_event *event)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	/* Initialize the hardware counter and reset prev_count*/
+	local64_set(&event->hw.prev_count, 0);
+	writeq_relaxed(0, macpmu->regs + MAC_PM_EVCNTR(idx));
+
+	/* Set the event type */
+	writeq_relaxed(MAC_PM_EVTYPE_EVSEL(event->attr.config), macpmu->regs + MAC_PM_EVTYPE(idx));
+
+	/* Enable interrupt generation by this counter */
+	writeq_relaxed(MAC_PM_INTENSET_IDX(idx), macpmu->regs + MAC_PM_INTENSET);
+
+	/* Finally, enable the counter */
+	writeq_relaxed(MAC_PM_CNTCTL_RESET, macpmu->regs + MAC_PM_CNTCTL(idx));
+	writeq_relaxed(MAC_PM_CNTENSET_IDX(idx), macpmu->regs + MAC_PM_CNTENSET);
+}
+
+static void fujitsu_mac_counter_stop(struct perf_event *event,
+				     int flags)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	/* Disable the counter */
+	writeq_relaxed(MAC_PM_CNTENCLR_IDX(idx), macpmu->regs + MAC_PM_CNTENCLR);
+
+	/* Disable interrupt generation by this counter */
+	writeq_relaxed(MAC_PM_INTENCLR_IDX(idx), macpmu->regs + MAC_PM_INTENCLR);
+}
+
+static void fujitsu_mac_counter_update(struct perf_event *event)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
+	int idx = event->hw.idx;
+	u64 prev, new;
+
+	do {
+		prev = local64_read(&event->hw.prev_count);
+		new = readq_relaxed(macpmu->regs + MAC_PM_EVCNTR(idx));
+	} while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev);
+
+	local64_add(new - prev, &event->count);
+}
+
+static inline void fujitsu_mac__init(struct mac_pmu *macpmu)
+{
+	int i;
+
+	writeq_relaxed(MAC_PM_CR_RESET, macpmu->regs + MAC_PM_CR);
+
+	writeq_relaxed(MAC_PM_CNTENCLR_RESET, macpmu->regs + MAC_PM_CNTENCLR);
+	writeq_relaxed(MAC_PM_INTENCLR_RESET, macpmu->regs + MAC_PM_INTENCLR);
+	writeq_relaxed(MAC_PM_OVSR_OVSRCLR_RESET, macpmu->regs + MAC_PM_OVSR);
+
+	for (i = 0; i < MAC_NUM_COUNTERS; ++i) {
+		writeq_relaxed(MAC_PM_CNTCTL_RESET, macpmu->regs + MAC_PM_CNTCTL(i));
+		writeq_relaxed(MAC_PM_EVTYPE_EVSEL(0), macpmu->regs + MAC_PM_EVTYPE(i));
+	}
+
+	/*
+	 * Use writeq here to ensure all programming commands are done
+	 * before proceeding
+	 */
+	writeq(MAC_PM_CR_ENABLE, macpmu->regs + MAC_PM_CR);
+}
+
+static irqreturn_t fujitsu_mac__handle_irq(int irq_num, void *data)
+{
+	struct mac_pmu *macpmu = data;
+	/* Read the overflow status register */
+	long status = readq_relaxed(macpmu->regs + MAC_PM_OVSR);
+	int idx;
+
+	if (status == 0)
+		return IRQ_NONE;
+
+	/* Clear the bits we read on the overflow status register */
+	writeq_relaxed(status, macpmu->regs + MAC_PM_OVSR);
+
+	for_each_set_bit(idx, &status, MAC_NUM_COUNTERS) {
+		struct perf_event *event;
+
+		event = macpmu->events[idx];
+		if (!event)
+			continue;
+
+		fujitsu_mac_counter_update(event);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static void fujitsu_mac__pmu_enable(struct pmu *pmu)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(pmu);
+
+	/* Ensure the other programming commands are observed before enabling */
+	wmb();
+
+	writeq_relaxed(MAC_PM_CR_ENABLE, macpmu->regs + MAC_PM_CR);
+}
+
+static void fujitsu_mac__pmu_disable(struct pmu *pmu)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(pmu);
+
+	writeq_relaxed(0, macpmu->regs + MAC_PM_CR);
+
+	/* Ensure the basic counter unit is stopped before proceeding */
+	wmb();
+}
+
+/*
+ * We must NOT create groups containing events from multiple hardware PMUs,
+ * although mixing different software and hardware PMUs is allowed.
+ */
+static bool fujitsu_mac__validate_event_group(struct perf_event *event)
+{
+	struct perf_event *leader = event->group_leader;
+	struct perf_event *sibling;
+	int counters;
+
+	if (leader->pmu != event->pmu && !is_software_event(leader))
+		return false;
+
+	/* The sum of the counters used by the event and its leader event */
+	counters = 2;
+
+	for_each_sibling_event(sibling, leader) {
+		if (is_software_event(sibling))
+			continue;
+		if (sibling->pmu != event->pmu)
+			return false;
+		counters++;
+	}
+
+	/*
+	 * If the group requires more counters than the HW has, it
+	 * cannot ever be scheduled.
+	 */
+	return counters <= MAC_NUM_COUNTERS;
+}
+
+static int fujitsu_mac__event_init(struct perf_event *event)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+
+	/* Is the event for this PMU? */
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/*
+	 * Sampling not supported since these events are not
+	 * core-attributable.
+	 */
+	if (hwc->sample_period)
+		return -EINVAL;
+
+	/*
+	 * Task mode not available, we run the counters as socket counters,
+	 * not attributable to any CPU and therefore cannot attribute per-task.
+	 */
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	/* Validate the group */
+	if (!fujitsu_mac__validate_event_group(event))
+		return -EINVAL;
+
+	hwc->idx = -1;
+
+	/*
+	 * Many perf core operations (eg. events rotation) operate on a
+	 * single CPU context. This is obvious for CPU PMUs, where one
+	 * expects the same sets of events being observed on all CPUs,
+	 * but can lead to issues for off-core PMUs, like this one, where
+	 * each event could be theoretically assigned to a different CPU.
+	 * To mitigate this, we enforce CPU assignment to one designated
+	 * processor (the one described in the "cpumask" attribute exported
+	 * by the PMU device). perf user space tools honor this and avoid
+	 * opening more than one copy of the events.
+	 */
+	event->cpu = cpumask_first(&macpmu->cpumask);
+
+	return 0;
+}
+
+static void fujitsu_mac__event_start(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	hwc->state = 0;
+	fujitsu_mac_counter_start(event);
+}
+
+static void fujitsu_mac__event_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (hwc->state & PERF_HES_STOPPED)
+		return;
+
+	fujitsu_mac_counter_stop(event, flags);
+	if (flags & PERF_EF_UPDATE)
+		fujitsu_mac_counter_update(event);
+	hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int fujitsu_mac__event_add(struct perf_event *event, int flags)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx;
+
+	/* Try to allocate a counter. */
+	idx = bitmap_find_free_region(macpmu->used_mask, MAC_NUM_COUNTERS, 0);
+	if (idx < 0)
+		/* The counters are all in use. */
+		return -EAGAIN;
+
+	hwc->idx = idx;
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+	macpmu->events[idx] = event;
+
+	if (flags & PERF_EF_START)
+		fujitsu_mac__event_start(event, 0);
+
+	/* Propagate changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void fujitsu_mac__event_del(struct perf_event *event, int flags)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+
+	/* Stop and clean up */
+	fujitsu_mac__event_stop(event, flags | PERF_EF_UPDATE);
+	macpmu->events[hwc->idx] = NULL;
+	bitmap_release_region(macpmu->used_mask, hwc->idx, 0);
+
+	/* Propagate changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+}
+
+static void fujitsu_mac__event_read(struct perf_event *event)
+{
+	fujitsu_mac_counter_update(event);
+}
+
+#define MAC_PMU_FORMAT_ATTR(_name, _config)				      \
+	(&((struct dev_ext_attribute[]) {				      \
+		{ .attr = __ATTR(_name, 0444, device_show_string, NULL),      \
+		  .var = (void *) _config, }				      \
+	})[0].attr.attr)
+
+static struct attribute *fujitsu_mac_pmu_formats[] = {
+	MAC_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL
+};
+
+static const struct attribute_group fujitsu_mac_pmu_format_group = {
+	.name = "format",
+	.attrs = fujitsu_mac_pmu_formats,
+};
+
+static ssize_t mac_pmu_event_show(struct device *dev,
+				  struct device_attribute *attr, char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr;
+
+	pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+	return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
+}
+
+#define MAC_EVENT_ATTR(_name, _id)					     \
+	PMU_EVENT_ATTR_ID(_name, mac_pmu_event_show, _id)
+
+static struct attribute *fujitsu_mac_pmu_events[] = {
+	MAC_EVENT_ATTR(cycles, MAC_EVENT_CYCLES),
+	MAC_EVENT_ATTR(read-count, MAC_EVENT_READ_COUNT),
+	MAC_EVENT_ATTR(read-count-request, MAC_EVENT_READ_COUNT_REQUEST),
+	MAC_EVENT_ATTR(read-count-return, MAC_EVENT_READ_COUNT_RETURN),
+	MAC_EVENT_ATTR(read-count-request-pftgt, MAC_EVENT_READ_COUNT_REQUEST_PFTGT),
+	MAC_EVENT_ATTR(read-count-request-normal, MAC_EVENT_READ_COUNT_REQUEST_NORMAL),
+	MAC_EVENT_ATTR(read-count-return-pftgt-hit, MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT),
+	MAC_EVENT_ATTR(read-count-return-pftgt-miss, MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS),
+	MAC_EVENT_ATTR(read-wait, MAC_EVENT_READ_WAIT),
+	MAC_EVENT_ATTR(write-count, MAC_EVENT_WRITE_COUNT),
+	MAC_EVENT_ATTR(write-count-write, MAC_EVENT_WRITE_COUNT_WRITE),
+	MAC_EVENT_ATTR(write-count-pwrite, MAC_EVENT_WRITE_COUNT_PWRITE),
+	MAC_EVENT_ATTR(memory-read-count, MAC_EVENT_MEMORY_READ_COUNT),
+	MAC_EVENT_ATTR(memory-write-count, MAC_EVENT_MEMORY_WRITE_COUNT),
+	MAC_EVENT_ATTR(memory-pwrite-count, MAC_EVENT_MEMORY_PWRITE_COUNT),
+	MAC_EVENT_ATTR(ea-mac, MAC_EVENT_EA_MAC),
+	MAC_EVENT_ATTR(ea-memory, MAC_EVENT_EA_MEMORY),
+	MAC_EVENT_ATTR(ea-memory-mac-write, MAC_EVENT_EA_MEMORY_MAC_WRITE),
+	MAC_EVENT_ATTR(ea-ha, MAC_EVENT_EA_HA),
+	NULL
+};
+
+static const struct attribute_group fujitsu_mac_pmu_events_group = {
+	.name = "events",
+	.attrs = fujitsu_mac_pmu_events,
+};
+
+static ssize_t cpumask_show(struct device *dev,
+			    struct device_attribute *attr, char *buf)
+{
+	struct mac_pmu *macpmu = to_mac_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, &macpmu->cpumask);
+}
+static DEVICE_ATTR_RO(cpumask);
+
+static struct attribute *fujitsu_mac_pmu_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL
+};
+
+static const struct attribute_group fujitsu_mac_pmu_cpumask_attr_group = {
+	.attrs = fujitsu_mac_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *fujitsu_mac_pmu_attr_grps[] = {
+	&fujitsu_mac_pmu_format_group,
+	&fujitsu_mac_pmu_events_group,
+	&fujitsu_mac_pmu_cpumask_attr_group,
+	NULL
+};
+
+static int fujitsu_mac_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node);
+
+	/* If there is not a CPU/PMU association pick this CPU */
+	if (cpumask_empty(&macpmu->cpumask))
+		cpumask_set_cpu(cpu, &macpmu->cpumask);
+
+	return 0;
+}
+
+static int fujitsu_mac_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node);
+	unsigned int target;
+
+	if (!cpumask_test_and_clear_cpu(cpu, &macpmu->cpumask))
+		return 0;
+
+	target = cpumask_any_but(cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	perf_pmu_migrate_context(&macpmu->pmu, cpu, target);
+	cpumask_set_cpu(target, &macpmu->cpumask);
+
+	return 0;
+}
+
+static int fujitsu_mac_pmu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct acpi_device *acpi_dev;
+	struct mac_pmu *macpmu;
+	struct resource *memrc;
+	char *name;
+	int ret;
+	u64 uid;
+
+	acpi_dev = ACPI_COMPANION(dev);
+	if (!acpi_dev)
+		return -ENODEV;
+
+	ret = acpi_dev_uid_to_integer(acpi_dev, &uid);
+	if (ret)
+		return dev_err_probe(dev, ret, "unable to read ACPI uid\n");
+
+	macpmu = devm_kzalloc(dev, sizeof(*macpmu), GFP_KERNEL);
+	if (!macpmu)
+		return -ENOMEM;
+
+	name = devm_kasprintf(dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu",
+			      (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF);
+	if (!name)
+		return -ENOMEM;
+
+	macpmu->pmu = (struct pmu) {
+		.parent		= dev,
+		.task_ctx_nr	= perf_invalid_context,
+
+		.pmu_enable	= fujitsu_mac__pmu_enable,
+		.pmu_disable	= fujitsu_mac__pmu_disable,
+		.event_init	= fujitsu_mac__event_init,
+		.add		= fujitsu_mac__event_add,
+		.del		= fujitsu_mac__event_del,
+		.start		= fujitsu_mac__event_start,
+		.stop		= fujitsu_mac__event_stop,
+		.read		= fujitsu_mac__event_read,
+
+		.attr_groups	= fujitsu_mac_pmu_attr_grps,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
+	};
+
+	macpmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc);
+	if (IS_ERR(macpmu->regs))
+		return PTR_ERR(macpmu->regs);
+
+	fujitsu_mac__init(macpmu);
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		return ret;
+
+	ret = devm_request_irq(dev, ret, fujitsu_mac__handle_irq, 0,
+			       name, macpmu);
+	if (ret)
+		return dev_err_probe(dev, ret, "Request for IRQ failed for slice @%pa\n",
+				     &memrc->start);
+
+	/* Add this instance to the list used by the offline callback */
+	ret = cpuhp_state_add_instance(mac_pmu_cpuhp_state, &macpmu->node);
+	if (ret)
+		return dev_err_probe(dev, ret, "Error registering hotplug");
+
+	ret = perf_pmu_register(&macpmu->pmu, name, -1);
+	if (ret < 0)
+		return dev_err_probe(dev, ret, "Failed to register MAC PMU\n");
+
+	dev_dbg(dev, "Registered %s, type: %d\n", name, macpmu->pmu.type);
+
+	return 0;
+}
+
+static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = {
+	{ "FUJI200C", },
+	{ }
+};
+MODULE_DEVICE_TABLE(acpi, fujitsu_mac_pmu_acpi_match);
+
+static struct platform_driver fujitsu_mac_pmu_driver = {
+	.driver = {
+		.name = "fujitsu-mac-pmu",
+		.acpi_match_table = fujitsu_mac_pmu_acpi_match,
+		.suppress_bind_attrs = true,
+	},
+	.probe = fujitsu_mac_pmu_probe,
+};
+
+static int __init register_fujitsu_mac_pmu_driver(void)
+{
+	int ret;
+
+	/* Install a hook to update the reader CPU in case it goes offline */
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "perf/fujitsu/mac:online",
+				      fujitsu_mac_pmu_online_cpu,
+				      fujitsu_mac_pmu_offline_cpu);
+	if (ret < 0)
+		return ret;
+
+	mac_pmu_cpuhp_state = ret;
+	return platform_driver_register(&fujitsu_mac_pmu_driver);
+}
+device_initcall(register_fujitsu_mac_pmu_driver);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver
  2025-08-15  3:47 [PATCH v7 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver Koichi Okuno
  2025-08-15  3:47 ` [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC " Koichi Okuno
@ 2025-08-15  3:47 ` Koichi Okuno
  2025-08-15 11:36   ` Jonathan Cameron
  2025-08-15 12:59   ` Robin Murphy
  1 sibling, 2 replies; 10+ messages in thread
From: Koichi Okuno @ 2025-08-15  3:47 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
	Gowthami Thiagarajan, Linu Cherian, Robin Murphy,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra,
	Jonathan Cameron, linux-doc, linux-kernel, Koichi Okuno

This adds a new dynamic PMU to the Perf Events framework to program and
control the Uncore PCI PMUs in Fujitsu chips.

This driver was created with reference to drivers/perf/qcom_l3_pmu.c.

This driver exports formatting and event information to sysfs so it can
be used by the perf user space tools with the syntaxes:

perf stat -e pci_iod0_pci0/ea-pci/ ls
perf stat -e pci_iod0_pci0/event=0x80/ ls

FUJITSU-MONAKA PMU Events Specification v1.1 URL:
https://github.com/fujitsu/FUJITSU-MONAKA

Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>
---
 .../admin-guide/perf/fujitsu_pci_pmu.rst      |  50 ++
 Documentation/admin-guide/perf/index.rst      |   1 +
 drivers/perf/Kconfig                          |   9 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/fujitsu_pci_pmu.c                | 536 ++++++++++++++++++
 5 files changed, 597 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/fujitsu_pci_pmu.rst
 create mode 100644 drivers/perf/fujitsu_pci_pmu.c

diff --git a/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst b/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst
new file mode 100644
index 000000000000..117cefc4d185
--- /dev/null
+++ b/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst
@@ -0,0 +1,50 @@
+====================================================
+Fujitsu Uncore PCI Performance Monitoring Unit (PMU)
+====================================================
+
+This driver supports the Uncore PCI PMUs found in Fujitsu chips.
+Each PCI PMU on these chips is exposed as a uncore perf PMU with device name
+pci_iod<iod>_pci<pci>.
+
+The driver provides a description of its available events and configuration
+options in sysfs, see /sys/bus/event_sources/devices/pci_iod<iod>_pci<pci>/.
+This driver exports:
+- formats, used by perf user space and other tools to configure events
+- events, used by perf user space and other tools to create events
+  symbolically, e.g.:
+    perf stat -a -e pci_iod0_pci0/event=0x24/ ls
+- cpumask, used by perf user space and other tools to know on which CPUs
+  to open the events
+
+This driver supports the following events:
+- pci-port0-cycles
+  This event counts PCI cycles at PCI frequency in port0.
+- pci-port0-read-count
+  This event counts read transactions for data transfer in port0.
+- pci-port0-read-count-bus
+  This event counts read transactions for bus usage in port0.
+- pci-port0-write-count
+  This event counts write transactions for data transfer in port0.
+- pci-port0-write-count-bus
+  This event counts write transactions for bus usage in port0.
+- pci-port1-cycles
+  This event counts PCI cycles at PCI frequency in port1.
+- pci-port1-read-count
+  This event counts read transactions for data transfer in port1.
+- pci-port1-read-count-bus
+  This event counts read transactions for bus usage in port1.
+- pci-port1-write-count
+  This event counts write transactions for data transfer in port1.
+- pci-port1-write-count-bus
+  This event counts write transactions for bus usage in port1.
+- ea-pci
+  This event counts energy consumption of PCI.
+
+  'ea' is the abbreviation for 'Energy Analyzer'.
+
+Examples for use with perf::
+
+  perf stat -e pci_iod0_pci0/ea-pci/ ls
+
+Given that these are uncore PMUs the driver does not support sampling, therefore
+"perf record" will not work. Per-task perf sessions are not supported.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index e0262060db17..50cc039ad702 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -30,3 +30,4 @@ Performance monitor support
    ampere_cspmu
    mrvl-pem-pmu
    fujitsu_mac_pmu
+   fujitsu_pci_pmu
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index ab4e44c99a53..3d86710d9c73 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -187,6 +187,15 @@ config FUJITSU_MAC_PMU
 	 Adds the Uncore MAC PMU into the perf events subsystem for
 	 monitoring Uncore MAC events.
 
+config FUJITSU_PCI_PMU
+	bool "Fujitsu Uncore PCI PMU"
+	depends on (ARM64 && (ACPI || COMPILE_TEST))
+	help
+	 Provides support for the Uncore PCI performance monitor unit (PMU)
+	 in Fujitsu processors.
+	 Adds the Uncore PCI PMU into the perf events subsystem for
+	 monitoring Uncore PCI events.
+
 config QCOM_L2_PMU
 	bool "Qualcomm Technologies L2-cache PMU"
 	depends on ARCH_QCOM && ARM64 && ACPI
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index be7f74c3b14c..6d4a33cd8211 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o
 obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o
 obj-$(CONFIG_HISI_PMU) += hisilicon/
 obj-$(CONFIG_FUJITSU_MAC_PMU) += fujitsu_mac_pmu.o
+obj-$(CONFIG_FUJITSU_PCI_PMU) += fujitsu_pci_pmu.o
 obj-$(CONFIG_QCOM_L2_PMU)	+= qcom_l2_pmu.o
 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
 obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
diff --git a/drivers/perf/fujitsu_pci_pmu.c b/drivers/perf/fujitsu_pci_pmu.c
new file mode 100644
index 000000000000..dbe050eec711
--- /dev/null
+++ b/drivers/perf/fujitsu_pci_pmu.c
@@ -0,0 +1,536 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Driver for the Uncore PCI PMUs in Fujitsu chips.
+ *
+ * See Documentation/admin-guide/perf/fujitsu_pci_pmu.rst for more details.
+ *
+ * This driver is based on drivers/perf/qcom_l3_pmu.c
+ * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024 Fujitsu. All rights reserved.
+ */
+
+#include <linux/acpi.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/list.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+
+/* Number of counters on each PMU */
+#define PCI_NUM_COUNTERS  8
+/* Mask for the event type field within perf_event_attr.config and EVTYPE reg */
+#define PCI_EVTYPE_MASK   0xFF
+
+/* Perfmon registers */
+#define PCI_PM_EVCNTR(__cntr)           (0x000 + (__cntr) * 8)
+#define PCI_PM_CNTCTL(__cntr)           (0x100 + (__cntr) * 8)
+#define PCI_PM_CNTCTL_RESET             0
+#define PCI_PM_EVTYPE(__cntr)           (0x200 + (__cntr) * 8)
+#define PCI_PM_EVTYPE_EVSEL(__val)      FIELD_GET(PCI_EVTYPE_MASK, __val)
+#define PCI_PM_CR                       0x400
+#define PCI_PM_CR_RESET                 BIT(1)
+#define PCI_PM_CR_ENABLE                BIT(0)
+#define PCI_PM_CNTENSET                 0x410
+#define PCI_PM_CNTENSET_IDX(__cntr)     BIT(__cntr)
+#define PCI_PM_CNTENCLR                 0x418
+#define PCI_PM_CNTENCLR_IDX(__cntr)     BIT(__cntr)
+#define PCI_PM_CNTENCLR_RESET           0xFF
+#define PCI_PM_INTENSET                 0x420
+#define PCI_PM_INTENSET_IDX(__cntr)     BIT(__cntr)
+#define PCI_PM_INTENCLR                 0x428
+#define PCI_PM_INTENCLR_IDX(__cntr)     BIT(__cntr)
+#define PCI_PM_INTENCLR_RESET           0xFF
+#define PCI_PM_OVSR                     0x440
+#define PCI_PM_OVSR_OVSRCLR_RESET       0xFF
+
+#define PCI_EVENT_PORT0_CYCLES          0x000
+#define PCI_EVENT_PORT0_READ_COUNT      0x010
+#define PCI_EVENT_PORT0_READ_COUNT_BUS  0x014
+#define PCI_EVENT_PORT0_WRITE_COUNT     0x020
+#define PCI_EVENT_PORT0_WRITE_COUNT_BUS 0x024
+#define PCI_EVENT_PORT1_CYCLES          0x040
+#define PCI_EVENT_PORT1_READ_COUNT      0x050
+#define PCI_EVENT_PORT1_READ_COUNT_BUS  0x054
+#define PCI_EVENT_PORT1_WRITE_COUNT     0x060
+#define PCI_EVENT_PORT1_WRITE_COUNT_BUS 0x064
+#define PCI_EVENT_EA_PCI                0x080
+
+struct pci_pmu {
+	struct pmu		pmu;
+	struct hlist_node	node;
+	void __iomem		*regs;
+	struct perf_event	*events[PCI_NUM_COUNTERS];
+	unsigned long		used_mask[BITS_TO_LONGS(PCI_NUM_COUNTERS)];
+	cpumask_t		cpumask;
+};
+
+static int pci_pmu_cpuhp_state;
+
+#define to_pci_pmu(p) (container_of(p, struct pci_pmu, pmu))
+
+static void fujitsu_pci_counter_start(struct perf_event *event)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	/* Initialize the hardware counter and reset prev_count*/
+	local64_set(&event->hw.prev_count, 0);
+	writeq_relaxed(0, pcipmu->regs + PCI_PM_EVCNTR(idx));
+
+	/* Set the event type */
+	writeq_relaxed(PCI_PM_EVTYPE_EVSEL(event->attr.config), pcipmu->regs + PCI_PM_EVTYPE(idx));
+
+	/* Enable interrupt generation by this counter */
+	writeq_relaxed(PCI_PM_INTENSET_IDX(idx), pcipmu->regs + PCI_PM_INTENSET);
+
+	/* Finally, enable the counter */
+	writeq_relaxed(PCI_PM_CNTCTL_RESET, pcipmu->regs + PCI_PM_CNTCTL(idx));
+	writeq_relaxed(PCI_PM_CNTENSET_IDX(idx), pcipmu->regs + PCI_PM_CNTENSET);
+}
+
+static void fujitsu_pci_counter_stop(struct perf_event *event,
+				     int flags)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	/* Disable the counter */
+	writeq_relaxed(PCI_PM_CNTENCLR_IDX(idx), pcipmu->regs + PCI_PM_CNTENCLR);
+
+	/* Disable interrupt generation by this counter */
+	writeq_relaxed(PCI_PM_INTENCLR_IDX(idx), pcipmu->regs + PCI_PM_INTENCLR);
+}
+
+static void fujitsu_pci_counter_update(struct perf_event *event)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(event->pmu);
+	int idx = event->hw.idx;
+	u64 prev, new;
+
+	do {
+		prev = local64_read(&event->hw.prev_count);
+		new = readq_relaxed(pcipmu->regs + PCI_PM_EVCNTR(idx));
+	} while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev);
+
+	local64_add(new - prev, &event->count);
+}
+
+static inline void fujitsu_pci__init(struct pci_pmu *pcipmu)
+{
+	int i;
+
+	writeq_relaxed(PCI_PM_CR_RESET, pcipmu->regs + PCI_PM_CR);
+
+	writeq_relaxed(PCI_PM_CNTENCLR_RESET, pcipmu->regs + PCI_PM_CNTENCLR);
+	writeq_relaxed(PCI_PM_INTENCLR_RESET, pcipmu->regs + PCI_PM_INTENCLR);
+	writeq_relaxed(PCI_PM_OVSR_OVSRCLR_RESET, pcipmu->regs + PCI_PM_OVSR);
+
+	for (i = 0; i < PCI_NUM_COUNTERS; ++i) {
+		writeq_relaxed(PCI_PM_CNTCTL_RESET, pcipmu->regs + PCI_PM_CNTCTL(i));
+		writeq_relaxed(PCI_PM_EVTYPE_EVSEL(0), pcipmu->regs + PCI_PM_EVTYPE(i));
+	}
+
+	/*
+	 * Use writeq here to ensure all programming commands are done
+	 * before proceeding
+	 */
+	writeq(PCI_PM_CR_ENABLE, pcipmu->regs + PCI_PM_CR);
+}
+
+static irqreturn_t fujitsu_pci__handle_irq(int irq_num, void *data)
+{
+	struct pci_pmu *pcipmu = data;
+	/* Read the overflow status register */
+	long status = readq_relaxed(pcipmu->regs + PCI_PM_OVSR);
+	int idx;
+
+	if (status == 0)
+		return IRQ_NONE;
+
+	/* Clear the bits we read on the overflow status register */
+	writeq_relaxed(status, pcipmu->regs + PCI_PM_OVSR);
+
+	for_each_set_bit(idx, &status, PCI_NUM_COUNTERS) {
+		struct perf_event *event;
+
+		event = pcipmu->events[idx];
+		if (!event)
+			continue;
+
+		fujitsu_pci_counter_update(event);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static void fujitsu_pci__pmu_enable(struct pmu *pmu)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(pmu);
+
+	/* Ensure the other programming commands are observed before enabling */
+	wmb();
+
+	writeq_relaxed(PCI_PM_CR_ENABLE, pcipmu->regs + PCI_PM_CR);
+}
+
+static void fujitsu_pci__pmu_disable(struct pmu *pmu)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(pmu);
+
+	writeq_relaxed(0, pcipmu->regs + PCI_PM_CR);
+
+	/* Ensure the basic counter unit is stopped before proceeding */
+	wmb();
+}
+
+/*
+ * We must NOT create groups containing events from multiple hardware PMUs,
+ * although mixing different software and hardware PMUs is allowed.
+ */
+static bool fujitsu_pci__validate_event_group(struct perf_event *event)
+{
+	struct perf_event *leader = event->group_leader;
+	struct perf_event *sibling;
+	int counters;
+
+	if (leader->pmu != event->pmu && !is_software_event(leader))
+		return false;
+
+	/* The sum of the counters used by the event and its leader event */
+	counters = 2;
+
+	for_each_sibling_event(sibling, leader) {
+		if (is_software_event(sibling))
+			continue;
+		if (sibling->pmu != event->pmu)
+			return false;
+		counters++;
+	}
+
+	/*
+	 * If the group requires more counters than the HW has, it
+	 * cannot ever be scheduled.
+	 */
+	return counters <= PCI_NUM_COUNTERS;
+}
+
+static int fujitsu_pci__event_init(struct perf_event *event)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+
+	/* Is the event for this PMU? */
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/*
+	 * Sampling not supported since these events are not
+	 * core-attributable.
+	 */
+	if (hwc->sample_period)
+		return -EINVAL;
+
+	/*
+	 * Task mode not available, we run the counters as socket counters,
+	 * not attributable to any CPU and therefore cannot attribute per-task.
+	 */
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	/* Validate the group */
+	if (!fujitsu_pci__validate_event_group(event))
+		return -EINVAL;
+
+	hwc->idx = -1;
+
+	/*
+	 * Many perf core operations (eg. events rotation) operate on a
+	 * single CPU context. This is obvious for CPU PMUs, where one
+	 * expects the same sets of events being observed on all CPUs,
+	 * but can lead to issues for off-core PMUs, like this one, where
+	 * each event could be theoretically assigned to a different CPU.
+	 * To mitigate this, we enforce CPU assignment to one designated
+	 * processor (the one described in the "cpumask" attribute exported
+	 * by the PMU device). perf user space tools honor this and avoid
+	 * opening more than one copy of the events.
+	 */
+	event->cpu = cpumask_first(&pcipmu->cpumask);
+
+	return 0;
+}
+
+static void fujitsu_pci__event_start(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	hwc->state = 0;
+	fujitsu_pci_counter_start(event);
+}
+
+static void fujitsu_pci__event_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (hwc->state & PERF_HES_STOPPED)
+		return;
+
+	fujitsu_pci_counter_stop(event, flags);
+	if (flags & PERF_EF_UPDATE)
+		fujitsu_pci_counter_update(event);
+	hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int fujitsu_pci__event_add(struct perf_event *event, int flags)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int idx;
+
+	/* Try to allocate a counter. */
+	idx = bitmap_find_free_region(pcipmu->used_mask, PCI_NUM_COUNTERS, 0);
+	if (idx < 0)
+		/* The counters are all in use. */
+		return -EAGAIN;
+
+	hwc->idx = idx;
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+	pcipmu->events[idx] = event;
+
+	if (flags & PERF_EF_START)
+		fujitsu_pci__event_start(event, 0);
+
+	/* Propagate changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+	return 0;
+}
+
+static void fujitsu_pci__event_del(struct perf_event *event, int flags)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+
+	/* Stop and clean up */
+	fujitsu_pci__event_stop(event, flags | PERF_EF_UPDATE);
+	pcipmu->events[hwc->idx] = NULL;
+	bitmap_release_region(pcipmu->used_mask, hwc->idx, 0);
+
+	/* Propagate changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+}
+
+static void fujitsu_pci__event_read(struct perf_event *event)
+{
+	fujitsu_pci_counter_update(event);
+}
+
+#define PCI_PMU_FORMAT_ATTR(_name, _config)				      \
+	(&((struct dev_ext_attribute[]) {				      \
+		{ .attr = __ATTR(_name, 0444, device_show_string, NULL),      \
+		  .var = (void *) _config, }				      \
+	})[0].attr.attr)
+
+static struct attribute *fujitsu_pci_pmu_formats[] = {
+	PCI_PMU_FORMAT_ATTR(event, "config:0-7"),
+	NULL
+};
+
+static const struct attribute_group fujitsu_pci_pmu_format_group = {
+	.name = "format",
+	.attrs = fujitsu_pci_pmu_formats,
+};
+
+static ssize_t pci_pmu_event_show(struct device *dev,
+				  struct device_attribute *attr, char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr;
+
+	pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+	return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
+}
+
+#define PCI_EVENT_ATTR(_name, _id)					     \
+	PMU_EVENT_ATTR_ID(_name, pci_pmu_event_show, _id)
+
+static struct attribute *fujitsu_pci_pmu_events[] = {
+	PCI_EVENT_ATTR(pci-port0-cycles, PCI_EVENT_PORT0_CYCLES),
+	PCI_EVENT_ATTR(pci-port0-read-count, PCI_EVENT_PORT0_READ_COUNT),
+	PCI_EVENT_ATTR(pci-port0-read-count-bus, PCI_EVENT_PORT0_READ_COUNT_BUS),
+	PCI_EVENT_ATTR(pci-port0-write-count, PCI_EVENT_PORT0_WRITE_COUNT),
+	PCI_EVENT_ATTR(pci-port0-write-count-bus, PCI_EVENT_PORT0_WRITE_COUNT_BUS),
+	PCI_EVENT_ATTR(pci-port1-cycles, PCI_EVENT_PORT1_CYCLES),
+	PCI_EVENT_ATTR(pci-port1-read-count, PCI_EVENT_PORT1_READ_COUNT),
+	PCI_EVENT_ATTR(pci-port1-read-count-bus, PCI_EVENT_PORT1_READ_COUNT_BUS),
+	PCI_EVENT_ATTR(pci-port1-write-count, PCI_EVENT_PORT1_WRITE_COUNT),
+	PCI_EVENT_ATTR(pci-port1-write-count-bus, PCI_EVENT_PORT1_WRITE_COUNT_BUS),
+	PCI_EVENT_ATTR(ea-pci, PCI_EVENT_EA_PCI),
+	NULL
+};
+
+static const struct attribute_group fujitsu_pci_pmu_events_group = {
+	.name = "events",
+	.attrs = fujitsu_pci_pmu_events,
+};
+
+static ssize_t cpumask_show(struct device *dev,
+			    struct device_attribute *attr, char *buf)
+{
+	struct pci_pmu *pcipmu = to_pci_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, &pcipmu->cpumask);
+}
+static DEVICE_ATTR_RO(cpumask);
+
+static struct attribute *fujitsu_pci_pmu_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL
+};
+
+static const struct attribute_group fujitsu_pci_pmu_cpumask_attr_group = {
+	.attrs = fujitsu_pci_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *fujitsu_pci_pmu_attr_grps[] = {
+	&fujitsu_pci_pmu_format_group,
+	&fujitsu_pci_pmu_events_group,
+	&fujitsu_pci_pmu_cpumask_attr_group,
+	NULL
+};
+
+static int fujitsu_pci_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct pci_pmu *pcipmu = hlist_entry_safe(node, struct pci_pmu, node);
+
+	/* If there is not a CPU/PMU association pick this CPU */
+	if (cpumask_empty(&pcipmu->cpumask))
+		cpumask_set_cpu(cpu, &pcipmu->cpumask);
+
+	return 0;
+}
+
+static int fujitsu_pci_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+	struct pci_pmu *pcipmu = hlist_entry_safe(node, struct pci_pmu, node);
+	unsigned int target;
+
+	if (!cpumask_test_and_clear_cpu(cpu, &pcipmu->cpumask))
+		return 0;
+
+	target = cpumask_any_but(cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
+		return 0;
+
+	perf_pmu_migrate_context(&pcipmu->pmu, cpu, target);
+	cpumask_set_cpu(target, &pcipmu->cpumask);
+
+	return 0;
+}
+
+static int fujitsu_pci_pmu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct acpi_device *acpi_dev;
+	struct pci_pmu *pcipmu;
+	struct resource *memrc;
+	char *name;
+	int ret;
+	u64 uid;
+
+	acpi_dev = ACPI_COMPANION(dev);
+	if (!acpi_dev)
+		return -ENODEV;
+
+	ret = acpi_dev_uid_to_integer(acpi_dev, &uid);
+	if (ret)
+		return dev_err_probe(dev, ret, "unable to read ACPI uid\n");
+
+	pcipmu = devm_kzalloc(dev, sizeof(*pcipmu), GFP_KERNEL);
+	if (!pcipmu)
+		return -ENOMEM;
+
+	name = devm_kasprintf(dev, GFP_KERNEL, "pci_iod%llu_pci%llu",
+			      (uid >> 4) & 0xF, uid & 0xF);
+	if (!name)
+		return -ENOMEM;
+
+	pcipmu->pmu = (struct pmu) {
+		.parent		= dev,
+		.task_ctx_nr	= perf_invalid_context,
+
+		.pmu_enable	= fujitsu_pci__pmu_enable,
+		.pmu_disable	= fujitsu_pci__pmu_disable,
+		.event_init	= fujitsu_pci__event_init,
+		.add		= fujitsu_pci__event_add,
+		.del		= fujitsu_pci__event_del,
+		.start		= fujitsu_pci__event_start,
+		.stop		= fujitsu_pci__event_stop,
+		.read		= fujitsu_pci__event_read,
+
+		.attr_groups	= fujitsu_pci_pmu_attr_grps,
+		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
+	};
+
+	pcipmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc);
+	if (IS_ERR(pcipmu->regs))
+		return PTR_ERR(pcipmu->regs);
+
+	fujitsu_pci__init(pcipmu);
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		return ret;
+
+	ret = devm_request_irq(dev, ret, fujitsu_pci__handle_irq, 0,
+			       name, pcipmu);
+	if (ret)
+		return dev_err_probe(dev, ret, "Request for IRQ failed for slice @%pa\n",
+				     &memrc->start);
+
+	/* Add this instance to the list used by the offline callback */
+	ret = cpuhp_state_add_instance(pci_pmu_cpuhp_state, &pcipmu->node);
+	if (ret)
+		return dev_err_probe(dev, ret, "Error registering hotplug");
+
+	ret = perf_pmu_register(&pcipmu->pmu, name, -1);
+	if (ret < 0)
+		return dev_err_probe(dev, ret, "Failed to register PCI PMU\n");
+
+	dev_dbg(dev, "Registered %s, type: %d\n", name, pcipmu->pmu.type);
+
+	return 0;
+}
+
+static const struct acpi_device_id fujitsu_pci_pmu_acpi_match[] = {
+	{ "FUJI200D", },
+	{ }
+};
+MODULE_DEVICE_TABLE(acpi, fujitsu_pci_pmu_acpi_match);
+
+static struct platform_driver fujitsu_pci_pmu_driver = {
+	.driver = {
+		.name = "fujitsu-pci-pmu",
+		.acpi_match_table = fujitsu_pci_pmu_acpi_match,
+		.suppress_bind_attrs = true,
+	},
+	.probe = fujitsu_pci_pmu_probe,
+};
+
+static int __init register_fujitsu_pci_pmu_driver(void)
+{
+	int ret;
+
+	/* Install a hook to update the reader CPU in case it goes offline */
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+				      "perf/fujitsu/pci:online",
+				      fujitsu_pci_pmu_online_cpu,
+				      fujitsu_pci_pmu_offline_cpu);
+	if (ret < 0)
+		return ret;
+
+	pci_pmu_cpuhp_state = ret;
+	return platform_driver_register(&fujitsu_pci_pmu_driver);
+}
+device_initcall(register_fujitsu_pci_pmu_driver);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver
  2025-08-15  3:47 ` [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC " Koichi Okuno
@ 2025-08-15 11:33   ` Jonathan Cameron
  2025-08-15 12:39   ` Robin Murphy
  1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Cameron @ 2025-08-15 11:33 UTC (permalink / raw)
  To: Koichi Okuno
  Cc: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
	Gowthami Thiagarajan, Linu Cherian, Robin Murphy,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas "F. R. A. Prado", Thomas Gleixner,
	Peter Zijlstra, linux-doc, linux-kernel

On Fri, 15 Aug 2025 12:47:28 +0900
Koichi Okuno <fj2767dz@fujitsu.com> wrote:

> This adds a new dynamic PMU to the Perf Events framework to program and
> control the Uncore MAC PMUs in Fujitsu chips.
> 
> This driver was created with reference to drivers/perf/qcom_l3_pmu.c.
> 
> This driver exports formatting and event information to sysfs so it can
> be used by the perf user space tools with the syntaxes:
> 
> perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
> perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls
> 
> FUJITSU-MONAKA PMU Events Specification v1.1 URL:
> https://github.com/fujitsu/FUJITSU-MONAKA
> 
> Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>

Hi, just one slight doubt inline around interrupt affinities.
I can't quite remember what breaks when those don't move with the
perf state and so can't immediately see if relevant here.

Anyhow, if that is resolved (either by you saying it's definitely
not an issue) or adding the irq_set_affinity() calls, then

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

Jonathan

> diff --git a/drivers/perf/fujitsu_mac_pmu.c b/drivers/perf/fujitsu_mac_pmu.c
> new file mode 100644
> index 000000000000..1031e0221bb2
> --- /dev/null
> +++ b/drivers/perf/fujitsu_mac_pmu.c
> @@ -0,0 +1,552 @@

> +#define MAC_EVENT_CYCLES                        0x000
Maybe drop the lading 0 on all these values? 0x00
etc as we don't go beyond 0xa0

> +#define MAC_EVENT_READ_COUNT                    0x010
> +#define MAC_EVENT_READ_COUNT_REQUEST            0x011
> +#define MAC_EVENT_READ_COUNT_RETURN             0x012
> +#define MAC_EVENT_READ_COUNT_REQUEST_PFTGT      0x013
> +#define MAC_EVENT_READ_COUNT_REQUEST_NORMAL     0x014
> +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT   0x015
> +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS  0x016
> +#define MAC_EVENT_READ_WAIT                     0x017
> +#define MAC_EVENT_WRITE_COUNT                   0x020
> +#define MAC_EVENT_WRITE_COUNT_WRITE             0x021
> +#define MAC_EVENT_WRITE_COUNT_PWRITE            0x022
> +#define MAC_EVENT_MEMORY_READ_COUNT             0x040
> +#define MAC_EVENT_MEMORY_WRITE_COUNT            0x050
> +#define MAC_EVENT_MEMORY_PWRITE_COUNT           0x060
> +#define MAC_EVENT_EA_MAC                        0x080
> +#define MAC_EVENT_EA_MEMORY                     0x090
> +#define MAC_EVENT_EA_MEMORY_MAC_WRITE           0x092
> +#define MAC_EVENT_EA_HA                         0x0a0
> +
> +struct mac_pmu {
> +	struct pmu		pmu;
> +	struct hlist_node	node;
> +	void __iomem		*regs;
> +	struct perf_event	*events[MAC_NUM_COUNTERS];
> +	unsigned long		used_mask[BITS_TO_LONGS(MAC_NUM_COUNTERS)];
> +	cpumask_t		cpumask;
> +};

> +static void fujitsu_mac_counter_update(struct perf_event *event)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> +	int idx = event->hw.idx;
> +	u64 prev, new;
> +
> +	do {
> +		prev = local64_read(&event->hw.prev_count);
> +		new = readq_relaxed(macpmu->regs + MAC_PM_EVCNTR(idx));
> +	} while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev);
> +
> +	local64_add(new - prev, &event->count);
> +}



> +static int fujitsu_mac_pmu_probe(struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct acpi_device *acpi_dev;
> +	struct mac_pmu *macpmu;
> +	struct resource *memrc;
> +	char *name;
> +	int ret;
> +	u64 uid;
> +
> +	acpi_dev = ACPI_COMPANION(dev);
> +	if (!acpi_dev)
> +		return -ENODEV;
> +
> +	ret = acpi_dev_uid_to_integer(acpi_dev, &uid);
> +	if (ret)
> +		return dev_err_probe(dev, ret, "unable to read ACPI uid\n");
> +
> +	macpmu = devm_kzalloc(dev, sizeof(*macpmu), GFP_KERNEL);
> +	if (!macpmu)
> +		return -ENOMEM;
> +
> +	name = devm_kasprintf(dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu",
> +			      (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF);
> +	if (!name)
> +		return -ENOMEM;
> +
> +	macpmu->pmu = (struct pmu) {
> +		.parent		= dev,
> +		.task_ctx_nr	= perf_invalid_context,
> +
> +		.pmu_enable	= fujitsu_mac__pmu_enable,
> +		.pmu_disable	= fujitsu_mac__pmu_disable,
> +		.event_init	= fujitsu_mac__event_init,
> +		.add		= fujitsu_mac__event_add,
> +		.del		= fujitsu_mac__event_del,
> +		.start		= fujitsu_mac__event_start,
> +		.stop		= fujitsu_mac__event_stop,
> +		.read		= fujitsu_mac__event_read,
> +
> +		.attr_groups	= fujitsu_mac_pmu_attr_grps,
> +		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
> +	};
> +
> +	macpmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc);
> +	if (IS_ERR(macpmu->regs))
> +		return PTR_ERR(macpmu->regs);
> +
> +	fujitsu_mac__init(macpmu);
> +
> +	ret = platform_get_irq(pdev, 0);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = devm_request_irq(dev, ret, fujitsu_mac__handle_irq, 0,
> +			       name, macpmu);

It's been a while since I wrote a perf driver, but I'm a bit surprised to not
see irq affinity management in here.  I remember running into issues with
per cpu variables deep in perf when we didn't ensure the perf state and
the irq remain on the same cpu core.  

There might be something different here though.

> +	if (ret)
> +		return dev_err_probe(dev, ret, "Request for IRQ failed for slice @%pa\n",
> +				     &memrc->start);
> +
> +	/* Add this instance to the list used by the offline callback */
> +	ret = cpuhp_state_add_instance(mac_pmu_cpuhp_state, &macpmu->node);
> +	if (ret)
> +		return dev_err_probe(dev, ret, "Error registering hotplug");
> +
> +	ret = perf_pmu_register(&macpmu->pmu, name, -1);
> +	if (ret < 0)
> +		return dev_err_probe(dev, ret, "Failed to register MAC PMU\n");
> +
> +	dev_dbg(dev, "Registered %s, type: %d\n", name, macpmu->pmu.type);
> +
> +	return 0;
> +}


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver
  2025-08-15  3:47 ` [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI " Koichi Okuno
@ 2025-08-15 11:36   ` Jonathan Cameron
  2025-08-15 12:59   ` Robin Murphy
  1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Cameron @ 2025-08-15 11:36 UTC (permalink / raw)
  To: Koichi Okuno
  Cc: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
	Gowthami Thiagarajan, Linu Cherian, Robin Murphy,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas "F. R. A. Prado", Thomas Gleixner,
	Peter Zijlstra, linux-doc, linux-kernel

On Fri, 15 Aug 2025 12:47:29 +0900
Koichi Okuno <fj2767dz@fujitsu.com> wrote:

> This adds a new dynamic PMU to the Perf Events framework to program and
> control the Uncore PCI PMUs in Fujitsu chips.
> 
> This driver was created with reference to drivers/perf/qcom_l3_pmu.c.
> 
> This driver exports formatting and event information to sysfs so it can
> be used by the perf user space tools with the syntaxes:
> 
> perf stat -e pci_iod0_pci0/ea-pci/ ls
> perf stat -e pci_iod0_pci0/event=0x80/ ls
> 
> FUJITSU-MONAKA PMU Events Specification v1.1 URL:
> https://github.com/fujitsu/FUJITSU-MONAKA
> 
> Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>
Hi,

Just the same question on irq affinity as in previous patch.
So likewise if resolved

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver
  2025-08-15  3:47 ` [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC " Koichi Okuno
  2025-08-15 11:33   ` Jonathan Cameron
@ 2025-08-15 12:39   ` Robin Murphy
  2025-08-15 15:51     ` Jonathan Cameron
  2025-09-04  6:37     ` Koichi Okuno (Fujitsu)
  1 sibling, 2 replies; 10+ messages in thread
From: Robin Murphy @ 2025-08-15 12:39 UTC (permalink / raw)
  To: Koichi Okuno, Will Deacon, Mark Rutland, Jonathan Corbet,
	Catalin Marinas, Gowthami Thiagarajan, Linu Cherian,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra,
	Jonathan Cameron, linux-doc, linux-kernel

On 2025-08-15 4:47 am, Koichi Okuno wrote:
> This adds a new dynamic PMU to the Perf Events framework to program and
> control the Uncore MAC PMUs in Fujitsu chips.
> 
> This driver was created with reference to drivers/perf/qcom_l3_pmu.c.
> 
> This driver exports formatting and event information to sysfs so it can
> be used by the perf user space tools with the syntaxes:
> 
> perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
> perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls
> 
> FUJITSU-MONAKA PMU Events Specification v1.1 URL:
> https://github.com/fujitsu/FUJITSU-MONAKA
> 
> Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>
> ---
>   .../admin-guide/perf/fujitsu_mac_pmu.rst      |  73 +++
>   Documentation/admin-guide/perf/index.rst      |   1 +
>   drivers/perf/Kconfig                          |   9 +
>   drivers/perf/Makefile                         |   1 +
>   drivers/perf/fujitsu_mac_pmu.c                | 552 ++++++++++++++++++
>   5 files changed, 636 insertions(+)
>   create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
>   create mode 100644 drivers/perf/fujitsu_mac_pmu.c
> 
> diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
> new file mode 100644
> index 000000000000..383f3c1bbde7
> --- /dev/null
> +++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
> @@ -0,0 +1,73 @@
> +====================================================
> +Fujitsu Uncore MAC Performance Monitoring Unit (PMU)
> +====================================================
> +
> +This driver supports the Uncore MAC PMUs found in Fujitsu chips.
> +Each MAC PMU on these chips is exposed as a uncore perf PMU with device name
> +mac_iod<iod>_mac<mac>_ch<ch>.
> +
> +The driver provides a description of its available events and configuration
> +options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/.
> +This driver exports:
> +- formats, used by perf user space and other tools to configure events
> +- events, used by perf user space and other tools to create events
> +  symbolically, e.g.:
> +    perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls
> +- cpumask, used by perf user space and other tools to know on which CPUs
> +  to open the events
> +
> +This driver supports the following events:
> +- cycles
> +  This event counts MAC cycles at MAC frequency.
> +- read-count
> +  This event counts the number of read requests to MAC.
> +- read-count-request
> +  This event counts the number of read requests including retry to MAC.
> +- read-count-return
> +  This event counts the number of responses to read requests to MAC.
> +- read-count-request-pftgt
> +  This event counts the number of read requests including retry with PFTGT
> +  flag.
> +- read-count-request-normal
> +  This event counts the number of read requests including retry without PFTGT
> +  flag.
> +- read-count-return-pftgt-hit
> +  This event counts the number of responses to read requests which hit the
> +  PFTGT buffer.
> +- read-count-return-pftgt-miss
> +  This event counts the number of responses to read requests which miss the
> +  PFTGT buffer.
> +- read-wait
> +  This event counts outstanding read requests issued by DDR memory controller
> +  per cycle.
> +- write-count
> +  This event counts the number of write requests to MAC (including zero write,
> +  full write, partial write, write cancel).
> +- write-count-write
> +  This event counts the number of full write requests to MAC (not including
> +  zero write).
> +- write-count-pwrite
> +  This event counts the number of partial write requests to MAC.
> +- memory-read-count
> +  This event counts the number of read requests from MAC to memory.
> +- memory-write-count
> +  This event counts the number of full write requests from MAC to memory.
> +- memory-pwrite-count
> +  This event counts the number of partial write requests from MAC to memory.
> +- ea-mac
> +  This event counts energy consumption of MAC.
> +- ea-memory
> +  This event counts energy consumption of memory.
> +- ea-memory-mac-write
> +  This event counts the number of write requests from MAC to memory.
> +- ea-ha
> +  This event counts energy consumption of HA.
> +
> +  'ea' is the abbreviation for 'Energy Analyzer'.
> +
> +Examples for use with perf::
> +
> +  perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
> +
> +Given that these are uncore PMUs the driver does not support sampling, therefore
> +"perf record" will not work. Per-task perf sessions are not supported.
> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
> index 072b510385c4..e0262060db17 100644
> --- a/Documentation/admin-guide/perf/index.rst
> +++ b/Documentation/admin-guide/perf/index.rst
> @@ -29,3 +29,4 @@ Performance monitor support
>      cxl
>      ampere_cspmu
>      mrvl-pem-pmu
> +   fujitsu_mac_pmu
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index a9188dec36fe..ab4e44c99a53 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU
>   	 can give information about memory throughput and other related
>   	 events.
>   
> +config FUJITSU_MAC_PMU
> +	bool "Fujitsu Uncore MAC PMU"

Why not a module? Note that copying some other driver that itself has no 
good reason not to be modular really doesn't count. TBH I'm not sure why 
you've focused on qcom_l3 in the first place - AFAICS this has nothing 
in common with that beyond what is common to teh majority of other 
MMIO-based system PMU drivers as well.

(And frankly I find modules so much more convenient just for development 
and debugging, regardless of distro/end user concerns!)

> +	depends on (ARM64 && (ACPI || COMPILE_TEST))

Not "(ARM64 && ACPI) || COMPILE_TEST"? I don't see any obvious 
compile-time dependency on a particular CPU architecture.

(you can #include <linux/io-64-nonatomic-lo-hi.h> to get 
readq()/writeq() fallbacks for 32-bit build-tests)
> +	help
> +	 Provides support for the Uncore MAC performance monitor unit (PMU)
> +	 in Fujitsu processors.
> +	 Adds the Uncore MAC PMU into the perf events subsystem for
> +	 monitoring Uncore MAC events.
> +
>   config QCOM_L2_PMU
>   	bool "Qualcomm Technologies L2-cache PMU"
>   	depends on ARCH_QCOM && ARM64 && ACPI
> diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> index 192fc8b16204..be7f74c3b14c 100644
> --- a/drivers/perf/Makefile
> +++ b/drivers/perf/Makefile
> @@ -14,6 +14,7 @@ obj-$(CONFIG_ARM_SMMU_V3_PMU) += arm_smmuv3_pmu.o
>   obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o
>   obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o
>   obj-$(CONFIG_HISI_PMU) += hisilicon/
> +obj-$(CONFIG_FUJITSU_MAC_PMU) += fujitsu_mac_pmu.o
>   obj-$(CONFIG_QCOM_L2_PMU)	+= qcom_l2_pmu.o
>   obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
>   obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
> diff --git a/drivers/perf/fujitsu_mac_pmu.c b/drivers/perf/fujitsu_mac_pmu.c
> new file mode 100644
> index 000000000000..1031e0221bb2
> --- /dev/null
> +++ b/drivers/perf/fujitsu_mac_pmu.c
> @@ -0,0 +1,552 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Driver for the Uncore MAC PMUs in Fujitsu chips.
> + *
> + * See Documentation/admin-guide/perf/fujitsu_mac_pmu.rst for more details.
> + *
> + * This driver is based on drivers/perf/qcom_l3_pmu.c
> + * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved.
> + * Copyright (c) 2024 Fujitsu. All rights reserved.
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/bitops.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/list.h>
> +#include <linux/mod_devicetable.h>
> +#include <linux/module.h>
> +#include <linux/perf_event.h>
> +#include <linux/platform_device.h>
> +
> +/* Number of counters on each PMU */
> +#define MAC_NUM_COUNTERS  8
> +/* Mask for the event type field within perf_event_attr.config and EVTYPE reg */
> +#define MAC_EVTYPE_MASK   0xFF
> +
> +/* Perfmon registers */
> +#define MAC_PM_EVCNTR(__cntr)           (0x000 + (__cntr) * 8)
> +#define MAC_PM_CNTCTL(__cntr)           (0x100 + (__cntr) * 8)
> +#define MAC_PM_CNTCTL_RESET             0
> +#define MAC_PM_EVTYPE(__cntr)           (0x200 + (__cntr) * 8)
> +#define MAC_PM_EVTYPE_EVSEL(__val)      FIELD_GET(MAC_EVTYPE_MASK, __val)
> +#define MAC_PM_CR                       0x400
> +#define MAC_PM_CR_RESET                 BIT(1)
> +#define MAC_PM_CR_ENABLE                BIT(0)
> +#define MAC_PM_CNTENSET                 0x410
> +#define MAC_PM_CNTENSET_IDX(__cntr)     BIT(__cntr)
> +#define MAC_PM_CNTENCLR                 0x418
> +#define MAC_PM_CNTENCLR_IDX(__cntr)     BIT(__cntr)
> +#define MAC_PM_CNTENCLR_RESET           0xFF
> +#define MAC_PM_INTENSET                 0x420
> +#define MAC_PM_INTENSET_IDX(__cntr)     BIT(__cntr)
> +#define MAC_PM_INTENCLR                 0x428
> +#define MAC_PM_INTENCLR_IDX(__cntr)     BIT(__cntr)
> +#define MAC_PM_INTENCLR_RESET           0xFF
> +#define MAC_PM_OVSR                     0x440
> +#define MAC_PM_OVSR_OVSRCLR_RESET       0xFF
> +
> +#define MAC_EVENT_CYCLES                        0x000
> +#define MAC_EVENT_READ_COUNT                    0x010
> +#define MAC_EVENT_READ_COUNT_REQUEST            0x011
> +#define MAC_EVENT_READ_COUNT_RETURN             0x012
> +#define MAC_EVENT_READ_COUNT_REQUEST_PFTGT      0x013
> +#define MAC_EVENT_READ_COUNT_REQUEST_NORMAL     0x014
> +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT   0x015
> +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS  0x016
> +#define MAC_EVENT_READ_WAIT                     0x017
> +#define MAC_EVENT_WRITE_COUNT                   0x020
> +#define MAC_EVENT_WRITE_COUNT_WRITE             0x021
> +#define MAC_EVENT_WRITE_COUNT_PWRITE            0x022
> +#define MAC_EVENT_MEMORY_READ_COUNT             0x040
> +#define MAC_EVENT_MEMORY_WRITE_COUNT            0x050
> +#define MAC_EVENT_MEMORY_PWRITE_COUNT           0x060
> +#define MAC_EVENT_EA_MAC                        0x080
> +#define MAC_EVENT_EA_MEMORY                     0x090
> +#define MAC_EVENT_EA_MEMORY_MAC_WRITE           0x092
> +#define MAC_EVENT_EA_HA                         0x0a0

If these are strictly 8-bit values (per MAC_EVTYPE_MASK) then please 
drop the confusingly-significant-looking leading 0s.

> +
> +struct mac_pmu {
> +	struct pmu		pmu;
> +	struct hlist_node	node;
> +	void __iomem		*regs;
> +	struct perf_event	*events[MAC_NUM_COUNTERS];
> +	unsigned long		used_mask[BITS_TO_LONGS(MAC_NUM_COUNTERS)];
> +	cpumask_t		cpumask;
FWIW if you only associate with a single CPU then you can just store an 
int rather than a whole cpumask (plenty of examples in other drivers)

> +};
> +
> +#define to_mac_pmu(p) (container_of(p, struct mac_pmu, pmu))
> +
> +static int mac_pmu_cpuhp_state;
> +
> +static void fujitsu_mac_counter_start(struct perf_event *event)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> +	int idx = event->hw.idx;
> +
> +	/* Initialize the hardware counter and reset prev_count*/
> +	local64_set(&event->hw.prev_count, 0);
> +	writeq_relaxed(0, macpmu->regs + MAC_PM_EVCNTR(idx));
> +
> +	/* Set the event type */
> +	writeq_relaxed(MAC_PM_EVTYPE_EVSEL(event->attr.config), macpmu->regs + MAC_PM_EVTYPE(idx));
> +
> +	/* Enable interrupt generation by this counter */
> +	writeq_relaxed(MAC_PM_INTENSET_IDX(idx), macpmu->regs + MAC_PM_INTENSET);
> +
> +	/* Finally, enable the counter */
> +	writeq_relaxed(MAC_PM_CNTCTL_RESET, macpmu->regs + MAC_PM_CNTCTL(idx));
> +	writeq_relaxed(MAC_PM_CNTENSET_IDX(idx), macpmu->regs + MAC_PM_CNTENSET);
> +}
> +
> +static void fujitsu_mac_counter_stop(struct perf_event *event,
> +				     int flags)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> +	int idx = event->hw.idx;
> +
> +	/* Disable the counter */
> +	writeq_relaxed(MAC_PM_CNTENCLR_IDX(idx), macpmu->regs + MAC_PM_CNTENCLR);
> +
> +	/* Disable interrupt generation by this counter */
> +	writeq_relaxed(MAC_PM_INTENCLR_IDX(idx), macpmu->regs + MAC_PM_INTENCLR);
> +}
> +
> +static void fujitsu_mac_counter_update(struct perf_event *event)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> +	int idx = event->hw.idx;
> +	u64 prev, new;
> +
> +	do {
> +		prev = local64_read(&event->hw.prev_count);
> +		new = readq_relaxed(macpmu->regs + MAC_PM_EVCNTR(idx));
> +	} while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev);
> +
> +	local64_add(new - prev, &event->count);
> +}
> +
> +static inline void fujitsu_mac__init(struct mac_pmu *macpmu)
> +{
> +	int i;
> +
> +	writeq_relaxed(MAC_PM_CR_RESET, macpmu->regs + MAC_PM_CR);
> +
> +	writeq_relaxed(MAC_PM_CNTENCLR_RESET, macpmu->regs + MAC_PM_CNTENCLR);
> +	writeq_relaxed(MAC_PM_INTENCLR_RESET, macpmu->regs + MAC_PM_INTENCLR);
> +	writeq_relaxed(MAC_PM_OVSR_OVSRCLR_RESET, macpmu->regs + MAC_PM_OVSR);
> +
> +	for (i = 0; i < MAC_NUM_COUNTERS; ++i) {
> +		writeq_relaxed(MAC_PM_CNTCTL_RESET, macpmu->regs + MAC_PM_CNTCTL(i));
> +		writeq_relaxed(MAC_PM_EVTYPE_EVSEL(0), macpmu->regs + MAC_PM_EVTYPE(i));
> +	}
> +
> +	/*
> +	 * Use writeq here to ensure all programming commands are done
> +	 * before proceeding
> +	 */

Are you certain an explicit barrier is necessary here? Since your 
Kconfig implies this should only ever run on ARM64, the ioremap() 
results in a Device-nGnRx memory type, for which the CPU architecture 
already requires that accesses to the same peripheral occur in program 
order.

> +	writeq(MAC_PM_CR_ENABLE, macpmu->regs + MAC_PM_CR);
> +}
> +
> +static irqreturn_t fujitsu_mac__handle_irq(int irq_num, void *data)
> +{
> +	struct mac_pmu *macpmu = data;
> +	/* Read the overflow status register */
> +	long status = readq_relaxed(macpmu->regs + MAC_PM_OVSR);
> +	int idx;
> +
> +	if (status == 0)
> +		return IRQ_NONE;
> +
> +	/* Clear the bits we read on the overflow status register */
> +	writeq_relaxed(status, macpmu->regs + MAC_PM_OVSR);
> +
> +	for_each_set_bit(idx, &status, MAC_NUM_COUNTERS) {
> +		struct perf_event *event;
> +
> +		event = macpmu->events[idx];
> +		if (!event)
> +			continue;
> +
> +		fujitsu_mac_counter_update(event);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static void fujitsu_mac__pmu_enable(struct pmu *pmu)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(pmu);
> +
> +	/* Ensure the other programming commands are observed before enabling */
> +	wmb();

Same here. At best it might be somewhat logical to have a DSB *after* 
the enable, as below, to try to ensure the PMU itself has seen the write 
if it was early-acknowledged, but most drivers don't bother since we're 
hardly dealing with cycle-accurate timings here anyway.

> +	writeq_relaxed(MAC_PM_CR_ENABLE, macpmu->regs + MAC_PM_CR);
> +}
> +
> +static void fujitsu_mac__pmu_disable(struct pmu *pmu)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(pmu);
> +
> +	writeq_relaxed(0, macpmu->regs + MAC_PM_CR);
> +
> +	/* Ensure the basic counter unit is stopped before proceeding */
> +	wmb();
> +}
> +
> +/*
> + * We must NOT create groups containing events from multiple hardware PMUs,
> + * although mixing different software and hardware PMUs is allowed.
> + */

Let's stop copying that comment around, it really isn't relevant any longer.

> +static bool fujitsu_mac__validate_event_group(struct perf_event *event)
> +{
> +	struct perf_event *leader = event->group_leader;
> +	struct perf_event *sibling;
> +	int counters;
> +
> +	if (leader->pmu != event->pmu && !is_software_event(leader))
> +		return false;
> +
> +	/* The sum of the counters used by the event and its leader event */
> +	counters = 2;

This is mis-counting a software leader as this PMU's event, but also 
mishandling the case when leader == event and it's not valid to touch 
the sibling list. What you want here is to simply count your own events:

	int counters = 1;

	if (leader == event)
		return true;

	if (leader->pmu == event->pmu)
		counters++;

> +
> +	for_each_sibling_event(sibling, leader) {
> +		if (is_software_event(sibling))
> +			continue;
> +		if (sibling->pmu != event->pmu)
> +			return false;
> +		counters++;

And here just:

		if (sibling->pmu == event->pmu)
			counters++;

The core code will take care of rejecting inappropriate cross-PMU groups.

> +	}
> +
> +	/*
> +	 * If the group requires more counters than the HW has, it
> +	 * cannot ever be scheduled.
> +	 */
> +	return counters <= MAC_NUM_COUNTERS;
> +}
> +
> +static int fujitsu_mac__event_init(struct perf_event *event)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +
> +	/* Is the event for this PMU? */
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/*
> +	 * Sampling not supported since these events are not
> +	 * core-attributable.
> +	 */
> +	if (hwc->sample_period)
> +		return -EINVAL;

Strictly that's redundant now that you added PERF_PMU_CAP_NO_INTERRUPT, 
however I'm now trying to remove NO_INTERRUPT, so perhaps it might be 
better to keep this for the moment depending on how things end up landing.
  > +
> +	/*
> +	 * Task mode not available, we run the counters as socket counters,
> +	 * not attributable to any CPU and therefore cannot attribute per-task.
> +	 */
> +	if (event->cpu < 0)
> +		return -EINVAL;

You can drop this, perf_event_alloc() already takes care of it (see the 
CPU/task check on entry and the "Disallow uncore-task events" comment 
later - there's no harm if the driver itself nominally accepts the event 
in between)

> +
> +	/* Validate the group */
> +	if (!fujitsu_mac__validate_event_group(event))
> +		return -EINVAL;
> +
> +	hwc->idx = -1;
> +
> +	/*
> +	 * Many perf core operations (eg. events rotation) operate on a
> +	 * single CPU context. This is obvious for CPU PMUs, where one
> +	 * expects the same sets of events being observed on all CPUs,
> +	 * but can lead to issues for off-core PMUs, like this one, where
> +	 * each event could be theoretically assigned to a different CPU.
> +	 * To mitigate this, we enforce CPU assignment to one designated
> +	 * processor (the one described in the "cpumask" attribute exported
> +	 * by the PMU device). perf user space tools honor this and avoid
> +	 * opening more than one copy of the events.
> +	 */
> +	event->cpu = cpumask_first(&macpmu->cpumask);
> +
> +	return 0;
> +}
> +
> +static void fujitsu_mac__event_start(struct perf_event *event, int flags)
> +{
> +	struct hw_perf_event *hwc = &event->hw;
> +
> +	hwc->state = 0;
> +	fujitsu_mac_counter_start(event);
> +}
> +
> +static void fujitsu_mac__event_stop(struct perf_event *event, int flags)
> +{
> +	struct hw_perf_event *hwc = &event->hw;
> +
> +	if (hwc->state & PERF_HES_STOPPED)
> +		return;
> +
> +	fujitsu_mac_counter_stop(event, flags);
> +	if (flags & PERF_EF_UPDATE)
> +		fujitsu_mac_counter_update(event);
> +	hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
> +}
> +
> +static int fujitsu_mac__event_add(struct perf_event *event, int flags)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	int idx;
> +
> +	/* Try to allocate a counter. */
> +	idx = bitmap_find_free_region(macpmu->used_mask, MAC_NUM_COUNTERS, 0);
> +	if (idx < 0)
> +		/* The counters are all in use. */
> +		return -EAGAIN;
> +
> +	hwc->idx = idx;
> +	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
> +	macpmu->events[idx] = event;
> +
> +	if (flags & PERF_EF_START)
> +		fujitsu_mac__event_start(event, 0);
> +
> +	/* Propagate changes to the userspace mapping. */
> +	perf_event_update_userpage(event);
> +
> +	return 0;
> +}
> +
> +static void fujitsu_mac__event_del(struct perf_event *event, int flags)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +
> +	/* Stop and clean up */
> +	fujitsu_mac__event_stop(event, flags | PERF_EF_UPDATE);
> +	macpmu->events[hwc->idx] = NULL;
> +	bitmap_release_region(macpmu->used_mask, hwc->idx, 0);
> +
> +	/* Propagate changes to the userspace mapping. */
> +	perf_event_update_userpage(event);
> +}
> +
> +static void fujitsu_mac__event_read(struct perf_event *event)
> +{
> +	fujitsu_mac_counter_update(event);
> +}
> +
> +#define MAC_PMU_FORMAT_ATTR(_name, _config)				      \
> +	(&((struct dev_ext_attribute[]) {				      \
> +		{ .attr = __ATTR(_name, 0444, device_show_string, NULL),      \
> +		  .var = (void *) _config, }				      \
> +	})[0].attr.attr)
> +
> +static struct attribute *fujitsu_mac_pmu_formats[] = {
> +	MAC_PMU_FORMAT_ATTR(event, "config:0-7"),
> +	NULL
> +};
> +
> +static const struct attribute_group fujitsu_mac_pmu_format_group = {
> +	.name = "format",
> +	.attrs = fujitsu_mac_pmu_formats,
> +};
> +
> +static ssize_t mac_pmu_event_show(struct device *dev,
> +				  struct device_attribute *attr, char *page)
> +{
> +	struct perf_pmu_events_attr *pmu_attr;
> +
> +	pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
> +	return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
> +}
> +
> +#define MAC_EVENT_ATTR(_name, _id)					     \
> +	PMU_EVENT_ATTR_ID(_name, mac_pmu_event_show, _id)
> +
> +static struct attribute *fujitsu_mac_pmu_events[] = {
> +	MAC_EVENT_ATTR(cycles, MAC_EVENT_CYCLES),
> +	MAC_EVENT_ATTR(read-count, MAC_EVENT_READ_COUNT),
> +	MAC_EVENT_ATTR(read-count-request, MAC_EVENT_READ_COUNT_REQUEST),
> +	MAC_EVENT_ATTR(read-count-return, MAC_EVENT_READ_COUNT_RETURN),
> +	MAC_EVENT_ATTR(read-count-request-pftgt, MAC_EVENT_READ_COUNT_REQUEST_PFTGT),
> +	MAC_EVENT_ATTR(read-count-request-normal, MAC_EVENT_READ_COUNT_REQUEST_NORMAL),
> +	MAC_EVENT_ATTR(read-count-return-pftgt-hit, MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT),
> +	MAC_EVENT_ATTR(read-count-return-pftgt-miss, MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS),
> +	MAC_EVENT_ATTR(read-wait, MAC_EVENT_READ_WAIT),
> +	MAC_EVENT_ATTR(write-count, MAC_EVENT_WRITE_COUNT),
> +	MAC_EVENT_ATTR(write-count-write, MAC_EVENT_WRITE_COUNT_WRITE),
> +	MAC_EVENT_ATTR(write-count-pwrite, MAC_EVENT_WRITE_COUNT_PWRITE),
> +	MAC_EVENT_ATTR(memory-read-count, MAC_EVENT_MEMORY_READ_COUNT),
> +	MAC_EVENT_ATTR(memory-write-count, MAC_EVENT_MEMORY_WRITE_COUNT),
> +	MAC_EVENT_ATTR(memory-pwrite-count, MAC_EVENT_MEMORY_PWRITE_COUNT),
> +	MAC_EVENT_ATTR(ea-mac, MAC_EVENT_EA_MAC),
> +	MAC_EVENT_ATTR(ea-memory, MAC_EVENT_EA_MEMORY),
> +	MAC_EVENT_ATTR(ea-memory-mac-write, MAC_EVENT_EA_MEMORY_MAC_WRITE),
> +	MAC_EVENT_ATTR(ea-ha, MAC_EVENT_EA_HA),
I firmly maintain my opinion that if this is the only place the event 
numbers are referenced then the extra layer of macros actually makes it 
*harder* to read and follow, compared to simply:

	MAC_EVENT_ATTR(ea-ha, 0xa0),

but that is very much just one reviewer's personal opinion :)

> +	NULL
> +};
> +
> +static const struct attribute_group fujitsu_mac_pmu_events_group = {
> +	.name = "events",
> +	.attrs = fujitsu_mac_pmu_events,
> +};
> +
> +static ssize_t cpumask_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf)
> +{
> +	struct mac_pmu *macpmu = to_mac_pmu(dev_get_drvdata(dev));
> +
> +	return cpumap_print_to_pagebuf(true, buf, &macpmu->cpumask);
> +}
> +static DEVICE_ATTR_RO(cpumask);
> +
> +static struct attribute *fujitsu_mac_pmu_cpumask_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL
> +};
> +
> +static const struct attribute_group fujitsu_mac_pmu_cpumask_attr_group = {
> +	.attrs = fujitsu_mac_pmu_cpumask_attrs,
> +};
> +
> +static const struct attribute_group *fujitsu_mac_pmu_attr_grps[] = {
> +	&fujitsu_mac_pmu_format_group,
> +	&fujitsu_mac_pmu_events_group,
> +	&fujitsu_mac_pmu_cpumask_attr_group,
> +	NULL
> +};
> +
> +static int fujitsu_mac_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node);
> +
> +	/* If there is not a CPU/PMU association pick this CPU */
> +	if (cpumask_empty(&macpmu->cpumask))
> +		cpumask_set_cpu(cpu, &macpmu->cpumask);

You should set the IRQ affinity to the PMU CPU as well (not that you'll 
ever realistically get an overflow with 64-bit counters, but still...)

> +
> +	return 0;
> +}
> +
> +static int fujitsu_mac_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node);
> +	unsigned int target;
> +
> +	if (!cpumask_test_and_clear_cpu(cpu, &macpmu->cpumask))
> +		return 0;
> +
> +	target = cpumask_any_but(cpu_online_mask, cpu);
> +	if (target >= nr_cpu_ids)
> +		return 0;
> +
> +	perf_pmu_migrate_context(&macpmu->pmu, cpu, target);
> +	cpumask_set_cpu(target, &macpmu->cpumask);

Ditto.

> +
> +	return 0;
> +}
> +
> +static int fujitsu_mac_pmu_probe(struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct acpi_device *acpi_dev;
> +	struct mac_pmu *macpmu;
> +	struct resource *memrc;
> +	char *name;
> +	int ret;
> +	u64 uid;
> +
> +	acpi_dev = ACPI_COMPANION(dev);

acpi_dev_uid_to_integer() already handles NULL, so you may as well just 
inline this like the majority of other callers do.

> +	if (!acpi_dev)
> +		return -ENODEV;
> +
> +	ret = acpi_dev_uid_to_integer(acpi_dev, &uid);
> +	if (ret)
> +		return dev_err_probe(dev, ret, "unable to read ACPI uid\n");
> +
> +	macpmu = devm_kzalloc(dev, sizeof(*macpmu), GFP_KERNEL);
> +	if (!macpmu)
> +		return -ENOMEM;
> +
> +	name = devm_kasprintf(dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu",
> +			      (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF);
> +	if (!name)
> +		return -ENOMEM;
> +
> +	macpmu->pmu = (struct pmu) {
> +		.parent		= dev,
> +		.task_ctx_nr	= perf_invalid_context,
> +
> +		.pmu_enable	= fujitsu_mac__pmu_enable,
> +		.pmu_disable	= fujitsu_mac__pmu_disable,
> +		.event_init	= fujitsu_mac__event_init,
> +		.add		= fujitsu_mac__event_add,
> +		.del		= fujitsu_mac__event_del,
> +		.start		= fujitsu_mac__event_start,
> +		.stop		= fujitsu_mac__event_stop,
> +		.read		= fujitsu_mac__event_read,
> +
> +		.attr_groups	= fujitsu_mac_pmu_attr_grps,
> +		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
> +	};
> +
> +	macpmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc);
> +	if (IS_ERR(macpmu->regs))
> +		return PTR_ERR(macpmu->regs);
> +
> +	fujitsu_mac__init(macpmu);
> +
> +	ret = platform_get_irq(pdev, 0);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = devm_request_irq(dev, ret, fujitsu_mac__handle_irq, 0,
> +			       name, macpmu);
> +	if (ret)
> +		return dev_err_probe(dev, ret, "Request for IRQ failed for slice @%pa\n",

What's a slice?

> +				     &memrc->start);
> +
> +	/* Add this instance to the list used by the offline callback */
> +	ret = cpuhp_state_add_instance(mac_pmu_cpuhp_state, &macpmu->node);
> +	if (ret)
> +		return dev_err_probe(dev, ret, "Error registering hotplug");
> +
> +	ret = perf_pmu_register(&macpmu->pmu, name, -1);
> +	if (ret < 0)
> +		return dev_err_probe(dev, ret, "Failed to register MAC PMU\n");
You should remove the cpuhp state in the error case (but remember to use 
the _nocalls variant since you don't want to attempt to migrate anything 
in that case.)

Thanks,
Robin.

> +
> +	dev_dbg(dev, "Registered %s, type: %d\n", name, macpmu->pmu.type);
> +
> +	return 0;
> +}
> +
> +static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = {
> +	{ "FUJI200C", },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(acpi, fujitsu_mac_pmu_acpi_match);
> +
> +static struct platform_driver fujitsu_mac_pmu_driver = {
> +	.driver = {
> +		.name = "fujitsu-mac-pmu",
> +		.acpi_match_table = fujitsu_mac_pmu_acpi_match,
> +		.suppress_bind_attrs = true,
> +	},
> +	.probe = fujitsu_mac_pmu_probe,
> +};
> +
> +static int __init register_fujitsu_mac_pmu_driver(void)
> +{
> +	int ret;
> +
> +	/* Install a hook to update the reader CPU in case it goes offline */
> +	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
> +				      "perf/fujitsu/mac:online",
> +				      fujitsu_mac_pmu_online_cpu,
> +				      fujitsu_mac_pmu_offline_cpu);
> +	if (ret < 0)
> +		return ret;
> +
> +	mac_pmu_cpuhp_state = ret;
> +	return platform_driver_register(&fujitsu_mac_pmu_driver);
> +}
> +device_initcall(register_fujitsu_mac_pmu_driver);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver
  2025-08-15  3:47 ` [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI " Koichi Okuno
  2025-08-15 11:36   ` Jonathan Cameron
@ 2025-08-15 12:59   ` Robin Murphy
  2025-09-04  6:41     ` Koichi Okuno (Fujitsu)
  1 sibling, 1 reply; 10+ messages in thread
From: Robin Murphy @ 2025-08-15 12:59 UTC (permalink / raw)
  To: Koichi Okuno, Will Deacon, Mark Rutland, Jonathan Corbet,
	Catalin Marinas, Gowthami Thiagarajan, Linu Cherian,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra,
	Jonathan Cameron, linux-doc, linux-kernel

On 2025-08-15 4:47 am, Koichi Okuno wrote:
> This adds a new dynamic PMU to the Perf Events framework to program and
> control the Uncore PCI PMUs in Fujitsu chips.
> 
> This driver was created with reference to drivers/perf/qcom_l3_pmu.c.
> 
> This driver exports formatting and event information to sysfs so it can
> be used by the perf user space tools with the syntaxes:
> 
> perf stat -e pci_iod0_pci0/ea-pci/ ls
> perf stat -e pci_iod0_pci0/event=0x80/ ls
> 
> FUJITSU-MONAKA PMU Events Specification v1.1 URL:
> https://github.com/fujitsu/FUJITSU-MONAKA
> 
> Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>
> ---
>   .../admin-guide/perf/fujitsu_pci_pmu.rst      |  50 ++
>   Documentation/admin-guide/perf/index.rst      |   1 +
>   drivers/perf/Kconfig                          |   9 +
>   drivers/perf/Makefile                         |   1 +
>   drivers/perf/fujitsu_pci_pmu.c                | 536 ++++++++++++++++++

 From a quick side-by-side skim, this is a copy-paste of the exact same 
driver from patch #1 with s/mac/pci/g applied. Please don't do that. If 
the hardware is functionally the same, then it should just be a single 
driver that can then pick which PMU name and set of event alias 
attributes to use for a given instance based on the ACPI HID match 
(and/or any other ID register info you may have.)

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver
  2025-08-15 12:39   ` Robin Murphy
@ 2025-08-15 15:51     ` Jonathan Cameron
  2025-09-04  6:37     ` Koichi Okuno (Fujitsu)
  1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Cameron @ 2025-08-15 15:51 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Koichi Okuno, Will Deacon, Mark Rutland, Jonathan Corbet,
	Catalin Marinas, Gowthami Thiagarajan, Linu Cherian,
	linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
	Krzysztof Kozlowski, Konrad Dybcio, Neil Armstrong, Arnd Bergmann,
	Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra,
	linux-doc, linux-kernel


> > +static struct attribute *fujitsu_mac_pmu_events[] = {
> > +	MAC_EVENT_ATTR(cycles, MAC_EVENT_CYCLES),
> > +	MAC_EVENT_ATTR(read-count, MAC_EVENT_READ_COUNT),
> > +	MAC_EVENT_ATTR(read-count-request, MAC_EVENT_READ_COUNT_REQUEST),
> > +	MAC_EVENT_ATTR(read-count-return, MAC_EVENT_READ_COUNT_RETURN),
> > +	MAC_EVENT_ATTR(read-count-request-pftgt, MAC_EVENT_READ_COUNT_REQUEST_PFTGT),
> > +	MAC_EVENT_ATTR(read-count-request-normal, MAC_EVENT_READ_COUNT_REQUEST_NORMAL),
> > +	MAC_EVENT_ATTR(read-count-return-pftgt-hit, MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT),
> > +	MAC_EVENT_ATTR(read-count-return-pftgt-miss, MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS),
> > +	MAC_EVENT_ATTR(read-wait, MAC_EVENT_READ_WAIT),
> > +	MAC_EVENT_ATTR(write-count, MAC_EVENT_WRITE_COUNT),
> > +	MAC_EVENT_ATTR(write-count-write, MAC_EVENT_WRITE_COUNT_WRITE),
> > +	MAC_EVENT_ATTR(write-count-pwrite, MAC_EVENT_WRITE_COUNT_PWRITE),
> > +	MAC_EVENT_ATTR(memory-read-count, MAC_EVENT_MEMORY_READ_COUNT),
> > +	MAC_EVENT_ATTR(memory-write-count, MAC_EVENT_MEMORY_WRITE_COUNT),
> > +	MAC_EVENT_ATTR(memory-pwrite-count, MAC_EVENT_MEMORY_PWRITE_COUNT),
> > +	MAC_EVENT_ATTR(ea-mac, MAC_EVENT_EA_MAC),
> > +	MAC_EVENT_ATTR(ea-memory, MAC_EVENT_EA_MEMORY),
> > +	MAC_EVENT_ATTR(ea-memory-mac-write, MAC_EVENT_EA_MEMORY_MAC_WRITE),
> > +	MAC_EVENT_ATTR(ea-ha, MAC_EVENT_EA_HA),  
> I firmly maintain my opinion that if this is the only place the event 
> numbers are referenced then the extra layer of macros actually makes it 
> *harder* to read and follow, compared to simply:
> 
> 	MAC_EVENT_ATTR(ea-ha, 0xa0),
> 
> but that is very much just one reviewer's personal opinion :)

I'll second this suggestion!  I failed to notice they were only used here.

Jonathan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver
  2025-08-15 12:39   ` Robin Murphy
  2025-08-15 15:51     ` Jonathan Cameron
@ 2025-09-04  6:37     ` Koichi Okuno (Fujitsu)
  1 sibling, 0 replies; 10+ messages in thread
From: Koichi Okuno (Fujitsu) @ 2025-09-04  6:37 UTC (permalink / raw)
  To: 'Robin Murphy'
  Cc: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
	Gowthami Thiagarajan, Linu Cherian,
	linux-arm-kernel@lists.infradead.org, Bjorn Andersson,
	Geert Uytterhoeven, Krzysztof Kozlowski, Konrad Dybcio,
	Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado,
	Thomas Gleixner, Peter Zijlstra, Jonathan Cameron,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Koichi Okuno (Fujitsu)

Hi, Robin

> > This adds a new dynamic PMU to the Perf Events framework to program and
> > control the Uncore MAC PMUs in Fujitsu chips.
> > 
> > This driver was created with reference to drivers/perf/qcom_l3_pmu.c.
> > 
> > This driver exports formatting and event information to sysfs so it can
> > be used by the perf user space tools with the syntaxes:
> > 
> > perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
> > perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls
> > 
> > FUJITSU-MONAKA PMU Events Specification v1.1 URL:
> > https://github.com/fujitsu/FUJITSU-MONAKA
> > 
> > Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>
> > ---
> >   .../admin-guide/perf/fujitsu_mac_pmu.rst      |  73 +++
> >   Documentation/admin-guide/perf/index.rst      |   1 +
> >   drivers/perf/Kconfig                          |   9 +
> >   drivers/perf/Makefile                         |   1 +
> >   drivers/perf/fujitsu_mac_pmu.c                | 552 ++++++++++++++++++
> >   5 files changed, 636 insertions(+)
> >   create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
> >   create mode 100644 drivers/perf/fujitsu_mac_pmu.c
> > 
> > diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
> > new file mode 100644
> > index 000000000000..383f3c1bbde7
> > --- /dev/null
> > +++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
> > @@ -0,0 +1,73 @@
> > +====================================================
> > +Fujitsu Uncore MAC Performance Monitoring Unit (PMU)
> > +====================================================
> > +
> > +This driver supports the Uncore MAC PMUs found in Fujitsu chips.
> > +Each MAC PMU on these chips is exposed as a uncore perf PMU with device name
> > +mac_iod<iod>_mac<mac>_ch<ch>.
> > +
> > +The driver provides a description of its available events and configuration
> > +options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/.
> > +This driver exports:
> > +- formats, used by perf user space and other tools to configure events
> > +- events, used by perf user space and other tools to create events
> > +  symbolically, e.g.:
> > +    perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls
> > +- cpumask, used by perf user space and other tools to know on which CPUs
> > +  to open the events
> > +
> > +This driver supports the following events:
> > +- cycles
> > +  This event counts MAC cycles at MAC frequency.
> > +- read-count
> > +  This event counts the number of read requests to MAC.
> > +- read-count-request
> > +  This event counts the number of read requests including retry to MAC.
> > +- read-count-return
> > +  This event counts the number of responses to read requests to MAC.
> > +- read-count-request-pftgt
> > +  This event counts the number of read requests including retry with PFTGT
> > +  flag.
> > +- read-count-request-normal
> > +  This event counts the number of read requests including retry without PFTGT
> > +  flag.
> > +- read-count-return-pftgt-hit
> > +  This event counts the number of responses to read requests which hit the
> > +  PFTGT buffer.
> > +- read-count-return-pftgt-miss
> > +  This event counts the number of responses to read requests which miss the
> > +  PFTGT buffer.
> > +- read-wait
> > +  This event counts outstanding read requests issued by DDR memory controller
> > +  per cycle.
> > +- write-count
> > +  This event counts the number of write requests to MAC (including zero write,
> > +  full write, partial write, write cancel).
> > +- write-count-write
> > +  This event counts the number of full write requests to MAC (not including
> > +  zero write).
> > +- write-count-pwrite
> > +  This event counts the number of partial write requests to MAC.
> > +- memory-read-count
> > +  This event counts the number of read requests from MAC to memory.
> > +- memory-write-count
> > +  This event counts the number of full write requests from MAC to memory.
> > +- memory-pwrite-count
> > +  This event counts the number of partial write requests from MAC to memory.
> > +- ea-mac
> > +  This event counts energy consumption of MAC.
> > +- ea-memory
> > +  This event counts energy consumption of memory.
> > +- ea-memory-mac-write
> > +  This event counts the number of write requests from MAC to memory.
> > +- ea-ha
> > +  This event counts energy consumption of HA.
> > +
> > +  'ea' is the abbreviation for 'Energy Analyzer'.
> > +
> > +Examples for use with perf::
> > +
> > +  perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
> > +
> > +Given that these are uncore PMUs the driver does not support sampling, therefore
> > +"perf record" will not work. Per-task perf sessions are not supported.
> > diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
> > index 072b510385c4..e0262060db17 100644
> > --- a/Documentation/admin-guide/perf/index.rst
> > +++ b/Documentation/admin-guide/perf/index.rst
> > @@ -29,3 +29,4 @@ Performance monitor support
> >      cxl
> >      ampere_cspmu
> >      mrvl-pem-pmu
> > +   fujitsu_mac_pmu
> > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> > index a9188dec36fe..ab4e44c99a53 100644
> > --- a/drivers/perf/Kconfig
> > +++ b/drivers/perf/Kconfig
> > @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU
> >   	 can give information about memory throughput and other related
> >   	 events.
> >   
> > +config FUJITSU_MAC_PMU
> > +	bool "Fujitsu Uncore MAC PMU"
> 
> Why not a module? Note that copying some other driver that itself has no 
> good reason not to be modular really doesn't count. TBH I'm not sure why 
> you've focused on qcom_l3 in the first place - AFAICS this has nothing 
> in common with that beyond what is common to teh majority of other 
> MMIO-based system PMU drivers as well.
> 

This driver has been modularized in v8.

> (And frankly I find modules so much more convenient just for development 
> and debugging, regardless of distro/end user concerns!)
> 
> > +	depends on (ARM64 && (ACPI || COMPILE_TEST))
> 
> Not "(ARM64 && ACPI) || COMPILE_TEST"? I don't see any obvious 
> compile-time dependency on a particular CPU architecture.
> 
> (you can #include <linux/io-64-nonatomic-lo-hi.h> to get 
> readq()/writeq() fallbacks for 32-bit build-tests)

In v3, I was advised to remove 64BIT from COMPILE_TEST due to a SPARC64
COMPILE_TEST issue, so I restricted it to ARM64 only.
In v8, I've modified it to pass COMPILE_TEST on SPARC64, allowing me to
reintroduce 64BIT for COMPILE_TEST.
Consequently, linux/io-64-nonatomic-lo-hi.h is no longer needed.

> > +	help
> > +	 Provides support for the Uncore MAC performance monitor unit (PMU)
> > +	 in Fujitsu processors.
> > +	 Adds the Uncore MAC PMU into the perf events subsystem for
> > +	 monitoring Uncore MAC events.
> > +
> >   config QCOM_L2_PMU
> >   	bool "Qualcomm Technologies L2-cache PMU"
> >   	depends on ARCH_QCOM && ARM64 && ACPI
> > diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> > index 192fc8b16204..be7f74c3b14c 100644
> > --- a/drivers/perf/Makefile
> > +++ b/drivers/perf/Makefile
> > @@ -14,6 +14,7 @@ obj-$(CONFIG_ARM_SMMU_V3_PMU) += arm_smmuv3_pmu.o
> >   obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o
> >   obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o
> >   obj-$(CONFIG_HISI_PMU) += hisilicon/
> > +obj-$(CONFIG_FUJITSU_MAC_PMU) += fujitsu_mac_pmu.o
> >   obj-$(CONFIG_QCOM_L2_PMU)	+= qcom_l2_pmu.o
> >   obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
> >   obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
> > diff --git a/drivers/perf/fujitsu_mac_pmu.c b/drivers/perf/fujitsu_mac_pmu.c
> > new file mode 100644
> > index 000000000000..1031e0221bb2
> > --- /dev/null
> > +++ b/drivers/perf/fujitsu_mac_pmu.c
> > @@ -0,0 +1,552 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Driver for the Uncore MAC PMUs in Fujitsu chips.
> > + *
> > + * See Documentation/admin-guide/perf/fujitsu_mac_pmu.rst for more details.
> > + *
> > + * This driver is based on drivers/perf/qcom_l3_pmu.c
> > + * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved.
> > + * Copyright (c) 2024 Fujitsu. All rights reserved.
> > + */
> > +
> > +#include <linux/acpi.h>
> > +#include <linux/bitops.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/list.h>
> > +#include <linux/mod_devicetable.h>
> > +#include <linux/module.h>
> > +#include <linux/perf_event.h>
> > +#include <linux/platform_device.h>
> > +
> > +/* Number of counters on each PMU */
> > +#define MAC_NUM_COUNTERS  8
> > +/* Mask for the event type field within perf_event_attr.config and EVTYPE reg */
> > +#define MAC_EVTYPE_MASK   0xFF
> > +
> > +/* Perfmon registers */
> > +#define MAC_PM_EVCNTR(__cntr)           (0x000 + (__cntr) * 8)
> > +#define MAC_PM_CNTCTL(__cntr)           (0x100 + (__cntr) * 8)
> > +#define MAC_PM_CNTCTL_RESET             0
> > +#define MAC_PM_EVTYPE(__cntr)           (0x200 + (__cntr) * 8)
> > +#define MAC_PM_EVTYPE_EVSEL(__val)      FIELD_GET(MAC_EVTYPE_MASK, __val)
> > +#define MAC_PM_CR                       0x400
> > +#define MAC_PM_CR_RESET                 BIT(1)
> > +#define MAC_PM_CR_ENABLE                BIT(0)
> > +#define MAC_PM_CNTENSET                 0x410
> > +#define MAC_PM_CNTENSET_IDX(__cntr)     BIT(__cntr)
> > +#define MAC_PM_CNTENCLR                 0x418
> > +#define MAC_PM_CNTENCLR_IDX(__cntr)     BIT(__cntr)
> > +#define MAC_PM_CNTENCLR_RESET           0xFF
> > +#define MAC_PM_INTENSET                 0x420
> > +#define MAC_PM_INTENSET_IDX(__cntr)     BIT(__cntr)
> > +#define MAC_PM_INTENCLR                 0x428
> > +#define MAC_PM_INTENCLR_IDX(__cntr)     BIT(__cntr)
> > +#define MAC_PM_INTENCLR_RESET           0xFF
> > +#define MAC_PM_OVSR                     0x440
> > +#define MAC_PM_OVSR_OVSRCLR_RESET       0xFF
> > +
> > +#define MAC_EVENT_CYCLES                        0x000
> > +#define MAC_EVENT_READ_COUNT                    0x010
> > +#define MAC_EVENT_READ_COUNT_REQUEST            0x011
> > +#define MAC_EVENT_READ_COUNT_RETURN             0x012
> > +#define MAC_EVENT_READ_COUNT_REQUEST_PFTGT      0x013
> > +#define MAC_EVENT_READ_COUNT_REQUEST_NORMAL     0x014
> > +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT   0x015
> > +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS  0x016
> > +#define MAC_EVENT_READ_WAIT                     0x017
> > +#define MAC_EVENT_WRITE_COUNT                   0x020
> > +#define MAC_EVENT_WRITE_COUNT_WRITE             0x021
> > +#define MAC_EVENT_WRITE_COUNT_PWRITE            0x022
> > +#define MAC_EVENT_MEMORY_READ_COUNT             0x040
> > +#define MAC_EVENT_MEMORY_WRITE_COUNT            0x050
> > +#define MAC_EVENT_MEMORY_PWRITE_COUNT           0x060
> > +#define MAC_EVENT_EA_MAC                        0x080
> > +#define MAC_EVENT_EA_MEMORY                     0x090
> > +#define MAC_EVENT_EA_MEMORY_MAC_WRITE           0x092
> > +#define MAC_EVENT_EA_HA                         0x0a0
> 
> If these are strictly 8-bit values (per MAC_EVTYPE_MASK) then please 
> drop the confusingly-significant-looking leading 0s.
> 
> > +
> > +struct mac_pmu {
> > +	struct pmu		pmu;
> > +	struct hlist_node	node;
> > +	void __iomem		*regs;
> > +	struct perf_event	*events[MAC_NUM_COUNTERS];
> > +	unsigned long		used_mask[BITS_TO_LONGS(MAC_NUM_COUNTERS)];
> > +	cpumask_t		cpumask;
> FWIW if you only associate with a single CPU then you can just store an 
> int rather than a whole cpumask (plenty of examples in other drivers)
> 
> > +};
> > +
> > +#define to_mac_pmu(p) (container_of(p, struct mac_pmu, pmu))
> > +
> > +static int mac_pmu_cpuhp_state;
> > +
> > +static void fujitsu_mac_counter_start(struct perf_event *event)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> > +	int idx = event->hw.idx;
> > +
> > +	/* Initialize the hardware counter and reset prev_count*/
> > +	local64_set(&event->hw.prev_count, 0);
> > +	writeq_relaxed(0, macpmu->regs + MAC_PM_EVCNTR(idx));
> > +
> > +	/* Set the event type */
> > +	writeq_relaxed(MAC_PM_EVTYPE_EVSEL(event->attr.config), macpmu->regs + MAC_PM_EVTYPE(idx));
> > +
> > +	/* Enable interrupt generation by this counter */
> > +	writeq_relaxed(MAC_PM_INTENSET_IDX(idx), macpmu->regs + MAC_PM_INTENSET);
> > +
> > +	/* Finally, enable the counter */
> > +	writeq_relaxed(MAC_PM_CNTCTL_RESET, macpmu->regs + MAC_PM_CNTCTL(idx));
> > +	writeq_relaxed(MAC_PM_CNTENSET_IDX(idx), macpmu->regs + MAC_PM_CNTENSET);
> > +}
> > +
> > +static void fujitsu_mac_counter_stop(struct perf_event *event,
> > +				     int flags)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> > +	int idx = event->hw.idx;
> > +
> > +	/* Disable the counter */
> > +	writeq_relaxed(MAC_PM_CNTENCLR_IDX(idx), macpmu->regs + MAC_PM_CNTENCLR);
> > +
> > +	/* Disable interrupt generation by this counter */
> > +	writeq_relaxed(MAC_PM_INTENCLR_IDX(idx), macpmu->regs + MAC_PM_INTENCLR);
> > +}
> > +
> > +static void fujitsu_mac_counter_update(struct perf_event *event)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> > +	int idx = event->hw.idx;
> > +	u64 prev, new;
> > +
> > +	do {
> > +		prev = local64_read(&event->hw.prev_count);
> > +		new = readq_relaxed(macpmu->regs + MAC_PM_EVCNTR(idx));
> > +	} while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev);
> > +
> > +	local64_add(new - prev, &event->count);
> > +}
> > +
> > +static inline void fujitsu_mac__init(struct mac_pmu *macpmu)
> > +{
> > +	int i;
> > +
> > +	writeq_relaxed(MAC_PM_CR_RESET, macpmu->regs + MAC_PM_CR);
> > +
> > +	writeq_relaxed(MAC_PM_CNTENCLR_RESET, macpmu->regs + MAC_PM_CNTENCLR);
> > +	writeq_relaxed(MAC_PM_INTENCLR_RESET, macpmu->regs + MAC_PM_INTENCLR);
> > +	writeq_relaxed(MAC_PM_OVSR_OVSRCLR_RESET, macpmu->regs + MAC_PM_OVSR);
> > +
> > +	for (i = 0; i < MAC_NUM_COUNTERS; ++i) {
> > +		writeq_relaxed(MAC_PM_CNTCTL_RESET, macpmu->regs + MAC_PM_CNTCTL(i));
> > +		writeq_relaxed(MAC_PM_EVTYPE_EVSEL(0), macpmu->regs + MAC_PM_EVTYPE(i));
> > +	}
> > +
> > +	/*
> > +	 * Use writeq here to ensure all programming commands are done
> > +	 * before proceeding
> > +	 */
> 
> Are you certain an explicit barrier is necessary here? Since your 
> Kconfig implies this should only ever run on ARM64, the ioremap() 
> results in a Device-nGnRx memory type, for which the CPU architecture 
> already requires that accesses to the same peripheral occur in program 
> order.

We confirmed that the barrier was not necessary due to the hardware
specifications, and removed it in v8.

> > +	writeq(MAC_PM_CR_ENABLE, macpmu->regs + MAC_PM_CR);
> > +}
> > +
> > +static irqreturn_t fujitsu_mac__handle_irq(int irq_num, void *data)
> > +{
> > +	struct mac_pmu *macpmu = data;
> > +	/* Read the overflow status register */
> > +	long status = readq_relaxed(macpmu->regs + MAC_PM_OVSR);
> > +	int idx;
> > +
> > +	if (status == 0)
> > +		return IRQ_NONE;
> > +
> > +	/* Clear the bits we read on the overflow status register */
> > +	writeq_relaxed(status, macpmu->regs + MAC_PM_OVSR);
> > +
> > +	for_each_set_bit(idx, &status, MAC_NUM_COUNTERS) {
> > +		struct perf_event *event;
> > +
> > +		event = macpmu->events[idx];
> > +		if (!event)
> > +			continue;
> > +
> > +		fujitsu_mac_counter_update(event);
> > +	}
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static void fujitsu_mac__pmu_enable(struct pmu *pmu)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(pmu);
> > +
> > +	/* Ensure the other programming commands are observed before enabling */
> > +	wmb();
> 
> Same here. At best it might be somewhat logical to have a DSB *after* 
> the enable, as below, to try to ensure the PMU itself has seen the write 
> if it was early-acknowledged, but most drivers don't bother since we're 
> hardly dealing with cycle-accurate timings here anyway.

Similarly, the barrier was removed here in v8.

> > +	writeq_relaxed(MAC_PM_CR_ENABLE, macpmu->regs + MAC_PM_CR);
> > +}
> > +
> > +static void fujitsu_mac__pmu_disable(struct pmu *pmu)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(pmu);
> > +
> > +	writeq_relaxed(0, macpmu->regs + MAC_PM_CR);
> > +
> > +	/* Ensure the basic counter unit is stopped before proceeding */
> > +	wmb();
> > +}
> > +
> > +/*
> > + * We must NOT create groups containing events from multiple hardware PMUs,
> > + * although mixing different software and hardware PMUs is allowed.
> > + */
> 
> Let's stop copying that comment around, it really isn't relevant any longer.

Similarly, the barrier was also removed here in v8.

> > +static bool fujitsu_mac__validate_event_group(struct perf_event *event)
> > +{
> > +	struct perf_event *leader = event->group_leader;
> > +	struct perf_event *sibling;
> > +	int counters;
> > +
> > +	if (leader->pmu != event->pmu && !is_software_event(leader))
> > +		return false;
> > +
> > +	/* The sum of the counters used by the event and its leader event */
> > +	counters = 2;
> 
> This is mis-counting a software leader as this PMU's event, but also 
> mishandling the case when leader == event and it's not valid to touch 
> the sibling list. What you want here is to simply count your own events:
>
> 	int counters = 1;
> 
> 	if (leader == event)
> 		return true;
> 
> 	if (leader->pmu == event->pmu)
> 		counters++;
> 
> > +
> > +	for_each_sibling_event(sibling, leader) {
> > +		if (is_software_event(sibling))
> > +			continue;
> > +		if (sibling->pmu != event->pmu)
> > +			return false;
> > +		counters++;
> 
> And here just:
> 
> 		if (sibling->pmu == event->pmu)
> 			counters++;
> 
> The core code will take care of rejecting inappropriate cross-PMU groups.
> 

As you pointed out, it has been modified in v8 to count correctly.

> > +	}
> > +
> > +	/*
> > +	 * If the group requires more counters than the HW has, it
> > +	 * cannot ever be scheduled.
> > +	 */
> > +	return counters <= MAC_NUM_COUNTERS;
> > +}
> > +
> > +static int fujitsu_mac__event_init(struct perf_event *event)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +
> > +	/* Is the event for this PMU? */
> > +	if (event->attr.type != event->pmu->type)
> > +		return -ENOENT;
> > +
> > +	/*
> > +	 * Sampling not supported since these events are not
> > +	 * core-attributable.
> > +	 */
> > +	if (hwc->sample_period)
> > +		return -EINVAL;
> 
> Strictly that's redundant now that you added PERF_PMU_CAP_NO_INTERRUPT, 
> however I'm now trying to remove NO_INTERRUPT, so perhaps it might be 
> better to keep this for the moment depending on how things end up landing.
>   > +
> > +	/*
> > +	 * Task mode not available, we run the counters as socket counters,
> > +	 * not attributable to any CPU and therefore cannot attribute per-task.
> > +	 */
> > +	if (event->cpu < 0)
> > +		return -EINVAL;
> 
> You can drop this, perf_event_alloc() already takes care of it (see the 
> CPU/task check on entry and the "Disallow uncore-task events" comment 
> later - there's no harm if the driver itself nominally accepts the event 
> in between)

Understood. I've dropped this.

> > +
> > +	/* Validate the group */
> > +	if (!fujitsu_mac__validate_event_group(event))
> > +		return -EINVAL;
> > +
> > +	hwc->idx = -1;
> > +
> > +	/*
> > +	 * Many perf core operations (eg. events rotation) operate on a
> > +	 * single CPU context. This is obvious for CPU PMUs, where one
> > +	 * expects the same sets of events being observed on all CPUs,
> > +	 * but can lead to issues for off-core PMUs, like this one, where
> > +	 * each event could be theoretically assigned to a different CPU.
> > +	 * To mitigate this, we enforce CPU assignment to one designated
> > +	 * processor (the one described in the "cpumask" attribute exported
> > +	 * by the PMU device). perf user space tools honor this and avoid
> > +	 * opening more than one copy of the events.
> > +	 */
> > +	event->cpu = cpumask_first(&macpmu->cpumask);
> > +
> > +	return 0;
> > +}
> > +
> > +static void fujitsu_mac__event_start(struct perf_event *event, int flags)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +
> > +	hwc->state = 0;
> > +	fujitsu_mac_counter_start(event);
> > +}
> > +
> > +static void fujitsu_mac__event_stop(struct perf_event *event, int flags)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +
> > +	if (hwc->state & PERF_HES_STOPPED)
> > +		return;
> > +
> > +	fujitsu_mac_counter_stop(event, flags);
> > +	if (flags & PERF_EF_UPDATE)
> > +		fujitsu_mac_counter_update(event);
> > +	hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
> > +}
> > +
> > +static int fujitsu_mac__event_add(struct perf_event *event, int flags)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	int idx;
> > +
> > +	/* Try to allocate a counter. */
> > +	idx = bitmap_find_free_region(macpmu->used_mask, MAC_NUM_COUNTERS, 0);
> > +	if (idx < 0)
> > +		/* The counters are all in use. */
> > +		return -EAGAIN;
> > +
> > +	hwc->idx = idx;
> > +	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
> > +	macpmu->events[idx] = event;
> > +
> > +	if (flags & PERF_EF_START)
> > +		fujitsu_mac__event_start(event, 0);
> > +
> > +	/* Propagate changes to the userspace mapping. */
> > +	perf_event_update_userpage(event);
> > +
> > +	return 0;
> > +}
> > +
> > +static void fujitsu_mac__event_del(struct perf_event *event, int flags)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +
> > +	/* Stop and clean up */
> > +	fujitsu_mac__event_stop(event, flags | PERF_EF_UPDATE);
> > +	macpmu->events[hwc->idx] = NULL;
> > +	bitmap_release_region(macpmu->used_mask, hwc->idx, 0);
> > +
> > +	/* Propagate changes to the userspace mapping. */
> > +	perf_event_update_userpage(event);
> > +}
> > +
> > +static void fujitsu_mac__event_read(struct perf_event *event)
> > +{
> > +	fujitsu_mac_counter_update(event);
> > +}
> > +
> > +#define MAC_PMU_FORMAT_ATTR(_name, _config)				      \
> > +	(&((struct dev_ext_attribute[]) {				      \
> > +		{ .attr = __ATTR(_name, 0444, device_show_string, NULL),      \
> > +		  .var = (void *) _config, }				      \
> > +	})[0].attr.attr)
> > +
> > +static struct attribute *fujitsu_mac_pmu_formats[] = {
> > +	MAC_PMU_FORMAT_ATTR(event, "config:0-7"),
> > +	NULL
> > +};
> > +
> > +static const struct attribute_group fujitsu_mac_pmu_format_group = {
> > +	.name = "format",
> > +	.attrs = fujitsu_mac_pmu_formats,
> > +};
> > +
> > +static ssize_t mac_pmu_event_show(struct device *dev,
> > +				  struct device_attribute *attr, char *page)
> > +{
> > +	struct perf_pmu_events_attr *pmu_attr;
> > +
> > +	pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
> > +	return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
> > +}
> > +
> > +#define MAC_EVENT_ATTR(_name, _id)					     \
> > +	PMU_EVENT_ATTR_ID(_name, mac_pmu_event_show, _id)
> > +
> > +static struct attribute *fujitsu_mac_pmu_events[] = {
> > +	MAC_EVENT_ATTR(cycles, MAC_EVENT_CYCLES),
> > +	MAC_EVENT_ATTR(read-count, MAC_EVENT_READ_COUNT),
> > +	MAC_EVENT_ATTR(read-count-request, MAC_EVENT_READ_COUNT_REQUEST),
> > +	MAC_EVENT_ATTR(read-count-return, MAC_EVENT_READ_COUNT_RETURN),
> > +	MAC_EVENT_ATTR(read-count-request-pftgt, MAC_EVENT_READ_COUNT_REQUEST_PFTGT),
> > +	MAC_EVENT_ATTR(read-count-request-normal, MAC_EVENT_READ_COUNT_REQUEST_NORMAL),
> > +	MAC_EVENT_ATTR(read-count-return-pftgt-hit, MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT),
> > +	MAC_EVENT_ATTR(read-count-return-pftgt-miss, MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS),
> > +	MAC_EVENT_ATTR(read-wait, MAC_EVENT_READ_WAIT),
> > +	MAC_EVENT_ATTR(write-count, MAC_EVENT_WRITE_COUNT),
> > +	MAC_EVENT_ATTR(write-count-write, MAC_EVENT_WRITE_COUNT_WRITE),
> > +	MAC_EVENT_ATTR(write-count-pwrite, MAC_EVENT_WRITE_COUNT_PWRITE),
> > +	MAC_EVENT_ATTR(memory-read-count, MAC_EVENT_MEMORY_READ_COUNT),
> > +	MAC_EVENT_ATTR(memory-write-count, MAC_EVENT_MEMORY_WRITE_COUNT),
> > +	MAC_EVENT_ATTR(memory-pwrite-count, MAC_EVENT_MEMORY_PWRITE_COUNT),
> > +	MAC_EVENT_ATTR(ea-mac, MAC_EVENT_EA_MAC),
> > +	MAC_EVENT_ATTR(ea-memory, MAC_EVENT_EA_MEMORY),
> > +	MAC_EVENT_ATTR(ea-memory-mac-write, MAC_EVENT_EA_MEMORY_MAC_WRITE),
> > +	MAC_EVENT_ATTR(ea-ha, MAC_EVENT_EA_HA),
> I firmly maintain my opinion that if this is the only place the event 
> numbers are referenced then the extra layer of macros actually makes it 
> *harder* to read and follow, compared to simply:
> 
> 	MAC_EVENT_ATTR(ea-ha, 0xa0),
> 
> but that is very much just one reviewer's personal opinion :)

In v8, I've modified it to directly specify the event numbers.

> > +	NULL
> > +};
> > +
> > +static const struct attribute_group fujitsu_mac_pmu_events_group = {
> > +	.name = "events",
> > +	.attrs = fujitsu_mac_pmu_events,
> > +};
> > +
> > +static ssize_t cpumask_show(struct device *dev,
> > +			    struct device_attribute *attr, char *buf)
> > +{
> > +	struct mac_pmu *macpmu = to_mac_pmu(dev_get_drvdata(dev));
> > +
> > +	return cpumap_print_to_pagebuf(true, buf, &macpmu->cpumask);
> > +}
> > +static DEVICE_ATTR_RO(cpumask);
> > +
> > +static struct attribute *fujitsu_mac_pmu_cpumask_attrs[] = {
> > +	&dev_attr_cpumask.attr,
> > +	NULL
> > +};
> > +
> > +static const struct attribute_group fujitsu_mac_pmu_cpumask_attr_group = {
> > +	.attrs = fujitsu_mac_pmu_cpumask_attrs,
> > +};
> > +
> > +static const struct attribute_group *fujitsu_mac_pmu_attr_grps[] = {
> > +	&fujitsu_mac_pmu_format_group,
> > +	&fujitsu_mac_pmu_events_group,
> > +	&fujitsu_mac_pmu_cpumask_attr_group,
> > +	NULL
> > +};
> > +
> > +static int fujitsu_mac_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
> > +{
> > +	struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node);
> > +
> > +	/* If there is not a CPU/PMU association pick this CPU */
> > +	if (cpumask_empty(&macpmu->cpumask))
> > +		cpumask_set_cpu(cpu, &macpmu->cpumask);
> 
> You should set the IRQ affinity to the PMU CPU as well (not that you'll 
> ever realistically get an overflow with 64-bit counters, but still...)

I've added irq_set_affinity() calls in v8.

> 
> > +
> > +	return 0;
> > +}
> > +
> > +static int fujitsu_mac_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
> > +{
> > +	struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node);
> > +	unsigned int target;
> > +
> > +	if (!cpumask_test_and_clear_cpu(cpu, &macpmu->cpumask))
> > +		return 0;
> > +
> > +	target = cpumask_any_but(cpu_online_mask, cpu);
> > +	if (target >= nr_cpu_ids)
> > +		return 0;
> > +
> > +	perf_pmu_migrate_context(&macpmu->pmu, cpu, target);
> > +	cpumask_set_cpu(target, &macpmu->cpumask);
> 
> Ditto.

Similarly, I've added irq_set_affinity() calls.

> > +
> > +	return 0;
> > +}
> > +
> > +static int fujitsu_mac_pmu_probe(struct platform_device *pdev)
> > +{
> > +	struct device *dev = &pdev->dev;
> > +	struct acpi_device *acpi_dev;
> > +	struct mac_pmu *macpmu;
> > +	struct resource *memrc;
> > +	char *name;
> > +	int ret;
> > +	u64 uid;
> > +
> > +	acpi_dev = ACPI_COMPANION(dev);
> 
> acpi_dev_uid_to_integer() already handles NULL, so you may as well just 
> inline this like the majority of other callers do.

As you pointed out, I've inlined it in v8.

> > +	if (!acpi_dev)
> > +		return -ENODEV;
> > +
> > +	ret = acpi_dev_uid_to_integer(acpi_dev, &uid);
> > +	if (ret)
> > +		return dev_err_probe(dev, ret, "unable to read ACPI uid\n");
> > +
> > +	macpmu = devm_kzalloc(dev, sizeof(*macpmu), GFP_KERNEL);
> > +	if (!macpmu)
> > +		return -ENOMEM;
> > +
> > +	name = devm_kasprintf(dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu",
> > +			      (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF);
> > +	if (!name)
> > +		return -ENOMEM;
> > +
> > +	macpmu->pmu = (struct pmu) {
> > +		.parent		= dev,
> > +		.task_ctx_nr	= perf_invalid_context,
> > +
> > +		.pmu_enable	= fujitsu_mac__pmu_enable,
> > +		.pmu_disable	= fujitsu_mac__pmu_disable,
> > +		.event_init	= fujitsu_mac__event_init,
> > +		.add		= fujitsu_mac__event_add,
> > +		.del		= fujitsu_mac__event_del,
> > +		.start		= fujitsu_mac__event_start,
> > +		.stop		= fujitsu_mac__event_stop,
> > +		.read		= fujitsu_mac__event_read,
> > +
> > +		.attr_groups	= fujitsu_mac_pmu_attr_grps,
> > +		.capabilities	= PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
> > +	};
> > +
> > +	macpmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc);
> > +	if (IS_ERR(macpmu->regs))
> > +		return PTR_ERR(macpmu->regs);
> > +
> > +	fujitsu_mac__init(macpmu);
> > +
> > +	ret = platform_get_irq(pdev, 0);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	ret = devm_request_irq(dev, ret, fujitsu_mac__handle_irq, 0,
> > +			       name, macpmu);
> > +	if (ret)
> > +		return dev_err_probe(dev, ret, "Request for IRQ failed for slice @%pa\n",
> 
> What's a slice?

It has been changed to irq in v8.

> > +				     &memrc->start);
> > +
> > +	/* Add this instance to the list used by the offline callback */
> > +	ret = cpuhp_state_add_instance(mac_pmu_cpuhp_state, &macpmu->node);
> > +	if (ret)
> > +		return dev_err_probe(dev, ret, "Error registering hotplug");
> > +
> > +	ret = perf_pmu_register(&macpmu->pmu, name, -1);
> > +	if (ret < 0)
> > +		return dev_err_probe(dev, ret, "Failed to register MAC PMU\n");
> You should remove the cpuhp state in the error case (but remember to use 
> the _nocalls variant since you don't want to attempt to migrate anything 
> in that case.)

I've added cpuhp_state_remove_instance_nocalls() calls in v8.

> Thanks,
> Robin.

Best Regards,
Koichi Okuno

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver
  2025-08-15 12:59   ` Robin Murphy
@ 2025-09-04  6:41     ` Koichi Okuno (Fujitsu)
  0 siblings, 0 replies; 10+ messages in thread
From: Koichi Okuno (Fujitsu) @ 2025-09-04  6:41 UTC (permalink / raw)
  To: 'Robin Murphy'
  Cc: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
	Gowthami Thiagarajan, Linu Cherian,
	linux-arm-kernel@lists.infradead.org, Bjorn Andersson,
	Geert Uytterhoeven, Krzysztof Kozlowski, Konrad Dybcio,
	Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado,
	Thomas Gleixner, Peter Zijlstra, Jonathan Cameron,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Koichi Okuno (Fujitsu)

Hi, Robin

> > This adds a new dynamic PMU to the Perf Events framework to program and
> > control the Uncore PCI PMUs in Fujitsu chips.
> > 
> > This driver was created with reference to drivers/perf/qcom_l3_pmu.c.
> > 
> > This driver exports formatting and event information to sysfs so it can
> > be used by the perf user space tools with the syntaxes:
> > 
> > perf stat -e pci_iod0_pci0/ea-pci/ ls
> > perf stat -e pci_iod0_pci0/event=0x80/ ls
> > 
> > FUJITSU-MONAKA PMU Events Specification v1.1 URL:
> > https://github.com/fujitsu/FUJITSU-MONAKA
> > 
> > Signed-off-by: Koichi Okuno <fj2767dz@fujitsu.com>
> > ---
> >   .../admin-guide/perf/fujitsu_pci_pmu.rst      |  50 ++
> >   Documentation/admin-guide/perf/index.rst      |   1 +
> >   drivers/perf/Kconfig                          |   9 +
> >   drivers/perf/Makefile                         |   1 +
> >   drivers/perf/fujitsu_pci_pmu.c                | 536 ++++++++++++++++++
> 
>  From a quick side-by-side skim, this is a copy-paste of the exact same 
> driver from patch #1 with s/mac/pci/g applied. Please don't do that. If 
> the hardware is functionally the same, then it should just be a single 
> driver that can then pick which PMU name and set of event alias 
> attributes to use for a given instance based on the ACPI HID match 
> (and/or any other ID register info you may have.)

I've integrated the MAC PMU driver and the PCI PMU driver into a single
driver in v8.

Best Regards,
Koichi Okuno

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-09-04  6:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-15  3:47 [PATCH v7 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver Koichi Okuno
2025-08-15  3:47 ` [PATCH v7 1/2] perf: Fujitsu: Add the Uncore MAC " Koichi Okuno
2025-08-15 11:33   ` Jonathan Cameron
2025-08-15 12:39   ` Robin Murphy
2025-08-15 15:51     ` Jonathan Cameron
2025-09-04  6:37     ` Koichi Okuno (Fujitsu)
2025-08-15  3:47 ` [PATCH v7 2/2] perf: Fujitsu: Add the Uncore PCI " Koichi Okuno
2025-08-15 11:36   ` Jonathan Cameron
2025-08-15 12:59   ` Robin Murphy
2025-09-04  6:41     ` Koichi Okuno (Fujitsu)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).