* [PATCH 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver
@ 2024-11-08 5:40 Yoshihiro Furudera
2024-11-08 5:40 ` [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC " Yoshihiro Furudera
2024-11-08 5:40 ` [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI " Yoshihiro Furudera
0 siblings, 2 replies; 13+ messages in thread
From: Yoshihiro Furudera @ 2024-11-08 5:40 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas,
linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven,
Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio,
Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado,
Thomas Gleixner, Peter Zijlstra, linux-doc, linux-kernel,
Yoshihiro Furudera
This adds two new dynamic PMUs to the Perf Events framework to program
and control the Uncore MAC/PCI PMUs in Fujitsu chips.
These drivers were created with reference to drivers/perf/qcom_l3_pmu.c.
These drivers export formatting and event information to sysfs so they can
be used by the perf user space tools with the syntaxes:
perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls
perf stat -e pci_iod0_pci0/ea-pci/ ls
perf stat -e pci_iod0_pci0/event=0x80/ ls
FUJITSU-MONAKA Specification URL:
https://github.com/fujitsu/FUJITSU-MONAKA
I have a question.
When I run scripts/checkpatch.pl, the following WARNING appears.
> WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
> #56: FILE: Documentation/admin-guide/perf/fujitsu_mac_pmu.rst:1:
> +===========================================================================
Some of the files under Documentation/admin-guide/perf have an
SPDX-License-Identifier tag and some don't. Is it necessary to
include it in fujitsu_mac_pmu.rst and fujitsu_pci_pmu.rst and?
Best regards.
Yoshihiro Furudera
Yoshihiro Furudera (2):
perf: Fujitsu: Add the Uncore MAC PMU driver
perf: Fujitsu: Add the Uncore PCI PMU driver
.../admin-guide/perf/fujitsu_mac_pmu.rst | 20 +
.../admin-guide/perf/fujitsu_pci_pmu.rst | 20 +
arch/arm64/configs/defconfig | 2 +
drivers/perf/Kconfig | 18 +
drivers/perf/Makefile | 2 +
drivers/perf/fujitsu_mac_pmu.c | 633 ++++++++++++++++++
drivers/perf/fujitsu_pci_pmu.c | 613 +++++++++++++++++
include/linux/cpuhotplug.h | 2 +
8 files changed, 1310 insertions(+)
create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst
create mode 100644 Documentation/admin-guide/perf/fujitsu_pci_pmu.rst
create mode 100644 drivers/perf/fujitsu_mac_pmu.c
create mode 100644 drivers/perf/fujitsu_pci_pmu.c
--
2.34.1
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 5:40 [PATCH 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver Yoshihiro Furudera @ 2024-11-08 5:40 ` Yoshihiro Furudera 2024-11-08 11:03 ` Krzysztof Kozlowski ` (3 more replies) 2024-11-08 5:40 ` [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI " Yoshihiro Furudera 1 sibling, 4 replies; 13+ messages in thread From: Yoshihiro Furudera @ 2024-11-08 5:40 UTC (permalink / raw) To: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc, linux-kernel, Yoshihiro Furudera This adds a new dynamic PMU to the Perf Events framework to program and control the Uncore MAC PMUs in Fujitsu chips. This driver was created with reference to drivers/perf/qcom_l3_pmu.c. This driver exports formatting and event information to sysfs so it can be used by the perf user space tools with the syntaxes: perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls FUJITSU-MONAKA Specification URL: https://github.com/fujitsu/FUJITSU-MONAKA Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> --- .../admin-guide/perf/fujitsu_mac_pmu.rst | 20 + arch/arm64/configs/defconfig | 1 + drivers/perf/Kconfig | 9 + drivers/perf/Makefile | 1 + drivers/perf/fujitsu_mac_pmu.c | 633 ++++++++++++++++++ include/linux/cpuhotplug.h | 1 + 6 files changed, 665 insertions(+) create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst create mode 100644 drivers/perf/fujitsu_mac_pmu.c diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst new file mode 100644 index 000000000000..ddb3dcff3c61 --- /dev/null +++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst @@ -0,0 +1,20 @@ +=========================================================================== +Fujitsu Uncore MAC Performance Monitoring Unit (PMU) +=========================================================================== + +This driver supports the Uncore MAC PMUs found in Fujitsu chips. +Each MAC PMU on these chips is exposed as a uncore perf PMU with device name +mac_iod<iod>_mac<mac>_ch<ch>. + +The driver provides a description of its available events and configuration +options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/. +Given that these are uncore PMUs the driver also exposes a "cpumask" sysfs +attribute which contains a mask consisting of one CPU which will be used to +handle all the PMU events. + +Examples for use with perf:: + + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls + +Given that these are uncore PMUs the driver does not support sampling, therefore +"perf record" will not work. Per-task perf sessions are not supported. diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 5fdbfea7a5b2..2ef412937228 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -1575,6 +1575,7 @@ CONFIG_ARM_CMN=m CONFIG_ARM_SMMU_V3_PMU=m CONFIG_ARM_DSU_PMU=m CONFIG_FSL_IMX8_DDR_PMU=m +CONFIG_FUJITSU_MAC_PMU=y CONFIG_QCOM_L2_PMU=y CONFIG_QCOM_L3_PMU=y CONFIG_ARM_SPE_PMU=m diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index bab8ba64162f..4705c605e286 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU can give information about memory throughput and other related events. +config FUJITSU_MAC_PMU + bool "Fujitsu Uncore MAC PMU" + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) + help + Provides support for the Uncore MAC performance monitor unit (PMU) + in Fujitsu processors. + Adds the Uncore MAC PMU into the perf events subsystem for + monitoring Uncore MAC events. + config QCOM_L2_PMU bool "Qualcomm Technologies L2-cache PMU" depends on ARCH_QCOM && ARM64 && ACPI diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 8268f38e42c5..7285f94125ce 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -14,6 +14,7 @@ obj-$(CONFIG_ARM_SMMU_V3_PMU) += arm_smmuv3_pmu.o obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o obj-$(CONFIG_HISI_PMU) += hisilicon/ +obj-$(CONFIG_FUJITSU_MAC_PMU) += fujitsu_mac_pmu.o obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o diff --git a/drivers/perf/fujitsu_mac_pmu.c b/drivers/perf/fujitsu_mac_pmu.c new file mode 100644 index 000000000000..ee92ef5691dd --- /dev/null +++ b/drivers/perf/fujitsu_mac_pmu.c @@ -0,0 +1,633 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Driver for the Uncore MAC PMUs in Fujitsu chips. + * + * See Documentation/admin-guide/perf/fujitsu_mac_pmu.rst for more details. + * + * This driver is based on drivers/perf/qcom_l3_pmu.c + * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved. + * Copyright (c) 2024 Fujitsu. All rights reserved. + */ + +#include <linux/acpi.h> +#include <linux/bitops.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/list.h> +#include <linux/module.h> +#include <linux/perf_event.h> +#include <linux/platform_device.h> + +/* + * General constants + */ + +/* Number of counters on each PMU */ +#define MAC_NUM_COUNTERS 8 +/* Mask for the event type field within perf_event_attr.config and EVTYPE reg */ +#define MAC_EVTYPE_MASK 0xFF + +/* + * Register offsets + */ + +/* Perfmon registers */ +#define MAC_PM_EVCNTR(__cntr) (0x000 + ((__cntr) & 0x7) * 8) +#define MAC_PM_CNTCTL(__cntr) (0x100 + ((__cntr) & 0x7) * 8) +#define MAC_PM_EVTYPE(__cntr) (0x200 + ((__cntr) & 0x7) * 8) +#define MAC_PM_CR 0x400 +#define MAC_PM_CNTENSET 0x410 +#define MAC_PM_CNTENCLR 0x418 +#define MAC_PM_INTENSET 0x420 +#define MAC_PM_INTENCLR 0x428 +#define MAC_PM_OVSR 0x440 + +/* + * Bit field definitions + */ + +/* MAC_PM_CNTCTLx */ +#define PMCNT_RESET (0) + +/* MAC_PM_EVTYPEx */ +#define EVSEL(__val) ((__val) & MAC_EVTYPE_MASK) + +/* MAC_PM_CR */ +#define PM_RESET (1UL << 1) +#define PM_ENABLE (1UL << 0) + +/* MAC_PM_CNTENSET */ +#define PMCNTENSET(__cntr) (1UL << ((__cntr) & 0x7)) + +/* MAC_PM_CNTENCLR */ +#define PMCNTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) +#define PM_CNTENCLR_RESET (0xFF) + +/* MAC_PM_INTENSET */ +#define PMINTENSET(__cntr) (1UL << ((__cntr) & 0x7)) + +/* MAC_PM_INTENCLR */ +#define PMINTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) +#define PM_INTENCLR_RESET (0xFF) + +/* MAC_PM_OVSR */ +#define PMOVSRCLR(__cntr) (1UL << ((__cntr) & 0x7)) +#define PMOVSRCLR_RESET (0xFF) + +/* + * Events + */ + +#define MAC_EVENT_CYCLES 0x000 +#define MAC_EVENT_READ_COUNT 0x010 +#define MAC_EVENT_READ_COUNT_REQUEST 0x011 +#define MAC_EVENT_READ_COUNT_RETURN 0x012 +#define MAC_EVENT_READ_COUNT_REQUEST_PFTGT 0x013 +#define MAC_EVENT_READ_COUNT_REQUEST_NORMAL 0x014 +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT 0x015 +#define MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS 0x016 +#define MAC_EVENT_READ_WAIT 0x017 +#define MAC_EVENT_WRITE_COUNT 0x020 +#define MAC_EVENT_WRITE_COUNT_WRITE 0x021 +#define MAC_EVENT_WRITE_COUNT_PWRITE 0x022 +#define MAC_EVENT_MEMORY_READ_COUNT 0x040 +#define MAC_EVENT_MEMORY_WRITE_COUNT 0x050 +#define MAC_EVENT_MEMORY_PWRITE_COUNT 0x060 +#define MAC_EVENT_EA_MAC 0x080 +#define MAC_EVENT_EA_MEMORY 0x090 +#define MAC_EVENT_EA_MEMORY_MAC_READ 0x091 +#define MAC_EVENT_EA_MEMORY_MAC_WRITE 0x092 +#define MAC_EVENT_EA_MEMORY_MAC_PWRITE 0x093 +#define MAC_EVENT_EA_HA 0x0a0 + +/* + * Main PMU, inherits from the core perf PMU type + */ +struct mac_pmu { + struct pmu pmu; + struct hlist_node node; + void __iomem *regs; + struct perf_event *events[MAC_NUM_COUNTERS]; + unsigned long used_mask[BITS_TO_LONGS(MAC_NUM_COUNTERS)]; + cpumask_t cpumask; +}; + +#define to_mac_pmu(p) (container_of(p, struct mac_pmu, pmu)) + +/* + * Implementation of standard counter operations + */ + +static void fujitsu_mac_counter_start(struct perf_event *event) +{ + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); + int idx = event->hw.idx; + + /* Initialize the hardware counter and reset prev_count*/ + local64_set(&event->hw.prev_count, 0); + writeq_relaxed(0, macpmu->regs + MAC_PM_EVCNTR(idx)); + + /* Set the event type */ + writeq_relaxed(EVSEL(event->attr.config), macpmu->regs + MAC_PM_EVTYPE(idx)); + + /* Enable interrupt generation by this counter */ + writeq_relaxed(PMINTENSET(idx), macpmu->regs + MAC_PM_INTENSET); + + /* Finally, enable the counter */ + writeq_relaxed(PMCNT_RESET, macpmu->regs + MAC_PM_CNTCTL(idx)); + writeq_relaxed(PMCNTENSET(idx), macpmu->regs + MAC_PM_CNTENSET); +} + +static void fujitsu_mac_counter_stop(struct perf_event *event, + int flags) +{ + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); + int idx = event->hw.idx; + + /* Disable the counter */ + writeq_relaxed(PMCNTENCLR(idx), macpmu->regs + MAC_PM_CNTENCLR); + + /* Disable interrupt generation by this counter */ + writeq_relaxed(PMINTENCLR(idx), macpmu->regs + MAC_PM_INTENCLR); +} + +static void fujitsu_mac_counter_update(struct perf_event *event) +{ + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); + int idx = event->hw.idx; + u64 prev, new; + + do { + prev = local64_read(&event->hw.prev_count); + new = readq_relaxed(macpmu->regs + MAC_PM_EVCNTR(idx)); + } while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev); + + local64_add(new - prev, &event->count); +} + +/* + * Top level PMU functions. + */ + +static inline void fujitsu_mac__init(struct mac_pmu *macpmu) +{ + int i; + + writeq_relaxed(PM_RESET, macpmu->regs + MAC_PM_CR); + + writeq_relaxed(PM_CNTENCLR_RESET, macpmu->regs + MAC_PM_CNTENCLR); + writeq_relaxed(PM_INTENCLR_RESET, macpmu->regs + MAC_PM_INTENCLR); + writeq_relaxed(PMOVSRCLR_RESET, macpmu->regs + MAC_PM_OVSR); + + for (i = 0; i < MAC_NUM_COUNTERS; ++i) { + writeq_relaxed(PMCNT_RESET, macpmu->regs + MAC_PM_CNTCTL(i)); + writeq_relaxed(EVSEL(0), macpmu->regs + MAC_PM_EVTYPE(i)); + } + + /* + * Use writeq here to ensure all programming commands are done + * before proceeding + */ + writeq(PM_ENABLE, macpmu->regs + MAC_PM_CR); +} + +static irqreturn_t fujitsu_mac__handle_irq(int irq_num, void *data) +{ + struct mac_pmu *macpmu = data; + /* Read the overflow status register */ + long status = readq_relaxed(macpmu->regs + MAC_PM_OVSR); + int idx; + + if (status == 0) + return IRQ_NONE; + + /* Clear the bits we read on the overflow status register */ + writeq_relaxed(status, macpmu->regs + MAC_PM_OVSR); + + for_each_set_bit(idx, &status, MAC_NUM_COUNTERS) { + struct perf_event *event; + + event = macpmu->events[idx]; + if (!event) + continue; + + fujitsu_mac_counter_update(event); + } + + return IRQ_HANDLED; +} + +/* + * Implementation of abstract pmu functionality required by + * the core perf events code. + */ + +static void fujitsu_mac__pmu_enable(struct pmu *pmu) +{ + struct mac_pmu *macpmu = to_mac_pmu(pmu); + + /* Ensure the other programming commands are observed before enabling */ + wmb(); + + writeq_relaxed(PM_ENABLE, macpmu->regs + MAC_PM_CR); +} + +static void fujitsu_mac__pmu_disable(struct pmu *pmu) +{ + struct mac_pmu *macpmu = to_mac_pmu(pmu); + + writeq_relaxed(0, macpmu->regs + MAC_PM_CR); + + /* Ensure the basic counter unit is stopped before proceeding */ + wmb(); +} + +/* + * We must NOT create groups containing events from multiple hardware PMUs, + * although mixing different software and hardware PMUs is allowed. + */ +static bool fujitsu_mac__validate_event_group(struct perf_event *event) +{ + struct perf_event *leader = event->group_leader; + struct perf_event *sibling; + int counters = 0; + + if (leader->pmu != event->pmu && !is_software_event(leader)) + return false; + + /* The sum of the counters used by the event and its leader event */ + counters = 2; + + for_each_sibling_event(sibling, leader) { + if (is_software_event(sibling)) + continue; + if (sibling->pmu != event->pmu) + return false; + counters += 1; + } + + /* + * If the group requires more counters than the HW has, it + * cannot ever be scheduled. + */ + return counters <= MAC_NUM_COUNTERS; +} + +static int fujitsu_mac__event_init(struct perf_event *event) +{ + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + /* + * Is the event for this PMU? + */ + if (event->attr.type != event->pmu->type) + return -ENOENT; + + /* + * Sampling not supported since these events are not core-attributable. + */ + if (hwc->sample_period) + return -EINVAL; + + /* + * Task mode not available, we run the counters as socket counters, + * not attributable to any CPU and therefore cannot attribute per-task. + */ + if (event->cpu < 0) + return -EINVAL; + + /* Validate the group */ + if (!fujitsu_mac__validate_event_group(event)) + return -EINVAL; + + hwc->idx = -1; + + /* + * Many perf core operations (eg. events rotation) operate on a + * single CPU context. This is obvious for CPU PMUs, where one + * expects the same sets of events being observed on all CPUs, + * but can lead to issues for off-core PMUs, like this one, where + * each event could be theoretically assigned to a different CPU. + * To mitigate this, we enforce CPU assignment to one designated + * processor (the one described in the "cpumask" attribute exported + * by the PMU device). perf user space tools honor this and avoid + * opening more than one copy of the events. + */ + event->cpu = cpumask_first(&macpmu->cpumask); + + return 0; +} + +static void fujitsu_mac__event_start(struct perf_event *event, int flags) +{ + struct hw_perf_event *hwc = &event->hw; + + hwc->state = 0; + fujitsu_mac_counter_start(event); +} + +static void fujitsu_mac__event_stop(struct perf_event *event, int flags) +{ + struct hw_perf_event *hwc = &event->hw; + + if (hwc->state & PERF_HES_STOPPED) + return; + + fujitsu_mac_counter_stop(event, flags); + if (flags & PERF_EF_UPDATE) + fujitsu_mac_counter_update(event); + hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int fujitsu_mac__event_add(struct perf_event *event, int flags) +{ + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + int idx; + + /* + * Try to allocate a counter. + */ + idx = bitmap_find_free_region(macpmu->used_mask, MAC_NUM_COUNTERS, 0); + if (idx < 0) + /* The counters are all in use. */ + return -EAGAIN; + + hwc->idx = idx; + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; + macpmu->events[idx] = event; + + if (flags & PERF_EF_START) + fujitsu_mac__event_start(event, 0); + + /* Propagate changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void fujitsu_mac__event_del(struct perf_event *event, int flags) +{ + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + /* Stop and clean up */ + fujitsu_mac__event_stop(event, flags | PERF_EF_UPDATE); + macpmu->events[hwc->idx] = NULL; + bitmap_release_region(macpmu->used_mask, hwc->idx, 0); + + /* Propagate changes to the userspace mapping. */ + perf_event_update_userpage(event); +} + +static void fujitsu_mac__event_read(struct perf_event *event) +{ + fujitsu_mac_counter_update(event); +} + +/* + * Add sysfs attributes + * + * We export: + * - formats, used by perf user space and other tools to configure events + * - events, used by perf user space and other tools to create events + * symbolically, e.g.: + * perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls + * - cpumask, used by perf user space and other tools to know on which CPUs + * to open the events + */ + +/* formats */ + +#define MAC_PMU_FORMAT_ATTR(_name, _config) \ + (&((struct dev_ext_attribute[]) { \ + { .attr = __ATTR(_name, 0444, device_show_string, NULL), \ + .var = (void *) _config, } \ + })[0].attr.attr) + +static struct attribute *fujitsu_mac_pmu_formats[] = { + MAC_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group fujitsu_mac_pmu_format_group = { + .name = "format", + .attrs = fujitsu_mac_pmu_formats, +}; + +/* events */ + +static ssize_t mac_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct perf_pmu_events_attr *pmu_attr; + + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id); +} + +#define MAC_EVENT_ATTR(_name, _id) \ + PMU_EVENT_ATTR_ID(_name, mac_pmu_event_show, _id) + +static struct attribute *fujitsu_mac_pmu_events[] = { + MAC_EVENT_ATTR(cycles, MAC_EVENT_CYCLES), + MAC_EVENT_ATTR(read-count, MAC_EVENT_READ_COUNT), + MAC_EVENT_ATTR(read-count-request, MAC_EVENT_READ_COUNT_REQUEST), + MAC_EVENT_ATTR(read-count-return, MAC_EVENT_READ_COUNT_RETURN), + MAC_EVENT_ATTR(read-count-request-pftgt, MAC_EVENT_READ_COUNT_REQUEST_PFTGT), + MAC_EVENT_ATTR(read-count-request-normal, MAC_EVENT_READ_COUNT_REQUEST_NORMAL), + MAC_EVENT_ATTR(read-count-return-pftgt-hit, MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT), + MAC_EVENT_ATTR(read-count-return-pftgt-miss, MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS), + MAC_EVENT_ATTR(read-wait, MAC_EVENT_READ_WAIT), + MAC_EVENT_ATTR(write-count, MAC_EVENT_WRITE_COUNT), + MAC_EVENT_ATTR(write-count-write, MAC_EVENT_WRITE_COUNT_WRITE), + MAC_EVENT_ATTR(write-count-pwrite, MAC_EVENT_WRITE_COUNT_PWRITE), + MAC_EVENT_ATTR(memory-read-count, MAC_EVENT_MEMORY_READ_COUNT), + MAC_EVENT_ATTR(memory-write-count, MAC_EVENT_MEMORY_WRITE_COUNT), + MAC_EVENT_ATTR(memory-pwrite-count, MAC_EVENT_MEMORY_PWRITE_COUNT), + MAC_EVENT_ATTR(ea-mac, MAC_EVENT_EA_MAC), + MAC_EVENT_ATTR(ea-memory, MAC_EVENT_EA_MEMORY), + MAC_EVENT_ATTR(ea-memory-mac-read, MAC_EVENT_EA_MEMORY_MAC_READ), + MAC_EVENT_ATTR(ea-memory-mac-write, MAC_EVENT_EA_MEMORY_MAC_WRITE), + MAC_EVENT_ATTR(ea-memory-mac-pwrite, MAC_EVENT_EA_MEMORY_MAC_PWRITE), + MAC_EVENT_ATTR(ea-ha, MAC_EVENT_EA_HA), + NULL +}; + +static const struct attribute_group fujitsu_mac_pmu_events_group = { + .name = "events", + .attrs = fujitsu_mac_pmu_events, +}; + +/* cpumask */ + +static ssize_t cpumask_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct mac_pmu *macpmu = to_mac_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, &macpmu->cpumask); +} + +static DEVICE_ATTR_RO(cpumask); + +static struct attribute *fujitsu_mac_pmu_cpumask_attrs[] = { + &dev_attr_cpumask.attr, + NULL, +}; + +static const struct attribute_group fujitsu_mac_pmu_cpumask_attr_group = { + .attrs = fujitsu_mac_pmu_cpumask_attrs, +}; + +/* + * Per PMU device attribute groups + */ +static const struct attribute_group *fujitsu_mac_pmu_attr_grps[] = { + &fujitsu_mac_pmu_format_group, + &fujitsu_mac_pmu_events_group, + &fujitsu_mac_pmu_cpumask_attr_group, + NULL, +}; + +/* + * Probing functions and data. + */ + +static int fujitsu_mac_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node); + + /* If there is not a CPU/PMU association pick this CPU */ + if (cpumask_empty(&macpmu->cpumask)) + cpumask_set_cpu(cpu, &macpmu->cpumask); + + return 0; +} + +static int fujitsu_mac_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node); + unsigned int target; + + if (!cpumask_test_and_clear_cpu(cpu, &macpmu->cpumask)) + return 0; + target = cpumask_any_but(cpu_online_mask, cpu); + if (target >= nr_cpu_ids) + return 0; + perf_pmu_migrate_context(&macpmu->pmu, cpu, target); + cpumask_set_cpu(target, &macpmu->cpumask); + return 0; +} + +static int fujitsu_mac_pmu_probe(struct platform_device *pdev) +{ + struct mac_pmu *macpmu; + struct acpi_device *acpi_dev; + struct resource *memrc; + int ret; + char *name; + u64 uid; + + /* Initialize the PMU data structures */ + + acpi_dev = ACPI_COMPANION(&pdev->dev); + if (!acpi_dev) + return -ENODEV; + + ret = acpi_dev_uid_to_integer(acpi_dev, &uid); + if (ret) { + dev_err(&pdev->dev, "unable to read ACPI uid\n"); + return ret; + } + + macpmu = devm_kzalloc(&pdev->dev, sizeof(*macpmu), GFP_KERNEL); + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu", + (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF); + if (!macpmu || !name) + return -ENOMEM; + + macpmu->pmu = (struct pmu) { + .parent = &pdev->dev, + .task_ctx_nr = perf_invalid_context, + + .pmu_enable = fujitsu_mac__pmu_enable, + .pmu_disable = fujitsu_mac__pmu_disable, + .event_init = fujitsu_mac__event_init, + .add = fujitsu_mac__event_add, + .del = fujitsu_mac__event_del, + .start = fujitsu_mac__event_start, + .stop = fujitsu_mac__event_stop, + .read = fujitsu_mac__event_read, + + .attr_groups = fujitsu_mac_pmu_attr_grps, + .capabilities = PERF_PMU_CAP_NO_EXCLUDE, + }; + + macpmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc); + if (IS_ERR(macpmu->regs)) + return PTR_ERR(macpmu->regs); + + fujitsu_mac__init(macpmu); + + ret = platform_get_irq(pdev, 0); + if (ret <= 0) + return ret; + + ret = devm_request_irq(&pdev->dev, ret, fujitsu_mac__handle_irq, 0, + name, macpmu); + if (ret) { + dev_err(&pdev->dev, "Request for IRQ failed for slice @%pa\n", + &memrc->start); + return ret; + } + + /* Add this instance to the list used by the offline callback */ + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, &macpmu->node); + if (ret) { + dev_err(&pdev->dev, "Error %d registering hotplug", ret); + return ret; + } + + ret = perf_pmu_register(&macpmu->pmu, name, -1); + if (ret < 0) { + dev_err(&pdev->dev, "Failed to register MAC PMU (%d)\n", ret); + return ret; + } + + dev_info(&pdev->dev, "Registered %s, type: %d\n", name, macpmu->pmu.type); + + return 0; +} + +static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = { + { "FUJI200C", }, + { } +}; +MODULE_DEVICE_TABLE(acpi, fujitsu_mac_pmu_acpi_match); + +static struct platform_driver fujitsu_mac_pmu_driver = { + .driver = { + .name = "fujitsu-mac-pmu", + .acpi_match_table = ACPI_PTR(fujitsu_mac_pmu_acpi_match), + .suppress_bind_attrs = true, + }, + .probe = fujitsu_mac_pmu_probe, +}; + +static int __init register_fujitsu_mac_pmu_driver(void) +{ + int ret; + + /* Install a hook to update the reader CPU in case it goes offline */ + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, + "perf/fujitsu/mac:online", + fujitsu_mac_pmu_online_cpu, + fujitsu_mac_pmu_offline_cpu); + if (ret) + return ret; + + return platform_driver_register(&fujitsu_mac_pmu_driver); +} +device_initcall(register_fujitsu_mac_pmu_driver); diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 2361ed4d2b15..e6e49e09488a 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -227,6 +227,7 @@ enum cpuhp_state { CPUHP_AP_PERF_ARM_APM_XGENE_ONLINE, CPUHP_AP_PERF_ARM_CAVIUM_TX2_UNCORE_ONLINE, CPUHP_AP_PERF_ARM_MARVELL_CN10K_DDR_ONLINE, + CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE, CPUHP_AP_PERF_POWERPC_CORE_IMC_ONLINE, CPUHP_AP_PERF_POWERPC_THREAD_IMC_ONLINE, -- 2.34.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 5:40 ` [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC " Yoshihiro Furudera @ 2024-11-08 11:03 ` Krzysztof Kozlowski 2024-11-08 11:08 ` Krzysztof Kozlowski 2024-11-08 14:19 ` Jonathan Cameron ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Krzysztof Kozlowski @ 2024-11-08 11:03 UTC (permalink / raw) To: Yoshihiro Furudera, Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc, linux-kernel On 08/11/2024 06:40, Yoshihiro Furudera wrote: > This adds a new dynamic PMU to the Perf Events framework to program and > control the Uncore MAC PMUs in Fujitsu chips. > > This driver was created with reference to drivers/perf/qcom_l3_pmu.c. > > This driver exports formatting and event information to sysfs so it can > be used by the perf user space tools with the syntaxes: > > perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls > perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls > > FUJITSU-MONAKA Specification URL: > https://github.com/fujitsu/FUJITSU-MONAKA > > Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> > --- > .../admin-guide/perf/fujitsu_mac_pmu.rst | 20 + > arch/arm64/configs/defconfig | 1 + defconfig goes via your SoC maintainer. Split the patch and Cc the SoC folks. Which ARCH is it, BTW? > drivers/perf/Kconfig | 9 + > drivers/perf/Makefile | 1 + > drivers/perf/fujitsu_mac_pmu.c | 633 ++++++++++++++++++ > include/linux/cpuhotplug.h | 1 + > 6 files changed, 665 insertions(+) > create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > create mode 100644 drivers/perf/fujitsu_mac_pmu.c > > diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > new file mode 100644 > index 000000000000..ddb3dcff3c61 > --- /dev/null > +++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > @@ -0,0 +1,20 @@ > +=========================================================================== > +Fujitsu Uncore MAC Performance Monitoring Unit (PMU) > +=========================================================================== > + > +This driver supports the Uncore MAC PMUs found in Fujitsu chips. > +Each MAC PMU on these chips is exposed as a uncore perf PMU with device name > +mac_iod<iod>_mac<mac>_ch<ch>. > + > +The driver provides a description of its available events and configuration > +options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/. > +Given that these are uncore PMUs the driver also exposes a "cpumask" sysfs > +attribute which contains a mask consisting of one CPU which will be used to > +handle all the PMU events. > + > +Examples for use with perf:: > + > + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls > + > +Given that these are uncore PMUs the driver does not support sampling, therefore > +"perf record" will not work. Per-task perf sessions are not supported. > diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig > index 5fdbfea7a5b2..2ef412937228 100644 > --- a/arch/arm64/configs/defconfig > +++ b/arch/arm64/configs/defconfig > @@ -1575,6 +1575,7 @@ CONFIG_ARM_CMN=m > CONFIG_ARM_SMMU_V3_PMU=m > CONFIG_ARM_DSU_PMU=m > CONFIG_FSL_IMX8_DDR_PMU=m > +CONFIG_FUJITSU_MAC_PMU=y > CONFIG_QCOM_L2_PMU=y > CONFIG_QCOM_L3_PMU=y > CONFIG_ARM_SPE_PMU=m > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig > index bab8ba64162f..4705c605e286 100644 > --- a/drivers/perf/Kconfig > +++ b/drivers/perf/Kconfig > @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU > can give information about memory throughput and other related > events. > > +config FUJITSU_MAC_PMU > + bool "Fujitsu Uncore MAC PMU" > + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) Missing depends on specific ARCH. Sorry, this looks like work for some out of tree arch support. I don't think we have any interest in taking it... unless it is part of bigger patchset/work? If so, then provide *lore* link to relevant patchset. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 11:03 ` Krzysztof Kozlowski @ 2024-11-08 11:08 ` Krzysztof Kozlowski 2024-11-12 7:49 ` Yoshihiro Furudera (Fujitsu) 0 siblings, 1 reply; 13+ messages in thread From: Krzysztof Kozlowski @ 2024-11-08 11:08 UTC (permalink / raw) To: Yoshihiro Furudera, Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc, linux-kernel On 08/11/2024 12:03, Krzysztof Kozlowski wrote: > On 08/11/2024 06:40, Yoshihiro Furudera wrote: >> This adds a new dynamic PMU to the Perf Events framework to program and >> control the Uncore MAC PMUs in Fujitsu chips. >> >> This driver was created with reference to drivers/perf/qcom_l3_pmu.c. This confused me... >> CONFIG_ARM_SPE_PMU=m >> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig >> index bab8ba64162f..4705c605e286 100644 >> --- a/drivers/perf/Kconfig >> +++ b/drivers/perf/Kconfig >> @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU >> can give information about memory throughput and other related >> events. >> >> +config FUJITSU_MAC_PMU >> + bool "Fujitsu Uncore MAC PMU" >> + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) > > Missing depends on specific ARCH. > > Sorry, this looks like work for some out of tree arch support. I don't > think we have any interest in taking it... unless it is part of bigger > patchset/work? If so, then provide *lore* link to relevant patchset. > -ENOTENOUGHCOFFEE, I see now ACPI dependency so there will be no SoC folks for this, right? Then anyway split work per subsystem and send defconfig to Soc maintainers. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 11:08 ` Krzysztof Kozlowski @ 2024-11-12 7:49 ` Yoshihiro Furudera (Fujitsu) 0 siblings, 0 replies; 13+ messages in thread From: Yoshihiro Furudera (Fujitsu) @ 2024-11-12 7:49 UTC (permalink / raw) To: 'Krzysztof Kozlowski', Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel@lists.infradead.org, Bjorn Andersson, Geert Uytterhoeven, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Hi, Krzysztof Kozlowski Thanks for you review/comments. > > On 08/11/2024 06:40, Yoshihiro Furudera wrote: > > This adds a new dynamic PMU to the Perf Events framework to program > > and control the Uncore MAC PMUs in Fujitsu chips. > > > > This driver was created with reference to drivers/perf/qcom_l3_pmu.c. > > > > This driver exports formatting and event information to sysfs so it > > can be used by the perf user space tools with the syntaxes: > > > > perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls perf stat -e > > mac_iod0_mac0_ch0/event=0x80/ ls > > > > FUJITSU-MONAKA Specification URL: > > https://github.com/fujitsu/FUJITSU-MONAKA > > > > Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> > > --- > > .../admin-guide/perf/fujitsu_mac_pmu.rst | 20 + > > arch/arm64/configs/defconfig | 1 + > > > defconfig goes via your SoC maintainer. Split the patch and Cc the SoC folks. Understood.I'll do that and resubmit the patch. > > Which ARCH is it, BTW? This is supported by an ARM64-based processor called FUJITSU-MONAKA, which is currently being developed by FUJITSU. > > > > drivers/perf/Kconfig | 9 + > > drivers/perf/Makefile | 1 + > > drivers/perf/fujitsu_mac_pmu.c | 633 > ++++++++++++++++++ > > include/linux/cpuhotplug.h | 1 + > > 6 files changed, 665 insertions(+) > > create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > create mode 100644 drivers/perf/fujitsu_mac_pmu.c > > > > diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > new file mode 100644 > > index 000000000000..ddb3dcff3c61 > > --- /dev/null > > +++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > @@ -0,0 +1,20 @@ > > > +================================================== > =================== > > +====== Fujitsu Uncore MAC Performance Monitoring Unit (PMU) > > > +================================================== > =================== > > +====== > > + > > +This driver supports the Uncore MAC PMUs found in Fujitsu chips. > > +Each MAC PMU on these chips is exposed as a uncore perf PMU with > > +device name mac_iod<iod>_mac<mac>_ch<ch>. > > + > > +The driver provides a description of its available events and > > +configuration options in sysfs, see > /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/. > > +Given that these are uncore PMUs the driver also exposes a "cpumask" > > +sysfs attribute which contains a mask consisting of one CPU which > > +will be used to handle all the PMU events. > > + > > +Examples for use with perf:: > > + > > + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls > > + > > +Given that these are uncore PMUs the driver does not support > > +sampling, therefore "perf record" will not work. Per-task perf sessions are not > supported. > > diff --git a/arch/arm64/configs/defconfig > > b/arch/arm64/configs/defconfig index 5fdbfea7a5b2..2ef412937228 100644 > > --- a/arch/arm64/configs/defconfig > > +++ b/arch/arm64/configs/defconfig > > @@ -1575,6 +1575,7 @@ CONFIG_ARM_CMN=m > CONFIG_ARM_SMMU_V3_PMU=m > > CONFIG_ARM_DSU_PMU=m CONFIG_FSL_IMX8_DDR_PMU=m > > +CONFIG_FUJITSU_MAC_PMU=y > > CONFIG_QCOM_L2_PMU=y > > CONFIG_QCOM_L3_PMU=y > > CONFIG_ARM_SPE_PMU=m > > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index > > bab8ba64162f..4705c605e286 100644 > > --- a/drivers/perf/Kconfig > > +++ b/drivers/perf/Kconfig > > @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU > > can give information about memory throughput and other related > > events. > > > > +config FUJITSU_MAC_PMU > > + bool "Fujitsu Uncore MAC PMU" > > + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) > > Missing depends on specific ARCH. This is because this driver supports the FUJITSU-MONAKA, which is an ARM64 and ACPI-based processor. > > Sorry, this looks like work for some out of tree arch support. I don't think we have > any interest in taking it... unless it is part of bigger patchset/work? If so, then > provide *lore* link to relevant patchset. It is determined by the ACPI ID and does not depend on other patches. (This ACPI ID is used by FUJITSU-MONAKA.) The URLs of other patches related to FUJITSU-MONAKA are as follows: https://lore.kernel.org/all/20241018015640.2924794-1-fj5100bi@fujitsu.com/ https://lore.kernel.org/all/20241024071553.3073864-1-fj5100bi@fujitsu.com/ https://lore.kernel.org/all/20241111064843.3003093-1-fj5100bi@fujitsu.com/ > > Best regards, > Krzysztof > > On 08/11/2024 12:03, Krzysztof Kozlowski wrote: > > On 08/11/2024 06:40, Yoshihiro Furudera wrote: > >> This adds a new dynamic PMU to the Perf Events framework to program > >> and control the Uncore MAC PMUs in Fujitsu chips. > >> > >> This driver was created with reference to drivers/perf/qcom_l3_pmu.c. > > This confused me... This driver was created based on drivers/perf/qcom_l3_pmu.c. The reason is that the processing done to Qualcomm's L3 cache PMU registers in qcom_l3_pmu.c is similar to the processing done to the Uncore MAC/PCI PMU registers. Specifically, the basic processing is the same as drivers/perf/qcom_l3_pmu.c, but the variable names, function names, ACPI device ID, and some processing have been modified to match FUJITSU-MONAKA's Uncore MAC PMU. > > >> CONFIG_ARM_SPE_PMU=m > >> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index > >> bab8ba64162f..4705c605e286 100644 > >> --- a/drivers/perf/Kconfig > >> +++ b/drivers/perf/Kconfig > >> @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU > >> can give information about memory throughput and other related > >> events. > >> > >> +config FUJITSU_MAC_PMU > >> + bool "Fujitsu Uncore MAC PMU" > >> + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) > > > > Missing depends on specific ARCH. > > > > Sorry, this looks like work for some out of tree arch support. I don't > > think we have any interest in taking it... unless it is part of bigger > > patchset/work? If so, then provide *lore* link to relevant patchset. > > > > -ENOTENOUGHCOFFEE, I see now ACPI dependency so there will be no SoC > folks for this, right? Then anyway split work per subsystem and send > defconfig to Soc maintainers. Yes, we only use ACPI ID. I'll do that and resubmit the patch. > Best regards, > Krzysztof Best Regards, Yoshihiro Furudera ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 5:40 ` [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC " Yoshihiro Furudera 2024-11-08 11:03 ` Krzysztof Kozlowski @ 2024-11-08 14:19 ` Jonathan Cameron 2024-11-15 1:18 ` Yoshihiro Furudera (Fujitsu) 2024-11-08 20:19 ` kernel test robot 2024-11-09 0:36 ` kernel test robot 3 siblings, 1 reply; 13+ messages in thread From: Jonathan Cameron @ 2024-11-08 14:19 UTC (permalink / raw) To: Yoshihiro Furudera Cc: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc, linux-kernel On Fri, 8 Nov 2024 05:40:04 +0000 Yoshihiro Furudera <fj5100bi@fujitsu.com> wrote: > This adds a new dynamic PMU to the Perf Events framework to program and > control the Uncore MAC PMUs in Fujitsu chips. > > This driver was created with reference to drivers/perf/qcom_l3_pmu.c. > > This driver exports formatting and event information to sysfs so it can > be used by the perf user space tools with the syntaxes: > > perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls > perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls > > FUJITSU-MONAKA Specification URL: > https://github.com/fujitsu/FUJITSU-MONAKA > > Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> Hi. A quick drive by review. I haven't looked at the details so this is a little superficial at this stage. Jonathan > --- > .../admin-guide/perf/fujitsu_mac_pmu.rst | 20 + > arch/arm64/configs/defconfig | 1 + > drivers/perf/Kconfig | 9 + > drivers/perf/Makefile | 1 + > drivers/perf/fujitsu_mac_pmu.c | 633 ++++++++++++++++++ > include/linux/cpuhotplug.h | 1 + > 6 files changed, 665 insertions(+) > create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > create mode 100644 drivers/perf/fujitsu_mac_pmu.c > > diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > new file mode 100644 > index 000000000000..ddb3dcff3c61 > --- /dev/null > +++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst Add this to the index file. > @@ -0,0 +1,20 @@ > +=========================================================================== > +Fujitsu Uncore MAC Performance Monitoring Unit (PMU) > +=========================================================================== > + > +This driver supports the Uncore MAC PMUs found in Fujitsu chips. > +Each MAC PMU on these chips is exposed as a uncore perf PMU with device name > +mac_iod<iod>_mac<mac>_ch<ch>. > + > +The driver provides a description of its available events and configuration > +options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/. > +Given that these are uncore PMUs the driver also exposes a "cpumask" sysfs > +attribute which contains a mask consisting of one CPU which will be used to > +handle all the PMU events. > + > +Examples for use with perf:: > + > + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls > + > +Given that these are uncore PMUs the driver does not support sampling, therefore > +"perf record" will not work. Per-task perf sessions are not supported. Nice to give an idea of what the events are. EA in particularly isn't an immediately obvious acronym. ... > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig > index bab8ba64162f..4705c605e286 100644 > --- a/drivers/perf/Kconfig > +++ b/drivers/perf/Kconfig > @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU > can give information about memory throughput and other related > events. > > +config FUJITSU_MAC_PMU > + bool "Fujitsu Uncore MAC PMU" > + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) > + help > + Provides support for the Uncore MAC performance monitor unit (PMU) > + in Fujitsu processors. > + Adds the Uncore MAC PMU into the perf events subsystem for > + monitoring Uncore MAC events. Unusual indent. Match the rest of the file. > + > diff --git a/drivers/perf/fujitsu_mac_pmu.c b/drivers/perf/fujitsu_mac_pmu.c > new file mode 100644 > index 000000000000..ee92ef5691dd > --- /dev/null > +++ b/drivers/perf/fujitsu_mac_pmu.c > @@ -0,0 +1,633 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Driver for the Uncore MAC PMUs in Fujitsu chips. > + * > + * See Documentation/admin-guide/perf/fujitsu_mac_pmu.rst for more details. > + * > + * This driver is based on drivers/perf/qcom_l3_pmu.c > + * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved. > + * Copyright (c) 2024 Fujitsu. All rights reserved. > + */ > + > +#include <linux/acpi.h> > +#include <linux/bitops.h> > +#include <linux/interrupt.h> > +#include <linux/io.h> > +#include <linux/list.h> > +#include <linux/module.h> > +#include <linux/perf_event.h> > +#include <linux/platform_device.h> > + > +/* > + * General constants > + */ > + > +/* Number of counters on each PMU */ > +#define MAC_NUM_COUNTERS 8 > +/* Mask for the event type field within perf_event_attr.config and EVTYPE reg */ > +#define MAC_EVTYPE_MASK 0xFF > + > +/* > + * Register offsets > + */ > + > +/* Perfmon registers */ > +#define MAC_PM_EVCNTR(__cntr) (0x000 + ((__cntr) & 0x7) * 8) Perhaps a macro to extra the offset part from __cntr? However there only seem to be 8 counters, so why is the masking needed? > +#define MAC_PM_CNTCTL(__cntr) (0x100 + ((__cntr) & 0x7) * 8) > +#define MAC_PM_EVTYPE(__cntr) (0x200 + ((__cntr) & 0x7) * 8) > +#define MAC_PM_CR 0x400 > +#define MAC_PM_CNTENSET 0x410 > +#define MAC_PM_CNTENCLR 0x418 > +#define MAC_PM_INTENSET 0x420 > +#define MAC_PM_INTENCLR 0x428 > +#define MAC_PM_OVSR 0x440 > + > +/* > + * Bit field definitions > + */ > + > +/* MAC_PM_CNTCTLx */ > +#define PMCNT_RESET (0) > + > +/* MAC_PM_EVTYPEx */ > +#define EVSEL(__val) ((__val) & MAC_EVTYPE_MASK) > + > +/* MAC_PM_CR */ > +#define PM_RESET (1UL << 1) > +#define PM_ENABLE (1UL << 0) > + > +/* MAC_PM_CNTENSET */ > +#define PMCNTENSET(__cntr) (1UL << ((__cntr) & 0x7)) > + > +/* MAC_PM_CNTENCLR */ > +#define PMCNTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) > +#define PM_CNTENCLR_RESET (0xFF) > + > +/* MAC_PM_INTENSET */ > +#define PMINTENSET(__cntr) (1UL << ((__cntr) & 0x7)) > + > +/* MAC_PM_INTENCLR */ > +#define PMINTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) > +#define PM_INTENCLR_RESET (0xFF) > + > +/* MAC_PM_OVSR */ > +#define PMOVSRCLR(__cntr) (1UL << ((__cntr) & 0x7)) > +#define PMOVSRCLR_RESET (0xFF) No brackets for single value. Can you rename these so that the register is obvious. MAC_PM_OVSR_CLR(_) etc. Though you'd also want to add _REG or similar to the register definitions so it is obvious those are the addresses. > + > +static inline void fujitsu_mac__init(struct mac_pmu *macpmu) > +{ > + int i; > + > + writeq_relaxed(PM_RESET, macpmu->regs + MAC_PM_CR); > + > + writeq_relaxed(PM_CNTENCLR_RESET, macpmu->regs + MAC_PM_CNTENCLR); > + writeq_relaxed(PM_INTENCLR_RESET, macpmu->regs + MAC_PM_INTENCLR); > + writeq_relaxed(PMOVSRCLR_RESET, macpmu->regs + MAC_PM_OVSR); > + > + for (i = 0; i < MAC_NUM_COUNTERS; ++i) { > + writeq_relaxed(PMCNT_RESET, macpmu->regs + MAC_PM_CNTCTL(i)); > + writeq_relaxed(EVSEL(0), macpmu->regs + MAC_PM_EVTYPE(i)); > + } > + > + /* > + * Use writeq here to ensure all programming commands are done > + * before proceeding Odd indenting. Looks like an extra space before before. > + */ > + writeq(PM_ENABLE, macpmu->regs + MAC_PM_CR); > +} > + > +static void fujitsu_mac__pmu_enable(struct pmu *pmu) > +{ > + struct mac_pmu *macpmu = to_mac_pmu(pmu); > + > + /* Ensure the other programming commands are observed before enabling */ > + wmb(); Unusual to do it this way rather than after the things you want to have finished. I guess it's not wrong, but it does prevent use of writeq() > + > + writeq_relaxed(PM_ENABLE, macpmu->regs + MAC_PM_CR); > +} > + > +static void fujitsu_mac__pmu_disable(struct pmu *pmu) > +{ > + struct mac_pmu *macpmu = to_mac_pmu(pmu); > + > + writeq_relaxed(0, macpmu->regs + MAC_PM_CR); > + > + /* Ensure the basic counter unit is stopped before proceeding */ > + wmb(); Maybe just use writeq()? > +} > + > +/* > + * We must NOT create groups containing events from multiple hardware PMUs, > + * although mixing different software and hardware PMUs is allowed. > + */ > +static bool fujitsu_mac__validate_event_group(struct perf_event *event) > +{ > + struct perf_event *leader = event->group_leader; > + struct perf_event *sibling; > + int counters = 0; > + > + if (leader->pmu != event->pmu && !is_software_event(leader)) > + return false; > + > + /* The sum of the counters used by the event and its leader event */ > + counters = 2; > + > + for_each_sibling_event(sibling, leader) { > + if (is_software_event(sibling)) > + continue; > + if (sibling->pmu != event->pmu) > + return false; > + counters += 1; > + } > + > + /* > + * If the group requires more counters than the HW has, it > + * cannot ever be scheduled. > + */ > + return counters <= MAC_NUM_COUNTERS; > +} > + > +static int fujitsu_mac__event_init(struct perf_event *event) > +{ > + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); > + struct hw_perf_event *hwc = &event->hw; > + > + /* > + * Is the event for this PMU? Single line comment syntax. > + */ > + if (event->attr.type != event->pmu->type) > + return -ENOENT; > + > + /* > + * Sampling not supported since these events are not core-attributable. Probably also single line syntax (it's a bit long, so maybe multiline is appropriate here). > + */ > + if (hwc->sample_period) > + return -EINVAL; > + > +static int fujitsu_mac__event_add(struct perf_event *event, int flags) > +{ > + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); > + struct hw_perf_event *hwc = &event->hw; > + int idx; > + > + /* > + * Try to allocate a counter. Single line comment syntax. > + */ > + idx = bitmap_find_free_region(macpmu->used_mask, MAC_NUM_COUNTERS, 0); > + if (idx < 0) > + /* The counters are all in use. */ > + return -EAGAIN; > + > + hwc->idx = idx; > + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; > + macpmu->events[idx] = event; > + > + if (flags & PERF_EF_START) > + fujitsu_mac__event_start(event, 0); > + > + /* Propagate changes to the userspace mapping. */ > + perf_event_update_userpage(event); > + > + return 0; > +} > + > + > +/* > + * Add sysfs attributes Code makes this obvious and it is standard PMU driver stuff. I'd drop this comment. The details belong in the documentation, not here. > + * > + * We export: > + * - formats, used by perf user space and other tools to configure events > + * - events, used by perf user space and other tools to create events > + * symbolically, e.g.: > + * perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls > + * - cpumask, used by perf user space and other tools to know on which CPUs > + * to open the events > + */ > + > +/* formats */ > + > +#define MAC_PMU_FORMAT_ATTR(_name, _config) \ > + (&((struct dev_ext_attribute[]) { \ > + { .attr = __ATTR(_name, 0444, device_show_string, NULL), \ > + .var = (void *) _config, } \ > + })[0].attr.attr) > + > +static struct attribute *fujitsu_mac_pmu_formats[] = { > + MAC_PMU_FORMAT_ATTR(event, "config:0-7"), > + NULL, > +}; > + > +static const struct attribute_group fujitsu_mac_pmu_format_group = { > + .name = "format", > + .attrs = fujitsu_mac_pmu_formats, > +}; > + > +/* events */ Drop comment as adds little to my eyes. Same for all similar comments. Your code is easy to read, so they are unnecessary noise. > + > +static ssize_t mac_pmu_event_show(struct device *dev, > + struct device_attribute *attr, char *page) > +{ > + struct perf_pmu_events_attr *pmu_attr; > + > + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); > + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id); > +} > + > +#define MAC_EVENT_ATTR(_name, _id) \ > + PMU_EVENT_ATTR_ID(_name, mac_pmu_event_show, _id) > + > +static struct attribute *fujitsu_mac_pmu_events[] = { > + MAC_EVENT_ATTR(cycles, MAC_EVENT_CYCLES), > + MAC_EVENT_ATTR(read-count, MAC_EVENT_READ_COUNT), > + MAC_EVENT_ATTR(read-count-request, MAC_EVENT_READ_COUNT_REQUEST), > + MAC_EVENT_ATTR(read-count-return, MAC_EVENT_READ_COUNT_RETURN), > + MAC_EVENT_ATTR(read-count-request-pftgt, MAC_EVENT_READ_COUNT_REQUEST_PFTGT), > + MAC_EVENT_ATTR(read-count-request-normal, MAC_EVENT_READ_COUNT_REQUEST_NORMAL), > + MAC_EVENT_ATTR(read-count-return-pftgt-hit, MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT), > + MAC_EVENT_ATTR(read-count-return-pftgt-miss, MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS), > + MAC_EVENT_ATTR(read-wait, MAC_EVENT_READ_WAIT), > + MAC_EVENT_ATTR(write-count, MAC_EVENT_WRITE_COUNT), > + MAC_EVENT_ATTR(write-count-write, MAC_EVENT_WRITE_COUNT_WRITE), > + MAC_EVENT_ATTR(write-count-pwrite, MAC_EVENT_WRITE_COUNT_PWRITE), > + MAC_EVENT_ATTR(memory-read-count, MAC_EVENT_MEMORY_READ_COUNT), > + MAC_EVENT_ATTR(memory-write-count, MAC_EVENT_MEMORY_WRITE_COUNT), > + MAC_EVENT_ATTR(memory-pwrite-count, MAC_EVENT_MEMORY_PWRITE_COUNT), > + MAC_EVENT_ATTR(ea-mac, MAC_EVENT_EA_MAC), > + MAC_EVENT_ATTR(ea-memory, MAC_EVENT_EA_MEMORY), > + MAC_EVENT_ATTR(ea-memory-mac-read, MAC_EVENT_EA_MEMORY_MAC_READ), > + MAC_EVENT_ATTR(ea-memory-mac-write, MAC_EVENT_EA_MEMORY_MAC_WRITE), > + MAC_EVENT_ATTR(ea-memory-mac-pwrite, MAC_EVENT_EA_MEMORY_MAC_PWRITE), > + MAC_EVENT_ATTR(ea-ha, MAC_EVENT_EA_HA), > + NULL > +}; > + > +static const struct attribute_group fujitsu_mac_pmu_events_group = { > + .name = "events", > + .attrs = fujitsu_mac_pmu_events, > +}; > + > +/* cpumask */ Comment is obvious, so drop it in favor of brevity. > + > +static ssize_t cpumask_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct mac_pmu *macpmu = to_mac_pmu(dev_get_drvdata(dev)); > + > + return cpumap_print_to_pagebuf(true, buf, &macpmu->cpumask); > +} > + > +static DEVICE_ATTR_RO(cpumask); > + > +static struct attribute *fujitsu_mac_pmu_cpumask_attrs[] = { > + &dev_attr_cpumask.attr, > + NULL, No comma needed on null terminators as we will never add anything after them. > +}; > + > +static const struct attribute_group fujitsu_mac_pmu_cpumask_attr_group = { > + .attrs = fujitsu_mac_pmu_cpumask_attrs, > +}; > + > +/* > + * Per PMU device attribute groups > + */ > +static const struct attribute_group *fujitsu_mac_pmu_attr_grps[] = { > + &fujitsu_mac_pmu_format_group, > + &fujitsu_mac_pmu_events_group, > + &fujitsu_mac_pmu_cpumask_attr_group, > + NULL, No comma needed on null terminators. > +}; > + > +/* > + * Probing functions and data. > + */ Structural comments like this rarely bring real value and tend to end up wrong as code evolves. I'd drop them all now the code is written. > + > +static int fujitsu_mac_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) > +{ > + struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node); > + > + /* If there is not a CPU/PMU association pick this CPU */ > + if (cpumask_empty(&macpmu->cpumask)) > + cpumask_set_cpu(cpu, &macpmu->cpumask); > + > + return 0; > +} > + > +static int fujitsu_mac_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) > +{ > + struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, node); > + unsigned int target; > + > + if (!cpumask_test_and_clear_cpu(cpu, &macpmu->cpumask)) > + return 0; blank line - to help readability a little. > + target = cpumask_any_but(cpu_online_mask, cpu); > + if (target >= nr_cpu_ids) > + return 0; blank line > + perf_pmu_migrate_context(&macpmu->pmu, cpu, target); > + cpumask_set_cpu(target, &macpmu->cpumask); blank line. > + return 0; > +} > + > +static int fujitsu_mac_pmu_probe(struct platform_device *pdev) > +{ > + struct mac_pmu *macpmu; > + struct acpi_device *acpi_dev; > + struct resource *memrc; > + int ret; > + char *name; > + u64 uid; I'd pick an ordering and use it consistently. Perhaps reverse xmas tree. > + > + /* Initialize the PMU data structures */ This comment is a bit vague and not clearly associated with the code I would drop it as adding insufficient value. > + > + acpi_dev = ACPI_COMPANION(&pdev->dev); > + if (!acpi_dev) > + return -ENODEV; > + > + ret = acpi_dev_uid_to_integer(acpi_dev, &uid); > + if (ret) { > + dev_err(&pdev->dev, "unable to read ACPI uid\n"); Probably nicer to use return dev_err_probe(&pdev->dev, ret, "....) Consider a local struct device *dev = &pdev->dev; as reasonable number of users in here. > + return ret; > + } > + > + macpmu = devm_kzalloc(&pdev->dev, sizeof(*macpmu), GFP_KERNEL); > + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu", > + (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF); > + if (!macpmu || !name) > + return -ENOMEM; Whilst valid, it is neater to just do two separate checks. Also makes the code more resilient to future reorganizations introducing bugs. > + > + macpmu->pmu = (struct pmu) { > + .parent = &pdev->dev, > + .task_ctx_nr = perf_invalid_context, > + > + .pmu_enable = fujitsu_mac__pmu_enable, > + .pmu_disable = fujitsu_mac__pmu_disable, > + .event_init = fujitsu_mac__event_init, > + .add = fujitsu_mac__event_add, > + .del = fujitsu_mac__event_del, > + .start = fujitsu_mac__event_start, > + .stop = fujitsu_mac__event_stop, > + .read = fujitsu_mac__event_read, > + > + .attr_groups = fujitsu_mac_pmu_attr_grps, > + .capabilities = PERF_PMU_CAP_NO_EXCLUDE, > + }; > + > + macpmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc); > + if (IS_ERR(macpmu->regs)) > + return PTR_ERR(macpmu->regs); > + > + fujitsu_mac__init(macpmu); Why the double underscore? That is fairly unusual syntax. > + > + ret = platform_get_irq(pdev, 0); > + if (ret <= 0) > + return ret; > + > + ret = devm_request_irq(&pdev->dev, ret, fujitsu_mac__handle_irq, 0, > + name, macpmu); > + if (ret) { > + dev_err(&pdev->dev, "Request for IRQ failed for slice @%pa\n", > + &memrc->start); > + return ret; reutrn dev_err_probe() > + } > + > + /* Add this instance to the list used by the offline callback */ > + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, &macpmu->node); > + if (ret) { > + dev_err(&pdev->dev, "Error %d registering hotplug", ret); > + return ret; return dev_err_probe() > + } > + > + ret = perf_pmu_register(&macpmu->pmu, name, -1); > + if (ret < 0) { > + dev_err(&pdev->dev, "Failed to register MAC PMU (%d)\n", ret); return dev_err_probe() > + return ret; > + } > + > + dev_info(&pdev->dev, "Registered %s, type: %d\n", name, macpmu->pmu.type); Too noisy for the kernel log when this can be easily established anyway. dev_dbg() at most. > + > + return 0; > +} > + > +static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = { > + { "FUJI200C", }, > + { } > +}; > +MODULE_DEVICE_TABLE(acpi, fujitsu_mac_pmu_acpi_match); > + > +static struct platform_driver fujitsu_mac_pmu_driver = { > + .driver = { > + .name = "fujitsu-mac-pmu", > + .acpi_match_table = ACPI_PTR(fujitsu_mac_pmu_acpi_match), Drop the ACPI_PTR() It saves very little space and if you do use it you should guard the relevant data with ifdefs > + .suppress_bind_attrs = true, > + }, > + .probe = fujitsu_mac_pmu_probe, > +}; > + > +static int __init register_fujitsu_mac_pmu_driver(void) > +{ > + int ret; > + > + /* Install a hook to update the reader CPU in case it goes offline */ > + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, > + "perf/fujitsu/mac:online", > + fujitsu_mac_pmu_online_cpu, > + fujitsu_mac_pmu_offline_cpu); > + if (ret) > + return ret; > + > + return platform_driver_register(&fujitsu_mac_pmu_driver); > +} > +device_initcall(register_fujitsu_mac_pmu_driver); > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index 2361ed4d2b15..e6e49e09488a 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -227,6 +227,7 @@ enum cpuhp_state { > CPUHP_AP_PERF_ARM_APM_XGENE_ONLINE, > CPUHP_AP_PERF_ARM_CAVIUM_TX2_UNCORE_ONLINE, > CPUHP_AP_PERF_ARM_MARVELL_CN10K_DDR_ONLINE, > + CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, Can this use CPU_AP_ONLINE_DYN or are there some ordering contraints? > CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE, > CPUHP_AP_PERF_POWERPC_CORE_IMC_ONLINE, > CPUHP_AP_PERF_POWERPC_THREAD_IMC_ONLINE, ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 14:19 ` Jonathan Cameron @ 2024-11-15 1:18 ` Yoshihiro Furudera (Fujitsu) 2024-12-11 11:12 ` Jonathan Cameron 0 siblings, 1 reply; 13+ messages in thread From: Yoshihiro Furudera (Fujitsu) @ 2024-11-15 1:18 UTC (permalink / raw) To: 'Jonathan Cameron' Cc: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel@lists.infradead.org, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Hi, Jonathan Cameron Thanks for you review/comments. > > On Fri, 8 Nov 2024 05:40:04 +0000 > Yoshihiro Furudera <fj5100bi@fujitsu.com> wrote: > > > This adds a new dynamic PMU to the Perf Events framework to program > > and control the Uncore MAC PMUs in Fujitsu chips. > > > > This driver was created with reference to drivers/perf/qcom_l3_pmu.c. > > > > This driver exports formatting and event information to sysfs so it > > can be used by the perf user space tools with the syntaxes: > > > > perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls perf stat -e > > mac_iod0_mac0_ch0/event=0x80/ ls > > > > FUJITSU-MONAKA Specification URL: > > https://github.com/fujitsu/FUJITSU-MONAKA > > > > Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> > Hi. > > A quick drive by review. I haven't looked at the details so this is a little superficial > at this stage. > > Jonathan > > > --- > > .../admin-guide/perf/fujitsu_mac_pmu.rst | 20 + > > arch/arm64/configs/defconfig | 1 + > > drivers/perf/Kconfig | 9 + > > drivers/perf/Makefile | 1 + > > drivers/perf/fujitsu_mac_pmu.c | 633 > ++++++++++++++++++ > > include/linux/cpuhotplug.h | 1 + > > 6 files changed, 665 insertions(+) > > create mode 100644 Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > create mode 100644 drivers/perf/fujitsu_mac_pmu.c > > > > diff --git a/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > new file mode 100644 > > index 000000000000..ddb3dcff3c61 > > --- /dev/null > > +++ b/Documentation/admin-guide/perf/fujitsu_mac_pmu.rst > > Add this to the index file. Understood. I'll add this to Documentation/admin-guide/perf/index.rst. > > > @@ -0,0 +1,20 @@ > > > +================================================== > =================== > > +====== Fujitsu Uncore MAC Performance Monitoring Unit (PMU) > > > +================================================== > =================== > > +====== > > + > > +This driver supports the Uncore MAC PMUs found in Fujitsu chips. > > +Each MAC PMU on these chips is exposed as a uncore perf PMU with > > +device name mac_iod<iod>_mac<mac>_ch<ch>. > > + > > +The driver provides a description of its available events and > > +configuration options in sysfs, see > /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/. > > +Given that these are uncore PMUs the driver also exposes a "cpumask" > > +sysfs attribute which contains a mask consisting of one CPU which > > +will be used to handle all the PMU events. > > + > > +Examples for use with perf:: > > + > > + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls > > + > > +Given that these are uncore PMUs the driver does not support > > +sampling, therefore "perf record" will not work. Per-task perf sessions are not > supported. > > > Nice to give an idea of what the events are. EA in particularly isn't an immediately > obvious acronym. > Understood. I'll add a description of these events in next version. > ... > > > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index > > bab8ba64162f..4705c605e286 100644 > > --- a/drivers/perf/Kconfig > > +++ b/drivers/perf/Kconfig > > @@ -178,6 +178,15 @@ config FSL_IMX9_DDR_PMU > > can give information about memory throughput and other related > > events. > > > > +config FUJITSU_MAC_PMU > > + bool "Fujitsu Uncore MAC PMU" > > + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) > > + help > > + Provides support for the Uncore MAC performance monitor unit > (PMU) > > + in Fujitsu processors. > > + Adds the Uncore MAC PMU into the perf events subsystem for > > + monitoring Uncore MAC events. > > Unusual indent. Match the rest of the file. Understood. I'll fix the indent. > > > + > > > diff --git a/drivers/perf/fujitsu_mac_pmu.c > > b/drivers/perf/fujitsu_mac_pmu.c new file mode 100644 index > > 000000000000..ee92ef5691dd > > --- /dev/null > > +++ b/drivers/perf/fujitsu_mac_pmu.c > > @@ -0,0 +1,633 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* > > + * Driver for the Uncore MAC PMUs in Fujitsu chips. > > + * > > + * See Documentation/admin-guide/perf/fujitsu_mac_pmu.rst for more > details. > > + * > > + * This driver is based on drivers/perf/qcom_l3_pmu.c > > + * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved. > > + * Copyright (c) 2024 Fujitsu. All rights reserved. > > + */ > > + > > +#include <linux/acpi.h> > > +#include <linux/bitops.h> > > +#include <linux/interrupt.h> > > +#include <linux/io.h> > > +#include <linux/list.h> > > +#include <linux/module.h> > > +#include <linux/perf_event.h> > > +#include <linux/platform_device.h> > > + > > +/* > > + * General constants > > + */ > > + > > +/* Number of counters on each PMU */ > > +#define MAC_NUM_COUNTERS 8 > > +/* Mask for the event type field within perf_event_attr.config and EVTYPE reg > */ > > +#define MAC_EVTYPE_MASK 0xFF > > + > > +/* > > + * Register offsets > > + */ > > + > > +/* Perfmon registers */ > > +#define MAC_PM_EVCNTR(__cntr) (0x000 + ((__cntr) & 0x7) * 8) > Perhaps a macro to extra the offset part from __cntr? Yes. > However there only seem to be 8 counters, so why is the masking needed? Yes, there are only 8 counters. I folow the convention of drivers/perf/qcom_l3_pmu.c. > > > > +#define MAC_PM_CNTCTL(__cntr) (0x100 + ((__cntr) & 0x7) * 8) #define > > +MAC_PM_EVTYPE(__cntr) (0x200 + ((__cntr) & 0x7) * 8) > > +#define MAC_PM_CR 0x400 > > +#define MAC_PM_CNTENSET 0x410 > > +#define MAC_PM_CNTENCLR 0x418 > > +#define MAC_PM_INTENSET 0x420 > > +#define MAC_PM_INTENCLR 0x428 > > +#define MAC_PM_OVSR 0x440 > > + > > +/* > > + * Bit field definitions > > + */ > > + > > +/* MAC_PM_CNTCTLx */ > > +#define PMCNT_RESET (0) > > + > > +/* MAC_PM_EVTYPEx */ > > +#define EVSEL(__val) ((__val) & MAC_EVTYPE_MASK) > > + > > +/* MAC_PM_CR */ > > +#define PM_RESET (1UL << 1) > > +#define PM_ENABLE (1UL << 0) > > + > > +/* MAC_PM_CNTENSET */ > > +#define PMCNTENSET(__cntr) (1UL << ((__cntr) & 0x7)) > > + > > +/* MAC_PM_CNTENCLR */ > > +#define PMCNTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) > > +#define PM_CNTENCLR_RESET (0xFF) > > + > > +/* MAC_PM_INTENSET */ > > +#define PMINTENSET(__cntr) (1UL << ((__cntr) & 0x7)) > > + > > +/* MAC_PM_INTENCLR */ > > +#define PMINTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) > > +#define PM_INTENCLR_RESET (0xFF) > > + > > +/* MAC_PM_OVSR */ > > +#define PMOVSRCLR(__cntr) (1UL << ((__cntr) & 0x7)) > > +#define PMOVSRCLR_RESET (0xFF) > No brackets for single value. Understood. For a single value, I'll remove the brackets. > > Can you rename these so that the register is obvious. > MAC_PM_OVSR_CLR(_) etc. Though you'd also want to add _REG or similar to > the register definitions so it is obvious those are the addresses. The symbol naming convention is based on qcom_l3_pmu.c, so I would like to keep it that way. > > > > + > > +static inline void fujitsu_mac__init(struct mac_pmu *macpmu) { > > + int i; > > + > > + writeq_relaxed(PM_RESET, macpmu->regs + MAC_PM_CR); > > + > > + writeq_relaxed(PM_CNTENCLR_RESET, macpmu->regs + > MAC_PM_CNTENCLR); > > + writeq_relaxed(PM_INTENCLR_RESET, macpmu->regs + > MAC_PM_INTENCLR); > > + writeq_relaxed(PMOVSRCLR_RESET, macpmu->regs + > MAC_PM_OVSR); > > + > > + for (i = 0; i < MAC_NUM_COUNTERS; ++i) { > > + writeq_relaxed(PMCNT_RESET, macpmu->regs + > MAC_PM_CNTCTL(i)); > > + writeq_relaxed(EVSEL(0), macpmu->regs + > MAC_PM_EVTYPE(i)); > > + } > > + > > + /* > > + * Use writeq here to ensure all programming commands are done > > + * before proceeding > Odd indenting. Looks like an extra space before before. Understood. I'll remove the extra space. > > + */ > > + writeq(PM_ENABLE, macpmu->regs + MAC_PM_CR); } > > > + > > +static void fujitsu_mac__pmu_enable(struct pmu *pmu) { > > + struct mac_pmu *macpmu = to_mac_pmu(pmu); > > + > > + /* Ensure the other programming commands are observed before > enabling */ > > + wmb(); > Unusual to do it this way rather than after the things you want to have finished. > I guess it's not wrong, but it does prevent use of writeq() > > + > > + writeq_relaxed(PM_ENABLE, macpmu->regs + MAC_PM_CR); } > > + > > +static void fujitsu_mac__pmu_disable(struct pmu *pmu) { > > + struct mac_pmu *macpmu = to_mac_pmu(pmu); > > + > > + writeq_relaxed(0, macpmu->regs + MAC_PM_CR); > > + > > + /* Ensure the basic counter unit is stopped before proceeding */ > > + wmb(); > > Maybe just use writeq()? I understand that functions such as pmu_enable, pmu_disable, start, and stop in the struct pmu structure are called at the time of context switching, so wmb() is necessary for synchronization. This is based on qcom_l3_pmu.c, so I would like to keep it that way. > > > +} > > + > > +/* > > + * We must NOT create groups containing events from multiple hardware > > +PMUs, > > + * although mixing different software and hardware PMUs is allowed. > > + */ > > +static bool fujitsu_mac__validate_event_group(struct perf_event > > +*event) { > > + struct perf_event *leader = event->group_leader; > > + struct perf_event *sibling; > > + int counters = 0; > > + > > + if (leader->pmu != event->pmu && !is_software_event(leader)) > > + return false; > > + > > + /* The sum of the counters used by the event and its leader event */ > > + counters = 2; > > + > > + for_each_sibling_event(sibling, leader) { > > + if (is_software_event(sibling)) > > + continue; > > + if (sibling->pmu != event->pmu) > > + return false; > > + counters += 1; > > + } > > + > > + /* > > + * If the group requires more counters than the HW has, it > > + * cannot ever be scheduled. > > + */ > > + return counters <= MAC_NUM_COUNTERS; } > > + > > +static int fujitsu_mac__event_init(struct perf_event *event) { > > + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); > > + struct hw_perf_event *hwc = &event->hw; > > + > > + /* > > + * Is the event for this PMU? > Single line comment syntax. Understood. I'll use single line comment syntax here and any others like it. > > + */ > > + if (event->attr.type != event->pmu->type) > > + return -ENOENT; > > + > > + /* > > + * Sampling not supported since these events are not core-attributable. > Probably also single line syntax (it's a bit long, so maybe multiline is appropriate > here). Understood. I'll use multiline comment syntax here. > > + */ > > + if (hwc->sample_period) > > + return -EINVAL; > > > + > > +static int fujitsu_mac__event_add(struct perf_event *event, int > > +flags) { > > + struct mac_pmu *macpmu = to_mac_pmu(event->pmu); > > + struct hw_perf_event *hwc = &event->hw; > > + int idx; > > + > > + /* > > + * Try to allocate a counter. > Single line comment syntax. Understood. I'll use single line comment syntax here. > > > + */ > > + idx = bitmap_find_free_region(macpmu->used_mask, > MAC_NUM_COUNTERS, 0); > > + if (idx < 0) > > + /* The counters are all in use. */ > > + return -EAGAIN; > > + > > + hwc->idx = idx; > > + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; > > + macpmu->events[idx] = event; > > + > > + if (flags & PERF_EF_START) > > + fujitsu_mac__event_start(event, 0); > > + > > + /* Propagate changes to the userspace mapping. */ > > + perf_event_update_userpage(event); > > + > > + return 0; > > +} > > + > > > + > > +/* > > + * Add sysfs attributes > Code makes this obvious and it is standard PMU driver stuff. I'd drop this > comment. > The details belong in the documentation, not here. Understood. I'll delete this comment and include it in the documentation file (.rst). > > > + * > > + * We export: > > + * - formats, used by perf user space and other tools to configure > > + events > > + * - events, used by perf user space and other tools to create events > > + * symbolically, e.g.: > > + * perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls > > + * - cpumask, used by perf user space and other tools to know on which CPUs > > + * to open the events > > + */ > > + > > +/* formats */ > > + > > +#define MAC_PMU_FORMAT_ATTR(_name, _config) > \ > > + (&((struct dev_ext_attribute[]) { \ > > + { .attr = __ATTR(_name, 0444, device_show_string, NULL), > \ > > + .var = (void *) _config, } > \ > > + })[0].attr.attr) > > + > > +static struct attribute *fujitsu_mac_pmu_formats[] = { > > + MAC_PMU_FORMAT_ATTR(event, "config:0-7"), > > + NULL, > > +}; > > + > > +static const struct attribute_group fujitsu_mac_pmu_format_group = { > > + .name = "format", > > + .attrs = fujitsu_mac_pmu_formats, > > +}; > > + > > +/* events */ > Drop comment as adds little to my eyes. Same for all similar comments. > Your code is easy to read, so they are unnecessary noise. Understood. I'll remove this comment and all similar comments. > > > + > > +static ssize_t mac_pmu_event_show(struct device *dev, > > + struct device_attribute *attr, char *page) > { > > + struct perf_pmu_events_attr *pmu_attr; > > + > > + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); > > + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id); } > > + > > +#define MAC_EVENT_ATTR(_name, _id) > \ > > + PMU_EVENT_ATTR_ID(_name, mac_pmu_event_show, _id) > > + > > +static struct attribute *fujitsu_mac_pmu_events[] = { > > + MAC_EVENT_ATTR(cycles, MAC_EVENT_CYCLES), > > + MAC_EVENT_ATTR(read-count, MAC_EVENT_READ_COUNT), > > + MAC_EVENT_ATTR(read-count-request, > MAC_EVENT_READ_COUNT_REQUEST), > > + MAC_EVENT_ATTR(read-count-return, > MAC_EVENT_READ_COUNT_RETURN), > > + MAC_EVENT_ATTR(read-count-request-pftgt, > MAC_EVENT_READ_COUNT_REQUEST_PFTGT), > > + MAC_EVENT_ATTR(read-count-request-normal, > MAC_EVENT_READ_COUNT_REQUEST_NORMAL), > > + MAC_EVENT_ATTR(read-count-return-pftgt-hit, > MAC_EVENT_READ_COUNT_RETURN_PFTGT_HIT), > > + MAC_EVENT_ATTR(read-count-return-pftgt-miss, > MAC_EVENT_READ_COUNT_RETURN_PFTGT_MISS), > > + MAC_EVENT_ATTR(read-wait, MAC_EVENT_READ_WAIT), > > + MAC_EVENT_ATTR(write-count, MAC_EVENT_WRITE_COUNT), > > + MAC_EVENT_ATTR(write-count-write, > MAC_EVENT_WRITE_COUNT_WRITE), > > + MAC_EVENT_ATTR(write-count-pwrite, > MAC_EVENT_WRITE_COUNT_PWRITE), > > + MAC_EVENT_ATTR(memory-read-count, > MAC_EVENT_MEMORY_READ_COUNT), > > + MAC_EVENT_ATTR(memory-write-count, > MAC_EVENT_MEMORY_WRITE_COUNT), > > + MAC_EVENT_ATTR(memory-pwrite-count, > MAC_EVENT_MEMORY_PWRITE_COUNT), > > + MAC_EVENT_ATTR(ea-mac, MAC_EVENT_EA_MAC), > > + MAC_EVENT_ATTR(ea-memory, MAC_EVENT_EA_MEMORY), > > + MAC_EVENT_ATTR(ea-memory-mac-read, > MAC_EVENT_EA_MEMORY_MAC_READ), > > + MAC_EVENT_ATTR(ea-memory-mac-write, > MAC_EVENT_EA_MEMORY_MAC_WRITE), > > + MAC_EVENT_ATTR(ea-memory-mac-pwrite, > MAC_EVENT_EA_MEMORY_MAC_PWRITE), > > + MAC_EVENT_ATTR(ea-ha, MAC_EVENT_EA_HA), > > + NULL > > +}; > > + > > +static const struct attribute_group fujitsu_mac_pmu_events_group = { > > + .name = "events", > > + .attrs = fujitsu_mac_pmu_events, > > +}; > > + > > +/* cpumask */ > Comment is obvious, so drop it in favor of brevity. > > + > > +static ssize_t cpumask_show(struct device *dev, > > + struct device_attribute *attr, char *buf) { > > + struct mac_pmu *macpmu = to_mac_pmu(dev_get_drvdata(dev)); > > + > > + return cpumap_print_to_pagebuf(true, buf, &macpmu->cpumask); } > > + > > +static DEVICE_ATTR_RO(cpumask); > > + > > +static struct attribute *fujitsu_mac_pmu_cpumask_attrs[] = { > > + &dev_attr_cpumask.attr, > > + NULL, > No comma needed on null terminators as we will never add anything after them. Understood. I'll delete this comma and any others like it. > > +}; > > + > > +static const struct attribute_group fujitsu_mac_pmu_cpumask_attr_group = { > > + .attrs = fujitsu_mac_pmu_cpumask_attrs, }; > > + > > +/* > > + * Per PMU device attribute groups > > + */ > > +static const struct attribute_group *fujitsu_mac_pmu_attr_grps[] = { > > + &fujitsu_mac_pmu_format_group, > > + &fujitsu_mac_pmu_events_group, > > + &fujitsu_mac_pmu_cpumask_attr_group, > > + NULL, > No comma needed on null terminators. > > +}; > > + > > +/* > > + * Probing functions and data. > > + */ > Structural comments like this rarely bring real value and tend to end up wrong as > code evolves. I'd drop them all now the code is written. > > > + > > +static int fujitsu_mac_pmu_online_cpu(unsigned int cpu, struct > > +hlist_node *node) { > > + struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, > > +node); > > + > > + /* If there is not a CPU/PMU association pick this CPU */ > > + if (cpumask_empty(&macpmu->cpumask)) > > + cpumask_set_cpu(cpu, &macpmu->cpumask); > > + > > + return 0; > > +} > > + > > +static int fujitsu_mac_pmu_offline_cpu(unsigned int cpu, struct > > +hlist_node *node) { > > + struct mac_pmu *macpmu = hlist_entry_safe(node, struct mac_pmu, > node); > > + unsigned int target; > > + > > + if (!cpumask_test_and_clear_cpu(cpu, &macpmu->cpumask)) > > + return 0; > blank line - to help readability a little. Understood. I'll add the blank here and any others like it. > > + target = cpumask_any_but(cpu_online_mask, cpu); > > + if (target >= nr_cpu_ids) > > + return 0; > blank line > > + perf_pmu_migrate_context(&macpmu->pmu, cpu, target); > > + cpumask_set_cpu(target, &macpmu->cpumask); > blank line. > > > + return 0; > > +} > > + > > +static int fujitsu_mac_pmu_probe(struct platform_device *pdev) { > > + struct mac_pmu *macpmu; > > + struct acpi_device *acpi_dev; > > + struct resource *memrc; > > + int ret; > > + char *name; > > + u64 uid; > I'd pick an ordering and use it consistently. > Perhaps reverse xmas tree. Understood. I'll sort the variables by the length of their names, following your suggested "reverse xmas tree" order. > > > + > > + /* Initialize the PMU data structures */ > > This comment is a bit vague and not clearly associated with the code I would drop > it as adding insufficient value. Understood. I'll remove this comment. > > > + > > + acpi_dev = ACPI_COMPANION(&pdev->dev); > > + if (!acpi_dev) > > + return -ENODEV; > > + > > + ret = acpi_dev_uid_to_integer(acpi_dev, &uid); > > + if (ret) { > > + dev_err(&pdev->dev, "unable to read ACPI uid\n"); > > Probably nicer to use > return dev_err_probe(&pdev->dev, ret, "....) Consider a local > struct device *dev = &pdev->dev; > as reasonable number of users in here. Understood. I'll change dev_err to dev_err_probe and define *dev to replace pdev->dev. I'll change any others like it. > > > > + return ret; > > + } > > + > > + macpmu = devm_kzalloc(&pdev->dev, sizeof(*macpmu), GFP_KERNEL); > > + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, > "mac_iod%llu_mac%llu_ch%llu", > > + (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF); > > + if (!macpmu || !name) > > + return -ENOMEM; > > Whilst valid, it is neater to just do two separate checks. Also makes the code more > resilient to future reorganizations introducing bugs. Understood. I'll change the if statement to two separate checks. > > > > + > > + macpmu->pmu = (struct pmu) { > > + .parent = &pdev->dev, > > + .task_ctx_nr = perf_invalid_context, > > + > > + .pmu_enable = fujitsu_mac__pmu_enable, > > + .pmu_disable = fujitsu_mac__pmu_disable, > > + .event_init = fujitsu_mac__event_init, > > + .add = fujitsu_mac__event_add, > > + .del = fujitsu_mac__event_del, > > + .start = fujitsu_mac__event_start, > > + .stop = fujitsu_mac__event_stop, > > + .read = fujitsu_mac__event_read, > > + > > + .attr_groups = fujitsu_mac_pmu_attr_grps, > > + .capabilities = PERF_PMU_CAP_NO_EXCLUDE, > > + }; > > + > > + macpmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, > &memrc); > > + if (IS_ERR(macpmu->regs)) > > + return PTR_ERR(macpmu->regs); > > + > > + fujitsu_mac__init(macpmu); > > Why the double underscore? That is fairly unusual syntax. I folow the convention of drivers/perf/qcom_l3_pmu.c, where the corresponding qcom_l3_cache__init function uses a double underscore. > > > > + > > + ret = platform_get_irq(pdev, 0); > > + if (ret <= 0) > > + return ret; > > + > > + ret = devm_request_irq(&pdev->dev, ret, fujitsu_mac__handle_irq, 0, > > + name, macpmu); > > + if (ret) { > > + dev_err(&pdev->dev, "Request for IRQ failed for slice @%pa\n", > > + &memrc->start); > > + return ret; > reutrn dev_err_probe() > > + } > > + > > + /* Add this instance to the list used by the offline callback */ > > + ret = > cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, > &macpmu->node); > > + if (ret) { > > + dev_err(&pdev->dev, "Error %d registering hotplug", ret); > > + return ret; > return dev_err_probe() > > > + } > > + > > + ret = perf_pmu_register(&macpmu->pmu, name, -1); > > + if (ret < 0) { > > + dev_err(&pdev->dev, "Failed to register MAC PMU (%d)\n", > ret); > return dev_err_probe() > > + return ret; > > + } > > + > > + dev_info(&pdev->dev, "Registered %s, type: %d\n", name, > > +macpmu->pmu.type); > Too noisy for the kernel log when this can be easily established anyway. > dev_dbg() at most. Understood. I'll change dev_info to dev_dbg. > > > + > > + return 0; > > +} > > + > > +static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = { > > + { "FUJI200C", }, > > + { } > > +}; > > +MODULE_DEVICE_TABLE(acpi, fujitsu_mac_pmu_acpi_match); > > + > > +static struct platform_driver fujitsu_mac_pmu_driver = { > > + .driver = { > > + .name = "fujitsu-mac-pmu", > > + .acpi_match_table = ACPI_PTR(fujitsu_mac_pmu_acpi_match), > > Drop the ACPI_PTR() It saves very little space and if you do use it you should > guard the relevant data with ifdefs Understood. I'll drop the ACPI_PTR(). > > > + .suppress_bind_attrs = true, > > + }, > > + .probe = fujitsu_mac_pmu_probe, > > +}; > > + > > +static int __init register_fujitsu_mac_pmu_driver(void) > > +{ > > + int ret; > > + > > + /* Install a hook to update the reader CPU in case it goes offline */ > > + ret = > cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, > > + "perf/fujitsu/mac:online", > > + fujitsu_mac_pmu_online_cpu, > > + fujitsu_mac_pmu_offline_cpu); > > + if (ret) > > + return ret; > > + > > + return platform_driver_register(&fujitsu_mac_pmu_driver); > > +} > > +device_initcall(register_fujitsu_mac_pmu_driver); > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > index 2361ed4d2b15..e6e49e09488a 100644 > > --- a/include/linux/cpuhotplug.h > > +++ b/include/linux/cpuhotplug.h > > @@ -227,6 +227,7 @@ enum cpuhp_state { > > CPUHP_AP_PERF_ARM_APM_XGENE_ONLINE, > > CPUHP_AP_PERF_ARM_CAVIUM_TX2_UNCORE_ONLINE, > > CPUHP_AP_PERF_ARM_MARVELL_CN10K_DDR_ONLINE, > > + CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, > > Can this use CPU_AP_ONLINE_DYN > or are there some ordering contraints? It follows the same principles as other PMU drivers. > > > CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE, > > CPUHP_AP_PERF_POWERPC_CORE_IMC_ONLINE, > > CPUHP_AP_PERF_POWERPC_THREAD_IMC_ONLINE, Best Regards, Yoshihiro Furudera ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-15 1:18 ` Yoshihiro Furudera (Fujitsu) @ 2024-12-11 11:12 ` Jonathan Cameron 0 siblings, 0 replies; 13+ messages in thread From: Jonathan Cameron @ 2024-12-11 11:12 UTC (permalink / raw) To: Yoshihiro Furudera (Fujitsu) Cc: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel@lists.infradead.org, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas "F. R. A. Prado", Thomas Gleixner, Peter Zijlstra, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org > > > +device_initcall(register_fujitsu_mac_pmu_driver); > > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > > index 2361ed4d2b15..e6e49e09488a 100644 > > > --- a/include/linux/cpuhotplug.h > > > +++ b/include/linux/cpuhotplug.h > > > @@ -227,6 +227,7 @@ enum cpuhp_state { > > > CPUHP_AP_PERF_ARM_APM_XGENE_ONLINE, > > > CPUHP_AP_PERF_ARM_CAVIUM_TX2_UNCORE_ONLINE, > > > CPUHP_AP_PERF_ARM_MARVELL_CN10K_DDR_ONLINE, > > > + CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, > > > > Can this use CPU_AP_ONLINE_DYN > > or are there some ordering contraints? > > It follows the same principles as other PMU drivers. Some of them and I suspect the is cut and paste rather than justified. + Not sure how long the dynamic part has been available. > > > > > > CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE, > > > CPUHP_AP_PERF_POWERPC_CORE_IMC_ONLINE, > > > CPUHP_AP_PERF_POWERPC_THREAD_IMC_ONLINE, > > Best Regards, > Yoshihiro Furudera ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 5:40 ` [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC " Yoshihiro Furudera 2024-11-08 11:03 ` Krzysztof Kozlowski 2024-11-08 14:19 ` Jonathan Cameron @ 2024-11-08 20:19 ` kernel test robot 2024-11-09 0:36 ` kernel test robot 3 siblings, 0 replies; 13+ messages in thread From: kernel test robot @ 2024-11-08 20:19 UTC (permalink / raw) To: Yoshihiro Furudera, Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, linux-doc, linux-kernel Cc: oe-kbuild-all Hi Yoshihiro, kernel test robot noticed the following build warnings: [auto build test WARNING on tip/smp/core] [also build test WARNING on linus/master v6.12-rc6] [cannot apply to arm64/for-next/core arm-perf/for-next/perf next-20241108] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Yoshihiro-Furudera/perf-Fujitsu-Add-the-Uncore-MAC-PMU-driver/20241108-134245 base: tip/smp/core patch link: https://lore.kernel.org/r/20241108054006.2550856-2-fj5100bi%40fujitsu.com patch subject: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20241109/202411090451.quuiocP9-lkp@intel.com/config) compiler: alpha-linux-gcc (GCC) 14.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241109/202411090451.quuiocP9-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411090451.quuiocP9-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/perf/fujitsu_mac_pmu.c:604:36: warning: 'fujitsu_mac_pmu_acpi_match' defined but not used [-Wunused-const-variable=] 604 | static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = { | ^~~~~~~~~~~~~~~~~~~~~~~~~~ vim +/fujitsu_mac_pmu_acpi_match +604 drivers/perf/fujitsu_mac_pmu.c 603 > 604 static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = { 605 { "FUJI200C", }, 606 { } 607 }; 608 MODULE_DEVICE_TABLE(acpi, fujitsu_mac_pmu_acpi_match); 609 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver 2024-11-08 5:40 ` [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC " Yoshihiro Furudera ` (2 preceding siblings ...) 2024-11-08 20:19 ` kernel test robot @ 2024-11-09 0:36 ` kernel test robot 3 siblings, 0 replies; 13+ messages in thread From: kernel test robot @ 2024-11-09 0:36 UTC (permalink / raw) To: Yoshihiro Furudera, Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, linux-doc, linux-kernel Cc: llvm, oe-kbuild-all Hi Yoshihiro, kernel test robot noticed the following build warnings: [auto build test WARNING on tip/smp/core] [also build test WARNING on linus/master v6.12-rc6] [cannot apply to arm64/for-next/core arm-perf/for-next/perf next-20241108] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Yoshihiro-Furudera/perf-Fujitsu-Add-the-Uncore-MAC-PMU-driver/20241108-134245 base: tip/smp/core patch link: https://lore.kernel.org/r/20241108054006.2550856-2-fj5100bi%40fujitsu.com patch subject: [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC PMU driver config: s390-allmodconfig (https://download.01.org/0day-ci/archive/20241109/202411090817.15n7hhOv-lkp@intel.com/config) compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 592c0fe55f6d9a811028b5f3507be91458ab2713) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241109/202411090817.15n7hhOv-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411090817.15n7hhOv-lkp@intel.com/ All warnings (new ones prefixed by >>): In file included from drivers/perf/fujitsu_mac_pmu.c:12: In file included from include/linux/acpi.h:14: In file included from include/linux/device.h:32: In file included from include/linux/device/driver.h:21: In file included from include/linux/module.h:19: In file included from include/linux/elf.h:6: In file included from arch/s390/include/asm/elf.h:181: In file included from arch/s390/include/asm/mmu_context.h:11: In file included from arch/s390/include/asm/pgalloc.h:18: In file included from include/linux/mm.h:2232: include/linux/vmstat.h:503:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 503 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 504 | item]; | ~~~~ include/linux/vmstat.h:510:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 510 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 511 | NR_VM_NUMA_EVENT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~~ include/linux/vmstat.h:517:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 517 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ include/linux/vmstat.h:523:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 523 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 524 | NR_VM_NUMA_EVENT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/perf/fujitsu_mac_pmu.c:15: In file included from include/linux/io.h:14: In file included from arch/s390/include/asm/io.h:93: include/asm-generic/io.h:548:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 548 | val = __raw_readb(PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:561:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 561 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu' 37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) | ^ include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16' 102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) | ^ In file included from drivers/perf/fujitsu_mac_pmu.c:15: In file included from include/linux/io.h:14: In file included from arch/s390/include/asm/io.h:93: include/asm-generic/io.h:574:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 574 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu' 35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x)) | ^ include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32' 115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x)) | ^ In file included from drivers/perf/fujitsu_mac_pmu.c:15: In file included from include/linux/io.h:14: In file included from arch/s390/include/asm/io.h:93: include/asm-generic/io.h:585:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 585 | __raw_writeb(value, PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:595:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 595 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:605:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 605 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:693:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 693 | readsb(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:701:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 701 | readsw(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:709:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 709 | readsl(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:718:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 718 | writesb(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:727:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 727 | writesw(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:736:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 736 | writesl(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ >> drivers/perf/fujitsu_mac_pmu.c:604:36: warning: unused variable 'fujitsu_mac_pmu_acpi_match' [-Wunused-const-variable] 604 | static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = { | ^~~~~~~~~~~~~~~~~~~~~~~~~~ 17 warnings generated. vim +/fujitsu_mac_pmu_acpi_match +604 drivers/perf/fujitsu_mac_pmu.c 603 > 604 static const struct acpi_device_id fujitsu_mac_pmu_acpi_match[] = { 605 { "FUJI200C", }, 606 { } 607 }; 608 MODULE_DEVICE_TABLE(acpi, fujitsu_mac_pmu_acpi_match); 609 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver 2024-11-08 5:40 [PATCH 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver Yoshihiro Furudera 2024-11-08 5:40 ` [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC " Yoshihiro Furudera @ 2024-11-08 5:40 ` Yoshihiro Furudera 2024-11-08 11:03 ` Krzysztof Kozlowski 2024-11-08 22:33 ` kernel test robot 1 sibling, 2 replies; 13+ messages in thread From: Yoshihiro Furudera @ 2024-11-08 5:40 UTC (permalink / raw) To: Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc, linux-kernel, Yoshihiro Furudera This adds a new dynamic PMU to the Perf Events framework to program and control the Uncore PCI PMUs in Fujitsu chips. This driver was created with reference to drivers/perf/qcom_l3_pmu.c. This driver exports formatting and event information to sysfs so it can be used by the perf user space tools with the syntaxes: perf stat -e pci_iod0_pci0/ea-pci/ ls perf stat -e pci_iod0_pci0/event=0x80/ ls FUJITSU-MONAKA Specification URL: https://github.com/fujitsu/FUJITSU-MONAKA Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> --- .../admin-guide/perf/fujitsu_pci_pmu.rst | 20 + arch/arm64/configs/defconfig | 1 + drivers/perf/Kconfig | 9 + drivers/perf/Makefile | 1 + drivers/perf/fujitsu_pci_pmu.c | 613 ++++++++++++++++++ include/linux/cpuhotplug.h | 1 + 6 files changed, 645 insertions(+) create mode 100644 Documentation/admin-guide/perf/fujitsu_pci_pmu.rst create mode 100644 drivers/perf/fujitsu_pci_pmu.c diff --git a/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst b/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst new file mode 100644 index 000000000000..5fee3a3ccc86 --- /dev/null +++ b/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst @@ -0,0 +1,20 @@ +=========================================================================== +Fujitsu Uncore PCI Performance Monitoring Unit (PMU) +=========================================================================== + +This driver supports the Uncore PCI PMUs found in Fujitsu chips. +Each PCI PMU on these chips is exposed as a uncore perf PMU with device name +pci_iod<iod>_pci<pci>. + +The driver provides a description of its available events and configuration +options in sysfs, see /sys/bus/event_sources/devices/pci_iod<iod>_pci<pci>/. +Given that these are uncore PMUs the driver also exposes a "cpumask" sysfs +attribute which contains a mask consisting of one CPU which will be used to +handle all the PMU events. + +Examples for use with perf:: + + perf stat -e pci_iod0_pci0/ea-pci/ ls + +Given that these are uncore PMUs the driver does not support sampling, therefore +"perf record" will not work. Per-task perf sessions are not supported. diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 2ef412937228..d7df90205be6 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -1576,6 +1576,7 @@ CONFIG_ARM_SMMU_V3_PMU=m CONFIG_ARM_DSU_PMU=m CONFIG_FSL_IMX8_DDR_PMU=m CONFIG_FUJITSU_MAC_PMU=y +CONFIG_FUJITSU_PCI_PMU=y CONFIG_QCOM_L2_PMU=y CONFIG_QCOM_L3_PMU=y CONFIG_ARM_SPE_PMU=m diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 4705c605e286..d33b6e47cda2 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -187,6 +187,15 @@ config FUJITSU_MAC_PMU Adds the Uncore MAC PMU into the perf events subsystem for monitoring Uncore MAC events. +config FUJITSU_PCI_PMU + bool "Fujitsu Uncore PCI PMU" + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) + help + Provides support for the Uncore PCI performance monitor unit (PMU) + in Fujitsu processors. + Adds the Uncore PCI PMU into the perf events subsystem for + monitoring Uncore PCI events. + config QCOM_L2_PMU bool "Qualcomm Technologies L2-cache PMU" depends on ARCH_QCOM && ARM64 && ACPI diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 7285f94125ce..1220fca45575 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -15,6 +15,7 @@ obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o obj-$(CONFIG_HISI_PMU) += hisilicon/ obj-$(CONFIG_FUJITSU_MAC_PMU) += fujitsu_mac_pmu.o +obj-$(CONFIG_FUJITSU_PCI_PMU) += fujitsu_pci_pmu.o obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o diff --git a/drivers/perf/fujitsu_pci_pmu.c b/drivers/perf/fujitsu_pci_pmu.c new file mode 100644 index 000000000000..7a3f8a0ad52e --- /dev/null +++ b/drivers/perf/fujitsu_pci_pmu.c @@ -0,0 +1,613 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Driver for the Uncore PCI PMUs in Fujitsu chips. + * + * See Documentation/admin-guide/perf/fujitsu_pci_pmu.rst for more details. + * + * This driver is based on drivers/perf/qcom_l3_pmu.c + * Copyright (c) 2015-2017, The Linux Foundation. All rights reserved. + * Copyright (c) 2024 Fujitsu. All rights reserved. + */ + +#include <linux/acpi.h> +#include <linux/bitops.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/list.h> +#include <linux/module.h> +#include <linux/perf_event.h> +#include <linux/platform_device.h> + +/* + * General constants + */ + +/* Number of counters on each PMU */ +#define PCI_NUM_COUNTERS 8 +/* Mask for the event type field within perf_event_attr.config and EVTYPE reg */ +#define PCI_EVTYPE_MASK 0xFF + +/* + * Register offsets + */ + +/* Perfmon registers */ +#define PCI_PM_EVCNTR(__cntr) (0x000 + ((__cntr) & 0x7) * 8) +#define PCI_PM_CNTCTL(__cntr) (0x100 + ((__cntr) & 0x7) * 8) +#define PCI_PM_EVTYPE(__cntr) (0x200 + ((__cntr) & 0x7) * 8) +#define PCI_PM_CR 0x400 +#define PCI_PM_CNTENSET 0x410 +#define PCI_PM_CNTENCLR 0x418 +#define PCI_PM_INTENSET 0x420 +#define PCI_PM_INTENCLR 0x428 +#define PCI_PM_OVSR 0x440 + +/* + * Bit field definitions + */ + +/* PCI_PM_CNTCTLx */ +#define PMCNT_RESET (0) + +/* PCI_PM_EVTYPEx */ +#define EVSEL(__val) ((__val) & PCI_EVTYPE_MASK) + +/* PCI_PM_CR */ +#define PM_RESET (1UL << 1) +#define PM_ENABLE (1UL << 0) + +/* PCI_PM_CNTENSET */ +#define PMCNTENSET(__cntr) (1UL << ((__cntr) & 0x7)) + +/* PCI_PM_CNTENCLR */ +#define PMCNTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) +#define PM_CNTENCLR_RESET (0xFF) + +/* PCI_PM_INTENSET */ +#define PMINTENSET(__cntr) (1UL << ((__cntr) & 0x7)) + +/* PCI_PM_INTENCLR */ +#define PMINTENCLR(__cntr) (1UL << ((__cntr) & 0x7)) +#define PM_INTENCLR_RESET (0xFF) + +/* PCI_PM_OVSR */ +#define PMOVSRCLR(__cntr) (1UL << ((__cntr) & 0x7)) +#define PMOVSRCLR_RESET (0xFF) + +/* + * Events + */ + +#define PCI_EVENT_PORT0_CYCLES 0x000 +#define PCI_EVENT_PORT0_READ_COUNT 0x010 +#define PCI_EVENT_PORT0_READ_COUNT_BUS 0x014 +#define PCI_EVENT_PORT0_WRITE_COUNT 0x020 +#define PCI_EVENT_PORT0_WRITE_COUNT_BUS 0x024 +#define PCI_EVENT_PORT1_CYCLES 0x040 +#define PCI_EVENT_PORT1_READ_COUNT 0x050 +#define PCI_EVENT_PORT1_READ_COUNT_BUS 0x054 +#define PCI_EVENT_PORT1_WRITE_COUNT 0x060 +#define PCI_EVENT_PORT1_WRITE_COUNT_BUS 0x064 +#define PCI_EVENT_EA_PCI 0x080 + +/* + * Main PMU, inherits from the core perf PMU type + */ +struct pci_pmu { + struct pmu pmu; + struct hlist_node node; + void __iomem *regs; + struct perf_event *events[PCI_NUM_COUNTERS]; + unsigned long used_mask[BITS_TO_LONGS(PCI_NUM_COUNTERS)]; + cpumask_t cpumask; +}; + +#define to_pci_pmu(p) (container_of(p, struct pci_pmu, pmu)) + +/* + * Implementation of standard counter operations + */ + +static void fujitsu_pci_counter_start(struct perf_event *event) +{ + struct pci_pmu *pcipmu = to_pci_pmu(event->pmu); + int idx = event->hw.idx; + + /* Initialize the hardware counter and reset prev_count*/ + local64_set(&event->hw.prev_count, 0); + writeq_relaxed(0, pcipmu->regs + PCI_PM_EVCNTR(idx)); + + /* Set the event type */ + writeq_relaxed(EVSEL(event->attr.config), pcipmu->regs + PCI_PM_EVTYPE(idx)); + + /* Enable interrupt generation by this counter */ + writeq_relaxed(PMINTENSET(idx), pcipmu->regs + PCI_PM_INTENSET); + + /* Finally, enable the counter */ + writeq_relaxed(PMCNT_RESET, pcipmu->regs + PCI_PM_CNTCTL(idx)); + writeq_relaxed(PMCNTENSET(idx), pcipmu->regs + PCI_PM_CNTENSET); +} + +static void fujitsu_pci_counter_stop(struct perf_event *event, + int flags) +{ + struct pci_pmu *pcipmu = to_pci_pmu(event->pmu); + int idx = event->hw.idx; + + /* Disable the counter */ + writeq_relaxed(PMCNTENCLR(idx), pcipmu->regs + PCI_PM_CNTENCLR); + + /* Disable interrupt generation by this counter */ + writeq_relaxed(PMINTENCLR(idx), pcipmu->regs + PCI_PM_INTENCLR); +} + +static void fujitsu_pci_counter_update(struct perf_event *event) +{ + struct pci_pmu *pcipmu = to_pci_pmu(event->pmu); + int idx = event->hw.idx; + u64 prev, new; + + do { + prev = local64_read(&event->hw.prev_count); + new = readq_relaxed(pcipmu->regs + PCI_PM_EVCNTR(idx)); + } while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev); + + local64_add(new - prev, &event->count); +} + +/* + * Top level PMU functions. + */ + +static inline void fujitsu_pci__init(struct pci_pmu *pcipmu) +{ + int i; + + writeq_relaxed(PM_RESET, pcipmu->regs + PCI_PM_CR); + + writeq_relaxed(PM_CNTENCLR_RESET, pcipmu->regs + PCI_PM_CNTENCLR); + writeq_relaxed(PM_INTENCLR_RESET, pcipmu->regs + PCI_PM_INTENCLR); + writeq_relaxed(PMOVSRCLR_RESET, pcipmu->regs + PCI_PM_OVSR); + + for (i = 0; i < PCI_NUM_COUNTERS; ++i) { + writeq_relaxed(PMCNT_RESET, pcipmu->regs + PCI_PM_CNTCTL(i)); + writeq_relaxed(EVSEL(0), pcipmu->regs + PCI_PM_EVTYPE(i)); + } + + /* + * Use writeq here to ensure all programming commands are done + * before proceeding + */ + writeq(PM_ENABLE, pcipmu->regs + PCI_PM_CR); +} + +static irqreturn_t fujitsu_pci__handle_irq(int irq_num, void *data) +{ + struct pci_pmu *pcipmu = data; + /* Read the overflow status register */ + long status = readq_relaxed(pcipmu->regs + PCI_PM_OVSR); + int idx; + + if (status == 0) + return IRQ_NONE; + + /* Clear the bits we read on the overflow status register */ + writeq_relaxed(status, pcipmu->regs + PCI_PM_OVSR); + + for_each_set_bit(idx, &status, PCI_NUM_COUNTERS) { + struct perf_event *event; + + event = pcipmu->events[idx]; + if (!event) + continue; + + fujitsu_pci_counter_update(event); + } + + return IRQ_HANDLED; +} + +/* + * Implementation of abstract pmu functionality required by + * the core perf events code. + */ + +static void fujitsu_pci__pmu_enable(struct pmu *pmu) +{ + struct pci_pmu *pcipmu = to_pci_pmu(pmu); + + /* Ensure the other programming commands are observed before enabling */ + wmb(); + + writeq_relaxed(PM_ENABLE, pcipmu->regs + PCI_PM_CR); +} + +static void fujitsu_pci__pmu_disable(struct pmu *pmu) +{ + struct pci_pmu *pcipmu = to_pci_pmu(pmu); + + writeq_relaxed(0, pcipmu->regs + PCI_PM_CR); + + /* Ensure the basic counter unit is stopped before proceeding */ + wmb(); +} + +/* + * We must NOT create groups containing events from multiple hardware PMUs, + * although mixing different software and hardware PMUs is allowed. + */ +static bool fujitsu_pci__validate_event_group(struct perf_event *event) +{ + struct perf_event *leader = event->group_leader; + struct perf_event *sibling; + int counters = 0; + + if (leader->pmu != event->pmu && !is_software_event(leader)) + return false; + + /* The sum of the counters used by the event and its leader event */ + counters = 2; + + for_each_sibling_event(sibling, leader) { + if (is_software_event(sibling)) + continue; + if (sibling->pmu != event->pmu) + return false; + counters += 1; + } + + /* + * If the group requires more counters than the HW has, it + * cannot ever be scheduled. + */ + return counters <= PCI_NUM_COUNTERS; +} + +static int fujitsu_pci__event_init(struct perf_event *event) +{ + struct pci_pmu *pcipmu = to_pci_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + /* + * Is the event for this PMU? + */ + if (event->attr.type != event->pmu->type) + return -ENOENT; + + /* + * Sampling not supported since these events are not core-attributable. + */ + if (hwc->sample_period) + return -EINVAL; + + /* + * Task mode not available, we run the counters as socket counters, + * not attributable to any CPU and therefore cannot attribute per-task. + */ + if (event->cpu < 0) + return -EINVAL; + + /* Validate the group */ + if (!fujitsu_pci__validate_event_group(event)) + return -EINVAL; + + hwc->idx = -1; + + /* + * Many perf core operations (eg. events rotation) operate on a + * single CPU context. This is obvious for CPU PMUs, where one + * expects the same sets of events being observed on all CPUs, + * but can lead to issues for off-core PMUs, like this one, where + * each event could be theoretically assigned to a different CPU. + * To mitigate this, we enforce CPU assignment to one designated + * processor (the one described in the "cpumask" attribute exported + * by the PMU device). perf user space tools honor this and avoid + * opening more than one copy of the events. + */ + event->cpu = cpumask_first(&pcipmu->cpumask); + + return 0; +} + +static void fujitsu_pci__event_start(struct perf_event *event, int flags) +{ + struct hw_perf_event *hwc = &event->hw; + + hwc->state = 0; + fujitsu_pci_counter_start(event); +} + +static void fujitsu_pci__event_stop(struct perf_event *event, int flags) +{ + struct hw_perf_event *hwc = &event->hw; + + if (hwc->state & PERF_HES_STOPPED) + return; + + fujitsu_pci_counter_stop(event, flags); + if (flags & PERF_EF_UPDATE) + fujitsu_pci_counter_update(event); + hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int fujitsu_pci__event_add(struct perf_event *event, int flags) +{ + struct pci_pmu *pcipmu = to_pci_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + int idx; + + /* + * Try to allocate a counter. + */ + idx = bitmap_find_free_region(pcipmu->used_mask, PCI_NUM_COUNTERS, 0); + if (idx < 0) + /* The counters are all in use. */ + return -EAGAIN; + + hwc->idx = idx; + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; + pcipmu->events[idx] = event; + + if (flags & PERF_EF_START) + fujitsu_pci__event_start(event, 0); + + /* Propagate changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void fujitsu_pci__event_del(struct perf_event *event, int flags) +{ + struct pci_pmu *pcipmu = to_pci_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + /* Stop and clean up */ + fujitsu_pci__event_stop(event, flags | PERF_EF_UPDATE); + pcipmu->events[hwc->idx] = NULL; + bitmap_release_region(pcipmu->used_mask, hwc->idx, 0); + + /* Propagate changes to the userspace mapping. */ + perf_event_update_userpage(event); +} + +static void fujitsu_pci__event_read(struct perf_event *event) +{ + fujitsu_pci_counter_update(event); +} + +/* + * Add sysfs attributes + * + * We export: + * - formats, used by perf user space and other tools to configure events + * - events, used by perf user space and other tools to create events + * symbolically, e.g.: + * perf stat -a -e pci_iod0_pci0/event=0x24/ ls + * - cpumask, used by perf user space and other tools to know on which CPUs + * to open the events + */ + +/* formats */ + +#define PCI_PMU_FORMAT_ATTR(_name, _config) \ + (&((struct dev_ext_attribute[]) { \ + { .attr = __ATTR(_name, 0444, device_show_string, NULL), \ + .var = (void *) _config, } \ + })[0].attr.attr) + +static struct attribute *fujitsu_pci_pmu_formats[] = { + PCI_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group fujitsu_pci_pmu_format_group = { + .name = "format", + .attrs = fujitsu_pci_pmu_formats, +}; + +/* events */ + +static ssize_t pci_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct perf_pmu_events_attr *pmu_attr; + + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id); +} + +#define PCI_EVENT_ATTR(_name, _id) \ + PMU_EVENT_ATTR_ID(_name, pci_pmu_event_show, _id) + +static struct attribute *fujitsu_pci_pmu_events[] = { + PCI_EVENT_ATTR(pci-port0-cycles, PCI_EVENT_PORT0_CYCLES), + PCI_EVENT_ATTR(pci-port0-read-count, PCI_EVENT_PORT0_READ_COUNT), + PCI_EVENT_ATTR(pci-port0-read-count-bus, PCI_EVENT_PORT0_READ_COUNT_BUS), + PCI_EVENT_ATTR(pci-port0-write-count, PCI_EVENT_PORT0_WRITE_COUNT), + PCI_EVENT_ATTR(pci-port0-write-count-bus, PCI_EVENT_PORT0_WRITE_COUNT_BUS), + PCI_EVENT_ATTR(pci-port1-cycles, PCI_EVENT_PORT1_CYCLES), + PCI_EVENT_ATTR(pci-port1-read-count, PCI_EVENT_PORT1_READ_COUNT), + PCI_EVENT_ATTR(pci-port1-read-count-bus, PCI_EVENT_PORT1_READ_COUNT_BUS), + PCI_EVENT_ATTR(pci-port1-write-count, PCI_EVENT_PORT1_WRITE_COUNT), + PCI_EVENT_ATTR(pci-port1-write-count_bus, PCI_EVENT_PORT1_WRITE_COUNT_BUS), + PCI_EVENT_ATTR(ea-pci, PCI_EVENT_EA_PCI), + NULL +}; + +static const struct attribute_group fujitsu_pci_pmu_events_group = { + .name = "events", + .attrs = fujitsu_pci_pmu_events, +}; + +/* cpumask */ + +static ssize_t cpumask_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct pci_pmu *pcipmu = to_pci_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, &pcipmu->cpumask); +} + +static DEVICE_ATTR_RO(cpumask); + +static struct attribute *fujitsu_pci_pmu_cpumask_attrs[] = { + &dev_attr_cpumask.attr, + NULL, +}; + +static const struct attribute_group fujitsu_pci_pmu_cpumask_attr_group = { + .attrs = fujitsu_pci_pmu_cpumask_attrs, +}; + +/* + * Per PMU device attribute groups + */ +static const struct attribute_group *fujitsu_pci_pmu_attr_grps[] = { + &fujitsu_pci_pmu_format_group, + &fujitsu_pci_pmu_events_group, + &fujitsu_pci_pmu_cpumask_attr_group, + NULL, +}; + +/* + * Probing functions and data. + */ + +static int fujitsu_pci_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct pci_pmu *pcipmu = hlist_entry_safe(node, struct pci_pmu, node); + + /* If there is not a CPU/PMU association pick this CPU */ + if (cpumask_empty(&pcipmu->cpumask)) + cpumask_set_cpu(cpu, &pcipmu->cpumask); + + return 0; +} + +static int fujitsu_pci_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct pci_pmu *pcipmu = hlist_entry_safe(node, struct pci_pmu, node); + unsigned int target; + + if (!cpumask_test_and_clear_cpu(cpu, &pcipmu->cpumask)) + return 0; + target = cpumask_any_but(cpu_online_mask, cpu); + if (target >= nr_cpu_ids) + return 0; + perf_pmu_migrate_context(&pcipmu->pmu, cpu, target); + cpumask_set_cpu(target, &pcipmu->cpumask); + return 0; +} + +static int fujitsu_pci_pmu_probe(struct platform_device *pdev) +{ + struct pci_pmu *pcipmu; + struct acpi_device *acpi_dev; + struct resource *memrc; + int ret; + char *name; + u64 uid; + + /* Initialize the PMU data structures */ + + acpi_dev = ACPI_COMPANION(&pdev->dev); + if (!acpi_dev) + return -ENODEV; + + ret = acpi_dev_uid_to_integer(acpi_dev, &uid); + if (ret) { + dev_err(&pdev->dev, "unable to read ACPI uid\n"); + return ret; + } + + pcipmu = devm_kzalloc(&pdev->dev, sizeof(*pcipmu), GFP_KERNEL); + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "pci_iod%llu_pci%llu", + (uid >> 4) & 0xF, uid & 0xF); + if (!pcipmu || !name) + return -ENOMEM; + + pcipmu->pmu = (struct pmu) { + .parent = &pdev->dev, + .task_ctx_nr = perf_invalid_context, + + .pmu_enable = fujitsu_pci__pmu_enable, + .pmu_disable = fujitsu_pci__pmu_disable, + .event_init = fujitsu_pci__event_init, + .add = fujitsu_pci__event_add, + .del = fujitsu_pci__event_del, + .start = fujitsu_pci__event_start, + .stop = fujitsu_pci__event_stop, + .read = fujitsu_pci__event_read, + + .attr_groups = fujitsu_pci_pmu_attr_grps, + .capabilities = PERF_PMU_CAP_NO_EXCLUDE, + }; + + pcipmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc); + if (IS_ERR(pcipmu->regs)) + return PTR_ERR(pcipmu->regs); + + fujitsu_pci__init(pcipmu); + + ret = platform_get_irq(pdev, 0); + if (ret <= 0) + return ret; + + ret = devm_request_irq(&pdev->dev, ret, fujitsu_pci__handle_irq, 0, + name, pcipmu); + if (ret) { + dev_err(&pdev->dev, "Request for IRQ failed for slice @%pa\n", + &memrc->start); + return ret; + } + + /* Add this instance to the list used by the offline callback */ + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_FUJITSU_PCI_ONLINE, &pcipmu->node); + if (ret) { + dev_err(&pdev->dev, "Error %d registering hotplug", ret); + return ret; + } + + ret = perf_pmu_register(&pcipmu->pmu, name, -1); + if (ret < 0) { + dev_err(&pdev->dev, "Failed to register PCI PMU (%d)\n", ret); + return ret; + } + + dev_info(&pdev->dev, "Registered %s, type: %d\n", name, pcipmu->pmu.type); + + return 0; +} + +static const struct acpi_device_id fujitsu_pci_pmu_acpi_match[] = { + { "FUJI200D", }, + { } +}; +MODULE_DEVICE_TABLE(acpi, fujitsu_pci_pmu_acpi_match); + +static struct platform_driver fujitsu_pci_pmu_driver = { + .driver = { + .name = "fujitsu-pci-pmu", + .acpi_match_table = ACPI_PTR(fujitsu_pci_pmu_acpi_match), + .suppress_bind_attrs = true, + }, + .probe = fujitsu_pci_pmu_probe, +}; + +static int __init register_fujitsu_pci_pmu_driver(void) +{ + int ret; + + /* Install a hook to update the reader CPU in case it goes offline */ + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_FUJITSU_PCI_ONLINE, + "perf/fujitsu/pci:online", + fujitsu_pci_pmu_online_cpu, + fujitsu_pci_pmu_offline_cpu); + if (ret) + return ret; + + return platform_driver_register(&fujitsu_pci_pmu_driver); +} +device_initcall(register_fujitsu_pci_pmu_driver); diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index e6e49e09488a..b2538a7bdff8 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -228,6 +228,7 @@ enum cpuhp_state { CPUHP_AP_PERF_ARM_CAVIUM_TX2_UNCORE_ONLINE, CPUHP_AP_PERF_ARM_MARVELL_CN10K_DDR_ONLINE, CPUHP_AP_PERF_ARM_FUJITSU_MAC_ONLINE, + CPUHP_AP_PERF_ARM_FUJITSU_PCI_ONLINE, CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE, CPUHP_AP_PERF_POWERPC_CORE_IMC_ONLINE, CPUHP_AP_PERF_POWERPC_THREAD_IMC_ONLINE, -- 2.34.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver 2024-11-08 5:40 ` [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI " Yoshihiro Furudera @ 2024-11-08 11:03 ` Krzysztof Kozlowski 2024-11-08 22:33 ` kernel test robot 1 sibling, 0 replies; 13+ messages in thread From: Krzysztof Kozlowski @ 2024-11-08 11:03 UTC (permalink / raw) To: Yoshihiro Furudera, Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, Peter Zijlstra, linux-doc, linux-kernel On 08/11/2024 06:40, Yoshihiro Furudera wrote: > This adds a new dynamic PMU to the Perf Events framework to program and > control the Uncore PCI PMUs in Fujitsu chips. > > This driver was created with reference to drivers/perf/qcom_l3_pmu.c. > > This driver exports formatting and event information to sysfs so it can > be used by the perf user space tools with the syntaxes: > > perf stat -e pci_iod0_pci0/ea-pci/ ls > perf stat -e pci_iod0_pci0/event=0x80/ ls > > FUJITSU-MONAKA Specification URL: > https://github.com/fujitsu/FUJITSU-MONAKA > > Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> > --- > .../admin-guide/perf/fujitsu_pci_pmu.rst | 20 + > arch/arm64/configs/defconfig | 1 + > drivers/perf/Kconfig | 9 + > drivers/perf/Makefile | 1 + > drivers/perf/fujitsu_pci_pmu.c | 613 ++++++++++++++++++ > include/linux/cpuhotplug.h | 1 + > 6 files changed, 645 insertions(+) > create mode 100644 Documentation/admin-guide/perf/fujitsu_pci_pmu.rst > create mode 100644 drivers/perf/fujitsu_pci_pmu.c > > diff --git a/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst b/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst > new file mode 100644 > index 000000000000..5fee3a3ccc86 > --- /dev/null > +++ b/Documentation/admin-guide/perf/fujitsu_pci_pmu.rst > @@ -0,0 +1,20 @@ > +=========================================================================== > +Fujitsu Uncore PCI Performance Monitoring Unit (PMU) > +=========================================================================== > + > +This driver supports the Uncore PCI PMUs found in Fujitsu chips. > +Each PCI PMU on these chips is exposed as a uncore perf PMU with device name > +pci_iod<iod>_pci<pci>. > + > +The driver provides a description of its available events and configuration > +options in sysfs, see /sys/bus/event_sources/devices/pci_iod<iod>_pci<pci>/. > +Given that these are uncore PMUs the driver also exposes a "cpumask" sysfs > +attribute which contains a mask consisting of one CPU which will be used to > +handle all the PMU events. > + > +Examples for use with perf:: > + > + perf stat -e pci_iod0_pci0/ea-pci/ ls > + > +Given that these are uncore PMUs the driver does not support sampling, therefore > +"perf record" will not work. Per-task perf sessions are not supported. > diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig > index 2ef412937228..d7df90205be6 100644 > --- a/arch/arm64/configs/defconfig > +++ b/arch/arm64/configs/defconfig > @@ -1576,6 +1576,7 @@ CONFIG_ARM_SMMU_V3_PMU=m > CONFIG_ARM_DSU_PMU=m > CONFIG_FSL_IMX8_DDR_PMU=m > CONFIG_FUJITSU_MAC_PMU=y > +CONFIG_FUJITSU_PCI_PMU=y Same concerns. There is no such ARCH and this must be sent via your SoC folks. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver 2024-11-08 5:40 ` [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI " Yoshihiro Furudera 2024-11-08 11:03 ` Krzysztof Kozlowski @ 2024-11-08 22:33 ` kernel test robot 1 sibling, 0 replies; 13+ messages in thread From: kernel test robot @ 2024-11-08 22:33 UTC (permalink / raw) To: Yoshihiro Furudera, Will Deacon, Mark Rutland, Jonathan Corbet, Catalin Marinas, linux-arm-kernel, Bjorn Andersson, Geert Uytterhoeven, Krzysztof Kozlowski, Dmitry Baryshkov, Konrad Dybcio, Neil Armstrong, Arnd Bergmann, Nícolas F. R. A. Prado, Thomas Gleixner, linux-doc, linux-kernel Cc: oe-kbuild-all Hi Yoshihiro, kernel test robot noticed the following build warnings: [auto build test WARNING on tip/smp/core] [also build test WARNING on linus/master v6.12-rc6] [cannot apply to arm64/for-next/core arm-perf/for-next/perf next-20241108] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Yoshihiro-Furudera/perf-Fujitsu-Add-the-Uncore-MAC-PMU-driver/20241108-134245 base: tip/smp/core patch link: https://lore.kernel.org/r/20241108054006.2550856-3-fj5100bi%40fujitsu.com patch subject: [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI PMU driver config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20241109/202411090610.oPYJJG00-lkp@intel.com/config) compiler: alpha-linux-gcc (GCC) 14.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241109/202411090610.oPYJJG00-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411090610.oPYJJG00-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/perf/fujitsu_pci_pmu.c:584:36: warning: 'fujitsu_pci_pmu_acpi_match' defined but not used [-Wunused-const-variable=] 584 | static const struct acpi_device_id fujitsu_pci_pmu_acpi_match[] = { | ^~~~~~~~~~~~~~~~~~~~~~~~~~ vim +/fujitsu_pci_pmu_acpi_match +584 drivers/perf/fujitsu_pci_pmu.c 583 > 584 static const struct acpi_device_id fujitsu_pci_pmu_acpi_match[] = { 585 { "FUJI200D", }, 586 { } 587 }; 588 MODULE_DEVICE_TABLE(acpi, fujitsu_pci_pmu_acpi_match); 589 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-12-11 11:14 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-08 5:40 [PATCH 0/2] perf: Fujitsu: Add Uncore MAC/PCI PMU driver Yoshihiro Furudera 2024-11-08 5:40 ` [PATCH 1/2] perf: Fujitsu: Add the Uncore MAC " Yoshihiro Furudera 2024-11-08 11:03 ` Krzysztof Kozlowski 2024-11-08 11:08 ` Krzysztof Kozlowski 2024-11-12 7:49 ` Yoshihiro Furudera (Fujitsu) 2024-11-08 14:19 ` Jonathan Cameron 2024-11-15 1:18 ` Yoshihiro Furudera (Fujitsu) 2024-12-11 11:12 ` Jonathan Cameron 2024-11-08 20:19 ` kernel test robot 2024-11-09 0:36 ` kernel test robot 2024-11-08 5:40 ` [PATCH 2/2] perf: Fujitsu: Add the Uncore PCI " Yoshihiro Furudera 2024-11-08 11:03 ` Krzysztof Kozlowski 2024-11-08 22:33 ` kernel test robot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).