* [PATCH v6 0/4] perf: arm_cspmu: ampere: Add support for Ampere SoC PMUs @ 2023-08-15 6:35 Ilkka Koskinen 2023-08-15 6:35 ` [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes Ilkka Koskinen ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: Ilkka Koskinen @ 2023-08-15 6:35 UTC (permalink / raw) To: Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Suzuki K Poulose, Mark Rutland, Jonathan Corbet Cc: Ilkka Koskinen, linux-arm-kernel, linux-kernel, linux-doc Changes since v5: * Implemented the needed parts for vendor registration API * Rebased on top of Besar's patch [PATCH v5] perf: arm_cspmu: Separate Arm and vendor module https://lore.kernel.org/all/20230705104745.52255-1-bwicaksono@nvidia.com/ * v5: https://lore.kernel.org/all/20230714010141.824226-1-ilkka@os.amperecomputing.com/ Changes since v4: * "Support implementation specific filters" patch: - Added comment about filter and impdef registers and reference to the Coresight PMU specification to the commit message * "Add support for Ampere SoC PMU" patch: - Fixed the documentation and added more comments - Changed the incrementing PMU index number to idr_alloc() (Needs a impdef release hook patch to release unused index) - Fixed style in init_ops() to more reasonable - Moved bank parameter to config1 Changes since v3: * use_64b_counter_reg => has_atomic_dword (patch 1/4) * Removed the unnecessary hook for group validation (patch 3/4) * Added group config validation to ampere_cspmu_validate_event() (patch 4/4) * Rebased the patchset Changes since v2: * Changed to use supports_64bits_atomics() and replaced the split writes with lo_hi_writeq() * Added implementation specific group validation to patch 3 * Dropped shared interrupt patch * Removed unnecessary filter_enable parameter from ampere module * Added group validation to ampere module Changes since v1: * Rather than creating a completely new driver, implemented as a submodule of Arm CoreSight PMU driver * Fixed shared filter handling Ilkka Koskinen (4): perf: arm_cspmu: Split 64-bit write to 32-bit writes perf: arm_cspmu: Support implementation specific filters perf: arm_cspmu: Support implementation specific validation perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU .../admin-guide/perf/ampere_cspmu.rst | 29 ++ drivers/perf/arm_cspmu/Kconfig | 10 + drivers/perf/arm_cspmu/Makefile | 2 + drivers/perf/arm_cspmu/ampere_cspmu.c | 271 ++++++++++++++++++ drivers/perf/arm_cspmu/arm_cspmu.c | 33 ++- drivers/perf/arm_cspmu/arm_cspmu.h | 7 + 6 files changed, 346 insertions(+), 6 deletions(-) create mode 100644 Documentation/admin-guide/perf/ampere_cspmu.rst create mode 100644 drivers/perf/arm_cspmu/ampere_cspmu.c -- 2.41.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes 2023-08-15 6:35 [PATCH v6 0/4] perf: arm_cspmu: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen @ 2023-08-15 6:35 ` Ilkka Koskinen 2023-08-15 10:24 ` Suzuki K Poulose 2023-08-15 6:35 ` [PATCH v6 2/4] perf: arm_cspmu: Support implementation specific filters Ilkka Koskinen ` (2 subsequent siblings) 3 siblings, 1 reply; 10+ messages in thread From: Ilkka Koskinen @ 2023-08-15 6:35 UTC (permalink / raw) To: Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Suzuki K Poulose, Mark Rutland, Jonathan Corbet Cc: Ilkka Koskinen, linux-arm-kernel, linux-kernel, linux-doc Split the 64-bit register accesses if 64-bit access is not supported by the PMU. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> --- drivers/perf/arm_cspmu/arm_cspmu.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c index 04be94b4aa48..6387cbad7a7d 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -715,7 +715,10 @@ static void arm_cspmu_write_counter(struct perf_event *event, u64 val) if (use_64b_counter_reg(cspmu)) { offset = counter_offset(sizeof(u64), event->hw.idx); - writeq(val, cspmu->base1 + offset); + if (cspmu->has_atomic_dword) + writeq(val, cspmu->base1 + offset); + else + lo_hi_writeq(val, cspmu->base1 + offset); } else { offset = counter_offset(sizeof(u32), event->hw.idx); -- 2.41.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes 2023-08-15 6:35 ` [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes Ilkka Koskinen @ 2023-08-15 10:24 ` Suzuki K Poulose 2023-08-15 20:46 ` Ilkka Koskinen 0 siblings, 1 reply; 10+ messages in thread From: Suzuki K Poulose @ 2023-08-15 10:24 UTC (permalink / raw) To: Ilkka Koskinen, Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Mark Rutland, Jonathan Corbet Cc: linux-arm-kernel, linux-kernel, linux-doc On 15/08/2023 07:35, Ilkka Koskinen wrote: > Split the 64-bit register accesses if 64-bit access is not supported > by the PMU. > > Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> > Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> Do we need a Fixes tag ? With that: Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Suzuki > --- > drivers/perf/arm_cspmu/arm_cspmu.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c > index 04be94b4aa48..6387cbad7a7d 100644 > --- a/drivers/perf/arm_cspmu/arm_cspmu.c > +++ b/drivers/perf/arm_cspmu/arm_cspmu.c > @@ -715,7 +715,10 @@ static void arm_cspmu_write_counter(struct perf_event *event, u64 val) > if (use_64b_counter_reg(cspmu)) { > offset = counter_offset(sizeof(u64), event->hw.idx); > > - writeq(val, cspmu->base1 + offset); > + if (cspmu->has_atomic_dword) > + writeq(val, cspmu->base1 + offset); > + else > + lo_hi_writeq(val, cspmu->base1 + offset); > } else { > offset = counter_offset(sizeof(u32), event->hw.idx); > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes 2023-08-15 10:24 ` Suzuki K Poulose @ 2023-08-15 20:46 ` Ilkka Koskinen 2023-08-16 14:00 ` Suzuki K Poulose 0 siblings, 1 reply; 10+ messages in thread From: Ilkka Koskinen @ 2023-08-15 20:46 UTC (permalink / raw) To: Suzuki K Poulose Cc: Ilkka Koskinen, Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Mark Rutland, Jonathan Corbet, linux-arm-kernel, linux-kernel, linux-doc Hi Suzuki, On Tue, 15 Aug 2023, Suzuki K Poulose wrote: > On 15/08/2023 07:35, Ilkka Koskinen wrote: >> Split the 64-bit register accesses if 64-bit access is not supported >> by the PMU. >> >> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> >> Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> > > Do we need a Fixes tag ? I believe, NVIDIA's PMU supports 64-bit access while Ampere's one doesn't and since this patchset enables support for the latter one, it doesn't seem like we need a Fixes tag here. Cheers, Ilkka > > With that: > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> > > Suzuki > >> --- >> drivers/perf/arm_cspmu/arm_cspmu.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c >> b/drivers/perf/arm_cspmu/arm_cspmu.c >> index 04be94b4aa48..6387cbad7a7d 100644 >> --- a/drivers/perf/arm_cspmu/arm_cspmu.c >> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c >> @@ -715,7 +715,10 @@ static void arm_cspmu_write_counter(struct perf_event >> *event, u64 val) >> if (use_64b_counter_reg(cspmu)) { >> offset = counter_offset(sizeof(u64), event->hw.idx); >> - writeq(val, cspmu->base1 + offset); >> + if (cspmu->has_atomic_dword) >> + writeq(val, cspmu->base1 + offset); >> + else >> + lo_hi_writeq(val, cspmu->base1 + offset); > > >> } else { >> offset = counter_offset(sizeof(u32), event->hw.idx); >> > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes 2023-08-15 20:46 ` Ilkka Koskinen @ 2023-08-16 14:00 ` Suzuki K Poulose 0 siblings, 0 replies; 10+ messages in thread From: Suzuki K Poulose @ 2023-08-16 14:00 UTC (permalink / raw) To: Ilkka Koskinen Cc: Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Mark Rutland, Jonathan Corbet, linux-arm-kernel, linux-kernel, linux-doc On 15/08/2023 21:46, Ilkka Koskinen wrote: > > Hi Suzuki, > > On Tue, 15 Aug 2023, Suzuki K Poulose wrote: >> On 15/08/2023 07:35, Ilkka Koskinen wrote: >>> Split the 64-bit register accesses if 64-bit access is not supported >>> by the PMU. >>> >>> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> >>> Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> >> >> Do we need a Fixes tag ? > > I believe, NVIDIA's PMU supports 64-bit access while Ampere's one > doesn't and since this patchset enables support for the latter one, it > doesn't seem like we need a Fixes tag here. Ok, makes sense. Suzuki > > Cheers, Ilkka > >> >> With that: >> >> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> >> >> Suzuki >> >>> --- >>> drivers/perf/arm_cspmu/arm_cspmu.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c >>> b/drivers/perf/arm_cspmu/arm_cspmu.c >>> index 04be94b4aa48..6387cbad7a7d 100644 >>> --- a/drivers/perf/arm_cspmu/arm_cspmu.c >>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c >>> @@ -715,7 +715,10 @@ static void arm_cspmu_write_counter(struct >>> perf_event *event, u64 val) >>> if (use_64b_counter_reg(cspmu)) { >>> offset = counter_offset(sizeof(u64), event->hw.idx); >>> - writeq(val, cspmu->base1 + offset); >>> + if (cspmu->has_atomic_dword) >>> + writeq(val, cspmu->base1 + offset); >>> + else >>> + lo_hi_writeq(val, cspmu->base1 + offset); >> >> >>> } else { >>> offset = counter_offset(sizeof(u32), event->hw.idx); >>> >> >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v6 2/4] perf: arm_cspmu: Support implementation specific filters 2023-08-15 6:35 [PATCH v6 0/4] perf: arm_cspmu: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen 2023-08-15 6:35 ` [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes Ilkka Koskinen @ 2023-08-15 6:35 ` Ilkka Koskinen 2023-08-16 14:03 ` Suzuki K Poulose 2023-08-15 6:35 ` [PATCH v6 3/4] perf: arm_cspmu: Support implementation specific validation Ilkka Koskinen 2023-08-15 6:35 ` [PATCH v6 4/4] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen 3 siblings, 1 reply; 10+ messages in thread From: Ilkka Koskinen @ 2023-08-15 6:35 UTC (permalink / raw) To: Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Suzuki K Poulose, Mark Rutland, Jonathan Corbet Cc: Ilkka Koskinen, linux-arm-kernel, linux-kernel, linux-doc ARM Coresight PMU architecture specification [1] defines PMEVTYPER and PMEVFILT* registers as optional in Chapter 2.1. Moreover, implementers may choose to use PMIMPDEF* registers (offset: 0xD80-> 0xDFF) to filter the events. Add support for those by adding implementation specific filter callback function. [1] https://developer.arm.com/documentation/ihi0091/latest Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> --- drivers/perf/arm_cspmu/arm_cspmu.c | 12 ++++++++---- drivers/perf/arm_cspmu/arm_cspmu.h | 3 +++ 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c index 6387cbad7a7d..94f6856ec786 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -116,6 +116,9 @@ static unsigned long arm_cspmu_cpuhp_state; static DEFINE_MUTEX(arm_cspmu_lock); +static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu, + struct hw_perf_event *hwc, u32 filter); + static struct acpi_apmt_node *arm_cspmu_apmt_node(struct device *dev) { return *(struct acpi_apmt_node **)dev_get_platdata(dev); @@ -450,6 +453,7 @@ static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu) CHECK_DEFAULT_IMPL_OPS(impl_ops, event_type); CHECK_DEFAULT_IMPL_OPS(impl_ops, event_filter); CHECK_DEFAULT_IMPL_OPS(impl_ops, event_attr_is_visible); + CHECK_DEFAULT_IMPL_OPS(impl_ops, set_ev_filter); return 0; } @@ -811,9 +815,9 @@ static inline void arm_cspmu_set_event(struct arm_cspmu *cspmu, writel(hwc->config, cspmu->base0 + offset); } -static inline void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu, - struct hw_perf_event *hwc, - u32 filter) +static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu, + struct hw_perf_event *hwc, + u32 filter) { u32 offset = PMEVFILTR + (4 * hwc->idx); @@ -845,7 +849,7 @@ static void arm_cspmu_start(struct perf_event *event, int pmu_flags) arm_cspmu_set_cc_filter(cspmu, filter); } else { arm_cspmu_set_event(cspmu, hwc); - arm_cspmu_set_ev_filter(cspmu, hwc, filter); + cspmu->impl.ops.set_ev_filter(cspmu, hwc, filter); } hwc->state = 0; diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h index e5c6dff2ce7f..274ca3d10578 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.h +++ b/drivers/perf/arm_cspmu/arm_cspmu.h @@ -104,6 +104,9 @@ struct arm_cspmu_impl_ops { u32 (*event_type)(const struct perf_event *event); /* Decode filter value from configs */ u32 (*event_filter)(const struct perf_event *event); + /* Set event filter */ + void (*set_ev_filter)(struct arm_cspmu *cspmu, + struct hw_perf_event *hwc, u32 filter); /* Hide/show unsupported events */ umode_t (*event_attr_is_visible)(struct kobject *kobj, struct attribute *attr, int unused); -- 2.41.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v6 2/4] perf: arm_cspmu: Support implementation specific filters 2023-08-15 6:35 ` [PATCH v6 2/4] perf: arm_cspmu: Support implementation specific filters Ilkka Koskinen @ 2023-08-16 14:03 ` Suzuki K Poulose 0 siblings, 0 replies; 10+ messages in thread From: Suzuki K Poulose @ 2023-08-16 14:03 UTC (permalink / raw) To: Ilkka Koskinen, Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Mark Rutland, Jonathan Corbet Cc: linux-arm-kernel, linux-kernel, linux-doc On 15/08/2023 07:35, Ilkka Koskinen wrote: > ARM Coresight PMU architecture specification [1] defines PMEVTYPER and > PMEVFILT* registers as optional in Chapter 2.1. Moreover, implementers may > choose to use PMIMPDEF* registers (offset: 0xD80-> 0xDFF) to filter the > events. Add support for those by adding implementation specific filter > callback function. > > [1] https://developer.arm.com/documentation/ihi0091/latest > > Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> > Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> > --- > drivers/perf/arm_cspmu/arm_cspmu.c | 12 ++++++++---- > drivers/perf/arm_cspmu/arm_cspmu.h | 3 +++ > 2 files changed, 11 insertions(+), 4 deletions(-) > > diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c > index 6387cbad7a7d..94f6856ec786 100644 > --- a/drivers/perf/arm_cspmu/arm_cspmu.c > +++ b/drivers/perf/arm_cspmu/arm_cspmu.c > @@ -116,6 +116,9 @@ static unsigned long arm_cspmu_cpuhp_state; > > static DEFINE_MUTEX(arm_cspmu_lock); > > +static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu, > + struct hw_perf_event *hwc, u32 filter); > + > static struct acpi_apmt_node *arm_cspmu_apmt_node(struct device *dev) > { > return *(struct acpi_apmt_node **)dev_get_platdata(dev); > @@ -450,6 +453,7 @@ static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu) > CHECK_DEFAULT_IMPL_OPS(impl_ops, event_type); > CHECK_DEFAULT_IMPL_OPS(impl_ops, event_filter); > CHECK_DEFAULT_IMPL_OPS(impl_ops, event_attr_is_visible); > + CHECK_DEFAULT_IMPL_OPS(impl_ops, set_ev_filter); > > return 0; > } > @@ -811,9 +815,9 @@ static inline void arm_cspmu_set_event(struct arm_cspmu *cspmu, > writel(hwc->config, cspmu->base0 + offset); > } > > -static inline void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu, > - struct hw_perf_event *hwc, > - u32 filter) > +static void arm_cspmu_set_ev_filter(struct arm_cspmu *cspmu, > + struct hw_perf_event *hwc, > + u32 filter) > { > u32 offset = PMEVFILTR + (4 * hwc->idx); > > @@ -845,7 +849,7 @@ static void arm_cspmu_start(struct perf_event *event, int pmu_flags) > arm_cspmu_set_cc_filter(cspmu, filter); > } else { > arm_cspmu_set_event(cspmu, hwc); > - arm_cspmu_set_ev_filter(cspmu, hwc, filter); > + cspmu->impl.ops.set_ev_filter(cspmu, hwc, filter); > } > > hwc->state = 0; > diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h > index e5c6dff2ce7f..274ca3d10578 100644 > --- a/drivers/perf/arm_cspmu/arm_cspmu.h > +++ b/drivers/perf/arm_cspmu/arm_cspmu.h > @@ -104,6 +104,9 @@ struct arm_cspmu_impl_ops { > u32 (*event_type)(const struct perf_event *event); > /* Decode filter value from configs */ > u32 (*event_filter)(const struct perf_event *event); > + /* Set event filter */ > + void (*set_ev_filter)(struct arm_cspmu *cspmu, > + struct hw_perf_event *hwc, u32 filter); > /* Hide/show unsupported events */ > umode_t (*event_attr_is_visible)(struct kobject *kobj, > struct attribute *attr, int unused); ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v6 3/4] perf: arm_cspmu: Support implementation specific validation 2023-08-15 6:35 [PATCH v6 0/4] perf: arm_cspmu: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen 2023-08-15 6:35 ` [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes Ilkka Koskinen 2023-08-15 6:35 ` [PATCH v6 2/4] perf: arm_cspmu: Support implementation specific filters Ilkka Koskinen @ 2023-08-15 6:35 ` Ilkka Koskinen 2023-08-16 14:05 ` Suzuki K Poulose 2023-08-15 6:35 ` [PATCH v6 4/4] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen 3 siblings, 1 reply; 10+ messages in thread From: Ilkka Koskinen @ 2023-08-15 6:35 UTC (permalink / raw) To: Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Suzuki K Poulose, Mark Rutland, Jonathan Corbet Cc: Ilkka Koskinen, linux-arm-kernel, linux-kernel, linux-doc Some platforms may use e.g. different filtering mechanism and, thus, may need different way to validate the events and group. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> --- drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++++++- drivers/perf/arm_cspmu/arm_cspmu.h | 3 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c index 94f6856ec786..585ce96ac03f 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -572,7 +572,7 @@ static void arm_cspmu_disable(struct pmu *pmu) static int arm_cspmu_get_event_idx(struct arm_cspmu_hw_events *hw_events, struct perf_event *event) { - int idx; + int idx, ret; struct arm_cspmu *cspmu = to_arm_cspmu(event->pmu); if (supports_cycle_counter(cspmu)) { @@ -606,6 +606,12 @@ static int arm_cspmu_get_event_idx(struct arm_cspmu_hw_events *hw_events, if (idx >= cspmu->num_logical_ctrs) return -EAGAIN; + if (cspmu->impl.ops.validate_event) { + ret = cspmu->impl.ops.validate_event(cspmu, event); + if (ret) + return ret; + } + set_bit(idx, hw_events->used_ctrs); return idx; diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h index 274ca3d10578..05577f74b8a0 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.h +++ b/drivers/perf/arm_cspmu/arm_cspmu.h @@ -107,6 +107,9 @@ struct arm_cspmu_impl_ops { /* Set event filter */ void (*set_ev_filter)(struct arm_cspmu *cspmu, struct hw_perf_event *hwc, u32 filter); + /* Implementation specific event validation */ + int (*validate_event)(struct arm_cspmu *cspmu, + struct perf_event *event); /* Hide/show unsupported events */ umode_t (*event_attr_is_visible)(struct kobject *kobj, struct attribute *attr, int unused); -- 2.41.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v6 3/4] perf: arm_cspmu: Support implementation specific validation 2023-08-15 6:35 ` [PATCH v6 3/4] perf: arm_cspmu: Support implementation specific validation Ilkka Koskinen @ 2023-08-16 14:05 ` Suzuki K Poulose 0 siblings, 0 replies; 10+ messages in thread From: Suzuki K Poulose @ 2023-08-16 14:05 UTC (permalink / raw) To: Ilkka Koskinen, Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Mark Rutland, Jonathan Corbet Cc: linux-arm-kernel, linux-kernel, linux-doc On 15/08/2023 07:35, Ilkka Koskinen wrote: > Some platforms may use e.g. different filtering mechanism and, thus, > may need different way to validate the events and group. > > Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> > --- > drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++++++- > drivers/perf/arm_cspmu/arm_cspmu.h | 3 +++ > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c > index 94f6856ec786..585ce96ac03f 100644 > --- a/drivers/perf/arm_cspmu/arm_cspmu.c > +++ b/drivers/perf/arm_cspmu/arm_cspmu.c > @@ -572,7 +572,7 @@ static void arm_cspmu_disable(struct pmu *pmu) > static int arm_cspmu_get_event_idx(struct arm_cspmu_hw_events *hw_events, > struct perf_event *event) > { > - int idx; > + int idx, ret; > struct arm_cspmu *cspmu = to_arm_cspmu(event->pmu); > > if (supports_cycle_counter(cspmu)) { > @@ -606,6 +606,12 @@ static int arm_cspmu_get_event_idx(struct arm_cspmu_hw_events *hw_events, > if (idx >= cspmu->num_logical_ctrs) > return -EAGAIN; > > + if (cspmu->impl.ops.validate_event) { > + ret = cspmu->impl.ops.validate_event(cspmu, event); > + if (ret) > + return ret; > + } > + > set_bit(idx, hw_events->used_ctrs); > > return idx; > diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h > index 274ca3d10578..05577f74b8a0 100644 > --- a/drivers/perf/arm_cspmu/arm_cspmu.h > +++ b/drivers/perf/arm_cspmu/arm_cspmu.h > @@ -107,6 +107,9 @@ struct arm_cspmu_impl_ops { > /* Set event filter */ > void (*set_ev_filter)(struct arm_cspmu *cspmu, > struct hw_perf_event *hwc, u32 filter); > + /* Implementation specific event validation */ > + int (*validate_event)(struct arm_cspmu *cspmu, > + struct perf_event *event); > /* Hide/show unsupported events */ > umode_t (*event_attr_is_visible)(struct kobject *kobj, > struct attribute *attr, int unused); ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v6 4/4] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU 2023-08-15 6:35 [PATCH v6 0/4] perf: arm_cspmu: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen ` (2 preceding siblings ...) 2023-08-15 6:35 ` [PATCH v6 3/4] perf: arm_cspmu: Support implementation specific validation Ilkka Koskinen @ 2023-08-15 6:35 ` Ilkka Koskinen 3 siblings, 0 replies; 10+ messages in thread From: Ilkka Koskinen @ 2023-08-15 6:35 UTC (permalink / raw) To: Will Deacon, Robin Murphy, Besar Wicaksono, Jonathan Cameron, Suzuki K Poulose, Mark Rutland, Jonathan Corbet Cc: Ilkka Koskinen, linux-arm-kernel, linux-kernel, linux-doc Ampere SoC PMU follows CoreSight PMU architecture. It uses implementation specific registers to filter events rather than PMEVFILTnR registers. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> --- .../admin-guide/perf/ampere_cspmu.rst | 29 ++ drivers/perf/arm_cspmu/Kconfig | 10 + drivers/perf/arm_cspmu/Makefile | 2 + drivers/perf/arm_cspmu/ampere_cspmu.c | 271 ++++++++++++++++++ drivers/perf/arm_cspmu/arm_cspmu.c | 8 + drivers/perf/arm_cspmu/arm_cspmu.h | 1 + 6 files changed, 321 insertions(+) create mode 100644 Documentation/admin-guide/perf/ampere_cspmu.rst create mode 100644 drivers/perf/arm_cspmu/ampere_cspmu.c diff --git a/Documentation/admin-guide/perf/ampere_cspmu.rst b/Documentation/admin-guide/perf/ampere_cspmu.rst new file mode 100644 index 000000000000..94f93f5aee6c --- /dev/null +++ b/Documentation/admin-guide/perf/ampere_cspmu.rst @@ -0,0 +1,29 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================================ +Ampere SoC Performance Monitoring Unit (PMU) +============================================ + +Ampere SoC PMU is a generic PMU IP that follows Arm CoreSight PMU architecture. +Therefore, the driver is implemented as a submodule of arm_cspmu driver. At the +first phase it's used for counting MCU events on AmpereOne. + + +MCU PMU events +-------------- + +The PMU driver supports setting filters for "rank", "bank", and "threshold". +Note, that the filters are per PMU instance rather than per event. + + +Example for perf tool use:: + + / # perf list ampere + + ampere_mcu_pmu_0/act_sent/ [Kernel PMU event] + <...> + ampere_mcu_pmu_1/rd_sent/ [Kernel PMU event] + <...> + + / # perf stat -a -e ampere_mcu_pmu_0/act_sent,bank=5,rank=3,threshold=2/,ampere_mcu_pmu_1/rd_sent/ \ + sleep 1 diff --git a/drivers/perf/arm_cspmu/Kconfig b/drivers/perf/arm_cspmu/Kconfig index d5f787d22234..6f4e28fc84a2 100644 --- a/drivers/perf/arm_cspmu/Kconfig +++ b/drivers/perf/arm_cspmu/Kconfig @@ -17,3 +17,13 @@ config NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU help Provides NVIDIA specific attributes for performance monitoring unit (PMU) devices based on ARM CoreSight PMU architecture. + +config AMPERE_CORESIGHT_PMU_ARCH_SYSTEM_PMU + tristate "Ampere Coresight Architecture PMU" + depends on ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU + help + Provides Ampere specific attributes for performance monitoring unit + (PMU) devices based on ARM CoreSight PMU architecture. + + In the first phase, the driver enables support on MCU PMU used in + AmpereOne SoC family. diff --git a/drivers/perf/arm_cspmu/Makefile b/drivers/perf/arm_cspmu/Makefile index 0309d2ff264a..220a734efd54 100644 --- a/drivers/perf/arm_cspmu/Makefile +++ b/drivers/perf/arm_cspmu/Makefile @@ -3,6 +3,8 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu_module.o + arm_cspmu_module-y := arm_cspmu.o obj-$(CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += nvidia_cspmu.o +obj-$(CONFIG_AMPERE_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += ampere_cspmu.o diff --git a/drivers/perf/arm_cspmu/ampere_cspmu.c b/drivers/perf/arm_cspmu/ampere_cspmu.c new file mode 100644 index 000000000000..a365f59fbfe7 --- /dev/null +++ b/drivers/perf/arm_cspmu/ampere_cspmu.c @@ -0,0 +1,271 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Ampere SoC PMU (Performance Monitor Unit) + * + * Copyright (c) 2023, Ampere Computing LLC + */ +#include <linux/module.h> +#include <linux/topology.h> + +#include "arm_cspmu.h" + +#define PMAUXR0 0xD80 +#define PMAUXR1 0xD84 +#define PMAUXR2 0xD88 +#define PMAUXR3 0xD8C + +#define to_ampere_cspmu_ctx(cspmu) ((struct ampere_cspmu_ctx *)(cspmu->impl.ctx)) + +struct ampere_cspmu_ctx { + const char *name; + struct attribute **event_attr; + struct attribute **format_attr; +}; + +static DEFINE_IDR(mcu_pmu_idr); + +#define SOC_PMU_EVENT_ATTR_EXTRACTOR(_name, _config, _start, _end) \ + static inline u32 get_##_name(const struct perf_event *event) \ + { \ + return FIELD_GET(GENMASK_ULL(_end, _start), \ + event->attr._config); \ + } \ + +SOC_PMU_EVENT_ATTR_EXTRACTOR(event, config, 0, 8); +SOC_PMU_EVENT_ATTR_EXTRACTOR(threshold, config1, 0, 7); +SOC_PMU_EVENT_ATTR_EXTRACTOR(rank, config1, 8, 23); +SOC_PMU_EVENT_ATTR_EXTRACTOR(bank, config1, 24, 55); + +static struct attribute *ampereone_mcu_pmu_event_attrs[] = { + ARM_CSPMU_EVENT_ATTR(cycle_count, 0x00), + ARM_CSPMU_EVENT_ATTR(act_sent, 0x01), + ARM_CSPMU_EVENT_ATTR(pre_sent, 0x02), + ARM_CSPMU_EVENT_ATTR(rd_sent, 0x03), + ARM_CSPMU_EVENT_ATTR(rda_sent, 0x04), + ARM_CSPMU_EVENT_ATTR(wr_sent, 0x05), + ARM_CSPMU_EVENT_ATTR(wra_sent, 0x06), + ARM_CSPMU_EVENT_ATTR(pd_entry_vld, 0x07), + ARM_CSPMU_EVENT_ATTR(sref_entry_vld, 0x08), + ARM_CSPMU_EVENT_ATTR(prea_sent, 0x09), + ARM_CSPMU_EVENT_ATTR(pre_sb_sent, 0x0a), + ARM_CSPMU_EVENT_ATTR(ref_sent, 0x0b), + ARM_CSPMU_EVENT_ATTR(rfm_sent, 0x0c), + ARM_CSPMU_EVENT_ATTR(ref_sb_sent, 0x0d), + ARM_CSPMU_EVENT_ATTR(rfm_sb_sent, 0x0e), + ARM_CSPMU_EVENT_ATTR(rd_rda_sent, 0x0f), + ARM_CSPMU_EVENT_ATTR(wr_wra_sent, 0x10), + ARM_CSPMU_EVENT_ATTR(raw_hazard, 0x11), + ARM_CSPMU_EVENT_ATTR(war_hazard, 0x12), + ARM_CSPMU_EVENT_ATTR(waw_hazard, 0x13), + ARM_CSPMU_EVENT_ATTR(rar_hazard, 0x14), + ARM_CSPMU_EVENT_ATTR(raw_war_waw_hazard, 0x15), + ARM_CSPMU_EVENT_ATTR(hprd_lprd_wr_req_vld, 0x16), + ARM_CSPMU_EVENT_ATTR(lprd_req_vld, 0x17), + ARM_CSPMU_EVENT_ATTR(hprd_req_vld, 0x18), + ARM_CSPMU_EVENT_ATTR(hprd_lprd_req_vld, 0x19), + ARM_CSPMU_EVENT_ATTR(prefetch_tgt, 0x1a), + ARM_CSPMU_EVENT_ATTR(wr_req_vld, 0x1b), + ARM_CSPMU_EVENT_ATTR(partial_wr_req_vld, 0x1c), + ARM_CSPMU_EVENT_ATTR(rd_retry, 0x1d), + ARM_CSPMU_EVENT_ATTR(wr_retry, 0x1e), + ARM_CSPMU_EVENT_ATTR(retry_gnt, 0x1f), + ARM_CSPMU_EVENT_ATTR(rank_change, 0x20), + ARM_CSPMU_EVENT_ATTR(dir_change, 0x21), + ARM_CSPMU_EVENT_ATTR(rank_dir_change, 0x22), + ARM_CSPMU_EVENT_ATTR(rank_active, 0x23), + ARM_CSPMU_EVENT_ATTR(rank_idle, 0x24), + ARM_CSPMU_EVENT_ATTR(rank_pd, 0x25), + ARM_CSPMU_EVENT_ATTR(rank_sref, 0x26), + ARM_CSPMU_EVENT_ATTR(queue_fill_gt_thresh, 0x27), + ARM_CSPMU_EVENT_ATTR(queue_rds_gt_thresh, 0x28), + ARM_CSPMU_EVENT_ATTR(queue_wrs_gt_thresh, 0x29), + ARM_CSPMU_EVENT_ATTR(phy_updt_complt, 0x2a), + ARM_CSPMU_EVENT_ATTR(tz_fail, 0x2b), + ARM_CSPMU_EVENT_ATTR(dram_errc, 0x2c), + ARM_CSPMU_EVENT_ATTR(dram_errd, 0x2d), + ARM_CSPMU_EVENT_ATTR(read_data_return, 0x32), + ARM_CSPMU_EVENT_ATTR(chi_wr_data_delta, 0x33), + ARM_CSPMU_EVENT_ATTR(zq_start, 0x34), + ARM_CSPMU_EVENT_ATTR(zq_latch, 0x35), + ARM_CSPMU_EVENT_ATTR(wr_fifo_full, 0x36), + ARM_CSPMU_EVENT_ATTR(info_fifo_full, 0x37), + ARM_CSPMU_EVENT_ATTR(cmd_fifo_full, 0x38), + ARM_CSPMU_EVENT_ATTR(dfi_nop, 0x39), + ARM_CSPMU_EVENT_ATTR(dfi_cmd, 0x3a), + ARM_CSPMU_EVENT_ATTR(rd_run_len, 0x3b), + ARM_CSPMU_EVENT_ATTR(wr_run_len, 0x3c), + + ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT), + NULL, +}; + +static struct attribute *ampereone_mcu_format_attrs[] = { + ARM_CSPMU_FORMAT_EVENT_ATTR, + ARM_CSPMU_FORMAT_ATTR(threshold, "config1:0-7"), + ARM_CSPMU_FORMAT_ATTR(rank, "config1:8-23"), + ARM_CSPMU_FORMAT_ATTR(bank, "config1:24-55"), + NULL, +}; + +static struct attribute ** +ampere_cspmu_get_event_attrs(const struct arm_cspmu *cspmu) +{ + const struct ampere_cspmu_ctx *ctx = to_ampere_cspmu_ctx(cspmu); + + return ctx->event_attr; +} + +static struct attribute ** +ampere_cspmu_get_format_attrs(const struct arm_cspmu *cspmu) +{ + const struct ampere_cspmu_ctx *ctx = to_ampere_cspmu_ctx(cspmu); + + return ctx->format_attr; +} + +static const char * +ampere_cspmu_get_name(const struct arm_cspmu *cspmu) +{ + const struct ampere_cspmu_ctx *ctx = to_ampere_cspmu_ctx(cspmu); + + return ctx->name; +} + +static u32 ampere_cspmu_event_filter(const struct perf_event *event) +{ + /* + * PMEVFILTR or PMCCFILTR aren't used in Ampere SoC PMU but are marked + * as RES0. Make sure, PMCCFILTR is written zero. + */ + return 0; +} + +static void ampere_cspmu_set_ev_filter(struct arm_cspmu *cspmu, + struct hw_perf_event *hwc, + u32 filter) +{ + struct perf_event *event; + unsigned int idx; + u32 threshold, rank, bank; + + /* + * At this point, all the events have the same filter settings. + * Therefore, take the first event and use its configuration. + */ + idx = find_first_bit(cspmu->hw_events.used_ctrs, + cspmu->cycle_counter_logical_idx); + + event = cspmu->hw_events.events[idx]; + + threshold = get_threshold(event); + rank = get_rank(event); + bank = get_bank(event); + + writel(threshold, cspmu->base0 + PMAUXR0); + writel(rank, cspmu->base0 + PMAUXR1); + writel(bank, cspmu->base0 + PMAUXR2); +} + +static int ampere_cspmu_validate_configs(struct perf_event *event, + struct perf_event *event2) +{ + if (get_threshold(event) != get_threshold(event2) || + get_rank(event) != get_rank(event2) || + get_bank(event) != get_bank(event2)) + return -EINVAL; + + return 0; +} + +static int ampere_cspmu_validate_event(struct arm_cspmu *cspmu, + struct perf_event *new) +{ + struct perf_event *curr, *leader = new->group_leader; + unsigned int idx; + int ret; + + ret = ampere_cspmu_validate_configs(new, leader); + if (ret) + return ret; + + /* We compare the global filter settings to the existing events */ + idx = find_first_bit(cspmu->hw_events.used_ctrs, + cspmu->cycle_counter_logical_idx); + + /* This is the first event, thus any configuration is fine */ + if (idx == cspmu->cycle_counter_logical_idx) + return 0; + + curr = cspmu->hw_events.events[idx]; + + return ampere_cspmu_validate_configs(curr, new); +} + +static char *ampere_cspmu_format_name(const struct arm_cspmu *cspmu, + const char *name_pattern) +{ + struct device *dev = cspmu->dev; + int id; + + id = idr_alloc(&mcu_pmu_idr, NULL, 0, 0, GFP_KERNEL); + if (id < 0) + return ERR_PTR(id); + + return devm_kasprintf(dev, GFP_KERNEL, name_pattern, id); +} + +static int ampere_cspmu_init_ops(struct arm_cspmu *cspmu) +{ + struct device *dev = cspmu->dev; + struct ampere_cspmu_ctx *ctx; + struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops; + + ctx = devm_kzalloc(dev, sizeof(struct ampere_cspmu_ctx), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ctx->event_attr = ampereone_mcu_pmu_event_attrs; + ctx->format_attr = ampereone_mcu_format_attrs; + ctx->name = ampere_cspmu_format_name(cspmu, "ampere_mcu_pmu_%d"); + if (IS_ERR_OR_NULL(ctx->name)) + return ctx->name ? PTR_ERR(ctx->name) : -ENOMEM; + + cspmu->impl.ctx = ctx; + + impl_ops->event_filter = ampere_cspmu_event_filter; + impl_ops->set_ev_filter = ampere_cspmu_set_ev_filter; + impl_ops->validate_event = ampere_cspmu_validate_event; + impl_ops->get_name = ampere_cspmu_get_name; + impl_ops->get_event_attrs = ampere_cspmu_get_event_attrs; + impl_ops->get_format_attrs = ampere_cspmu_get_format_attrs; + + return 0; +} + +/* Match all Ampere Coresight PMU devices */ +static const struct arm_cspmu_impl_match ampere_cspmu_param = { + .pmiidr_val = ARM_CSPMU_IMPL_ID_AMPERE, + .module = THIS_MODULE, + .impl_init_ops = ampere_cspmu_init_ops +}; + +static int __init ampere_cspmu_init(void) +{ + int ret; + + ret = arm_cspmu_impl_register(&ere_cspmu_param); + if (ret) + pr_err("ampere_cspmu backend registration error: %d\n", ret); + + return ret; +} + +static void __exit ampere_cspmu_exit(void) +{ + arm_cspmu_impl_unregister(&ere_cspmu_param); +} + +module_init(ampere_cspmu_init); +module_exit(ampere_cspmu_exit); + +MODULE_LICENSE("GPL"); diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c index 585ce96ac03f..d6c50e06c5bc 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -383,6 +383,14 @@ static struct arm_cspmu_impl_match impl_match[] = { .module = NULL, .impl_init_ops = NULL, }, + { + .module_name = "ampere_cspmu", + .pmiidr_val = ARM_CSPMU_IMPL_ID_AMPERE, + .pmiidr_mask = ARM_CSPMU_PMIIDR_IMPLEMENTER, + .module = NULL, + .impl_init_ops = NULL, + }, + {0} }; diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h index 05577f74b8a0..586441d4dba1 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.h +++ b/drivers/perf/arm_cspmu/arm_cspmu.h @@ -71,6 +71,7 @@ /* JEDEC-assigned JEP106 identification code */ #define ARM_CSPMU_IMPL_ID_NVIDIA 0x36B +#define ARM_CSPMU_IMPL_ID_AMPERE 0xA16 struct arm_cspmu; -- 2.41.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-08-16 14:06 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-15 6:35 [PATCH v6 0/4] perf: arm_cspmu: ampere: Add support for Ampere SoC PMUs Ilkka Koskinen 2023-08-15 6:35 ` [PATCH v6 1/4] perf: arm_cspmu: Split 64-bit write to 32-bit writes Ilkka Koskinen 2023-08-15 10:24 ` Suzuki K Poulose 2023-08-15 20:46 ` Ilkka Koskinen 2023-08-16 14:00 ` Suzuki K Poulose 2023-08-15 6:35 ` [PATCH v6 2/4] perf: arm_cspmu: Support implementation specific filters Ilkka Koskinen 2023-08-16 14:03 ` Suzuki K Poulose 2023-08-15 6:35 ` [PATCH v6 3/4] perf: arm_cspmu: Support implementation specific validation Ilkka Koskinen 2023-08-16 14:05 ` Suzuki K Poulose 2023-08-15 6:35 ` [PATCH v6 4/4] perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU Ilkka Koskinen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).