linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/3] arm64: perf: Add support for event counting threshold
@ 2023-11-13 11:25 James Clark
  2023-11-13 11:25 ` [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask James Clark
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: James Clark @ 2023-11-13 11:25 UTC (permalink / raw)
  To: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland
  Cc: James Clark, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel

Changes since v4:

  * Rebase onto v6.7-rc1, it no longer depends on kvmarm/next
  * Remove change that moved ARMV8_PMU_EVTYPE_MASK to the asm files.
    This actually depended on those files being included in a certain
    order with arm_pmuv3.h to avoid circular includes. Now the
    definition is done programmatically in arm_pmuv3.c instead.

Changes since v3:

  * Drop #include changes to KVM source files because since
    commit bc512d6a9b92 ("KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if
    EL2 isn't advertised"), KVM doesn't use ARMV8_PMU_EVTYPE_MASK
    anymore

Changes since v2:

  * Split threshold_control attribute into two, threshold_compare and
    threshold_count so that it's easier to use
  * Add some notes to the first commit message and the cover letter
    about the behavior in KVM
  * Update the docs commit with regards to the split attribute
 
Changes since v1:

  * Fix build on aarch32 by disabling FEAT_PMUv3_TH and splitting event
    type mask between the platforms
  * Change armv8pmu_write_evtype() to take unsigned long instead of u64
    so it isn't unnecessarily wide on aarch32
  * Add UL suffix to aarch64 event type mask definition

----

FEAT_PMUv3_TH (Armv8.8) is a new feature that allows conditional
counting of PMU events depending on how much the event increments on
a single cycle. Two new config fields for perf_event_open have been
added, and a PMU cap file for reading the max_threshold. See the second
commit message and the docs in the last commit for more details.

The feature is not currently supported on KVM guests, and PMMIR is set
to read as zero, so it's not advertised as available. But it can be
added at a later time. Writes to PMEVTYPER.TC and TH from guests are
already RES0.

The change has been validated on the Arm FVP model:

  # Zero values, works as expected (as before).
  $ perf stat -e dtlb_walk/threshold=0,threshold_compare=0/ -- true

    5962      dtlb_walk/threshold=0,threshold_compare=0/

  # Threshold >= 255 causes count to be 0 because dtlb_walk doesn't
  # increase by more than 1 per cycle.
  $ perf stat -e dtlb_walk/threshold=255,threshold_compare=2/ -- true

    0      dtlb_walk/threshold=255,threshold_compare=2/
  
  # Keeping comparison as >= but lowering the threshold to 1 makes the
  # count return.
  $ perf stat -e dtlb_walk/threshold=1,threshold_compare=2/ -- true

    6329      dtlb_walk/threshold=1,threshold_compare=2/

James Clark (3):
  arm64: perf: Include threshold control fields in PMEVTYPER mask
  arm64: perf: Add support for event counting threshold
  Documentation: arm64: Document the PMU event counting threshold
    feature

 Documentation/arch/arm64/perf.rst | 56 ++++++++++++++++++++
 drivers/perf/arm_pmuv3.c          | 88 ++++++++++++++++++++++++++++++-
 include/linux/perf/arm_pmuv3.h    |  4 +-
 3 files changed, 145 insertions(+), 3 deletions(-)


base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
-- 
2.34.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask
  2023-11-13 11:25 [PATCH v5 0/3] arm64: perf: Add support for event counting threshold James Clark
@ 2023-11-13 11:25 ` James Clark
  2023-11-21 10:15   ` Suzuki K Poulose
  2023-11-22  8:44   ` Anshuman Khandual
  2023-11-13 11:25 ` [PATCH v5 2/3] arm64: perf: Add support for event counting threshold James Clark
  2023-11-13 11:25 ` [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature James Clark
  2 siblings, 2 replies; 15+ messages in thread
From: James Clark @ 2023-11-13 11:25 UTC (permalink / raw)
  To: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland
  Cc: James Clark, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel

FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include
them in the mask. These aren't writable on 32 bit kernels as they are in
the high part of the register, so only include them for arm64.

It would be difficult to do this statically in the asm header files for
each platform without resulting in circular includes or #ifdefs inline
in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has
been removed and the mask is constructed programmatically.

Signed-off-by: James Clark <james.clark@arm.com>
---
 drivers/perf/arm_pmuv3.c       | 9 ++++++++-
 include/linux/perf/arm_pmuv3.h | 3 ++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 6ca7be05229c..1d40d794f5e4 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -555,8 +555,15 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
 static inline void armv8pmu_write_evtype(int idx, u32 val)
 {
 	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
+	unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
+			     ARMV8_PMU_INCLUDE_EL2 |
+			     ARMV8_PMU_EXCLUDE_EL0 |
+			     ARMV8_PMU_EXCLUDE_EL1;
 
-	val &= ARMV8_PMU_EVTYPE_MASK;
+	if (IS_ENABLED(CONFIG_ARM64))
+		mask |= ARMV8_PMU_EVTYPE_TC | ARMV8_PMU_EVTYPE_TH;
+
+	val &= mask;
 	write_pmevtypern(counter, val);
 }
 
diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
index 9c226adf938a..ddd1fec86739 100644
--- a/include/linux/perf/arm_pmuv3.h
+++ b/include/linux/perf/arm_pmuv3.h
@@ -228,7 +228,8 @@
 /*
  * PMXEVTYPER: Event selection reg
  */
-#define ARMV8_PMU_EVTYPE_MASK	0xc800ffff	/* Mask for writable bits */
+#define ARMV8_PMU_EVTYPE_TH	GENMASK(43, 32)
+#define ARMV8_PMU_EVTYPE_TC	GENMASK(63, 61)
 #define ARMV8_PMU_EVTYPE_EVENT	0xffff		/* Mask for EVENT bits */
 
 /*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 2/3] arm64: perf: Add support for event counting threshold
  2023-11-13 11:25 [PATCH v5 0/3] arm64: perf: Add support for event counting threshold James Clark
  2023-11-13 11:25 ` [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask James Clark
@ 2023-11-13 11:25 ` James Clark
  2023-11-21 10:20   ` Suzuki K Poulose
  2023-11-23  3:42   ` Anshuman Khandual
  2023-11-13 11:25 ` [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature James Clark
  2 siblings, 2 replies; 15+ messages in thread
From: James Clark @ 2023-11-13 11:25 UTC (permalink / raw)
  To: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland
  Cc: James Clark, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel

FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
events whose count meets a specified threshold condition. For example if
PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or
equal, count), and the threshold is set to 2, then the PMU counter will
now only increment by 1 when an event would have previously incremented
the PMU counter by 2 or more on a single processor cycle.

Three new Perf event config fields, 'threshold', 'threshold_compare' and
'threshold_count' have been added to control the feature.
threshold_compare maps to the upper two bits of PMEVTYPERn.TC and
threshold_count maps to the first bit of TC. These separate attributes
have been picked rather than enumerating all the possible combinations
of the TC field as in the Arm ARM. The attributes would be used on a
Perf command line like this:

  $ perf stat -e stall_slot/threshold=2,threshold_compare=2/

A new capability for reading out the maximum supported threshold value
has also been added:

  $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max

  0x000000ff

If a threshold higher than threshold_max is provided, then no error is
generated but the threshold is clamped to the max value. If
FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then
threshold_max reads zero, and neither the 'threshold' nor
'threshold_control' parameters will be used.

The threshold is per PMU counter, and there are potentially different
threshold_max values per PMU type on heterogeneous systems.

Bits higher than 32 now need to be written into PMEVTYPER, so
armv8pmu_write_evtype() has to be updated to take an unsigned long value
rather than u32 which gives the correct behavior on both aarch32 and 64.

Signed-off-by: James Clark <james.clark@arm.com>
---
 drivers/perf/arm_pmuv3.c       | 79 +++++++++++++++++++++++++++++++++-
 include/linux/perf/arm_pmuv3.h |  1 +
 2 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 1d40d794f5e4..694d914ffc08 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -15,6 +15,7 @@
 #include <clocksource/arm_arch_timer.h>
 
 #include <linux/acpi.h>
+#include <linux/bitfield.h>
 #include <linux/clocksource.h>
 #include <linux/of.h>
 #include <linux/perf/arm_pmu.h>
@@ -294,9 +295,18 @@ static const struct attribute_group armv8_pmuv3_events_attr_group = {
 	.is_visible = armv8pmu_event_attr_is_visible,
 };
 
+#define TH_LO	2
+#define TH_HI	13
+#define TH_CNT	14
+#define TH_CMP_LO	15
+#define TH_CMP_HI	16
+
 PMU_FORMAT_ATTR(event, "config:0-15");
 PMU_FORMAT_ATTR(long, "config1:0");
 PMU_FORMAT_ATTR(rdpmc, "config1:1");
+PMU_FORMAT_ATTR(threshold, "config1:" __stringify(TH_LO) "-" __stringify(TH_HI));
+PMU_FORMAT_ATTR(threshold_compare, "config1:" __stringify(TH_CMP_LO) "-" __stringify(TH_CMP_HI));
+PMU_FORMAT_ATTR(threshold_count, "config1:" __stringify(TH_CNT));
 
 static int sysctl_perf_user_access __read_mostly;
 
@@ -310,10 +320,32 @@ static inline bool armv8pmu_event_want_user_access(struct perf_event *event)
 	return event->attr.config1 & 0x2;
 }
 
+static inline u32 armv8pmu_event_threshold(struct perf_event_attr *attr)
+{
+	return FIELD_GET(GENMASK(TH_HI, TH_LO), attr->config1);
+}
+
+static inline u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr)
+{
+	u8 th_compare = FIELD_GET(GENMASK(TH_CMP_HI, TH_CMP_LO), attr->config1);
+	u8 th_count = FIELD_GET(BIT(TH_CNT), attr->config1);
+
+	/*
+	 * The count bit is always the bottom bit of the full control field, and
+	 * the comparison is the upper two bits, but it's not explicitly
+	 * labelled in the Arm ARM. For the Perf interface we split it into two
+	 * fields, so reconstruct it here.
+	 */
+	return (th_compare << 1) | th_count;
+}
+
 static struct attribute *armv8_pmuv3_format_attrs[] = {
 	&format_attr_event.attr,
 	&format_attr_long.attr,
 	&format_attr_rdpmc.attr,
+	&format_attr_threshold.attr,
+	&format_attr_threshold_compare.attr,
+	&format_attr_threshold_count.attr,
 	NULL,
 };
 
@@ -365,10 +397,38 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr,
 
 static DEVICE_ATTR_RO(bus_width);
 
+static u32 threshold_max(struct arm_pmu *cpu_pmu)
+{
+	/*
+	 * PMMIR.WIDTH is readable and non-zero on aarch32, but it would be
+	 * impossible to write the threshold in the upper 32 bits of PMEVTYPER.
+	 */
+	if (IS_ENABLED(CONFIG_ARM))
+		return 0;
+
+	/*
+	 * The largest value that can be written to PMEVTYPER<n>_EL0.TH is
+	 * (2 ^ PMMIR.THWIDTH) - 1.
+	 */
+	return (1 << FIELD_GET(ARMV8_PMU_THWIDTH, cpu_pmu->reg_pmmir)) - 1;
+}
+
+static ssize_t threshold_max_show(struct device *dev,
+				  struct device_attribute *attr, char *page)
+{
+	struct pmu *pmu = dev_get_drvdata(dev);
+	struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
+
+	return sysfs_emit(page, "0x%08x\n", threshold_max(cpu_pmu));
+}
+
+static DEVICE_ATTR_RO(threshold_max);
+
 static struct attribute *armv8_pmuv3_caps_attrs[] = {
 	&dev_attr_slots.attr,
 	&dev_attr_bus_slots.attr,
 	&dev_attr_bus_width.attr,
+	&dev_attr_threshold_max.attr,
 	NULL,
 };
 
@@ -552,7 +612,7 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
 		armv8pmu_write_hw_counter(event, value);
 }
 
-static inline void armv8pmu_write_evtype(int idx, u32 val)
+static inline void armv8pmu_write_evtype(int idx, unsigned long val)
 {
 	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
 	unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
@@ -921,6 +981,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
 				     struct perf_event_attr *attr)
 {
 	unsigned long config_base = 0;
+	struct perf_event *perf_event = container_of(attr, struct perf_event,
+						     attr);
+	struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu);
+	u32 th, th_max;
 
 	if (attr->exclude_idle)
 		return -EPERM;
@@ -952,6 +1016,19 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
 	if (attr->exclude_user)
 		config_base |= ARMV8_PMU_EXCLUDE_EL0;
 
+	/*
+	 * Insert event counting threshold (FEAT_PMUv3_TH) values. If
+	 * FEAT_PMUv3_TH isn't implemented, then THWIDTH (threshold_max) will be
+	 * 0 and no values will be written.
+	 */
+	th_max = threshold_max(cpu_pmu);
+	if (IS_ENABLED(CONFIG_ARM64) && th_max) {
+		th = min(armv8pmu_event_threshold(attr), th_max);
+		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th);
+		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC,
+					  armv8pmu_event_threshold_control(attr));
+	}
+
 	/*
 	 * Install the filter into config_base as this is used to
 	 * construct the event type.
diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
index ddd1fec86739..ccbc0f9a74d8 100644
--- a/include/linux/perf/arm_pmuv3.h
+++ b/include/linux/perf/arm_pmuv3.h
@@ -258,6 +258,7 @@
 #define ARMV8_PMU_BUS_SLOTS_MASK 0xff
 #define ARMV8_PMU_BUS_WIDTH_SHIFT 16
 #define ARMV8_PMU_BUS_WIDTH_MASK 0xf
+#define ARMV8_PMU_THWIDTH GENMASK(23, 20)
 
 /*
  * This code is really good
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature
  2023-11-13 11:25 [PATCH v5 0/3] arm64: perf: Add support for event counting threshold James Clark
  2023-11-13 11:25 ` [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask James Clark
  2023-11-13 11:25 ` [PATCH v5 2/3] arm64: perf: Add support for event counting threshold James Clark
@ 2023-11-13 11:25 ` James Clark
  2023-11-20 21:31   ` Namhyung Kim
  2 siblings, 1 reply; 15+ messages in thread
From: James Clark @ 2023-11-13 11:25 UTC (permalink / raw)
  To: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland
  Cc: James Clark, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel

Add documentation for the new Perf event open parameters and
the threshold_max capability file.

Signed-off-by: James Clark <james.clark@arm.com>
---
 Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst
index 1f87b57c2332..36b8111a710d 100644
--- a/Documentation/arch/arm64/perf.rst
+++ b/Documentation/arch/arm64/perf.rst
@@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed.
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
 .. _tools/lib/perf/tests/test-evsel.c:
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
+
+Event Counting Threshold
+==========================================
+
+Overview
+--------
+
+FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
+events whose count meets a specified threshold condition. For example if
+threshold_compare is set to 2 ('Greater than or equal'), and the
+threshold is set to 2, then the PMU counter will now only increment by
+when an event would have previously incremented the PMU counter by 2 or
+more on a single processor cycle.
+
+To increment by 1 after passing the threshold condition instead of the
+number of events on that cycle, add the 'threshold_count' option to the
+commandline.
+
+How-to
+------
+
+The threshold, threshold_compare and threshold_count values can be
+provided per event:
+
+.. code-block:: sh
+
+  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
+            -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
+
+And the following comparison values are supported:
+
+.. code-block::
+
+  0: Not-equal
+  1: Equals
+  2: Greater-than-or-equal
+  3: Less-than
+
+The maximum supported threshold value can be read from the caps of each
+PMU, for example:
+
+.. code-block:: sh
+
+  cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
+
+  0x000000ff
+
+If a value higher than this is given, then it will be silently clamped
+to the maximum. The highest possible maximum is 4095, as the config
+field for threshold is limited to 12 bits, and the Perf tool will refuse
+to parse higher values.
+
+If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read
+0, and both threshold and threshold_compare will be silently ignored.
+threshold_max will also read as 0 on aarch32 guests, even if the host
+is running on hardware with the feature.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature
  2023-11-13 11:25 ` [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature James Clark
@ 2023-11-20 21:31   ` Namhyung Kim
  2023-11-21 10:33     ` Suzuki K Poulose
  2023-11-23  5:50     ` Anshuman Khandual
  0 siblings, 2 replies; 15+ messages in thread
From: Namhyung Kim @ 2023-11-20 21:31 UTC (permalink / raw)
  To: James Clark
  Cc: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel

On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote:
>
> Add documentation for the new Perf event open parameters and
> the threshold_max capability file.
>
> Signed-off-by: James Clark <james.clark@arm.com>
> ---
>  Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
>  1 file changed, 56 insertions(+)
>
> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst
> index 1f87b57c2332..36b8111a710d 100644
> --- a/Documentation/arch/arm64/perf.rst
> +++ b/Documentation/arch/arm64/perf.rst
> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed.
>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
>  .. _tools/lib/perf/tests/test-evsel.c:
>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
> +
> +Event Counting Threshold
> +==========================================
> +
> +Overview
> +--------
> +
> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
> +events whose count meets a specified threshold condition. For example if
> +threshold_compare is set to 2 ('Greater than or equal'), and the
> +threshold is set to 2, then the PMU counter will now only increment by
> +when an event would have previously incremented the PMU counter by 2 or
> +more on a single processor cycle.
> +
> +To increment by 1 after passing the threshold condition instead of the
> +number of events on that cycle, add the 'threshold_count' option to the
> +commandline.
> +
> +How-to
> +------
> +
> +The threshold, threshold_compare and threshold_count values can be
> +provided per event:
> +
> +.. code-block:: sh
> +
> +  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
> +            -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/

Can you please explain this a bit more?

I guess the first event counts stall_slot PMU if the event if it's
greater than or equal to 2.  And as threshold_count is not set,
it'd count the stall_slot as is.  E.g. it counts 3 when it sees 3.

OTOH, dtlb_walk will count 1 if it sees an event less than 10.
Is my understanding correct?

> +
> +And the following comparison values are supported:
> +
> +.. code-block::
> +
> +  0: Not-equal
> +  1: Equals
> +  2: Greater-than-or-equal
> +  3: Less-than

So the above values are for threashold_compare, right?
It'd be nice if it's more explicit.

Similarly, it'd be helpful to have a description for the
threshold and threshold_count fields.

Thanks,
Namhyung

> +
> +The maximum supported threshold value can be read from the caps of each
> +PMU, for example:
> +
> +.. code-block:: sh
> +
> +  cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
> +
> +  0x000000ff
> +
> +If a value higher than this is given, then it will be silently clamped
> +to the maximum. The highest possible maximum is 4095, as the config
> +field for threshold is limited to 12 bits, and the Perf tool will refuse
> +to parse higher values.
> +
> +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read
> +0, and both threshold and threshold_compare will be silently ignored.
> +threshold_max will also read as 0 on aarch32 guests, even if the host
> +is running on hardware with the feature.
> --
> 2.34.1
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask
  2023-11-13 11:25 ` [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask James Clark
@ 2023-11-21 10:15   ` Suzuki K Poulose
  2023-11-22  8:44   ` Anshuman Khandual
  1 sibling, 0 replies; 15+ messages in thread
From: Suzuki K Poulose @ 2023-11-21 10:15 UTC (permalink / raw)
  To: James Clark, linux-arm-kernel, linux-perf-users, will,
	mark.rutland
  Cc: Catalin Marinas, Jonathan Corbet, linux-doc, linux-kernel

On 13/11/2023 11:25, James Clark wrote:
> FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include
> them in the mask. These aren't writable on 32 bit kernels as they are in
> the high part of the register, so only include them for arm64.
> 
> It would be difficult to do this statically in the asm header files for
> each platform without resulting in circular includes or #ifdefs inline
> in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has
> been removed and the mask is constructed programmatically.
> 
> Signed-off-by: James Clark <james.clark@arm.com>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>


> ---
>   drivers/perf/arm_pmuv3.c       | 9 ++++++++-
>   include/linux/perf/arm_pmuv3.h | 3 ++-
>   2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 6ca7be05229c..1d40d794f5e4 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -555,8 +555,15 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
>   static inline void armv8pmu_write_evtype(int idx, u32 val)
>   {
>   	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> +	unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
> +			     ARMV8_PMU_INCLUDE_EL2 |
> +			     ARMV8_PMU_EXCLUDE_EL0 |
> +			     ARMV8_PMU_EXCLUDE_EL1;
>   
> -	val &= ARMV8_PMU_EVTYPE_MASK;
> +	if (IS_ENABLED(CONFIG_ARM64))
> +		mask |= ARMV8_PMU_EVTYPE_TC | ARMV8_PMU_EVTYPE_TH;
> +
> +	val &= mask;
>   	write_pmevtypern(counter, val);
>   }
>   
> diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
> index 9c226adf938a..ddd1fec86739 100644
> --- a/include/linux/perf/arm_pmuv3.h
> +++ b/include/linux/perf/arm_pmuv3.h
> @@ -228,7 +228,8 @@
>   /*
>    * PMXEVTYPER: Event selection reg
>    */
> -#define ARMV8_PMU_EVTYPE_MASK	0xc800ffff	/* Mask for writable bits */
> +#define ARMV8_PMU_EVTYPE_TH	GENMASK(43, 32)
> +#define ARMV8_PMU_EVTYPE_TC	GENMASK(63, 61)
>   #define ARMV8_PMU_EVTYPE_EVENT	0xffff		/* Mask for EVENT bits */
>   
>   /*


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 2/3] arm64: perf: Add support for event counting threshold
  2023-11-13 11:25 ` [PATCH v5 2/3] arm64: perf: Add support for event counting threshold James Clark
@ 2023-11-21 10:20   ` Suzuki K Poulose
  2023-11-23  3:42   ` Anshuman Khandual
  1 sibling, 0 replies; 15+ messages in thread
From: Suzuki K Poulose @ 2023-11-21 10:20 UTC (permalink / raw)
  To: James Clark, linux-arm-kernel, linux-perf-users, will,
	mark.rutland
  Cc: Catalin Marinas, Jonathan Corbet, linux-doc, linux-kernel

On 13/11/2023 11:25, James Clark wrote:
> FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
> events whose count meets a specified threshold condition. For example if
> PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or
> equal, count), and the threshold is set to 2, then the PMU counter will
> now only increment by 1 when an event would have previously incremented
> the PMU counter by 2 or more on a single processor cycle.
> 
> Three new Perf event config fields, 'threshold', 'threshold_compare' and
> 'threshold_count' have been added to control the feature.
> threshold_compare maps to the upper two bits of PMEVTYPERn.TC and
> threshold_count maps to the first bit of TC. These separate attributes
> have been picked rather than enumerating all the possible combinations
> of the TC field as in the Arm ARM. The attributes would be used on a
> Perf command line like this:
> 
>    $ perf stat -e stall_slot/threshold=2,threshold_compare=2/
> 
> A new capability for reading out the maximum supported threshold value
> has also been added:
> 
>    $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
> 
>    0x000000ff
> 
> If a threshold higher than threshold_max is provided, then no error is
> generated but the threshold is clamped to the max value. If
> FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then
> threshold_max reads zero, and neither the 'threshold' nor
> 'threshold_control' parameters will be used.
> 
> The threshold is per PMU counter, and there are potentially different
> threshold_max values per PMU type on heterogeneous systems.
> 
> Bits higher than 32 now need to be written into PMEVTYPER, so
> armv8pmu_write_evtype() has to be updated to take an unsigned long value
> rather than u32 which gives the correct behavior on both aarch32 and 64.
> 
> Signed-off-by: James Clark <james.clark@arm.com>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>


> ---
>   drivers/perf/arm_pmuv3.c       | 79 +++++++++++++++++++++++++++++++++-
>   include/linux/perf/arm_pmuv3.h |  1 +
>   2 files changed, 79 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 1d40d794f5e4..694d914ffc08 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -15,6 +15,7 @@
>   #include <clocksource/arm_arch_timer.h>
>   
>   #include <linux/acpi.h>
> +#include <linux/bitfield.h>
>   #include <linux/clocksource.h>
>   #include <linux/of.h>
>   #include <linux/perf/arm_pmu.h>
> @@ -294,9 +295,18 @@ static const struct attribute_group armv8_pmuv3_events_attr_group = {
>   	.is_visible = armv8pmu_event_attr_is_visible,
>   };
>   
> +#define TH_LO	2
> +#define TH_HI	13
> +#define TH_CNT	14
> +#define TH_CMP_LO	15
> +#define TH_CMP_HI	16
> +
>   PMU_FORMAT_ATTR(event, "config:0-15");
>   PMU_FORMAT_ATTR(long, "config1:0");
>   PMU_FORMAT_ATTR(rdpmc, "config1:1");
> +PMU_FORMAT_ATTR(threshold, "config1:" __stringify(TH_LO) "-" __stringify(TH_HI));
> +PMU_FORMAT_ATTR(threshold_compare, "config1:" __stringify(TH_CMP_LO) "-" __stringify(TH_CMP_HI));
> +PMU_FORMAT_ATTR(threshold_count, "config1:" __stringify(TH_CNT));
>   
>   static int sysctl_perf_user_access __read_mostly;
>   
> @@ -310,10 +320,32 @@ static inline bool armv8pmu_event_want_user_access(struct perf_event *event)
>   	return event->attr.config1 & 0x2;
>   }
>   
> +static inline u32 armv8pmu_event_threshold(struct perf_event_attr *attr)
> +{
> +	return FIELD_GET(GENMASK(TH_HI, TH_LO), attr->config1);
> +}
> +
> +static inline u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr)
> +{
> +	u8 th_compare = FIELD_GET(GENMASK(TH_CMP_HI, TH_CMP_LO), attr->config1);
> +	u8 th_count = FIELD_GET(BIT(TH_CNT), attr->config1);
> +
> +	/*
> +	 * The count bit is always the bottom bit of the full control field, and
> +	 * the comparison is the upper two bits, but it's not explicitly
> +	 * labelled in the Arm ARM. For the Perf interface we split it into two
> +	 * fields, so reconstruct it here.
> +	 */
> +	return (th_compare << 1) | th_count;
> +}
> +
>   static struct attribute *armv8_pmuv3_format_attrs[] = {
>   	&format_attr_event.attr,
>   	&format_attr_long.attr,
>   	&format_attr_rdpmc.attr,
> +	&format_attr_threshold.attr,
> +	&format_attr_threshold_compare.attr,
> +	&format_attr_threshold_count.attr,
>   	NULL,
>   };
>   
> @@ -365,10 +397,38 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr,
>   
>   static DEVICE_ATTR_RO(bus_width);
>   
> +static u32 threshold_max(struct arm_pmu *cpu_pmu)
> +{
> +	/*
> +	 * PMMIR.WIDTH is readable and non-zero on aarch32, but it would be
> +	 * impossible to write the threshold in the upper 32 bits of PMEVTYPER.
> +	 */
> +	if (IS_ENABLED(CONFIG_ARM))
> +		return 0;
> +
> +	/*
> +	 * The largest value that can be written to PMEVTYPER<n>_EL0.TH is
> +	 * (2 ^ PMMIR.THWIDTH) - 1.
> +	 */
> +	return (1 << FIELD_GET(ARMV8_PMU_THWIDTH, cpu_pmu->reg_pmmir)) - 1;
> +}
> +
> +static ssize_t threshold_max_show(struct device *dev,
> +				  struct device_attribute *attr, char *page)
> +{
> +	struct pmu *pmu = dev_get_drvdata(dev);
> +	struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
> +
> +	return sysfs_emit(page, "0x%08x\n", threshold_max(cpu_pmu));
> +}
> +
> +static DEVICE_ATTR_RO(threshold_max);
> +
>   static struct attribute *armv8_pmuv3_caps_attrs[] = {
>   	&dev_attr_slots.attr,
>   	&dev_attr_bus_slots.attr,
>   	&dev_attr_bus_width.attr,
> +	&dev_attr_threshold_max.attr,
>   	NULL,
>   };
>   
> @@ -552,7 +612,7 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
>   		armv8pmu_write_hw_counter(event, value);
>   }
>   
> -static inline void armv8pmu_write_evtype(int idx, u32 val)
> +static inline void armv8pmu_write_evtype(int idx, unsigned long val)
>   {
>   	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
>   	unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
> @@ -921,6 +981,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>   				     struct perf_event_attr *attr)
>   {
>   	unsigned long config_base = 0;
> +	struct perf_event *perf_event = container_of(attr, struct perf_event,
> +						     attr);
> +	struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu);
> +	u32 th, th_max;
>   
>   	if (attr->exclude_idle)
>   		return -EPERM;
> @@ -952,6 +1016,19 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>   	if (attr->exclude_user)
>   		config_base |= ARMV8_PMU_EXCLUDE_EL0;
>   
> +	/*
> +	 * Insert event counting threshold (FEAT_PMUv3_TH) values. If
> +	 * FEAT_PMUv3_TH isn't implemented, then THWIDTH (threshold_max) will be
> +	 * 0 and no values will be written.
> +	 */
> +	th_max = threshold_max(cpu_pmu);
> +	if (IS_ENABLED(CONFIG_ARM64) && th_max) {
> +		th = min(armv8pmu_event_threshold(attr), th_max);
> +		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th);
> +		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC,
> +					  armv8pmu_event_threshold_control(attr));
> +	}
> +
>   	/*
>   	 * Install the filter into config_base as this is used to
>   	 * construct the event type.
> diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
> index ddd1fec86739..ccbc0f9a74d8 100644
> --- a/include/linux/perf/arm_pmuv3.h
> +++ b/include/linux/perf/arm_pmuv3.h
> @@ -258,6 +258,7 @@
>   #define ARMV8_PMU_BUS_SLOTS_MASK 0xff
>   #define ARMV8_PMU_BUS_WIDTH_SHIFT 16
>   #define ARMV8_PMU_BUS_WIDTH_MASK 0xf
> +#define ARMV8_PMU_THWIDTH GENMASK(23, 20)
>   
>   /*
>    * This code is really good


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature
  2023-11-20 21:31   ` Namhyung Kim
@ 2023-11-21 10:33     ` Suzuki K Poulose
  2023-11-23 15:33       ` James Clark
  2023-11-23  5:50     ` Anshuman Khandual
  1 sibling, 1 reply; 15+ messages in thread
From: Suzuki K Poulose @ 2023-11-21 10:33 UTC (permalink / raw)
  To: Namhyung Kim, James Clark
  Cc: linux-arm-kernel, linux-perf-users, will, mark.rutland,
	Catalin Marinas, Jonathan Corbet, linux-doc, linux-kernel

On 20/11/2023 21:31, Namhyung Kim wrote:
> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote:
>>
>> Add documentation for the new Perf event open parameters and
>> the threshold_max capability file.
>>
>> Signed-off-by: James Clark <james.clark@arm.com>
>> ---
>>   Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
>>   1 file changed, 56 insertions(+)
>>
>> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst
>> index 1f87b57c2332..36b8111a710d 100644
>> --- a/Documentation/arch/arm64/perf.rst
>> +++ b/Documentation/arch/arm64/perf.rst
>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed.
>>      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
>>   .. _tools/lib/perf/tests/test-evsel.c:
>>      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
>> +
>> +Event Counting Threshold
>> +==========================================
>> +
>> +Overview
>> +--------
>> +
>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
>> +events whose count meets a specified threshold condition. For example if
>> +threshold_compare is set to 2 ('Greater than or equal'), and the
>> +threshold is set to 2, then the PMU counter will now only increment by
>> +when an event would have previously incremented the PMU counter by 2 or
>> +more on a single processor cycle.
>> +
>> +To increment by 1 after passing the threshold condition instead of the
>> +number of events on that cycle, add the 'threshold_count' option to the
>> +commandline.
>> +
>> +How-to
>> +------
>> +
>> +The threshold, threshold_compare and threshold_count values can be
>> +provided per event:
>> +
>> +.. code-block:: sh
>> +
>> +  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
>> +            -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
> 
> Can you please explain this a bit more?
> 
> I guess the first event counts stall_slot PMU if the event if it's
> greater than or equal to 2.  And as threshold_count is not set,
> it'd count the stall_slot as is.  E.g. it counts 3 when it sees 3.
> 
> OTOH, dtlb_walk will count 1 if it sees an event less than 10.
> Is my understanding correct?

That is correct. The behavior is described in the paragraph above.
But I agree that it would be really helpful if we explained with the
example above.

> 
>> +
>> +And the following comparison values are supported:
>> +
>> +.. code-block::
>> +
>> +  0: Not-equal
>> +  1: Equals
>> +  2: Greater-than-or-equal
>> +  3: Less-than
> 
> So the above values are for threashold_compare, right?
> It'd be nice if it's more explicit.
> 
> Similarly, it'd be helpful to have a description for the
> threshold and threshold_count fields.

Agreed.

Suzuki



> 
> Thanks,
> Namhyung
> 
>> +
>> +The maximum supported threshold value can be read from the caps of each
>> +PMU, for example:
>> +
>> +.. code-block:: sh
>> +
>> +  cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
>> +
>> +  0x000000ff
>> +
>> +If a value higher than this is given, then it will be silently clamped
>> +to the maximum. The highest possible maximum is 4095, as the config
>> +field for threshold is limited to 12 bits, and the Perf tool will refuse
>> +to parse higher values.
>> +
>> +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read
>> +0, and both threshold and threshold_compare will be silently ignored.
>> +threshold_max will also read as 0 on aarch32 guests, even if the host
>> +is running on hardware with the feature.
>> --
>> 2.34.1
>>
>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask
  2023-11-13 11:25 ` [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask James Clark
  2023-11-21 10:15   ` Suzuki K Poulose
@ 2023-11-22  8:44   ` Anshuman Khandual
  1 sibling, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2023-11-22  8:44 UTC (permalink / raw)
  To: James Clark, linux-arm-kernel, linux-perf-users, suzuki.poulose,
	will, mark.rutland
  Cc: Catalin Marinas, Jonathan Corbet, linux-doc, linux-kernel



On 11/13/23 16:55, James Clark wrote:
> FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include
> them in the mask. These aren't writable on 32 bit kernels as they are in
> the high part of the register, so only include them for arm64.
> 
> It would be difficult to do this statically in the asm header files for
> each platform without resulting in circular includes or #ifdefs inline
> in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has
> been removed and the mask is constructed programmatically.

Agreed, and this also makes sense because there is just a single instance
for ARMV8_PMU_EVTYPE_MASK in armv8pmu_write_evtype().

> 
> Signed-off-by: James Clark <james.clark@arm.com>
> ---
>  drivers/perf/arm_pmuv3.c       | 9 ++++++++-
>  include/linux/perf/arm_pmuv3.h | 3 ++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 6ca7be05229c..1d40d794f5e4 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -555,8 +555,15 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
>  static inline void armv8pmu_write_evtype(int idx, u32 val)
>  {
>  	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> +	unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
> +			     ARMV8_PMU_INCLUDE_EL2 |
> +			     ARMV8_PMU_EXCLUDE_EL0 |
> +			     ARMV8_PMU_EXCLUDE_EL1;

At first this looks bit odd sequence - EL2, EL0, EL1 but such as these
bit positions.

#define ARMV8_PMU_EXCLUDE_EL1           (1U << 31)
#define ARMV8_PMU_EXCLUDE_EL0           (1U << 30)
#define ARMV8_PMU_INCLUDE_EL2           (1U << 27)

>  
> -	val &= ARMV8_PMU_EVTYPE_MASK;
> +	if (IS_ENABLED(CONFIG_ARM64))
> +		mask |= ARMV8_PMU_EVTYPE_TC | ARMV8_PMU_EVTYPE_TH;

This makes sense.

> +
> +	val &= mask;
>  	write_pmevtypern(counter, val);
>  }
>  
> diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
> index 9c226adf938a..ddd1fec86739 100644
> --- a/include/linux/perf/arm_pmuv3.h
> +++ b/include/linux/perf/arm_pmuv3.h
> @@ -228,7 +228,8 @@
>  /*
>   * PMXEVTYPER: Event selection reg
>   */
> -#define ARMV8_PMU_EVTYPE_MASK	0xc800ffff	/* Mask for writable bits */
> +#define ARMV8_PMU_EVTYPE_TH	GENMASK(43, 32)
> +#define ARMV8_PMU_EVTYPE_TC	GENMASK(63, 61)

Looks correct.

>  #define ARMV8_PMU_EVTYPE_EVENT	0xffff		/* Mask for EVENT bits */
>  
>  /*

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 2/3] arm64: perf: Add support for event counting threshold
  2023-11-13 11:25 ` [PATCH v5 2/3] arm64: perf: Add support for event counting threshold James Clark
  2023-11-21 10:20   ` Suzuki K Poulose
@ 2023-11-23  3:42   ` Anshuman Khandual
  2023-11-23 17:53     ` James Clark
  1 sibling, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2023-11-23  3:42 UTC (permalink / raw)
  To: James Clark, linux-arm-kernel, linux-perf-users, suzuki.poulose,
	will, mark.rutland
  Cc: Catalin Marinas, Jonathan Corbet, linux-doc, linux-kernel


On 11/13/23 16:55, James Clark wrote:
> FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
> events whose count meets a specified threshold condition. For example if
> PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or
> equal, count), and the threshold is set to 2, then the PMU counter will
> now only increment by 1 when an event would have previously incremented
> the PMU counter by 2 or more on a single processor cycle.
> 
> Three new Perf event config fields, 'threshold', 'threshold_compare' and
> 'threshold_count' have been added to control the feature.
> threshold_compare maps to the upper two bits of PMEVTYPERn.TC and
> threshold_count maps to the first bit of TC. These separate attributes
> have been picked rather than enumerating all the possible combinations
> of the TC field as in the Arm ARM. The attributes would be used on a
> Perf command line like this:
> 
>   $ perf stat -e stall_slot/threshold=2,threshold_compare=2/

If threshold_count = 0, then threshold and threshold_compare should just
be ignored ?

> 
> A new capability for reading out the maximum supported threshold value
> has also been added:
> 
>   $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
> 
>   0x000000ff

Makes sense.

> 
> If a threshold higher than threshold_max is provided, then no error is
> generated but the threshold is clamped to the max value. If
> FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then
> threshold_max reads zero, and neither the 'threshold' nor
> 'threshold_control' parameters will be used.
> 
> The threshold is per PMU counter, and there are potentially different
> threshold_max values per PMU type on heterogeneous systems.
> 
> Bits higher than 32 now need to be written into PMEVTYPER, so
> armv8pmu_write_evtype() has to be updated to take an unsigned long value
> rather than u32 which gives the correct behavior on both aarch32 and 64.

Makes sense.

> 
> Signed-off-by: James Clark <james.clark@arm.com>
> ---
>  drivers/perf/arm_pmuv3.c       | 79 +++++++++++++++++++++++++++++++++-
>  include/linux/perf/arm_pmuv3.h |  1 +
>  2 files changed, 79 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 1d40d794f5e4..694d914ffc08 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -15,6 +15,7 @@
>  #include <clocksource/arm_arch_timer.h>
>  
>  #include <linux/acpi.h>
> +#include <linux/bitfield.h>
>  #include <linux/clocksource.h>
>  #include <linux/of.h>
>  #include <linux/perf/arm_pmu.h>
> @@ -294,9 +295,18 @@ static const struct attribute_group armv8_pmuv3_events_attr_group = {
>  	.is_visible = armv8pmu_event_attr_is_visible,
>  };
>  
> +#define TH_LO	2
> +#define TH_HI	13
> +#define TH_CNT	14
> +#define TH_CMP_LO	15
> +#define TH_CMP_HI	16

TH_ prefix sounds too cryptic. I had suggested some clean up earlier

https://lore.kernel.org/all/20231009051753.179355-1-anshuman.khandual@arm.com/

But for now, something like the following might be bit better instead ?

THRESHOLD_LOW
THRESHOLD_HIGH
THRESHOLD_CNT
THRESHOLD_CMP_LOW
THRESHOLD_CMP_HIGH

> +
>  PMU_FORMAT_ATTR(event, "config:0-15");
>  PMU_FORMAT_ATTR(long, "config1:0");
>  PMU_FORMAT_ATTR(rdpmc, "config1:1");
> +PMU_FORMAT_ATTR(threshold, "config1:" __stringify(TH_LO) "-" __stringify(TH_HI));
> +PMU_FORMAT_ATTR(threshold_compare, "config1:" __stringify(TH_CMP_LO) "-" __stringify(TH_CMP_HI));
> +PMU_FORMAT_ATTR(threshold_count, "config1:" __stringify(TH_CNT));
>  
>  static int sysctl_perf_user_access __read_mostly;
>  
> @@ -310,10 +320,32 @@ static inline bool armv8pmu_event_want_user_access(struct perf_event *event)
>  	return event->attr.config1 & 0x2;
>  }
>  
> +static inline u32 armv8pmu_event_threshold(struct perf_event_attr *attr)
> +{
> +	return FIELD_GET(GENMASK(TH_HI, TH_LO), attr->config1);
> +}
> +
> +static inline u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr)
> +{
> +	u8 th_compare = FIELD_GET(GENMASK(TH_CMP_HI, TH_CMP_LO), attr->config1);
> +	u8 th_count = FIELD_GET(BIT(TH_CNT), attr->config1);
> +
> +	/*
> +	 * The count bit is always the bottom bit of the full control field, and
> +	 * the comparison is the upper two bits, but it's not explicitly
> +	 * labelled in the Arm ARM. For the Perf interface we split it into two
> +	 * fields, so reconstruct it here.
> +	 */
> +	return (th_compare << 1) | th_count;

If user provides 'th_count = 0', then all these threshold control code can be
skipped as if FEAT_PMUv3_TH was never implemented ? Also what happens when
threshold = 0 ?

> +}
> +
>  static struct attribute *armv8_pmuv3_format_attrs[] = {
>  	&format_attr_event.attr,
>  	&format_attr_long.attr,
>  	&format_attr_rdpmc.attr,
> +	&format_attr_threshold.attr,
> +	&format_attr_threshold_compare.attr,
> +	&format_attr_threshold_count.attr,
>  	NULL,
>  };
>  
> @@ -365,10 +397,38 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr,
>  
>  static DEVICE_ATTR_RO(bus_width);
>  
> +static u32 threshold_max(struct arm_pmu *cpu_pmu)
> +{
> +	/*
> +	 * PMMIR.WIDTH is readable and non-zero on aarch32, but it would be

PMMIR.THWIDTH ^^^^^^^^ ?

> +	 * impossible to write the threshold in the upper 32 bits of PMEVTYPER.
> +	 */

Could aarch32 support FEAT_PMUv3_TH ? if not, how can the PMMIR.THWIDTH value
here be non-zero ? Also wondering if just a non-zero PMMIR.THWIDTH indicates
the presence for FEAT_PMUv3_TH on a given ARM PMU.

> +	if (IS_ENABLED(CONFIG_ARM))
> +		return 0;

Small nit - would a negative check on CONFIG_ARM64 be better ?

> +
> +	/*
> +	 * The largest value that can be written to PMEVTYPER<n>_EL0.TH is
> +	 * (2 ^ PMMIR.THWIDTH) - 1.
> +	 */
> +	return (1 << FIELD_GET(ARMV8_PMU_THWIDTH, cpu_pmu->reg_pmmir)) - 1;

Makes sense to return the adjusted value.

> +}
> +
> +static ssize_t threshold_max_show(struct device *dev,
> +				  struct device_attribute *attr, char *page)
> +{
> +	struct pmu *pmu = dev_get_drvdata(dev);
> +	struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
> +
> +	return sysfs_emit(page, "0x%08x\n", threshold_max(cpu_pmu));
> +}
> +
> +static DEVICE_ATTR_RO(threshold_max);
> +
>  static struct attribute *armv8_pmuv3_caps_attrs[] = {
>  	&dev_attr_slots.attr,
>  	&dev_attr_bus_slots.attr,
>  	&dev_attr_bus_width.attr,
> +	&dev_attr_threshold_max.attr,
>  	NULL,
>  };
>  
> @@ -552,7 +612,7 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
>  		armv8pmu_write_hw_counter(event, value);
>  }
>  
> -static inline void armv8pmu_write_evtype(int idx, u32 val)
> +static inline void armv8pmu_write_evtype(int idx, unsigned long val)
>  {
>  	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
>  	unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
> @@ -921,6 +981,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  				     struct perf_event_attr *attr)
>  {
>  	unsigned long config_base = 0;
> +	struct perf_event *perf_event = container_of(attr, struct perf_event,
> +						     attr);
> +	struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu);
> +	u32 th, th_max;
>  
>  	if (attr->exclude_idle)
>  		return -EPERM;
> @@ -952,6 +1016,19 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  	if (attr->exclude_user)
>  		config_base |= ARMV8_PMU_EXCLUDE_EL0;
>  
> +	/*
> +	 * Insert event counting threshold (FEAT_PMUv3_TH) values. If
> +	 * FEAT_PMUv3_TH isn't implemented, then THWIDTH (threshold_max) will be
> +	 * 0 and no values will be written.
> +	 */
> +	th_max = threshold_max(cpu_pmu);
> +	if (IS_ENABLED(CONFIG_ARM64) && th_max) {

As mentioned above, using negative check on CONFIG_ARM64 in threshold_max()
will complement this condition here, making it clear that these threshold
configurations are applicable only on 64 bit platforms.

> +		th = min(armv8pmu_event_threshold(attr), th_max);
> +		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th);
> +		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC,
> +					  armv8pmu_event_threshold_control(attr));

Small nit - armv8pmu_event_threshold_control() might also be captured before
into a 'tc' local variable before adjusting the config_base similar to 'th'.
Also better to add a small comment before 'th = min(..., ..) ' regarding the
clamping user input to platform max_threshold.

		th = min(armv8pmu_event_threshold(attr), th_max);
		tc = armv8pmu_event_threshold_control(attr));
		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th);
		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC, tc);

> +	}
> +
>  	/*
>  	 * Install the filter into config_base as this is used to
>  	 * construct the event type.
> diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
> index ddd1fec86739..ccbc0f9a74d8 100644
> --- a/include/linux/perf/arm_pmuv3.h
> +++ b/include/linux/perf/arm_pmuv3.h
> @@ -258,6 +258,7 @@
>  #define ARMV8_PMU_BUS_SLOTS_MASK 0xff
>  #define ARMV8_PMU_BUS_WIDTH_SHIFT 16
>  #define ARMV8_PMU_BUS_WIDTH_MASK 0xf
> +#define ARMV8_PMU_THWIDTH GENMASK(23, 20)

Small nit - may be ARMV8_PMU_TH_WIDTH instead ?

>  
>  /*
>   * This code is really good

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature
  2023-11-20 21:31   ` Namhyung Kim
  2023-11-21 10:33     ` Suzuki K Poulose
@ 2023-11-23  5:50     ` Anshuman Khandual
  2023-11-23 15:45       ` James Clark
  1 sibling, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2023-11-23  5:50 UTC (permalink / raw)
  To: Namhyung Kim, James Clark
  Cc: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel



On 11/21/23 03:01, Namhyung Kim wrote:
> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote:
>> Add documentation for the new Perf event open parameters and
>> the threshold_max capability file.
>>
>> Signed-off-by: James Clark <james.clark@arm.com>
>> ---
>>  Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
>>  1 file changed, 56 insertions(+)
>>
>> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst
>> index 1f87b57c2332..36b8111a710d 100644
>> --- a/Documentation/arch/arm64/perf.rst
>> +++ b/Documentation/arch/arm64/perf.rst
>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed.
>>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
>>  .. _tools/lib/perf/tests/test-evsel.c:
>>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
>> +
>> +Event Counting Threshold
>> +==========================================
>> +
>> +Overview
>> +--------
>> +
>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
>> +events whose count meets a specified threshold condition. For example if
>> +threshold_compare is set to 2 ('Greater than or equal'), and the
>> +threshold is set to 2, then the PMU counter will now only increment by
>> +when an event would have previously incremented the PMU counter by 2 or
>> +more on a single processor cycle.
>> +
>> +To increment by 1 after passing the threshold condition instead of the
>> +number of events on that cycle, add the 'threshold_count' option to the
>> +commandline.
>> +
>> +How-to
>> +------
>> +
>> +The threshold, threshold_compare and threshold_count values can be
>> +provided per event:
>> +
>> +.. code-block:: sh
>> +
>> +  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
>> +            -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
> Can you please explain this a bit more?
> 
> I guess the first event counts stall_slot PMU if the event if it's
> greater than or equal to 2.  And as threshold_count is not set,
> it'd count the stall_slot as is.  E.g. it counts 3 when it sees 3.

Hence without 'threshold_count' being set, the other two config requests
will not have an effect, is that correct ?

> 
> OTOH, dtlb_walk will count 1 if it sees an event less than 10.
> Is my understanding correct?

'Equals' and 'Greater-than-or-equal' makes sense and are intuitive. Just
wondering what will happen for 'Not-equal' and 'Less-than' - when would
the counter count in such cases ?

  0: Not-equal
  1: Equals
  2: Greater-than-or-equal
  3: Less-than

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature
  2023-11-21 10:33     ` Suzuki K Poulose
@ 2023-11-23 15:33       ` James Clark
  0 siblings, 0 replies; 15+ messages in thread
From: James Clark @ 2023-11-23 15:33 UTC (permalink / raw)
  To: Suzuki K Poulose, Namhyung Kim
  Cc: linux-arm-kernel, linux-perf-users, will, mark.rutland,
	Catalin Marinas, Jonathan Corbet, linux-doc, linux-kernel



On 21/11/2023 10:33, Suzuki K Poulose wrote:
> On 20/11/2023 21:31, Namhyung Kim wrote:
>> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote:
>>>
>>> Add documentation for the new Perf event open parameters and
>>> the threshold_max capability file.
>>>
>>> Signed-off-by: James Clark <james.clark@arm.com>
>>> ---
>>>   Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
>>>   1 file changed, 56 insertions(+)
>>>
>>> diff --git a/Documentation/arch/arm64/perf.rst
>>> b/Documentation/arch/arm64/perf.rst
>>> index 1f87b57c2332..36b8111a710d 100644
>>> --- a/Documentation/arch/arm64/perf.rst
>>> +++ b/Documentation/arch/arm64/perf.rst
>>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as
>>> needed.
>>>     
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
>>>   .. _tools/lib/perf/tests/test-evsel.c:
>>>     
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
>>> +
>>> +Event Counting Threshold
>>> +==========================================
>>> +
>>> +Overview
>>> +--------
>>> +
>>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
>>> +events whose count meets a specified threshold condition. For
>>> example if
>>> +threshold_compare is set to 2 ('Greater than or equal'), and the
>>> +threshold is set to 2, then the PMU counter will now only increment by
>>> +when an event would have previously incremented the PMU counter by 2 or
>>> +more on a single processor cycle.
>>> +
>>> +To increment by 1 after passing the threshold condition instead of the
>>> +number of events on that cycle, add the 'threshold_count' option to the
>>> +commandline.
>>> +
>>> +How-to
>>> +------
>>> +
>>> +The threshold, threshold_compare and threshold_count values can be
>>> +provided per event:
>>> +
>>> +.. code-block:: sh
>>> +
>>> +  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
>>> +            -e
>>> dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
>>
>> Can you please explain this a bit more?
>>
>> I guess the first event counts stall_slot PMU if the event if it's
>> greater than or equal to 2.  And as threshold_count is not set,
>> it'd count the stall_slot as is.  E.g. it counts 3 when it sees 3.
>>
>> OTOH, dtlb_walk will count 1 if it sees an event less than 10.
>> Is my understanding correct?
> 
> That is correct. The behavior is described in the paragraph above.
> But I agree that it would be really helpful if we explained with the
> example above.
> 

Yeah I can add a description of how the example behaves.

>>
>>> +
>>> +And the following comparison values are supported:
>>> +
>>> +.. code-block::
>>> +
>>> +  0: Not-equal
>>> +  1: Equals
>>> +  2: Greater-than-or-equal
>>> +  3: Less-than
>>
>> So the above values are for threashold_compare, right?
>> It'd be nice if it's more explicit.

Yep I agree, I can label this with threshold_compare.

>>
>> Similarly, it'd be helpful to have a description for the
>> threshold and threshold_count fields.
> 
> Agreed.
> 
> Suzuki
> 

Yeah I'll add explicit descriptions for each field.

Thanks for the review.

> 
> 
>>
>> Thanks,
>> Namhyung
>>
>>> +
>>> +The maximum supported threshold value can be read from the caps of each
>>> +PMU, for example:
>>> +
>>> +.. code-block:: sh
>>> +
>>> +  cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
>>> +
>>> +  0x000000ff
>>> +
>>> +If a value higher than this is given, then it will be silently clamped
>>> +to the maximum. The highest possible maximum is 4095, as the config
>>> +field for threshold is limited to 12 bits, and the Perf tool will
>>> refuse
>>> +to parse higher values.
>>> +
>>> +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read
>>> +0, and both threshold and threshold_compare will be silently ignored.
>>> +threshold_max will also read as 0 on aarch32 guests, even if the host
>>> +is running on hardware with the feature.
>>> -- 
>>> 2.34.1
>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature
  2023-11-23  5:50     ` Anshuman Khandual
@ 2023-11-23 15:45       ` James Clark
  2023-11-24  9:52         ` James Clark
  0 siblings, 1 reply; 15+ messages in thread
From: James Clark @ 2023-11-23 15:45 UTC (permalink / raw)
  To: Anshuman Khandual, Namhyung Kim
  Cc: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel



On 23/11/2023 05:50, Anshuman Khandual wrote:
> 
> 
> On 11/21/23 03:01, Namhyung Kim wrote:
>> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote:
>>> Add documentation for the new Perf event open parameters and
>>> the threshold_max capability file.
>>>
>>> Signed-off-by: James Clark <james.clark@arm.com>
>>> ---
>>>  Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
>>>  1 file changed, 56 insertions(+)
>>>
>>> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst
>>> index 1f87b57c2332..36b8111a710d 100644
>>> --- a/Documentation/arch/arm64/perf.rst
>>> +++ b/Documentation/arch/arm64/perf.rst
>>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed.
>>>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
>>>  .. _tools/lib/perf/tests/test-evsel.c:
>>>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
>>> +
>>> +Event Counting Threshold
>>> +==========================================
>>> +
>>> +Overview
>>> +--------
>>> +
>>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
>>> +events whose count meets a specified threshold condition. For example if
>>> +threshold_compare is set to 2 ('Greater than or equal'), and the
>>> +threshold is set to 2, then the PMU counter will now only increment by
>>> +when an event would have previously incremented the PMU counter by 2 or
>>> +more on a single processor cycle.
>>> +
>>> +To increment by 1 after passing the threshold condition instead of the
>>> +number of events on that cycle, add the 'threshold_count' option to the
>>> +commandline.
>>> +
>>> +How-to
>>> +------
>>> +
>>> +The threshold, threshold_compare and threshold_count values can be
>>> +provided per event:
>>> +
>>> +.. code-block:: sh
>>> +
>>> +  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
>>> +            -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
>> Can you please explain this a bit more?
>>
>> I guess the first event counts stall_slot PMU if the event if it's
>> greater than or equal to 2.  And as threshold_count is not set,
>> it'd count the stall_slot as is.  E.g. it counts 3 when it sees 3.
> 
> Hence without 'threshold_count' being set, the other two config requests
> will not have an effect, is that correct ?

Yeah I can mention this. It's implied because 0 is the default value of
config fields, and 0 is a valid value for compare and count field, so
threshold=0 has to be the way to disable it. But I can mention it
explicitly.

> 
>>
>> OTOH, dtlb_walk will count 1 if it sees an event less than 10.
>> Is my understanding correct?
> 
> 'Equals' and 'Greater-than-or-equal' makes sense and are intuitive. Just
> wondering what will happen for 'Not-equal' and 'Less-than' - when would
> the counter count in such cases ?
> 
>   0: Not-equal
>   1: Equals
>   2: Greater-than-or-equal
>   3: Less-than
> 

They would count when the event is not equal to or less than the
threshold value on any cycle. Probably going into more detail would
start to reproduce what's in the reference manual. All the pseudocode is
in there which describes how it works.

As for use cases, I'm not really sure. It probably wasn't any effort to
add into the hardware with a single not gate, and something could have
been missed if it wasn't added. You might be able to do things like
count the inverse of something without having to open another event to
subtract from to find what the inverse would be.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 2/3] arm64: perf: Add support for event counting threshold
  2023-11-23  3:42   ` Anshuman Khandual
@ 2023-11-23 17:53     ` James Clark
  0 siblings, 0 replies; 15+ messages in thread
From: James Clark @ 2023-11-23 17:53 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Catalin Marinas, Jonathan Corbet, linux-doc, linux-kernel,
	linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland



On 23/11/2023 03:42, Anshuman Khandual wrote:
> 
> On 11/13/23 16:55, James Clark wrote:
>> FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
>> events whose count meets a specified threshold condition. For example if
>> PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or
>> equal, count), and the threshold is set to 2, then the PMU counter will
>> now only increment by 1 when an event would have previously incremented
>> the PMU counter by 2 or more on a single processor cycle.
>>
>> Three new Perf event config fields, 'threshold', 'threshold_compare' and
>> 'threshold_count' have been added to control the feature.
>> threshold_compare maps to the upper two bits of PMEVTYPERn.TC and
>> threshold_count maps to the first bit of TC. These separate attributes
>> have been picked rather than enumerating all the possible combinations
>> of the TC field as in the Arm ARM. The attributes would be used on a
>> Perf command line like this:
>>
>>   $ perf stat -e stall_slot/threshold=2,threshold_compare=2/
> 
> If threshold_count = 0, then threshold and threshold_compare should just
> be ignored ?
> 

No, threshold_count only effects the value which the PMU counts by after
the threshold condition is passed.

  Threshold_count == 1 : Always increment by 1
  Threshold_count == 0 : Increment by whatever the count was on that
                         cycle

Threshold_count never causes anything else to be ignored, only threshold
== 0 does.

>>
>> A new capability for reading out the maximum supported threshold value
>> has also been added:
>>
>>   $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
>>
>>   0x000000ff
> 
> Makes sense.
> 
>>
>> If a threshold higher than threshold_max is provided, then no error is
>> generated but the threshold is clamped to the max value. If
>> FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then
>> threshold_max reads zero, and neither the 'threshold' nor
>> 'threshold_control' parameters will be used.
>>
>> The threshold is per PMU counter, and there are potentially different
>> threshold_max values per PMU type on heterogeneous systems.
>>
>> Bits higher than 32 now need to be written into PMEVTYPER, so
>> armv8pmu_write_evtype() has to be updated to take an unsigned long value
>> rather than u32 which gives the correct behavior on both aarch32 and 64.
> 
> Makes sense.
> 
>>
>> Signed-off-by: James Clark <james.clark@arm.com>
>> ---
>>  drivers/perf/arm_pmuv3.c       | 79 +++++++++++++++++++++++++++++++++-
>>  include/linux/perf/arm_pmuv3.h |  1 +
>>  2 files changed, 79 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
>> index 1d40d794f5e4..694d914ffc08 100644
>> --- a/drivers/perf/arm_pmuv3.c
>> +++ b/drivers/perf/arm_pmuv3.c
>> @@ -15,6 +15,7 @@
>>  #include <clocksource/arm_arch_timer.h>
>>  
>>  #include <linux/acpi.h>
>> +#include <linux/bitfield.h>
>>  #include <linux/clocksource.h>
>>  #include <linux/of.h>
>>  #include <linux/perf/arm_pmu.h>
>> @@ -294,9 +295,18 @@ static const struct attribute_group armv8_pmuv3_events_attr_group = {
>>  	.is_visible = armv8pmu_event_attr_is_visible,
>>  };
>>  
>> +#define TH_LO	2
>> +#define TH_HI	13
>> +#define TH_CNT	14
>> +#define TH_CMP_LO	15
>> +#define TH_CMP_HI	16
> 
> TH_ prefix sounds too cryptic. I had suggested some clean up earlier
> 
> https://lore.kernel.org/all/20231009051753.179355-1-anshuman.khandual@arm.com/
> 
> But for now, something like the following might be bit better instead ?
> 
> THRESHOLD_LOW
> THRESHOLD_HIGH
> THRESHOLD_CNT
> THRESHOLD_CMP_LOW
> THRESHOLD_CMP_HIGH
> 

Yep I can expand them. I agree they're a bit cryptic but it was making
the lines too long.

>> +
>>  PMU_FORMAT_ATTR(event, "config:0-15");
>>  PMU_FORMAT_ATTR(long, "config1:0");
>>  PMU_FORMAT_ATTR(rdpmc, "config1:1");
>> +PMU_FORMAT_ATTR(threshold, "config1:" __stringify(TH_LO) "-" __stringify(TH_HI));
>> +PMU_FORMAT_ATTR(threshold_compare, "config1:" __stringify(TH_CMP_LO) "-" __stringify(TH_CMP_HI));
>> +PMU_FORMAT_ATTR(threshold_count, "config1:" __stringify(TH_CNT));
>>  
>>  static int sysctl_perf_user_access __read_mostly;
>>  
>> @@ -310,10 +320,32 @@ static inline bool armv8pmu_event_want_user_access(struct perf_event *event)
>>  	return event->attr.config1 & 0x2;
>>  }
>>  
>> +static inline u32 armv8pmu_event_threshold(struct perf_event_attr *attr)
>> +{
>> +	return FIELD_GET(GENMASK(TH_HI, TH_LO), attr->config1);
>> +}
>> +
>> +static inline u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr)
>> +{
>> +	u8 th_compare = FIELD_GET(GENMASK(TH_CMP_HI, TH_CMP_LO), attr->config1);
>> +	u8 th_count = FIELD_GET(BIT(TH_CNT), attr->config1);
>> +
>> +	/*
>> +	 * The count bit is always the bottom bit of the full control field, and
>> +	 * the comparison is the upper two bits, but it's not explicitly
>> +	 * labelled in the Arm ARM. For the Perf interface we split it into two
>> +	 * fields, so reconstruct it here.
>> +	 */
>> +	return (th_compare << 1) | th_count;
> 
> If user provides 'th_count = 0', then all these threshold control code can be
> skipped as if FEAT_PMUv3_TH was never implemented ?

Same as above, no nothing can be skipped based on th_count.

> Also what happens when
> threshold = 0 ?
> 

When threshold = 0 then the feature is disabled. I will add that to the
docs and I can add an early exit condition for that when writing to
config_base. Although it wouldn't have any actual effect as the hardware
already ignores the control field when the threshold is 0.

>> +}
>> +
>>  static struct attribute *armv8_pmuv3_format_attrs[] = {
>>  	&format_attr_event.attr,
>>  	&format_attr_long.attr,
>>  	&format_attr_rdpmc.attr,
>> +	&format_attr_threshold.attr,
>> +	&format_attr_threshold_compare.attr,
>> +	&format_attr_threshold_count.attr,
>>  	NULL,
>>  };
>>  
>> @@ -365,10 +397,38 @@ static ssize_t bus_width_show(struct device *dev, struct device_attribute *attr,
>>  
>>  static DEVICE_ATTR_RO(bus_width);
>>  
>> +static u32 threshold_max(struct arm_pmu *cpu_pmu)
>> +{
>> +	/*
>> +	 * PMMIR.WIDTH is readable and non-zero on aarch32, but it would be
> 
> PMMIR.THWIDTH ^^^^^^^^ ?
> 
>> +	 * impossible to write the threshold in the upper 32 bits of PMEVTYPER.
>> +	 */
> 
> Could aarch32 support FEAT_PMUv3_TH ? if not,

Kind of, if a 64 bit host sets the register then I think thresholding
might work on a 32 bit guest. You'd have to check the manual if you
really want to know, but I don't think it's really relevant as this code
is only about not letting 32 bit guests write to it themselves, because
it wouldn't even compile.

> how can the PMMIR.THWIDTH value
> here be non-zero ?

It just is, it's written like that in the ARM Arm as far as I read it.

> Also wondering if just a non-zero PMMIR.THWIDTH indicates
> the presence for FEAT_PMUv3_TH on a given ARM PMU.
> 

Yes.

>> +	if (IS_ENABLED(CONFIG_ARM))
>> +		return 0;
> 
> Small nit - would a negative check on CONFIG_ARM64 be better ?
> 

Only if we plan to add a third Arm config type in the future that
doesn't support FEAT_PMUv3_TH but also uses this code? Seems so unlikely
as to not worth considering to me.

Couldn't you make the same point about every CONFIG_ARM in the code?

>> +
>> +	/*
>> +	 * The largest value that can be written to PMEVTYPER<n>_EL0.TH is
>> +	 * (2 ^ PMMIR.THWIDTH) - 1.
>> +	 */
>> +	return (1 << FIELD_GET(ARMV8_PMU_THWIDTH, cpu_pmu->reg_pmmir)) - 1;
> 
> Makes sense to return the adjusted value.
> 

Are you saying to make a change here? Or just that it makes sense?

>> +}
>> +
>> +static ssize_t threshold_max_show(struct device *dev,
>> +				  struct device_attribute *attr, char *page)
>> +{
>> +	struct pmu *pmu = dev_get_drvdata(dev);
>> +	struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
>> +
>> +	return sysfs_emit(page, "0x%08x\n", threshold_max(cpu_pmu));
>> +}
>> +
>> +static DEVICE_ATTR_RO(threshold_max);
>> +
>>  static struct attribute *armv8_pmuv3_caps_attrs[] = {
>>  	&dev_attr_slots.attr,
>>  	&dev_attr_bus_slots.attr,
>>  	&dev_attr_bus_width.attr,
>> +	&dev_attr_threshold_max.attr,
>>  	NULL,
>>  };
>>  
>> @@ -552,7 +612,7 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
>>  		armv8pmu_write_hw_counter(event, value);
>>  }
>>  
>> -static inline void armv8pmu_write_evtype(int idx, u32 val)
>> +static inline void armv8pmu_write_evtype(int idx, unsigned long val)
>>  {
>>  	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
>>  	unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
>> @@ -921,6 +981,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>>  				     struct perf_event_attr *attr)
>>  {
>>  	unsigned long config_base = 0;
>> +	struct perf_event *perf_event = container_of(attr, struct perf_event,
>> +						     attr);
>> +	struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu);
>> +	u32 th, th_max;
>>  
>>  	if (attr->exclude_idle)
>>  		return -EPERM;
>> @@ -952,6 +1016,19 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>>  	if (attr->exclude_user)
>>  		config_base |= ARMV8_PMU_EXCLUDE_EL0;
>>  
>> +	/*
>> +	 * Insert event counting threshold (FEAT_PMUv3_TH) values. If
>> +	 * FEAT_PMUv3_TH isn't implemented, then THWIDTH (threshold_max) will be
>> +	 * 0 and no values will be written.
>> +	 */
>> +	th_max = threshold_max(cpu_pmu);
>> +	if (IS_ENABLED(CONFIG_ARM64) && th_max) {
> 
> As mentioned above, using negative check on CONFIG_ARM64 in threshold_max()
> will complement this condition here, making it clear that these threshold
> configurations are applicable only on 64 bit platforms.
> 
>> +		th = min(armv8pmu_event_threshold(attr), th_max);
>> +		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th);
>> +		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC,
>> +					  armv8pmu_event_threshold_control(attr));
> 
> Small nit - armv8pmu_event_threshold_control() might also be captured before
> into a 'tc' local variable before adjusting the config_base similar to 'th'.

I don't see the improvement, I don't get putting things in variables if
they are only used once and there is no line length issue. The only
reason I probably did it for th was to make the line shorter.

> Also better to add a small comment before 'th = min(..., ..) ' regarding the
> clamping user input to platform max_threshold.
> 
> 		th = min(armv8pmu_event_threshold(attr), th_max);
> 		tc = armv8pmu_event_threshold_control(attr));
> 		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th);
> 		config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC, tc);
> 

The comment would just say "clamp to the max" and the code says min(...,
th_max) so it would just be repeating what's already there. I also don't
see the benefit of this one either.

>> +	}
>> +
>>  	/*
>>  	 * Install the filter into config_base as this is used to
>>  	 * construct the event type.
>> diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
>> index ddd1fec86739..ccbc0f9a74d8 100644
>> --- a/include/linux/perf/arm_pmuv3.h
>> +++ b/include/linux/perf/arm_pmuv3.h
>> @@ -258,6 +258,7 @@
>>  #define ARMV8_PMU_BUS_SLOTS_MASK 0xff
>>  #define ARMV8_PMU_BUS_WIDTH_SHIFT 16
>>  #define ARMV8_PMU_BUS_WIDTH_MASK 0xf
>> +#define ARMV8_PMU_THWIDTH GENMASK(23, 20)
> 
> Small nit - may be ARMV8_PMU_TH_WIDTH instead ?
> 

I probably won't change this one. 'THWIDTH' isn't very nice I agree, but
it's from the reference manual, and this enum describes the field. So
you remove people's ability to grep for it, just to make it look a
little bit nicer. It's not worth it.

>>  
>>  /*
>>   * This code is really good

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature
  2023-11-23 15:45       ` James Clark
@ 2023-11-24  9:52         ` James Clark
  0 siblings, 0 replies; 15+ messages in thread
From: James Clark @ 2023-11-24  9:52 UTC (permalink / raw)
  To: Anshuman Khandual, Namhyung Kim
  Cc: linux-arm-kernel, linux-perf-users, suzuki.poulose, will,
	mark.rutland, Catalin Marinas, Jonathan Corbet, linux-doc,
	linux-kernel



On 23/11/2023 15:45, James Clark wrote:
> 
> 
> On 23/11/2023 05:50, Anshuman Khandual wrote:
>>
>>
>> On 11/21/23 03:01, Namhyung Kim wrote:
>>> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@arm.com> wrote:
>>>> Add documentation for the new Perf event open parameters and
>>>> the threshold_max capability file.
>>>>
>>>> Signed-off-by: James Clark <james.clark@arm.com>
>>>> ---
>>>>  Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
>>>>  1 file changed, 56 insertions(+)
>>>>
>>>> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst
>>>> index 1f87b57c2332..36b8111a710d 100644
>>>> --- a/Documentation/arch/arm64/perf.rst
>>>> +++ b/Documentation/arch/arm64/perf.rst
>>>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed.
>>>>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
>>>>  .. _tools/lib/perf/tests/test-evsel.c:
>>>>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
>>>> +
>>>> +Event Counting Threshold
>>>> +==========================================
>>>> +
>>>> +Overview
>>>> +--------
>>>> +
>>>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
>>>> +events whose count meets a specified threshold condition. For example if
>>>> +threshold_compare is set to 2 ('Greater than or equal'), and the
>>>> +threshold is set to 2, then the PMU counter will now only increment by
>>>> +when an event would have previously incremented the PMU counter by 2 or
>>>> +more on a single processor cycle.
>>>> +
>>>> +To increment by 1 after passing the threshold condition instead of the
>>>> +number of events on that cycle, add the 'threshold_count' option to the
>>>> +commandline.
>>>> +
>>>> +How-to
>>>> +------
>>>> +
>>>> +The threshold, threshold_compare and threshold_count values can be
>>>> +provided per event:
>>>> +
>>>> +.. code-block:: sh
>>>> +
>>>> +  perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
>>>> +            -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
>>> Can you please explain this a bit more?
>>>
>>> I guess the first event counts stall_slot PMU if the event if it's
>>> greater than or equal to 2.  And as threshold_count is not set,
>>> it'd count the stall_slot as is.  E.g. it counts 3 when it sees 3.
>>
>> Hence without 'threshold_count' being set, the other two config requests
>> will not have an effect, is that correct ?
> 
> Yeah I can mention this. It's implied because 0 is the default value of
> config fields, and 0 is a valid value for compare and count field, so
> threshold=0 has to be the way to disable it. But I can mention it
> explicitly.
> 

To avoid any confusion, I thought you meant threshold here instead of
threshold_count. But I replied in more detail about the same issue on
patch 2.


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-11-24  9:52 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-13 11:25 [PATCH v5 0/3] arm64: perf: Add support for event counting threshold James Clark
2023-11-13 11:25 ` [PATCH v5 1/3] arm64: perf: Include threshold control fields in PMEVTYPER mask James Clark
2023-11-21 10:15   ` Suzuki K Poulose
2023-11-22  8:44   ` Anshuman Khandual
2023-11-13 11:25 ` [PATCH v5 2/3] arm64: perf: Add support for event counting threshold James Clark
2023-11-21 10:20   ` Suzuki K Poulose
2023-11-23  3:42   ` Anshuman Khandual
2023-11-23 17:53     ` James Clark
2023-11-13 11:25 ` [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature James Clark
2023-11-20 21:31   ` Namhyung Kim
2023-11-21 10:33     ` Suzuki K Poulose
2023-11-23 15:33       ` James Clark
2023-11-23  5:50     ` Anshuman Khandual
2023-11-23 15:45       ` James Clark
2023-11-24  9:52         ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).