linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features
@ 2025-11-11 11:37 James Clark
  2025-11-11 11:37 ` [PATCH v10 1/5] perf: Add perf_event_attr::config4 James Clark
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: James Clark @ 2025-11-11 11:37 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Leo Yan, Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-perf-users, linux-doc,
	kvmarm, James Clark

Support SPE_FEAT_FDS data source filtering.

---
Changes in v10:
- Pick up Peter's ack
- Slightly clarify commit message regarding the difference between the
  data source filter and the data source
- Link to v9: https://lore.kernel.org/r/20251029-james-perf-feat_spe_eft-v9-0-d22536b9cf94@linaro.org

Changes in v9:
- Fix another typo in docs: s/data_src_filter/inv_data_src_filter/g
- Drop already applied patches for other features. Only the data source
  filtering patches remain.
- Rebase on latest perf-tools-next
- Link to v8: https://lore.kernel.org/r/20250901-james-perf-feat_spe_eft-v8-0-2e2738f24559@linaro.org

Changes in v8:
- Define __spe_vers_imp before it's used
- "disable traps to PMSDSFR" -> "disable traps of PMSDSFR to EL2"
- Link to v7: https://lore.kernel.org/r/20250814-james-perf-feat_spe_eft-v7-0-6a743f7fa259@linaro.org

Changes in v7:
- Fix typo in docs: s/data_src_filter/inv_data_src_filter/g
- Pickup trailers
- Link to v6: https://lore.kernel.org/r/20250808-james-perf-feat_spe_eft-v6-0-6daf498578c8@linaro.org

Changes in v6:
- Rebase to resolve conflict with BRBE changes in el2_setup.h
- Link to v5: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v5-0-a7bc533485a1@linaro.org

Changes in v5:
- Forgot to pickup tags from v4
- Forgot to drop test and review tags on v4 patches that were
  significantly modified
- Update commit message for data source filtering to mention inversion
- Link to v4: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v4-0-0a527410f8fd@linaro.org

Changes in v4:
- Rewrite "const u64 feat_spe_eft_bits" inline
- Invert data source filter so that it's possible to exclude all data
  sources without adding an additional 'enable filter' flag
- Add a macro in el2_setup.h to check for an SPE version
- Probe valid filter bits instead of hardcoding them
- Take in Leo's commit to expose the filter bits as it depends on the
  new filter probing
- Link to v3: https://lore.kernel.org/r/20250605-james-perf-feat_spe_eft-v3-0-71b0c9f98093@linaro.org

Changes in v3:
- Use PMSIDR_EL1_FDS instead of 1 << PMSIDR_EL1_FDS_SHIFT
- Add VNCR offsets
- Link to v2: https://lore.kernel.org/r/20250529-james-perf-feat_spe_eft-v2-0-a01a9baad06a@linaro.org

Changes in v2:
- Fix detection of FEAT_SPE_FDS in el2_setup.h
- Pickup Marc Z's sysreg change instead which matches the json
- Restructure and expand docs changes
- Link to v1: https://lore.kernel.org/r/20250506-james-perf-feat_spe_eft-v1-0-dd480e8e4851@linaro.org

---
James Clark (5):
      perf: Add perf_event_attr::config4
      perf: arm_spe: Add support for filtering on data source
      tools headers UAPI: Sync linux/perf_event.h with the kernel sources
      perf tools: Add support for perf_event_attr::config4
      perf docs: arm-spe: Document new SPE filtering features

 drivers/perf/arm_spe_pmu.c                |  37 +++++++++++
 include/uapi/linux/perf_event.h           |   2 +
 tools/include/uapi/linux/perf_event.h     |   2 +
 tools/perf/Documentation/perf-arm-spe.txt | 104 +++++++++++++++++++++++++++---
 tools/perf/tests/parse-events.c           |  13 +++-
 tools/perf/util/parse-events.c            |  11 ++++
 tools/perf/util/parse-events.h            |   1 +
 tools/perf/util/parse-events.l            |   1 +
 tools/perf/util/pmu.c                     |   8 +++
 tools/perf/util/pmu.h                     |   1 +
 10 files changed, 170 insertions(+), 10 deletions(-)
---
base-commit: 081006b7c8e19406dc6674c6b6d086764d415b5c
change-id: 20250312-james-perf-feat_spe_eft-66cdf4d8fe99

Best regards,
-- 
James Clark <james.clark@linaro.org>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v10 1/5] perf: Add perf_event_attr::config4
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
@ 2025-11-11 11:37 ` James Clark
  2025-11-11 11:37 ` [PATCH v10 2/5] perf: arm_spe: Add support for filtering on data source James Clark
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2025-11-11 11:37 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Leo Yan, Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-perf-users, linux-doc,
	kvmarm, James Clark

Arm FEAT_SPE_FDS adds the ability to filter on the data source of a
packet using another 64-bits of event filtering control. As the existing
perf_event_attr::configN fields are all used up for SPE PMU, an
additional field is needed. Add a new 'config4' field.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 include/uapi/linux/perf_event.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 78a362b80027..0d0ed85ad8cb 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -382,6 +382,7 @@ enum perf_event_read_format {
 #define PERF_ATTR_SIZE_VER6			120	/* Add: aux_sample_size */
 #define PERF_ATTR_SIZE_VER7			128	/* Add: sig_data */
 #define PERF_ATTR_SIZE_VER8			136	/* Add: config3 */
+#define PERF_ATTR_SIZE_VER9			144	/* add: config4 */
 
 /*
  * 'struct perf_event_attr' contains various attributes that define
@@ -543,6 +544,7 @@ struct perf_event_attr {
 	__u64	sig_data;
 
 	__u64	config3; /* extension of config2 */
+	__u64	config4; /* extension of config3 */
 };
 
 /*

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 2/5] perf: arm_spe: Add support for filtering on data source
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
  2025-11-11 11:37 ` [PATCH v10 1/5] perf: Add perf_event_attr::config4 James Clark
@ 2025-11-11 11:37 ` James Clark
  2025-11-11 11:37 ` [PATCH v10 3/5] tools headers UAPI: Sync linux/perf_event.h with the kernel sources James Clark
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2025-11-11 11:37 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Leo Yan, Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-perf-users, linux-doc,
	kvmarm, James Clark

SPE_FEAT_FDS adds the ability to filter on the data source of packets.
Like the other existing filters, enable filtering with PMSFCR_EL1.FDS
when any of the filter bits are set.

Each bit position of the 64 bit filter maps to numerical data sources
0-63 described by bits[0:5] in the data source packet (although the full
range of data source is 16 bits so higher value data sources can't be
filtered on). The filter is an OR of all the filter bits, so for example
clearing filter bits 0 and 3 only includes packets from data sources 0
OR 3.

Invert the filter given by userspace so that the default value of 0 is
equivalent to including all values (no filtering). This allows us to
skip adding a new format bit to enable filtering and still support
excluding all data sources which would have been a filter value of 0 if
not for the inversion.

Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 drivers/perf/arm_spe_pmu.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index fa50645fedda..617f8a98dd63 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -87,6 +87,7 @@ struct arm_spe_pmu {
 #define SPE_PMU_FEAT_INV_FILT_EVT		(1UL << 6)
 #define SPE_PMU_FEAT_DISCARD			(1UL << 7)
 #define SPE_PMU_FEAT_EFT			(1UL << 8)
+#define SPE_PMU_FEAT_FDS			(1UL << 9)
 #define SPE_PMU_FEAT_DEV_PROBED			(1UL << 63)
 	u64					features;
 
@@ -252,6 +253,10 @@ static const struct attribute_group arm_spe_pmu_cap_group = {
 #define ATTR_CFG_FLD_inv_event_filter_LO	0
 #define ATTR_CFG_FLD_inv_event_filter_HI	63
 
+#define ATTR_CFG_FLD_inv_data_src_filter_CFG	config4	/* inverse of PMSDSFR_EL1 */
+#define ATTR_CFG_FLD_inv_data_src_filter_LO	0
+#define ATTR_CFG_FLD_inv_data_src_filter_HI	63
+
 GEN_PMU_FORMAT_ATTR(ts_enable);
 GEN_PMU_FORMAT_ATTR(pa_enable);
 GEN_PMU_FORMAT_ATTR(pct_enable);
@@ -268,6 +273,7 @@ GEN_PMU_FORMAT_ATTR(float_filter);
 GEN_PMU_FORMAT_ATTR(float_filter_mask);
 GEN_PMU_FORMAT_ATTR(event_filter);
 GEN_PMU_FORMAT_ATTR(inv_event_filter);
+GEN_PMU_FORMAT_ATTR(inv_data_src_filter);
 GEN_PMU_FORMAT_ATTR(min_latency);
 GEN_PMU_FORMAT_ATTR(discard);
 
@@ -288,6 +294,7 @@ static struct attribute *arm_spe_pmu_formats_attr[] = {
 	&format_attr_float_filter_mask.attr,
 	&format_attr_event_filter.attr,
 	&format_attr_inv_event_filter.attr,
+	&format_attr_inv_data_src_filter.attr,
 	&format_attr_min_latency.attr,
 	&format_attr_discard.attr,
 	NULL,
@@ -306,6 +313,10 @@ static umode_t arm_spe_pmu_format_attr_is_visible(struct kobject *kobj,
 	if (attr == &format_attr_inv_event_filter.attr && !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT))
 		return 0;
 
+	if (attr == &format_attr_inv_data_src_filter.attr &&
+	    !(spe_pmu->features & SPE_PMU_FEAT_FDS))
+		return 0;
+
 	if ((attr == &format_attr_branch_filter_mask.attr ||
 	     attr == &format_attr_load_filter_mask.attr ||
 	     attr == &format_attr_store_filter_mask.attr ||
@@ -430,6 +441,9 @@ static u64 arm_spe_event_to_pmsfcr(struct perf_event *event)
 	if (ATTR_CFG_GET_FLD(attr, inv_event_filter))
 		reg |= PMSFCR_EL1_FnE;
 
+	if (ATTR_CFG_GET_FLD(attr, inv_data_src_filter))
+		reg |= PMSFCR_EL1_FDS;
+
 	if (ATTR_CFG_GET_FLD(attr, min_latency))
 		reg |= PMSFCR_EL1_FL;
 
@@ -454,6 +468,17 @@ static u64 arm_spe_event_to_pmslatfr(struct perf_event *event)
 	return FIELD_PREP(PMSLATFR_EL1_MINLAT, ATTR_CFG_GET_FLD(attr, min_latency));
 }
 
+static u64 arm_spe_event_to_pmsdsfr(struct perf_event *event)
+{
+	struct perf_event_attr *attr = &event->attr;
+
+	/*
+	 * Data src filter is inverted so that the default value of 0 is
+	 * equivalent to no filtering.
+	 */
+	return ~ATTR_CFG_GET_FLD(attr, inv_data_src_filter);
+}
+
 static void arm_spe_pmu_pad_buf(struct perf_output_handle *handle, int len)
 {
 	struct arm_spe_pmu_buf *buf = perf_get_aux(handle);
@@ -791,6 +816,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
 	if (arm_spe_event_to_pmsnevfr(event) & spe_pmu->pmsevfr_res0)
 		return -EOPNOTSUPP;
 
+	if (arm_spe_event_to_pmsdsfr(event) != U64_MAX &&
+	    !(spe_pmu->features & SPE_PMU_FEAT_FDS))
+		return -EOPNOTSUPP;
+
 	if (attr->exclude_idle)
 		return -EOPNOTSUPP;
 
@@ -866,6 +895,11 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags)
 		write_sysreg_s(reg, SYS_PMSNEVFR_EL1);
 	}
 
+	if (spe_pmu->features & SPE_PMU_FEAT_FDS) {
+		reg = arm_spe_event_to_pmsdsfr(event);
+		write_sysreg_s(reg, SYS_PMSDSFR_EL1);
+	}
+
 	reg = arm_spe_event_to_pmslatfr(event);
 	write_sysreg_s(reg, SYS_PMSLATFR_EL1);
 
@@ -1125,6 +1159,9 @@ static void __arm_spe_pmu_dev_probe(void *info)
 	if (FIELD_GET(PMSIDR_EL1_EFT, reg))
 		spe_pmu->features |= SPE_PMU_FEAT_EFT;
 
+	if (FIELD_GET(PMSIDR_EL1_FDS, reg))
+		spe_pmu->features |= SPE_PMU_FEAT_FDS;
+
 	/* This field has a spaced out encoding, so just use a look-up */
 	fld = FIELD_GET(PMSIDR_EL1_INTERVAL, reg);
 	switch (fld) {

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 3/5] tools headers UAPI: Sync linux/perf_event.h with the kernel sources
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
  2025-11-11 11:37 ` [PATCH v10 1/5] perf: Add perf_event_attr::config4 James Clark
  2025-11-11 11:37 ` [PATCH v10 2/5] perf: arm_spe: Add support for filtering on data source James Clark
@ 2025-11-11 11:37 ` James Clark
  2025-11-11 11:37 ` [PATCH v10 4/5] perf tools: Add support for perf_event_attr::config4 James Clark
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2025-11-11 11:37 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Leo Yan, Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-perf-users, linux-doc,
	kvmarm, James Clark

To pickup config4 changes.

Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 tools/include/uapi/linux/perf_event.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 78a362b80027..0d0ed85ad8cb 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -382,6 +382,7 @@ enum perf_event_read_format {
 #define PERF_ATTR_SIZE_VER6			120	/* Add: aux_sample_size */
 #define PERF_ATTR_SIZE_VER7			128	/* Add: sig_data */
 #define PERF_ATTR_SIZE_VER8			136	/* Add: config3 */
+#define PERF_ATTR_SIZE_VER9			144	/* add: config4 */
 
 /*
  * 'struct perf_event_attr' contains various attributes that define
@@ -543,6 +544,7 @@ struct perf_event_attr {
 	__u64	sig_data;
 
 	__u64	config3; /* extension of config2 */
+	__u64	config4; /* extension of config3 */
 };
 
 /*

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 4/5] perf tools: Add support for perf_event_attr::config4
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
                   ` (2 preceding siblings ...)
  2025-11-11 11:37 ` [PATCH v10 3/5] tools headers UAPI: Sync linux/perf_event.h with the kernel sources James Clark
@ 2025-11-11 11:37 ` James Clark
  2025-11-11 11:37 ` [PATCH v10 5/5] perf docs: arm-spe: Document new SPE filtering features James Clark
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2025-11-11 11:37 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Leo Yan, Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-perf-users, linux-doc,
	kvmarm, James Clark

perf_event_attr has gained a new field, config4, so add support for it
extending the existing configN support.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 tools/perf/tests/parse-events.c | 13 ++++++++++++-
 tools/perf/util/parse-events.c  | 11 +++++++++++
 tools/perf/util/parse-events.h  |  1 +
 tools/perf/util/parse-events.l  |  1 +
 tools/perf/util/pmu.c           |  8 ++++++++
 tools/perf/util/pmu.h           |  1 +
 6 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index e4cdb517c10e..128d21dc389f 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -647,6 +647,7 @@ static int test__checkevent_pmu(struct evlist *evlist)
 	TEST_ASSERT_EVSEL("wrong config1",    1 == evsel->core.attr.config1, evsel);
 	TEST_ASSERT_EVSEL("wrong config2",    3 == evsel->core.attr.config2, evsel);
 	TEST_ASSERT_EVSEL("wrong config3",    0 == evsel->core.attr.config3, evsel);
+	TEST_ASSERT_EVSEL("wrong config4",    0 == evsel->core.attr.config4, evsel);
 	/*
 	 * The period value gets configured within evlist__config,
 	 * while this test executes only parse events method.
@@ -669,6 +670,7 @@ static int test__checkevent_list(struct evlist *evlist)
 		TEST_ASSERT_EVSEL("wrong config1", 0 == evsel->core.attr.config1, evsel);
 		TEST_ASSERT_EVSEL("wrong config2", 0 == evsel->core.attr.config2, evsel);
 		TEST_ASSERT_EVSEL("wrong config3", 0 == evsel->core.attr.config3, evsel);
+		TEST_ASSERT_EVSEL("wrong config4", 0 == evsel->core.attr.config4, evsel);
 		TEST_ASSERT_EVSEL("wrong exclude_user", !evsel->core.attr.exclude_user, evsel);
 		TEST_ASSERT_EVSEL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel, evsel);
 		TEST_ASSERT_EVSEL("wrong exclude_hv", !evsel->core.attr.exclude_hv, evsel);
@@ -849,6 +851,15 @@ static int test__checkterms_simple(struct parse_events_terms *terms)
 	TEST_ASSERT_VAL("wrong val", term->val.num == 4);
 	TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "config3"));
 
+	/* config4=5 */
+	term = list_entry(term->list.next, struct parse_events_term, list);
+	TEST_ASSERT_VAL("wrong type term",
+			term->type_term == PARSE_EVENTS__TERM_TYPE_CONFIG4);
+	TEST_ASSERT_VAL("wrong type val",
+			term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+	TEST_ASSERT_VAL("wrong val", term->val.num == 5);
+	TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "config4"));
+
 	/* umask=1*/
 	term = list_entry(term->list.next, struct parse_events_term, list);
 	TEST_ASSERT_VAL("wrong type term",
@@ -2516,7 +2527,7 @@ struct terms_test {
 
 static const struct terms_test test__terms[] = {
 	[0] = {
-		.str   = "config=10,config1,config2=3,config3=4,umask=1,read,r0xead",
+		.str   = "config=10,config1,config2=3,config3=4,config4=5,umask=1,read,r0xead",
 		.check = test__checkterms_simple,
 	},
 };
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 0c0dc20b1c13..ee4f55cbd3cb 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -215,6 +215,8 @@ __add_event(struct list_head *list, int *idx,
 						PERF_PMU_FORMAT_VALUE_CONFIG2, "config2");
 			perf_pmu__warn_invalid_config(pmu, attr->config3, name,
 						PERF_PMU_FORMAT_VALUE_CONFIG3, "config3");
+			perf_pmu__warn_invalid_config(pmu, attr->config4, name,
+						PERF_PMU_FORMAT_VALUE_CONFIG4, "config4");
 		}
 	}
 	/*
@@ -700,6 +702,7 @@ const char *parse_events__term_type_str(enum parse_events__term_type term_type)
 		[PARSE_EVENTS__TERM_TYPE_CONFIG1]		= "config1",
 		[PARSE_EVENTS__TERM_TYPE_CONFIG2]		= "config2",
 		[PARSE_EVENTS__TERM_TYPE_CONFIG3]		= "config3",
+		[PARSE_EVENTS__TERM_TYPE_CONFIG4]		= "config4",
 		[PARSE_EVENTS__TERM_TYPE_NAME]			= "name",
 		[PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD]		= "period",
 		[PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ]		= "freq",
@@ -749,6 +752,7 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er
 	case PARSE_EVENTS__TERM_TYPE_CONFIG1:
 	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 	case PARSE_EVENTS__TERM_TYPE_CONFIG3:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG4:
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 	case PARSE_EVENTS__TERM_TYPE_METRIC_ID:
 	case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
@@ -819,6 +823,10 @@ do {											\
 		CHECK_TYPE_VAL(NUM);
 		attr->config3 = term->val.num;
 		break;
+	case PARSE_EVENTS__TERM_TYPE_CONFIG4:
+		CHECK_TYPE_VAL(NUM);
+		attr->config4 = term->val.num;
+		break;
 	case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
 		CHECK_TYPE_VAL(NUM);
 		break;
@@ -1064,6 +1072,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_CONFIG1:
 	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 	case PARSE_EVENTS__TERM_TYPE_CONFIG3:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG4:
 	case PARSE_EVENTS__TERM_TYPE_LEGACY_HARDWARE_CONFIG:
 	case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE_CONFIG:
 	case PARSE_EVENTS__TERM_TYPE_NAME:
@@ -1207,6 +1216,7 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_CONFIG1:
 		case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 		case PARSE_EVENTS__TERM_TYPE_CONFIG3:
+		case PARSE_EVENTS__TERM_TYPE_CONFIG4:
 		case PARSE_EVENTS__TERM_TYPE_LEGACY_HARDWARE_CONFIG:
 		case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE_CONFIG:
 		case PARSE_EVENTS__TERM_TYPE_NAME:
@@ -1245,6 +1255,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head
 		case PARSE_EVENTS__TERM_TYPE_CONFIG1:
 		case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 		case PARSE_EVENTS__TERM_TYPE_CONFIG3:
+		case PARSE_EVENTS__TERM_TYPE_CONFIG4:
 		case PARSE_EVENTS__TERM_TYPE_LEGACY_HARDWARE_CONFIG:
 		case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE_CONFIG:
 		case PARSE_EVENTS__TERM_TYPE_NAME:
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 1012b441e9cd..3577ab213730 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -59,6 +59,7 @@ enum parse_events__term_type {
 	PARSE_EVENTS__TERM_TYPE_CONFIG1,
 	PARSE_EVENTS__TERM_TYPE_CONFIG2,
 	PARSE_EVENTS__TERM_TYPE_CONFIG3,
+	PARSE_EVENTS__TERM_TYPE_CONFIG4,
 	PARSE_EVENTS__TERM_TYPE_NAME,
 	PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD,
 	PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 8e0ea441e57f..251ce4321878 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -287,6 +287,7 @@ config			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG); }
 config1			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG1); }
 config2			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG2); }
 config3			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG3); }
+config4			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG4); }
 name			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NAME); }
 period			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD); }
 freq			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ); }
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index f14f2a12d061..1b7c712d8f99 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1574,6 +1574,10 @@ static int pmu_config_term(const struct perf_pmu *pmu,
 			assert(term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
 			pmu_format_value(bits, term->val.num, &attr->config3, zero);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_CONFIG4:
+			assert(term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+			pmu_format_value(bits, term->val.num, &attr->config4, zero);
+			break;
 		case PARSE_EVENTS__TERM_TYPE_LEGACY_HARDWARE_CONFIG:
 			assert(term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
 			assert(term->val.num < PERF_COUNT_HW_MAX);
@@ -1649,6 +1653,9 @@ static int pmu_config_term(const struct perf_pmu *pmu,
 	case PERF_PMU_FORMAT_VALUE_CONFIG3:
 		vp = &attr->config3;
 		break;
+	case PERF_PMU_FORMAT_VALUE_CONFIG4:
+		vp = &attr->config4;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -2008,6 +2015,7 @@ int perf_pmu__for_each_format(struct perf_pmu *pmu, void *state, pmu_format_call
 		"config1=0..0xffffffffffffffff",
 		"config2=0..0xffffffffffffffff",
 		"config3=0..0xffffffffffffffff",
+		"config4=0..0xffffffffffffffff",
 		"legacy-hardware-config=0..9,",
 		"legacy-cache-config=0..0xffffff,",
 		"name=string",
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 1ebcf0242af8..67431f765266 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -23,6 +23,7 @@ enum {
 	PERF_PMU_FORMAT_VALUE_CONFIG1,
 	PERF_PMU_FORMAT_VALUE_CONFIG2,
 	PERF_PMU_FORMAT_VALUE_CONFIG3,
+	PERF_PMU_FORMAT_VALUE_CONFIG4,
 	PERF_PMU_FORMAT_VALUE_CONFIG_END,
 };
 

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v10 5/5] perf docs: arm-spe: Document new SPE filtering features
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
                   ` (3 preceding siblings ...)
  2025-11-11 11:37 ` [PATCH v10 4/5] perf tools: Add support for perf_event_attr::config4 James Clark
@ 2025-11-11 11:37 ` James Clark
  2025-11-20  1:54 ` [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features Namhyung Kim
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2025-11-11 11:37 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Leo Yan, Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-perf-users, linux-doc,
	kvmarm, James Clark

FEAT_SPE_EFT and FEAT_SPE_FDS etc have new user facing format attributes
so document them. Also document existing 'event_filter' bits that were
missing from the doc and the fact that latency values are stored in the
weight field.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 tools/perf/Documentation/perf-arm-spe.txt | 104 +++++++++++++++++++++++++++---
 1 file changed, 95 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt
index cda8dd47fc4d..8b02e5b983fa 100644
--- a/tools/perf/Documentation/perf-arm-spe.txt
+++ b/tools/perf/Documentation/perf-arm-spe.txt
@@ -141,27 +141,65 @@ Config parameters
 These are placed between the // in the event and comma separated. For example '-e
 arm_spe/load_filter=1,min_latency=10/'
 
-  branch_filter=1     - collect branches only (PMSFCR.B)
-  event_filter=<mask> - filter on specific events (PMSEVFR) - see bitfield description below
+  event_filter=<mask> - logical AND filter on specific events (PMSEVFR) - see bitfield description below
+  inv_event_filter=<mask> - logical OR to filter out specific events (PMSNEVFR, FEAT_SPEv1p2) - see bitfield description below
   jitter=1            - use jitter to avoid resonance when sampling (PMSIRR.RND)
-  load_filter=1       - collect loads only (PMSFCR.LD)
   min_latency=<n>     - collect only samples with this latency or higher* (PMSLATFR)
   pa_enable=1         - collect physical address (as well as VA) of loads/stores (PMSCR.PA) - requires privilege
   pct_enable=1        - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requires privilege
-  store_filter=1      - collect stores only (PMSFCR.ST)
   ts_enable=1         - enable timestamping with value of generic timer (PMSCR.TS)
   discard=1           - enable SPE PMU events but don't collect sample data - see 'Discard mode' (PMBLIMITR.FM = DISCARD)
+  inv_data_src_filter=<mask> - mask to filter from 0-63 possible data sources (PMSDSFR, FEAT_SPE_FDS) - See 'Data source filtering'
 
 +++*+++ Latency is the total latency from the point at which sampling started on that instruction, rather
 than only the execution latency.
 
-Only some events can be filtered on; these include:
-
-  bit 1     - instruction retired (i.e. omit speculative instructions)
+Only some events can be filtered on using 'event_filter' bits. The overall
+filter is the logical AND of these bits, for example if bits 3 and 5 are set
+only samples that have both 'L1D cache refill' AND 'TLB walk' are recorded. When
+FEAT_SPEv1p2 is implemented 'inv_event_filter' can also be used to exclude
+events that have any (OR) of the filter's bits set. For example setting bits 3
+and 5 in 'inv_event_filter' will exclude any events that are either L1D cache
+refill OR TLB walk. If the same bit is set in both filters it's UNPREDICTABLE
+whether the sample is included or excluded. Filter bits for both event_filter
+and inv_event_filter are:
+
+  bit 1     - Instruction retired (i.e. omit speculative instructions)
+  bit 2     - L1D access (FEAT_SPEv1p4)
   bit 3     - L1D refill
+  bit 4     - TLB access (FEAT_SPEv1p4)
   bit 5     - TLB refill
-  bit 7     - mispredict
-  bit 11    - misaligned access
+  bit 6     - Not taken event (FEAT_SPEv1p2)
+  bit 7     - Mispredict
+  bit 8     - Last level cache access (FEAT_SPEv1p4)
+  bit 9     - Last level cache miss (FEAT_SPEv1p4)
+  bit 10    - Remote access (FEAT_SPEv1p4)
+  bit 11    - Misaligned access (FEAT_SPEv1p1)
+  bit 12-15 - IMPLEMENTATION DEFINED events (when implemented)
+  bit 16    - Transaction (FEAT_TME)
+  bit 17    - Partial or empty SME or SVE predicate (FEAT_SPEv1p1)
+  bit 18    - Empty SME or SVE predicate (FEAT_SPEv1p1)
+  bit 19    - L2D access (FEAT_SPEv1p4)
+  bit 20    - L2D miss (FEAT_SPEv1p4)
+  bit 21    - Cache data modified (FEAT_SPEv1p4)
+  bit 22    - Recently fetched (FEAT_SPEv1p4)
+  bit 23    - Data snooped (FEAT_SPEv1p4)
+  bit 24    - Streaming SVE mode event (when FEAT_SPE_SME is implemented), or
+              IMPLEMENTATION DEFINED event 24 (when implemented, only versions
+              less than FEAT_SPEv1p4)
+  bit 25    - SMCU or external coprocessor operation event when FEAT_SPE_SME is
+              implemented, or IMPLEMENTATION DEFINED event 25 (when implemented,
+              only versions less than FEAT_SPEv1p4)
+  bit 26-31 - IMPLEMENTATION DEFINED events (only versions less than FEAT_SPEv1p4)
+  bit 48-63 - IMPLEMENTATION DEFINED events (when implemented)
+
+For IMPLEMENTATION DEFINED bits, refer to the CPU TRM if these bits are
+implemented.
+
+The driver will reject events if requested filter bits require unimplemented SPE
+versions, but will not reject filter bits for unimplemented IMPDEF bits or when
+their related feature is not present (e.g. SME). For example, if FEAT_SPEv1p2 is
+not implemented, filtering on "Not taken event" (bit 6) will be rejected.
 
 So to sample just retired instructions:
 
@@ -171,6 +209,31 @@ or just mispredicted branches:
 
   perf record -e arm_spe/event_filter=0x80/ -- ./mybench
 
+When set, the following filters can be used to select samples that match any of
+the operation types (OR filtering). If only one is set then only samples of that
+type are collected:
+
+  branch_filter=1     - Collect branches (PMSFCR.B)
+  load_filter=1       - Collect loads (PMSFCR.LD)
+  store_filter=1      - Collect stores (PMSFCR.ST)
+
+When extended filtering is supported (FEAT_SPE_EFT), SIMD and float
+pointer operations can also be selected:
+
+  simd_filter=1         - Collect SIMD loads, stores and operations (PMSFCR.SIMD)
+  float_filter=1        - Collect floating point loads, stores and operations (PMSFCR.FP)
+
+When extended filtering is supported (FEAT_SPE_EFT), operation type filters can
+be changed to AND using _mask fields. For example samples could be selected if
+they are store AND SIMD by setting 'store_filter=1,simd_filter=1,
+store_filter_mask=1,simd_filter_mask=1'. The new masks are as follows:
+
+  branch_filter_mask=1  - Change branch filter behavior from OR to AND (PMSFCR.Bm)
+  load_filter_mask=1    - Change load filter behavior from OR to AND (PMSFCR.LDm)
+  store_filter_mask=1   - Change store filter behavior from OR to AND (PMSFCR.STm)
+  simd_filter_mask=1    - Change SIMD filter behavior from OR to AND (PMSFCR.SIMDm)
+  float_filter_mask=1   - Change floating point filter behavior from OR to AND (PMSFCR.FPm)
+
 Viewing the data
 ~~~~~~~~~~~~~~~~~
 
@@ -210,6 +273,10 @@ Memory access details are also stored on the samples and this can be viewed with
 
   perf report --mem-mode
 
+The latency value from the SPE sample is stored in the 'weight' field of the
+Perf samples and can be displayed in Perf script and report outputs by enabling
+its display from the command line.
+
 Common errors
 ~~~~~~~~~~~~~
 
@@ -253,6 +320,25 @@ to minimize output. Then run perf stat:
   perf record -e arm_spe/discard/ -a -N -B --no-bpf-event -o - > /dev/null &
   perf stat -e SAMPLE_FEED_LD
 
+Data source filtering
+~~~~~~~~~~~~~~~~~~~~~
+
+When FEAT_SPE_FDS is present, 'inv_data_src_filter' can be used as a mask to
+filter on a subset (0 - 63) of possible data source IDs. The full range of data
+sources is 0 - 65535 although these are unlikely to be used in practice. Data
+sources are IMPDEF so refer to the TRM for the mappings. Each bit N of the
+filter maps to data source N. The filter is an OR of all the bits, and the value
+provided inv_data_src_filter is inverted before writing to PMSDSFR_EL1 so that
+set bits exclude that data source and cleared bits include that data source.
+Therefore the default value of 0 is equivalent to no filtering (all data sources
+included).
+
+For example, to include only data sources 0 and 3, clear bits 0 and 3
+(0xFFFFFFFFFFFFFFF6)
+
+When 'inv_data_src_filter' is set to 0xFFFFFFFFFFFFFFFF, any samples with any
+data source set are excluded.
+
 SEE ALSO
 --------
 

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
                   ` (4 preceding siblings ...)
  2025-11-11 11:37 ` [PATCH v10 5/5] perf docs: arm-spe: Document new SPE filtering features James Clark
@ 2025-11-20  1:54 ` Namhyung Kim
  2025-11-20  9:19   ` James Clark
  2025-11-24 19:18 ` Will Deacon
  2025-11-25 18:19 ` Namhyung Kim
  7 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2025-11-20  1:54 UTC (permalink / raw)
  To: James Clark
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Leo Yan,
	Anshuman Khandual, linux-arm-kernel, linux-kernel,
	linux-perf-users, linux-doc, kvmarm

Hello,

On Tue, Nov 11, 2025 at 11:37:54AM +0000, James Clark wrote:
> Support SPE_FEAT_FDS data source filtering.

What's the state of this series?  I can merge the tools part (3, 4, 5)
once the kernel part lands somewhere.

Thanks,
Namhyung

> 
> ---
> Changes in v10:
> - Pick up Peter's ack
> - Slightly clarify commit message regarding the difference between the
>   data source filter and the data source
> - Link to v9: https://lore.kernel.org/r/20251029-james-perf-feat_spe_eft-v9-0-d22536b9cf94@linaro.org
> 
> Changes in v9:
> - Fix another typo in docs: s/data_src_filter/inv_data_src_filter/g
> - Drop already applied patches for other features. Only the data source
>   filtering patches remain.
> - Rebase on latest perf-tools-next
> - Link to v8: https://lore.kernel.org/r/20250901-james-perf-feat_spe_eft-v8-0-2e2738f24559@linaro.org
> 
> Changes in v8:
> - Define __spe_vers_imp before it's used
> - "disable traps to PMSDSFR" -> "disable traps of PMSDSFR to EL2"
> - Link to v7: https://lore.kernel.org/r/20250814-james-perf-feat_spe_eft-v7-0-6a743f7fa259@linaro.org
> 
> Changes in v7:
> - Fix typo in docs: s/data_src_filter/inv_data_src_filter/g
> - Pickup trailers
> - Link to v6: https://lore.kernel.org/r/20250808-james-perf-feat_spe_eft-v6-0-6daf498578c8@linaro.org
> 
> Changes in v6:
> - Rebase to resolve conflict with BRBE changes in el2_setup.h
> - Link to v5: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v5-0-a7bc533485a1@linaro.org
> 
> Changes in v5:
> - Forgot to pickup tags from v4
> - Forgot to drop test and review tags on v4 patches that were
>   significantly modified
> - Update commit message for data source filtering to mention inversion
> - Link to v4: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v4-0-0a527410f8fd@linaro.org
> 
> Changes in v4:
> - Rewrite "const u64 feat_spe_eft_bits" inline
> - Invert data source filter so that it's possible to exclude all data
>   sources without adding an additional 'enable filter' flag
> - Add a macro in el2_setup.h to check for an SPE version
> - Probe valid filter bits instead of hardcoding them
> - Take in Leo's commit to expose the filter bits as it depends on the
>   new filter probing
> - Link to v3: https://lore.kernel.org/r/20250605-james-perf-feat_spe_eft-v3-0-71b0c9f98093@linaro.org
> 
> Changes in v3:
> - Use PMSIDR_EL1_FDS instead of 1 << PMSIDR_EL1_FDS_SHIFT
> - Add VNCR offsets
> - Link to v2: https://lore.kernel.org/r/20250529-james-perf-feat_spe_eft-v2-0-a01a9baad06a@linaro.org
> 
> Changes in v2:
> - Fix detection of FEAT_SPE_FDS in el2_setup.h
> - Pickup Marc Z's sysreg change instead which matches the json
> - Restructure and expand docs changes
> - Link to v1: https://lore.kernel.org/r/20250506-james-perf-feat_spe_eft-v1-0-dd480e8e4851@linaro.org
> 
> ---
> James Clark (5):
>       perf: Add perf_event_attr::config4
>       perf: arm_spe: Add support for filtering on data source
>       tools headers UAPI: Sync linux/perf_event.h with the kernel sources
>       perf tools: Add support for perf_event_attr::config4
>       perf docs: arm-spe: Document new SPE filtering features
> 
>  drivers/perf/arm_spe_pmu.c                |  37 +++++++++++
>  include/uapi/linux/perf_event.h           |   2 +
>  tools/include/uapi/linux/perf_event.h     |   2 +
>  tools/perf/Documentation/perf-arm-spe.txt | 104 +++++++++++++++++++++++++++---
>  tools/perf/tests/parse-events.c           |  13 +++-
>  tools/perf/util/parse-events.c            |  11 ++++
>  tools/perf/util/parse-events.h            |   1 +
>  tools/perf/util/parse-events.l            |   1 +
>  tools/perf/util/pmu.c                     |   8 +++
>  tools/perf/util/pmu.h                     |   1 +
>  10 files changed, 170 insertions(+), 10 deletions(-)
> ---
> base-commit: 081006b7c8e19406dc6674c6b6d086764d415b5c
> change-id: 20250312-james-perf-feat_spe_eft-66cdf4d8fe99
> 
> Best regards,
> -- 
> James Clark <james.clark@linaro.org>
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features
  2025-11-20  1:54 ` [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features Namhyung Kim
@ 2025-11-20  9:19   ` James Clark
  0 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2025-11-20  9:19 UTC (permalink / raw)
  To: Namhyung Kim, Will Deacon
  Cc: Catalin Marinas, Mark Rutland, Jonathan Corbet, Marc Zyngier,
	Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Leo Yan,
	Anshuman Khandual, linux-arm-kernel, linux-kernel,
	linux-perf-users, linux-doc, kvmarm



On 20/11/2025 1:54 am, Namhyung Kim wrote:
> Hello,
> 
> On Tue, Nov 11, 2025 at 11:37:54AM +0000, James Clark wrote:
>> Support SPE_FEAT_FDS data source filtering.
> 
> What's the state of this series?  I can merge the tools part (3, 4, 5)
> once the kernel part lands somewhere.
> 
> Thanks,
> Namhyung
> 

The SPE driver part was blocked on Peter's ack for the config4 change. 
He's given it now so Will should be able to take the driver.

Thanks
James

>>
>> ---
>> Changes in v10:
>> - Pick up Peter's ack
>> - Slightly clarify commit message regarding the difference between the
>>    data source filter and the data source
>> - Link to v9: https://lore.kernel.org/r/20251029-james-perf-feat_spe_eft-v9-0-d22536b9cf94@linaro.org
>>
>> Changes in v9:
>> - Fix another typo in docs: s/data_src_filter/inv_data_src_filter/g
>> - Drop already applied patches for other features. Only the data source
>>    filtering patches remain.
>> - Rebase on latest perf-tools-next
>> - Link to v8: https://lore.kernel.org/r/20250901-james-perf-feat_spe_eft-v8-0-2e2738f24559@linaro.org
>>
>> Changes in v8:
>> - Define __spe_vers_imp before it's used
>> - "disable traps to PMSDSFR" -> "disable traps of PMSDSFR to EL2"
>> - Link to v7: https://lore.kernel.org/r/20250814-james-perf-feat_spe_eft-v7-0-6a743f7fa259@linaro.org
>>
>> Changes in v7:
>> - Fix typo in docs: s/data_src_filter/inv_data_src_filter/g
>> - Pickup trailers
>> - Link to v6: https://lore.kernel.org/r/20250808-james-perf-feat_spe_eft-v6-0-6daf498578c8@linaro.org
>>
>> Changes in v6:
>> - Rebase to resolve conflict with BRBE changes in el2_setup.h
>> - Link to v5: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v5-0-a7bc533485a1@linaro.org
>>
>> Changes in v5:
>> - Forgot to pickup tags from v4
>> - Forgot to drop test and review tags on v4 patches that were
>>    significantly modified
>> - Update commit message for data source filtering to mention inversion
>> - Link to v4: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v4-0-0a527410f8fd@linaro.org
>>
>> Changes in v4:
>> - Rewrite "const u64 feat_spe_eft_bits" inline
>> - Invert data source filter so that it's possible to exclude all data
>>    sources without adding an additional 'enable filter' flag
>> - Add a macro in el2_setup.h to check for an SPE version
>> - Probe valid filter bits instead of hardcoding them
>> - Take in Leo's commit to expose the filter bits as it depends on the
>>    new filter probing
>> - Link to v3: https://lore.kernel.org/r/20250605-james-perf-feat_spe_eft-v3-0-71b0c9f98093@linaro.org
>>
>> Changes in v3:
>> - Use PMSIDR_EL1_FDS instead of 1 << PMSIDR_EL1_FDS_SHIFT
>> - Add VNCR offsets
>> - Link to v2: https://lore.kernel.org/r/20250529-james-perf-feat_spe_eft-v2-0-a01a9baad06a@linaro.org
>>
>> Changes in v2:
>> - Fix detection of FEAT_SPE_FDS in el2_setup.h
>> - Pickup Marc Z's sysreg change instead which matches the json
>> - Restructure and expand docs changes
>> - Link to v1: https://lore.kernel.org/r/20250506-james-perf-feat_spe_eft-v1-0-dd480e8e4851@linaro.org
>>
>> ---
>> James Clark (5):
>>        perf: Add perf_event_attr::config4
>>        perf: arm_spe: Add support for filtering on data source
>>        tools headers UAPI: Sync linux/perf_event.h with the kernel sources
>>        perf tools: Add support for perf_event_attr::config4
>>        perf docs: arm-spe: Document new SPE filtering features
>>
>>   drivers/perf/arm_spe_pmu.c                |  37 +++++++++++
>>   include/uapi/linux/perf_event.h           |   2 +
>>   tools/include/uapi/linux/perf_event.h     |   2 +
>>   tools/perf/Documentation/perf-arm-spe.txt | 104 +++++++++++++++++++++++++++---
>>   tools/perf/tests/parse-events.c           |  13 +++-
>>   tools/perf/util/parse-events.c            |  11 ++++
>>   tools/perf/util/parse-events.h            |   1 +
>>   tools/perf/util/parse-events.l            |   1 +
>>   tools/perf/util/pmu.c                     |   8 +++
>>   tools/perf/util/pmu.h                     |   1 +
>>   10 files changed, 170 insertions(+), 10 deletions(-)
>> ---
>> base-commit: 081006b7c8e19406dc6674c6b6d086764d415b5c
>> change-id: 20250312-james-perf-feat_spe_eft-66cdf4d8fe99
>>
>> Best regards,
>> -- 
>> James Clark <james.clark@linaro.org>
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
                   ` (5 preceding siblings ...)
  2025-11-20  1:54 ` [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features Namhyung Kim
@ 2025-11-24 19:18 ` Will Deacon
  2025-11-25 18:19 ` Namhyung Kim
  7 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2025-11-24 19:18 UTC (permalink / raw)
  To: Catalin Marinas, Mark Rutland, Jonathan Corbet, Marc Zyngier,
	Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Leo Yan, Anshuman Khandual, James Clark
  Cc: kernel-team, Will Deacon, linux-arm-kernel, linux-kernel,
	linux-perf-users, linux-doc, kvmarm

On Tue, 11 Nov 2025 11:37:54 +0000, James Clark wrote:
> Support SPE_FEAT_FDS data source filtering.
> 

Applied kernel changes to will (for-next/perf), thanks!

[1/5] perf: Add perf_event_attr::config4
      https://git.kernel.org/will/c/cbbfba4847b8
[2/5] perf: arm_spe: Add support for filtering on data source
      https://git.kernel.org/will/c/e6a27290d800

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features
  2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
                   ` (6 preceding siblings ...)
  2025-11-24 19:18 ` Will Deacon
@ 2025-11-25 18:19 ` Namhyung Kim
  7 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2025-11-25 18:19 UTC (permalink / raw)
  To: James Clark
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Jonathan Corbet,
	Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Leo Yan,
	Anshuman Khandual, linux-arm-kernel, linux-kernel,
	linux-perf-users, linux-doc, kvmarm

On Tue, Nov 11, 2025 at 11:37:54AM +0000, James Clark wrote:
> Support SPE_FEAT_FDS data source filtering.
> 
> ---
> Changes in v10:
> - Pick up Peter's ack
> - Slightly clarify commit message regarding the difference between the
>   data source filter and the data source
> - Link to v9: https://lore.kernel.org/r/20251029-james-perf-feat_spe_eft-v9-0-d22536b9cf94@linaro.org
> 
> Changes in v9:
> - Fix another typo in docs: s/data_src_filter/inv_data_src_filter/g
> - Drop already applied patches for other features. Only the data source
>   filtering patches remain.
> - Rebase on latest perf-tools-next
> - Link to v8: https://lore.kernel.org/r/20250901-james-perf-feat_spe_eft-v8-0-2e2738f24559@linaro.org
> 
> Changes in v8:
> - Define __spe_vers_imp before it's used
> - "disable traps to PMSDSFR" -> "disable traps of PMSDSFR to EL2"
> - Link to v7: https://lore.kernel.org/r/20250814-james-perf-feat_spe_eft-v7-0-6a743f7fa259@linaro.org
> 
> Changes in v7:
> - Fix typo in docs: s/data_src_filter/inv_data_src_filter/g
> - Pickup trailers
> - Link to v6: https://lore.kernel.org/r/20250808-james-perf-feat_spe_eft-v6-0-6daf498578c8@linaro.org
> 
> Changes in v6:
> - Rebase to resolve conflict with BRBE changes in el2_setup.h
> - Link to v5: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v5-0-a7bc533485a1@linaro.org
> 
> Changes in v5:
> - Forgot to pickup tags from v4
> - Forgot to drop test and review tags on v4 patches that were
>   significantly modified
> - Update commit message for data source filtering to mention inversion
> - Link to v4: https://lore.kernel.org/r/20250721-james-perf-feat_spe_eft-v4-0-0a527410f8fd@linaro.org
> 
> Changes in v4:
> - Rewrite "const u64 feat_spe_eft_bits" inline
> - Invert data source filter so that it's possible to exclude all data
>   sources without adding an additional 'enable filter' flag
> - Add a macro in el2_setup.h to check for an SPE version
> - Probe valid filter bits instead of hardcoding them
> - Take in Leo's commit to expose the filter bits as it depends on the
>   new filter probing
> - Link to v3: https://lore.kernel.org/r/20250605-james-perf-feat_spe_eft-v3-0-71b0c9f98093@linaro.org
> 
> Changes in v3:
> - Use PMSIDR_EL1_FDS instead of 1 << PMSIDR_EL1_FDS_SHIFT
> - Add VNCR offsets
> - Link to v2: https://lore.kernel.org/r/20250529-james-perf-feat_spe_eft-v2-0-a01a9baad06a@linaro.org
> 
> Changes in v2:
> - Fix detection of FEAT_SPE_FDS in el2_setup.h
> - Pickup Marc Z's sysreg change instead which matches the json
> - Restructure and expand docs changes
> - Link to v1: https://lore.kernel.org/r/20250506-james-perf-feat_spe_eft-v1-0-dd480e8e4851@linaro.org
> 
> ---
> James Clark (5):
>       perf: Add perf_event_attr::config4
>       perf: arm_spe: Add support for filtering on data source
>       tools headers UAPI: Sync linux/perf_event.h with the kernel sources
>       perf tools: Add support for perf_event_attr::config4
>       perf docs: arm-spe: Document new SPE filtering features

Applied the tools part to perf-tools-next, thanks!

Best regards,
Namhyung


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-11-25 18:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-11 11:37 [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features James Clark
2025-11-11 11:37 ` [PATCH v10 1/5] perf: Add perf_event_attr::config4 James Clark
2025-11-11 11:37 ` [PATCH v10 2/5] perf: arm_spe: Add support for filtering on data source James Clark
2025-11-11 11:37 ` [PATCH v10 3/5] tools headers UAPI: Sync linux/perf_event.h with the kernel sources James Clark
2025-11-11 11:37 ` [PATCH v10 4/5] perf tools: Add support for perf_event_attr::config4 James Clark
2025-11-11 11:37 ` [PATCH v10 5/5] perf docs: arm-spe: Document new SPE filtering features James Clark
2025-11-20  1:54 ` [PATCH v10 0/5] perf: arm_spe: Armv8.8 SPE features Namhyung Kim
2025-11-20  9:19   ` James Clark
2025-11-24 19:18 ` Will Deacon
2025-11-25 18:19 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).