linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/8] perf auxtrace: Support multiple AUX events
@ 2024-08-23 11:32 Leo Yan
  2024-08-23 11:32 ` [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module Leo Yan
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:32 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

This series is to support multiple events with the *same* type in Perf
AUX trace. As the events are same type, the trace data shares the same
format and can be decoded by the same decoder.

Note, a more complex case - different types of AUX events, (e.g. Arm
CoreSight event and Arm SPE events are enabled simultaneously) - is
still not supported.

Patch 01 is to change the perf core layer in the kernel to allow
multiple AUX events to output to single FD, so that '--per-thread' mode
can be supported.

Patches 02 and 03 are refactoring with using evsel__is_aux_event().

Patch 04 validates the AUX events' CPU map. It presumes the AUX events
do not overlap for CPU maps, otherwise, it returns failure.

Patches 05, 06, 07 are to support multiple AUX events for buffer mapped
index.

Patch 08 is to support multiple Arm SPE events during report phase.

The patch series has been tested with multiple Arm SPE events (arm_spe_0
and arm_spe_1). And verified this patch series on Intel-PT test for no
regression.

Changes from v5:
- Changed to comparing string "arm_spe" in patch 08, as "arm_spe"
  is a general even name and passed Arm SPE test.

Changes from v4:
- Used 'evsel->name' to replace 'evsel->pmu_name' in patch 08.

Changes from v3:
- Changed to compare .setup_aux() callback in patch 01 (Peter.Z).
- Picked up patches 02 / 03 with evsel__is_aux_event() refactoring.
- Added new patch 08.

Changes from v2:
- Added patch 01 for support per-thread mode (Adrian).
- Added patch 02 for verifying CPU maps without overlapping (Adrian).
- Reworked patches to fix the regression on Intel-PT (Ian).

Changes from v1:
- Added comment in patch 01 for iterating AUX events (Adrian)
- Added patch 02 for removing unused field 'pmu' (Adrian)


Leo Yan (8):
  perf/core: Allow multiple AUX PMU events with the same module
  perf auxtrace: Use evsel__is_aux_event() for checking AUX event
  perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_record
  perf auxtrace: Introduce auxtrace_record__validate_events()
  perf auxtrace: Refactor evlist__enable_event_idx()
  perf auxtrace: Bails out after finding the event for the map index
  perf auxtrace: Iterate all AUX events when finish reading
  perf arm-spe: Support multiple events in arm_spe_evsel_is_auxtrace()

 kernel/events/core.c                  |   9 +-
 tools/perf/arch/arm/util/cs-etm.c     |   1 -
 tools/perf/arch/arm64/util/arm-spe.c  |   1 -
 tools/perf/arch/arm64/util/hisi-ptt.c |   1 -
 tools/perf/arch/x86/util/intel-bts.c  |   1 -
 tools/perf/arch/x86/util/intel-pt.c   |   1 -
 tools/perf/builtin-record.c           |   4 +
 tools/perf/util/arm-spe.c             |  11 +--
 tools/perf/util/auxtrace.c            | 114 +++++++++++++++++++++++---
 tools/perf/util/auxtrace.h            |   8 +-
 10 files changed, 125 insertions(+), 26 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
@ 2024-08-23 11:32 ` Leo Yan
  2024-08-23 11:40   ` Leo Yan
  2024-09-03 10:06   ` Adrian Hunter
  2024-08-23 11:33 ` [PATCH v6 2/8] perf auxtrace: Use evsel__is_aux_event() for checking AUX event Leo Yan
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:32 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

This commit changes the condition from checking the same PMU instance to
checking the same .setup_aux() callback pointer. If PMU events have the
same callback pointer, it means they share the same PMU driver module.
This allows support for multiple PMU events with the same driver module.

As a result, more than one AUX event (e.g. arm_spe_0 and arm_spe_1)
can record trace into the AUX ring buffer.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 kernel/events/core.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index c973e3c11e03..883c457911a3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -12345,9 +12345,16 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
 
 	/*
 	 * If both events generate aux data, they must be on the same PMU
+	 * module but can be with different PMU instances.
+	 *
+	 * For a built-in PMU module, the 'pmu->module' pointer is NULL,
+	 * thus it is not feasible to compare the module pointers when
+	 * AUX PMU drivers are built into the kernel image. Instead,
+	 * comparing the .setup_aux() callback pointer can determine if
+	 * the two PMU events come from the same PMU driver.
 	 */
 	if (has_aux(event) && has_aux(output_event) &&
-	    event->pmu != output_event->pmu)
+	    event->pmu->setup_aux != output_event->pmu->setup_aux)
 		goto out;
 
 	/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/8] perf auxtrace: Use evsel__is_aux_event() for checking AUX event
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
  2024-08-23 11:32 ` [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module Leo Yan
@ 2024-08-23 11:33 ` Leo Yan
  2024-08-23 11:33 ` [PATCH v6 3/8] perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_record Leo Yan
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:33 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

Use evsel__is_aux_event() to decide if an event is a AUX event, this is
a refactoring to replace comparing the PMU type.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index cbb773ed6f1a..ca8682966fae 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -671,11 +671,11 @@ int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
 {
 	struct evsel *evsel;
 
-	if (!itr->evlist || !itr->pmu)
+	if (!itr->evlist)
 		return -EINVAL;
 
 	evlist__for_each_entry(itr->evlist, evsel) {
-		if (evsel->core.attr.type == itr->pmu->type) {
+		if (evsel__is_aux_event(evsel)) {
 			if (evsel->disabled)
 				return 0;
 			return evlist__enable_event_idx(itr->evlist, evsel, idx);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/8] perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_record
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
  2024-08-23 11:32 ` [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module Leo Yan
  2024-08-23 11:33 ` [PATCH v6 2/8] perf auxtrace: Use evsel__is_aux_event() for checking AUX event Leo Yan
@ 2024-08-23 11:33 ` Leo Yan
  2024-08-23 11:33 ` [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events() Leo Yan
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:33 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

The 'pmu' pointer in the auxtrace_record structure is not used after
support multiple AUX events, remove it.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/arch/arm/util/cs-etm.c     | 1 -
 tools/perf/arch/arm64/util/arm-spe.c  | 1 -
 tools/perf/arch/arm64/util/hisi-ptt.c | 1 -
 tools/perf/arch/x86/util/intel-bts.c  | 1 -
 tools/perf/arch/x86/util/intel-pt.c   | 1 -
 tools/perf/util/auxtrace.h            | 1 -
 6 files changed, 6 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index da6231367993..96aeb7cdbee1 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -888,7 +888,6 @@ struct auxtrace_record *cs_etm_record_init(int *err)
 	}
 
 	ptr->cs_etm_pmu			= cs_etm_pmu;
-	ptr->itr.pmu			= cs_etm_pmu;
 	ptr->itr.parse_snapshot_options	= cs_etm_parse_snapshot_options;
 	ptr->itr.recording_options	= cs_etm_recording_options;
 	ptr->itr.info_priv_size		= cs_etm_info_priv_size;
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index d59f6ca499f2..2be99fdf997d 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -514,7 +514,6 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
 	}
 
 	sper->arm_spe_pmu = arm_spe_pmu;
-	sper->itr.pmu = arm_spe_pmu;
 	sper->itr.snapshot_start = arm_spe_snapshot_start;
 	sper->itr.snapshot_finish = arm_spe_snapshot_finish;
 	sper->itr.find_snapshot = arm_spe_find_snapshot;
diff --git a/tools/perf/arch/arm64/util/hisi-ptt.c b/tools/perf/arch/arm64/util/hisi-ptt.c
index ba97c8a562a0..eac9739c87e6 100644
--- a/tools/perf/arch/arm64/util/hisi-ptt.c
+++ b/tools/perf/arch/arm64/util/hisi-ptt.c
@@ -174,7 +174,6 @@ struct auxtrace_record *hisi_ptt_recording_init(int *err,
 	}
 
 	pttr->hisi_ptt_pmu = hisi_ptt_pmu;
-	pttr->itr.pmu = hisi_ptt_pmu;
 	pttr->itr.recording_options = hisi_ptt_recording_options;
 	pttr->itr.info_priv_size = hisi_ptt_info_priv_size;
 	pttr->itr.info_fill = hisi_ptt_info_fill;
diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c
index 34696f3d3d5d..85c8186300c8 100644
--- a/tools/perf/arch/x86/util/intel-bts.c
+++ b/tools/perf/arch/x86/util/intel-bts.c
@@ -434,7 +434,6 @@ struct auxtrace_record *intel_bts_recording_init(int *err)
 	}
 
 	btsr->intel_bts_pmu = intel_bts_pmu;
-	btsr->itr.pmu = intel_bts_pmu;
 	btsr->itr.recording_options = intel_bts_recording_options;
 	btsr->itr.info_priv_size = intel_bts_info_priv_size;
 	btsr->itr.info_fill = intel_bts_info_fill;
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 4b710e875953..ea510a7486b1 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -1197,7 +1197,6 @@ struct auxtrace_record *intel_pt_recording_init(int *err)
 	}
 
 	ptr->intel_pt_pmu = intel_pt_pmu;
-	ptr->itr.pmu = intel_pt_pmu;
 	ptr->itr.recording_options = intel_pt_recording_options;
 	ptr->itr.info_priv_size = intel_pt_info_priv_size;
 	ptr->itr.info_fill = intel_pt_info_fill;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index d405efcd8708..a1895a4f530b 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -411,7 +411,6 @@ struct auxtrace_record {
 	int (*read_finish)(struct auxtrace_record *itr, int idx);
 	unsigned int alignment;
 	unsigned int default_aux_sample_size;
-	struct perf_pmu *pmu;
 	struct evlist *evlist;
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events()
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
                   ` (2 preceding siblings ...)
  2024-08-23 11:33 ` [PATCH v6 3/8] perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_record Leo Yan
@ 2024-08-23 11:33 ` Leo Yan
  2024-09-03 15:26   ` Adrian Hunter
  2024-08-23 11:33 ` [PATCH v6 5/8] perf auxtrace: Refactor evlist__enable_event_idx() Leo Yan
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:33 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

A prerequisite for multiple AUX events is that the AUX events cannot
overlap CPU maps. The reason is that every CPU has only one AUX trace
buffer and maps it to an unique buffer index for CPU and system tracing
mode.

To prevent the case of CPU maps overlapping occurring within multiple
AUX events, the auxtrace_record__validate_events() function is
introduced. It iterates through all AUX events and returns failure if
it detects CPU maps overlapping.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/builtin-record.c |  4 +++
 tools/perf/util/auxtrace.c  | 64 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/auxtrace.h  |  7 ++++
 3 files changed, 75 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index adbaf80b398c..2c618efba97d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -862,6 +862,10 @@ static int record__auxtrace_init(struct record *rec)
 
 	auxtrace_regroup_aux_output(rec->evlist);
 
+	err = auxtrace_validate_events(rec->evlist);
+	if (err)
+		return err;
+
 	return auxtrace_parse_filters(rec->evlist);
 }
 
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index ca8682966fae..87e4f21b6edf 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -2828,6 +2828,70 @@ int auxtrace_parse_filters(struct evlist *evlist)
 	return 0;
 }
 
+int auxtrace_validate_events(struct evlist *evlist)
+{
+	struct evsel *evsel;
+	struct perf_cpu_map *cpu_map = NULL;
+	struct perf_cpu_map *cpu_map_intersect = NULL;
+	struct perf_cpu_map *cpu_map_merged = NULL;
+	int ret = 0;
+
+	if (!evlist)
+		return 0;
+
+	/*
+	 * Currently the tool only supports multiple AUX events without
+	 * overlapping CPU maps and every CPU has its unique AUX buffer
+	 * for CPU or system mode tracing.
+	 *
+	 * Returns failure if detects CPU maps overlapping.
+	 */
+	evlist__for_each_entry(evlist, evsel) {
+		if (!evsel__is_aux_event(evsel))
+			continue;
+
+		if (perf_cpu_map__is_empty(evsel->pmu->cpus))
+			continue;
+
+		cpu_map_intersect = perf_cpu_map__intersect(cpu_map, evsel->pmu->cpus);
+		if (cpu_map_intersect) {
+			perf_cpu_map__put(cpu_map_intersect);
+			pr_err("Doesn't support AUX events with overlapping CPU masks\n");
+			ret = -EINVAL;
+			break;
+		}
+		perf_cpu_map__put(cpu_map_intersect);
+
+		cpu_map_merged = perf_cpu_map__merge(cpu_map, evsel->pmu->cpus);
+		if (!cpu_map_merged) {
+			ret = -ENOMEM;
+			break;
+		}
+
+		/* Update the CPU maps after merging */
+		perf_cpu_map__put(cpu_map);
+		cpu_map = cpu_map_merged;
+	}
+
+	if (!ret)
+		goto out;
+
+	/* If fails, dump CPU maps for debugging */
+	evlist__for_each_entry(evlist, evsel) {
+		char buf[200];
+
+		if (!evsel__is_aux_event(evsel))
+			continue;
+
+		cpu_map__snprint(evsel->pmu->cpus, buf, sizeof(buf));
+		pr_debug("AUX event [%s]'s cpu map is: %s\n", evsel->pmu->name, buf);
+	}
+
+out:
+	perf_cpu_map__put(cpu_map);
+	return ret;
+}
+
 int auxtrace__process_event(struct perf_session *session, union perf_event *event,
 			    struct perf_sample *sample, const struct perf_tool *tool)
 {
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index a1895a4f530b..67a74ad0c383 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -636,6 +636,7 @@ void addr_filters__exit(struct addr_filters *filts);
 int addr_filters__parse_bare_filter(struct addr_filters *filts,
 				    const char *filter);
 int auxtrace_parse_filters(struct evlist *evlist);
+int auxtrace_validate_events(struct evlist *evlist);
 
 int auxtrace__process_event(struct perf_session *session, union perf_event *event,
 			    struct perf_sample *sample, const struct perf_tool *tool);
@@ -875,6 +876,12 @@ int auxtrace_parse_filters(struct evlist *evlist __maybe_unused)
 	return 0;
 }
 
+static inline
+int auxtrace_validate_events(struct evlist *evlist __maybe_unused)
+{
+	return 0;
+}
+
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
 			struct auxtrace_mmap_params *mp,
 			void *userpg, int fd);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 5/8] perf auxtrace: Refactor evlist__enable_event_idx()
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
                   ` (3 preceding siblings ...)
  2024-08-23 11:33 ` [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events() Leo Yan
@ 2024-08-23 11:33 ` Leo Yan
  2024-09-03 18:39   ` Adrian Hunter
  2024-08-23 11:33 ` [PATCH v6 6/8] perf auxtrace: Bails out after finding the event for the map index Leo Yan
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:33 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

This commit splits the evlist__enable_event_idx() function into two
steps. The first step uses a new function evlist__find_cpu_map_idx() to
find the CPU map index, based on the found CPU map index or a thread map
index, it continues to call evlist__enable_event_idx() for enabling the
corresponding event.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/auxtrace.c | 42 +++++++++++++++++++++++++++++---------
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 87e4f21b6edf..e7b582d92811 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -651,20 +651,30 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr,
 	return -EINVAL;
 }
 
-static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx)
+static int evlist__find_cpu_map_idx(struct evlist *evlist, struct evsel *evsel,
+				    int idx)
 {
 	bool per_cpu_mmaps = !perf_cpu_map__has_any_cpu(evlist->core.user_requested_cpus);
+	struct perf_cpu evlist_cpu;
+	int cpu_map_idx;
 
-	if (per_cpu_mmaps) {
-		struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
-		int cpu_map_idx = perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);
+	if (!per_cpu_mmaps)
+		return -EINVAL;
 
-		if (cpu_map_idx == -1)
-			return -EINVAL;
-		return perf_evsel__enable_cpu(&evsel->core, cpu_map_idx);
-	}
+	evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
+	cpu_map_idx = perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);
+	if (cpu_map_idx == -1)
+		return -ENOENT;
+
+	return cpu_map_idx;
+}
 
-	return perf_evsel__enable_thread(&evsel->core, idx);
+static int evlist__enable_event_idx(struct evsel *evsel, int cpu_mode, int idx)
+{
+	if (cpu_mode)
+		return perf_evsel__enable_cpu(&evsel->core, idx);
+	else
+		return perf_evsel__enable_thread(&evsel->core, idx);
 }
 
 int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
@@ -676,9 +686,21 @@ int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
 
 	evlist__for_each_entry(itr->evlist, evsel) {
 		if (evsel__is_aux_event(evsel)) {
+			int cpu_map_idx;
+
 			if (evsel->disabled)
 				return 0;
-			return evlist__enable_event_idx(itr->evlist, evsel, idx);
+
+			cpu_map_idx = evlist__find_cpu_map_idx(itr->evlist,
+							       evsel, idx);
+			/* No map is found in per CPU mmap */
+			if (cpu_map_idx == -ENOENT)
+				return cpu_map_idx;
+
+			if (cpu_map_idx >= 0)
+				return evlist__enable_event_idx(evsel, 1, cpu_map_idx);
+			else
+				return evlist__enable_event_idx(evsel, 0, idx);
 		}
 	}
 	return -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 6/8] perf auxtrace: Bails out after finding the event for the map index
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
                   ` (4 preceding siblings ...)
  2024-08-23 11:33 ` [PATCH v6 5/8] perf auxtrace: Refactor evlist__enable_event_idx() Leo Yan
@ 2024-08-23 11:33 ` Leo Yan
  2024-09-03 18:41   ` Adrian Hunter
  2024-08-23 11:33 ` [PATCH v6 7/8] perf auxtrace: Iterate all AUX events when finish reading Leo Yan
  2024-08-23 11:33 ` [PATCH v6 8/8] perf arm-spe: Support multiple events in arm_spe_evsel_is_auxtrace() Leo Yan
  7 siblings, 1 reply; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:33 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

After finding the corresponding event for the passed buffer index, it is
safe to say the found event has been used. Then, the tool can check the
event status and bails out if it has been disabled.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/auxtrace.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index e7b582d92811..2acf63efab1d 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -688,15 +688,15 @@ int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
 		if (evsel__is_aux_event(evsel)) {
 			int cpu_map_idx;
 
-			if (evsel->disabled)
-				return 0;
-
 			cpu_map_idx = evlist__find_cpu_map_idx(itr->evlist,
 							       evsel, idx);
 			/* No map is found in per CPU mmap */
 			if (cpu_map_idx == -ENOENT)
 				return cpu_map_idx;
 
+			if (evsel->disabled)
+				return 0;
+
 			if (cpu_map_idx >= 0)
 				return evlist__enable_event_idx(evsel, 1, cpu_map_idx);
 			else
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 7/8] perf auxtrace: Iterate all AUX events when finish reading
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
                   ` (5 preceding siblings ...)
  2024-08-23 11:33 ` [PATCH v6 6/8] perf auxtrace: Bails out after finding the event for the map index Leo Yan
@ 2024-08-23 11:33 ` Leo Yan
  2024-08-23 11:33 ` [PATCH v6 8/8] perf arm-spe: Support multiple events in arm_spe_evsel_is_auxtrace() Leo Yan
  7 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:33 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

When finished to read AUX trace data from mmaped buffer, based on the
AUX buffer index the core layer needs to search the corresponding PMU
event and re-enable it to continue tracing.

However, current code only searches the first AUX event. It misses to
search other enabled AUX events, thus, it returns failure if the buffer
index does not belong to the first AUX event.

This patch changes to continue searching every enabled AUX events for
covering the mmaped buffer indexes.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/auxtrace.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 2acf63efab1d..864ed20794ab 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -690,9 +690,13 @@ int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
 
 			cpu_map_idx = evlist__find_cpu_map_idx(itr->evlist,
 							       evsel, idx);
-			/* No map is found in per CPU mmap */
+			/*
+			 * No map is found in per CPU mmap. Multiple AUX events
+			 * might be opened in a session, continue to check if
+			 * the next AUX event can cover the mmaped buffer 'idx'.
+			 */
 			if (cpu_map_idx == -ENOENT)
-				return cpu_map_idx;
+				continue;
 
 			if (evsel->disabled)
 				return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 8/8] perf arm-spe: Support multiple events in arm_spe_evsel_is_auxtrace()
  2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
                   ` (6 preceding siblings ...)
  2024-08-23 11:33 ` [PATCH v6 7/8] perf auxtrace: Iterate all AUX events when finish reading Leo Yan
@ 2024-08-23 11:33 ` Leo Yan
  7 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:33 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight
  Cc: Leo Yan

The 'pmu_type' field is an unique value and cannot support multiple PMU
events.

The arm_spe_evsel_is_auxtrace() function changes to compare PMU name to
decide if it is a Arm SPE event. This leads to the 'pmu_type' field is
no longer used, remove it.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/arm-spe.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 138ffc71b32d..27724711e763 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -11,6 +11,7 @@
 #include <linux/bitops.h>
 #include <linux/kernel.h>
 #include <linux/log2.h>
+#include <linux/string.h>
 #include <linux/types.h>
 #include <linux/zalloc.h>
 #include <stdlib.h>
@@ -45,7 +46,6 @@ struct arm_spe {
 	u32				auxtrace_type;
 	struct perf_session		*session;
 	struct machine			*machine;
-	u32				pmu_type;
 	u64				midr;
 
 	struct perf_tsc_conversion	tc;
@@ -1053,12 +1053,10 @@ static void arm_spe_free(struct perf_session *session)
 	free(spe);
 }
 
-static bool arm_spe_evsel_is_auxtrace(struct perf_session *session,
+static bool arm_spe_evsel_is_auxtrace(struct perf_session *session __maybe_unused,
 				      struct evsel *evsel)
 {
-	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe, auxtrace);
-
-	return evsel->core.attr.type == spe->pmu_type;
+	return strstarts(evsel->name, "arm_spe");
 }
 
 static const char * const arm_spe_info_fmts[] = {
@@ -1099,7 +1097,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
 	int err;
 
 	evlist__for_each_entry(evlist, evsel) {
-		if (evsel->core.attr.type == spe->pmu_type) {
+		if (arm_spe_evsel_is_auxtrace(session, evsel)) {
 			found = true;
 			break;
 		}
@@ -1284,7 +1282,6 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->session = session;
 	spe->machine = &session->machines.host; /* No kvm support */
 	spe->auxtrace_type = auxtrace_info->type;
-	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
 	spe->midr = midr;
 
 	spe->timeless_decoding = arm_spe__is_timeless_decoding(spe);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module
  2024-08-23 11:32 ` [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module Leo Yan
@ 2024-08-23 11:40   ` Leo Yan
  2024-09-03 10:06   ` Adrian Hunter
  1 sibling, 0 replies; 18+ messages in thread
From: Leo Yan @ 2024-08-23 11:40 UTC (permalink / raw)
  To: Peter Zijlstra, Adrian Hunter, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight

Hi Peter, Adrian,

On 8/23/2024 12:32 PM, Leo Yan wrote:
> 
> This commit changes the condition from checking the same PMU instance to
> checking the same .setup_aux() callback pointer. If PMU events have the
> same callback pointer, it means they share the same PMU driver module.
> This allows support for multiple PMU events with the same driver module.
> 
> As a result, more than one AUX event (e.g. arm_spe_0 and arm_spe_1)
> can record trace into the AUX ring buffer.

This patch is the only change in the kernel, so it is crucial for this
series. Can I get your opinion? Thanks a lot!

Leo 
 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  kernel/events/core.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index c973e3c11e03..883c457911a3 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -12345,9 +12345,16 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
> 
>         /*
>          * If both events generate aux data, they must be on the same PMU
> +        * module but can be with different PMU instances.
> +        *
> +        * For a built-in PMU module, the 'pmu->module' pointer is NULL,
> +        * thus it is not feasible to compare the module pointers when
> +        * AUX PMU drivers are built into the kernel image. Instead,
> +        * comparing the .setup_aux() callback pointer can determine if
> +        * the two PMU events come from the same PMU driver.
>          */
>         if (has_aux(event) && has_aux(output_event) &&
> -           event->pmu != output_event->pmu)
> +           event->pmu->setup_aux != output_event->pmu->setup_aux)
>                 goto out;
> 
>         /*
> --
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module
  2024-08-23 11:32 ` [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module Leo Yan
  2024-08-23 11:40   ` Leo Yan
@ 2024-09-03 10:06   ` Adrian Hunter
  2024-09-04 19:35     ` Leo Yan
  1 sibling, 1 reply; 18+ messages in thread
From: Adrian Hunter @ 2024-09-03 10:06 UTC (permalink / raw)
  To: Leo Yan, Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
	Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan, Suzuki K Poulose,
	Mike Leach, James Clark, John Garry, Will Deacon, Yicong Yang,
	Jonathan Cameron, linux-perf-users, linux-kernel,
	linux-arm-kernel, coresight

On 23/08/24 14:32, Leo Yan wrote:
> This commit changes the condition from checking the same PMU instance to
> checking the same .setup_aux() callback pointer. If PMU events have the
> same callback pointer, it means they share the same PMU driver module.
> This allows support for multiple PMU events with the same driver module.
> 
> As a result, more than one AUX event (e.g. arm_spe_0 and arm_spe_1)
> can record trace into the AUX ring buffer.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  kernel/events/core.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index c973e3c11e03..883c457911a3 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -12345,9 +12345,16 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
>  
>  	/*
>  	 * If both events generate aux data, they must be on the same PMU
> +	 * module but can be with different PMU instances.
> +	 *
> +	 * For a built-in PMU module, the 'pmu->module' pointer is NULL,
> +	 * thus it is not feasible to compare the module pointers when
> +	 * AUX PMU drivers are built into the kernel image. Instead,
> +	 * comparing the .setup_aux() callback pointer can determine if
> +	 * the two PMU events come from the same PMU driver.
>  	 */
>  	if (has_aux(event) && has_aux(output_event) &&
> -	    event->pmu != output_event->pmu)
> +	    event->pmu->setup_aux != output_event->pmu->setup_aux)

It is not very flexible and risks someone adding aux PMUs that
do not want that rule but accidentally support it.  Another
option is to add a PMU callback, but really you need to Peter's
feedback.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events()
  2024-08-23 11:33 ` [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events() Leo Yan
@ 2024-09-03 15:26   ` Adrian Hunter
  2024-09-04 21:13     ` Leo Yan
  0 siblings, 1 reply; 18+ messages in thread
From: Adrian Hunter @ 2024-09-03 15:26 UTC (permalink / raw)
  To: Leo Yan, Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
	Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan, Suzuki K Poulose,
	Mike Leach, James Clark, John Garry, Will Deacon, Yicong Yang,
	Jonathan Cameron, linux-perf-users, linux-kernel,
	linux-arm-kernel, coresight

On 23/08/24 14:33, Leo Yan wrote:
> A prerequisite for multiple AUX events is that the AUX events cannot
> overlap CPU maps. The reason is that every CPU has only one AUX trace
> buffer and maps it to an unique buffer index for CPU and system tracing
> mode.
> 
> To prevent the case of CPU maps overlapping occurring within multiple
> AUX events, the auxtrace_record__validate_events() function is
> introduced. It iterates through all AUX events and returns failure if
> it detects CPU maps overlapping.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/builtin-record.c |  4 +++
>  tools/perf/util/auxtrace.c  | 64 +++++++++++++++++++++++++++++++++++++
>  tools/perf/util/auxtrace.h  |  7 ++++
>  3 files changed, 75 insertions(+)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index adbaf80b398c..2c618efba97d 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -862,6 +862,10 @@ static int record__auxtrace_init(struct record *rec)
>  
>  	auxtrace_regroup_aux_output(rec->evlist);
>  
> +	err = auxtrace_validate_events(rec->evlist);
> +	if (err)
> +		return err;
> +
>  	return auxtrace_parse_filters(rec->evlist);
>  }
>  
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index ca8682966fae..87e4f21b6edf 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -2828,6 +2828,70 @@ int auxtrace_parse_filters(struct evlist *evlist)
>  	return 0;
>  }
>  
> +int auxtrace_validate_events(struct evlist *evlist)

'auxtrace_validate_aux_events' would better indicate that it is
looking only at AUX area events.

> +{
> +	struct evsel *evsel;
> +	struct perf_cpu_map *cpu_map = NULL;
> +	struct perf_cpu_map *cpu_map_intersect = NULL;
> +	struct perf_cpu_map *cpu_map_merged = NULL;
> +	int ret = 0;
> +
> +	if (!evlist)
> +		return 0;

Elsewhere we assume it is not NULL, might as well here too.

> +
> +	/*
> +	 * Currently the tool only supports multiple AUX events without
> +	 * overlapping CPU maps and every CPU has its unique AUX buffer
> +	 * for CPU or system mode tracing.
> +	 *
> +	 * Returns failure if detects CPU maps overlapping.
> +	 */
> +	evlist__for_each_entry(evlist, evsel) {
> +		if (!evsel__is_aux_event(evsel))
> +			continue;
> +
> +		if (perf_cpu_map__is_empty(evsel->pmu->cpus))
> +			continue;

Unless perf_cpu_map__intersect() is broken, the empty check
should not be needed.

Shouldn't we be looking at evsel->cpus ?

Possibly need to consider the perf_cpu_map__has_any_cpu() case?
e.g.
		if (cpu_map && (perf_cpu_map__has_any_cpu(evsel->cpus) || 
				perf_cpu_map__has_any_cpu(cpu_map)) {
			ret = -EINVAL;
			break;
		}

> +
> +		cpu_map_intersect = perf_cpu_map__intersect(cpu_map, evsel->pmu->cpus);
> +		if (cpu_map_intersect) {
> +			perf_cpu_map__put(cpu_map_intersect);
> +			pr_err("Doesn't support AUX events with overlapping CPU masks\n");
> +			ret = -EINVAL;
> +			break;
> +		}
> +		perf_cpu_map__put(cpu_map_intersect);

Maybe add a helper:

static bool perf_cpu_map__do_maps_intersect(struct perf_cpu_map *a, struct perf_cpu_map *b)
{
	struct perf_cpu_map *intersection = perf_cpu_map__intersect(a, b);
	bool ret = !perf_cpu_map__is_empty(intersection);

	perf_cpu_map__put(intersection);

	return ret;
}

> +
> +		cpu_map_merged = perf_cpu_map__merge(cpu_map, evsel->pmu->cpus);
> +		if (!cpu_map_merged) {
> +			ret = -ENOMEM;
> +			break;
> +		}
> +
> +		/* Update the CPU maps after merging */
> +		perf_cpu_map__put(cpu_map);
> +		cpu_map = cpu_map_merged;

perf_cpu_map__merge() is a bit tricky - see its comments.  This
should probably all just be:

		cpu_map = perf_cpu_map__merge(cpu_map, evsel->pmu->cpus);


> +	}
> +
> +	if (!ret)
> +		goto out;

Could we put the error path last i.e.

	perf_cpu_map__put(cpu_map);

	if (ret)
		goto out_err;

	return 0;

out_err:
> +
> +	/* If fails, dump CPU maps for debugging */
> +	evlist__for_each_entry(evlist, evsel) {
> +		char buf[200];
> +
> +		if (!evsel__is_aux_event(evsel))
> +			continue;
> +
> +		cpu_map__snprint(evsel->pmu->cpus, buf, sizeof(buf));
> +		pr_debug("AUX event [%s]'s cpu map is: %s\n", evsel->pmu->name, buf);

Could probably use cpu_map__fprintf(pmu->cpus, debug_file()) and
not need buf.

> +	}
> +
> +out:
> +	perf_cpu_map__put(cpu_map);
> +	return ret;
> +}
> +
>  int auxtrace__process_event(struct perf_session *session, union perf_event *event,
>  			    struct perf_sample *sample, const struct perf_tool *tool)
>  {
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index a1895a4f530b..67a74ad0c383 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -636,6 +636,7 @@ void addr_filters__exit(struct addr_filters *filts);
>  int addr_filters__parse_bare_filter(struct addr_filters *filts,
>  				    const char *filter);
>  int auxtrace_parse_filters(struct evlist *evlist);
> +int auxtrace_validate_events(struct evlist *evlist);
>  
>  int auxtrace__process_event(struct perf_session *session, union perf_event *event,
>  			    struct perf_sample *sample, const struct perf_tool *tool);
> @@ -875,6 +876,12 @@ int auxtrace_parse_filters(struct evlist *evlist __maybe_unused)
>  	return 0;
>  }
>  
> +static inline
> +int auxtrace_validate_events(struct evlist *evlist __maybe_unused)
> +{
> +	return 0;
> +}
> +
>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
>  			struct auxtrace_mmap_params *mp,
>  			void *userpg, int fd);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 5/8] perf auxtrace: Refactor evlist__enable_event_idx()
  2024-08-23 11:33 ` [PATCH v6 5/8] perf auxtrace: Refactor evlist__enable_event_idx() Leo Yan
@ 2024-09-03 18:39   ` Adrian Hunter
  0 siblings, 0 replies; 18+ messages in thread
From: Adrian Hunter @ 2024-09-03 18:39 UTC (permalink / raw)
  To: Leo Yan, Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
	Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan, Suzuki K Poulose,
	Mike Leach, James Clark, John Garry, Will Deacon, Yicong Yang,
	Jonathan Cameron, linux-perf-users, linux-kernel,
	linux-arm-kernel, coresight

On 23/08/24 14:33, Leo Yan wrote:
> This commit splits the evlist__enable_event_idx() function into two
> steps. The first step uses a new function evlist__find_cpu_map_idx() to
> find the CPU map index, based on the found CPU map index or a thread map
> index, it continues to call evlist__enable_event_idx() for enabling the
> corresponding event.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/util/auxtrace.c | 42 +++++++++++++++++++++++++++++---------
>  1 file changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 87e4f21b6edf..e7b582d92811 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -651,20 +651,30 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr,
>  	return -EINVAL;
>  }
>  
> -static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx)
> +static int evlist__find_cpu_map_idx(struct evlist *evlist, struct evsel *evsel,
> +				    int idx)
>  {
>  	bool per_cpu_mmaps = !perf_cpu_map__has_any_cpu(evlist->core.user_requested_cpus);
> +	struct perf_cpu evlist_cpu;
> +	int cpu_map_idx;
>  
> -	if (per_cpu_mmaps) {
> -		struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
> -		int cpu_map_idx = perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);
> +	if (!per_cpu_mmaps)
> +		return -EINVAL;
>  
> -		if (cpu_map_idx == -1)
> -			return -EINVAL;
> -		return perf_evsel__enable_cpu(&evsel->core, cpu_map_idx);
> -	}
> +	evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
> +	cpu_map_idx = perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);
> +	if (cpu_map_idx == -1)
> +		return -ENOENT;
> +
> +	return cpu_map_idx;
> +}
>  
> -	return perf_evsel__enable_thread(&evsel->core, idx);
> +static int evlist__enable_event_idx(struct evsel *evsel, int cpu_mode, int idx)
> +{
> +	if (cpu_mode)
> +		return perf_evsel__enable_cpu(&evsel->core, idx);
> +	else
> +		return perf_evsel__enable_thread(&evsel->core, idx);
>  }
>  
>  int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
> @@ -676,9 +686,21 @@ int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
>  
>  	evlist__for_each_entry(itr->evlist, evsel) {
>  		if (evsel__is_aux_event(evsel)) {
> +			int cpu_map_idx;
> +
>  			if (evsel->disabled)
>  				return 0;
> -			return evlist__enable_event_idx(itr->evlist, evsel, idx);
> +
> +			cpu_map_idx = evlist__find_cpu_map_idx(itr->evlist,
> +							       evsel, idx);
> +			/* No map is found in per CPU mmap */
> +			if (cpu_map_idx == -ENOENT)
> +				return cpu_map_idx;
> +
> +			if (cpu_map_idx >= 0)
> +				return evlist__enable_event_idx(evsel, 1, cpu_map_idx);
> +			else
> +				return evlist__enable_event_idx(evsel, 0, idx);
>  		}
>  	}
>  	return -EINVAL;

What about keeping per_cpu_mmaps in auxtrace_record__read_finish()
e.g.

static int evlist__find_evsel_cpu_idx(struct evlist *evlist, struct evsel *evsel, int idx)
{
	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);

	return perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);
}

int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
{
	bool per_cpu_mmaps = !perf_cpu_map__has_any_cpu(evlist->core.user_requested_cpus);
	struct evsel *evsel;
	int evsel_cpu_idx;

	if (!itr->evlist)
		return -EINVAL;

	evlist__for_each_entry(itr->evlist, evsel) {
		if (!evsel__is_aux_event(evsel))
			continue;

		if (per_cpu_mmaps) {
			evsel_cpu_idx = evlist__find_evsel_cpu_idx(itr->evlist, evsel, idx);
			/* No map is found in per CPU mmap */
			if (evsel_cpu_idx < 0)
				return -EINVAL;
		}

		if (evsel->disabled)
			return 0;

		if (per_cpu_mmaps)
			return perf_evsel__enable_cpu(&evsel->core, evsel_cpu_idx);

		return perf_evsel__enable_thread(&evsel->core, idx);
	}
	return -EINVAL;
}


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/8] perf auxtrace: Bails out after finding the event for the map index
  2024-08-23 11:33 ` [PATCH v6 6/8] perf auxtrace: Bails out after finding the event for the map index Leo Yan
@ 2024-09-03 18:41   ` Adrian Hunter
  0 siblings, 0 replies; 18+ messages in thread
From: Adrian Hunter @ 2024-09-03 18:41 UTC (permalink / raw)
  To: Leo Yan, Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
	Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan, Suzuki K Poulose,
	Mike Leach, James Clark, John Garry, Will Deacon, Yicong Yang,
	Jonathan Cameron, linux-perf-users, linux-kernel,
	linux-arm-kernel, coresight

On 23/08/24 14:33, Leo Yan wrote:
> After finding the corresponding event for the passed buffer index, it is
> safe to say the found event has been used. Then, the tool can check the
> event status and bails out if it has been disabled.

I don't really understand why this is a separate patch.  Maybe it
should be merged with the next one?

> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/util/auxtrace.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index e7b582d92811..2acf63efab1d 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -688,15 +688,15 @@ int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
>  		if (evsel__is_aux_event(evsel)) {
>  			int cpu_map_idx;
>  
> -			if (evsel->disabled)
> -				return 0;
> -
>  			cpu_map_idx = evlist__find_cpu_map_idx(itr->evlist,
>  							       evsel, idx);
>  			/* No map is found in per CPU mmap */
>  			if (cpu_map_idx == -ENOENT)
>  				return cpu_map_idx;
>  
> +			if (evsel->disabled)
> +				return 0;
> +
>  			if (cpu_map_idx >= 0)
>  				return evlist__enable_event_idx(evsel, 1, cpu_map_idx);
>  			else


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module
  2024-09-03 10:06   ` Adrian Hunter
@ 2024-09-04 19:35     ` Leo Yan
  2024-09-05  7:13       ` Adrian Hunter
  0 siblings, 1 reply; 18+ messages in thread
From: Leo Yan @ 2024-09-04 19:35 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight

On 9/3/2024 11:06 AM, Adrian Hunter wrote:
>> @@ -12345,9 +12345,16 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
>>
>>       /*
>>        * If both events generate aux data, they must be on the same PMU
>> +      * module but can be with different PMU instances.
>> +      *
>> +      * For a built-in PMU module, the 'pmu->module' pointer is NULL,
>> +      * thus it is not feasible to compare the module pointers when
>> +      * AUX PMU drivers are built into the kernel image. Instead,
>> +      * comparing the .setup_aux() callback pointer can determine if
>> +      * the two PMU events come from the same PMU driver.
>>        */
>>       if (has_aux(event) && has_aux(output_event) &&
>> -         event->pmu != output_event->pmu)
>> +         event->pmu->setup_aux != output_event->pmu->setup_aux)
> 
> It is not very flexible and risks someone adding aux PMUs that
> do not want that rule but accidentally support it.  Another
> option is to add a PMU callback, but really you need to Peter's
> feedback.

Thanks a lot for sharing opinion, Adrian!

How about below code? An alternative way is to compare the PMU's parent
device driver, e.g. for Arm SPE PMU events, this can compare if two PMU
events are using the Arm SPE driver.

/*
 * If both events generate aux data, they must be on the same PMU
 * module but can be with different PMU instances.
 */
if (has_aux(event) && has_aux(output_event)) {
        /* It isn't allowed if it fails to find driver pointer */
        if (!event->pmu->parent || !event->pmu->parent->driver)
                goto out;

        if (!output_event->pmu->parent || !output_event->pmu->parent->driver)
                goto out;

        /*
         * It isn't allowed if aux events are not same type of PMU
         * device. This is determined by comparing the associated
         * driver pointers.
         */
        if (event->pmu->parent->driver != output_event->pmu->parent->driver)
                goto out;
}

I verified the code above, it also works well at my side.

@Peter.Z, Please let me know if this is okay for you.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events()
  2024-09-03 15:26   ` Adrian Hunter
@ 2024-09-04 21:13     ` Leo Yan
  2024-09-05  6:44       ` Adrian Hunter
  0 siblings, 1 reply; 18+ messages in thread
From: Leo Yan @ 2024-09-04 21:13 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan,
	Suzuki K Poulose, Mike Leach, James Clark, John Garry,
	Will Deacon, Yicong Yang, Jonathan Cameron, linux-perf-users,
	linux-kernel, linux-arm-kernel, coresight

On 9/3/2024 4:26 PM, Adrian Hunter wrote:
> On 23/08/24 14:33, Leo Yan wrote:
>> A prerequisite for multiple AUX events is that the AUX events cannot
>> overlap CPU maps. The reason is that every CPU has only one AUX trace
>> buffer and maps it to an unique buffer index for CPU and system tracing
>> mode.
>>
>> To prevent the case of CPU maps overlapping occurring within multiple
>> AUX events, the auxtrace_record__validate_events() function is
>> introduced. It iterates through all AUX events and returns failure if
>> it detects CPU maps overlapping.
>>
>> Signed-off-by: Leo Yan <leo.yan@arm.com>
>> ---
>>  tools/perf/builtin-record.c |  4 +++
>>  tools/perf/util/auxtrace.c  | 64 +++++++++++++++++++++++++++++++++++++
>>  tools/perf/util/auxtrace.h  |  7 ++++
>>  3 files changed, 75 insertions(+)
>>
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index adbaf80b398c..2c618efba97d 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -862,6 +862,10 @@ static int record__auxtrace_init(struct record *rec)
>>
>>       auxtrace_regroup_aux_output(rec->evlist);
>>
>> +     err = auxtrace_validate_events(rec->evlist);
>> +     if (err)
>> +             return err;
>> +
>>       return auxtrace_parse_filters(rec->evlist);
>>  }
>>
>> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
>> index ca8682966fae..87e4f21b6edf 100644
>> --- a/tools/perf/util/auxtrace.c
>> +++ b/tools/perf/util/auxtrace.c
>> @@ -2828,6 +2828,70 @@ int auxtrace_parse_filters(struct evlist *evlist)
>>       return 0;
>>  }
>>
>> +int auxtrace_validate_events(struct evlist *evlist)
> 
> 'auxtrace_validate_aux_events' would better indicate that it is
> looking only at AUX area events.

Will fix.

>> +{
>> +     struct evsel *evsel;
>> +     struct perf_cpu_map *cpu_map = NULL;
>> +     struct perf_cpu_map *cpu_map_intersect = NULL;
>> +     struct perf_cpu_map *cpu_map_merged = NULL;
>> +     int ret = 0;
>> +
>> +     if (!evlist)
>> +             return 0;
> 
> Elsewhere we assume it is not NULL, might as well here too.

Sure, will drop this checking.

>> +
>> +     /*
>> +      * Currently the tool only supports multiple AUX events without
>> +      * overlapping CPU maps and every CPU has its unique AUX buffer
>> +      * for CPU or system mode tracing.
>> +      *
>> +      * Returns failure if detects CPU maps overlapping.
>> +      */
>> +     evlist__for_each_entry(evlist, evsel) {
>> +             if (!evsel__is_aux_event(evsel))
>> +                     continue;
>> +
>> +             if (perf_cpu_map__is_empty(evsel->pmu->cpus))
>> +                     continue;
> 
> Unless perf_cpu_map__intersect() is broken, the empty check
> should not be needed.

Perf's CPU map implementation is tricky. IIRC, if without this checking, it
will break the tool.

In below code, we invokes perf_cpu_map__merge() for merging maps. It does
_not_ always allocate a new map for merging. Based on testing, it only
allocates a new map if two passed map pointers are not NULL. If a passed CPU
map pointer is NULL, then it will directly return the another map's pointer.

This leads the difficulty for releasing the merged map. If the returned merged
map is a new allocated one, it is safe us to release it. Otherwise, we might
release a CPU map unexpectedly - though it is returned by
perf_cpu_map__merge(), but the CPU map comes from a PMU and should not be
released.

Anyway, I will remove the empty check and see if fix the perf CPU map issue.

> Shouldn't we be looking at evsel->cpus ?

I don't find the field `evsel->cpus`, I assume you are referring to
evsel__cpus(evsel)?  If so, I will change to use the CPU map from evsel.

> Possibly need to consider the perf_cpu_map__has_any_cpu() case?
> e.g.
>                 if (cpu_map && (perf_cpu_map__has_any_cpu(evsel->cpus) ||
>                                 perf_cpu_map__has_any_cpu(cpu_map)) {
>                         ret = -EINVAL;
>                         break;
>                 }

Will add.

>> +
>> +             cpu_map_intersect = perf_cpu_map__intersect(cpu_map, evsel->pmu->cpus);
>> +             if (cpu_map_intersect) {
>> +                     perf_cpu_map__put(cpu_map_intersect);
>> +                     pr_err("Doesn't support AUX events with overlapping CPU masks\n");
>> +                     ret = -EINVAL;
>> +                     break;
>> +             }
>> +             perf_cpu_map__put(cpu_map_intersect);
> 
> Maybe add a helper:
> 
> static bool perf_cpu_map__do_maps_intersect(struct perf_cpu_map *a, struct perf_cpu_map *b)
> {
>         struct perf_cpu_map *intersection = perf_cpu_map__intersect(a, b);
>         bool ret = !perf_cpu_map__is_empty(intersection);
> 
>         perf_cpu_map__put(intersection);
> 
>         return ret;
> }

Will do.

>> +
>> +             cpu_map_merged = perf_cpu_map__merge(cpu_map, evsel->pmu->cpus);
>> +             if (!cpu_map_merged) {
>> +                     ret = -ENOMEM;
>> +                     break;
>> +             }
>> +
>> +             /* Update the CPU maps after merging */
>> +             perf_cpu_map__put(cpu_map);
>> +             cpu_map = cpu_map_merged;
> 
> perf_cpu_map__merge() is a bit tricky - see its comments.  This
> should probably all just be:
> 
>                 cpu_map = perf_cpu_map__merge(cpu_map, evsel->pmu->cpus);

This might lead to memory leak for the 'old' cpu_map after merging.

We cannot assume the `cpu_map` variable is extended from its old value, a new
CPU map is allocated during the merging. This is why the patch always release
the old cpu_map (perf_cpu_map__put(cpu_map)) and then assign the new merged
CPU map.

>> +     }
>> +
>> +     if (!ret)
>> +             goto out;
> 
> Could we put the error path last i.e.
> 
>         perf_cpu_map__put(cpu_map);
> 
>         if (ret)
>                 goto out_err;
> 
>         return 0;
> 
> out_err:

Makes sense. Will fix.

>> +
>> +     /* If fails, dump CPU maps for debugging */
>> +     evlist__for_each_entry(evlist, evsel) {
>> +             char buf[200];
>> +
>> +             if (!evsel__is_aux_event(evsel))
>> +                     continue;
>> +
>> +             cpu_map__snprint(evsel->pmu->cpus, buf, sizeof(buf));
>> +             pr_debug("AUX event [%s]'s cpu map is: %s\n", evsel->pmu->name, buf);
> 
> Could probably use cpu_map__fprintf(pmu->cpus, debug_file()) and
> not need buf.

Will do.

Thanks for suggestions!

Leo

>> +     }
>> +
>> +out:
>> +     perf_cpu_map__put(cpu_map);
>> +     return ret;
>> +}
>> +
>>  int auxtrace__process_event(struct perf_session *session, union perf_event *event,
>>                           struct perf_sample *sample, const struct perf_tool *tool)
>>  {
>> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
>> index a1895a4f530b..67a74ad0c383 100644
>> --- a/tools/perf/util/auxtrace.h
>> +++ b/tools/perf/util/auxtrace.h
>> @@ -636,6 +636,7 @@ void addr_filters__exit(struct addr_filters *filts);
>>  int addr_filters__parse_bare_filter(struct addr_filters *filts,
>>                                   const char *filter);
>>  int auxtrace_parse_filters(struct evlist *evlist);
>> +int auxtrace_validate_events(struct evlist *evlist);
>>
>>  int auxtrace__process_event(struct perf_session *session, union perf_event *event,
>>                           struct perf_sample *sample, const struct perf_tool *tool);
>> @@ -875,6 +876,12 @@ int auxtrace_parse_filters(struct evlist *evlist __maybe_unused)
>>       return 0;
>>  }
>>
>> +static inline
>> +int auxtrace_validate_events(struct evlist *evlist __maybe_unused)
>> +{
>> +     return 0;
>> +}
>> +
>>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
>>                       struct auxtrace_mmap_params *mp,
>>                       void *userpg, int fd);
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events()
  2024-09-04 21:13     ` Leo Yan
@ 2024-09-05  6:44       ` Adrian Hunter
  0 siblings, 0 replies; 18+ messages in thread
From: Adrian Hunter @ 2024-09-05  6:44 UTC (permalink / raw)
  To: Leo Yan, Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
	Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan, Suzuki K Poulose,
	Mike Leach, James Clark, John Garry, Will Deacon, Yicong Yang,
	Jonathan Cameron, linux-perf-users, linux-kernel,
	linux-arm-kernel, coresight

On 5/09/24 00:13, Leo Yan wrote:
> On 9/3/2024 4:26 PM, Adrian Hunter wrote:
>> On 23/08/24 14:33, Leo Yan wrote:
>>> A prerequisite for multiple AUX events is that the AUX events cannot
>>> overlap CPU maps. The reason is that every CPU has only one AUX trace
>>> buffer and maps it to an unique buffer index for CPU and system tracing
>>> mode.
>>>
>>> To prevent the case of CPU maps overlapping occurring within multiple
>>> AUX events, the auxtrace_record__validate_events() function is
>>> introduced. It iterates through all AUX events and returns failure if
>>> it detects CPU maps overlapping.
>>>
>>> Signed-off-by: Leo Yan <leo.yan@arm.com>
>>> ---
>>>  tools/perf/builtin-record.c |  4 +++
>>>  tools/perf/util/auxtrace.c  | 64 +++++++++++++++++++++++++++++++++++++
>>>  tools/perf/util/auxtrace.h  |  7 ++++
>>>  3 files changed, 75 insertions(+)
>>>
>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>> index adbaf80b398c..2c618efba97d 100644
>>> --- a/tools/perf/builtin-record.c
>>> +++ b/tools/perf/builtin-record.c
>>> @@ -862,6 +862,10 @@ static int record__auxtrace_init(struct record *rec)
>>>
>>>       auxtrace_regroup_aux_output(rec->evlist);
>>>
>>> +     err = auxtrace_validate_events(rec->evlist);
>>> +     if (err)
>>> +             return err;
>>> +
>>>       return auxtrace_parse_filters(rec->evlist);
>>>  }
>>>
>>> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
>>> index ca8682966fae..87e4f21b6edf 100644
>>> --- a/tools/perf/util/auxtrace.c
>>> +++ b/tools/perf/util/auxtrace.c
>>> @@ -2828,6 +2828,70 @@ int auxtrace_parse_filters(struct evlist *evlist)
>>>       return 0;
>>>  }
>>>
>>> +int auxtrace_validate_events(struct evlist *evlist)
>>
>> 'auxtrace_validate_aux_events' would better indicate that it is
>> looking only at AUX area events.
> 
> Will fix.
> 
>>> +{
>>> +     struct evsel *evsel;
>>> +     struct perf_cpu_map *cpu_map = NULL;
>>> +     struct perf_cpu_map *cpu_map_intersect = NULL;
>>> +     struct perf_cpu_map *cpu_map_merged = NULL;
>>> +     int ret = 0;
>>> +
>>> +     if (!evlist)
>>> +             return 0;
>>
>> Elsewhere we assume it is not NULL, might as well here too.
> 
> Sure, will drop this checking.
> 
>>> +
>>> +     /*
>>> +      * Currently the tool only supports multiple AUX events without
>>> +      * overlapping CPU maps and every CPU has its unique AUX buffer
>>> +      * for CPU or system mode tracing.
>>> +      *
>>> +      * Returns failure if detects CPU maps overlapping.
>>> +      */
>>> +     evlist__for_each_entry(evlist, evsel) {
>>> +             if (!evsel__is_aux_event(evsel))
>>> +                     continue;
>>> +
>>> +             if (perf_cpu_map__is_empty(evsel->pmu->cpus))
>>> +                     continue;
>>
>> Unless perf_cpu_map__intersect() is broken, the empty check
>> should not be needed.
> 
> Perf's CPU map implementation is tricky. IIRC, if without this checking, it
> will break the tool.
> 
> In below code, we invokes perf_cpu_map__merge() for merging maps. It does
> _not_ always allocate a new map for merging. Based on testing, it only
> allocates a new map if two passed map pointers are not NULL. If a passed CPU
> map pointer is NULL, then it will directly return the another map's pointer.
> 
> This leads the difficulty for releasing the merged map. If the returned merged
> map is a new allocated one, it is safe us to release it. Otherwise, we might
> release a CPU map unexpectedly - though it is returned by
> perf_cpu_map__merge(), but the CPU map comes from a PMU and should not be
> released.

If it returns a different map, it adjusts the reference counts
accordingly.  perf_cpu_map__merge() is still a problem though.
See below.

> 
> Anyway, I will remove the empty check and see if fix the perf CPU map issue.
> 
>> Shouldn't we be looking at evsel->cpus ?
> 
> I don't find the field `evsel->cpus`, I assume you are referring to
> evsel__cpus(evsel)?  If so, I will change to use the CPU map from evsel.
> 
>> Possibly need to consider the perf_cpu_map__has_any_cpu() case?
>> e.g.
>>                 if (cpu_map && (perf_cpu_map__has_any_cpu(evsel->cpus) ||
>>                                 perf_cpu_map__has_any_cpu(cpu_map)) {
>>                         ret = -EINVAL;
>>                         break;
>>                 }
> 
> Will add.
> 
>>> +
>>> +             cpu_map_intersect = perf_cpu_map__intersect(cpu_map, evsel->pmu->cpus);
>>> +             if (cpu_map_intersect) {
>>> +                     perf_cpu_map__put(cpu_map_intersect);
>>> +                     pr_err("Doesn't support AUX events with overlapping CPU masks\n");
>>> +                     ret = -EINVAL;
>>> +                     break;
>>> +             }
>>> +             perf_cpu_map__put(cpu_map_intersect);
>>
>> Maybe add a helper:
>>
>> static bool perf_cpu_map__do_maps_intersect(struct perf_cpu_map *a, struct perf_cpu_map *b)
>> {
>>         struct perf_cpu_map *intersection = perf_cpu_map__intersect(a, b);
>>         bool ret = !perf_cpu_map__is_empty(intersection);
>>
>>         perf_cpu_map__put(intersection);
>>
>>         return ret;
>> }
> 
> Will do.
> 
>>> +
>>> +             cpu_map_merged = perf_cpu_map__merge(cpu_map, evsel->pmu->cpus);
>>> +             if (!cpu_map_merged) {
>>> +                     ret = -ENOMEM;
>>> +                     break;
>>> +             }
>>> +
>>> +             /* Update the CPU maps after merging */
>>> +             perf_cpu_map__put(cpu_map);
>>> +             cpu_map = cpu_map_merged;
>>
>> perf_cpu_map__merge() is a bit tricky - see its comments.  This
>> should probably all just be:
>>
>>                 cpu_map = perf_cpu_map__merge(cpu_map, evsel->pmu->cpus);
> 
> This might lead to memory leak for the 'old' cpu_map after merging.
> 
> We cannot assume the `cpu_map` variable is extended from its old value, a new
> CPU map is allocated during the merging. This is why the patch always release
> the old cpu_map (perf_cpu_map__put(cpu_map)) and then assign the new merged
> CPU map.

I agree, perf_cpu_map__merge() is a bit broken.  Maybe add
another helper like:

static int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
{
	struct perf_cpu_map *merged;

	/* Avoid being unable to tell if perf_cpu_map__merge() failed */
	if (perf_cpu_map__is_empty(*orig) && perf_cpu_map__is_empty(other))
		return 0;

	merged = perf_cpu_map__merge(*orig, other);
	if (!merged)
		return -ENOMEM;

	*orig = merged;

	return 0;
}

Then it can be:

	ret = perf_cpu_map__merge_in(&cpu_map, evsel__cpus(evsel));
	if (ret)
		break;

> 
>>> +     }
>>> +
>>> +     if (!ret)
>>> +             goto out;
>>
>> Could we put the error path last i.e.
>>
>>         perf_cpu_map__put(cpu_map);
>>
>>         if (ret)
>>                 goto out_err;
>>
>>         return 0;
>>
>> out_err:
> 
> Makes sense. Will fix.
> 
>>> +
>>> +     /* If fails, dump CPU maps for debugging */
>>> +     evlist__for_each_entry(evlist, evsel) {
>>> +             char buf[200];
>>> +
>>> +             if (!evsel__is_aux_event(evsel))
>>> +                     continue;
>>> +
>>> +             cpu_map__snprint(evsel->pmu->cpus, buf, sizeof(buf));
>>> +             pr_debug("AUX event [%s]'s cpu map is: %s\n", evsel->pmu->name, buf);
>>
>> Could probably use cpu_map__fprintf(pmu->cpus, debug_file()) and
>> not need buf.
> 
> Will do.
> 
> Thanks for suggestions!
> 
> Leo
> 
>>> +     }
>>> +
>>> +out:
>>> +     perf_cpu_map__put(cpu_map);
>>> +     return ret;
>>> +}
>>> +
>>>  int auxtrace__process_event(struct perf_session *session, union perf_event *event,
>>>                           struct perf_sample *sample, const struct perf_tool *tool)
>>>  {
>>> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
>>> index a1895a4f530b..67a74ad0c383 100644
>>> --- a/tools/perf/util/auxtrace.h
>>> +++ b/tools/perf/util/auxtrace.h
>>> @@ -636,6 +636,7 @@ void addr_filters__exit(struct addr_filters *filts);
>>>  int addr_filters__parse_bare_filter(struct addr_filters *filts,
>>>                                   const char *filter);
>>>  int auxtrace_parse_filters(struct evlist *evlist);
>>> +int auxtrace_validate_events(struct evlist *evlist);
>>>
>>>  int auxtrace__process_event(struct perf_session *session, union perf_event *event,
>>>                           struct perf_sample *sample, const struct perf_tool *tool);
>>> @@ -875,6 +876,12 @@ int auxtrace_parse_filters(struct evlist *evlist __maybe_unused)
>>>       return 0;
>>>  }
>>>
>>> +static inline
>>> +int auxtrace_validate_events(struct evlist *evlist __maybe_unused)
>>> +{
>>> +     return 0;
>>> +}
>>> +
>>>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
>>>                       struct auxtrace_mmap_params *mp,
>>>                       void *userpg, int fd);
>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module
  2024-09-04 19:35     ` Leo Yan
@ 2024-09-05  7:13       ` Adrian Hunter
  0 siblings, 0 replies; 18+ messages in thread
From: Adrian Hunter @ 2024-09-05  7:13 UTC (permalink / raw)
  To: Leo Yan, Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
	Mark Rutland, Jiri Olsa, Ian Rogers, Liang, Kan, Suzuki K Poulose,
	Mike Leach, James Clark, John Garry, Will Deacon, Yicong Yang,
	Jonathan Cameron, linux-perf-users, linux-kernel,
	linux-arm-kernel, coresight

On 4/09/24 22:35, Leo Yan wrote:
> On 9/3/2024 11:06 AM, Adrian Hunter wrote:
>>> @@ -12345,9 +12345,16 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
>>>
>>>       /*
>>>        * If both events generate aux data, they must be on the same PMU
>>> +      * module but can be with different PMU instances.
>>> +      *
>>> +      * For a built-in PMU module, the 'pmu->module' pointer is NULL,
>>> +      * thus it is not feasible to compare the module pointers when
>>> +      * AUX PMU drivers are built into the kernel image. Instead,
>>> +      * comparing the .setup_aux() callback pointer can determine if
>>> +      * the two PMU events come from the same PMU driver.
>>>        */
>>>       if (has_aux(event) && has_aux(output_event) &&
>>> -         event->pmu != output_event->pmu)
>>> +         event->pmu->setup_aux != output_event->pmu->setup_aux)
>>
>> It is not very flexible and risks someone adding aux PMUs that
>> do not want that rule but accidentally support it.  Another
>> option is to add a PMU callback, but really you need to Peter's
>> feedback.
> 
> Thanks a lot for sharing opinion, Adrian!
> 
> How about below code? An alternative way is to compare the PMU's parent
> device driver, e.g. for Arm SPE PMU events, this can compare if two PMU
> events are using the Arm SPE driver.

IMHO, in the general case, whether 2 AUX area events can
output to the same buffer isn't really related to the device
hierarchy, driver or module.

> 
> /*
>  * If both events generate aux data, they must be on the same PMU
>  * module but can be with different PMU instances.
>  */
> if (has_aux(event) && has_aux(output_event)) {
>         /* It isn't allowed if it fails to find driver pointer */
>         if (!event->pmu->parent || !event->pmu->parent->driver)
>                 goto out;
> 
>         if (!output_event->pmu->parent || !output_event->pmu->parent->driver)
>                 goto out;
> 
>         /*
>          * It isn't allowed if aux events are not same type of PMU
>          * device. This is determined by comparing the associated
>          * driver pointers.
>          */
>         if (event->pmu->parent->driver != output_event->pmu->parent->driver)
>                 goto out;
> }
> 
> I verified the code above, it also works well at my side.
> 
> @Peter.Z, Please let me know if this is okay for you.
> 
> Thanks,
> Leo


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-09-05  7:13 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-23 11:32 [PATCH v6 0/8] perf auxtrace: Support multiple AUX events Leo Yan
2024-08-23 11:32 ` [PATCH v6 1/8] perf/core: Allow multiple AUX PMU events with the same module Leo Yan
2024-08-23 11:40   ` Leo Yan
2024-09-03 10:06   ` Adrian Hunter
2024-09-04 19:35     ` Leo Yan
2024-09-05  7:13       ` Adrian Hunter
2024-08-23 11:33 ` [PATCH v6 2/8] perf auxtrace: Use evsel__is_aux_event() for checking AUX event Leo Yan
2024-08-23 11:33 ` [PATCH v6 3/8] perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_record Leo Yan
2024-08-23 11:33 ` [PATCH v6 4/8] perf auxtrace: Introduce auxtrace_record__validate_events() Leo Yan
2024-09-03 15:26   ` Adrian Hunter
2024-09-04 21:13     ` Leo Yan
2024-09-05  6:44       ` Adrian Hunter
2024-08-23 11:33 ` [PATCH v6 5/8] perf auxtrace: Refactor evlist__enable_event_idx() Leo Yan
2024-09-03 18:39   ` Adrian Hunter
2024-08-23 11:33 ` [PATCH v6 6/8] perf auxtrace: Bails out after finding the event for the map index Leo Yan
2024-09-03 18:41   ` Adrian Hunter
2024-08-23 11:33 ` [PATCH v6 7/8] perf auxtrace: Iterate all AUX events when finish reading Leo Yan
2024-08-23 11:33 ` [PATCH v6 8/8] perf arm-spe: Support multiple events in arm_spe_evsel_is_auxtrace() Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).