linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2
@ 2024-09-14 21:54 Leo Yan
  2024-09-14 21:54 ` [PATCH v2 1/5] perf arm-spe: Define metadata header " Leo Yan
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Leo Yan @ 2024-09-14 21:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

This patch series enhances Arm SPE metadata in the Perf file to a
version 2 format and maintains backward compatibility for metadata v1.

The version 2 metadata contains a global structure with fields for
metadata header version number, header size, CPU numbers. It also
expands to store per-CPU metadata, which includes the CPU logical
number in the kernel, MIDR, and capacity values associated with Arm SPE.

This patch set has been tested the perf to decode the Arm SPE metadata
v1 and v2.

Changes from v1:
- Dropped LDS bit exposing from Arm SPE driver (Will Deacon).
- To simplify the change, this series did not include multiple AUX event
  support.


Leo Yan (5):
  perf arm-spe: Define metadata header version 2
  perf arm-spe: Calculate meta data size
  perf arm-spe: Save per CPU information in metadata
  perf arm-spe: Support metadata version 2
  perf arm-spe: Dump metadata with version 2

 tools/perf/arch/arm64/util/arm-spe.c | 106 ++++++++++++++++++-
 tools/perf/util/arm-spe.c            | 151 +++++++++++++++++++++++++--
 tools/perf/util/arm-spe.h            |  38 ++++++-
 3 files changed, 281 insertions(+), 14 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/5] perf arm-spe: Define metadata header version 2
  2024-09-14 21:54 [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2 Leo Yan
@ 2024-09-14 21:54 ` Leo Yan
  2024-09-27  6:32   ` Namhyung Kim
  2024-09-14 21:54 ` [PATCH v2 2/5] perf arm-spe: Calculate meta data size Leo Yan
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Leo Yan @ 2024-09-14 21:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

The first version's metadata header structure doesn't include a field to
indicate a header version, which is not friendly for extension.

Define the metadata version 2 format with a new header structure and
extend per CPU's metadata. In the meantime, the old metadata header will
still be supported for backward compatibility.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/arch/arm64/util/arm-spe.c |  4 +--
 tools/perf/util/arm-spe.c            |  2 +-
 tools/perf/util/arm-spe.h            | 38 +++++++++++++++++++++++++++-
 3 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index 2be99fdf997d..c2d5c8ca4900 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -41,7 +41,7 @@ static size_t
 arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
 		       struct evlist *evlist __maybe_unused)
 {
-	return ARM_SPE_AUXTRACE_PRIV_SIZE;
+	return ARM_SPE_AUXTRACE_V1_PRIV_SIZE;
 }
 
 static int arm_spe_info_fill(struct auxtrace_record *itr,
@@ -53,7 +53,7 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
 			container_of(itr, struct arm_spe_recording, itr);
 	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
 
-	if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE)
+	if (priv_size != ARM_SPE_AUXTRACE_V1_PRIV_SIZE)
 		return -EINVAL;
 
 	if (!session->evlist->core.nr_mmaps)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 138ffc71b32d..70989b1bae47 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -1262,7 +1262,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 				  struct perf_session *session)
 {
 	struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
-	size_t min_sz = sizeof(u64) * ARM_SPE_AUXTRACE_PRIV_MAX;
+	size_t min_sz = ARM_SPE_AUXTRACE_V1_PRIV_SIZE;
 	struct perf_record_time_conv *tc = &session->time_conv;
 	const char *cpuid = perf_env__cpuid(session->evlist->env);
 	u64 midr = strtol(cpuid, NULL, 16);
diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
index 4f4900c18f3e..5416d4e1d15f 100644
--- a/tools/perf/util/arm-spe.h
+++ b/tools/perf/util/arm-spe.h
@@ -12,10 +12,46 @@
 enum {
 	ARM_SPE_PMU_TYPE,
 	ARM_SPE_PER_CPU_MMAPS,
+	ARM_SPE_AUXTRACE_V1_PRIV_MAX,
+};
+
+#define ARM_SPE_AUXTRACE_V1_PRIV_SIZE	\
+	(ARM_SPE_AUXTRACE_V1_PRIV_MAX * sizeof(u64))
+
+enum {
+	/*
+	 * The old metadata format (defined above) does not include a
+	 * field for version number. Version 1 is reserved and starts
+	 * from version 2.
+	 */
+	ARM_SPE_HEADER_VERSION,
+	/* Number of sizeof(u64) */
+	ARM_SPE_HEADER_SIZE,
+	/* PMU type shared by CPUs */
+	ARM_SPE_SHARED_PMU_TYPE,
+	/* Number of CPUs */
+	ARM_SPE_CPUS_NUM,
 	ARM_SPE_AUXTRACE_PRIV_MAX,
 };
 
-#define ARM_SPE_AUXTRACE_PRIV_SIZE (ARM_SPE_AUXTRACE_PRIV_MAX * sizeof(u64))
+enum {
+	/* Magic number */
+	ARM_SPE_MAGIC,
+	/* CPU logical number in system */
+	ARM_SPE_CPU,
+	/* Number of parameters */
+	ARM_SPE_CPU_NR_PARAMS,
+	/* CPU MIDR */
+	ARM_SPE_CPU_MIDR,
+	/* Associated PMU type */
+	ARM_SPE_CPU_PMU_TYPE,
+	/* Minimal interval */
+	ARM_SPE_CAP_MIN_IVAL,
+	ARM_SPE_CPU_PRIV_MAX,
+};
+
+#define ARM_SPE_HEADER_CURRENT_VERSION	2
+
 
 union perf_event;
 struct perf_session;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/5] perf arm-spe: Calculate meta data size
  2024-09-14 21:54 [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2 Leo Yan
  2024-09-14 21:54 ` [PATCH v2 1/5] perf arm-spe: Define metadata header " Leo Yan
@ 2024-09-14 21:54 ` Leo Yan
  2024-09-27  6:14   ` Namhyung Kim
  2024-09-14 21:54 ` [PATCH v2 3/5] perf arm-spe: Save per CPU information in metadata Leo Yan
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Leo Yan @ 2024-09-14 21:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

The metadata is designed to contain a header and per CPU information.

The arm_spe_find_cpus() function is introduced to identify how many CPUs
support ARM SPE. Based on the CPU number, calculates the metadata size.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/arch/arm64/util/arm-spe.c | 35 +++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index c2d5c8ca4900..15478989ef30 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -37,11 +37,40 @@ struct arm_spe_recording {
 	bool			*wrapped;
 };
 
+static struct perf_cpu_map *arm_spe_find_cpus(struct evlist *evlist)
+{
+	struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus;
+	struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
+	struct perf_cpu_map *intersect_cpus;
+
+	/* cpu map is not "any" CPU , we have specific CPUs to work with */
+	if (!perf_cpu_map__has_any_cpu(event_cpus)) {
+		intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus);
+		perf_cpu_map__put(online_cpus);
+	/* Event can be "any" CPU so count all CPUs. */
+	} else {
+		intersect_cpus = online_cpus;
+	}
+
+	return intersect_cpus;
+}
+
 static size_t
 arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
-		       struct evlist *evlist __maybe_unused)
+		       struct evlist *evlist)
 {
-	return ARM_SPE_AUXTRACE_V1_PRIV_SIZE;
+	struct perf_cpu_map *cpu_map = arm_spe_find_cpus(evlist);
+	size_t size;
+
+	if (!cpu_map)
+		return 0;
+
+	size = ARM_SPE_AUXTRACE_PRIV_MAX +
+	       ARM_SPE_CPU_PRIV_MAX * perf_cpu_map__nr(cpu_map);
+	size *= sizeof(u64);
+
+	perf_cpu_map__put(cpu_map);
+	return size;
 }
 
 static int arm_spe_info_fill(struct auxtrace_record *itr,
@@ -53,7 +82,7 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
 			container_of(itr, struct arm_spe_recording, itr);
 	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
 
-	if (priv_size != ARM_SPE_AUXTRACE_V1_PRIV_SIZE)
+	if (priv_size != arm_spe_info_priv_size(itr, session->evlist))
 		return -EINVAL;
 
 	if (!session->evlist->core.nr_mmaps)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/5] perf arm-spe: Save per CPU information in metadata
  2024-09-14 21:54 [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2 Leo Yan
  2024-09-14 21:54 ` [PATCH v2 1/5] perf arm-spe: Define metadata header " Leo Yan
  2024-09-14 21:54 ` [PATCH v2 2/5] perf arm-spe: Calculate meta data size Leo Yan
@ 2024-09-14 21:54 ` Leo Yan
  2024-09-27  6:21   ` Namhyung Kim
  2024-09-14 21:54 ` [PATCH v2 4/5] perf arm-spe: Support metadata version 2 Leo Yan
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Leo Yan @ 2024-09-14 21:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

Save the Arm SPE information on a per-CPU basis. This approach is easier
in the decoding phase for retrieving metadata based on the CPU number of
every Arm SPE record.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/arch/arm64/util/arm-spe.c | 71 +++++++++++++++++++++++++++-
 1 file changed, 70 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index 15478989ef30..2790a37709a5 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -26,6 +26,8 @@
 #include "../../../util/arm-spe.h"
 #include <tools/libc_compat.h> // reallocarray
 
+#define ARM_SPE_CPU_MAGIC		0x1010101010101010ULL
+
 #define KiB(x) ((x) * 1024)
 #define MiB(x) ((x) * 1024 * 1024)
 
@@ -73,14 +75,66 @@ arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
 	return size;
 }
 
+static int arm_spe_save_cpu_header(struct auxtrace_record *itr,
+				   struct perf_cpu cpu, __u64 data[])
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+	struct perf_pmu *pmu = NULL;
+	struct perf_pmu tmp_pmu;
+	char cpu_id_str[16];
+	char *cpuid = NULL;
+	u64 val;
+
+	snprintf(cpu_id_str, sizeof(cpu_id_str), "%d", cpu.cpu);
+	tmp_pmu.cpus = perf_cpu_map__new(cpu_id_str);
+	if (!tmp_pmu.cpus)
+		return -ENOMEM;
+
+	/* Read CPU MIDR */
+	cpuid = perf_pmu__getcpuid(&tmp_pmu);
+	if (!cpuid)
+		return -ENOMEM;
+	val = strtol(cpuid, NULL, 16);
+	perf_cpu_map__put(tmp_pmu.cpus);
+
+	data[ARM_SPE_MAGIC] = ARM_SPE_CPU_MAGIC;
+	data[ARM_SPE_CPU] = cpu.cpu;
+	data[ARM_SPE_CPU_NR_PARAMS] = ARM_SPE_CPU_PRIV_MAX - ARM_SPE_CPU_MIDR;
+	data[ARM_SPE_CPU_MIDR] = val;
+
+	/* Find the associate Arm SPE PMU for the CPU */
+	if (perf_cpu_map__has(sper->arm_spe_pmu->cpus, cpu))
+		pmu = sper->arm_spe_pmu;
+
+	if (!pmu) {
+		/* No Arm SPE PMU is found */
+		data[ARM_SPE_CPU_PMU_TYPE] = ULLONG_MAX;
+		data[ARM_SPE_CAP_MIN_IVAL] = 0;
+	} else {
+		data[ARM_SPE_CPU_PMU_TYPE] = pmu->type;
+
+		if (perf_pmu__scan_file(pmu, "caps/min_interval", "%lu", &val) != 1)
+			val = 0;
+		data[ARM_SPE_CAP_MIN_IVAL] = val;
+	}
+
+	return ARM_SPE_CPU_PRIV_MAX;
+}
+
 static int arm_spe_info_fill(struct auxtrace_record *itr,
 			     struct perf_session *session,
 			     struct perf_record_auxtrace_info *auxtrace_info,
 			     size_t priv_size)
 {
+	int i, ret;
+	size_t offset;
 	struct arm_spe_recording *sper =
 			container_of(itr, struct arm_spe_recording, itr);
 	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
+	struct perf_cpu_map *cpu_map = arm_spe_find_cpus(session->evlist);
+	struct perf_cpu cpu;
+	__u64 *data;
 
 	if (priv_size != arm_spe_info_priv_size(itr, session->evlist))
 		return -EINVAL;
@@ -89,8 +143,23 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
 		return -EINVAL;
 
 	auxtrace_info->type = PERF_AUXTRACE_ARM_SPE;
-	auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type;
+	auxtrace_info->priv[ARM_SPE_HEADER_VERSION] = ARM_SPE_HEADER_CURRENT_VERSION;
+	auxtrace_info->priv[ARM_SPE_HEADER_SIZE] =
+		ARM_SPE_AUXTRACE_PRIV_MAX - ARM_SPE_HEADER_VERSION;
+	auxtrace_info->priv[ARM_SPE_SHARED_PMU_TYPE] = arm_spe_pmu->type;
+	auxtrace_info->priv[ARM_SPE_CPUS_NUM] = perf_cpu_map__nr(cpu_map);
+
+	offset = ARM_SPE_AUXTRACE_PRIV_MAX;
+	perf_cpu_map__for_each_cpu(cpu, i, cpu_map) {
+		assert(offset < priv_size);
+		data = &auxtrace_info->priv[offset];
+		ret = arm_spe_save_cpu_header(itr, cpu, data);
+		if (ret < 0)
+			return ret;
+		offset += ret;
+	}
 
+	perf_cpu_map__put(cpu_map);
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/5] perf arm-spe: Support metadata version 2
  2024-09-14 21:54 [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2 Leo Yan
                   ` (2 preceding siblings ...)
  2024-09-14 21:54 ` [PATCH v2 3/5] perf arm-spe: Save per CPU information in metadata Leo Yan
@ 2024-09-14 21:54 ` Leo Yan
  2024-09-27  6:35   ` Namhyung Kim
  2024-09-14 21:54 ` [PATCH v2 5/5] perf arm-spe: Dump metadata with " Leo Yan
  2024-09-26 16:20 ` [PATCH v2 0/5] perf arm-spe: Introduce metadata " Leo Yan
  5 siblings, 1 reply; 17+ messages in thread
From: Leo Yan @ 2024-09-14 21:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

This commit is to support metadata version 2 and at the meantime it is
backward compatible for version 1's format.

The metadata version 1 doesn't include the ARM_SPE_HEADER_VERSION field.
As version 1 is fixed with two u64 fields, by checking the metadata
size, it distinguishes the metadata is version 1 or version 2 (and any
new versions if later will have). For version 2, it reads out CPU number
and retrieves the metadata info for every CPU.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/arm-spe.c | 95 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 92 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 70989b1bae47..17782cb40fb5 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -78,6 +78,10 @@ struct arm_spe {
 
 	unsigned long			num_events;
 	u8				use_ctx_pkt_for_pid;
+
+	u64				**metadata;
+	u64				metadata_ver;
+	u64				metadata_nr_cpu;
 };
 
 struct arm_spe_queue {
@@ -1016,6 +1020,73 @@ static int arm_spe_flush(struct perf_session *session __maybe_unused,
 	return 0;
 }
 
+static u64 *arm_spe__alloc_per_cpu_metadata(u64 *buf, int cpu_size)
+{
+	u64 *metadata;
+
+	metadata = zalloc(sizeof(*metadata) * cpu_size);
+	if (!metadata)
+		return NULL;
+
+	memcpy(metadata, buf, cpu_size);
+	return metadata;
+}
+
+static void arm_spe__free_metadata(u64 **metadata, int nr_cpu)
+{
+	int i;
+
+	for (i = 0; i < nr_cpu; i++)
+		zfree(&metadata[i]);
+	free(metadata);
+}
+
+static u64 **arm_spe__alloc_metadata(struct perf_record_auxtrace_info *info,
+				     u64 *ver, int *nr_cpu)
+{
+	u64 *ptr = (u64 *)info->priv;
+	u64 metadata_size;
+	u64 **metadata = NULL;
+	int hdr_sz, cpu_sz, i;
+
+	metadata_size = info->header.size -
+		sizeof(struct perf_record_auxtrace_info);
+
+	/* Metadata version 1 */
+	if (metadata_size == ARM_SPE_AUXTRACE_V1_PRIV_SIZE) {
+		*ver = 1;
+		*nr_cpu = 0;
+		/* No per CPU metadata */
+		return NULL;
+	}
+
+	*ver = ptr[ARM_SPE_HEADER_VERSION];
+	hdr_sz = ptr[ARM_SPE_HEADER_SIZE];
+	*nr_cpu = ptr[ARM_SPE_CPUS_NUM];
+
+	metadata = zalloc(sizeof(*metadata) * (*nr_cpu));
+	if (!metadata)
+		return NULL;
+
+	/* Locate the start address of per CPU metadata */
+	ptr += hdr_sz;
+	cpu_sz = (metadata_size - (hdr_sz * sizeof(u64))) / (*nr_cpu);
+
+	for (i = 0; i < *nr_cpu; i++) {
+		metadata[i] = arm_spe__alloc_per_cpu_metadata(ptr, cpu_sz);
+		if (!metadata[i])
+			goto err_per_cpu_metadata;
+
+		ptr += cpu_sz / sizeof(u64);
+	}
+
+	return metadata;
+
+err_per_cpu_metadata:
+	arm_spe__free_metadata(metadata, *nr_cpu);
+	return NULL;
+}
+
 static void arm_spe_free_queue(void *priv)
 {
 	struct arm_spe_queue *speq = priv;
@@ -1050,6 +1121,7 @@ static void arm_spe_free(struct perf_session *session)
 	auxtrace_heap__free(&spe->heap);
 	arm_spe_free_events(session);
 	session->auxtrace = NULL;
+	arm_spe__free_metadata(spe->metadata, spe->metadata_nr_cpu);
 	free(spe);
 }
 
@@ -1267,15 +1339,24 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	const char *cpuid = perf_env__cpuid(session->evlist->env);
 	u64 midr = strtol(cpuid, NULL, 16);
 	struct arm_spe *spe;
-	int err;
+	u64 **metadata = NULL;
+	u64 metadata_ver;
+	int nr_cpu, err;
 
 	if (auxtrace_info->header.size < sizeof(struct perf_record_auxtrace_info) +
 					min_sz)
 		return -EINVAL;
 
+	metadata = arm_spe__alloc_metadata(auxtrace_info, &metadata_ver,
+					   &nr_cpu);
+	if (!metadata && metadata_ver != 1) {
+		pr_err("Failed to parse Arm SPE metadata.\n");
+		return -EINVAL;
+	}
+
 	spe = zalloc(sizeof(struct arm_spe));
 	if (!spe)
-		return -ENOMEM;
+		goto err_free_metadata;
 
 	err = auxtrace_queues__init(&spe->queues);
 	if (err)
@@ -1284,8 +1365,14 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->session = session;
 	spe->machine = &session->machines.host; /* No kvm support */
 	spe->auxtrace_type = auxtrace_info->type;
-	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
+	if (metadata_ver == 1)
+		spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
+	else
+		spe->pmu_type = auxtrace_info->priv[ARM_SPE_SHARED_PMU_TYPE];
 	spe->midr = midr;
+	spe->metadata = metadata;
+	spe->metadata_ver = metadata_ver;
+	spe->metadata_nr_cpu = nr_cpu;
 
 	spe->timeless_decoding = arm_spe__is_timeless_decoding(spe);
 
@@ -1346,5 +1433,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	session->auxtrace = NULL;
 err_free:
 	free(spe);
+err_free_metadata:
+	arm_spe__free_metadata(metadata, nr_cpu);
 	return err;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/5] perf arm-spe: Dump metadata with version 2
  2024-09-14 21:54 [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2 Leo Yan
                   ` (3 preceding siblings ...)
  2024-09-14 21:54 ` [PATCH v2 4/5] perf arm-spe: Support metadata version 2 Leo Yan
@ 2024-09-14 21:54 ` Leo Yan
  2024-09-26 16:20 ` [PATCH v2 0/5] perf arm-spe: Introduce metadata " Leo Yan
  5 siblings, 0 replies; 17+ messages in thread
From: Leo Yan @ 2024-09-14 21:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

This commit dumps metadata with version 2. It dumps metadata for header
and per CPU data respectively in the arm_spe_print_info() function to
support metadata version 2 format.

After:

  0 0 0x3c0 [0x1b0]: PERF_RECORD_AUXTRACE_INFO type: 4
    Header version     :2
    Header size        :4
    Shared PMU type    :13
    CPU number         :8
      Magic            :0x1010101010101010
      CPU #            :0
      Num of params    :3
      MIDR             :0x410fd801
      PMU Type         :-1
      Min Interval     :0
      Magic            :0x1010101010101010
      CPU #            :1
      Num of params    :3
      MIDR             :0x410fd801
      PMU Type         :-1
      Min Interval     :0
      Magic            :0x1010101010101010
      CPU #            :2
      Num of params    :3
      MIDR             :0x410fd870
      PMU Type         :13
      Min Interval     :1024
      Magic            :0x1010101010101010
      CPU #            :3
      Num of params    :3
      MIDR             :0x410fd870
      PMU Type         :13
      Min Interval     :1024
      Magic            :0x1010101010101010
      CPU #            :4
      Num of params    :3
      MIDR             :0x410fd870
      PMU Type         :13
      Min Interval     :1024
      Magic            :0x1010101010101010
      CPU #            :5
      Num of params    :3
      MIDR             :0x410fd870
      PMU Type         :13
      Min Interval     :1024
      Magic            :0x1010101010101010
      CPU #            :6
      Num of params    :3
      MIDR             :0x410fd850
      PMU Type         :-1
      Min Interval     :0
      Magic            :0x1010101010101010
      CPU #            :7
      Num of params    :3
      MIDR             :0x410fd850
      PMU Type         :-1
      Min Interval     :0

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/arm-spe.c | 54 +++++++++++++++++++++++++++++++++++----
 1 file changed, 49 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 17782cb40fb5..4212a8aa0d2c 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -1133,16 +1133,60 @@ static bool arm_spe_evsel_is_auxtrace(struct perf_session *session,
 	return evsel->core.attr.type == spe->pmu_type;
 }
 
-static const char * const arm_spe_info_fmts[] = {
-	[ARM_SPE_PMU_TYPE]		= "  PMU Type           %"PRId64"\n",
+static const char * const metadata_hdr_v1_fmts[] = {
+	[ARM_SPE_PMU_TYPE]		= "  PMU Type           :%"PRId64"\n",
+	[ARM_SPE_PER_CPU_MMAPS]		= "  Per CPU mmaps      :%"PRId64"\n",
 };
 
-static void arm_spe_print_info(__u64 *arr)
+static const char * const metadata_hdr_fmts[] = {
+	[ARM_SPE_HEADER_VERSION]	= "  Header version     :%"PRId64"\n",
+	[ARM_SPE_HEADER_SIZE]		= "  Header size        :%"PRId64"\n",
+	[ARM_SPE_SHARED_PMU_TYPE]	= "  Shared PMU type    :%"PRId64"\n",
+	[ARM_SPE_CPUS_NUM]		= "  CPU number         :%"PRId64"\n",
+};
+
+static const char * const metadata_per_cpu_fmts[] = {
+	[ARM_SPE_MAGIC]			= "    Magic            :0x%"PRIx64"\n",
+	[ARM_SPE_CPU]			= "    CPU #            :%"PRId64"\n",
+	[ARM_SPE_CPU_NR_PARAMS]		= "    Num of params    :%"PRId64"\n",
+	[ARM_SPE_CPU_MIDR]		= "    MIDR             :0x%"PRIx64"\n",
+	[ARM_SPE_CPU_PMU_TYPE]		= "    PMU Type         :%"PRId64"\n",
+	[ARM_SPE_CAP_MIN_IVAL]		= "    Min Interval     :%"PRId64"\n",
+};
+
+static void arm_spe_print_info(struct arm_spe *spe, __u64 *arr)
 {
+	unsigned int i, cpu, hdr_size, cpu_num, cpu_size;
+	const char * const *hdr_fmts;
+
 	if (!dump_trace)
 		return;
 
-	fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], arr[ARM_SPE_PMU_TYPE]);
+	if (spe->metadata_ver == 1) {
+		cpu_num = 0;
+		hdr_size = ARM_SPE_AUXTRACE_V1_PRIV_MAX;
+		hdr_fmts = metadata_hdr_v1_fmts;
+	} else {
+		cpu_num = arr[ARM_SPE_CPUS_NUM];
+		hdr_size = arr[ARM_SPE_HEADER_SIZE];
+		hdr_fmts = metadata_hdr_fmts;
+	}
+
+	for (i = 0; i < hdr_size; i++)
+		fprintf(stdout, hdr_fmts[i], arr[i]);
+
+	arr += hdr_size;
+	for (cpu = 0; cpu < cpu_num; cpu++) {
+		/*
+		 * The parameters from ARM_SPE_MAGIC to ARM_SPE_CPU_NR_PARAMS
+		 * are fixed. The sequential parameter size is decided by the
+		 * field 'ARM_SPE_CPU_NR_PARAMS'.
+		 */
+		cpu_size = (ARM_SPE_CPU_NR_PARAMS + 1) + arr[ARM_SPE_CPU_NR_PARAMS];
+		for (i = 0; i < cpu_size; i++)
+			fprintf(stdout, metadata_per_cpu_fmts[i], arr[i]);
+		arr += cpu_size;
+	}
 }
 
 static void arm_spe_set_event_name(struct evlist *evlist, u64 id,
@@ -1405,7 +1449,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace;
 	session->auxtrace = &spe->auxtrace;
 
-	arm_spe_print_info(&auxtrace_info->priv[0]);
+	arm_spe_print_info(spe, &auxtrace_info->priv[0]);
 
 	if (dump_trace)
 		return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2
  2024-09-14 21:54 [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2 Leo Yan
                   ` (4 preceding siblings ...)
  2024-09-14 21:54 ` [PATCH v2 5/5] perf arm-spe: Dump metadata with " Leo Yan
@ 2024-09-26 16:20 ` Leo Yan
  2024-09-27  6:10   ` Namhyung Kim
  5 siblings, 1 reply; 17+ messages in thread
From: Leo Yan @ 2024-09-26 16:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan, linux-arm-kernel, linux-perf-users, linux-kernel

On 9/14/24 22:54, Leo Yan wrote:
> This patch series enhances Arm SPE metadata in the Perf file to a
> version 2 format and maintains backward compatibility for metadata v1.
> 
> The version 2 metadata contains a global structure with fields for
> metadata header version number, header size, CPU numbers. It also
> expands to store per-CPU metadata, which includes the CPU logical
> number in the kernel, MIDR, and capacity values associated with Arm SPE.
> 
> This patch set has been tested the perf to decode the Arm SPE metadata
> v1 and v2.
> 
> Changes from v1:
> - Dropped LDS bit exposing from Arm SPE driver (Will Deacon).
> - To simplify the change, this series did not include multiple AUX event
>    support.

Hi Arnaldo, Namhyung,

Gentle ping. There is a bit backlog for Arm SPE patches, so I would
like to bring up this series and the series [1] for the Arm SPE data
source refactoring (which is dependent on the current series).

Please kindly review and pick it up if it is fine for you.

Thanks,
Leo

[1] https://lore.kernel.org/linux-perf-users/20240914220901.756177-1-leo.yan@arm.com/


> Leo Yan (5):
>    perf arm-spe: Define metadata header version 2
>    perf arm-spe: Calculate meta data size
>    perf arm-spe: Save per CPU information in metadata
>    perf arm-spe: Support metadata version 2
>    perf arm-spe: Dump metadata with version 2
> 
>   tools/perf/arch/arm64/util/arm-spe.c | 106 ++++++++++++++++++-
>   tools/perf/util/arm-spe.c            | 151 +++++++++++++++++++++++++--
>   tools/perf/util/arm-spe.h            |  38 ++++++-
>   3 files changed, 281 insertions(+), 14 deletions(-)
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2
  2024-09-26 16:20 ` [PATCH v2 0/5] perf arm-spe: Introduce metadata " Leo Yan
@ 2024-09-27  6:10   ` Namhyung Kim
  2024-09-27  8:41     ` Leo Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2024-09-27  6:10 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

Hi Leo,

On Thu, Sep 26, 2024 at 05:20:49PM +0100, Leo Yan wrote:
> On 9/14/24 22:54, Leo Yan wrote:
> > This patch series enhances Arm SPE metadata in the Perf file to a
> > version 2 format and maintains backward compatibility for metadata v1.
> > 
> > The version 2 metadata contains a global structure with fields for
> > metadata header version number, header size, CPU numbers. It also
> > expands to store per-CPU metadata, which includes the CPU logical
> > number in the kernel, MIDR, and capacity values associated with Arm SPE.
> > 
> > This patch set has been tested the perf to decode the Arm SPE metadata
> > v1 and v2.
> > 
> > Changes from v1:
> > - Dropped LDS bit exposing from Arm SPE driver (Will Deacon).
> > - To simplify the change, this series did not include multiple AUX event
> >    support.
> 
> Hi Arnaldo, Namhyung,
> 
> Gentle ping. There is a bit backlog for Arm SPE patches, so I would
> like to bring up this series and the series [1] for the Arm SPE data
> source refactoring (which is dependent on the current series).
> 
> Please kindly review and pick it up if it is fine for you.

While it seems like general changes, I'd like to see some ARM folks
reviewing this series.

Thanks,
Namhyung

> 
> [1] https://lore.kernel.org/linux-perf-users/20240914220901.756177-1-leo.yan@arm.com/
> 
> 
> > Leo Yan (5):
> >    perf arm-spe: Define metadata header version 2
> >    perf arm-spe: Calculate meta data size
> >    perf arm-spe: Save per CPU information in metadata
> >    perf arm-spe: Support metadata version 2
> >    perf arm-spe: Dump metadata with version 2
> > 
> >   tools/perf/arch/arm64/util/arm-spe.c | 106 ++++++++++++++++++-
> >   tools/perf/util/arm-spe.c            | 151 +++++++++++++++++++++++++--
> >   tools/perf/util/arm-spe.h            |  38 ++++++-
> >   3 files changed, 281 insertions(+), 14 deletions(-)
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/5] perf arm-spe: Calculate meta data size
  2024-09-14 21:54 ` [PATCH v2 2/5] perf arm-spe: Calculate meta data size Leo Yan
@ 2024-09-27  6:14   ` Namhyung Kim
  2024-09-27  8:07     ` Leo Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2024-09-27  6:14 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

On Sat, Sep 14, 2024 at 10:54:55PM +0100, Leo Yan wrote:
> The metadata is designed to contain a header and per CPU information.
> 
> The arm_spe_find_cpus() function is introduced to identify how many CPUs
> support ARM SPE. Based on the CPU number, calculates the metadata size.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/arch/arm64/util/arm-spe.c | 35 +++++++++++++++++++++++++---
>  1 file changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index c2d5c8ca4900..15478989ef30 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -37,11 +37,40 @@ struct arm_spe_recording {
>  	bool			*wrapped;
>  };
>  
> +static struct perf_cpu_map *arm_spe_find_cpus(struct evlist *evlist)

Please add a comment that it returns a new cpu map, and caller should
release the reference after use.

> +{
> +	struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus;
> +	struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
> +	struct perf_cpu_map *intersect_cpus;
> +
> +	/* cpu map is not "any" CPU , we have specific CPUs to work with */
> +	if (!perf_cpu_map__has_any_cpu(event_cpus)) {
> +		intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus);
> +		perf_cpu_map__put(online_cpus);
> +	/* Event can be "any" CPU so count all CPUs. */
> +	} else {
> +		intersect_cpus = online_cpus;
> +	}
> +
> +	return intersect_cpus;
> +}
> +
>  static size_t
>  arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
> -		       struct evlist *evlist __maybe_unused)
> +		       struct evlist *evlist)
>  {
> -	return ARM_SPE_AUXTRACE_V1_PRIV_SIZE;
> +	struct perf_cpu_map *cpu_map = arm_spe_find_cpus(evlist);
> +	size_t size;
> +
> +	if (!cpu_map)
> +		return 0;
> +
> +	size = ARM_SPE_AUXTRACE_PRIV_MAX +
> +	       ARM_SPE_CPU_PRIV_MAX * perf_cpu_map__nr(cpu_map);
> +	size *= sizeof(u64);
> +
> +	perf_cpu_map__put(cpu_map);
> +	return size;
>  }
>  
>  static int arm_spe_info_fill(struct auxtrace_record *itr,
> @@ -53,7 +82,7 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
>  			container_of(itr, struct arm_spe_recording, itr);
>  	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
>  
> -	if (priv_size != ARM_SPE_AUXTRACE_V1_PRIV_SIZE)
> +	if (priv_size != arm_spe_info_priv_size(itr, session->evlist))
>  		return -EINVAL;
>  
>  	if (!session->evlist->core.nr_mmaps)
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/5] perf arm-spe: Save per CPU information in metadata
  2024-09-14 21:54 ` [PATCH v2 3/5] perf arm-spe: Save per CPU information in metadata Leo Yan
@ 2024-09-27  6:21   ` Namhyung Kim
  2024-09-27  8:16     ` Leo Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2024-09-27  6:21 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

On Sat, Sep 14, 2024 at 10:54:56PM +0100, Leo Yan wrote:
> Save the Arm SPE information on a per-CPU basis. This approach is easier
> in the decoding phase for retrieving metadata based on the CPU number of
> every Arm SPE record.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/arch/arm64/util/arm-spe.c | 71 +++++++++++++++++++++++++++-
>  1 file changed, 70 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index 15478989ef30..2790a37709a5 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -26,6 +26,8 @@
>  #include "../../../util/arm-spe.h"
>  #include <tools/libc_compat.h> // reallocarray
>  
> +#define ARM_SPE_CPU_MAGIC		0x1010101010101010ULL
> +
>  #define KiB(x) ((x) * 1024)
>  #define MiB(x) ((x) * 1024 * 1024)
>  
> @@ -73,14 +75,66 @@ arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
>  	return size;
>  }
>  
> +static int arm_spe_save_cpu_header(struct auxtrace_record *itr,
> +				   struct perf_cpu cpu, __u64 data[])
> +{
> +	struct arm_spe_recording *sper =
> +			container_of(itr, struct arm_spe_recording, itr);
> +	struct perf_pmu *pmu = NULL;
> +	struct perf_pmu tmp_pmu;
> +	char cpu_id_str[16];
> +	char *cpuid = NULL;
> +	u64 val;
> +
> +	snprintf(cpu_id_str, sizeof(cpu_id_str), "%d", cpu.cpu);
> +	tmp_pmu.cpus = perf_cpu_map__new(cpu_id_str);
> +	if (!tmp_pmu.cpus)
> +		return -ENOMEM;
> +
> +	/* Read CPU MIDR */
> +	cpuid = perf_pmu__getcpuid(&tmp_pmu);
> +	if (!cpuid)
> +		return -ENOMEM;

You'd better call perf_cpu_map__put() before return.


> +	val = strtol(cpuid, NULL, 16);
> +	perf_cpu_map__put(tmp_pmu.cpus);
> +
> +	data[ARM_SPE_MAGIC] = ARM_SPE_CPU_MAGIC;
> +	data[ARM_SPE_CPU] = cpu.cpu;
> +	data[ARM_SPE_CPU_NR_PARAMS] = ARM_SPE_CPU_PRIV_MAX - ARM_SPE_CPU_MIDR;
> +	data[ARM_SPE_CPU_MIDR] = val;
> +
> +	/* Find the associate Arm SPE PMU for the CPU */
> +	if (perf_cpu_map__has(sper->arm_spe_pmu->cpus, cpu))
> +		pmu = sper->arm_spe_pmu;
> +
> +	if (!pmu) {
> +		/* No Arm SPE PMU is found */
> +		data[ARM_SPE_CPU_PMU_TYPE] = ULLONG_MAX;
> +		data[ARM_SPE_CAP_MIN_IVAL] = 0;
> +	} else {
> +		data[ARM_SPE_CPU_PMU_TYPE] = pmu->type;
> +
> +		if (perf_pmu__scan_file(pmu, "caps/min_interval", "%lu", &val) != 1)
> +			val = 0;
> +		data[ARM_SPE_CAP_MIN_IVAL] = val;
> +	}
> +
> +	return ARM_SPE_CPU_PRIV_MAX;
> +}
> +
>  static int arm_spe_info_fill(struct auxtrace_record *itr,
>  			     struct perf_session *session,
>  			     struct perf_record_auxtrace_info *auxtrace_info,
>  			     size_t priv_size)
>  {
> +	int i, ret;
> +	size_t offset;
>  	struct arm_spe_recording *sper =
>  			container_of(itr, struct arm_spe_recording, itr);
>  	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
> +	struct perf_cpu_map *cpu_map = arm_spe_find_cpus(session->evlist);

Maybe you can move this to later in the function to make the error
handling easier.  Otherwise it should call perf_cpu_map__put().


> +	struct perf_cpu cpu;
> +	__u64 *data;
>  
>  	if (priv_size != arm_spe_info_priv_size(itr, session->evlist))
>  		return -EINVAL;
> @@ -89,8 +143,23 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
>  		return -EINVAL;
>  
>  	auxtrace_info->type = PERF_AUXTRACE_ARM_SPE;
> -	auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type;
> +	auxtrace_info->priv[ARM_SPE_HEADER_VERSION] = ARM_SPE_HEADER_CURRENT_VERSION;
> +	auxtrace_info->priv[ARM_SPE_HEADER_SIZE] =
> +		ARM_SPE_AUXTRACE_PRIV_MAX - ARM_SPE_HEADER_VERSION;
> +	auxtrace_info->priv[ARM_SPE_SHARED_PMU_TYPE] = arm_spe_pmu->type;
> +	auxtrace_info->priv[ARM_SPE_CPUS_NUM] = perf_cpu_map__nr(cpu_map);
> +
> +	offset = ARM_SPE_AUXTRACE_PRIV_MAX;
> +	perf_cpu_map__for_each_cpu(cpu, i, cpu_map) {
> +		assert(offset < priv_size);
> +		data = &auxtrace_info->priv[offset];
> +		ret = arm_spe_save_cpu_header(itr, cpu, data);
> +		if (ret < 0)
> +			return ret;

Please break the loop and release the cpu map.

Thanks,
Namhyung


> +		offset += ret;
> +	}
>  
> +	perf_cpu_map__put(cpu_map);
>  	return 0;
>  }
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] perf arm-spe: Define metadata header version 2
  2024-09-14 21:54 ` [PATCH v2 1/5] perf arm-spe: Define metadata header " Leo Yan
@ 2024-09-27  6:32   ` Namhyung Kim
  2024-09-27  8:05     ` Leo Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2024-09-27  6:32 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

On Sat, Sep 14, 2024 at 10:54:54PM +0100, Leo Yan wrote:
> The first version's metadata header structure doesn't include a field to
> indicate a header version, which is not friendly for extension.
> 
> Define the metadata version 2 format with a new header structure and
> extend per CPU's metadata. In the meantime, the old metadata header will
> still be supported for backward compatibility.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/arch/arm64/util/arm-spe.c |  4 +--
>  tools/perf/util/arm-spe.c            |  2 +-
>  tools/perf/util/arm-spe.h            | 38 +++++++++++++++++++++++++++-
>  3 files changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index 2be99fdf997d..c2d5c8ca4900 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -41,7 +41,7 @@ static size_t
>  arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
>  		       struct evlist *evlist __maybe_unused)
>  {
> -	return ARM_SPE_AUXTRACE_PRIV_SIZE;
> +	return ARM_SPE_AUXTRACE_V1_PRIV_SIZE;
>  }
>  
>  static int arm_spe_info_fill(struct auxtrace_record *itr,
> @@ -53,7 +53,7 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
>  			container_of(itr, struct arm_spe_recording, itr);
>  	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
>  
> -	if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE)
> +	if (priv_size != ARM_SPE_AUXTRACE_V1_PRIV_SIZE)
>  		return -EINVAL;
>  
>  	if (!session->evlist->core.nr_mmaps)
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 138ffc71b32d..70989b1bae47 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -1262,7 +1262,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
>  				  struct perf_session *session)
>  {
>  	struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
> -	size_t min_sz = sizeof(u64) * ARM_SPE_AUXTRACE_PRIV_MAX;
> +	size_t min_sz = ARM_SPE_AUXTRACE_V1_PRIV_SIZE;
>  	struct perf_record_time_conv *tc = &session->time_conv;
>  	const char *cpuid = perf_env__cpuid(session->evlist->env);
>  	u64 midr = strtol(cpuid, NULL, 16);
> diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
> index 4f4900c18f3e..5416d4e1d15f 100644
> --- a/tools/perf/util/arm-spe.h
> +++ b/tools/perf/util/arm-spe.h
> @@ -12,10 +12,46 @@
>  enum {
>  	ARM_SPE_PMU_TYPE,
>  	ARM_SPE_PER_CPU_MMAPS,
> +	ARM_SPE_AUXTRACE_V1_PRIV_MAX,
> +};
> +
> +#define ARM_SPE_AUXTRACE_V1_PRIV_SIZE	\
> +	(ARM_SPE_AUXTRACE_V1_PRIV_MAX * sizeof(u64))
> +
> +enum {
> +	/*
> +	 * The old metadata format (defined above) does not include a
> +	 * field for version number. Version 1 is reserved and starts
> +	 * from version 2.
> +	 */
> +	ARM_SPE_HEADER_VERSION,
> +	/* Number of sizeof(u64) */
> +	ARM_SPE_HEADER_SIZE,
> +	/* PMU type shared by CPUs */
> +	ARM_SPE_SHARED_PMU_TYPE,
> +	/* Number of CPUs */
> +	ARM_SPE_CPUS_NUM,
>  	ARM_SPE_AUXTRACE_PRIV_MAX,
>  };

Why don't you define something like struct arm_spe_header_v2 ?

Thanks,
Namhyung

>  
> -#define ARM_SPE_AUXTRACE_PRIV_SIZE (ARM_SPE_AUXTRACE_PRIV_MAX * sizeof(u64))
> +enum {
> +	/* Magic number */
> +	ARM_SPE_MAGIC,
> +	/* CPU logical number in system */
> +	ARM_SPE_CPU,
> +	/* Number of parameters */
> +	ARM_SPE_CPU_NR_PARAMS,
> +	/* CPU MIDR */
> +	ARM_SPE_CPU_MIDR,
> +	/* Associated PMU type */
> +	ARM_SPE_CPU_PMU_TYPE,
> +	/* Minimal interval */
> +	ARM_SPE_CAP_MIN_IVAL,
> +	ARM_SPE_CPU_PRIV_MAX,
> +};
> +
> +#define ARM_SPE_HEADER_CURRENT_VERSION	2
> +
>  
>  union perf_event;
>  struct perf_session;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] perf arm-spe: Support metadata version 2
  2024-09-14 21:54 ` [PATCH v2 4/5] perf arm-spe: Support metadata version 2 Leo Yan
@ 2024-09-27  6:35   ` Namhyung Kim
  2024-09-27  8:32     ` Leo Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Namhyung Kim @ 2024-09-27  6:35 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

On Sat, Sep 14, 2024 at 10:54:57PM +0100, Leo Yan wrote:
> This commit is to support metadata version 2 and at the meantime it is
> backward compatible for version 1's format.
> 
> The metadata version 1 doesn't include the ARM_SPE_HEADER_VERSION field.
> As version 1 is fixed with two u64 fields, by checking the metadata
> size, it distinguishes the metadata is version 1 or version 2 (and any
> new versions if later will have). For version 2, it reads out CPU number
> and retrieves the metadata info for every CPU.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/util/arm-spe.c | 95 +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 92 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 70989b1bae47..17782cb40fb5 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -78,6 +78,10 @@ struct arm_spe {
>  
>  	unsigned long			num_events;
>  	u8				use_ctx_pkt_for_pid;
> +
> +	u64				**metadata;
> +	u64				metadata_ver;
> +	u64				metadata_nr_cpu;
>  };
>  
>  struct arm_spe_queue {
> @@ -1016,6 +1020,73 @@ static int arm_spe_flush(struct perf_session *session __maybe_unused,
>  	return 0;
>  }
>  
> +static u64 *arm_spe__alloc_per_cpu_metadata(u64 *buf, int cpu_size)
> +{
> +	u64 *metadata;
> +
> +	metadata = zalloc(sizeof(*metadata) * cpu_size);

Maybe calloc() is slightly better here, but not a strong opinion.


> +	if (!metadata)
> +		return NULL;
> +
> +	memcpy(metadata, buf, cpu_size);

I'm not sure if it's correct since you allocated cpu_size * 8 and copies
only cpu_size.


> +	return metadata;
> +}
> +
> +static void arm_spe__free_metadata(u64 **metadata, int nr_cpu)
> +{
> +	int i;
> +
> +	for (i = 0; i < nr_cpu; i++)
> +		zfree(&metadata[i]);
> +	free(metadata);
> +}
> +
> +static u64 **arm_spe__alloc_metadata(struct perf_record_auxtrace_info *info,
> +				     u64 *ver, int *nr_cpu)
> +{
> +	u64 *ptr = (u64 *)info->priv;
> +	u64 metadata_size;
> +	u64 **metadata = NULL;
> +	int hdr_sz, cpu_sz, i;
> +
> +	metadata_size = info->header.size -
> +		sizeof(struct perf_record_auxtrace_info);
> +
> +	/* Metadata version 1 */
> +	if (metadata_size == ARM_SPE_AUXTRACE_V1_PRIV_SIZE) {
> +		*ver = 1;
> +		*nr_cpu = 0;
> +		/* No per CPU metadata */
> +		return NULL;
> +	}
> +
> +	*ver = ptr[ARM_SPE_HEADER_VERSION];
> +	hdr_sz = ptr[ARM_SPE_HEADER_SIZE];
> +	*nr_cpu = ptr[ARM_SPE_CPUS_NUM];
> +
> +	metadata = zalloc(sizeof(*metadata) * (*nr_cpu));

calloc() instead?  But probably better defining a struct for metadata.

Thanks,
Namhyung


> +	if (!metadata)
> +		return NULL;
> +
> +	/* Locate the start address of per CPU metadata */
> +	ptr += hdr_sz;
> +	cpu_sz = (metadata_size - (hdr_sz * sizeof(u64))) / (*nr_cpu);
> +
> +	for (i = 0; i < *nr_cpu; i++) {
> +		metadata[i] = arm_spe__alloc_per_cpu_metadata(ptr, cpu_sz);
> +		if (!metadata[i])
> +			goto err_per_cpu_metadata;
> +
> +		ptr += cpu_sz / sizeof(u64);
> +	}
> +
> +	return metadata;
> +
> +err_per_cpu_metadata:
> +	arm_spe__free_metadata(metadata, *nr_cpu);
> +	return NULL;
> +}
> +
>  static void arm_spe_free_queue(void *priv)
>  {
>  	struct arm_spe_queue *speq = priv;
> @@ -1050,6 +1121,7 @@ static void arm_spe_free(struct perf_session *session)
>  	auxtrace_heap__free(&spe->heap);
>  	arm_spe_free_events(session);
>  	session->auxtrace = NULL;
> +	arm_spe__free_metadata(spe->metadata, spe->metadata_nr_cpu);
>  	free(spe);
>  }
>  
> @@ -1267,15 +1339,24 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
>  	const char *cpuid = perf_env__cpuid(session->evlist->env);
>  	u64 midr = strtol(cpuid, NULL, 16);
>  	struct arm_spe *spe;
> -	int err;
> +	u64 **metadata = NULL;
> +	u64 metadata_ver;
> +	int nr_cpu, err;
>  
>  	if (auxtrace_info->header.size < sizeof(struct perf_record_auxtrace_info) +
>  					min_sz)
>  		return -EINVAL;
>  
> +	metadata = arm_spe__alloc_metadata(auxtrace_info, &metadata_ver,
> +					   &nr_cpu);
> +	if (!metadata && metadata_ver != 1) {
> +		pr_err("Failed to parse Arm SPE metadata.\n");
> +		return -EINVAL;
> +	}
> +
>  	spe = zalloc(sizeof(struct arm_spe));
>  	if (!spe)
> -		return -ENOMEM;
> +		goto err_free_metadata;
>  
>  	err = auxtrace_queues__init(&spe->queues);
>  	if (err)
> @@ -1284,8 +1365,14 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
>  	spe->session = session;
>  	spe->machine = &session->machines.host; /* No kvm support */
>  	spe->auxtrace_type = auxtrace_info->type;
> -	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
> +	if (metadata_ver == 1)
> +		spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
> +	else
> +		spe->pmu_type = auxtrace_info->priv[ARM_SPE_SHARED_PMU_TYPE];
>  	spe->midr = midr;
> +	spe->metadata = metadata;
> +	spe->metadata_ver = metadata_ver;
> +	spe->metadata_nr_cpu = nr_cpu;
>  
>  	spe->timeless_decoding = arm_spe__is_timeless_decoding(spe);
>  
> @@ -1346,5 +1433,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
>  	session->auxtrace = NULL;
>  err_free:
>  	free(spe);
> +err_free_metadata:
> +	arm_spe__free_metadata(metadata, nr_cpu);
>  	return err;
>  }
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] perf arm-spe: Define metadata header version 2
  2024-09-27  6:32   ` Namhyung Kim
@ 2024-09-27  8:05     ` Leo Yan
  0 siblings, 0 replies; 17+ messages in thread
From: Leo Yan @ 2024-09-27  8:05 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

On 9/27/24 07:32, Namhyung Kim wrote:

[...]

>> diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
>> index 4f4900c18f3e..5416d4e1d15f 100644
>> --- a/tools/perf/util/arm-spe.h
>> +++ b/tools/perf/util/arm-spe.h
>> @@ -12,10 +12,46 @@
>>   enum {
>>        ARM_SPE_PMU_TYPE,
>>        ARM_SPE_PER_CPU_MMAPS,
>> +     ARM_SPE_AUXTRACE_V1_PRIV_MAX,
>> +};
>> +
>> +#define ARM_SPE_AUXTRACE_V1_PRIV_SIZE        \
>> +     (ARM_SPE_AUXTRACE_V1_PRIV_MAX * sizeof(u64))
>> +
>> +enum {
>> +     /*
>> +      * The old metadata format (defined above) does not include a
>> +      * field for version number. Version 1 is reserved and starts
>> +      * from version 2.
>> +      */
>> +     ARM_SPE_HEADER_VERSION,
>> +     /* Number of sizeof(u64) */
>> +     ARM_SPE_HEADER_SIZE,
>> +     /* PMU type shared by CPUs */
>> +     ARM_SPE_SHARED_PMU_TYPE,
>> +     /* Number of CPUs */
>> +     ARM_SPE_CPUS_NUM,
>>        ARM_SPE_AUXTRACE_PRIV_MAX,
>>   };
> 
> Why don't you define something like struct arm_spe_header_v2 ?

Here the target is to make metadata to be self-described and can be
easily extended for new version (but I would not expect updating header
will happen frequently).

The fields ARM_SPE_HEADER_VERSION and ARM_SPE_HEADER_SIZE give a
header version and a header size. This allows us to easily extend new
enum items for new header. A benefit is we can use general code for
decoding because we don't need to process structure for a specific
version.

Thanks for review!

Leo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/5] perf arm-spe: Calculate meta data size
  2024-09-27  6:14   ` Namhyung Kim
@ 2024-09-27  8:07     ` Leo Yan
  0 siblings, 0 replies; 17+ messages in thread
From: Leo Yan @ 2024-09-27  8:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel



On 9/27/24 07:14, Namhyung Kim wrote:
> Warning: EXTERNAL SENDER, use caution when opening links or attachments.
> 
> 
> On Sat, Sep 14, 2024 at 10:54:55PM +0100, Leo Yan wrote:
>> The metadata is designed to contain a header and per CPU information.
>>
>> The arm_spe_find_cpus() function is introduced to identify how many CPUs
>> support ARM SPE. Based on the CPU number, calculates the metadata size.
>>
>> Signed-off-by: Leo Yan <leo.yan@arm.com>
>> ---
>>   tools/perf/arch/arm64/util/arm-spe.c | 35 +++++++++++++++++++++++++---
>>   1 file changed, 32 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
>> index c2d5c8ca4900..15478989ef30 100644
>> --- a/tools/perf/arch/arm64/util/arm-spe.c
>> +++ b/tools/perf/arch/arm64/util/arm-spe.c
>> @@ -37,11 +37,40 @@ struct arm_spe_recording {
>>        bool                    *wrapped;
>>   };
>>
>> +static struct perf_cpu_map *arm_spe_find_cpus(struct evlist *evlist)
> 
> Please add a comment that it returns a new cpu map, and caller should
> release the reference after use.

Will do.

Thanks,
Leo

>> +{
>> +     struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus;
>> +     struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
>> +     struct perf_cpu_map *intersect_cpus;
>> +
>> +     /* cpu map is not "any" CPU , we have specific CPUs to work with */
>> +     if (!perf_cpu_map__has_any_cpu(event_cpus)) {
>> +             intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus);
>> +             perf_cpu_map__put(online_cpus);
>> +     /* Event can be "any" CPU so count all CPUs. */
>> +     } else {
>> +             intersect_cpus = online_cpus;
>> +     }
>> +
>> +     return intersect_cpus;
>> +}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/5] perf arm-spe: Save per CPU information in metadata
  2024-09-27  6:21   ` Namhyung Kim
@ 2024-09-27  8:16     ` Leo Yan
  0 siblings, 0 replies; 17+ messages in thread
From: Leo Yan @ 2024-09-27  8:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

On 9/27/24 07:21, Namhyung Kim wrote:>
> On Sat, Sep 14, 2024 at 10:54:56PM +0100, Leo Yan wrote:
>> Save the Arm SPE information on a per-CPU basis. This approach is easier
>> in the decoding phase for retrieving metadata based on the CPU number of
>> every Arm SPE record.
>>
>> Signed-off-by: Leo Yan <leo.yan@arm.com>
>> ---
>>   tools/perf/arch/arm64/util/arm-spe.c | 71 +++++++++++++++++++++++++++-
>>   1 file changed, 70 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
>> index 15478989ef30..2790a37709a5 100644
>> --- a/tools/perf/arch/arm64/util/arm-spe.c
>> +++ b/tools/perf/arch/arm64/util/arm-spe.c
>> @@ -26,6 +26,8 @@
>>   #include "../../../util/arm-spe.h"
>>   #include <tools/libc_compat.h> // reallocarray
>>
>> +#define ARM_SPE_CPU_MAGIC            0x1010101010101010ULL
>> +
>>   #define KiB(x) ((x) * 1024)
>>   #define MiB(x) ((x) * 1024 * 1024)
>>
>> @@ -73,14 +75,66 @@ arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
>>        return size;
>>   }
>>
>> +static int arm_spe_save_cpu_header(struct auxtrace_record *itr,
>> +                                struct perf_cpu cpu, __u64 data[])
>> +{
>> +     struct arm_spe_recording *sper =
>> +                     container_of(itr, struct arm_spe_recording, itr);
>> +     struct perf_pmu *pmu = NULL;
>> +     struct perf_pmu tmp_pmu;
>> +     char cpu_id_str[16];
>> +     char *cpuid = NULL;
>> +     u64 val;
>> +
>> +     snprintf(cpu_id_str, sizeof(cpu_id_str), "%d", cpu.cpu);
>> +     tmp_pmu.cpus = perf_cpu_map__new(cpu_id_str);
>> +     if (!tmp_pmu.cpus)
>> +             return -ENOMEM;
>> +
>> +     /* Read CPU MIDR */
>> +     cpuid = perf_pmu__getcpuid(&tmp_pmu);
>> +     if (!cpuid)
>> +             return -ENOMEM;
> 
> You'd better call perf_cpu_map__put() before return.

Will do.

Just for recording, 'cpuid' should be released at the end of function.

>> +     val = strtol(cpuid, NULL, 16);
>> +     perf_cpu_map__put(tmp_pmu.cpus);
>> +
>> +     data[ARM_SPE_MAGIC] = ARM_SPE_CPU_MAGIC;
>> +     data[ARM_SPE_CPU] = cpu.cpu;
>> +     data[ARM_SPE_CPU_NR_PARAMS] = ARM_SPE_CPU_PRIV_MAX - ARM_SPE_CPU_MIDR;
>> +     data[ARM_SPE_CPU_MIDR] = val;
>> +
>> +     /* Find the associate Arm SPE PMU for the CPU */
>> +     if (perf_cpu_map__has(sper->arm_spe_pmu->cpus, cpu))
>> +             pmu = sper->arm_spe_pmu;
>> +
>> +     if (!pmu) {
>> +             /* No Arm SPE PMU is found */
>> +             data[ARM_SPE_CPU_PMU_TYPE] = ULLONG_MAX;
>> +             data[ARM_SPE_CAP_MIN_IVAL] = 0;
>> +     } else {
>> +             data[ARM_SPE_CPU_PMU_TYPE] = pmu->type;
>> +
>> +             if (perf_pmu__scan_file(pmu, "caps/min_interval", "%lu", &val) != 1)
>> +                     val = 0;
>> +             data[ARM_SPE_CAP_MIN_IVAL] = val;
>> +     }
>> +
>> +     return ARM_SPE_CPU_PRIV_MAX;
>> +}
>> +
>>   static int arm_spe_info_fill(struct auxtrace_record *itr,
>>                             struct perf_session *session,
>>                             struct perf_record_auxtrace_info *auxtrace_info,
>>                             size_t priv_size)
>>   {
>> +     int i, ret;
>> +     size_t offset;
>>        struct arm_spe_recording *sper =
>>                        container_of(itr, struct arm_spe_recording, itr);
>>        struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
>> +     struct perf_cpu_map *cpu_map = arm_spe_find_cpus(session->evlist);
> 
> Maybe you can move this to later in the function to make the error
> handling easier.  Otherwise it should call perf_cpu_map__put().

Good point. Will do.

>> +     struct perf_cpu cpu;
>> +     __u64 *data;
>>
>>        if (priv_size != arm_spe_info_priv_size(itr, session->evlist))
>>                return -EINVAL;
>> @@ -89,8 +143,23 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
>>                return -EINVAL;
>>
>>        auxtrace_info->type = PERF_AUXTRACE_ARM_SPE;
>> -     auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type;
>> +     auxtrace_info->priv[ARM_SPE_HEADER_VERSION] = ARM_SPE_HEADER_CURRENT_VERSION;
>> +     auxtrace_info->priv[ARM_SPE_HEADER_SIZE] =
>> +             ARM_SPE_AUXTRACE_PRIV_MAX - ARM_SPE_HEADER_VERSION;
>> +     auxtrace_info->priv[ARM_SPE_SHARED_PMU_TYPE] = arm_spe_pmu->type;
>> +     auxtrace_info->priv[ARM_SPE_CPUS_NUM] = perf_cpu_map__nr(cpu_map);
>> +
>> +     offset = ARM_SPE_AUXTRACE_PRIV_MAX;
>> +     perf_cpu_map__for_each_cpu(cpu, i, cpu_map) {
>> +             assert(offset < priv_size);
>> +             data = &auxtrace_info->priv[offset];
>> +             ret = arm_spe_save_cpu_header(itr, cpu, data);
>> +             if (ret < 0)
>> +                     return ret;
> 
> Please break the loop and release the cpu map.

Will do.

Thanks for good catchings!

Leo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] perf arm-spe: Support metadata version 2
  2024-09-27  6:35   ` Namhyung Kim
@ 2024-09-27  8:32     ` Leo Yan
  0 siblings, 0 replies; 17+ messages in thread
From: Leo Yan @ 2024-09-27  8:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel



On 9/27/24 07:35, Namhyung Kim wrote:
> On Sat, Sep 14, 2024 at 10:54:57PM +0100, Leo Yan wrote:
>> This commit is to support metadata version 2 and at the meantime it is
>> backward compatible for version 1's format.
>>
>> The metadata version 1 doesn't include the ARM_SPE_HEADER_VERSION field.
>> As version 1 is fixed with two u64 fields, by checking the metadata
>> size, it distinguishes the metadata is version 1 or version 2 (and any
>> new versions if later will have). For version 2, it reads out CPU number
>> and retrieves the metadata info for every CPU.
>>
>> Signed-off-by: Leo Yan <leo.yan@arm.com>
>> ---
>>   tools/perf/util/arm-spe.c | 95 +++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 92 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
>> index 70989b1bae47..17782cb40fb5 100644
>> --- a/tools/perf/util/arm-spe.c
>> +++ b/tools/perf/util/arm-spe.c
>> @@ -78,6 +78,10 @@ struct arm_spe {
>>
>>        unsigned long                   num_events;
>>        u8                              use_ctx_pkt_for_pid;
>> +
>> +     u64                             **metadata;
>> +     u64                             metadata_ver;
>> +     u64                             metadata_nr_cpu;
>>   };
>>
>>   struct arm_spe_queue {
>> @@ -1016,6 +1020,73 @@ static int arm_spe_flush(struct perf_session *session __maybe_unused,
>>        return 0;
>>   }
>>
>> +static u64 *arm_spe__alloc_per_cpu_metadata(u64 *buf, int cpu_size)
>> +{
>> +     u64 *metadata;
>> +
>> +     metadata = zalloc(sizeof(*metadata) * cpu_size);
> 
> Maybe calloc() is slightly better here, but not a strong opinion.

Ah, here 'cpu_size' is the size of per CPU's metadata in bytes. It
should not multiply sizeof(*metadata), though this will not lead buffer
overwrite issue as it allocates a bigger buffer than used.

I will rename the 'cpu_size' argument to 'per_cpu_size'.

>> +     if (!metadata)
>> +             return NULL;
>> +
>> +     memcpy(metadata, buf, cpu_size);
> 
> I'm not sure if it's correct since you allocated cpu_size * 8 and copies
> only cpu_size.

As described above, here it is correct as the unit of 'cpu_size' is in
bytes.

>> +     return metadata;
>> +}
>> +
>> +static void arm_spe__free_metadata(u64 **metadata, int nr_cpu)
>> +{
>> +     int i;
>> +
>> +     for (i = 0; i < nr_cpu; i++)
>> +             zfree(&metadata[i]);
>> +     free(metadata);
>> +}
>> +
>> +static u64 **arm_spe__alloc_metadata(struct perf_record_auxtrace_info *info,
>> +                                  u64 *ver, int *nr_cpu)
>> +{
>> +     u64 *ptr = (u64 *)info->priv;
>> +     u64 metadata_size;
>> +     u64 **metadata = NULL;
>> +     int hdr_sz, cpu_sz, i;
>> +
>> +     metadata_size = info->header.size -
>> +             sizeof(struct perf_record_auxtrace_info);
>> +
>> +     /* Metadata version 1 */
>> +     if (metadata_size == ARM_SPE_AUXTRACE_V1_PRIV_SIZE) {
>> +             *ver = 1;
>> +             *nr_cpu = 0;
>> +             /* No per CPU metadata */
>> +             return NULL;
>> +     }
>> +
>> +     *ver = ptr[ARM_SPE_HEADER_VERSION];
>> +     hdr_sz = ptr[ARM_SPE_HEADER_SIZE];
>> +     *nr_cpu = ptr[ARM_SPE_CPUS_NUM];
>> +
>> +     metadata = zalloc(sizeof(*metadata) * (*nr_cpu));
> 
> calloc() instead?  But probably better defining a struct for metadata.

I will change to calloc(). It allocates a pointer array, so I will
try to change to 'u64 *metadata[] = NULL;' for clear definition.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2
  2024-09-27  6:10   ` Namhyung Kim
@ 2024-09-27  8:41     ` Leo Yan
  0 siblings, 0 replies; 17+ messages in thread
From: Leo Yan @ 2024-09-27  8:41 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, James Clark, Mike Leach, Will Deacon,
	Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-arm-kernel, linux-perf-users, linux-kernel

Hi Namhyung,

On 9/27/24 07:10, Namhyung Kim wrote:

>> Hi Arnaldo, Namhyung,
>>
>> Gentle ping. There is a bit backlog for Arm SPE patches, so I would
>> like to bring up this series and the series [1] for the Arm SPE data
>> source refactoring (which is dependent on the current series).
>>
>> Please kindly review and pick it up if it is fine for you.
> 
> While it seems like general changes, I'd like to see some ARM folks
> reviewing this series.

Fair point. I will chase review internally after sending a new spin (I think I 
turned my eyes to James ;)).

Thanks a lot for review!

Leo

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-09-27  8:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-14 21:54 [PATCH v2 0/5] perf arm-spe: Introduce metadata version 2 Leo Yan
2024-09-14 21:54 ` [PATCH v2 1/5] perf arm-spe: Define metadata header " Leo Yan
2024-09-27  6:32   ` Namhyung Kim
2024-09-27  8:05     ` Leo Yan
2024-09-14 21:54 ` [PATCH v2 2/5] perf arm-spe: Calculate meta data size Leo Yan
2024-09-27  6:14   ` Namhyung Kim
2024-09-27  8:07     ` Leo Yan
2024-09-14 21:54 ` [PATCH v2 3/5] perf arm-spe: Save per CPU information in metadata Leo Yan
2024-09-27  6:21   ` Namhyung Kim
2024-09-27  8:16     ` Leo Yan
2024-09-14 21:54 ` [PATCH v2 4/5] perf arm-spe: Support metadata version 2 Leo Yan
2024-09-27  6:35   ` Namhyung Kim
2024-09-27  8:32     ` Leo Yan
2024-09-14 21:54 ` [PATCH v2 5/5] perf arm-spe: Dump metadata with " Leo Yan
2024-09-26 16:20 ` [PATCH v2 0/5] perf arm-spe: Introduce metadata " Leo Yan
2024-09-27  6:10   ` Namhyung Kim
2024-09-27  8:41     ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).