linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Athira Rajeev <atrajeev@linux.ibm.com>
To: acme@kernel.org, jolsa@kernel.org, adrian.hunter@intel.com,
	maddy@linux.ibm.com, irogers@google.com, namhyung@kernel.org
Cc: linux-perf-users@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	aboorvad@linux.ibm.com, sshegde@linux.ibm.com,
	atrajeev@linux.ibm.com, kjain@linux.ibm.com,
	hbathini@linux.vnet.ibm.com, Aditya.Bodkhe1@ibm.com,
	venkat88@linux.ibm.com
Subject: [PATCH 05/14] powerpc/perf/vpa-dtl: Add support to capture DTL data in aux buffer
Date: Fri, 15 Aug 2025 14:03:58 +0530	[thread overview]
Message-ID: <20250815083407.27953-6-atrajeev@linux.ibm.com> (raw)
In-Reply-To: <20250815083407.27953-1-atrajeev@linux.ibm.com>

vpa dtl pmu has one hrtimer added per vpa-dtl pmu thread. When the
hrtimer expires, in the timer handler, code is added to save the DTL
data to perf event record via vpa_dtl_capture_aux() function.
The DTL (Dispatch Trace Log) contains information
about dispatch/preempt, enqueue time etc. We directly copy the DTL
buffer data as part of auxiliary buffer. Data will be written to
disk only when the allocated buffer is full.

By this approach, all the DTL data will be present as-is in the
perf.data. The data will be post-processed in perf tools side when doing
perf report/perf script and this will avoid time taken to create samples
in the kernel space.

To corelate each DTL entry with other events across CPU's, we need to
map timebase from "struct dtl_entry" which phyp provides with boot
timebase. This also needs timebase frequency. Define "struct boottb_freq"
to save these details.

Added changes to capture the Dispatch Trace Log details to AUX buffer
in vpa_dtl_dump_sample_data(). Boot timebase and frequency needs to be
saved only at once, added field to indicate this as part of
"vpa_pmu_buf" structure.

perf_aux_output_begin: This function is called before writing to AUX
area. This returns the pointer to aux area private structure, ie
"struct vpa_pmu_buf". The function obtains the output handle
(used in perf_aux_output_end). when capture completes in
vpa_dtl_capture_aux(), call perf_aux_output_end() to commit the recorded
data. perf_aux_output_end() is called to move the aux->head of
"struct perf_buffer" to indicate size of data in aux buffer.
aux_tail will be moved in perf tools side when writing the data from
aux buffer to perf.data file in disk.

It is responsiblity of PMU driver to make sure data is copied between
perf_aux_output_begin and perf_aux_output_end.

Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
 arch/powerpc/perf/vpa-dtl.c | 131 +++++++++++++++++++++++++++++++++++-
 1 file changed, 130 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/vpa-dtl.c b/arch/powerpc/perf/vpa-dtl.c
index 364242cbfa8a..ce17beddd4b4 100644
--- a/arch/powerpc/perf/vpa-dtl.c
+++ b/arch/powerpc/perf/vpa-dtl.c
@@ -86,6 +86,24 @@ struct vpa_pmu_buf {
 	u64     *base;
 	u64     size;
 	u64     head;
+	/* boot timebase and frequency needs to be saved only at once */
+	int	boottb_freq_saved;
+};
+
+/*
+ * To corelate each DTL entry with other events across CPU's,
+ * we need to map timebase from "struct dtl_entry" which phyp
+ * provides with boot timebase. This also needs timebase frequency.
+ * Formula is: ((timbase from DTL entry - boot time) / frequency)
+ *
+ * To match with size of "struct dtl_entry" to ease post processing,
+ * padded 24 bytes to the structure.
+ */
+struct boottb_freq {
+	u64	boot_tb;
+	u64	tb_freq;
+	u64	timebase;
+	u64	padded[3];
 };
 
 static DEFINE_PER_CPU(struct vpa_pmu_ctx, vpa_pmu_ctx);
@@ -95,13 +113,123 @@ static DEFINE_PER_CPU(struct vpa_dtl, vpa_dtl_cpu);
 static int dtl_global_refc;
 static spinlock_t dtl_global_lock = __SPIN_LOCK_UNLOCKED(dtl_global_lock);
 
+/*
+ * Capture DTL data in AUX buffer
+ */
+static void vpa_dtl_capture_aux(long *n_entries, struct vpa_pmu_buf *buf,
+		struct vpa_dtl *dtl, int index)
+{
+	struct dtl_entry *aux_copy_buf = (struct dtl_entry *)buf->base;
+
+	/*
+	 * Copy to AUX buffer from per-thread address
+	 */
+	memcpy(aux_copy_buf + buf->head, &dtl->buf[index], *n_entries * sizeof(struct dtl_entry));
+
+	buf->head += *n_entries;
+
+	return;
+}
+
 /*
  * Function to dump the dispatch trace log buffer data to the
  * perf data.
+ *
+ * perf_aux_output_begin: This function is called before writing
+ * to AUX area. This returns the pointer to aux area private structure,
+ * ie "struct vpa_pmu_buf" here which is set in setup_aux() function.
+ * The function obtains the output handle (used in perf_aux_output_end).
+ * when capture completes in vpa_dtl_capture_aux(), call perf_aux_output_end()
+ * to commit the recorded data.
+ *
+ * perf_aux_output_end: This function commits data by adjusting the
+ * aux_head of "struct perf_buffer". aux_tail will be moved in perf tools
+ * side when writing the data from aux buffer to perf.data file in disk.
+ *
+ * Here in the private aux structure, we maintain head to know where
+ * to copy data next time in the PMU driver. vpa_pmu_buf->head is moved to
+ * maintain the aux head for PMU driver. It is responsiblity of PMU
+ * driver to make sure data is copied between perf_aux_output_begin and
+ * perf_aux_output_end.
+ *
+ * After data is copied in vpa_dtl_capture_aux() function, perf_aux_output_end()
+ * is called to move the aux->head of "struct perf_buffer" to indicate size of
+ * data in aux buffer. This will post a PERF_RECORD_AUX into the perf buffer.
+ * Data will be written to disk only when the allocated buffer is full.
+ *
+ * By this approach, all the DTL data will be present as-is in the
+ * perf.data. The data will be pre-processed in perf tools side when doing
+ * perf report/perf script and this will avoid time taken to create samples
+ * in the kernel space.
  */
 static void vpa_dtl_dump_sample_data(struct perf_event *event)
 {
-	return;
+	u64 cur_idx, last_idx, i;
+	u64 boot_tb;
+	struct boottb_freq boottb_freq;
+
+	/* actual number of entries read */
+	long n_read = 0, read_size = 0;
+
+	/* number of entries added to dtl buffer */
+	long n_req;
+
+	struct vpa_pmu_ctx *vpa_ctx = this_cpu_ptr(&vpa_pmu_ctx);
+
+	struct vpa_pmu_buf *aux_buf;
+
+	struct vpa_dtl *dtl = &per_cpu(vpa_dtl_cpu, event->cpu);
+
+	cur_idx = be64_to_cpu(lppaca_of(event->cpu).dtl_idx);
+	last_idx = dtl->last_idx;
+
+	if (last_idx + N_DISPATCH_LOG <= cur_idx)
+		last_idx = cur_idx - N_DISPATCH_LOG + 1;
+
+	n_req = cur_idx - last_idx;
+
+	/* no new entry added to the buffer, return */
+	if (n_req <= 0)
+		return;
+
+	dtl->last_idx = last_idx + n_req;
+	boot_tb = get_boot_tb();
+
+	i = last_idx % N_DISPATCH_LOG;
+
+	aux_buf = perf_aux_output_begin(&vpa_ctx->handle, event);
+	if (!aux_buf) {
+		pr_debug("returning. no aux\n");
+		return;
+	}
+
+	if (!aux_buf->boottb_freq_saved) {
+		pr_debug("Copying boot tb to aux buffer: %lld\n", boot_tb);
+		/* Save boot_tb to convert raw timebase to it's relative system boot time */
+		boottb_freq.boot_tb = boot_tb;
+		/* Save tb_ticks_per_sec to convert timebase to sec */
+		boottb_freq.tb_freq = tb_ticks_per_sec;
+		boottb_freq.timebase = 0;
+		memcpy(aux_buf->base, &boottb_freq, sizeof(boottb_freq));
+		aux_buf->head += 1;
+		aux_buf->boottb_freq_saved = 1;
+		n_read += 1;
+	}
+
+	/* read the tail of the buffer if we've wrapped */
+	if (i + n_req > N_DISPATCH_LOG) {
+		read_size = N_DISPATCH_LOG - i;
+		vpa_dtl_capture_aux(&read_size, aux_buf, dtl, i);
+		n_req -= read_size;
+		n_read += read_size;
+		i = 0;
+	}
+
+	/* .. and now the head */
+	vpa_dtl_capture_aux(&n_req, aux_buf, dtl, i);
+
+	/* Move the aux->head to indicate size of data in aux buffer */
+	perf_aux_output_end(&vpa_ctx->handle, (n_req + n_read) * sizeof(struct dtl_entry));
 }
 
 /*
@@ -372,6 +500,7 @@ static void *vpa_dtl_setup_aux(struct perf_event *event, void **pages,
 
 	buf->size = nr_pages << PAGE_SHIFT;
 	buf->head = 0;
+	buf->boottb_freq_saved = 0;
 	return no_free_ptr(buf);
 }
 
-- 
2.47.1


  parent reply	other threads:[~2025-08-15  8:35 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-15  8:33 [PATCH 00/14] Add interface to expose vpa dtl counters via perf Athira Rajeev
2025-08-15  8:33 ` [PATCH 01/14] powerpc/time: Expose boot_tb via accessor Athira Rajeev
2025-08-15  8:33 ` [PATCH 02/14] powerpc/vpa_dtl: Add interface to expose vpa dtl counters via perf Athira Rajeev
2025-08-20 11:53   ` Shrikanth Hegde
2025-09-04  7:25     ` Athira Rajeev
2025-08-15  8:33 ` [PATCH 03/14] docs: ABI: sysfs-bus-event_source-devices-vpa-dtl: Document sysfs event format entries for vpa_dtl pmu Athira Rajeev
2025-08-15  8:33 ` [PATCH 04/14] powerpc/perf/vpa-dtl: Add support to setup and free aux buffer for capturing DTL data Athira Rajeev
2025-08-15  8:33 ` Athira Rajeev [this message]
2025-08-15  8:33 ` [PATCH 06/14] powerpc/perf/vpa-dtl: Handle the writing of perf record when aux wake up is needed Athira Rajeev
2025-08-15  8:34 ` [PATCH 07/14] tools/perf: Add basic CONFIG_AUXTRACE support for VPA pmu on powerpc Athira Rajeev
2025-08-27 17:27   ` Adrian Hunter
2025-08-29  8:29     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 08/14] tools/perf: process auxtrace events and display in perf report -D Athira Rajeev
2025-08-27 17:28   ` Adrian Hunter
2025-08-29  8:31     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 09/14] tools/perf: Add event name as vpa-dtl of PERF_TYPE_SYNTH type to present DTL samples Athira Rajeev
2025-08-15  8:34 ` [PATCH 10/14] tools/perf: Allocate and setup aux buffer queue to help co-relate with other events across CPU's Athira Rajeev
2025-08-27 17:29   ` Adrian Hunter
2025-08-15  8:34 ` [PATCH 11/14] tools/perf: Process the DTL entries in queue and deliver samples Athira Rajeev
2025-08-27 17:29   ` Adrian Hunter
2025-08-29  8:33     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 12/14] tools/perf: Add support for printing synth event details via default callback Athira Rajeev
2025-08-27 17:29   ` Adrian Hunter
2025-08-29  8:35     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 13/14] tools/perf: Enable perf script to present the DTL entries Athira Rajeev
2025-08-27 17:30   ` Adrian Hunter
2025-08-15  8:34 ` [PATCH 14/14] powerpc/perf/vpa-dtl: Add documentation for VPA dispatch trace log PMU Athira Rajeev
2025-08-15 12:17 ` [PATCH 00/14] Add interface to expose vpa dtl counters via perf Venkat Rao Bagalkote
2025-08-15 12:51   ` Athira Rajeev
2025-08-18 14:41 ` tejas05

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250815083407.27953-6-atrajeev@linux.ibm.com \
    --to=atrajeev@linux.ibm.com \
    --cc=Aditya.Bodkhe1@ibm.com \
    --cc=aboorvad@linux.ibm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=hbathini@linux.vnet.ibm.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kjain@linux.ibm.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=namhyung@kernel.org \
    --cc=sshegde@linux.ibm.com \
    --cc=venkat88@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).