linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Athira Rajeev <atrajeev@linux.ibm.com>
To: Venkat Rao Bagalkote <venkat88@linux.ibm.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Ian Rogers <irogers@google.com>,
	"open list:PERFORMANCE EVENTS SUBSYSTEM"
	<linux-perf-users@vger.kernel.org>,
	PowerPC <linuxppc-dev@lists.ozlabs.org>,
	Aboorva Devarajan <aboorvad@linux.ibm.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Kajol Jain <kjain@linux.ibm.com>,
	hbathini@linux.vnet.ibm.com,
	Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Subject: Re: [PATCH 00/14] Add interface to expose vpa dtl counters via perf
Date: Fri, 15 Aug 2025 18:21:50 +0530	[thread overview]
Message-ID: <CD5A1BA3-B2A6-414A-A6EB-77C50A95C3F0@linux.ibm.com> (raw)
In-Reply-To: <df16509f-36b2-4f0b-8de8-685fa8703ea1@linux.ibm.com>



> On 15 Aug 2025, at 5:47 PM, Venkat Rao Bagalkote <venkat88@linux.ibm.com> wrote:
> 
> 
> On 15/08/25 2:03 pm, Athira Rajeev wrote:
>> The pseries Shared Processor Logical Partition(SPLPAR) machines can
>> retrieve a log of dispatch and preempt events from the hypervisor
>> using data from Disptach Trace Log(DTL) buffer. With this information,
>> user can retrieve when and why each dispatch & preempt has occurred.
>> The vpa-dtl PMU exposes the Virtual Processor Area(VPA) DTL counters
>> via perf.
>> 
>> - Patches 1 to 6 has powerpc PMU driver code changes to capture DTL
>>   trace in perf.data. And patch 14 has documentation update.
>> - Patch 7 to 13 is perf tools side code changes to enable perf
>>   report/script on perf.data file
>> 
>> Infrastructure used
>> ===================
>> 
>> The VPA DTL PMU counters do not interrupt on overflow or generate any
>> PMI interrupts. Therefore, hrtimer is used to poll the DTL data. The timer
>> nterval can be provided by user via sample_period field in nano seconds.
>> vpa dtl pmu has one hrtimer added per vpa-dtl pmu thread. DTL (Dispatch
>> Trace Log) contains information about dispatch/preempt, enqueue time etc.
>> We directly copy the DTL buffer data as part of auxiliary buffer and it
>> will be processed later. This will avoid time taken to create samples
>> in the kernel space. The PMU driver collecting Dispatch Trace Log (DTL)
>> entries makes use of AUX support in perf infrastructure. On the tools side,
>> this data is made available as PERF_RECORD_AUXTRACE records.
>> 
>> To corelate each DTL entry with other events across CPU's, an auxtrace_queue
>> is created for each CPU. Each auxtrace queue has a array/list of auxtrace buffers.
>> All auxtrace queues is maintained in auxtrace heap. The queues are sorted
>> based on timestamp. When the different PERF_RECORD_XX records are processed,
>> compare the timestamp of perf record with timestamp of top element in the
>> auxtrace heap so that DTL events can be co-related with other events
>> Process the auxtrace queue if the timestamp of element from heap is
>> lower than timestamp from entry in perf record. Sometimes it could happen that
>> one buffer is only partially processed. if the timestamp of occurrence of
>> another event is more than currently processed element in the queue, it will
>> move on to next perf record. So keep track of position of buffer to continue
>> processing next time. Update the timestamp of the auxtrace heap with the timestamp
>> of last processed entry from the auxtrace buffer.
>> 
>> This infrastructure ensures dispatch trace log entries can be corelated
>> and presented along with other events like sched.
>> 
>> vpa-dtl PMU example usage
>> 
>>   # ls /sys/devices/vpa_dtl/
>>   events  format  perf_event_mux_interval_ms  power  subsystem  type  uevent
>> 
>> 
>> To capture the DTL data using perf record:
>> 
>>   # ./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
>> 
>> The result can be interpreted using perf report. Snippet of perf report -D:
>> 
>>   # ./perf report -D
>> 
>> There are different PERF_RECORD_XX records. In that records corresponding to
>> auxtrace buffers includes:
>> 
>> 1. PERF_RECORD_AUX
>>    Conveys that new data is available in AUX area
>> 
>> 2. PERF_RECORD_AUXTRACE_INFO
>>    Describes offset and size of auxtrace data in the buffers
>> 
>> 3. PERF_RECORD_AUXTRACE
>>    This is the record that defines the auxtrace data which here in case of
>>    vpa-dtl pmu is dispatch trace log data.
>> 
>> Snippet from perf report -D showing the PERF_RECORD_AUXTRACE dump
>> 
>> 0 0 0x39b10 [0x30]: PERF_RECORD_AUXTRACE size: 0x690  offset: 0  ref: 0  idx: 0  tid: -1  cpu: 0
>> .
>> . ... VPA DTL PMU data: size 1680 bytes, entries is 35
>> .  00000000: boot_tb: 21349649546353231, tb_freq: 512000000
>> .  00000030: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:7064, ready_to_enqueue_time:187, waiting_to_ready_time:6611773
>> .  00000060: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:146, ready_to_enqueue_time:0, waiting_to_ready_time:15359437
>> .  00000090: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:4868, ready_to_enqueue_time:232, waiting_to_ready_time:5100709
>> .  000000c0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:179, ready_to_enqueue_time:0, waiting_to_ready_time:30714243
>> .  000000f0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:197, ready_to_enqueue_time:0, waiting_to_ready_time:15350648
>> .  00000120: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:213, ready_to_enqueue_time:0, waiting_to_ready_time:15353446
>> .  00000150: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:212, ready_to_enqueue_time:0, waiting_to_ready_time:15355126
>> .  00000180: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:6368, ready_to_enqueue_time:164, waiting_to_ready_time:5104665
>> 
>> Above is representation of dtl entry of below format:
>> 
>> struct dtl_entry {
>>         u8      dispatch_reason;
>>         u8      preempt_reason;
>>         u16     processor_id;
>>         u32     enqueue_to_dispatch_time;
>>         u32     ready_to_enqueue_time;
>>         u32     waiting_to_ready_time;
>>         u64     timebase;
>>         u64     fault_addr;
>>         u64     srr0;
>>         u64     srr1;
>>  };
>> 
>> First two fields represent the dispatch reason and preempt reason. The post
>> procecssing of PERF_RECORD_AUXTRACE records will translate to meaninful data
>> for user to consume.
>> 
>> Visualize the dispatch trace log entries with perf report:
>> 
>>   # ./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
>>   [ perf record: Woken up 1 times to write data ]
>>   [ perf record: Captured and wrote 0.300 MB perf.data ]
>> 
>>   # ./perf report
>>   # Samples: 321  of event 'vpa-dtl'
>>   # Event count (approx.): 321
>>   #
>>   # Children      Self  Command  Shared Object      Symbol
>>   # ........  ........  .......  .................  ..............................
>>   #
>>      100.00%   100.00%  swapper  [kernel.kallsyms]  [k] plpar_hcall_norets_notrace
>> 
>> Visualize the dispatch trace log entries with perf script:
>> 
>>    # ./perf script
>>             perf   13322 [002]   233.835807:                     sched:sched_switch: perf:13322 [120] R ==> migration/2:27 [0]
>>      migration/2      27 [002]   233.835811:               sched:sched_migrate_task: comm=perf pid=13322 prio=120 orig_cpu=2 dest_cpu=3
>>      migration/2      27 [002]   233.835818:               sched:sched_stat_runtime: comm=migration/2 pid=27 runtime=9214 [ns]
>>      migration/2      27 [002]   233.835819:                     sched:sched_switch: migration/2:27 [0] S ==> swapper/2:0 [120]
>>          swapper       0 [002]   233.835822:                                vpa-dtl: timebase: 338954486062657 dispatch_reason:decrementer_interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:435,                      ready_to_enqueue_time:0, waiting_to_ready_time:34775058, processor_id: 202 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms])
>>          swapper       0 [001]   233.835886:                                vpa-dtl: timebase: 338954486095398 dispatch_reason:priv_doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:542,            ready_to_enqueue_time:0, waiting_to_ready_time:1245360, processor_id: 201 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms])
>> 
>> Aboorva Devarajan (1):
>>   powerpc/time: Expose boot_tb via accessor
>> 
>> Athira Rajeev (11):
>>   powerpc/perf/vpa-dtl: Add support to setup and free aux buffer for
>>     capturing DTL data
>>   powerpc/perf/vpa-dtl: Add support to capture DTL data in aux buffer
>>   powerpc/perf/vpa-dtl: Handle the writing of perf record when aux wake
>>     up is needed
>>   tools/perf: Add basic CONFIG_AUXTRACE support for VPA pmu on powerpc
>>   tools/perf: process auxtrace events and display in perf report -D
>>   tools/perf: Add event name as vpa-dtl of PERF_TYPE_SYNTH type to
>>     present DTL samples
>>   tools/perf: Allocate and setup aux buffer queue to help co-relate with
>>     other events across CPU's
>>   tools/perf: Process the DTL entries in queue and deliver samples
>>   tools/perf: Add support for printing synth event details via default
>>     callback
>>   tools/perf: Enable perf script to present the DTL entries
>>   powerpc/perf/vpa-dtl: Add documentation for VPA dispatch trace log PMU
>> 
>> Kajol Jain (2):
>>   powerpc/vpa_dtl: Add interface to expose vpa dtl counters via perf
>>   docs: ABI: sysfs-bus-event_source-devices-vpa-dtl: Document sysfs
>>     event format entries for vpa_dtl pmu
>> 
>>  .../sysfs-bus-event_source-devices-vpa-dtl    |  25 +
>>  Documentation/arch/powerpc/index.rst          |   1 +
>>  Documentation/arch/powerpc/vpa-dtl.rst        | 155 ++++
>>  arch/powerpc/include/asm/time.h               |   2 +
>>  arch/powerpc/kernel/time.c                    |   7 +-
>>  arch/powerpc/perf/Makefile                    |   2 +-
>>  arch/powerpc/perf/vpa-dtl.c                   | 605 ++++++++++++++
>>  tools/perf/arch/powerpc/util/Build            |   1 +
>>  tools/perf/arch/powerpc/util/auxtrace.c       | 122 +++
>>  tools/perf/builtin-script.c                   |  26 +
>>  tools/perf/util/Build                         |   1 +
>>  tools/perf/util/auxtrace.c                    |   4 +
>>  tools/perf/util/auxtrace.h                    |   1 +
>>  tools/perf/util/event.h                       |   1 +
>>  tools/perf/util/powerpc-vpadtl.c              | 756 ++++++++++++++++++
>>  tools/perf/util/powerpc-vpadtl.h              |  45 ++
>>  16 files changed, 1752 insertions(+), 2 deletions(-)
>>  create mode 100644 Documentation/ABI/testing/sysfs-bus-event_source-devices-vpa-dtl
>>  create mode 100644 Documentation/arch/powerpc/vpa-dtl.rst
>>  create mode 100644 arch/powerpc/perf/vpa-dtl.c
>>  create mode 100644 tools/perf/arch/powerpc/util/auxtrace.c
>>  create mode 100644 tools/perf/util/powerpc-vpadtl.c
>>  create mode 100644 tools/perf/util/powerpc-vpadtl.h
>> 
> 
> Tested this patchset by applying on top of mainline kernel, and its working as expected. Hence for the entire series, please add below tag.
> 
> 
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> 
> 
> Regards,
> 
> Venkat.
Thanks Venkat for checking.

Hi All,

I have CC-ed linuxppc-dev for powerpc specific patches and linux-perf-users for perf tools side related patches.

As mentioned in cover letter:
- Patches 1 to 6 has powerpc PMU driver code changes to capture trace data in perf.data, And patch 14 has documentation update. 
- Patch 7 to 13 is perf tools side code changes to enable perf report/script on perf.data

Thanks
Athira


  reply	other threads:[~2025-08-15 12:52 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-15  8:33 [PATCH 00/14] Add interface to expose vpa dtl counters via perf Athira Rajeev
2025-08-15  8:33 ` [PATCH 01/14] powerpc/time: Expose boot_tb via accessor Athira Rajeev
2025-08-15  8:33 ` [PATCH 02/14] powerpc/vpa_dtl: Add interface to expose vpa dtl counters via perf Athira Rajeev
2025-08-20 11:53   ` Shrikanth Hegde
2025-09-04  7:25     ` Athira Rajeev
2025-08-15  8:33 ` [PATCH 03/14] docs: ABI: sysfs-bus-event_source-devices-vpa-dtl: Document sysfs event format entries for vpa_dtl pmu Athira Rajeev
2025-08-15  8:33 ` [PATCH 04/14] powerpc/perf/vpa-dtl: Add support to setup and free aux buffer for capturing DTL data Athira Rajeev
2025-08-15  8:33 ` [PATCH 05/14] powerpc/perf/vpa-dtl: Add support to capture DTL data in aux buffer Athira Rajeev
2025-08-15  8:33 ` [PATCH 06/14] powerpc/perf/vpa-dtl: Handle the writing of perf record when aux wake up is needed Athira Rajeev
2025-08-15  8:34 ` [PATCH 07/14] tools/perf: Add basic CONFIG_AUXTRACE support for VPA pmu on powerpc Athira Rajeev
2025-08-27 17:27   ` Adrian Hunter
2025-08-29  8:29     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 08/14] tools/perf: process auxtrace events and display in perf report -D Athira Rajeev
2025-08-27 17:28   ` Adrian Hunter
2025-08-29  8:31     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 09/14] tools/perf: Add event name as vpa-dtl of PERF_TYPE_SYNTH type to present DTL samples Athira Rajeev
2025-08-15  8:34 ` [PATCH 10/14] tools/perf: Allocate and setup aux buffer queue to help co-relate with other events across CPU's Athira Rajeev
2025-08-27 17:29   ` Adrian Hunter
2025-08-15  8:34 ` [PATCH 11/14] tools/perf: Process the DTL entries in queue and deliver samples Athira Rajeev
2025-08-27 17:29   ` Adrian Hunter
2025-08-29  8:33     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 12/14] tools/perf: Add support for printing synth event details via default callback Athira Rajeev
2025-08-27 17:29   ` Adrian Hunter
2025-08-29  8:35     ` Athira Rajeev
2025-08-15  8:34 ` [PATCH 13/14] tools/perf: Enable perf script to present the DTL entries Athira Rajeev
2025-08-27 17:30   ` Adrian Hunter
2025-08-15  8:34 ` [PATCH 14/14] powerpc/perf/vpa-dtl: Add documentation for VPA dispatch trace log PMU Athira Rajeev
2025-08-15 12:17 ` [PATCH 00/14] Add interface to expose vpa dtl counters via perf Venkat Rao Bagalkote
2025-08-15 12:51   ` Athira Rajeev [this message]
2025-08-18 14:41 ` tejas05

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CD5A1BA3-B2A6-414A-A6EB-77C50A95C3F0@linux.ibm.com \
    --to=atrajeev@linux.ibm.com \
    --cc=Aditya.Bodkhe1@ibm.com \
    --cc=aboorvad@linux.ibm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=hbathini@linux.vnet.ibm.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kjain@linux.ibm.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=namhyung@kernel.org \
    --cc=sshegde@linux.ibm.com \
    --cc=venkat88@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).