linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Joe Mario <jmario@redhat.com>, Ian Rogers <irogers@google.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org,
	Ravi Bangoria <ravi.bangoria@amd.com>, Leo Yan <leo.yan@arm.com>
Subject: Re: [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1)
Date: Fri, 2 May 2025 13:00:50 -0300	[thread overview]
Message-ID: <aBTsMkY1nAVpIUQ4@x1> (raw)
In-Reply-To: <20250430205548.789750-1-namhyung@kernel.org>

On Wed, Apr 30, 2025 at 01:55:37PM -0700, Namhyung Kim wrote:
> Hello,
 
> The perf mem uses PERF_SAMPLE_DATA_SRC which has a lot of information
> for memory access.  It has various sort keys to group related samples
> together but it's still cumbersome to see the result.  While perf c2c
> command provides a way to investigate the data in a specific way, I'd
> like to add more generic ways using new output fields.
 
> For example, the following is the 'cache' output field which breaks
> down the sample weights into different level of caches.

Super cool!
 
>   $ perf mem record -a sleep 1
>   
>   $ perf mem report -F cache,dso,sym --stdio
>   ...
>   #
>   # -------------- Cache --------------
>   #      L1     L2     L3 L1-buf  Other  Shared Object                                  Symbol
>   # ...................................  .....................................  .........................................
>   #
>        0.0%   0.0%   0.0%   0.0% 100.0%  [kernel.kallsyms]                      [k] ioread8
>      100.0%   0.0%   0.0%   0.0%   0.0%  [kernel.kallsyms]                      [k] _raw_spin_lock_irq
>        0.0%   0.0%   0.0%   0.0% 100.0%  [xhci_hcd]                             [k] xhci_update_erst_dequeue
>        0.0%   0.0%   0.0%  95.8%   4.2%  [kernel.kallsyms]                      [k] smaps_account
>        0.6%   1.8%  22.7%  45.5%  29.5%  [kernel.kallsyms]                      [k] sched_balance_update_blocked_averages
>       29.4%   0.0%   1.6%  58.8%  10.2%  [kernel.kallsyms]                      [k] __update_load_avg_cfs_rq
>        0.0%   8.5%   4.3%   0.0%  87.2%  [kernel.kallsyms]                      [k] copy_mc_enhanced_fast_string
>       63.9%   0.0%   8.0%  23.8%   4.3%  [kernel.kallsyms]                      [k] psi_group_change
>        3.9%   0.0%   9.3%  35.7%  51.1%  [kernel.kallsyms]                      [k] timerqueue_add
>       35.9%  10.9%   0.0%  39.0%  14.2%  [kernel.kallsyms]                      [k] memcpy
>       94.1%   0.0%   0.0%   5.9%   0.0%  [kernel.kallsyms]                      [k] unmap_page_range
>       25.7%   0.0%   4.9%  51.0%  18.4%  [kernel.kallsyms]                      [k] __update_load_avg_se
>        0.0%  24.9%  19.4%   9.6%  46.1%  [kernel.kallsyms]                      [k] _copy_to_iter
>       12.9%   0.0%   0.0%  87.1%   0.0%  [kernel.kallsyms]                      [k] next_uptodate_folio
>       36.8%   0.0%   9.5%  16.6%  37.1%  [kernel.kallsyms]                      [k] update_curr
>      100.0%   0.0%   0.0%   0.0%   0.0%  bpf_prog_b9611ccbbb3d1833_dfs_iter     [k] bpf_prog_b9611ccbbb3d1833_dfs_iter
>       45.4%   1.8%  20.4%  23.6%   8.8%  [kernel.kallsyms]                      [k] audit_filter_rules.isra.0
>       92.8%   0.0%   0.0%   7.2%   0.0%  [kernel.kallsyms]                      [k] filemap_map_pages
>       10.6%   0.0%   0.0%  89.4%   0.0%  [kernel.kallsyms]                      [k] smaps_page_accumulate
>       38.3%   0.0%  29.6%  27.1%   5.0%  [kernel.kallsyms]                      [k] __schedule
 
> Please see the description of each commit for other fields.
 
> New mem_stat field was added to the hist_entry to save this
> information.  It's a generic data structure (array) to handle
> different type of information like cache-level, memory location,
> snoop-result, etc.
 
> The first patch is a fix for the hierarchy mode and it was sent
> separately.  I just add it here not to break the hierarchy mode.  The
> second patch is to enable SAMPLE_DATA_SRC without SAMPLE_ADDR and
> perf_event_attr.mmap_data which generate a lot more data.

I merged it and added a test for the hierachy mode as mentioned in my
reply to that patch.
 
> The name of some new fields are the same as the corresponding sort
> keys (mem, op, snoop) so I had to change the order whether it's
> applied as an output field or a sort key.  Maybe it's better to name
> them differently but I couldn't come up with better ideas.

Looks ok at first sight.
 
> That means, you need to use -F/--fields option to specify those fields
> and the sort keys you want.  Maybe we can change the default output
> and sort keys for perf mem report with this.

Maybe we can come up with aliases to help using these new features
without having to create a long command line, maybe:

perf cache

Or some other more suitable name.

That would just be translated into the long command line for 'perf
report', kinda like 'perf kvm', but maybe we can do it like with 'perf
archive', i.e. just a shell wrapper?
 
> The code is available at 'perf/mem-field-v1' branch in

I'll test it, and I'm CCing Joe Mario, who I think will be very much
interesting in trying this!

- Arnaldo
 
>  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
 
> Thanks,
> Namhyung
 
> Namhyung Kim (11):
>   perf hist: Remove output field from sort-list properly
>   perf record: Add --sample-mem-info option
>   perf hist: Support multi-line header
>   perf hist: Add struct he_mem_stat
>   perf hist: Basic support for mem_stat accounting
>   perf hist: Implement output fields for mem stats
>   perf mem: Add 'op' output field
>   perf hist: Hide unused mem stat columns
>   perf mem: Add 'cache' and 'memory' output fields
>   perf mem: Add 'snoop' output field
>   perf mem: Add 'dtlb' output field
> 
>  tools/perf/Documentation/perf-record.txt |   7 +-
>  tools/perf/builtin-record.c              |   6 +
>  tools/perf/ui/browsers/hists.c           |  50 ++++-
>  tools/perf/ui/hist.c                     | 272 ++++++++++++++++++++++-
>  tools/perf/ui/stdio/hist.c               |  57 +++--
>  tools/perf/util/evsel.c                  |   2 +-
>  tools/perf/util/hist.c                   |  78 +++++++
>  tools/perf/util/hist.h                   |  22 ++
>  tools/perf/util/mem-events.c             | 183 ++++++++++++++-
>  tools/perf/util/mem-events.h             |  57 +++++
>  tools/perf/util/record.h                 |   1 +
>  tools/perf/util/sort.c                   |  42 +++-
>  12 files changed, 718 insertions(+), 59 deletions(-)
> 
> -- 
> 2.49.0.906.g1f30a19c02-goog

  parent reply	other threads:[~2025-05-02 16:00 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-30 20:55 [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1) Namhyung Kim
2025-04-30 20:55 ` [PATCH 01/11] perf hist: Remove output field from sort-list properly Namhyung Kim
2025-04-30 20:55 ` [PATCH 02/11] perf record: Add --sample-mem-info option Namhyung Kim
2025-04-30 20:55 ` [PATCH 03/11] perf hist: Support multi-line header Namhyung Kim
2025-04-30 20:55 ` [PATCH 04/11] perf hist: Add struct he_mem_stat Namhyung Kim
2025-04-30 20:55 ` [PATCH 05/11] perf hist: Basic support for mem_stat accounting Namhyung Kim
2025-04-30 20:55 ` [PATCH 06/11] perf hist: Implement output fields for mem stats Namhyung Kim
2025-04-30 20:55 ` [PATCH 07/11] perf mem: Add 'op' output field Namhyung Kim
2025-04-30 20:55 ` [PATCH 08/11] perf hist: Hide unused mem stat columns Namhyung Kim
2025-05-02 16:18   ` Arnaldo Carvalho de Melo
2025-05-02 16:27   ` Arnaldo Carvalho de Melo
2025-05-02 18:21     ` Namhyung Kim
2025-04-30 20:55 ` [PATCH 09/11] perf mem: Add 'cache' and 'memory' output fields Namhyung Kim
2025-04-30 20:55 ` [PATCH 10/11] perf mem: Add 'snoop' output field Namhyung Kim
2025-04-30 20:55 ` [PATCH 11/11] perf mem: Add 'dtlb' " Namhyung Kim
2025-05-02 16:30   ` Arnaldo Carvalho de Melo
2025-05-02 18:38     ` Namhyung Kim
2025-05-02 19:21       ` Arnaldo Carvalho de Melo
2025-05-02 20:01         ` Namhyung Kim
2025-05-02 16:00 ` Arnaldo Carvalho de Melo [this message]
2025-05-08  4:12 ` [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1) Ravi Bangoria
2025-05-09 16:17   ` Namhyung Kim
2025-05-12 10:01     ` Ravi Bangoria

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBTsMkY1nAVpIUQ4@x1 \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=irogers@google.com \
    --cc=jmario@redhat.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=leo.yan@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).