linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1)
@ 2025-04-30 20:55 Namhyung Kim
  2025-04-30 20:55 ` [PATCH 01/11] perf hist: Remove output field from sort-list properly Namhyung Kim
                   ` (12 more replies)
  0 siblings, 13 replies; 23+ messages in thread
From: Namhyung Kim @ 2025-04-30 20:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Ravi Bangoria, Leo Yan

Hello,

The perf mem uses PERF_SAMPLE_DATA_SRC which has a lot of information
for memory access.  It has various sort keys to group related samples
together but it's still cumbersome to see the result.  While perf c2c
command provides a way to investigate the data in a specific way, I'd
like to add more generic ways using new output fields.

For example, the following is the 'cache' output field which breaks
down the sample weights into different level of caches.

  $ perf mem record -a sleep 1
  
  $ perf mem report -F cache,dso,sym --stdio
  ...
  #
  # -------------- Cache --------------
  #      L1     L2     L3 L1-buf  Other  Shared Object                                  Symbol
  # ...................................  .....................................  .........................................
  #
       0.0%   0.0%   0.0%   0.0% 100.0%  [kernel.kallsyms]                      [k] ioread8
     100.0%   0.0%   0.0%   0.0%   0.0%  [kernel.kallsyms]                      [k] _raw_spin_lock_irq
       0.0%   0.0%   0.0%   0.0% 100.0%  [xhci_hcd]                             [k] xhci_update_erst_dequeue
       0.0%   0.0%   0.0%  95.8%   4.2%  [kernel.kallsyms]                      [k] smaps_account
       0.6%   1.8%  22.7%  45.5%  29.5%  [kernel.kallsyms]                      [k] sched_balance_update_blocked_averages
      29.4%   0.0%   1.6%  58.8%  10.2%  [kernel.kallsyms]                      [k] __update_load_avg_cfs_rq
       0.0%   8.5%   4.3%   0.0%  87.2%  [kernel.kallsyms]                      [k] copy_mc_enhanced_fast_string
      63.9%   0.0%   8.0%  23.8%   4.3%  [kernel.kallsyms]                      [k] psi_group_change
       3.9%   0.0%   9.3%  35.7%  51.1%  [kernel.kallsyms]                      [k] timerqueue_add
      35.9%  10.9%   0.0%  39.0%  14.2%  [kernel.kallsyms]                      [k] memcpy
      94.1%   0.0%   0.0%   5.9%   0.0%  [kernel.kallsyms]                      [k] unmap_page_range
      25.7%   0.0%   4.9%  51.0%  18.4%  [kernel.kallsyms]                      [k] __update_load_avg_se
       0.0%  24.9%  19.4%   9.6%  46.1%  [kernel.kallsyms]                      [k] _copy_to_iter
      12.9%   0.0%   0.0%  87.1%   0.0%  [kernel.kallsyms]                      [k] next_uptodate_folio
      36.8%   0.0%   9.5%  16.6%  37.1%  [kernel.kallsyms]                      [k] update_curr
     100.0%   0.0%   0.0%   0.0%   0.0%  bpf_prog_b9611ccbbb3d1833_dfs_iter     [k] bpf_prog_b9611ccbbb3d1833_dfs_iter
      45.4%   1.8%  20.4%  23.6%   8.8%  [kernel.kallsyms]                      [k] audit_filter_rules.isra.0
      92.8%   0.0%   0.0%   7.2%   0.0%  [kernel.kallsyms]                      [k] filemap_map_pages
      10.6%   0.0%   0.0%  89.4%   0.0%  [kernel.kallsyms]                      [k] smaps_page_accumulate
      38.3%   0.0%  29.6%  27.1%   5.0%  [kernel.kallsyms]                      [k] __schedule

Please see the description of each commit for other fields.

New mem_stat field was added to the hist_entry to save this
information.  It's a generic data structure (array) to handle
different type of information like cache-level, memory location,
snoop-result, etc.

The first patch is a fix for the hierarchy mode and it was sent
separately.  I just add it here not to break the hierarchy mode.  The
second patch is to enable SAMPLE_DATA_SRC without SAMPLE_ADDR and
perf_event_attr.mmap_data which generate a lot more data.

The name of some new fields are the same as the corresponding sort
keys (mem, op, snoop) so I had to change the order whether it's
applied as an output field or a sort key.  Maybe it's better to name
them differently but I couldn't come up with better ideas.

That means, you need to use -F/--fields option to specify those fields
and the sort keys you want.  Maybe we can change the default output
and sort keys for perf mem report with this.

The code is available at 'perf/mem-field-v1' branch in

 git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (11):
  perf hist: Remove output field from sort-list properly
  perf record: Add --sample-mem-info option
  perf hist: Support multi-line header
  perf hist: Add struct he_mem_stat
  perf hist: Basic support for mem_stat accounting
  perf hist: Implement output fields for mem stats
  perf mem: Add 'op' output field
  perf hist: Hide unused mem stat columns
  perf mem: Add 'cache' and 'memory' output fields
  perf mem: Add 'snoop' output field
  perf mem: Add 'dtlb' output field

 tools/perf/Documentation/perf-record.txt |   7 +-
 tools/perf/builtin-record.c              |   6 +
 tools/perf/ui/browsers/hists.c           |  50 ++++-
 tools/perf/ui/hist.c                     | 272 ++++++++++++++++++++++-
 tools/perf/ui/stdio/hist.c               |  57 +++--
 tools/perf/util/evsel.c                  |   2 +-
 tools/perf/util/hist.c                   |  78 +++++++
 tools/perf/util/hist.h                   |  22 ++
 tools/perf/util/mem-events.c             | 183 ++++++++++++++-
 tools/perf/util/mem-events.h             |  57 +++++
 tools/perf/util/record.h                 |   1 +
 tools/perf/util/sort.c                   |  42 +++-
 12 files changed, 718 insertions(+), 59 deletions(-)

-- 
2.49.0.906.g1f30a19c02-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-05-12 10:01 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-30 20:55 [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1) Namhyung Kim
2025-04-30 20:55 ` [PATCH 01/11] perf hist: Remove output field from sort-list properly Namhyung Kim
2025-04-30 20:55 ` [PATCH 02/11] perf record: Add --sample-mem-info option Namhyung Kim
2025-04-30 20:55 ` [PATCH 03/11] perf hist: Support multi-line header Namhyung Kim
2025-04-30 20:55 ` [PATCH 04/11] perf hist: Add struct he_mem_stat Namhyung Kim
2025-04-30 20:55 ` [PATCH 05/11] perf hist: Basic support for mem_stat accounting Namhyung Kim
2025-04-30 20:55 ` [PATCH 06/11] perf hist: Implement output fields for mem stats Namhyung Kim
2025-04-30 20:55 ` [PATCH 07/11] perf mem: Add 'op' output field Namhyung Kim
2025-04-30 20:55 ` [PATCH 08/11] perf hist: Hide unused mem stat columns Namhyung Kim
2025-05-02 16:18   ` Arnaldo Carvalho de Melo
2025-05-02 16:27   ` Arnaldo Carvalho de Melo
2025-05-02 18:21     ` Namhyung Kim
2025-04-30 20:55 ` [PATCH 09/11] perf mem: Add 'cache' and 'memory' output fields Namhyung Kim
2025-04-30 20:55 ` [PATCH 10/11] perf mem: Add 'snoop' output field Namhyung Kim
2025-04-30 20:55 ` [PATCH 11/11] perf mem: Add 'dtlb' " Namhyung Kim
2025-05-02 16:30   ` Arnaldo Carvalho de Melo
2025-05-02 18:38     ` Namhyung Kim
2025-05-02 19:21       ` Arnaldo Carvalho de Melo
2025-05-02 20:01         ` Namhyung Kim
2025-05-02 16:00 ` [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1) Arnaldo Carvalho de Melo
2025-05-08  4:12 ` Ravi Bangoria
2025-05-09 16:17   ` Namhyung Kim
2025-05-12 10:01     ` Ravi Bangoria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).