linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Guilherme Amadio <amadio@cern.ch>,
	Artem Savkov <asavkov@redhat.com>,
	Ian Rogers <irogers@google.com>,
	linux-perf-users@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] perf record: add a shortcut for metrics
Date: Tue, 28 May 2024 15:20:05 -0300	[thread overview]
Message-ID: <ZlYgVYI6ABqmwb-d@x1> (raw)
In-Reply-To: <97818e63-051f-4fcf-8c20-75730c08e98a@linux.intel.com>

On Tue, May 28, 2024 at 11:55:00AM -0400, Liang, Kan wrote:
> On 2024-05-28 7:57 a.m., Artem Savkov wrote:
> > On Mon, May 27, 2024 at 10:01:37PM -0700, Ian Rogers wrote:
> >> On Mon, May 27, 2024 at 10:46 AM Arnaldo Carvalho de Melo
> >> <acme@kernel.org> wrote:
> >>>
> >>> On Mon, May 27, 2024 at 02:28:32PM -0300, Arnaldo Carvalho de Melo wrote:
> >>>> On Mon, May 27, 2024 at 02:04:54PM -0300, Arnaldo Carvalho de Melo wrote:
> >>>>> On Mon, May 27, 2024 at 02:02:33PM -0300, Arnaldo Carvalho de Melo wrote:
> >>>>>> On Mon, May 27, 2024 at 12:15:19PM +0200, Artem Savkov wrote:
> >>>>>>> Add -M/--metrics option to perf-record providing a shortcut to record
> >>>>>>> metrics and metricgroups. This option mirrors the one in perf-stat.
> >>>>
> >>>>>>> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> >>>>>>> Signed-off-by: Artem Savkov <asavkov@redhat.com>
> >>>
> >>>> How did you test this?
> >>>
> >>>> The idea, from my notes, was to be able to have extra columns in 'perf
> >>>> report' with things like IPC and other metrics, probably not all metrics
> >>>> will apply. We need to find a way to find out which ones are OK for that
> >>>> purpose, for instance:
> >>>
> >>> One that may make sense:
> >>>
> >>> root@number:~# perf record -M tma_fb_full
> >>> ^C[ perf record: Woken up 1 times to write data ]
> >>> [ perf record: Captured and wrote 3.846 MB perf.data (21745 samples) ]
> >>>
> >>> root@number:~# perf evlist
> >>> cpu_core/CPU_CLK_UNHALTED.THREAD/
> >>> cpu_core/L1D_PEND_MISS.FB_FULL/
> >>> dummy:u
> >>> root@number:~#
> >>>
> >>> But then we need to read both to do the math, maybe something like:
> >>>
> >>> root@number:~# perf record -e '{cpu_core/CPU_CLK_UNHALTED.THREAD/,cpu_core/L1D_PEND_MISS.FB_FULL/}:S'
> >>> ^C[ perf record: Woken up 40 times to write data ]
> >>> [ perf record: Captured and wrote 14.640 MB perf.data (219990 samples) ]
> >>>
> >>> root@number:~# perf script | head
> >>>     cc1plus 1339704 [000] 36028.995981:  2011389 cpu_core/CPU_CLK_UNHALTED.THREAD/:           1097303 [unknown] (/usr/libexec/gcc/x86_64-pc-linux-gnu/13/cc1plus)
> >>>     cc1plus 1339704 [000] 36028.995981:    26231   cpu_core/L1D_PEND_MISS.FB_FULL/:           1097303 [unknown] (/usr/libexec/gcc/x86_64-pc-linux-gnu/13/cc1plus)
> >>>     cc1plus 1340011 [001] 36028.996008:  2004568 cpu_core/CPU_CLK_UNHALTED.THREAD/:            8c23b4 [unknown] (/usr/libexec/gcc/x86_64-pc-linux-gnu/13/cc1plus)
> >>>     cc1plus 1340011 [001] 36028.996008:    20113   cpu_core/L1D_PEND_MISS.FB_FULL/:            8c23b4 [unknown] (/usr/libexec/gcc/x86_64-pc-linux-gnu/13/cc1plus)
> >>>       clang 1340462 [002] 36028.996043:  2007356 cpu_core/CPU_CLK_UNHALTED.THREAD/:  ffffffffb43b045d release_pages+0x3dd ([kernel.kallsyms])
> >>>       clang 1340462 [002] 36028.996043:    23481   cpu_core/L1D_PEND_MISS.FB_FULL/:  ffffffffb43b045d release_pages+0x3dd ([kernel.kallsyms])
> >>>     cc1plus 1339622 [003] 36028.996066:  2004148 cpu_core/CPU_CLK_UNHALTED.THREAD/:            760874 [unknown] (/usr/libexec/gcc/x86_64-pc-linux-gnu/13/cc1plus)
> >>>     cc1plus 1339622 [003] 36028.996066:    31935   cpu_core/L1D_PEND_MISS.FB_FULL/:            760874 [unknown] (/usr/libexec/gcc/x86_64-pc-linux-gnu/13/cc1plus)
> >>>          as 1340513 [004] 36028.996097:  2005052 cpu_core/CPU_CLK_UNHALTED.THREAD/:  ffffffffb4491d65 __count_memcg_events+0x55 ([kernel.kallsyms])
> >>>          as 1340513 [004] 36028.996097:    45084   cpu_core/L1D_PEND_MISS.FB_FULL/:  ffffffffb4491d65 __count_memcg_events+0x55 ([kernel.kallsyms])
> >>> root@number:~#
> >>>
> >>> root@number:~# perf report --stdio -F +period | head -20
> >>> # To display the perf.data header info, please use --header/--header-only options.
> >>> #
> >>> #
> >>> # Total Lost Samples: 0
> >>> #
> >>> # Samples: 219K of events 'anon group { cpu_core/CPU_CLK_UNHALTED.THREAD/, cpu_core/L1D_PEND_MISS.FB_FULL/ }'
> >>> # Event count (approx.): 216528524863
> >>> #
> >>> #         Overhead                Period  Command    Shared Object      Symbol
> >>> # ................  ....................  .........  .................  ....................................
> >>> #
> >>>      4.01%   1.09%  8538169256  39826572  podman     [kernel.kallsyms]  [k] native_queued_spin_lock_slowpath
> >>>      1.35%   1.17%  2863376078  42829266  cc1plus    cc1plus            [.] 0x00000000003f6bcc
> >>>      0.94%   0.78%  1990639149  28408591  cc1plus    cc1plus            [.] 0x00000000003f6be4
> >>>      0.65%   0.17%  1375916283   6109515  podman     [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
> >>>      0.61%   0.99%  1304418325  36198834  cc1plus    [kernel.kallsyms]  [k] get_mem_cgroup_from_mm
> >>>      0.52%   0.42%  1103054030  15427418  cc1plus    cc1plus            [.] 0x0000000000ca6c69
> >>>      0.51%   0.17%  1094200572   6299289  podman     [kernel.kallsyms]  [k] psi_group_change
> >>>      0.42%   0.41%   893633315  14778675  cc1plus    cc1plus            [.] 0x00000000018afafe
> >>>      0.42%   1.29%   887664793  47046952  cc1plus    [kernel.kallsyms]  [k] asm_exc_page_fault
> >>> root@number:~#
> >>>
> >>> That 'tma_fb_full' metric then would be another column, calculated from
> >>> the sampled components of its metric equation:
> >>>
> >>> root@number:~# perf list tma_fb_full | head
> >>>
> >>> Metric Groups:
> >>>
> >>> MemoryBW: [Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet]
> >>>   tma_fb_full
> >>>        [This metric does a *rough estimation* of how often L1D Fill Buffer
> >>>         unavailability limited additional L1D miss memory access requests to
> >>>         proceed]
> >>>
> >>> TopdownL4: [Metrics for top-down breakdown at level 4]
> >>> root@number:~#
> >>>
> >>> This is roughly what we brainstormed, to support metrics in other tools
> >>> than 'perf stat' but we need to check the possibilities and limitations
> >>> of such an idea, hopefully this discussion will help with that,
> >>
> >> Putting metrics next to code in perf report/annotate sounds good to
> >> me, opening all events from a metric as if we want to sample on them
> >> less so.
> > 
> > The idea was to record whatever data was asked on record step and
> > provide the list of all metrics that can be calculated out of that data
> > in perf report, e.g. you could record tma_info_thread_ipc but report
> > will suggest both it and tma_info_thread_cpi.
> >
> 
> Do you mean that sample all the events in a metrics, and report both
> samples and its metrics calculation result in the report?
> That doesn't work for all the metrics.

IIRC Guilherme was mentioning having extra metrics on report was
something he missed that is available on tools such as VTune, Guilherme?

- Arnaldo
 
> - For the topdown related metrics, especially on ICL and later
> platforms, the perf metrics feature is used by default. It doesn't
> support sampling.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/topdown.txt?#n293
> - Some PMUs which doesn't support sampling as well, e.g., uncore, Power,
> MSR.
> - There are some SW events, e.g.,duration_time, you may don't want to do
> sampling
> 
> You probable need to introduce a flag to ignore those metrics in perf
> record.
> 
> >> We don't have metrics working with `perf stat record`, I
> >> think Kan may have volunteered for that, but it seems like something
> >> more urgent than expanding `perf record`. Presumably the way the
> >> metric would be recorded for that could also benefit this effort.
> >>
> >> If you look at the tma metrics a number of them have a "Sample with".
> >> For example:
> >> ```
> >> $ perf list -v
> >> ...
> >>   tma_branch_mispredicts
> >>        [This metric represents fraction of slots the CPU has wasted
> >> due to Branch Misprediction.
> >>         These slots are either wasted by uops fetched from an
> >> incorrectly speculated program path;
> >>         or stalls when the out-of-order part of the machine needs to
> >> recover its state from a
> >>         speculative path. Sample with: BR_MISP_RETIRED.ALL_BRANCHES.
> >> Related metrics:
> >>         tma_info_bad_spec_branch_misprediction_cost,tma_info_bottleneck_mispredictions,
> >>         tma_mispredicts_resteers]
> >> ...
> >> ```
> >> It could be logical for `perf record -M tma_branch_mispredicts ...` to
> >> be translated to `perf record -e BR_MISP_RETIRED.ALL_BRANCHES ...`
> >> rather than to do any form of counting.
> > 
> > Thanks for the pointer, I'll see how this could be done.
> 
> It sounds more reasonable to me that we can sample some typical events,
> and read the other members in the metrics. So we can put metrics next to
> the code in perf report/annotate as Ian mentioned. It could also address
> limits of some metrics, especially for the topdown related metrics.
> (But I'm not sure if the "Sample with" can give you the right hints. I
> will ask around internally.)
> 
> But there is also some limits for the sampling read. Everything has to
> be in a group. That could be a problem for some big metrics.
> Thanks,
> Kan

  reply	other threads:[~2024-05-28 18:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-27 10:15 [PATCH] perf record: add a shortcut for metrics Artem Savkov
2024-05-27 17:02 ` Arnaldo Carvalho de Melo
2024-05-27 17:04   ` Arnaldo Carvalho de Melo
2024-05-27 17:28     ` Arnaldo Carvalho de Melo
2024-05-27 17:46       ` Arnaldo Carvalho de Melo
2024-05-28  5:01         ` Ian Rogers
2024-05-28 11:57           ` Artem Savkov
2024-05-28 15:55             ` Liang, Kan
2024-05-28 18:20               ` Arnaldo Carvalho de Melo [this message]
2024-05-29 15:15                 ` Guilherme Amadio
2024-05-29 19:41                   ` Liang, Kan
2024-05-28 11:45       ` Artem Savkov
2024-05-28 14:47         ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlYgVYI6ABqmwb-d@x1 \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amadio@cern.ch \
    --cc=asavkov@redhat.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).