Re: [PATCH v5 0/8] perf report: Add latency and parallelism profiling

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dmitry Vyukov <dvyukov@google.com>
To: Andi Kleen <ak@linux.intel.com>
Cc: namhyung@kernel.org, irogers@google.com,
	linux-perf-users@vger.kernel.org,  linux-kernel@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [PATCH v5 0/8] perf report: Add latency and parallelism profiling
Date: Thu, 6 Feb 2025 19:41:00 +0100	[thread overview]
Message-ID: <CACT4Y+aVWH6d7kuxzvcJSarXMQB-PPDTcE5OvN2tjkOLnzmMLg@mail.gmail.com> (raw)
In-Reply-To: <87ldujkjsi.fsf@linux.intel.com>

On Thu, 6 Feb 2025 at 19:30, Andi Kleen <ak@linux.intel.com> wrote:
>
> Dmitry Vyukov <dvyukov@google.com> writes:
>
> > There are two notions of time: wall-clock time and CPU time.
> > For a single-threaded program, or a program running on a single-core
> > machine, these notions are the same. However, for a multi-threaded/
> > multi-process program running on a multi-core machine, these notions are
> > significantly different. Each second of wall-clock time we have
> > number-of-cores seconds of CPU time.
>
> I'm curious how does this interact with the time / --time-quantum sort key?
>
> I assume it just works, but might be worth checking.

I will check later. But if you have some concrete commands to try, it
will help. I never used --time-quantum before.


> It was intended to address some of these issues too.
>
> > Optimizing CPU overhead is useful to improve 'throughput', while
> > optimizing wall-clock overhead is useful to improve 'latency'.
> > These profiles are complementary and are not interchangeable.
> > Examples of where latency profile is needed:
> >  - optimzing build latency
> >  - optimizing server request latency
> >  - optimizing ML training/inference latency
> >  - optimizing running time of any command line program
> >
> > CPU profile is useless for these use cases at best (if a user understands
> > the difference), or misleading at worst (if a user tries to use a wrong
> > profile for a job).
>
> I would agree in the general case, but not if the time sort key
> is chosen with a suitable quantum. You can see how the parallelism
> changes over time then which is often a good enough proxy.

Never used it. I will look at what capabilities it provides.

> > We still default to the CPU profile, so it's up to users to learn
> > about the second profiling mode and use it when appropriate.
>
> You should add it to tips.txt then

It is done in the docs patch.

> >  .../callchain-overhead-calculation.txt        |   5 +-
> >  .../cpu-and-latency-overheads.txt             |  85 ++++++++++++++
> >  tools/perf/Documentation/perf-record.txt      |   4 +
> >  tools/perf/Documentation/perf-report.txt      |  54 ++++++---
> >  tools/perf/Documentation/tips.txt             |   3 +
> >  tools/perf/builtin-record.c                   |  20 ++++
> >  tools/perf/builtin-report.c                   |  39 +++++++
> >  tools/perf/ui/browsers/hists.c                |  27 +++--
> >  tools/perf/ui/hist.c                          | 104 ++++++++++++------
> >  tools/perf/util/addr_location.c               |   1 +
> >  tools/perf/util/addr_location.h               |   7 +-
> >  tools/perf/util/event.c                       |  11 ++
> >  tools/perf/util/events_stats.h                |   2 +
> >  tools/perf/util/hist.c                        |  90 ++++++++++++---
> >  tools/perf/util/hist.h                        |  32 +++++-
> >  tools/perf/util/machine.c                     |   7 ++
> >  tools/perf/util/machine.h                     |   6 +
> >  tools/perf/util/sample.h                      |   2 +-
> >  tools/perf/util/session.c                     |  12 ++
> >  tools/perf/util/session.h                     |   1 +
> >  tools/perf/util/sort.c                        |  69 +++++++++++-
> >  tools/perf/util/sort.h                        |   3 +-
> >  tools/perf/util/symbol.c                      |  34 ++++++
> >  tools/perf/util/symbol_conf.h                 |   8 +-
>
> We traditionally didn't do it, but in general test coverage
> of perf report is too low, so I would recommend to add some simple
> test case in the perf test scripts.

What of this is testable within the current testing framework?
Also how do I run tests? I failed to figure it out.

next prev parent reply	other threads:[~2025-02-06 18:41 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05 16:27 [PATCH v5 0/8] perf report: Add latency and parallelism profiling Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 1/8] perf report: Add machine parallelism Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 2/8] perf report: Add parallelism sort key Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 3/8] perf report: Switch filtered from u8 to u16 Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 4/8] perf report: Add parallelism filter Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 5/8] perf report: Add latency output field Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 6/8] perf report: Add --latency flag Dmitry Vyukov
2025-02-07  3:44   ` Namhyung Kim
2025-02-07  7:23     ` Dmitry Vyukov
2025-02-11  1:02       ` Namhyung Kim
2025-02-11  8:30         ` Dmitry Vyukov
2025-02-11  8:42         ` Dmitry Vyukov
2025-02-11 17:42           ` Namhyung Kim
2025-02-11 20:23             ` Dmitry Vyukov
2025-02-12 19:47               ` Namhyung Kim
2025-02-13  9:09                 ` Dmitry Vyukov
2025-02-07  3:53   ` Namhyung Kim
2025-02-07 11:42     ` Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 7/8] perf report: Add latency and parallelism profiling documentation Dmitry Vyukov
2025-02-05 16:27 ` [PATCH v5 8/8] perf hist: Shrink struct hist_entry size Dmitry Vyukov
2025-02-06 18:30 ` [PATCH v5 0/8] perf report: Add latency and parallelism profiling Andi Kleen
2025-02-06 18:41   ` Dmitry Vyukov [this message]
2025-02-06 18:51     ` Ian Rogers
2025-02-07  3:57       ` Namhyung Kim
2025-02-07 11:44         ` Dmitry Vyukov
2025-02-06 18:57     ` Andi Kleen
2025-02-06 19:07       ` Andi Kleen
2025-02-07  8:16   ` Dmitry Vyukov
2025-02-07 18:30     ` Andi Kleen
2025-02-10  7:17       ` Dmitry Vyukov
2025-02-10 17:11         ` Andi Kleen
2025-02-13  9:09         ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACT4Y+aVWH6d7kuxzvcJSarXMQB-PPDTcE5OvN2tjkOLnzmMLg@mail.gmail.com \
    --to=dvyukov@google.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=irogers@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).