From: Namhyung Kim <namhyung@kernel.org>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: irogers@google.com, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org,
Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [PATCH v3 0/7] perf report: Add latency and parallelism profiling
Date: Tue, 28 Jan 2025 21:05:14 -0800 [thread overview]
Message-ID: <Z5m3Co6TEnUiIQkG@google.com> (raw)
In-Reply-To: <cover.1737971364.git.dvyukov@google.com>
On Mon, Jan 27, 2025 at 10:58:47AM +0100, Dmitry Vyukov wrote:
> There are two notions of time: wall-clock time and CPU time.
> For a single-threaded program, or a program running on a single-core
> machine, these notions are the same. However, for a multi-threaded/
> multi-process program running on a multi-core machine, these notions are
> significantly different. Each second of wall-clock time we have
> number-of-cores seconds of CPU time.
>
> Currently perf only allows to profile CPU time. Perf (and all other
> existing profilers to the be best of my knowledge) does not allow to
> profile wall-clock time.
>
> Optimizing CPU overhead is useful to improve 'throughput', while
> optimizing wall-clock overhead is useful to improve 'latency'.
> These profiles are complementary and are not interchangeable.
> Examples of where latency profile is needed:
> - optimzing build latency
> - optimizing server request latency
> - optimizing ML training/inference latency
> - optimizing running time of any command line program
>
> CPU profile is useless for these use cases at best (if a user understands
> the difference), or misleading at worst (if a user tries to use a wrong
> profile for a job).
>
> This series add latency and parallelization profiling.
> See the added documentation and flags descriptions for details.
>
> Brief outline of the implementation:
> - add context switch collection during record
> - calculate number of threads running on CPUs (parallelism level)
> during report
> - divide each sample weight by the parallelism level
> This effectively models that we were taking 1 sample per unit of
> wall-clock time.
>
> We still default to the CPU profile, so it's up to users to learn
> about the second profiling mode and use it when appropriate.
>
> Changes in v3:
> - rebase and split into patches
> - rename 'wallclock' to 'latency' everywhere
> - don't enable latency profiling by default,
> instead add record/report --latency flag
Thanks for doing this, much better now. I've added some comments in
the thread.
Thanks,
Namhyung
>
> Dmitry Vyukov (7):
> perf report: Add machine parallelism
> perf report: Add parallelism sort key
> perf report: Switch filtered from u8 to u16
> perf report: Add parallelism filter
> perf report: Add latency output field
> perf report: Add --latency flag
> perf report: Add latency and parallelism profiling documentation
>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Ian Rogers <irogers@google.com>
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
>
> .../callchain-overhead-calculation.txt | 5 +-
> .../cpu-and-latency-overheads.txt | 85 ++++++++++++++++++
> tools/perf/Documentation/perf-report.txt | 49 ++++++----
> tools/perf/Documentation/tips.txt | 3 +
> tools/perf/builtin-record.c | 20 +++++
> tools/perf/builtin-report.c | 39 ++++++++
> tools/perf/ui/browsers/hists.c | 27 +++---
> tools/perf/ui/hist.c | 64 +++++++++----
> tools/perf/util/addr_location.c | 1 +
> tools/perf/util/addr_location.h | 7 +-
> tools/perf/util/event.c | 11 +++
> tools/perf/util/events_stats.h | 2 +
> tools/perf/util/hist.c | 90 +++++++++++++++----
> tools/perf/util/hist.h | 26 +++++-
> tools/perf/util/machine.c | 7 ++
> tools/perf/util/machine.h | 6 ++
> tools/perf/util/session.c | 12 +++
> tools/perf/util/session.h | 1 +
> tools/perf/util/sort.c | 69 ++++++++++++--
> tools/perf/util/sort.h | 3 +-
> tools/perf/util/symbol.c | 34 +++++++
> tools/perf/util/symbol_conf.h | 8 +-
> 22 files changed, 498 insertions(+), 71 deletions(-)
> create mode 100644 tools/perf/Documentation/cpu-and-latency-overheads.txt
>
>
> base-commit: 91b7747dc70d64b5ec56ffe493310f207e7ffc99
> --
> 2.48.1.262.g85cc9f2d1e-goog
>
prev parent reply other threads:[~2025-01-29 5:05 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-27 9:58 [PATCH v3 0/7] perf report: Add latency and parallelism profiling Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 1/7] perf report: Add machine parallelism Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 2/7] perf report: Add parallelism sort key Dmitry Vyukov
2025-01-29 4:42 ` Namhyung Kim
2025-01-29 7:18 ` Dmitry Vyukov
2025-01-30 5:28 ` Namhyung Kim
2025-02-03 14:40 ` Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 3/7] perf report: Switch filtered from u8 to u16 Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 4/7] perf report: Add parallelism filter Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 5/7] perf report: Add latency output field Dmitry Vyukov
2025-01-29 4:56 ` Namhyung Kim
2025-01-29 6:55 ` Dmitry Vyukov
2025-01-30 5:33 ` Namhyung Kim
2025-01-27 9:58 ` [PATCH v3 6/7] perf report: Add --latency flag Dmitry Vyukov
2025-01-29 5:03 ` Namhyung Kim
2025-01-29 7:12 ` Dmitry Vyukov
2025-01-30 6:30 ` Namhyung Kim
2025-02-03 14:45 ` Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 7/7] perf report: Add latency and parallelism profiling documentation Dmitry Vyukov
2025-01-29 5:05 ` Namhyung Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z5m3Co6TEnUiIQkG@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=dvyukov@google.com \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox