From: Namhyung Kim <namhyung@kernel.org>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: irogers@google.com, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org,
Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [PATCH v3 6/7] perf report: Add --latency flag
Date: Wed, 29 Jan 2025 22:30:08 -0800 [thread overview]
Message-ID: <Z5sccIPaYBeB_yab@google.com> (raw)
In-Reply-To: <CACT4Y+Z6KoYxY0AiC1eK=Ch9CQFuk7rWud_4GAMyjGdv+yemtQ@mail.gmail.com>
On Wed, Jan 29, 2025 at 08:12:51AM +0100, Dmitry Vyukov wrote:
> On Wed, 29 Jan 2025 at 06:03, Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Mon, Jan 27, 2025 at 10:58:53AM +0100, Dmitry Vyukov wrote:
> > > Add record/report --latency flag that allows to capture and show
> > > latency-centric profiles rather than the default CPU-consumption-centric
> > > profiles. For latency profiles record captures context switch events,
> > > and report shows Latency as the first column.
> >
> > It'd be nice if you could add a small example output in the commit
> > message.
> >
> > >
> > > Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> > > Cc: Namhyung Kim <namhyung@kernel.org>
> > > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > > Cc: Ian Rogers <irogers@google.com>
> > > Cc: linux-perf-users@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > [SNIP]
> > > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > > index bc4c3acfe7552..2b6023de7a53a 100644
> > > --- a/tools/perf/util/sort.c
> > > +++ b/tools/perf/util/sort.c
> > > @@ -2622,6 +2622,7 @@ struct hpp_dimension {
> > > const char *name;
> > > struct perf_hpp_fmt *fmt;
> > > int taken;
> > > + int was_taken;
> > > };
> > >
> > > #define DIM(d, n) { .name = n, .fmt = &perf_hpp__format[d], }
> > > @@ -3513,6 +3514,7 @@ static int __hpp_dimension__add(struct hpp_dimension *hd,
> > > return -1;
> > >
> > > hd->taken = 1;
> > > + hd->was_taken = 1;
> > > perf_hpp_list__register_sort_field(list, fmt);
> > > return 0;
> > > }
> > > @@ -3547,10 +3549,15 @@ static int __hpp_dimension__add_output(struct perf_hpp_list *list,
> > > return 0;
> > > }
> > >
> > > -int hpp_dimension__add_output(unsigned col)
> > > +int hpp_dimension__add_output(unsigned col, bool implicit)
> > > {
> > > + struct hpp_dimension *hd;
> > > +
> > > BUG_ON(col >= PERF_HPP__MAX_INDEX);
> > > - return __hpp_dimension__add_output(&perf_hpp_list, &hpp_sort_dimensions[col]);
> > > + hd = &hpp_sort_dimensions[col];
> > > + if (implicit && !hd->was_taken)
> > > + return 0;
> >
> > I don't understand why you need 'was_taken' field. Can it use the
> > 'taken' field?
>
> I've started getting failed asserts in
> perf_hpp__cancel_cumulate/latency when removing columns still linked
> into sort order list.
Ok.
>
> I've figured out that columns we implicitly add in setup_overhead and
> perf_hpp__init must match exactly. If we add one in setup_overhead,
> but not in perf_hpp__init, then it starts failing.
Right, I don't remember why we don't have unregister_sort_field or let
the perf_hpp__column_unregister() remove it from the sort_list too.
>
> Taken does not work b/c there is reset_dimensions call between these
> functions, so they actually add the same columns twice to field/sort
> lists (and that prevents failures in
> perf_hpp__cancel_cumulate/latency).
I see.
>
> Initially I've just tried to match logic in setup_overhead and in
> perf_hpp__init. But it turned out quite messy, duplicate logic, and
> e.g. in setup_overhead we look at sort_order, but in perf_hpp__init we
> already can't look at it b/c we already altered it in setup_overhead.
> That logic would also be quite fragile. Adding was_taken looks like
> the simplest, most reliable, and less fragile with respect to future
> changes approach.
Maybe just remove it from the sort_list. There should be no reason to
keep it in the list.
>
>
> > > + return __hpp_dimension__add_output(&perf_hpp_list, hd);
> > > }
> > >
> > > int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
> > > @@ -3809,10 +3816,24 @@ static char *setup_overhead(char *keys)
> > > if (sort__mode == SORT_MODE__DIFF)
> > > return keys;
> > >
> > > - keys = prefix_if_not_in("overhead", keys);
> > > -
> > > - if (symbol_conf.cumulate_callchain)
> > > - keys = prefix_if_not_in("overhead_children", keys);
> > > + if (symbol_conf.prefer_latency) {
> > > + keys = prefix_if_not_in("overhead", keys);
> > > + keys = prefix_if_not_in("latency", keys);
> > > + if (symbol_conf.cumulate_callchain) {
> > > + keys = prefix_if_not_in("overhead_children", keys);
> > > + keys = prefix_if_not_in("latency_children", keys);
> > > + }
> > > + } else if (!keys || (!strstr(keys, "overhead") &&
> > > + !strstr(keys, "latency"))) {
> > > + if (symbol_conf.enable_latency)
> > > + keys = prefix_if_not_in("latency", keys);
> > > + keys = prefix_if_not_in("overhead", keys);
> > > + if (symbol_conf.cumulate_callchain) {
> > > + if (symbol_conf.enable_latency)
> > > + keys = prefix_if_not_in("latency_children", keys);
> > > + keys = prefix_if_not_in("overhead_children", keys);
> > > + }
> > > + }
> >
> > Do you really need this complexity? I think it's better to fix the order
> > of columns in both case.
>
> This is sort order which affects ordering of row, and order of rows is
> basically the most important thing for a profiler. If we fix the
> order, it will be showing completely different things.
>
> "latency" and "overhead" are equivalent with respect to their
> fundamentality for a profile. So we shouldn't be adding any of them,
> if any of them are already explicitly specified by the user.
>
> For example, the command from tips.txt:
>
> perf report --hierarchy --sort latency,parallelism,comm,symbol
>
> if we prepend "overhead", it all falls apart.
>
> Or for 2 default modes (normals and latency) we want "overhead" or
> "latency" to come first. Prepending "latency" for the current CPU mode
> would lead to completely different ordering.
I see. You want to sort by overhead in the default mode even if it has
implicitly-added latency field, and to sort by latency if --latency is
given explicitly.
I think it can be done with the following command line.
$ perf report -F overhead,latency,parallelism,comm,sym -s overhead
or
$ perf report -F overhead,latency,parallelism,comm,sym -s latency
IOW, you can keep the same output column ordering and sort the result
differently. But I'm not sure if it'd make the code simpler. :)
Thanks,
Namhyung
next prev parent reply other threads:[~2025-01-30 6:30 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-27 9:58 [PATCH v3 0/7] perf report: Add latency and parallelism profiling Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 1/7] perf report: Add machine parallelism Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 2/7] perf report: Add parallelism sort key Dmitry Vyukov
2025-01-29 4:42 ` Namhyung Kim
2025-01-29 7:18 ` Dmitry Vyukov
2025-01-30 5:28 ` Namhyung Kim
2025-02-03 14:40 ` Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 3/7] perf report: Switch filtered from u8 to u16 Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 4/7] perf report: Add parallelism filter Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 5/7] perf report: Add latency output field Dmitry Vyukov
2025-01-29 4:56 ` Namhyung Kim
2025-01-29 6:55 ` Dmitry Vyukov
2025-01-30 5:33 ` Namhyung Kim
2025-01-27 9:58 ` [PATCH v3 6/7] perf report: Add --latency flag Dmitry Vyukov
2025-01-29 5:03 ` Namhyung Kim
2025-01-29 7:12 ` Dmitry Vyukov
2025-01-30 6:30 ` Namhyung Kim [this message]
2025-02-03 14:45 ` Dmitry Vyukov
2025-01-27 9:58 ` [PATCH v3 7/7] perf report: Add latency and parallelism profiling documentation Dmitry Vyukov
2025-01-29 5:05 ` [PATCH v3 0/7] perf report: Add latency and parallelism profiling Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z5sccIPaYBeB_yab@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=dvyukov@google.com \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.