public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: irogers@google.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [PATCH v3 6/7] perf report: Add --latency flag
Date: Wed, 29 Jan 2025 22:30:08 -0800	[thread overview]
Message-ID: <Z5sccIPaYBeB_yab@google.com> (raw)
In-Reply-To: <CACT4Y+Z6KoYxY0AiC1eK=Ch9CQFuk7rWud_4GAMyjGdv+yemtQ@mail.gmail.com>

On Wed, Jan 29, 2025 at 08:12:51AM +0100, Dmitry Vyukov wrote:
> On Wed, 29 Jan 2025 at 06:03, Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Mon, Jan 27, 2025 at 10:58:53AM +0100, Dmitry Vyukov wrote:
> > > Add record/report --latency flag that allows to capture and show
> > > latency-centric profiles rather than the default CPU-consumption-centric
> > > profiles. For latency profiles record captures context switch events,
> > > and report shows Latency as the first column.
> >
> > It'd be nice if you could add a small example output in the commit
> > message.
> >
> > >
> > > Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> > > Cc: Namhyung Kim <namhyung@kernel.org>
> > > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > > Cc: Ian Rogers <irogers@google.com>
> > > Cc: linux-perf-users@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > [SNIP]
> > > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > > index bc4c3acfe7552..2b6023de7a53a 100644
> > > --- a/tools/perf/util/sort.c
> > > +++ b/tools/perf/util/sort.c
> > > @@ -2622,6 +2622,7 @@ struct hpp_dimension {
> > >       const char              *name;
> > >       struct perf_hpp_fmt     *fmt;
> > >       int                     taken;
> > > +     int                     was_taken;
> > >  };
> > >
> > >  #define DIM(d, n) { .name = n, .fmt = &perf_hpp__format[d], }
> > > @@ -3513,6 +3514,7 @@ static int __hpp_dimension__add(struct hpp_dimension *hd,
> > >               return -1;
> > >
> > >       hd->taken = 1;
> > > +     hd->was_taken = 1;
> > >       perf_hpp_list__register_sort_field(list, fmt);
> > >       return 0;
> > >  }
> > > @@ -3547,10 +3549,15 @@ static int __hpp_dimension__add_output(struct perf_hpp_list *list,
> > >       return 0;
> > >  }
> > >
> > > -int hpp_dimension__add_output(unsigned col)
> > > +int hpp_dimension__add_output(unsigned col, bool implicit)
> > >  {
> > > +     struct hpp_dimension *hd;
> > > +
> > >       BUG_ON(col >= PERF_HPP__MAX_INDEX);
> > > -     return __hpp_dimension__add_output(&perf_hpp_list, &hpp_sort_dimensions[col]);
> > > +     hd = &hpp_sort_dimensions[col];
> > > +     if (implicit && !hd->was_taken)
> > > +             return 0;
> >
> > I don't understand why you need 'was_taken' field.  Can it use the
> > 'taken' field?
> 
> I've started getting failed asserts in
> perf_hpp__cancel_cumulate/latency when removing columns still linked
> into sort order list.

Ok.

> 
> I've figured out that columns we implicitly add in setup_overhead and
> perf_hpp__init must match exactly. If we add one in setup_overhead,
> but not in perf_hpp__init, then it starts failing.

Right, I don't remember why we don't have unregister_sort_field or let
the perf_hpp__column_unregister() remove it from the sort_list too.

> 
> Taken does not work b/c there is reset_dimensions call between these
> functions, so they actually add the same columns twice to field/sort
> lists (and that prevents failures in
> perf_hpp__cancel_cumulate/latency).

I see.

> 
> Initially I've just tried to match logic in setup_overhead and in
> perf_hpp__init. But it turned out quite messy, duplicate logic, and
> e.g. in setup_overhead we look at sort_order, but in perf_hpp__init we
> already can't look at it b/c we already altered it in setup_overhead.
> That logic would also be quite fragile. Adding was_taken looks like
> the simplest, most reliable, and less fragile with respect to future
> changes approach.

Maybe just remove it from the sort_list.  There should be no reason to
keep it in the list.

> 
> 
> > > +     return __hpp_dimension__add_output(&perf_hpp_list, hd);
> > >  }
> > >
> > >  int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
> > > @@ -3809,10 +3816,24 @@ static char *setup_overhead(char *keys)
> > >       if (sort__mode == SORT_MODE__DIFF)
> > >               return keys;
> > >
> > > -     keys = prefix_if_not_in("overhead", keys);
> > > -
> > > -     if (symbol_conf.cumulate_callchain)
> > > -             keys = prefix_if_not_in("overhead_children", keys);
> > > +     if (symbol_conf.prefer_latency) {
> > > +             keys = prefix_if_not_in("overhead", keys);
> > > +             keys = prefix_if_not_in("latency", keys);
> > > +             if (symbol_conf.cumulate_callchain) {
> > > +                     keys = prefix_if_not_in("overhead_children", keys);
> > > +                     keys = prefix_if_not_in("latency_children", keys);
> > > +             }
> > > +     } else if (!keys || (!strstr(keys, "overhead") &&
> > > +                     !strstr(keys, "latency"))) {
> > > +             if (symbol_conf.enable_latency)
> > > +                     keys = prefix_if_not_in("latency", keys);
> > > +             keys = prefix_if_not_in("overhead", keys);
> > > +             if (symbol_conf.cumulate_callchain) {
> > > +                     if (symbol_conf.enable_latency)
> > > +                             keys = prefix_if_not_in("latency_children", keys);
> > > +                     keys = prefix_if_not_in("overhead_children", keys);
> > > +             }
> > > +     }
> >
> > Do you really need this complexity?  I think it's better to fix the order
> > of columns in both case.
> 
> This is sort order which affects ordering of row, and order of rows is
> basically the most important thing for a profiler. If we fix the
> order, it will be showing completely different things.
> 
> "latency" and "overhead" are equivalent with respect to their
> fundamentality for a profile. So we shouldn't be adding any of them,
> if any of them are already explicitly specified by the user.
> 
> For example, the command from tips.txt:
> 
> perf report --hierarchy --sort latency,parallelism,comm,symbol
> 
> if we prepend "overhead", it all falls apart.
> 
> Or for 2 default modes (normals and latency) we want "overhead" or
> "latency" to come first. Prepending "latency" for the current CPU mode
> would lead to completely different ordering.

I see.  You want to sort by overhead in the default mode even if it has
implicitly-added latency field, and to sort by latency if --latency is
given explicitly.

I think it can be done with the following command line.

  $ perf report -F overhead,latency,parallelism,comm,sym -s overhead

or

  $ perf report -F overhead,latency,parallelism,comm,sym -s latency

IOW, you can keep the same output column ordering and sort the result
differently.  But I'm not sure if it'd make the code simpler. :)

Thanks,
Namhyung


  reply	other threads:[~2025-01-30  6:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-27  9:58 [PATCH v3 0/7] perf report: Add latency and parallelism profiling Dmitry Vyukov
2025-01-27  9:58 ` [PATCH v3 1/7] perf report: Add machine parallelism Dmitry Vyukov
2025-01-27  9:58 ` [PATCH v3 2/7] perf report: Add parallelism sort key Dmitry Vyukov
2025-01-29  4:42   ` Namhyung Kim
2025-01-29  7:18     ` Dmitry Vyukov
2025-01-30  5:28       ` Namhyung Kim
2025-02-03 14:40         ` Dmitry Vyukov
2025-01-27  9:58 ` [PATCH v3 3/7] perf report: Switch filtered from u8 to u16 Dmitry Vyukov
2025-01-27  9:58 ` [PATCH v3 4/7] perf report: Add parallelism filter Dmitry Vyukov
2025-01-27  9:58 ` [PATCH v3 5/7] perf report: Add latency output field Dmitry Vyukov
2025-01-29  4:56   ` Namhyung Kim
2025-01-29  6:55     ` Dmitry Vyukov
2025-01-30  5:33       ` Namhyung Kim
2025-01-27  9:58 ` [PATCH v3 6/7] perf report: Add --latency flag Dmitry Vyukov
2025-01-29  5:03   ` Namhyung Kim
2025-01-29  7:12     ` Dmitry Vyukov
2025-01-30  6:30       ` Namhyung Kim [this message]
2025-02-03 14:45         ` Dmitry Vyukov
2025-01-27  9:58 ` [PATCH v3 7/7] perf report: Add latency and parallelism profiling documentation Dmitry Vyukov
2025-01-29  5:05 ` [PATCH v3 0/7] perf report: Add latency and parallelism profiling Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5sccIPaYBeB_yab@google.com \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=dvyukov@google.com \
    --cc=irogers@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox