All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Howard Chu <howardchu95@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 4/4] perf trace: Add --system-summary option
Date: Thu, 23 Jan 2025 15:41:05 -0800	[thread overview]
Message-ID: <Z5LTkVsprlPbG6gF@google.com> (raw)
In-Reply-To: <CAH0uvojXeXiKWvZ6uZMyauWgHiA7Brf9L0=oxh0D2dO5gX70+w@mail.gmail.com>

On Fri, Jan 17, 2025 at 07:07:32PM -0800, Howard Chu wrote:
> Hello Namhyung,
> 
> On Tue, Jan 14, 2025 at 1:23 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> >   $ sudo ./perf trace -as --system-summary sleep 1
> >
> >    Summary of events:
> >
> >    total, 21580 events
> >
> >      syscall            calls  errors  total       min       avg       max       stddev
> >                                        (msec)    (msec)    (msec)    (msec)        (%)
> >      --------------- --------  ------ -------- --------- --------- ---------     ------
> >      epoll_wait          1305      0 14716.712     0.000    11.277   551.529      8.87%
> >      futex               1256     89 13331.197     0.000    10.614   733.722     15.49%
> >      poll                 669      0  6806.618     0.000    10.174   459.316     11.77%
> >      ppoll                220      0  3968.797     0.000    18.040   516.775     25.35%
> >      clock_nanosleep        1      0  1000.027  1000.027  1000.027  1000.027      0.00%
> >      epoll_pwait           21      0   592.783     0.000    28.228   522.293     88.29%
> >      nanosleep             16      0    60.515     0.000     3.782    10.123     33.33%
> >      ioctl                510      0     4.284     0.001     0.008     0.182      8.84%
> >      recvmsg             1434    775     3.497     0.001     0.002     0.174      6.37%
> >      write               1393      0     2.854     0.001     0.002     0.017      1.79%
> >      read                1063    100     2.236     0.000     0.002     0.083      5.11%
> >      ...
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/Documentation/perf-trace.txt |   3 +
> >  tools/perf/builtin-trace.c              | 100 +++++++++++++++++++-----
> >  2 files changed, 85 insertions(+), 18 deletions(-)
> >
> > diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
> > index 2620c4f4601c8701..72f29f82649e320a 100644
> > --- a/tools/perf/Documentation/perf-trace.txt
> > +++ b/tools/perf/Documentation/perf-trace.txt
> > @@ -150,6 +150,9 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
> >         To be used with -s or -S, to show stats for the errnos experienced by
> >         syscalls, using only this option will trigger --summary.
> >
> > +--system-summary::
> > +       To be used with -s or -S, to show system-wide summary instead of per-thread.
> 
> Is system-summary a good option name and is it really system-wide?

I'm afraid not. :)


> For example, if I do:
> 
> perf $ sudo ./perf trace -s -p 2552
> ^C
>  Summary of events:
> 
>  gmain (2581), 12 events, 2.7%
> 
>    syscall            calls  errors  total       min       avg
> max       stddev
>                                      (msec)    (msec)    (msec)
> (msec)        (%)
>    --------------- --------  ------ -------- --------- ---------
> ---------     ------
>    inotify_add_watch        5      5     0.071     0.009     0.014
> 0.022     16.74%
>    ppoll                  1      0     0.000     0.000     0.000
> 0.000      0.00%
> 
> 
>  gnome-shell (2647), 20 events, 4.4%
> 
>    syscall            calls  errors  total       min       avg
> max       stddev
>                                      (msec)    (msec)    (msec)
> (msec)        (%)
>    --------------- --------  ------ -------- --------- ---------
> ---------     ------
>    ppoll                  2      0   136.188     0.000    68.094
> 136.188    100.00%
>    read                   4      2     0.080     0.008     0.020
> 0.038     35.33%
>    epoll_wait             2      0     0.044     0.011     0.022
> 0.032     47.61%
>    write                  2      0     0.040     0.014     0.020
> 0.026     29.65%
> 
> 
>  gdbus (2583), 102 events, 22.6%
> 
>    syscall            calls  errors  total       min       avg
> max       stddev
>                                      (msec)    (msec)    (msec)
> (msec)        (%)
>    --------------- --------  ------ -------- --------- ---------
> ---------     ------
>    ppoll                 13      0  1535.789     0.000   118.138
> 1396.542     90.61%
>    write                 16      0     0.198     0.003     0.012
> 0.043     24.76%
>    recvmsg                6      0     0.117     0.005     0.020
> 0.036     27.08%
>    read                   8      0     0.094     0.003     0.012
> 0.050     48.51%
>    sendmsg                2      0     0.071     0.010     0.035
> 0.061     71.55%
>    poll                   6      0     0.058     0.003     0.010
> 0.020     27.31%
> 
> 
>  gnome-shell (2552), 299 events, 66.2%
> 
>    syscall            calls  errors  total       min       avg
> max       stddev
>                                      (msec)    (msec)    (msec)
> (msec)        (%)
>    --------------- --------  ------ -------- --------- ---------
> ---------     ------
>    ppoll                 17      0  1531.485     0.000    90.087
> 1387.446     90.43%
>    recvmsg               36     35     0.504     0.003     0.014
> 0.082     18.35%
>    mprotect              37      0     0.401     0.005     0.011
> 0.028      8.54%
>    write                 29      0     0.333     0.003     0.011
> 0.051     19.53%
>    read                  16      0     0.151     0.003     0.009
> 0.031     22.48%
>    sendmsg                4      0     0.130     0.014     0.033
> 0.059     34.63%
>    ioctl                  1      0     0.050     0.050     0.050
> 0.050      0.00%
>    poll                   2      0     0.045     0.007     0.023
> 0.039     71.16%
>    epoll_wait             1      0     0.028     0.028     0.028
> 0.028      0.00%
>    close                  2      0     0.017     0.006     0.009
> 0.012     34.67%
>    getpid                 3      0     0.012     0.003     0.004
> 0.007     32.84%
>    fcntl                  1      0     0.007     0.007     0.007
> 0.007      0.00%
> 
> The process 2552 has multiple threads, each with its own separate
> summary, as the old behavior yields.
> 
> perf $ sudo ./perf trace -s -p 2552 --system-summary
> ^C
>  Summary of events:
> 
>  total, 432 events
> 
>    syscall            calls  errors  total       min       avg
> max       stddev
>                                      (msec)    (msec)    (msec)
> (msec)        (%)
>    --------------- --------  ------ -------- --------- ---------
> ---------     ------
>    ppoll                 33      0   524.678     0.000    15.899
> 160.177     50.77%
>    recvmsg               56     51     0.602     0.003     0.011
> 0.033     10.06%
>    ioctl                 21      0     0.542     0.003     0.026
> 0.256     46.90%
>    write                 36      0     0.432     0.002     0.012
> 0.033     12.46%
>    read                  26      2     0.328     0.003     0.013
> 0.037     15.34%
>    sendmsg                6      0     0.209     0.012     0.035
> 0.081     30.35%
>    futex                  9      1     0.111     0.000     0.012
> 0.036     32.24%
>    inotify_add_watch        5      5     0.073     0.009     0.015
> 0.024     18.49%
>    epoll_wait             3      0     0.049     0.008     0.016
> 0.033     50.05%
>    poll                   4      0     0.039     0.004     0.010
> 0.017     29.67%
>    timerfd_settime        2      0     0.032     0.008     0.016
> 0.024     48.19%
>    getpid                 8      0     0.032     0.003     0.004
> 0.007     13.33%
>    close                  3      0     0.023     0.006     0.008
> 0.010     12.89%
>    fcntl                  2      0     0.013     0.005     0.006
> 0.008     25.18%
>    kcmp                   2      0     0.010     0.003     0.005
> 0.007     40.29%
> 
> And this is the summary that brings everything together. However, I
> wouldn't consider it a 'system-wide' summary. Maybe call it
> --summary-total,  summary-collapsed... or just make it the new default
> behavior?

Maybe --summary-total.  I'm not sure if it's ok to change the default.
Probably I need to add a config option at least.

> 
> > +
> >  --tool_stats::
> >         Show tool stats such as number of times fd->pathname was discovered thru
> >         hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index ad7f7fcd0d80b9df..00a20052f676847c 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -177,8 +177,10 @@ struct trace {
> >                 pid_t           *entries;
> >                 struct bpf_map  *map;
> >         }                       filter_pids;
> > +       struct hashmap          *syscall_stats;
> >         double                  duration_filter;
> >         double                  runtime_ms;
> > +       unsigned long           pfmaj, pfmin;
> >         struct {
> >                 u64             vfs_getname,
> >                                 proc_getname;
> > @@ -198,6 +200,7 @@ struct trace {
> >         bool                    summary;
> >         bool                    summary_only;
> >         bool                    errno_summary;
> > +       bool                    system_summary;
> >         bool                    failure_only;
> >         bool                    show_comm;
> >         bool                    print_sample;
> > @@ -2500,18 +2503,23 @@ struct syscall_stats {
> >  };
> >
> >  static void thread__update_stats(struct thread *thread, struct thread_trace *ttrace,
> > -                                int id, struct perf_sample *sample, long err, bool errno_summary)
> > +                                int id, struct perf_sample *sample, long err,
> > +                                struct trace *trace)
> >  {
> > +       struct hashmap *syscall_stats = ttrace->syscall_stats;
> >         struct syscall_stats *stats = NULL;
> >         u64 duration = 0;
> >
> > -       if (!hashmap__find(ttrace->syscall_stats, id, &stats)) {
> > +       if (trace->system_summary)
> > +               syscall_stats = trace->syscall_stats;
> > +
> > +       if (!hashmap__find(syscall_stats, id, &stats)) {
> >                 stats = zalloc(sizeof(*stats));
> >                 if (stats == NULL)
> >                         return;
> >
> >                 init_stats(&stats->stats);
> > -               if (hashmap__add(ttrace->syscall_stats, id, stats) < 0) {
> > +               if (hashmap__add(syscall_stats, id, stats) < 0) {
> >                         free(stats);
> >                         return;
> >                 }
> > @@ -2525,7 +2533,7 @@ static void thread__update_stats(struct thread *thread, struct thread_trace *ttr
> >         if (err < 0) {
> >                 ++stats->nr_failures;
> >
> > -               if (!errno_summary)
> > +               if (!trace->errno_summary)
> >                         return;
> >
> >                 err = -err;
> > @@ -2817,7 +2825,7 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
> >         ret = perf_evsel__sc_tp_uint(evsel, ret, sample);
> >
> >         if (trace->summary)
> > -               thread__update_stats(thread, ttrace, id, sample, ret, trace->errno_summary);
> > +               thread__update_stats(thread, ttrace, id, sample, ret, trace);
> >
> >         if (!trace->fd_path_disabled && sc->is_open && ret >= 0 && ttrace->filename.pending_open) {
> >                 trace__set_fd_pathname(thread, ret, ttrace->filename.name);
> > @@ -3255,10 +3263,13 @@ static int trace__pgfault(struct trace *trace,
> >         if (ttrace == NULL)
> >                 goto out_put;
> >
> > -       if (evsel->core.attr.config == PERF_COUNT_SW_PAGE_FAULTS_MAJ)
> > +       if (evsel->core.attr.config == PERF_COUNT_SW_PAGE_FAULTS_MAJ) {
> >                 ttrace->pfmaj++;
> > -       else
> > +               trace->pfmaj++;
> > +       } else {
> >                 ttrace->pfmin++;
> > +               trace->pfmin++;
> > +       }
> >
> >         if (trace->summary_only)
> >                 goto out;
> > @@ -3417,6 +3428,7 @@ static int trace__record(struct trace *trace, int argc, const char **argv)
> >  }
> >
> >  static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp);
> > +static size_t trace__fprintf_system_summary(struct trace *trace, FILE *fp);
> >
> >  static bool evlist__add_vfs_getname(struct evlist *evlist)
> >  {
> > @@ -4329,6 +4341,12 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> >                 goto out_delete_evlist;
> >         }
> >
> > +       if (trace->system_summary) {
> > +               trace->syscall_stats = alloc_syscall_stats();
> > +               if (trace->syscall_stats == NULL)
> > +                       goto out_delete_evlist;
> > +       }
> > +
> >         evlist__config(evlist, &trace->opts, &callchain_param);
> >
> >         if (forks) {
> > @@ -4489,8 +4507,12 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> >                 ordered_events__flush(&trace->oe.data, OE_FLUSH__FINAL);
> >
> >         if (!err) {
> > -               if (trace->summary)
> > -                       trace__fprintf_thread_summary(trace, trace->output);
> > +               if (trace->summary) {
> > +                       if (trace->system_summary)
> > +                               trace__fprintf_system_summary(trace, trace->output);
> > +                       else
> > +                               trace__fprintf_thread_summary(trace, trace->output);
> > +               }
> >
> >                 if (trace->show_tool_stats) {
> >                         fprintf(trace->output, "Stats:\n "
> > @@ -4502,6 +4524,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> >         }
> >
> >  out_delete_evlist:
> > +       delete_syscall_stats(trace->syscall_stats);
> >         trace__symbols__exit(trace);
> >         evlist__free_syscall_tp_fields(evlist);
> >         evlist__delete(evlist);
> > @@ -4629,6 +4652,12 @@ static int trace__replay(struct trace *trace)
> >                         evsel->handler = trace__pgfault;
> >         }
> >
> > +       if (trace->system_summary) {
> > +               trace->syscall_stats = alloc_syscall_stats();
> > +               if (trace->syscall_stats == NULL)
> > +                       goto out;
> > +       }
> > +
> >         setup_pager();
> >
> >         err = perf_session__process_events(session);
> > @@ -4639,12 +4668,13 @@ static int trace__replay(struct trace *trace)
> >                 trace__fprintf_thread_summary(trace, trace->output);
> >
> >  out:
> > +       delete_syscall_stats(trace->syscall_stats);
> >         perf_session__delete(session);
> >
> >         return err;
> >  }
> >
> > -static size_t trace__fprintf_threads_header(FILE *fp)
> > +static size_t trace__fprintf_summary_header(FILE *fp)
> >  {
> >         size_t printed;
> >
> > @@ -4667,19 +4697,19 @@ static int entry_cmp(const void *e1, const void *e2)
> >         return entry1->msecs > entry2->msecs ? -1 : 1;
> >  }
> >
> > -static struct syscall_entry *thread__sort_stats(struct thread_trace *ttrace)
> > +static struct syscall_entry *syscall__sort_stats(struct hashmap *syscall_stats)
> >  {
> >         struct syscall_entry *entry;
> >         struct hashmap_entry *pos;
> >         unsigned bkt, i, nr;
> >
> > -       nr = ttrace->syscall_stats->sz;
> > +       nr = syscall_stats->sz;
> >         entry = malloc(nr * sizeof(*entry));
> >         if (entry == NULL)
> >                 return NULL;
> >
> >         i = 0;
> > -       hashmap__for_each_entry(ttrace->syscall_stats, pos, bkt) {
> > +       hashmap__for_each_entry(syscall_stats, pos, bkt) {
> >                 struct syscall_stats *ss = pos->pvalue;
> >                 struct stats *st = &ss->stats;
> >
> > @@ -4694,14 +4724,14 @@ static struct syscall_entry *thread__sort_stats(struct thread_trace *ttrace)
> >         return entry;
> >  }
> >
> > -static size_t thread__dump_stats(struct thread_trace *ttrace,
> > -                                struct trace *trace, FILE *fp)
> > +static size_t syscall__dump_stats(struct trace *trace, FILE *fp,
> > +                                 struct hashmap *syscall_stats)
> >  {
> >         size_t printed = 0;
> >         struct syscall *sc;
> >         struct syscall_entry *entries;
> >
> > -       entries = thread__sort_stats(ttrace);
> > +       entries = syscall__sort_stats(syscall_stats);
> >         if (entries == NULL)
> >                 return 0;
> >
> > @@ -4711,7 +4741,7 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
> >         printed += fprintf(fp, "                                     (msec)    (msec)    (msec)    (msec)        (%%)\n");
> >         printed += fprintf(fp, "   --------------- --------  ------ -------- --------- --------- ---------     ------\n");
> >
> > -       for (size_t i = 0; i < ttrace->syscall_stats->sz; i++) {
> > +       for (size_t i = 0; i < syscall_stats->sz; i++) {
> >                 struct syscall_entry *entry = &entries[i];
> >                 struct syscall_stats *stats = entry->stats;
> >                 if (stats) {
> > @@ -4747,6 +4777,17 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
> >         return printed;
> >  }
> >
> > +static size_t thread__dump_stats(struct thread_trace *ttrace,
> > +                                struct trace *trace, FILE *fp)
> > +{
> > +       return syscall__dump_stats(trace, fp, ttrace->syscall_stats);
> > +}
> > +
> > +static size_t system__dump_stats(struct trace *trace, FILE *fp)
> > +{
> > +       return syscall__dump_stats(trace, fp, trace->syscall_stats);
> > +}
> 
> Perhaps consider inlining these two functions?

I think we can leave it to compiler to make decisions on inlining.

Thanks,
Namhyung

> 
> > +
> >  static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trace *trace)
> >  {
> >         size_t printed = 0;
> > @@ -4800,7 +4841,7 @@ static int trace_nr_events_cmp(void *priv __maybe_unused,
> >
> >  static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp)
> >  {
> > -       size_t printed = trace__fprintf_threads_header(fp);
> > +       size_t printed = trace__fprintf_summary_header(fp);
> >         LIST_HEAD(threads);
> >
> >         if (machine__thread_list(trace->host, &threads) == 0) {
> > @@ -4815,6 +4856,27 @@ static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp)
> >         return printed;
> >  }
> >
> > +static size_t trace__fprintf_system_summary(struct trace *trace, FILE *fp)
> > +{
> > +       size_t printed = trace__fprintf_summary_header(fp);
> > +
> > +       printed += fprintf(fp, " total, ");
> > +       printed += fprintf(fp, "%lu events", trace->nr_events);
> > +
> > +       if (trace->pfmaj)
> > +               printed += fprintf(fp, ", %lu majfaults", trace->pfmaj);
> > +       if (trace->pfmin)
> > +               printed += fprintf(fp, ", %lu minfaults", trace->pfmin);
> > +       if (trace->sched)
> > +               printed += fprintf(fp, ", %.3f msec\n", trace->runtime_ms);
> > +       else if (fputc('\n', fp) != EOF)
> > +               ++printed;
> > +
> > +       printed += system__dump_stats(trace, fp);
> > +
> > +       return printed;
> > +}
> > +
> >  static int trace__set_duration(const struct option *opt, const char *str,
> >                                int unset __maybe_unused)
> >  {
> > @@ -5233,6 +5295,8 @@ int cmd_trace(int argc, const char **argv)
> >                     "Show all syscalls and summary with statistics"),
> >         OPT_BOOLEAN(0, "errno-summary", &trace.errno_summary,
> >                     "Show errno stats per syscall, use with -s or -S"),
> > +       OPT_BOOLEAN(0, "system-summary", &trace.system_summary,
> > +                   "Show system-wide summary instead of per-thread, use with -s or -S"),
> >         OPT_CALLBACK_DEFAULT('F', "pf", &trace.trace_pgfaults, "all|maj|min",
> >                      "Trace pagefaults", parse_pagefaults, "maj"),
> >         OPT_BOOLEAN(0, "syscalls", &trace.trace_syscalls, "Trace syscalls"),
> > --
> > 2.48.0.rc2.279.g1de40edade-goog
> >
> 
> Thanks,
> Howard

      reply	other threads:[~2025-01-23 23:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-14 21:22 [PATCH 0/4] perf trace: Add --system-summary option Namhyung Kim
2025-01-14 21:22 ` [PATCH 1/4] perf trace: Allocate syscall stats only if summary is on Namhyung Kim
2025-01-14 21:22 ` [PATCH 2/4] perf trace: Convert syscall_stats to hashmap Namhyung Kim
2025-01-18  3:03   ` Howard Chu
2025-01-23 23:37     ` Namhyung Kim
2025-01-14 21:22 ` [PATCH 3/4] perf tools: Get rid of now-unused rb_resort.h Namhyung Kim
2025-01-14 21:22 ` [PATCH 4/4] perf trace: Add --system-summary option Namhyung Kim
2025-01-18  3:07   ` Howard Chu
2025-01-23 23:41     ` Namhyung Kim [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5LTkVsprlPbG6gF@google.com \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=howardchu95@gmail.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.