linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Milian Wolff <milian.wolff@kdab.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: linux-perf-users@vger.kernel.org, Jiri Olsa <jolsa@kernel.org>,
	David Ahern <dsahern@gmail.com>,
	Brendan Gregg <brendan.d.gregg@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Wang Nan <wangnan0@huawei.com>
Subject: Re: [PATCH] perf trace: Add support for printing call chains on sys_exit events.
Date: Sat, 09 Apr 2016 13:44:51 +0200	[thread overview]
Message-ID: <1714692.A55s9sBUmI@agathebauer> (raw)
In-Reply-To: <20160408181853.GC25165@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 4966 bytes --]

On Freitag, 8. April 2016 15:18:53 CEST Arnaldo Carvalho de Melo wrote:
> Em Fri, Apr 08, 2016 at 02:57:54PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, Apr 08, 2016 at 01:34:15PM +0200, Milian Wolff escreveu:
> > > Now, one can print the call chain for every encountered sys_exit
> > > event, e.g.:
> > > 
> > > Note that it is advised to increase the number of mmap pages to
> > > prevent event losses when using this new feature. Often, adding
> > > `-m 10M` to the `perf trace` invocation is enough.
> > > 
> > > This feature is also available in strace when built with libunwind
> > > 
> > > via `strace -k`. Performance wise, this solution is much better:
> > >     $ time find path/to/linux &> /dev/null
> > >     
> > >     real    0m0.051s
> > >     user    0m0.013s
> > >     sys     0m0.037s
> > >     
> > >     $ time perf trace -m 800M --call-graph dwarf find path/to/linux &>
> > >     /dev/null
> > >     
> > >     real    0m2.624s
> > >     user    0m1.203s
> > >     sys     0m1.333s
> > >     
> > >     $ time strace -k find path/to/linux  &> /dev/null
> > >     
> > >     real    0m35.398s
> > >     user    0m10.403s
> > >     sys     0m23.173s
> > > 
> > > Note that it is currently not possible to configure the print output.
> > > Adding such a feature, similar to what is available in `perf script`
> > > via its `--fields` knob can be added later on.
> > 
> > You mixed up multiple changes in one single patch, I'll break it down
> > while testing, and before pushing upstream.
> 
> Expanding a bit the audience:
> 
> First test, it works, great! But do we really need that address? I guess
> not, right, perhaps via some callchain parameter, to tell what we want to
> see? But by default knowing the function name + DSO seems enough, no?

...

> Yeah, you agree with that, now that I read the patch 8-):
> 
> +               /* TODO: user-configurable print_opts */
> +               unsigned int print_opts = PRINT_IP_OPT_IP

;-)

I even tried to make the code of `perf script` reusable for `perf trace`, but 
stopped once I realised that it currently relies on the existance of a 
`perf_session`, which does not exist when we do live tracing. It only exists 
for replaying in `builtin-trace.c`. So it involves some more refactoring which 
I did not have the time for.

<snip>

> Better, but perhaps we should try aligning, up to a limit, the function
> names/DSOs?
> 
> [root@jouet bpf]# trace -e nanosleep --call-graph dwarf usleep 1
>      0.063 ( 0.063 ms): usleep/6132 nanosleep(rqtp: 0x7ffd1b7a8e70          
>                              ) = 0 syscall_slow_exit_work
> ([kernel.kallsyms]) do_syscall_64          ([kernel.kallsyms])
> return_from_SYSCALL_64 ([kernel.kallsyms]) __nanosleep           
> (/usr/lib64/libc-2.22.so) usleep                 (/usr/lib64/libc-2.22.so)
> [unknown]              (/usr/bin/usleep) __libc_start_main     
> (/usr/lib64/libc-2.22.so) [unknown]              (/usr/bin/usleep)
> [root@jouet bpf]#
> 
> wdyt?

Yes, sounds good. Many profilers I've worked with always dump the IP, so I 
thought we should do it here as well. `perf script` e.g. does it. Could we 
maybe print the IP if the symbol is [unknown]?

> Also, after this initial support is in, I think the next step is to
> allow per syscall configs, like we have for per tracepoints, i.e. this
> should be possible:
> 
>    # trace -e nanosleep(call-graph=dwarf),socket -a
> 
> And then we would get callchains just for nanosleep calls, not for
> socket ones. We then need to think how to ask that efficiently to the
> kernel, in this case it should be instead of using
> raw_syscalls:sys_enter + tracepoint filters set via ioctl, to use
> syscalls:sys_{enter,exit}_nanosleep, with callgraphs +
> syscalls:sys_{enter,exit}_socket, without.
> 
> Doing it this way allows us to avoid asking callchains for a lot of
> events when we want just for a few ones, to reduce overhead.
 
Yep, sounds useful for some more specific use-cases. For me, this patch is 
sufficient as I'd just do:

    $ trace -e nanosleep --call-graph=dwarf ...

What I think is more important though is to make sure we only ask for 
callchains on the sys_exit events. Afaik, my patch will do it also for the 
sys_enter which is just additional cost with no benefit? So fixing that first 
is I think even more important, but I don't know how.

> Anyway, I think I'll just break this down into multiple patches and then
> we can work on these other aspects.

Yes, but note that I'll be busy and then on vacation for the next two weeks. 
I'll get back to this after wards.

> David, ah, his patch floated on the linux-perf-users mailing list, easy
> one once the thread->priv one got out of the way (it was being used by
> builtin-trace.c and the unwind code, ugh).
> 
> Thanks,

Same to you, cheers!

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

  reply	other threads:[~2016-04-09 11:44 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-08 11:34 [PATCH] perf trace: Add support for printing call chains on sys_exit events Milian Wolff
2016-04-08 17:57 ` Arnaldo Carvalho de Melo
2016-04-08 18:18   ` Arnaldo Carvalho de Melo
2016-04-09 11:44     ` Milian Wolff [this message]
2016-04-09 11:38   ` Milian Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1714692.A55s9sBUmI@agathebauer \
    --to=milian.wolff@kdab.com \
    --cc=acme@kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=dsahern@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).