linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Milian Wolff <mail@milianw.de>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: linux-perf-users <linux-perf-users@vger.kernel.org>,
	Namhyung Kim <namhyung@gmail.com>, Ingo Molnar <mingo@kernel.org>
Subject: Re: Perf event for Wall-time based sampling?
Date: Thu, 18 Sep 2014 18:37:47 +0200	[thread overview]
Message-ID: <45528931.El8SOGvs6Z@milian-kdab2> (raw)
In-Reply-To: <20140918155745.GH2770@kernel.org>

On Thursday 18 September 2014 12:57:45 Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 18, 2014 at 05:26:33PM +0200, Milian Wolff escreveu:
> > On Thursday 18 September 2014 11:51:24 Arnaldo Carvalho de Melo wrote:

<snip>

> > b) The callgraphs are really strange, imo. Different traces are printed
> > with the same cost, which sounds wrong, no? See e.g. the multiple 44.44%
> > traces in sched:sched_wakeup.
> 
> Try using --no-children in the 'report' command line.

Nice, this is very useful. Many thanks!

> > c) Most of the traces point into the kernel, how can I hide these traces
> > and only concentrate on the user-space? Do I have to grep manually for
> > [.] ? I

> Oh well, for userspace you need to be aware of how callchains are
> collected, i.e. if your binaries and libraries use
> -fno-omit-frame-pointer, because if they do you will not get callchains
> going into userspace, so you will need to specifically ask for 'DWARF'
> callchains, from 'perf record' documentation:

I'm actually aware of that and I did add that option to my initial record 
call, sorry for not being clear here.

<snip>

> This has to be made automated, i.e. the tooling needs to figure out that
> the binaries used do use %bp for optimization and automagically collect
> DWARF, but till then, one needs to know about such issues and deal with
> it.

That would indeed be very welcome. There are multiple "defaults" in perf which 
I find highly confusing. The --no-children above e.g. could/should probably be 
the default, no? Similar, I find it extremely irritating that `perf report -g` 
defaults to `-g fractal` and not `-g graph`.

100% foo
  70% bar
    70% asdf
    30% lalala
  30% baz

is much harder to interpret than

100% foo
  70% bar
    49% asdf
    21% lalala
  30% baz

especially for more involved call chains. It took me quite some time to become 
aware of the ability to pass `-g graph` to get the desired output. KCacheGrind 
e.g. also defaults to something similar to `-g graph` and only optionally 
allows the user to get the "relative to parent" cost of `-g fractal`.

> User space support is something that as you see, is still rough, we need
> people like you trying it, but while it is rough, people tend to avoid
> it... :-\

Yes. But already perf is extremely useful and I use it a lot. I'm also 
actively educating people about using it more. I've talked about it at last 
year's Akademy and Qt Developer Days, and again this year at a profiling 
workshop at Akademy. Please keep up the good work!

> > tried something like `perf report --parent "main"` but that makes no
> > difference.
> > 
> > > I would recommend that you take a look at Brendan Greggs _excellent_
> > > tutorials at:
> > > 
> > > http://www.brendangregg.com/perf.html
> > > 
> > > He will explain all this in way more detail than I briefly skimmed
> > > above. :-)
> > 
> > I did that already, but Brendan and the other available Perf documentation
> > mostly concentrates on performance issues in the Kernel. I'm interested
> > purely in the user space. Perf record with one of the hardware PMU events
> > works nicely in that case, but one cannot use it to find locks&waits
> > similar to what VTune offers.
> 
> Humm, yeah, you need to figure out how to solve your issue, what I tried
> was to show what kinds of building blocks you could use to build what
> you need, but no, there is no ready to use tool for this, that I am
> aware of.
> 
> For instance, you need to collect scheduler events, then do some
> scripting, perhaps using perl or python, perhaps using the scripting
> support that is built into perf already, but yeah, not documented.

And also lacking the ability to get callgraphs, if I'm not mistaken. This is 
crucial for my undertaking. Or has this been added in the meantime?

<snip>

This was also why I asked my initial question, which I want to repeat once 
more: Is there a technical reason to not offer a "timer" software event to 
perf? I'm a complete layman when it comes to Kernel internals, but from a user 
point of view this would be awesome:

perf record --call-graph dwarf -e sw-timer -F 100 someapplication

This command would then create a timer in the kernel with a 100Hz frequency. 
Whenever it fires, the callgraphs of all threads in $someapplication are 
sampled and written to perf.data. Is this technically not feasible? Or is it 
simply not implemented?

I'm experimenting with a libunwind based profiler, and with some ugly signal 
hackery I can now grab backtraces by sending my application SIGUSR1. Based on 
that, I can probably create a profiling tool that fits my needs. I just wonder 
why one cannot do the same with perf.

Thanks for your time

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

  reply	other threads:[~2014-09-18 16:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-18 12:32 Perf event for Wall-time based sampling? Milian Wolff
2014-09-18 13:23 ` Arnaldo Carvalho de Melo
2014-09-18 13:41   ` Milian Wolff
2014-09-18 14:51     ` Arnaldo Carvalho de Melo
2014-09-18 15:26       ` Milian Wolff
2014-09-18 15:57         ` Arnaldo Carvalho de Melo
2014-09-18 16:37           ` Milian Wolff [this message]
2014-09-18 19:17             ` Arnaldo Carvalho de Melo
2014-09-18 19:31               ` Arnaldo Carvalho de Melo
2014-09-18 20:17               ` David Ahern
2014-09-18 20:36                 ` Arnaldo Carvalho de Melo
2014-09-18 20:39                   ` David Ahern
2014-09-19  8:11                   ` Milian Wolff
2014-09-19  9:08                     ` Milian Wolff
2014-09-19 14:47                     ` Arnaldo Carvalho de Melo
2014-09-19 15:04                       ` David Ahern
2014-09-19 15:05                       ` Milian Wolff
2014-09-19 14:17                   ` David Ahern
2014-09-19 14:39                     ` Milian Wolff
2014-09-19 14:55                       ` David Ahern
2014-09-19  5:59               ` Namhyung Kim
2014-09-19 14:33                 ` Arnaldo Carvalho de Melo
2014-09-19 14:53                   ` Milian Wolff
2014-09-19 15:50                     ` Namhyung Kim
2014-09-22  7:56                 ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45528931.El8SOGvs6Z@milian-kdab2 \
    --to=mail@milianw.de \
    --cc=acme@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).