From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arnaldo Carvalho de Melo Subject: Re: Perf event for Wall-time based sampling? Date: Thu, 18 Sep 2014 16:17:13 -0300 Message-ID: <20140918191713.GK2770@kernel.org> References: <2221771.b2oSN5LR6X@milian-kdab2> <2297882.Vc1x1zOfA6@milian-kdab2> <20140918155745.GH2770@kernel.org> <45528931.El8SOGvs6Z@milian-kdab2> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mail.kernel.org ([198.145.19.201]:46243 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756197AbaIRTRU (ORCPT ); Thu, 18 Sep 2014 15:17:20 -0400 Content-Disposition: inline In-Reply-To: <45528931.El8SOGvs6Z@milian-kdab2> Sender: linux-perf-users-owner@vger.kernel.org List-ID: To: Milian Wolff Cc: linux-perf-users , Namhyung Kim , Ingo Molnar , Joseph Schuchart Em Thu, Sep 18, 2014 at 06:37:47PM +0200, Milian Wolff escreveu: > On Thursday 18 September 2014 12:57:45 Arnaldo Carvalho de Melo wrote: > > Em Thu, Sep 18, 2014 at 05:26:33PM +0200, Milian Wolff escreveu: > > > On Thursday 18 September 2014 11:51:24 Arnaldo Carvalho de Melo wrote: > > > > b) The callgraphs are really strange, imo. Different traces are printed > > > with the same cost, which sounds wrong, no? See e.g. the multiple 44.44% > > > traces in sched:sched_wakeup. > > Try using --no-children in the 'report' command line. > Nice, this is very useful. Many thanks! npo > > > c) Most of the traces point into the kernel, how can I hide these traces > > > and only concentrate on the user-space? Do I have to grep manually for > > > [.] ? I > > Oh well, for userspace you need to be aware of how callchains are > > collected, i.e. if your binaries and libraries use > > -fno-omit-frame-pointer, because if they do you will not get callchains > > going into userspace, so you will need to specifically ask for 'DWARF' > > callchains, from 'perf record' documentation: > I'm actually aware of that and I did add that option to my initial record > call, sorry for not being clear here. > > > This has to be made automated, i.e. the tooling needs to figure out that > > the binaries used do use %bp for optimization and automagically collect > > DWARF, but till then, one needs to know about such issues and deal with > > it. > > That would indeed be very welcome. There are multiple "defaults" in perf which > I find highly confusing. The --no-children above e.g. could/should probably be > the default, no? Similar, I find it extremely irritating that `perf report -g` It was, this is something we've actually been discussing recently: the change that made --children be the default mode. That is why I added Namhyung and Ingo to the CC list, so that they become aware of more reaction to this change. > defaults to `-g fractal` and not `-g graph`. > > 100% foo > 70% bar > 70% asdf > 30% lalala > 30% baz > > is much harder to interpret than > > 100% foo > 70% bar > 49% asdf > 21% lalala > 30% baz But the question then is if this is configurable, if not that would be a first step, i.e. making this possible via some ~/.perfconfig change. Later we could advocate changing the default. Or perhaps provide some "skins", i.e. config files that could be sourced into ~/.perfconfig so that perf mimics the decisions of other profilers, with which people are used to. Kinda like making mutt behave like pine (as I did a long time ago), even if just for a while, till one gets used to the "superior" default way of doing things of the new tool :-) > especially for more involved call chains. It took me quite some time to become > aware of the ability to pass `-g graph` to get the desired output. KCacheGrind > e.g. also defaults to something similar to `-g graph` and only optionally > allows the user to get the "relative to parent" cost of `-g fractal`. > > > User space support is something that as you see, is still rough, we need > > people like you trying it, but while it is rough, people tend to avoid > > it... :-\ > > Yes. But already perf is extremely useful and I use it a lot. I'm also > actively educating people about using it more. I've talked about it at last > year's Akademy and Qt Developer Days, and again this year at a profiling > workshop at Akademy. Please keep up the good work! Thanks a lot for doing that! > > > tried something like `perf report --parent "main"` but that makes no > > > difference. > > > > I would recommend that you take a look at Brendan Greggs _excellent_ > > > > tutorials at: > > > > http://www.brendangregg.com/perf.html > > > > He will explain all this in way more detail than I briefly skimmed > > > > above. :-) > > > I did that already, but Brendan and the other available Perf documentation > > > mostly concentrates on performance issues in the Kernel. I'm interested > > > purely in the user space. Perf record with one of the hardware PMU events > > > works nicely in that case, but one cannot use it to find locks&waits > > > similar to what VTune offers. > > Humm, yeah, you need to figure out how to solve your issue, what I tried > > was to show what kinds of building blocks you could use to build what > > you need, but no, there is no ready to use tool for this, that I am > > aware of. > > For instance, you need to collect scheduler events, then do some > > scripting, perhaps using perl or python, perhaps using the scripting > > support that is built into perf already, but yeah, not documented. > And also lacking the ability to get callgraphs, if I'm not mistaken. This is > crucial for my undertaking. Or has this been added in the meantime? I guess it was: commit 57608cfd8827a74237d264a197722e2c99f72da4 Author: Joseph Schuchart Date: Thu Jul 10 13:50:56 2014 +0200 perf script: Provide additional sample information on generic events To python scripts, including pid, tid, and cpu for which the event was recorded. At the moment, the pointer to the sample struct is passed to scripts, which seems to be of little use. The patch puts this information in dictionaries for easy access by Python scripts. commit 0f5f5bcd112292f14b75750dde7461463bb1c7bb Author: Joseph Schuchart Date: Thu Jul 10 13:50:51 2014 +0200 perf script: Add callchain to generic and tracepoint events This provides valuable information for tracing performance problems. Since this change alters the interface for the python scripts, also adjust the script generation and the provided scripts. > > > This was also why I asked my initial question, which I want to repeat once > more: Is there a technical reason to not offer a "timer" software event to > perf? I'm a complete layman when it comes to Kernel internals, but from a user > point of view this would be awesome: > perf record --call-graph dwarf -e sw-timer -F 100 someapplication > This command would then create a timer in the kernel with a 100Hz frequency. > Whenever it fires, the callgraphs of all threads in $someapplication are > sampled and written to perf.data. Is this technically not feasible? Or is it > simply not implemented? > I'm experimenting with a libunwind based profiler, and with some ugly signal > hackery I can now grab backtraces by sending my application SIGUSR1. Based on Humm, can't you do the same thing with perf? I.e. you send SIGUSR1 to your app with the frequency you want, and then hook a 'perf probe' into your signal... /me tries some stuff, will get back with results... > that, I can probably create a profiling tool that fits my needs. I just wonder > why one cannot do the same with perf. - Arnaldo