Re: Trying to understand some strange samples in a perf profile

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Hadrien Grasland <grasland@lal.in2p3.fr>
To: Milian Wolff <milian.wolff@kdab.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: Trying to understand some strange samples in a perf profile
Date: Thu, 21 Dec 2017 17:02:09 +0100	[thread overview]
Message-ID: <9c47802d-a568-1a39-bfa4-6a12d0622a4e@lal.in2p3.fr> (raw)
In-Reply-To: <2337765.oK3dqqyzHo@milian-kdab2>

Hey,

First of all, let me be even more aggressive with e-mail quote removal :)

> <snip>
>
>>>> Can you help me check if my understanding of the other samples is
>>>> correct, and give me some clues about what could happen for these last
>>>> few weird ones?
>>>>
>>>> I can provide perf.data files and archives on demand, but since the
>>>> original is a rather big upload/download ( 2.7 GB ), I thought I would
>>>> wait until you can give me some specific requests or a way to produce
>>>> skimmed datasets with only the samples that you are looking for.
>>> The perf.data file wouldn't be enough, we'd need access to all of the
>>> so/dbg files too. perf archive could be used for that purpose, but I've
>>> seen many situations where not all DSOs encountered had a build-id, and
>>> thus where not put into the archive...
>> Yes, back when I discussed the ffffffffffffffff samples with Jiri on
>> IRC, we had this problem that perf archive failed to include all
>> necessary DSOs for analysis. So I'm not sure if it would work here.
>>
>> If there is anything else which I can do to help you here, just tell me!
> I never took the time to debug this, but I personally consider this a big
> problem with perf. These broken backtraces make analysis often much harder
> compared to others tools like VTune which apparently do not suffer from this
> issue. At least there, I have never seen it - maybe it's just less of a
> problem there for some reason.
>
> So to debug this, I guess we need to sit down and actually find a way to
> reproduce the issue with a smaller test case. I actually have something that
> is much smaller than your code base and also shows this issue sometimes:
>
> https://github.com/KDAB/hotspot/blob/master/tests/test-clients/cpp-inlining/
> main.cpp
>
> perf record --call-graph dwarf -c 1000000 ./a.out
> ..
> cpp-inlining 25448 258803.904661:     100000 cycles:uppp:
>          ffffffff815fd197 [unknown] ([unknown])
> ..
> cpp-inlining 25448 258803.907014:     100000 cycles:uppp:
>                     23783 __hypot_finite (/usr/lib/libm-2.26.so)
>          40e3491ebf75199b [unknown] ([unknown])
> ..
> cpp-inlining 25448 258804.312048:     100000 cycles:uppp:
>                     23783 __hypot_finite (/usr/lib/libm-2.26.so)
>          40f204e9aa06ab79 [unknown] ([unknown])
>
> Interestingly, the last broken state repeats, and the address in libm is
> always the same (0x23783). So that's why I believe that the problem might be
> in libunwind/libdw or in the DWARF emitter in GCC/Clang... We'd have to debug
> this. I guess for that we'd have to do something along the following:
>
> - build a minimal "unwinder" application for libunwind e.g.
> - notify libunwind about all mmap events
> - install the register and stack state from the per.data file
> - try to unwind, see what happens
>
> This should give us a MWE that we could also show to Dave Watson and others
> from libunwind. Similarly, we can easily check what libdwfl does there and
> could show it to Mark Wielaard and others from elfutils. I'm pretty sure that
> together we can then figure out what to blame:
>
> - are the register values correct?
> - is the stack state correct?
> - is the DWARF info correct?
> - ...?
>
> It sounds like a lot of work, but it's certainly doable. Fixing this issue
> would make me very happy, and I bet a lot of other users would appreciate this
> too. Right now, I often have to tell people who use perf "deal with broken
> backtraces, try to guesstimate from the surroundings what actually happened".
> But this is of course far from optimal!

This discussion is getting a bit far from my technological comfort zone, 
and I am very likely to say something stupid or repeat what you just 
said in a different way. But if I look at the libunwind documentation, I 
see that there seems to be a way to attach to a process and generate a 
backtrace of it like a debugger would.

If so, couldn't we bisect the issue and discriminate whether this is 
perf's fault by just using this capability to build a minimal poor man 
profiler, which uses libunwind to print a stack trace of an application 
at regular intervals? To the best of my understanding, if we get broken 
stack traces there, the blame would lie on the side of libunwind or the 
compiler-generated DWARF info, whereas if we get correct stack traces, 
the blame would lie on perf's facilities for copying register and stack 
state.

Am I overlooking something here?

Bye,
Hadrien

next prev parent reply	other threads:[~2017-12-21 15:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-20  9:39 Trying to understand some strange samples in a perf profile Hadrien Grasland
2017-12-21  9:54 ` Milian Wolff
2017-12-21 10:46   ` Hadrien Grasland
2017-12-21 12:47     ` Milian Wolff
2017-12-21 16:02       ` Hadrien Grasland [this message]
2017-12-27 22:21         ` Milian Wolff
2017-12-28 23:31       ` David Hinkle
2018-01-04 22:08         ` Milian Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9c47802d-a568-1a39-bfa4-6a12d0622a4e@lal.in2p3.fr \
    --to=grasland@lal.in2p3.fr \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=milian.wolff@kdab.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).