From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Wangnan (F)" <wangnan0@huawei.com>, Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
David Ahern <dsahern@gmail.com>,
Milian Wolff <milian.wolff@kdab.com>,
linux-kernel@vger.kernel.org, pi3orama <pi3orama@163.com>,
lizefan 00213767 <lizefan@huawei.com>
Subject: Re: [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction
Date: Tue, 1 Dec 2015 08:28:26 +0100 [thread overview]
Message-ID: <20151201072826.GB28270@gmail.com> (raw)
In-Reply-To: <20151130092843.GF17308@twins.programming.kicks-ass.net>
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Nov 27, 2015 at 09:38:11AM +0100, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > On Thu, Nov 19, 2015 at 11:23:00AM +0100, Ingo Molnar wrote:
> > > > PEBS is an asynchronous hardware tracing mechanism, when batched PEBS is used it
> > > > might not even result in any interruption of execution. The 'pt_regs' does not
> > > > necessarily correspond to an interrupted, restartable context - we take the RIP
> > > > from the PEBS machinery and also use LBR and disassembly to determine the previous
> > > > instruction, before reporting it to user-space.
> > >
> > > Note that modern PEBS hardware (hsw+) does the rollback in hardware.
> > > Prior to that we indeed to it manually using the LBR.
> > >
> > > As to pt_regs, we construct a franken pt_regs based on the actual PEBS
> > > buffer overflow PMI and bits from the PEBS record (which also includes
> > > some register state). See
> > > arch/x86/kernel/cpu/perf_event_intel_ds.c:setup_pebs_sample_data().
> > >
> > > We always copy the flags, ip, bp and sp from the PEBS record into the
> > > interrupt pt_regs.
> > >
> > > And note that the PEBS record is constructed at instruction retirement,
> > > so it shows the state _after_ the instruction, with exception of the
> > > (hsw+) real_ip field.
> > >
> > > So the unwinder will have to be taught that if the IP points at a stack
> > > altering instruction (call, push, etc.) it will have to 'undo' the
> > > effects on the actual stack (I appreciate this might be 'interesting'
> > > for things like: pop, ret, etc.).
> >
> > So do we dump both the 'real' and the actual RIP, to not force tooling into having
> > to decode instructions and such?
>
> Nope, we only expose the corrected one.
>
> > (Which is pretty hard and fragile and not always
> > possible with instructions that destroy the original RIP, like JMP, etc.)
>
> Not sure what you're getting at here. We don't need the uncorrected
> instruction.
Well, we need it for stack unwinding, as you point it out:
> But the problem here is that we rewind the instruction stream, but not
> the stack. And the stack unwinder is (obviously) interested in the stack
> state.
Unwinding the stack state would fix it as well - but an equivalent solution would
be to pass along the original RIP would fix it as well: we'd have a
self-consistent pair of RIP/RSP.
Especially since unwinding the RSP is probably hard:
> I'm not sure we want (or need) to go undo the specific instruction's
> stack effect in-kernel. If the !DWARF unwinders are similarly confused
> we might need to put it in kernel (expensive *groan*). If its only the
> DWARF muck then its something that can be done in userspace just
> fine, although we might need to copy slightly more of the stack than SP
> is pointing at, such that we can undo RET/POP etc. which would have data
> beyond the head of stack.
>
> The easiest solution might be to figure out the biggest stack offset for
> any instruction and always capture that much over the head of stack.
so I think the problem here is that the RSP does not match up to the RIP. We can
either pass along the original RIP+RSP, or the fixed up one - but what we do
currently is that we pass along only half of it - which corrupts dwarf unwinding
state that doesn't tolerate such errors.
Thanks,
Ingo
next prev parent reply other threads:[~2015-12-01 7:28 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-18 7:20 [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction Wangnan (F)
2015-11-18 8:00 ` Wangnan (F)
2015-11-18 8:20 ` Ingo Molnar
2015-11-18 8:42 ` Wangnan (F)
2015-11-18 8:49 ` Wangnan (F)
2015-11-19 6:37 ` Ingo Molnar
2015-11-19 6:45 ` Wangnan (F)
2015-11-19 10:23 ` Ingo Molnar
2015-11-19 10:43 ` Wangnan (F)
2015-11-19 11:28 ` Peter Zijlstra
2015-11-19 11:23 ` Peter Zijlstra
2015-11-27 8:38 ` Ingo Molnar
2015-11-30 9:28 ` Peter Zijlstra
2015-12-01 7:28 ` Ingo Molnar [this message]
2015-12-01 8:38 ` Peter Zijlstra
2015-12-01 16:11 ` Ingo Molnar
2015-12-01 17:21 ` Peter Zijlstra
2015-12-02 9:55 ` Ingo Molnar
2015-11-18 8:48 ` Jiri Olsa
2015-11-18 9:02 ` Wangnan (F)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151201072826.GB28270@gmail.com \
--to=mingo@kernel.org \
--cc=acme@kernel.org \
--cc=dsahern@gmail.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=milian.wolff@kdab.com \
--cc=peterz@infradead.org \
--cc=pi3orama@163.com \
--cc=wangnan0@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.