From: Masami Hiramatsu <mhiramat@kernel.org>
To: Daniel Xu <dxu@dxuuu.xyz>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Ingo Molnar <mingo@kernel.org>, X86 ML <x86@kernel.org>,
linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
kuba@kernel.org, mingo@redhat.com, ast@kernel.org,
tglx@linutronix.de, kernel-team@fb.com, yhs@fb.com,
Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
Date: Wed, 10 Mar 2021 19:50:36 +0900 [thread overview]
Message-ID: <20210310195036.9aefe44bda0418484886c3a9@kernel.org> (raw)
In-Reply-To: <20210309213442.fyhxozdcyxfjljih@dlxu-fedora-R90QNFJV>
On Tue, 9 Mar 2021 13:34:42 -0800
Daniel Xu <dxu@dxuuu.xyz> wrote:
> Hi Masami,
>
> Just want to clarify a few points:
>
> On Mon, Mar 08, 2021 at 11:52:10AM +0900, Masami Hiramatsu wrote:
> > On Sun, 7 Mar 2021 13:23:33 -0800
> > Daniel Xu <dxu@dxuuu.xyz> wrote:
> > To help your understanding, let me explain.
> >
> > If we have a code here
> >
> > caller_func:
> > 0x00 add sp, 0x20 /* 0x20 bytes stack frame allocated */
> > ...
> > 0x10 call target_func
> > 0x15 ... /* return address */
> >
> > On the stack in the entry of target_func, we have
> >
> > [stack]
> > 0x0e0 caller_func+0x15
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> >
> > And when we put a kretprobe on the target_func, the stack will be
> >
> > [stack]
> > 0x0e0 kretprobe_trampoline
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> >
> > * "caller_func+0x15" is saved in current->kretprobe_instances.first.
> >
> > When returning from the target_func, call consumed the 0x0e0 and
> > jump to kretprobe_trampoline. Let's see the assembler code.
> >
> > ".text\n"
> > ".global kretprobe_trampoline\n"
> > ".type kretprobe_trampoline, @function\n"
> > "kretprobe_trampoline:\n"
> > /* We don't bother saving the ss register */
> > " pushq %rsp\n"
> > " pushfq\n"
> > SAVE_REGS_STRING
> > " movq %rsp, %rdi\n"
> > " call trampoline_handler\n"
> > /* Replace saved sp with true return address. */
> > " movq %rax, 19*8(%rsp)\n"
> > RESTORE_REGS_STRING
> > " popfq\n"
> > " ret\n"
> >
> > When the entry of trampoline_handler, stack is like this;
> >
> > [stack]
> > 0x040 kretprobe_trampoline+0x25
> > 0x048 r15
> > ... /* pt_regs */
> > 0x0d8 flags
> > 0x0e0 rsp (=0x0e0)
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> >
> > And after returned from trampoline_handler, "movq" changes the
> > stack like this.
> >
> > [stack]
> > 0x040 kretprobe_trampoline+0x25
> > 0x048 r15
> > ... /* pt_regs */
> > 0x0d8 flags
> > 0x0e0 caller_func+0x15
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
>
> Thanks for the detailed explanation. I think I understand kretprobe
> mechanics from a somewhat high level (kprobe saves real return address
> on entry, overwrites return address to trampoline, then trampoline
> runs handler and finally resets return address to real return address).
>
> I don't usually write much assembly so the details escape me somewhat.
>
> > So at the kretprobe handler, we have 2 issues.
> > 1) the return address (caller_func+0x15) is not on the stack.
> > this can be solved by searching from current->kretprobe_instances.
>
> Yes, agreed.
>
> > 2) the stack frame size of kretprobe_trampoline is unknown
> > Since the stackframe is fixed, the fixed number (0x98) can be used.
>
> I'm confused why this is relevant. Is it so ORC knows where to find
> saved return address in the frame?
No, because the kretprobe_trampoline is somewhat special. Usually, at the
function entry, there is a return address on the top of stack, but
kretprobe_trampoline doesn't have it.
So we have to put a hint at the function entry to mark there should be
a next return address. (and ORC unwinder must find it)
> > However, those solutions are only for the kretprobe handler. The stacktrace
> > from interrupt handler hit in the kretprobe_trampoline still doesn't work.
> >
> > So, here is my idea;
> >
> > 1) Change the trampline code to prepare stack frame at first and save
> > registers on it, instead of "push". This will makes ORC easy to setup
> > stackframe information for this code.
>
> I'm confused on the details here. But this is what Josh solves in his
> patch, right?
I'm not so sure how objtool makes the ORC information. If it can trace the
push/pop correctly, yes, it is solved.
> > 2) change the return addres fixup timing. Instead of using return value
> > of trampoline handler, before removing the real return address from
> > current->kretprobe_instances.
>
> Is the idea to have `kretprobe_trampoline` place the real return address
> on the stack (while telling ORC where to find it) _before_ running `call
> trampoline_handler` ? So that an unwind from inside the user defined
> kretprobe handler simply unwinds correctly?
No, unless calling the trampoline_handler, we can not get the real return
address. Thus, the __kretprobe_trampoline_handler() will call return address
fixup function right before unlink the current->kretprobe_instances.
> And to be extra clear, this would only work for stack_trace_save() and
> not stack_trace_save_regs()?
Yes, for the stack_trace_save_regs() and the stack-tracing inside the
kretprobe'd target function, we still need a hack as same as orc_ftrace_find().
>
> > 3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, it
> > checks the contents of the end of stackframe (at the place of regs->sp)
> > is same as the address of it. If it is, it can find the correct address
> > from current->kretprobe_instances. If not, there is a correct address.
>
> What do you mean by "it" w.r.t. "is the same address of it"? I'm
> confused on this point.
Oh I meant,
3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, orc_find()
checks the contents of the end of stackframe (at the place of regs->sp)
is same as the address of the stackframe (Note that kretprobe_trampoline
does "push %sp" at first). If so, orc_find() can find the correct address
from current->kretprobe_instances. If not, there is a correct address.
I need to see the orc unwinder carefully, orc_find() only gets the ip but
to find stackframe, I think this should be fixed in the caller of orc_find().
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
prev parent reply other threads:[~2021-03-10 10:51 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-05 15:38 [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 1/5] ia64: kprobes: Fix to pass correct trampoline address to the handler Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 2/5] kprobes: treewide: Replace arch_deref_entry_point() with dereference_function_descriptor() Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 3/5] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler() Masami Hiramatsu
2021-03-10 14:21 ` Miroslav Benes
2021-03-10 15:42 ` Masami Hiramatsu
2021-03-11 13:37 ` Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 4/5] kprobes: stacktrace: Recover the address changed by kretprobe Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 5/5] tracing: Remove kretprobe unknown indicator from stacktrace Masami Hiramatsu
2021-03-05 15:44 ` Steven Rostedt
2021-03-05 19:16 ` [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes Daniel Xu
2021-03-06 1:13 ` Masami Hiramatsu
2021-03-07 21:23 ` Daniel Xu
2021-03-08 2:52 ` Masami Hiramatsu
2021-03-08 13:05 ` Masami Hiramatsu
2021-03-09 1:19 ` Josh Poimboeuf
2021-03-10 9:57 ` Masami Hiramatsu
2021-03-10 15:08 ` Josh Poimboeuf
2021-03-10 15:55 ` Masami Hiramatsu
2021-03-10 18:31 ` Josh Poimboeuf
2021-03-11 0:20 ` Masami Hiramatsu
2021-03-11 1:06 ` Josh Poimboeuf
2021-03-11 1:54 ` Masami Hiramatsu
2021-03-11 16:51 ` Josh Poimboeuf
2021-03-12 3:22 ` Masami Hiramatsu
2021-03-10 22:46 ` Daniel Xu
2021-03-09 21:34 ` Daniel Xu
2021-03-10 10:50 ` Masami Hiramatsu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210310195036.9aefe44bda0418484886c3a9@kernel.org \
--to=mhiramat@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dxu@dxuuu.xyz \
--cc=jpoimboe@redhat.com \
--cc=kernel-team@fb.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.