From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Andrew Lutomirski <luto@mit.edu>, Borislav Petkov <bp@amd64.org>,
Ingo Molnar <mingo@kernel.org>,
"user-mode-linux-devel@lists.sourceforge.net"
<user-mode-linux-devel@lists.sourceforge.net>,
Richard Weinberger <richard@nod.at>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>
Subject: Re: [uml-devel] SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386)
Date: Tue, 23 Aug 2011 03:59:44 +0100 [thread overview]
Message-ID: <20110823025944.GB2203@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFygGyi8HRr8wNVDw5xk46D5QDOhis18u2qV-9+h0-4ReA@mail.gmail.com>
On Mon, Aug 22, 2011 at 06:59:48PM -0700, Linus Torvalds wrote:
> And the system call restart should actually work fine too, because at
> syscall entry we save %ebp *both* in the slot for ebp and ecx when we
> enter the first time. So the second time, we'll re-load the third
> argument from ebp again, but that's fine - it's still going to be the
> right value. Yes? No?
>
> However, I note that the cstar entrypont has a comment about not saving ebp:
>
> * %ebp Arg2 [note: not saved in the stack frame, should not be touched]
>
> which sounds odd. Why don't we save it? If we take a signal handler
> there, don't we want %ebp on the kernel stack in pt_regs, in order to
> do everything right?
That's exactly because it's callee-saved. amd64 doesn't build full
pt_regs on stack; there's a part built always (5 words needed for iret
to work + syscall number + rdi + rsi + rdx + rcx + rax + r8--r11) and
the rest of registers is not saved in regular cases. Reason: as long
as what we are calling follows amd64 ABI, we are guaranteed that
values of rsp/rbp/rbx/r12--r15 will not change. So we don't waste
cycles and stack space unless we need to. Which is to say,
* in fork/clone/vfork - there we want full pt_regs to copy it into
child's pt_regs.
* in {rt_,}sigreturn - we don't care about the current contents of
those registers, but we want to set them. Thus the full pt_regs on stack,
filled by sys_{rt_,}sigreturn() and these extra registers filled with
values from pt_regs.
* execve() - we want all registers reset to know state after
sys_execve(), so it fills the full pt_regs and we get the extra regs filled
out of it.
* sigaltstack() - there full pt_regs is an overkill, but we do want
userland sp.
* signal delivery - we want these registers preserved across the
duration of handler and we can't depend on handler following ABI. So we
fill the entire pt_regs, and copy it into sigcontext, to be eventually
picked up by sigreturn and reconstruct the entire state.
* ptrace - we want to be able to read/modify *all* these guys.
So we fill the entire pt_regs, let ptrace play with it and read extra regs
back. NOTE: ia32_cstar_tracesys() takes pains to prevent buggering ebp
there - we read the arg6 into r9, then swap it with ebp for duration of
that stuff. So ptrace will see arg6 in regs.bp, but when it's time
to go into syscall the (possibly modified) value will end in r9. Which
is how it's passed to C functions, so we are fine, but it'll be really
lost before we reach the userland. However, on the way *OUT* we are not
that nice, and SETREGS/POKEUSER hitting us there will end up modifying
ebp. Which will play hell on __kernel_vsyscall()...
Hell, you have done something very similar on alpha yourself... As for
ebp, it doesn't make any sense to save it on stack - ia32_cstar_entry()
itself takes care of not stomping on it just fine and IRET path
(int_ret_from_sys_call) modifies rbp only if explicitly asked to do so...
Which is most likely where it hits the fan for uml. Normally it wouldn't
hurt to ask PTRACE_PUTREGS to put into ebp the value we just got from
PTRACE_GETREGS. However, it *does* hurt when it happens on the second
stop per syscall - i.e. when we are on the way out. I'm not 100% sure
that this is what's going on (it's using PTRACE_SYSEMU, which is supposed
to avoid the second stop completely), but it looks like what I'm seeing...
next prev parent reply other threads:[~2011-08-23 3:00 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-18 18:58 Subject: [PATCH 00/91] pending uml patches Al Viro
2011-08-18 19:12 ` Richard Weinberger
2011-08-18 19:19 ` Al Viro
2011-08-19 4:31 ` Al Viro
2011-08-19 8:51 ` Richard Weinberger
2011-08-20 1:18 ` [RFC] weird crap with vdso on uml/i386 Al Viro
2011-08-20 15:22 ` Richard Weinberger
2011-08-20 20:14 ` Al Viro
2011-08-20 20:55 ` Richard Weinberger
2011-08-20 21:26 ` Andrew Lutomirski
2011-08-20 21:38 ` Richard Weinberger
2011-08-20 21:40 ` Andrew Lutomirski
2011-08-21 6:34 ` Al Viro
2011-08-21 8:42 ` SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386) Al Viro
2011-08-21 11:24 ` Andrew Lutomirski
2011-08-21 13:37 ` Andrew Lutomirski
2011-08-21 14:51 ` Al Viro
2011-08-21 14:43 ` Al Viro
2011-08-21 16:41 ` Al Viro
2011-08-22 0:44 ` Andrew Lutomirski
2011-08-22 1:09 ` Linus Torvalds
2011-08-22 1:19 ` Al Viro
2011-08-22 1:19 ` H. Peter Anvin
2011-08-22 21:25 ` [tip:x86/urgent] x86-32, vdso: On system call restart after SYSENTER, use int $0x80 tip-bot for H. Peter Anvin
2011-08-23 23:40 ` tip-bot for H. Peter Anvin
2011-08-22 1:16 ` SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386) Al Viro
2011-08-22 1:41 ` Linus Torvalds
2011-08-22 1:48 ` H. Peter Anvin
2011-08-22 2:01 ` Andrew Lutomirski
2011-08-22 2:07 ` Al Viro
2011-08-22 2:26 ` Andrew Lutomirski
2011-08-22 2:34 ` H. Peter Anvin
2011-08-22 4:05 ` H. Peter Anvin
2011-08-22 9:53 ` [uml-devel] " Ingo Molnar
2011-08-22 13:34 ` Andrew Lutomirski
2011-08-22 14:40 ` Borislav Petkov
2011-08-22 15:13 ` Al Viro
2011-08-22 20:05 ` Linus Torvalds
2011-08-22 20:11 ` H. Peter Anvin
2011-08-22 21:52 ` Andrew Lutomirski
2011-08-22 22:04 ` H. Peter Anvin
2011-08-22 23:27 ` Linus Torvalds
2011-08-22 23:46 ` H. Peter Anvin
2011-08-23 0:03 ` Al Viro
2011-08-23 0:07 ` Al Viro
2011-08-23 0:07 ` H. Peter Anvin
2011-08-23 0:22 ` Linus Torvalds
2011-08-23 1:01 ` Al Viro
2011-08-23 1:13 ` Al Viro
2011-08-23 1:59 ` Linus Torvalds
2011-08-23 2:59 ` Al Viro [this message]
2011-08-23 2:17 ` Al Viro
2011-08-23 6:15 ` Al Viro
2011-08-23 14:26 ` Borislav Petkov
2011-08-23 16:30 ` Al Viro
2011-08-23 16:03 ` Linus Torvalds
2011-08-23 16:11 ` Andrew Lutomirski
2011-08-23 16:20 ` Linus Torvalds
2011-08-23 17:33 ` Al Viro
2011-08-23 18:04 ` Al Viro
2011-08-24 12:44 ` [PATCH] x86, asm: Document some of the syscall asm glue Borislav Petkov
2011-08-23 16:22 ` [uml-devel] SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386) Borislav Petkov
2011-08-23 16:29 ` Linus Torvalds
2011-08-23 16:53 ` Al Viro
2011-08-23 16:58 ` Richard Weinberger
2011-08-23 17:07 ` Al Viro
2011-08-23 17:29 ` Richard Weinberger
2011-08-25 0:05 ` Richard Weinberger
2011-08-23 19:15 ` H. Peter Anvin
2011-08-23 20:56 ` Borislav Petkov
2011-08-23 21:06 ` H. Peter Anvin
2011-08-23 21:10 ` Borislav Petkov
2011-08-23 23:04 ` H. Peter Anvin
2011-08-24 21:10 ` H. Peter Anvin
2011-08-23 16:48 ` Al Viro
2011-08-23 17:33 ` Linus Torvalds
2011-08-23 21:08 ` H. Peter Anvin
2011-08-23 21:20 ` Linus Torvalds
2011-08-23 23:04 ` H. Peter Anvin
2011-08-23 19:18 ` H. Peter Anvin
2011-08-23 19:24 ` Linus Torvalds
2011-08-23 19:26 ` H. Peter Anvin
2011-08-23 19:41 ` Al Viro
2011-08-23 19:43 ` Linus Torvalds
2011-08-23 21:17 ` Al Viro
[not found] ` <CAObL_7FG8eFTZ4djKH0T8tbRf2h6+iOm=OXr8194nvzc+w+a9A@mail.gmail.com>
2011-08-23 1:18 ` H. Peter Anvin
2011-08-22 4:07 ` Al Viro
2011-08-22 4:11 ` H. Peter Anvin
2011-08-22 4:26 ` Al Viro
2011-08-22 5:03 ` H. Peter Anvin
2011-08-23 5:10 ` Andrew Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110823025944.GB2203@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=bp@amd64.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@mit.edu \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=richard@nod.at \
--cc=torvalds@linux-foundation.org \
--cc=user-mode-linux-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox