All of lore.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <dvlasenk@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>, Ingo Molnar <mingo@kernel.org>,
	Denys Vlasenko <vda.linux@googlemail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>,
	X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: vdso32/syscall.S: do not load __USER32_DS to %ss
Date: Wed, 25 Mar 2015 15:55:22 +0100	[thread overview]
Message-ID: <5512CC5A.8060506@redhat.com> (raw)
In-Reply-To: <CALCETrU=fWvyOf-yWG=UQL4jfhbp1vwzPpBd+eeTLjk94xX+8A@mail.gmail.com>

On 03/24/2015 10:40 PM, Andy Lutomirski wrote:
> The syscall and sysenter stuff is IMO really nasty.  Here's how I'd
> like it to work:
> 
> When you do "call __kernel_vsyscall", I want the net effect to be that
> your eax, ebx, ecx, edx, esi, edi, and ebp at the time of the call end
> up *verbatim* in pt_regs.  Your eip and rsp should be such that, if we
> iret normally using pt_regs, we end up returning correctly to
> userspace.  I want this to be true *regardless* of whether we're doing
> a fast-path or slow-path system call.
> 
> This means that we have, literally (see below for why ret $4):
> 
> int $0x80
> ret $4  <-- regs->eip points here
> 
> Then we add an opportunistic return trampoline: if a special ti flag
> is set (which we set on entry here) and the return eip and regs are
> appropriate, then we change the return at the last minute to vdso code
> that looks like:
> 
> popl $ecx
> popl $edx
> ret

I don't fully understand your intent.

> The vdso code would be something like (so untested it's not even funny):
> 
> __kernel_vsyscall:
>   ALTERNATIVE_2(something or other)
> 
> __kernel_vsyscall_for_intel:
>   pushl $edx
>   pushl $ecx
>   sysenter
>   hlt  <-- just for clarity
> 
> __kernel_vsyscall_for_amd:
>   pushl $ecx
>   syscall
> __vsyscall_after_syscall_insn:
>  ret $4 <-- for binary tracers only

This ret would use former ecx value as return address?


> __kernel_vsyscall_for_int80:
>   int $0x80  <-- regs->eip points here during *all* vsyscalls
> 
> __kernel_vsyscall_slow_ret:
>   ret $4

After returning, this will pop an extra word from __kernel_vsyscall() caller.
They don't expect that.


> __kernel_vsyscall_sysretl_target:
>   popl $ecx
>   ret
> 
> There is no sysexit.  Take that, Intel.
> 
> On sysenter, we copy regs->cx and regs->dx from user memory and then
> we increment regs->sp by 4 and point regs->eip to
> __kernel_vsyscall_for_int80.  On syscall, we copy regs->cx from user
> memory and point regs->eip to __kernel_vsyscall_for_int80.
> 
> On opportunistic sysretl, we do:
> 
> *regs->sp = regs->cx;  /* put_user or whatever */
> regs->eip = __kernel_vsyscall_sysretl_target
> ...
> sysretl
> 
> We never do sysexit or sysretl in any other code path.  That is, there
> is no really fast path anymore.

I still don't understand the purpose those "ret 4" insns.
They don't look right.

  parent reply	other threads:[~2015-03-25 14:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23 16:47 [PATCH] x86: vdso32/syscall.S: do not load __USER32_DS to %ss Denys Vlasenko
2015-03-23 19:37 ` Andy Lutomirski
2015-03-23 20:38   ` Andy Lutomirski
2015-03-23 21:55     ` Denys Vlasenko
2015-03-24  6:34       ` Ingo Molnar
2015-03-24 14:08         ` Denys Vlasenko
2015-03-24 15:50           ` Ingo Molnar
2015-03-24 16:55           ` Brian Gerst
2015-03-24 20:17             ` Denys Vlasenko
2015-03-24 21:40               ` Andy Lutomirski
2015-03-25  9:28                 ` Ingo Molnar
2015-03-25 15:03                   ` Denys Vlasenko
2015-03-25 15:17                     ` Andy Lutomirski
2015-03-25 14:55                 ` Denys Vlasenko [this message]
2015-03-25 15:12                   ` Andy Lutomirski
2015-03-25  0:59               ` Brian Gerst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5512CC5A.8060506@redhat.com \
    --to=dvlasenk@redhat.com \
    --cc=ast@plumgrid.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vda.linux@googlemail.com \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.