From: Ingo Molnar <mingo@kernel.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: X86 ML <x86@kernel.org>, Denys Vlasenko <dvlasenk@redhat.com>,
Brian Gerst <brgerst@gmail.com>, Borislav Petkov <bp@alien8.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Jan Beulich <jbeulich@suse.com>
Subject: Re: Proposal for finishing the 64-bit x86 syscall cleanup
Date: Tue, 25 Aug 2015 10:42:01 +0200 [thread overview]
Message-ID: <20150825084201.GA21589@gmail.com> (raw)
In-Reply-To: <20150825081841.GA19412@gmail.com>
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * Andy Lutomirski <luto@amacapital.net> wrote:
>
> > Hi all-
> >
> > I want to (try to) mostly or fully get rid of the messy bits (as
> > opposed to the hardware-bs-forced bits) of the 64-bit syscall asm.
> > There are two major conceptual things that are in the way.
> >
> > Thing 1: partial pt_regs
> >
> > 64-bit fast path syscalls don't fully initialize pt_regs: bx, bp, and
> > r12-r15 are uninitialized. Some syscalls require them to be
> > initialized, and they have special awful stubs to do it. The entry
> > and exit tracing code (except for phase1 tracing) also need them
> > initialized, and they have their own messy initialization. Compat
> > syscalls are their own private little mess here.
> >
> > This gets in the way of all kinds of cleanups, because C code can't
> > switch between the full and partial pt_regs states.
> >
> > I can see two ways out. We could remove the optimization entirely,
> > which consists of pushing and popping six more registers and adds
> > about ten cycles to fast path syscalls on Sandy Bridge. It also
> > simplifies and presumably speeds up the slow paths.
>
> So out of hundreds of regular system calls there's only a handful of such system
> calls:
>
> triton:~/tip> git grep stub arch/x86/entry/syscalls/
> arch/x86/entry/syscalls/syscall_32.tbl:2 i386 fork sys_fork stub32_fork
> arch/x86/entry/syscalls/syscall_32.tbl:11 i386 execve sys_execve stub32_execve
> arch/x86/entry/syscalls/syscall_32.tbl:119 i386 sigreturn sys_sigreturn stub32_sigreturn
> arch/x86/entry/syscalls/syscall_32.tbl:120 i386 clone sys_clone stub32_clone
> arch/x86/entry/syscalls/syscall_32.tbl:173 i386 rt_sigreturn sys_rt_sigreturn stub32_rt_sigreturn
> arch/x86/entry/syscalls/syscall_32.tbl:190 i386 vfork sys_vfork stub32_vfork
> arch/x86/entry/syscalls/syscall_32.tbl:358 i386 execveat sys_execveat stub32_execveat
> arch/x86/entry/syscalls/syscall_64.tbl:15 64 rt_sigreturn stub_rt_sigreturn
> arch/x86/entry/syscalls/syscall_64.tbl:56 common clone stub_clone
> arch/x86/entry/syscalls/syscall_64.tbl:57 common fork stub_fork
> arch/x86/entry/syscalls/syscall_64.tbl:58 common vfork stub_vfork
> arch/x86/entry/syscalls/syscall_64.tbl:59 64 execve stub_execve
> arch/x86/entry/syscalls/syscall_64.tbl:322 64 execveat stub_execveat
> arch/x86/entry/syscalls/syscall_64.tbl:513 x32 rt_sigreturn stub_x32_rt_sigreturn
> arch/x86/entry/syscalls/syscall_64.tbl:520 x32 execve stub_x32_execve
> arch/x86/entry/syscalls/syscall_64.tbl:545 x32 execveat stub_x32_execveat
>
> and none of them are super performance critical system calls, so no way would I
> go for unconditionally saving/restoring all of ptregs, just to make it a bit
> simpler for these syscalls.
Let me qualify that: no way in the long run.
In the short run we can drop the optimization and reintroduce it later, to lower
all the risks that the C conversion brings with itself.
( That would also make it easier to re-analyze the cost/benefit ratio of the
optimization. )
So feel free to introduce a simple ptregs save/restore pattern for now.
Thanks,
Ingo
next prev parent reply other threads:[~2015-08-25 8:42 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-24 21:13 Proposal for finishing the 64-bit x86 syscall cleanup Andy Lutomirski
2015-08-25 7:29 ` Jan Beulich
2015-08-25 8:18 ` Ingo Molnar
2015-08-25 8:42 ` Ingo Molnar [this message]
2015-08-25 10:59 ` Brian Gerst
2015-08-25 16:28 ` Andy Lutomirski
2015-08-25 16:59 ` Linus Torvalds
2015-08-26 5:20 ` Brian Gerst
2015-08-26 17:10 ` Andy Lutomirski
2015-08-27 3:13 ` Brian Gerst
2015-08-27 3:38 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150825084201.GA21589@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=dvlasenk@redhat.com \
--cc=jbeulich@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.