From: Ingo Molnar <mingo@kernel.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: X86 ML <x86@kernel.org>, Denys Vlasenko <dvlasenk@redhat.com>,
Brian Gerst <brgerst@gmail.com>, Borislav Petkov <bp@alien8.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Jan Beulich <jbeulich@suse.com>
Subject: Re: Proposal for finishing the 64-bit x86 syscall cleanup
Date: Tue, 25 Aug 2015 10:42:01 +0200 [thread overview]
Message-ID: <20150825084201.GA21589@gmail.com> (raw)
In-Reply-To: <20150825081841.GA19412@gmail.com>
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * Andy Lutomirski <luto@amacapital.net> wrote:
>
> > Hi all-
> >
> > I want to (try to) mostly or fully get rid of the messy bits (as
> > opposed to the hardware-bs-forced bits) of the 64-bit syscall asm.
> > There are two major conceptual things that are in the way.
> >
> > Thing 1: partial pt_regs
> >
> > 64-bit fast path syscalls don't fully initialize pt_regs: bx, bp, and
> > r12-r15 are uninitialized. Some syscalls require them to be
> > initialized, and they have special awful stubs to do it. The entry
> > and exit tracing code (except for phase1 tracing) also need them
> > initialized, and they have their own messy initialization. Compat
> > syscalls are their own private little mess here.
> >
> > This gets in the way of all kinds of cleanups, because C code can't
> > switch between the full and partial pt_regs states.
> >
> > I can see two ways out. We could remove the optimization entirely,
> > which consists of pushing and popping six more registers and adds
> > about ten cycles to fast path syscalls on Sandy Bridge. It also
> > simplifies and presumably speeds up the slow paths.
>
> So out of hundreds of regular system calls there's only a handful of such system
> calls:
>
> triton:~/tip> git grep stub arch/x86/entry/syscalls/
> arch/x86/entry/syscalls/syscall_32.tbl:2 i386 fork sys_fork stub32_fork
> arch/x86/entry/syscalls/syscall_32.tbl:11 i386 execve sys_execve stub32_execve
> arch/x86/entry/syscalls/syscall_32.tbl:119 i386 sigreturn sys_sigreturn stub32_sigreturn
> arch/x86/entry/syscalls/syscall_32.tbl:120 i386 clone sys_clone stub32_clone
> arch/x86/entry/syscalls/syscall_32.tbl:173 i386 rt_sigreturn sys_rt_sigreturn stub32_rt_sigreturn
> arch/x86/entry/syscalls/syscall_32.tbl:190 i386 vfork sys_vfork stub32_vfork
> arch/x86/entry/syscalls/syscall_32.tbl:358 i386 execveat sys_execveat stub32_execveat
> arch/x86/entry/syscalls/syscall_64.tbl:15 64 rt_sigreturn stub_rt_sigreturn
> arch/x86/entry/syscalls/syscall_64.tbl:56 common clone stub_clone
> arch/x86/entry/syscalls/syscall_64.tbl:57 common fork stub_fork
> arch/x86/entry/syscalls/syscall_64.tbl:58 common vfork stub_vfork
> arch/x86/entry/syscalls/syscall_64.tbl:59 64 execve stub_execve
> arch/x86/entry/syscalls/syscall_64.tbl:322 64 execveat stub_execveat
> arch/x86/entry/syscalls/syscall_64.tbl:513 x32 rt_sigreturn stub_x32_rt_sigreturn
> arch/x86/entry/syscalls/syscall_64.tbl:520 x32 execve stub_x32_execve
> arch/x86/entry/syscalls/syscall_64.tbl:545 x32 execveat stub_x32_execveat
>
> and none of them are super performance critical system calls, so no way would I
> go for unconditionally saving/restoring all of ptregs, just to make it a bit
> simpler for these syscalls.
Let me qualify that: no way in the long run.
In the short run we can drop the optimization and reintroduce it later, to lower
all the risks that the C conversion brings with itself.
( That would also make it easier to re-analyze the cost/benefit ratio of the
optimization. )
So feel free to introduce a simple ptregs save/restore pattern for now.
Thanks,
Ingo
next prev parent reply other threads:[~2015-08-25 8:42 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-24 21:13 Proposal for finishing the 64-bit x86 syscall cleanup Andy Lutomirski
2015-08-25 7:29 ` Jan Beulich
2015-08-25 8:18 ` Ingo Molnar
2015-08-25 8:42 ` Ingo Molnar [this message]
2015-08-25 10:59 ` Brian Gerst
2015-08-25 16:28 ` Andy Lutomirski
2015-08-25 16:59 ` Linus Torvalds
2015-08-26 5:20 ` Brian Gerst
2015-08-26 17:10 ` Andy Lutomirski
2015-08-27 3:13 ` Brian Gerst
2015-08-27 3:38 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150825084201.GA21589@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=dvlasenk@redhat.com \
--cc=jbeulich@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox