From: Ingo Molnar <mingo@kernel.org>
To: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
Brian Gerst <brgerst@gmail.com>,
Denys Vlasenko <dvlasenk@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH v2 00/36] x86: Rewrite all syscall entries except native 64-bit
Date: Fri, 9 Oct 2015 15:06:54 +0200 [thread overview]
Message-ID: <20151009130654.GA10456@gmail.com> (raw)
In-Reply-To: <cover.1444091584.git.luto@kernel.org>
* Andy Lutomirski <luto@kernel.org> wrote:
> The first two patches are optimizations that I'm surprised we didn't
> already have. I noticed them when I was looking at the generated
> asm.
>
> The next two patches are tests and some old stuff. There's a test
> that validates the vDSO AT_SYSINFO annotations. There's also a test
> that exercises some assumptions that signal handling and ptracers
> make about syscalls that currently do *not* hold on 64-bit AMD using
> 32-bit AT_SYSINFO.
>
> The next three patches are NT cleanups and a lockdep cleanup.
>
> It may pay to apply the beginning of the series (at most through
> "x86/entry/64/compat: After SYSENTER, move STI after the NT fixup")
> without waiting for everyone to wrap their heads around the rest.
>
> The rest is basically a rewrite of syscalls for all cases except
> 64-bit native. With these patches applied, there is a single 32-bit
> vDSO and it uses SYSCALL, SYSENTER, and INT80 almost interchangeably
> via alternatives. The semantics of SYSENTER and SYSCALL are defined
> as:
>
> 1. If SYSCALL, ESP = ECX
> 2. ECX = *ESP
> 3. IP = INT80 landing pad
> 4. Opportunistic SYSRET/SYSEXIT is enabled on return
>
> The vDSO is rearranged so that these semantics work. Anything that
> backs IP up by 2 ends up pointing at a bona fide int $0x80
> instruction with the expected regs.
>
> In the process, the vDSO CFI annotations (which are actually used)
> get rewritten using normal CFI directives.
>
> Opportunistic SYSRET/SYSEXIT only happens on return when CS and SS
> are as expected, IP points to the INT80 landing pad, and flags are
> in good shape. (There is no longer any assumption that full
> fast-path 32-bit syscalls don't muck with the registers that matter
> for fast exits -- I played with maintaining an optimization like
> that with poor results. I may try again if it saves a few cycles.)
>
> Other than that, the system call entries are simplified to the bare
> minimum prologue and a call to a C function. Amusingly, SYSENTER
> and SYSCALL32 use the same C function.
>
> To make that work, I had to remove all the 32-bit syscall stubs
> except the clone argument hack. This is because, for C code to call
> through the system call table, the system call table entries need to
> be real function pointers with C-compatible ABIs.
>
> There is nothing at all anymore that requires that x86_32 syscalls
> be asmlinkage. That could be removed in a subsequent patch.
>
> The upshot appears to be a ~16 cycle performance hit on 32-bit fast
> path syscalls. (On my system, my little prctl test takes 172 cycles
> before and 188 cycles with these patches applied.)
>
> The slow path is probably faster under most circumstances and, if
> the exit slow path gets hit, it'll be much faster because (as we
> already do in the 64-bit native case) we can still use
> SYSEXIT/SYSRET.
>
> The patchset is structured as a removal of the old fast syscall
> code, then the change that makes syscalls into real functions, then
> a clean re-implementation of fast syscalls.
>
> If we want some of the 25 cycles back, we could consider open-coding
> a new C fast path.
>
> Changes from v1:
> - The unwind_vdso_32 test now warns on broken Debian installations
> instead of failing. The problem is now fully understood, will
> be fixed by Debian and possibly also fixed by upstream glibc.
> - execve was rather broken in v1.
> - It's quite a bit faster now (the optimizations at the end are mostly new).
> - int80 on 64-bit no longer clobbers extra regs (thanks Denys!).
> - The uaccess stuff is new.
> - Lots of other things that I forgot, I'm sure.
>
> Andy Lutomirski (36):
> x86/uaccess: Tell the compiler that uaccess is unlikely to fault
> x86/uaccess: __chk_range_not_ok is unlikely to return true
> selftests/x86: Add a test for vDSO unwinding
> selftests/x86: Add a test for syscall restart and arg modification
> x86/entry/64/compat: Fix SYSENTER's NT flag before user memory access
> x86/entry: Move lockdep_sys_exit to prepare_exit_to_usermode
> x86/entry/64/compat: After SYSENTER, move STI after the NT fixup
> x86/vdso: Remove runtime 32-bit vDSO selection
> x86/asm: Re-add manual CFI infrastructure
> x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asm
> x86/vdso: Replace hex int80 CFI annotations with gas directives
> x86/elf/64: Clear more registers in elf_common_init
> x86/vdso/32: Save extra registers in the INT80 vsyscall path
> x86/entry/64/compat: Disable SYSENTER and SYSCALL32 entries
> x86/entry/64/compat: Remove audit optimizations
> x86/entry/64/compat: Remove most of the fast system call machinery
> x86/entry/64/compat: Set up full pt_regs for all compat syscalls
> x86/entry/syscalls: Move syscall table declarations into
> asm/syscalls.h
> x86/syscalls: Give sys_call_ptr_t a useful type
> x86/entry: Add do_syscall_32, a C function to do 32-bit syscalls
> x86/entry/64/compat: Migrate the body of the syscall entry to C
> x86/entry: Add C code for fast system call entries
> x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspace
> x86/entry/compat: Implement opportunistic SYSRETL for compat syscalls
> x86/entry/32: Open-code return tracking from fork and kthreads
> x86/entry/32: Switch INT80 to the new C syscall path
> x86/entry/32: Re-implement SYSENTER using the new C path
> x86/asm: Remove thread_info.sysenter_return
> x86/entry: Remove unnecessary IRQ twiddling in fast 32-bit syscalls
> x86/entry: Make irqs_disabled checks in exit code depend on lockdep
> x86/entry: Force inlining of 32-bit syscall code
> x86/entry: Micro-optimize compat fast syscall arg fetch
> x86/entry: Hide two syscall entry assertions behind CONFIG_DEBUG_ENTRY
> x86/entry: Use pt_regs_to_thread_info() in syscall entry tracing
> x86/entry: Split and inline prepare_exit_to_usermode
> x86/entry: Split and inline syscall_return_slowpath
>
> arch/x86/Makefile | 10 +-
> arch/x86/entry/common.c | 255 ++++++++--
> arch/x86/entry/entry_32.S | 184 +++----
> arch/x86/entry/entry_64.S | 9 +-
> arch/x86/entry/entry_64_compat.S | 541 +++++----------------
> arch/x86/entry/syscall_32.c | 9 +-
> arch/x86/entry/syscall_64.c | 4 +-
> arch/x86/entry/syscalls/syscall_32.tbl | 12 +-
> arch/x86/entry/vdso/Makefile | 39 +-
> arch/x86/entry/vdso/vdso2c.c | 2 +-
> arch/x86/entry/vdso/vdso32-setup.c | 28 +-
> arch/x86/entry/vdso/vdso32/int80.S | 56 ---
> arch/x86/entry/vdso/vdso32/syscall.S | 75 ---
> arch/x86/entry/vdso/vdso32/sysenter.S | 116 -----
> arch/x86/entry/vdso/vdso32/system_call.S | 57 +++
> arch/x86/entry/vdso/vma.c | 13 +-
> arch/x86/ia32/ia32_signal.c | 4 +-
> arch/x86/include/asm/dwarf2.h | 177 +++++++
> arch/x86/include/asm/elf.h | 10 +-
> arch/x86/include/asm/syscall.h | 14 +-
> arch/x86/include/asm/thread_info.h | 1 -
> arch/x86/include/asm/uaccess.h | 14 +-
> arch/x86/include/asm/vdso.h | 10 +-
> arch/x86/kernel/asm-offsets.c | 3 -
> arch/x86/kernel/signal.c | 4 +-
> arch/x86/um/sys_call_table_32.c | 7 +-
> arch/x86/um/sys_call_table_64.c | 7 +-
> arch/x86/xen/setup.c | 13 +-
> tools/testing/selftests/x86/Makefile | 5 +-
> tools/testing/selftests/x86/ptrace_syscall.c | 294 +++++++++++
> .../testing/selftests/x86/raw_syscall_helper_32.S | 46 ++
> tools/testing/selftests/x86/unwind_vdso.c | 209 ++++++++
> 32 files changed, 1258 insertions(+), 970 deletions(-)
> delete mode 100644 arch/x86/entry/vdso/vdso32/int80.S
> delete mode 100644 arch/x86/entry/vdso/vdso32/syscall.S
> delete mode 100644 arch/x86/entry/vdso/vdso32/sysenter.S
> create mode 100644 arch/x86/entry/vdso/vdso32/system_call.S
> create mode 100644 arch/x86/include/asm/dwarf2.h
> create mode 100644 tools/testing/selftests/x86/ptrace_syscall.c
> create mode 100644 tools/testing/selftests/x86/raw_syscall_helper_32.S
> create mode 100644 tools/testing/selftests/x86/unwind_vdso.c
Ok, so I applied all of them to tip:x86/asm, in two phases, with small (stylistic)
edits - it all seems to work fine for me so far, so I pushed it all out to -tip
and linux-next.
Thanks,
Ingo
next prev parent reply other threads:[~2015-10-09 13:07 UTC|newest]
Thread overview: 124+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-06 0:47 [PATCH v2 00/36] x86: Rewrite all syscall entries except native 64-bit Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 01/36] x86/uaccess: Tell the compiler that uaccess is unlikely to fault Andy Lutomirski
2015-10-07 16:15 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 02/36] x86/uaccess: __chk_range_not_ok is unlikely to return true Andy Lutomirski
2015-10-07 10:59 ` Borislav Petkov
2015-10-07 16:23 ` Ingo Molnar
2015-10-07 16:16 ` [tip:x86/asm] x86/uaccess: Add unlikely() to __chk_range_not_ok() failure paths tip-bot for Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 03/36] selftests/x86: Add a test for vDSO unwinding Andy Lutomirski
2015-10-07 16:16 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 04/36] selftests/x86: Add a test for syscall restart and arg modification Andy Lutomirski
2015-10-07 16:17 ` [tip:x86/asm] selftests/x86: Add a test for ptrace " tip-bot for Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 05/36] x86/entry/64/compat: Fix SYSENTER's NT flag before user memory access Andy Lutomirski
2015-10-07 11:10 ` Borislav Petkov
2015-10-07 14:33 ` Brian Gerst
2015-10-07 15:05 ` Borislav Petkov
2015-10-09 17:08 ` [PATCH] x86/entry/64/compat: Document sysenter_fix_flags's reason for existence Borislav Petkov
2015-10-09 19:06 ` Andy Lutomirski
2015-10-11 9:09 ` [tip:x86/asm] x86/entry/64/compat: Document sysenter_fix_flags' s " tip-bot for Borislav Petkov
2015-10-07 16:17 ` [tip:x86/asm] x86/entry/64/compat: Fix SYSENTER' s NT flag before user memory access tip-bot for Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 06/36] x86/entry: Move lockdep_sys_exit to prepare_exit_to_usermode Andy Lutomirski
2015-10-07 16:17 ` [tip:x86/asm] x86/entry, locking/lockdep: Move lockdep_sys_exit() to prepare_exit_to_usermode() tip-bot for Andy Lutomirski
2015-10-08 8:59 ` Peter Zijlstra
2015-10-09 19:34 ` Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 07/36] x86/entry/64/compat: After SYSENTER, move STI after the NT fixup Andy Lutomirski
2015-10-07 16:18 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-07 17:39 ` [PATCH v2 07/36] " Denys Vlasenko
2015-10-07 19:02 ` Andy Lutomirski
2015-10-09 19:48 ` Andy Lutomirski
2015-10-12 17:48 ` Denys Vlasenko
2015-10-12 18:11 ` Brian Gerst
2015-10-06 0:47 ` [PATCH v2 08/36] x86/vdso: Remove runtime 32-bit vDSO selection Andy Lutomirski
2015-10-07 16:18 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-07 17:00 ` Brian Gerst
2015-10-06 0:47 ` [PATCH v2 09/36] x86/asm: Re-add manual CFI infrastructure Andy Lutomirski
2015-10-06 8:23 ` Ingo Molnar
2015-10-06 18:21 ` Andy Lutomirski
2015-10-08 13:11 ` Borislav Petkov
2015-10-08 14:14 ` Ingo Molnar
2015-10-09 13:06 ` [tip:x86/asm] x86/asm: Re-add parts of the " tip-bot for Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 10/36] x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asm Andy Lutomirski
2015-10-09 7:21 ` Ingo Molnar
2015-10-09 13:07 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:47 ` [PATCH v2 11/36] x86/vdso: Replace hex int80 CFI annotations with gas directives Andy Lutomirski
2015-10-09 13:07 ` [tip:x86/asm] x86/vdso: Replace hex int80 CFI annotations with GAS directives tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 12/36] x86/elf/64: Clear more registers in elf_common_init Andy Lutomirski
2015-10-09 13:08 ` [tip:x86/asm] x86/elf/64: Clear more registers in elf_common_init () tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 13/36] x86/vdso/32: Save extra registers in the INT80 vsyscall path Andy Lutomirski
2015-10-09 13:08 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 14/36] x86/entry/64/compat: Disable SYSENTER and SYSCALL32 entries Andy Lutomirski
2015-10-08 15:41 ` Borislav Petkov
2015-10-09 19:11 ` Andy Lutomirski
2015-10-09 13:08 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 15/36] x86/entry/64/compat: Remove audit optimizations Andy Lutomirski
2015-10-09 13:09 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 16/36] x86/entry/64/compat: Remove most of the fast system call machinery Andy Lutomirski
2015-10-09 13:09 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 17/36] x86/entry/64/compat: Set up full pt_regs for all compat syscalls Andy Lutomirski
2015-10-09 13:09 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 18/36] x86/entry/syscalls: Move syscall table declarations into asm/syscalls.h Andy Lutomirski
2015-10-09 13:10 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 19/36] x86/syscalls: Give sys_call_ptr_t a useful type Andy Lutomirski
2015-10-09 13:10 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 20/36] x86/entry: Add do_syscall_32, a C function to do 32-bit syscalls Andy Lutomirski
2015-10-09 13:10 ` [tip:x86/asm] x86/entry: Add do_syscall_32(), " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 21/36] x86/entry/64/compat: Migrate the body of the syscall entry to C Andy Lutomirski
2015-10-09 13:11 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 22/36] x86/entry: Add C code for fast system call entries Andy Lutomirski
2015-10-06 8:25 ` Linus Torvalds
2015-10-06 8:29 ` Linus Torvalds
2015-10-06 18:25 ` Andy Lutomirski
2015-10-09 13:11 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 23/36] x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspace Andy Lutomirski
2015-10-09 13:11 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 24/36] x86/entry/compat: Implement opportunistic SYSRETL for compat syscalls Andy Lutomirski
2015-10-09 13:12 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-12 16:15 ` [PATCH v2 24/36] " Borislav Petkov
2015-10-14 16:25 ` Andy Lutomirski
2015-10-14 16:31 ` Borislav Petkov
2015-10-06 0:48 ` [PATCH v2 25/36] x86/entry/32: Open-code return tracking from fork and kthreads Andy Lutomirski
2015-10-09 13:12 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 26/36] x86/entry/32: Switch INT80 to the new C syscall path Andy Lutomirski
2015-10-09 13:12 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-15 18:09 ` Borislav Petkov
2015-10-15 19:09 ` Andy Lutomirski
2015-10-16 10:52 ` Borislav Petkov
[not found] ` <20151016140502.GG31612@pd.tnic>
2015-10-16 15:57 ` Andy Lutomirski
2015-10-16 17:14 ` Borislav Petkov
2015-10-16 15:59 ` Andy Lutomirski
2015-10-16 17:34 ` Borislav Petkov
2015-10-16 18:22 ` Brian Gerst
2015-10-16 18:32 ` Andy Lutomirski
2015-10-16 19:36 ` Brian Gerst
2015-10-06 0:48 ` [PATCH v2 27/36] x86/entry/32: Re-implement SYSENTER using the new C path Andy Lutomirski
2015-10-07 18:08 ` Denys Vlasenko
2015-10-07 19:06 ` Andy Lutomirski
2015-10-09 13:13 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 28/36] x86/asm: Remove thread_info.sysenter_return Andy Lutomirski
2015-10-09 13:13 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 29/36] x86/entry: Remove unnecessary IRQ twiddling in fast 32-bit syscalls Andy Lutomirski
2015-10-09 13:13 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 30/36] x86/entry: Make irqs_disabled checks in exit code depend on lockdep Andy Lutomirski
2015-10-09 13:14 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 31/36] x86/entry: Force inlining of 32-bit syscall code Andy Lutomirski
2015-10-09 13:14 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 32/36] x86/entry: Micro-optimize compat fast syscall arg fetch Andy Lutomirski
2015-10-09 7:32 ` Ingo Molnar
2015-10-09 19:28 ` Andy Lutomirski
2015-10-10 9:05 ` Ingo Molnar
2015-10-09 13:14 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 33/36] x86/entry: Hide two syscall entry assertions behind CONFIG_DEBUG_ENTRY Andy Lutomirski
2015-10-09 13:15 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 34/36] x86/entry: Use pt_regs_to_thread_info() in syscall entry tracing Andy Lutomirski
2015-10-09 13:15 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 35/36] x86/entry: Split and inline prepare_exit_to_usermode Andy Lutomirski
2015-10-09 13:15 ` [tip:x86/asm] x86/entry: Split and inline prepare_exit_to_usermode() tip-bot for Andy Lutomirski
2015-10-06 0:48 ` [PATCH v2 36/36] x86/entry: Split and inline syscall_return_slowpath Andy Lutomirski
2015-10-09 13:16 ` [tip:x86/asm] x86/entry: Split and inline syscall_return_slowpath () tip-bot for Andy Lutomirski
2015-10-06 8:39 ` [PATCH v2 00/36] x86: Rewrite all syscall entries except native 64-bit Linus Torvalds
2015-10-06 8:49 ` Ingo Molnar
2015-10-06 18:26 ` Andy Lutomirski
2015-10-09 13:06 ` Ingo Molnar [this message]
2015-10-12 18:30 ` Richard Weinberger
2015-10-12 18:41 ` Andy Lutomirski
2015-10-12 21:02 ` Richard Weinberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151009130654.GA10456@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=dvlasenk@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).