linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Brian Gerst <brgerst@gmail.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH v2 00/36] x86: Rewrite all syscall entries except native 64-bit
Date: Fri, 9 Oct 2015 15:06:54 +0200	[thread overview]
Message-ID: <20151009130654.GA10456@gmail.com> (raw)
In-Reply-To: <cover.1444091584.git.luto@kernel.org>


* Andy Lutomirski <luto@kernel.org> wrote:

> The first two patches are optimizations that I'm surprised we didn't
> already have.  I noticed them when I was looking at the generated
> asm.
> 
> The next two patches are tests and some old stuff.  There's a test
> that validates the vDSO AT_SYSINFO annotations.  There's also a test
> that exercises some assumptions that signal handling and ptracers
> make about syscalls that currently do *not* hold on 64-bit AMD using
> 32-bit AT_SYSINFO.
> 
> The next three patches are NT cleanups and a lockdep cleanup.
> 
> It may pay to apply the beginning of the series (at most through
> "x86/entry/64/compat: After SYSENTER, move STI after the NT fixup")
> without waiting for everyone to wrap their heads around the rest.
> 
> The rest is basically a rewrite of syscalls for all cases except
> 64-bit native.  With these patches applied, there is a single 32-bit
> vDSO and it uses SYSCALL, SYSENTER, and INT80 almost interchangeably
> via alternatives.  The semantics of SYSENTER and SYSCALL are defined
> as:
> 
>  1. If SYSCALL, ESP = ECX
>  2. ECX = *ESP
>  3. IP = INT80 landing pad
>  4. Opportunistic SYSRET/SYSEXIT is enabled on return
> 
> The vDSO is rearranged so that these semantics work.  Anything that
> backs IP up by 2 ends up pointing at a bona fide int $0x80
> instruction with the expected regs.
> 
> In the process, the vDSO CFI annotations (which are actually used)
> get rewritten using normal CFI directives.
> 
> Opportunistic SYSRET/SYSEXIT only happens on return when CS and SS
> are as expected, IP points to the INT80 landing pad, and flags are
> in good shape.  (There is no longer any assumption that full
> fast-path 32-bit syscalls don't muck with the registers that matter
> for fast exits -- I played with maintaining an optimization like
> that with poor results.  I may try again if it saves a few cycles.)
> 
> Other than that, the system call entries are simplified to the bare
> minimum prologue and a call to a C function.  Amusingly, SYSENTER
> and SYSCALL32 use the same C function.
> 
> To make that work, I had to remove all the 32-bit syscall stubs
> except the clone argument hack.  This is because, for C code to call
> through the system call table, the system call table entries need to
> be real function pointers with C-compatible ABIs.
> 
> There is nothing at all anymore that requires that x86_32 syscalls
> be asmlinkage.  That could be removed in a subsequent patch.
> 
> The upshot appears to be a ~16 cycle performance hit on 32-bit fast
> path syscalls.  (On my system, my little prctl test takes 172 cycles
> before and 188 cycles with these patches applied.)
> 
> The slow path is probably faster under most circumstances and, if
> the exit slow path gets hit, it'll be much faster because (as we
> already do in the 64-bit native case) we can still use
> SYSEXIT/SYSRET.
> 
> The patchset is structured as a removal of the old fast syscall
> code, then the change that makes syscalls into real functions, then
> a clean re-implementation of fast syscalls.
> 
> If we want some of the 25 cycles back, we could consider open-coding
> a new C fast path.
> 
> Changes from v1:
>  - The unwind_vdso_32 test now warns on broken Debian installations
>    instead of failing.  The problem is now fully understood, will
>    be fixed by Debian and possibly also fixed by upstream glibc.
>  - execve was rather broken in v1.
>  - It's quite a bit faster now (the optimizations at the end are mostly new).
>  - int80 on 64-bit no longer clobbers extra regs (thanks Denys!).
>  - The uaccess stuff is new.
>  - Lots of other things that I forgot, I'm sure.
> 
> Andy Lutomirski (36):
>   x86/uaccess: Tell the compiler that uaccess is unlikely to fault
>   x86/uaccess: __chk_range_not_ok is unlikely to return true
>   selftests/x86: Add a test for vDSO unwinding
>   selftests/x86: Add a test for syscall restart and arg modification
>   x86/entry/64/compat: Fix SYSENTER's NT flag before user memory access
>   x86/entry: Move lockdep_sys_exit to prepare_exit_to_usermode
>   x86/entry/64/compat: After SYSENTER, move STI after the NT fixup
>   x86/vdso: Remove runtime 32-bit vDSO selection
>   x86/asm: Re-add manual CFI infrastructure
>   x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asm
>   x86/vdso: Replace hex int80 CFI annotations with gas directives
>   x86/elf/64: Clear more registers in elf_common_init
>   x86/vdso/32: Save extra registers in the INT80 vsyscall path
>   x86/entry/64/compat: Disable SYSENTER and SYSCALL32 entries
>   x86/entry/64/compat: Remove audit optimizations
>   x86/entry/64/compat: Remove most of the fast system call machinery
>   x86/entry/64/compat: Set up full pt_regs for all compat syscalls
>   x86/entry/syscalls: Move syscall table declarations into
>     asm/syscalls.h
>   x86/syscalls: Give sys_call_ptr_t a useful type
>   x86/entry: Add do_syscall_32, a C function to do 32-bit syscalls
>   x86/entry/64/compat: Migrate the body of the syscall entry to C
>   x86/entry: Add C code for fast system call entries
>   x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspace
>   x86/entry/compat: Implement opportunistic SYSRETL for compat syscalls
>   x86/entry/32: Open-code return tracking from fork and kthreads
>   x86/entry/32: Switch INT80 to the new C syscall path
>   x86/entry/32: Re-implement SYSENTER using the new C path
>   x86/asm: Remove thread_info.sysenter_return
>   x86/entry: Remove unnecessary IRQ twiddling in fast 32-bit syscalls
>   x86/entry: Make irqs_disabled checks in exit code depend on lockdep
>   x86/entry: Force inlining of 32-bit syscall code
>   x86/entry: Micro-optimize compat fast syscall arg fetch
>   x86/entry: Hide two syscall entry assertions behind CONFIG_DEBUG_ENTRY
>   x86/entry: Use pt_regs_to_thread_info() in syscall entry tracing
>   x86/entry: Split and inline prepare_exit_to_usermode
>   x86/entry: Split and inline syscall_return_slowpath
> 
>  arch/x86/Makefile                                  |  10 +-
>  arch/x86/entry/common.c                            | 255 ++++++++--
>  arch/x86/entry/entry_32.S                          | 184 +++----
>  arch/x86/entry/entry_64.S                          |   9 +-
>  arch/x86/entry/entry_64_compat.S                   | 541 +++++----------------
>  arch/x86/entry/syscall_32.c                        |   9 +-
>  arch/x86/entry/syscall_64.c                        |   4 +-
>  arch/x86/entry/syscalls/syscall_32.tbl             |  12 +-
>  arch/x86/entry/vdso/Makefile                       |  39 +-
>  arch/x86/entry/vdso/vdso2c.c                       |   2 +-
>  arch/x86/entry/vdso/vdso32-setup.c                 |  28 +-
>  arch/x86/entry/vdso/vdso32/int80.S                 |  56 ---
>  arch/x86/entry/vdso/vdso32/syscall.S               |  75 ---
>  arch/x86/entry/vdso/vdso32/sysenter.S              | 116 -----
>  arch/x86/entry/vdso/vdso32/system_call.S           |  57 +++
>  arch/x86/entry/vdso/vma.c                          |  13 +-
>  arch/x86/ia32/ia32_signal.c                        |   4 +-
>  arch/x86/include/asm/dwarf2.h                      | 177 +++++++
>  arch/x86/include/asm/elf.h                         |  10 +-
>  arch/x86/include/asm/syscall.h                     |  14 +-
>  arch/x86/include/asm/thread_info.h                 |   1 -
>  arch/x86/include/asm/uaccess.h                     |  14 +-
>  arch/x86/include/asm/vdso.h                        |  10 +-
>  arch/x86/kernel/asm-offsets.c                      |   3 -
>  arch/x86/kernel/signal.c                           |   4 +-
>  arch/x86/um/sys_call_table_32.c                    |   7 +-
>  arch/x86/um/sys_call_table_64.c                    |   7 +-
>  arch/x86/xen/setup.c                               |  13 +-
>  tools/testing/selftests/x86/Makefile               |   5 +-
>  tools/testing/selftests/x86/ptrace_syscall.c       | 294 +++++++++++
>  .../testing/selftests/x86/raw_syscall_helper_32.S  |  46 ++
>  tools/testing/selftests/x86/unwind_vdso.c          | 209 ++++++++
>  32 files changed, 1258 insertions(+), 970 deletions(-)
>  delete mode 100644 arch/x86/entry/vdso/vdso32/int80.S
>  delete mode 100644 arch/x86/entry/vdso/vdso32/syscall.S
>  delete mode 100644 arch/x86/entry/vdso/vdso32/sysenter.S
>  create mode 100644 arch/x86/entry/vdso/vdso32/system_call.S
>  create mode 100644 arch/x86/include/asm/dwarf2.h
>  create mode 100644 tools/testing/selftests/x86/ptrace_syscall.c
>  create mode 100644 tools/testing/selftests/x86/raw_syscall_helper_32.S
>  create mode 100644 tools/testing/selftests/x86/unwind_vdso.c

Ok, so I applied all of them to tip:x86/asm, in two phases, with small (stylistic) 
edits - it all seems to work fine for me so far, so I pushed it all out to -tip 
and linux-next.

Thanks,

	Ingo

  parent reply	other threads:[~2015-10-09 13:07 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-06  0:47 [PATCH v2 00/36] x86: Rewrite all syscall entries except native 64-bit Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 01/36] x86/uaccess: Tell the compiler that uaccess is unlikely to fault Andy Lutomirski
2015-10-07 16:15   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 02/36] x86/uaccess: __chk_range_not_ok is unlikely to return true Andy Lutomirski
2015-10-07 10:59   ` Borislav Petkov
2015-10-07 16:23     ` Ingo Molnar
2015-10-07 16:16   ` [tip:x86/asm] x86/uaccess: Add unlikely() to __chk_range_not_ok() failure paths tip-bot for Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 03/36] selftests/x86: Add a test for vDSO unwinding Andy Lutomirski
2015-10-07 16:16   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 04/36] selftests/x86: Add a test for syscall restart and arg modification Andy Lutomirski
2015-10-07 16:17   ` [tip:x86/asm] selftests/x86: Add a test for ptrace " tip-bot for Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 05/36] x86/entry/64/compat: Fix SYSENTER's NT flag before user memory access Andy Lutomirski
2015-10-07 11:10   ` Borislav Petkov
2015-10-07 14:33     ` Brian Gerst
2015-10-07 15:05       ` Borislav Petkov
2015-10-09 17:08         ` [PATCH] x86/entry/64/compat: Document sysenter_fix_flags's reason for existence Borislav Petkov
2015-10-09 19:06           ` Andy Lutomirski
2015-10-11  9:09           ` [tip:x86/asm] x86/entry/64/compat: Document sysenter_fix_flags' s " tip-bot for Borislav Petkov
2015-10-07 16:17   ` [tip:x86/asm] x86/entry/64/compat: Fix SYSENTER' s NT flag before user memory access tip-bot for Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 06/36] x86/entry: Move lockdep_sys_exit to prepare_exit_to_usermode Andy Lutomirski
2015-10-07 16:17   ` [tip:x86/asm] x86/entry, locking/lockdep: Move lockdep_sys_exit() to prepare_exit_to_usermode() tip-bot for Andy Lutomirski
2015-10-08  8:59     ` Peter Zijlstra
2015-10-09 19:34       ` Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 07/36] x86/entry/64/compat: After SYSENTER, move STI after the NT fixup Andy Lutomirski
2015-10-07 16:18   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-07 17:39   ` [PATCH v2 07/36] " Denys Vlasenko
2015-10-07 19:02     ` Andy Lutomirski
2015-10-09 19:48     ` Andy Lutomirski
2015-10-12 17:48       ` Denys Vlasenko
2015-10-12 18:11         ` Brian Gerst
2015-10-06  0:47 ` [PATCH v2 08/36] x86/vdso: Remove runtime 32-bit vDSO selection Andy Lutomirski
2015-10-07 16:18   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-07 17:00     ` Brian Gerst
2015-10-06  0:47 ` [PATCH v2 09/36] x86/asm: Re-add manual CFI infrastructure Andy Lutomirski
2015-10-06  8:23   ` Ingo Molnar
2015-10-06 18:21     ` Andy Lutomirski
2015-10-08 13:11       ` Borislav Petkov
2015-10-08 14:14         ` Ingo Molnar
2015-10-09 13:06   ` [tip:x86/asm] x86/asm: Re-add parts of the " tip-bot for Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 10/36] x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asm Andy Lutomirski
2015-10-09  7:21   ` Ingo Molnar
2015-10-09 13:07   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:47 ` [PATCH v2 11/36] x86/vdso: Replace hex int80 CFI annotations with gas directives Andy Lutomirski
2015-10-09 13:07   ` [tip:x86/asm] x86/vdso: Replace hex int80 CFI annotations with GAS directives tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 12/36] x86/elf/64: Clear more registers in elf_common_init Andy Lutomirski
2015-10-09 13:08   ` [tip:x86/asm] x86/elf/64: Clear more registers in elf_common_init () tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 13/36] x86/vdso/32: Save extra registers in the INT80 vsyscall path Andy Lutomirski
2015-10-09 13:08   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 14/36] x86/entry/64/compat: Disable SYSENTER and SYSCALL32 entries Andy Lutomirski
2015-10-08 15:41   ` Borislav Petkov
2015-10-09 19:11     ` Andy Lutomirski
2015-10-09 13:08   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 15/36] x86/entry/64/compat: Remove audit optimizations Andy Lutomirski
2015-10-09 13:09   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 16/36] x86/entry/64/compat: Remove most of the fast system call machinery Andy Lutomirski
2015-10-09 13:09   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 17/36] x86/entry/64/compat: Set up full pt_regs for all compat syscalls Andy Lutomirski
2015-10-09 13:09   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 18/36] x86/entry/syscalls: Move syscall table declarations into asm/syscalls.h Andy Lutomirski
2015-10-09 13:10   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 19/36] x86/syscalls: Give sys_call_ptr_t a useful type Andy Lutomirski
2015-10-09 13:10   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 20/36] x86/entry: Add do_syscall_32, a C function to do 32-bit syscalls Andy Lutomirski
2015-10-09 13:10   ` [tip:x86/asm] x86/entry: Add do_syscall_32(), " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 21/36] x86/entry/64/compat: Migrate the body of the syscall entry to C Andy Lutomirski
2015-10-09 13:11   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 22/36] x86/entry: Add C code for fast system call entries Andy Lutomirski
2015-10-06  8:25   ` Linus Torvalds
2015-10-06  8:29     ` Linus Torvalds
2015-10-06 18:25       ` Andy Lutomirski
2015-10-09 13:11   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 23/36] x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspace Andy Lutomirski
2015-10-09 13:11   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 24/36] x86/entry/compat: Implement opportunistic SYSRETL for compat syscalls Andy Lutomirski
2015-10-09 13:12   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-12 16:15   ` [PATCH v2 24/36] " Borislav Petkov
2015-10-14 16:25     ` Andy Lutomirski
2015-10-14 16:31       ` Borislav Petkov
2015-10-06  0:48 ` [PATCH v2 25/36] x86/entry/32: Open-code return tracking from fork and kthreads Andy Lutomirski
2015-10-09 13:12   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 26/36] x86/entry/32: Switch INT80 to the new C syscall path Andy Lutomirski
2015-10-09 13:12   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-15 18:09     ` Borislav Petkov
2015-10-15 19:09       ` Andy Lutomirski
2015-10-16 10:52         ` Borislav Petkov
     [not found]           ` <20151016140502.GG31612@pd.tnic>
2015-10-16 15:57             ` Andy Lutomirski
2015-10-16 17:14               ` Borislav Petkov
2015-10-16 15:59           ` Andy Lutomirski
2015-10-16 17:34             ` Borislav Petkov
2015-10-16 18:22               ` Brian Gerst
2015-10-16 18:32                 ` Andy Lutomirski
2015-10-16 19:36                   ` Brian Gerst
2015-10-06  0:48 ` [PATCH v2 27/36] x86/entry/32: Re-implement SYSENTER using the new C path Andy Lutomirski
2015-10-07 18:08   ` Denys Vlasenko
2015-10-07 19:06     ` Andy Lutomirski
2015-10-09 13:13   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 28/36] x86/asm: Remove thread_info.sysenter_return Andy Lutomirski
2015-10-09 13:13   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 29/36] x86/entry: Remove unnecessary IRQ twiddling in fast 32-bit syscalls Andy Lutomirski
2015-10-09 13:13   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 30/36] x86/entry: Make irqs_disabled checks in exit code depend on lockdep Andy Lutomirski
2015-10-09 13:14   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 31/36] x86/entry: Force inlining of 32-bit syscall code Andy Lutomirski
2015-10-09 13:14   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 32/36] x86/entry: Micro-optimize compat fast syscall arg fetch Andy Lutomirski
2015-10-09  7:32   ` Ingo Molnar
2015-10-09 19:28     ` Andy Lutomirski
2015-10-10  9:05       ` Ingo Molnar
2015-10-09 13:14   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 33/36] x86/entry: Hide two syscall entry assertions behind CONFIG_DEBUG_ENTRY Andy Lutomirski
2015-10-09 13:15   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 34/36] x86/entry: Use pt_regs_to_thread_info() in syscall entry tracing Andy Lutomirski
2015-10-09 13:15   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 35/36] x86/entry: Split and inline prepare_exit_to_usermode Andy Lutomirski
2015-10-09 13:15   ` [tip:x86/asm] x86/entry: Split and inline prepare_exit_to_usermode() tip-bot for Andy Lutomirski
2015-10-06  0:48 ` [PATCH v2 36/36] x86/entry: Split and inline syscall_return_slowpath Andy Lutomirski
2015-10-09 13:16   ` [tip:x86/asm] x86/entry: Split and inline syscall_return_slowpath () tip-bot for Andy Lutomirski
2015-10-06  8:39 ` [PATCH v2 00/36] x86: Rewrite all syscall entries except native 64-bit Linus Torvalds
2015-10-06  8:49   ` Ingo Molnar
2015-10-06 18:26   ` Andy Lutomirski
2015-10-09 13:06 ` Ingo Molnar [this message]
2015-10-12 18:30   ` Richard Weinberger
2015-10-12 18:41     ` Andy Lutomirski
2015-10-12 21:02       ` Richard Weinberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151009130654.GA10456@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).