From: Andy Lutomirski <luto@kernel.org>
To: x86@kernel.org
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
Borislav Petkov <bp@alien8.de>, Nadav Amit <nadav.amit@gmail.com>,
Kees Cook <keescook@chromium.org>,
Brian Gerst <brgerst@gmail.com>,
"kernel-hardening@lists.openwall.com"
<kernel-hardening@lists.openwall.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Josh Poimboeuf <jpoimboe@redhat.com>, Jann Horn <jann@thejh.net>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Andy Lutomirski <luto@kernel.org>
Subject: [PATCH v4 00/29] virtually mapped stacks and thread_info cleanup
Date: Sun, 26 Jun 2016 14:55:22 -0700 [thread overview]
Message-ID: <cover.1466974736.git.luto@kernel.org> (raw)
Hi all-
Since the dawn of time, a kernel stack overflow has been a real PITA
to debug, has caused nondeterministic crashes some time after the
actual overflow, and has generally been easy to exploit for root.
With this series, arches can enable HAVE_ARCH_VMAP_STACK. Arches
that enable it (just x86 for now) get virtually mapped stacks with
guard pages. This causes reliable faults when the stack overflows.
If the arch implements it well, we get a nice OOPS on stack overflow
(as opposed to panicing directly or otherwise exploding badly). On
x86, the OOPS is nice, has a usable call trace, and the overflowing
task is killed cleanly.
This series (starting with this version, v4) also extensively cleans
up thread_info. thread_info has been partially redundant with
thread_struct for a long time -- both are places for arch code to
add additional per-task variables. thread_struct is much cleaner:
it's always in task_struct, and there's nothing particularly magical
about it. So this series contains a bunch of cleanups on x86 to
move almost everything from thread_info to thread_struct (which,
even by itself, deletes more code than it adds) and to remove x86's
dependence on thread_info's position on the stack. Then it opts x86
into a new config option THREAD_INFO_IN_TASK to get rid of
arch-specific thread_info entirely and simply embed a defanged
thread_info (containing only flags) and 'int cpu' into task_struct.
Once thread_info stops being magical, there's another benefit: we
can free the thread stack as soon as the task is dead (without
waiting for RCU) and then, if vmapped stacks are in use, cache the
entire stack for reuse on the same cpu.
This seems to be an overall speedup of about 0.5-1 µs per
pthread_create/join in a simple test -- a percpu cache of vmalloced
stacks appears to be a bit faster than a high-order stack
allocation, at least when the cache hits. (I expect that workloads
with a low cache hit rate are likely to be dominated by other
effects anyway.)
This does not address interrupt stacks.
It's worth noting that s390 has an arch-specific gcc feature that
detects stack overflows by adjusting function prologues. Arches
with features like that may wish to avoid using vmapped stacks to
minimize the performance hit.
Known issues:
- tcp md5, virtio_net, and virtio_console will have issues. Eric Dumazet
has a patch for tcp md5, and Michael Tsirkin says he'll fix virtio_net
and virtio_console.
Changes from v3:
- Minor cleanups
- Rebased onto Linus' tree
- All the thread_info stuff is new
Changes from v2:
- Delete kerne_unmap_pages_in_pgd rather than hardening it (Borislav)
- Fix sub-page stack accounting better (Josh)
Changes from v1:
- Fix rewind_stack_and_do_exit (Josh)
- Fix deadlock under load
- Clean up generic stack vmalloc code
- Many other minor fixes
Andy Lutomirski (25):
bluetooth: Switch SMP to crypto_cipher_encrypt_one()
x86/cpa: In populate_pgd, don't set the pgd entry until it's populated
x86/mm: Remove kernel_unmap_pages_in_pgd() and
efi_cleanup_page_tables()
mm: Track NR_KERNEL_STACK in KiB instead of number of stacks
mm: Fix memcg stack accounting for sub-page stacks
dma-api: Teach the "DMA-from-stack" check about vmapped stacks
fork: Add generic vmalloced stack support
x86/die: Don't try to recover from an OOPS on a non-default stack
x86/dumpstack: When OOPSing, rewind the stack before do_exit
x86/dumpstack: When dumping stack bytes due to OOPS, start with
regs->sp
x86/dumpstack: Try harder to get a call trace on stack overflow
x86/dumpstack/64: Handle faults when printing the "Stack:" part of an
OOPS
x86/mm/64: Enable vmapped stacks
x86/mm: Improve stack-overflow #PF handling
x86: Move uaccess_err and sig_on_uaccess_err to thread_struct
x86: Move addr_limit to thread_struct
signal: Consolidate {TS,TLF}_RESTORE_SIGMASK code
x86/smp: Remove stack_smp_processor_id()
x86/smp: Remove unnecessary initialization of thread_info::cpu
x86/asm: Move 'status' from struct thread_info to struct thread_struct
kdb: Use task_cpu() instead of task_thread_info()->cpu
sched: Allow putting thread_info into task_struct
x86: Move thread_info into task_struct
sched: Free the stack early if CONFIG_THREAD_INFO_IN_TASK
fork: Cache two thread stacks per cpu if CONFIG_VMAP_STACK is set
Herbert Xu (1):
rxrpc: Avoid using stack memory in SG lists in rxkad
Ingo Molnar (1):
x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()
Linus Torvalds (2):
x86/entry: Get rid of pt_regs_to_thread_info()
um: Stop conflating task_struct::stack with thread_info
arch/Kconfig | 29 ++++++
arch/alpha/include/asm/thread_info.h | 27 -----
arch/ia64/include/asm/thread_info.h | 30 +-----
arch/microblaze/include/asm/thread_info.h | 27 -----
arch/powerpc/include/asm/thread_info.h | 25 -----
arch/sh/include/asm/thread_info.h | 26 -----
arch/sparc/include/asm/thread_info_64.h | 24 -----
arch/tile/include/asm/thread_info.h | 27 -----
arch/x86/Kconfig | 2 +
arch/x86/entry/common.c | 25 ++---
arch/x86/entry/entry_32.S | 11 +++
arch/x86/entry/entry_64.S | 20 +++-
arch/x86/entry/vsyscall/vsyscall_64.c | 6 +-
arch/x86/include/asm/checksum_32.h | 3 +-
arch/x86/include/asm/cpu.h | 1 -
arch/x86/include/asm/efi.h | 1 -
arch/x86/include/asm/pgtable_types.h | 2 -
arch/x86/include/asm/processor.h | 32 ++++--
arch/x86/include/asm/smp.h | 6 --
arch/x86/include/asm/switch_to.h | 34 ++++++-
arch/x86/include/asm/syscall.h | 23 +----
arch/x86/include/asm/thread_info.h | 102 +------------------
arch/x86/include/asm/traps.h | 6 ++
arch/x86/include/asm/uaccess.h | 10 +-
arch/x86/kernel/asm-offsets.c | 5 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/dumpstack.c | 20 +++-
arch/x86/kernel/dumpstack_32.c | 4 +-
arch/x86/kernel/dumpstack_64.c | 16 ++-
arch/x86/kernel/fpu/init.c | 1 -
arch/x86/kernel/irq_64.c | 3 +-
arch/x86/kernel/process.c | 6 +-
arch/x86/kernel/process_64.c | 4 +-
arch/x86/kernel/ptrace.c | 2 +-
arch/x86/kernel/smpboot.c | 1 -
arch/x86/kernel/traps.c | 32 ++++++
arch/x86/lib/copy_user_64.S | 8 +-
arch/x86/lib/csum-wrappers_64.c | 1 +
arch/x86/lib/getuser.S | 20 ++--
arch/x86/lib/putuser.S | 10 +-
arch/x86/lib/usercopy_64.c | 2 +-
arch/x86/mm/extable.c | 2 +-
arch/x86/mm/fault.c | 41 +++++++-
arch/x86/mm/init_64.c | 27 -----
arch/x86/mm/pageattr.c | 32 +-----
arch/x86/mm/tlb.c | 15 +++
arch/x86/platform/efi/efi.c | 2 -
arch/x86/platform/efi/efi_32.c | 3 -
arch/x86/platform/efi/efi_64.c | 5 -
arch/x86/um/ptrace_32.c | 8 +-
drivers/base/node.c | 3 +-
drivers/pnp/isapnp/proc.c | 2 +-
fs/proc/meminfo.c | 2 +-
include/linux/init_task.h | 9 ++
include/linux/kdb.h | 2 +-
include/linux/memcontrol.h | 2 +-
include/linux/mmzone.h | 2 +-
include/linux/sched.h | 115 +++++++++++++++++++++-
include/linux/thread_info.h | 56 +++--------
init/Kconfig | 3 +
init/init_task.c | 7 +-
kernel/fork.c | 158 +++++++++++++++++++++++++-----
kernel/sched/core.c | 9 ++
kernel/sched/sched.h | 4 +
lib/bitmap.c | 2 +-
lib/dma-debug.c | 39 ++++++--
mm/memcontrol.c | 2 +-
mm/page_alloc.c | 3 +-
net/bluetooth/smp.c | 67 ++++++-------
net/rxrpc/ar-internal.h | 1 +
net/rxrpc/rxkad.c | 103 ++++++++-----------
71 files changed, 714 insertions(+), 648 deletions(-)
--
2.7.4
next reply other threads:[~2016-06-26 21:55 UTC|newest]
Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-26 21:55 Andy Lutomirski [this message]
2016-06-26 21:55 ` [PATCH v4 00/29] virtually mapped stacks and thread_info cleanup Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 01/29] bluetooth: Switch SMP to crypto_cipher_encrypt_one() Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
[not found] ` <264af59a3060c2bc2a725cfc66a8fa68219d1c4a.1466974736.git.luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-06-27 5:58 ` Marcel Holtmann
2016-06-27 5:58 ` Marcel Holtmann
2016-06-27 8:54 ` Ingo Molnar
2016-06-27 8:54 ` Ingo Molnar
2016-06-27 22:30 ` Marcel Holtmann
2016-06-27 22:30 ` Marcel Holtmann
2016-06-27 22:33 ` Andy Lutomirski
2016-07-04 17:56 ` Marcel Holtmann
2016-07-04 17:56 ` Marcel Holtmann
2016-07-06 13:17 ` Andy Lutomirski
2016-07-06 13:17 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 02/29] rxrpc: Avoid using stack memory in SG lists in rxkad Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 03/29] x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 04/29] x86/cpa: In populate_pgd, don't set the pgd entry until it's populated Andy Lutomirski
2016-06-28 18:48 ` Borislav Petkov
2016-06-28 19:07 ` Andy Lutomirski
2016-06-28 19:07 ` Andy Lutomirski
[not found] ` <cover.1466974736.git.luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-06-26 21:55 ` [PATCH v4 05/29] x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables() Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-27 7:19 ` Borislav Petkov
2016-06-27 7:19 ` Borislav Petkov
2016-06-26 21:55 ` [PATCH v4 06/29] mm: Track NR_KERNEL_STACK in KiB instead of number of stacks Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 07/29] mm: Fix memcg stack accounting for sub-page stacks Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 08/29] dma-api: Teach the "DMA-from-stack" check about vmapped stacks Andy Lutomirski
2016-06-30 19:37 ` Borislav Petkov
2016-06-30 19:37 ` Borislav Petkov
2016-07-06 13:20 ` Andy Lutomirski
2016-07-06 13:20 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 09/29] fork: Add generic vmalloced stack support Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-07-01 14:59 ` Borislav Petkov
2016-07-01 14:59 ` Borislav Petkov
2016-07-01 16:30 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 10/29] x86/die: Don't try to recover from an OOPS on a non-default stack Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-07-02 17:24 ` Borislav Petkov
2016-07-02 17:24 ` Borislav Petkov
2016-07-02 18:34 ` Josh Poimboeuf
2016-07-03 9:40 ` Borislav Petkov
2016-07-03 14:25 ` Andy Lutomirski
2016-07-03 14:25 ` Andy Lutomirski
2016-07-03 18:42 ` Borislav Petkov
2016-06-26 21:55 ` [PATCH v4 11/29] x86/dumpstack: When OOPSing, rewind the stack before do_exit Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-07-04 18:45 ` Borislav Petkov
2016-06-26 21:55 ` [PATCH v4 12/29] x86/dumpstack: When dumping stack bytes due to OOPS, start with regs->sp Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 13/29] x86/dumpstack: Try harder to get a call trace on stack overflow Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 14/29] x86/dumpstack/64: Handle faults when printing the "Stack:" part of an OOPS Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 15/29] x86/mm/64: Enable vmapped stacks Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-27 15:01 ` Brian Gerst
2016-06-27 15:01 ` Brian Gerst
2016-06-27 15:12 ` Brian Gerst
2016-06-27 15:22 ` Andy Lutomirski
2016-06-27 15:22 ` Andy Lutomirski
2016-06-27 15:54 ` Andy Lutomirski
2016-06-27 15:54 ` Andy Lutomirski
2016-06-27 16:17 ` Brian Gerst
2016-06-27 16:17 ` Brian Gerst
2016-06-27 16:35 ` Andy Lutomirski
2016-06-27 16:35 ` Andy Lutomirski
2016-06-27 17:09 ` Brian Gerst
2016-06-27 17:23 ` Brian Gerst
2016-06-27 17:28 ` Linus Torvalds
2016-06-27 17:30 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 16/29] x86/mm: Improve stack-overflow #PF handling Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 17/29] x86: Move uaccess_err and sig_on_uaccess_err to thread_struct Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 18/29] x86: Move addr_limit " Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 19/29] signal: Consolidate {TS,TLF}_RESTORE_SIGMASK code Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 20/29] x86/smp: Remove stack_smp_processor_id() Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 21/29] x86/smp: Remove unnecessary initialization of thread_info::cpu Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 22/29] x86/asm: Move 'status' from struct thread_info to struct thread_struct Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 23:55 ` Brian Gerst
2016-06-27 0:23 ` Andy Lutomirski
2016-06-27 0:36 ` Brian Gerst
2016-06-27 0:40 ` Andy Lutomirski
2016-06-27 0:40 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 23/29] kdb: Use task_cpu() instead of task_thread_info()->cpu Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 24/29] x86/entry: Get rid of pt_regs_to_thread_info() Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 25/29] um: Stop conflating task_struct::stack with thread_info Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 23:40 ` Brian Gerst
2016-06-26 23:49 ` Andy Lutomirski
2016-06-26 23:49 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 26/29] sched: Allow putting thread_info into task_struct Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-07-11 10:08 ` [kernel-hardening] " Mark Rutland
2016-07-11 14:55 ` Andy Lutomirski
2016-07-11 14:55 ` Andy Lutomirski
2016-07-11 15:08 ` Mark Rutland
2016-07-11 16:06 ` Linus Torvalds
2016-07-11 16:31 ` [kernel-hardening] " Mark Rutland
2016-07-11 16:31 ` Mark Rutland
2016-07-11 16:42 ` Linus Torvalds
2016-06-26 21:55 ` [PATCH v4 27/29] x86: Move " Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 28/29] sched: Free the stack early if CONFIG_THREAD_INFO_IN_TASK Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-27 2:35 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 29/29] fork: Cache two thread stacks per cpu if CONFIG_VMAP_STACK is set Andy Lutomirski
2016-06-28 7:32 ` [PATCH v4 02/29] rxrpc: Avoid using stack memory in SG lists in rxkad David Howells
2016-06-28 7:37 ` Herbert Xu
2016-06-28 9:07 ` David Howells
2016-06-28 9:45 ` Herbert Xu
2016-06-28 9:45 ` Herbert Xu
2016-06-28 7:41 ` David Howells
2016-06-28 7:41 ` David Howells
2016-06-28 7:52 ` David Howells
2016-06-28 7:55 ` Herbert Xu
2016-06-28 8:54 ` David Howells
2016-06-28 9:43 ` Herbert Xu
2016-06-28 9:43 ` Herbert Xu
2016-06-28 10:00 ` David Howells
2016-06-28 10:00 ` David Howells
2016-06-28 13:23 ` David Howells
2016-06-29 7:06 ` [PATCH v4 00/29] virtually mapped stacks and thread_info cleanup Mika Penttilä
2016-06-29 7:06 ` Mika Penttilä
2016-06-29 17:24 ` Mika Penttilä
2016-06-29 17:24 ` Mika Penttilä
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1466974736.git.luto@kernel.org \
--to=luto@kernel.org \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=heiko.carstens@de.ibm.com \
--cc=jann@thejh.net \
--cc=jpoimboe@redhat.com \
--cc=keescook@chromium.org \
--cc=kernel-hardening@lists.openwall.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nadav.amit@gmail.com \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).