linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: x86@kernel.org
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	Borislav Petkov <bp@alien8.de>, Nadav Amit <nadav.amit@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	Brian Gerst <brgerst@gmail.com>,
	"kernel-hardening@lists.openwall.com"
	<kernel-hardening@lists.openwall.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>, Jann Horn <jann@thejh.net>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Andy Lutomirski <luto@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: [PATCH v4 28/29] sched: Free the stack early if CONFIG_THREAD_INFO_IN_TASK
Date: Sun, 26 Jun 2016 14:55:50 -0700	[thread overview]
Message-ID: <ec58e505925c46bd43f9c4275c78292d4483af16.1466974736.git.luto@kernel.org> (raw)
In-Reply-To: <cover.1466974736.git.luto@kernel.org>
In-Reply-To: <cover.1466974736.git.luto@kernel.org>

We currently keep every task's stack around until the task_struct
itself is freed.  This means that we keep the stack allocation alive
for longer than necessary and that, under load, we free stacks in
big batches whenever RCU drops the last task reference.  Neither of
these is good for reuse of cache-hot memory, and freeing in batches
prevents us from usefully caching small numbers of vmalloced stacks.

On architectures that have thread_info on the stack, we can't easily
change this, but on architectures that set THREAD_INFO_IN_TASK, we
can free it as soon as the task is dead.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 include/linux/sched.h |  1 +
 kernel/fork.c         | 23 ++++++++++++++++++++++-
 kernel/sched/core.c   |  9 +++++++++
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4108b4880b86..0b9486826d62 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2659,6 +2659,7 @@ static inline void kernel_signal_stop(void)
 }
 
 extern void release_task(struct task_struct * p);
+extern void release_task_stack(struct task_struct *tsk);
 extern int send_sig_info(int, struct siginfo *, struct task_struct *);
 extern int force_sigsegv(int, struct task_struct *);
 extern int force_sig_info(int, struct siginfo *, struct task_struct *);
diff --git a/kernel/fork.c b/kernel/fork.c
index 06761de69360..8dd1329e1bf8 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -269,11 +269,32 @@ static void account_kernel_stack(struct task_struct *tsk, int account)
 	}
 }
 
-void free_task(struct task_struct *tsk)
+void release_task_stack(struct task_struct *tsk)
 {
 	account_kernel_stack(tsk, -1);
 	arch_release_thread_stack(tsk->stack);
 	free_thread_stack(tsk);
+	tsk->stack = NULL;
+#ifdef CONFIG_VMAP_STACK
+	tsk->stack_vm_area = NULL;
+#endif
+}
+
+void free_task(struct task_struct *tsk)
+{
+#ifndef CONFIG_THREAD_INFO_IN_TASK
+	/*
+	 * The task is finally done with both the stack and thread_info,
+	 * so free both.
+	 */
+	release_task_stack(tsk);
+#else
+	/*
+	 * If the task had a separate stack allocation, it should be gone
+	 * by now.
+	 */
+	WARN_ON_ONCE(tsk->stack);
+#endif
 	rt_mutex_debug_task_free(tsk);
 	ftrace_graph_exit_task(tsk);
 	put_seccomp_filter(tsk);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 51d7105f529a..00c9ba5cf605 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2742,6 +2742,15 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 		 * task and put them back on the free list.
 		 */
 		kprobe_flush_task(prev);
+
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+		/*
+		 * If thread_info is in task_struct, then the dead task no
+		 * longer needs its stack.  Free it right away.
+		 */
+		release_task_stack(prev);
+#endif
+
 		put_task_struct(prev);
 	}
 
-- 
2.7.4

  parent reply	other threads:[~2016-06-26 21:55 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-26 21:55 [PATCH v4 00/29] virtually mapped stacks and thread_info cleanup Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 01/29] bluetooth: Switch SMP to crypto_cipher_encrypt_one() Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
     [not found]   ` <264af59a3060c2bc2a725cfc66a8fa68219d1c4a.1466974736.git.luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-06-27  5:58     ` Marcel Holtmann
2016-06-27  5:58       ` Marcel Holtmann
2016-06-27  8:54       ` Ingo Molnar
2016-06-27  8:54         ` Ingo Molnar
2016-06-27 22:30         ` Marcel Holtmann
2016-06-27 22:30           ` Marcel Holtmann
2016-06-27 22:33           ` Andy Lutomirski
2016-07-04 17:56             ` Marcel Holtmann
2016-07-04 17:56               ` Marcel Holtmann
2016-07-06 13:17               ` Andy Lutomirski
2016-07-06 13:17                 ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 02/29] rxrpc: Avoid using stack memory in SG lists in rxkad Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 03/29] x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 04/29] x86/cpa: In populate_pgd, don't set the pgd entry until it's populated Andy Lutomirski
2016-06-28 18:48   ` Borislav Petkov
2016-06-28 19:07     ` Andy Lutomirski
2016-06-28 19:07       ` Andy Lutomirski
     [not found] ` <cover.1466974736.git.luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-06-26 21:55   ` [PATCH v4 05/29] x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables() Andy Lutomirski
2016-06-26 21:55     ` Andy Lutomirski
2016-06-27  7:19     ` Borislav Petkov
2016-06-27  7:19       ` Borislav Petkov
2016-06-26 21:55 ` [PATCH v4 06/29] mm: Track NR_KERNEL_STACK in KiB instead of number of stacks Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 07/29] mm: Fix memcg stack accounting for sub-page stacks Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 08/29] dma-api: Teach the "DMA-from-stack" check about vmapped stacks Andy Lutomirski
2016-06-30 19:37   ` Borislav Petkov
2016-06-30 19:37     ` Borislav Petkov
2016-07-06 13:20     ` Andy Lutomirski
2016-07-06 13:20       ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 09/29] fork: Add generic vmalloced stack support Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-07-01 14:59   ` Borislav Petkov
2016-07-01 14:59     ` Borislav Petkov
2016-07-01 16:30     ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 10/29] x86/die: Don't try to recover from an OOPS on a non-default stack Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-07-02 17:24   ` Borislav Petkov
2016-07-02 17:24     ` Borislav Petkov
2016-07-02 18:34     ` Josh Poimboeuf
2016-07-03  9:40       ` Borislav Petkov
2016-07-03 14:25       ` Andy Lutomirski
2016-07-03 14:25         ` Andy Lutomirski
2016-07-03 18:42         ` Borislav Petkov
2016-06-26 21:55 ` [PATCH v4 11/29] x86/dumpstack: When OOPSing, rewind the stack before do_exit Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-07-04 18:45   ` Borislav Petkov
2016-06-26 21:55 ` [PATCH v4 12/29] x86/dumpstack: When dumping stack bytes due to OOPS, start with regs->sp Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 13/29] x86/dumpstack: Try harder to get a call trace on stack overflow Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 14/29] x86/dumpstack/64: Handle faults when printing the "Stack:" part of an OOPS Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 15/29] x86/mm/64: Enable vmapped stacks Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-27 15:01   ` Brian Gerst
2016-06-27 15:01     ` Brian Gerst
2016-06-27 15:12     ` Brian Gerst
2016-06-27 15:22       ` Andy Lutomirski
2016-06-27 15:22         ` Andy Lutomirski
2016-06-27 15:54         ` Andy Lutomirski
2016-06-27 15:54           ` Andy Lutomirski
2016-06-27 16:17           ` Brian Gerst
2016-06-27 16:17             ` Brian Gerst
2016-06-27 16:35             ` Andy Lutomirski
2016-06-27 16:35               ` Andy Lutomirski
2016-06-27 17:09               ` Brian Gerst
2016-06-27 17:23                 ` Brian Gerst
2016-06-27 17:28           ` Linus Torvalds
2016-06-27 17:30             ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 16/29] x86/mm: Improve stack-overflow #PF handling Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 17/29] x86: Move uaccess_err and sig_on_uaccess_err to thread_struct Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 18/29] x86: Move addr_limit " Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 19/29] signal: Consolidate {TS,TLF}_RESTORE_SIGMASK code Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 20/29] x86/smp: Remove stack_smp_processor_id() Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 21/29] x86/smp: Remove unnecessary initialization of thread_info::cpu Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 22/29] x86/asm: Move 'status' from struct thread_info to struct thread_struct Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 23:55   ` Brian Gerst
2016-06-27  0:23     ` Andy Lutomirski
2016-06-27  0:36       ` Brian Gerst
2016-06-27  0:40         ` Andy Lutomirski
2016-06-27  0:40           ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 23/29] kdb: Use task_cpu() instead of task_thread_info()->cpu Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 24/29] x86/entry: Get rid of pt_regs_to_thread_info() Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 25/29] um: Stop conflating task_struct::stack with thread_info Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 23:40   ` Brian Gerst
2016-06-26 23:49     ` Andy Lutomirski
2016-06-26 23:49       ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 26/29] sched: Allow putting thread_info into task_struct Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-07-11 10:08   ` [kernel-hardening] " Mark Rutland
2016-07-11 14:55     ` Andy Lutomirski
2016-07-11 14:55       ` Andy Lutomirski
2016-07-11 15:08       ` Mark Rutland
2016-07-11 16:06       ` Linus Torvalds
2016-07-11 16:31         ` [kernel-hardening] " Mark Rutland
2016-07-11 16:31           ` Mark Rutland
2016-07-11 16:42           ` Linus Torvalds
2016-06-26 21:55 ` [PATCH v4 27/29] x86: Move " Andy Lutomirski
2016-06-26 21:55   ` Andy Lutomirski
2016-06-26 21:55 ` Andy Lutomirski [this message]
2016-06-26 21:55   ` [PATCH v4 28/29] sched: Free the stack early if CONFIG_THREAD_INFO_IN_TASK Andy Lutomirski
2016-06-27  2:35   ` Andy Lutomirski
2016-06-26 21:55 ` [PATCH v4 29/29] fork: Cache two thread stacks per cpu if CONFIG_VMAP_STACK is set Andy Lutomirski
2016-06-28  7:32 ` [PATCH v4 02/29] rxrpc: Avoid using stack memory in SG lists in rxkad David Howells
2016-06-28  7:37   ` Herbert Xu
2016-06-28  9:07   ` David Howells
2016-06-28  9:45     ` Herbert Xu
2016-06-28  9:45       ` Herbert Xu
2016-06-28  7:41 ` David Howells
2016-06-28  7:41   ` David Howells
2016-06-28  7:52 ` David Howells
2016-06-28  7:55   ` Herbert Xu
2016-06-28  8:54   ` David Howells
2016-06-28  9:43     ` Herbert Xu
2016-06-28  9:43       ` Herbert Xu
2016-06-28 10:00     ` David Howells
2016-06-28 10:00       ` David Howells
2016-06-28 13:23     ` David Howells
2016-06-29  7:06 ` [PATCH v4 00/29] virtually mapped stacks and thread_info cleanup Mika Penttilä
2016-06-29  7:06   ` Mika Penttilä
2016-06-29 17:24   ` Mika Penttilä
2016-06-29 17:24     ` Mika Penttilä

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec58e505925c46bd43f9c4275c78292d4483af16.1466974736.git.luto@kernel.org \
    --to=luto@kernel.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jann@thejh.net \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).