From: David Laight <david.laight.linux@gmail.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Huacai Chen <chenhuacai@kernel.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Paul Walmsley <pjw@kernel.org>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Kees Cook <kees@kernel.org>,
"Gustavo A. R. Silva" <gustavoars@kernel.org>,
Arnd Bergmann <arnd@arndb.de>,
Mark Rutland <mark.rutland@arm.com>,
"Jason A. Donenfeld" <Jason@zx2c4.com>,
Ard Biesheuvel <ardb@kernel.org>,
Jeremy Linton <jeremy.linton@arm.com>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev,
linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, linux-hardening@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH v3 1/3] randomize_kstack: Maintain kstack_offset per task
Date: Fri, 2 Jan 2026 22:44:32 +0000 [thread overview]
Message-ID: <20260102224432.172b1247@pumpkin> (raw)
In-Reply-To: <20260102131156.3265118-2-ryan.roberts@arm.com>
On Fri, 2 Jan 2026 13:11:52 +0000
Ryan Roberts <ryan.roberts@arm.com> wrote:
> kstack_offset was previously maintained per-cpu, but this caused a
> couple of issues. So let's instead make it per-task.
>
> Issue 1: add_random_kstack_offset() and choose_random_kstack_offset()
> expected and required to be called with interrupts and preemption
> disabled so that it could manipulate per-cpu state. But arm64, loongarch
> and risc-v are calling them with interrupts and preemption enabled. I
> don't _think_ this causes any functional issues, but it's certainly
> unexpected and could lead to manipulating the wrong cpu's state, which
> could cause a minor performance degradation due to bouncing the cache
> lines. By maintaining the state per-task those functions can safely be
> called in preemptible context.
>
> Issue 2: add_random_kstack_offset() is called before executing the
> syscall and expands the stack using a previously chosen rnadom offset.
<>
David
> choose_random_kstack_offset() is called after executing the syscall and
> chooses and stores a new random offset for the next syscall. With
> per-cpu storage for this offset, an attacker could force cpu migration
> during the execution of the syscall and prevent the offset from being
> updated for the original cpu such that it is predictable for the next
> syscall on that cpu. By maintaining the state per-task, this problem
> goes away because the per-task random offset is updated after the
> syscall regardless of which cpu it is executing on.
>
> Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
> Closes: https://lore.kernel.org/all/dd8c37bc-795f-4c7a-9086-69e584d8ab24@arm.com/
> Cc: stable@vger.kernel.org
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> include/linux/randomize_kstack.h | 26 +++++++++++++++-----------
> include/linux/sched.h | 4 ++++
> init/main.c | 1 -
> kernel/fork.c | 2 ++
> 4 files changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
> index 1d982dbdd0d0..5d3916ca747c 100644
> --- a/include/linux/randomize_kstack.h
> +++ b/include/linux/randomize_kstack.h
> @@ -9,7 +9,6 @@
>
> DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
> randomize_kstack_offset);
> -DECLARE_PER_CPU(u32, kstack_offset);
>
> /*
> * Do not use this anywhere else in the kernel. This is used here because
> @@ -50,15 +49,14 @@ DECLARE_PER_CPU(u32, kstack_offset);
> * add_random_kstack_offset - Increase stack utilization by previously
> * chosen random offset
> *
> - * This should be used in the syscall entry path when interrupts and
> - * preempt are disabled, and after user registers have been stored to
> - * the stack. For testing the resulting entropy, please see:
> - * tools/testing/selftests/lkdtm/stack-entropy.sh
> + * This should be used in the syscall entry path after user registers have been
> + * stored to the stack. Preemption may be enabled. For testing the resulting
> + * entropy, please see: tools/testing/selftests/lkdtm/stack-entropy.sh
> */
> #define add_random_kstack_offset() do { \
> if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
> &randomize_kstack_offset)) { \
> - u32 offset = raw_cpu_read(kstack_offset); \
> + u32 offset = current->kstack_offset; \
> u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset)); \
> /* Keep allocation even after "ptr" loses scope. */ \
> asm volatile("" :: "r"(ptr) : "memory"); \
> @@ -69,9 +67,9 @@ DECLARE_PER_CPU(u32, kstack_offset);
> * choose_random_kstack_offset - Choose the random offset for the next
> * add_random_kstack_offset()
> *
> - * This should only be used during syscall exit when interrupts and
> - * preempt are disabled. This position in the syscall flow is done to
> - * frustrate attacks from userspace attempting to learn the next offset:
> + * This should only be used during syscall exit. Preemption may be enabled. This
> + * position in the syscall flow is done to frustrate attacks from userspace
> + * attempting to learn the next offset:
> * - Maximize the timing uncertainty visible from userspace: if the
> * offset is chosen at syscall entry, userspace has much more control
> * over the timing between choosing offsets. "How long will we be in
> @@ -85,14 +83,20 @@ DECLARE_PER_CPU(u32, kstack_offset);
> #define choose_random_kstack_offset(rand) do { \
> if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
> &randomize_kstack_offset)) { \
> - u32 offset = raw_cpu_read(kstack_offset); \
> + u32 offset = current->kstack_offset; \
> offset = ror32(offset, 5) ^ (rand); \
> - raw_cpu_write(kstack_offset, offset); \
> + current->kstack_offset = offset; \
> } \
> } while (0)
> +
> +static inline void random_kstack_task_init(struct task_struct *tsk)
> +{
> + tsk->kstack_offset = 0;
> +}
> #else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
> #define add_random_kstack_offset() do { } while (0)
> #define choose_random_kstack_offset(rand) do { } while (0)
> +#define random_kstack_task_init(tsk) do { } while (0)
> #endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
>
> #endif
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index d395f2810fac..9e0080ed1484 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1591,6 +1591,10 @@ struct task_struct {
> unsigned long prev_lowest_stack;
> #endif
>
> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> + u32 kstack_offset;
> +#endif
> +
> #ifdef CONFIG_X86_MCE
> void __user *mce_vaddr;
> __u64 mce_kflags;
> diff --git a/init/main.c b/init/main.c
> index b84818ad9685..27fcbbde933e 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -830,7 +830,6 @@ static inline void initcall_debug_enable(void)
> #ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
> randomize_kstack_offset);
> -DEFINE_PER_CPU(u32, kstack_offset);
>
> static int __init early_randomize_kstack_offset(char *buf)
> {
> diff --git a/kernel/fork.c b/kernel/fork.c
> index b1f3915d5f8e..b061e1edbc43 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -95,6 +95,7 @@
> #include <linux/thread_info.h>
> #include <linux/kstack_erase.h>
> #include <linux/kasan.h>
> +#include <linux/randomize_kstack.h>
> #include <linux/scs.h>
> #include <linux/io_uring.h>
> #include <linux/bpf.h>
> @@ -2231,6 +2232,7 @@ __latent_entropy struct task_struct *copy_process(
> if (retval)
> goto bad_fork_cleanup_io;
>
> + random_kstack_task_init(p);
> stackleak_task_init(p);
>
> if (pid != &init_struct_pid) {
next prev parent reply other threads:[~2026-01-03 0:28 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 13:11 [PATCH v3 0/3] Fix bugs and performance of kstack offset randomisation Ryan Roberts
2026-01-02 13:11 ` [PATCH v3 1/3] randomize_kstack: Maintain kstack_offset per task Ryan Roberts
2026-01-02 22:44 ` David Laight [this message]
2026-01-02 13:11 ` [PATCH v3 2/3] prandom: Convert prandom_u32_state() to __always_inline Ryan Roberts
2026-01-02 13:39 ` Jason A. Donenfeld
2026-01-02 14:09 ` Ryan Roberts
2026-01-02 22:54 ` David Laight
2026-01-02 13:11 ` [PATCH v3 3/3] randomize_kstack: Unify random source across arches Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260102224432.172b1247@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=Jason@zx2c4.com \
--cc=agordeev@linux.ibm.com \
--cc=aou@eecs.berkeley.edu \
--cc=ardb@kernel.org \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=chenhuacai@kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=gor@linux.ibm.com \
--cc=gustavoars@kernel.org \
--cc=hca@linux.ibm.com \
--cc=jeremy.linton@arm.com \
--cc=kees@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=maddy@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=palmer@dabbelt.com \
--cc=pjw@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).