[PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
@ 2026-01-19 13:01 Ryan Roberts
  2026-01-19 13:01 ` [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task Ryan Roberts
                   ` (4 more replies)
  0 siblings, 5 replies; 28+ messages in thread
From: Ryan Roberts @ 2026-01-19 13:01 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Kees Cook,
	Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: Ryan Roberts, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390, linux-hardening

[Kees; I'm hoping this is now good-to-go via your hardening tree? Please shout
if you think there is more work to be done here!]

Hi All,

As I reported at [1], kstack offset randomisation suffers from a couple of bugs
and, on arm64 at least, the performance is poor. This series attempts to fix
both; patch 1 provides back-portable fixes for the functional bugs. Patches 2-3
propose a performance improvement approach.

I've looked at a few different options but ultimately decided that Jeremy's
original prng approach is the fastest. I made the argument that this approach is
secure "enough" in the RFC [2] and the responses indicated agreement.

More details in the commit logs.


Performance
===========

Mean and tail performance of 3 "small" syscalls was measured. syscall was made
10 million times and each individually measured and binned. These results have
low noise so I'm confident that they are trustworthy.

The baseline is v6.18-rc5 with stack randomization turned *off*. So I'm showing
performance cost of turning it on without any changes to the implementation,
then the reduced performance cost of turning it on with my changes applied.

**NOTE**: The below results were generated using the RFC patches but there is no
meaningful change, so the numbers are still valid.

arm64 (AWS Graviton3):
+-----------------+--------------+-------------+---------------+
| Benchmark       | Result Class |   v6.18-rc5 | per-task-prng |
|                 |              | rndstack-on |               |
|                 |              |             |               |
+=================+==============+=============+===============+
| syscall/getpid  | mean (ns)    |  (R) 15.62% |     (R) 3.43% |
|                 | p99 (ns)     | (R) 155.01% |     (R) 3.20% |
|                 | p99.9 (ns)   | (R) 156.71% |     (R) 2.93% |
+-----------------+--------------+-------------+---------------+
| syscall/getppid | mean (ns)    |  (R) 14.09% |     (R) 2.12% |
|                 | p99 (ns)     | (R) 152.81% |         1.55% |
|                 | p99.9 (ns)   | (R) 153.67% |         1.77% |
+-----------------+--------------+-------------+---------------+
| syscall/invalid | mean (ns)    |  (R) 13.89% |     (R) 3.32% |
|                 | p99 (ns)     | (R) 165.82% |     (R) 3.51% |
|                 | p99.9 (ns)   | (R) 168.83% |     (R) 3.77% |
+-----------------+--------------+-------------+---------------+

Because arm64 was previously using get_random_u16(), it was expensive when it
didn't have any buffered bits and had to call into the crng. That's what caused
the enormous tail latency.


x86 (AWS Sapphire Rapids):
+-----------------+--------------+-------------+---------------+
| Benchmark       | Result Class |   v6.18-rc5 | per-task-prng |
|                 |              | rndstack-on |               |
|                 |              |             |               |
+=================+==============+=============+===============+
| syscall/getpid  | mean (ns)    |  (R) 13.32% |     (R) 4.60% |
|                 | p99 (ns)     |  (R) 13.38% |    (R) 18.08% |
|                 | p99.9 (ns)   |      16.26% |    (R) 19.38% |
+-----------------+--------------+-------------+---------------+
| syscall/getppid | mean (ns)    |  (R) 11.96% |     (R) 5.26% |
|                 | p99 (ns)     |  (R) 11.83% |     (R) 8.35% |
|                 | p99.9 (ns)   |  (R) 11.42% |    (R) 22.37% |
+-----------------+--------------+-------------+---------------+
| syscall/invalid | mean (ns)    |  (R) 10.58% |     (R) 2.91% |
|                 | p99 (ns)     |  (R) 10.51% |     (R) 4.36% |
|                 | p99.9 (ns)   |  (R) 10.35% |    (R) 21.97% |
+-----------------+--------------+-------------+---------------+

I was surprised to see that the baseline cost on x86 is 10-12% since it is just
using rdtsc. But as I say, I believe the results are accurate.


Changes since v3 (RFC) [4]
==========================

- Patch 1: Fixed typo in commit log (per David L)
- Patch 2: Reinstated prandom_u32_state() as out-of-line function, which
  forwards to inline version (per David L)
- Patch 3: Added supplementary info about benefits of removing
  choose_random_kstack_offset() (per Mark R)

Changes since v2 (RFC) [3]
==========================

- Moved late_initcall() to initialize kstack_rnd_state out of
  randomize_kstack.h and into main.c. (issue noticed by kernel test robot)

Changes since v1 (RFC) [2]
==========================

- Introduced patch 2 to make prandom_u32_state() __always_inline (needed since
  its called from noinstr code)
- In patch 3, prng is now per-cpu instead of per-task (per Ard)


[1] https://lore.kernel.org/all/dd8c37bc-795f-4c7a-9086-69e584d8ab24@arm.com/
[2] https://lore.kernel.org/all/20251127105958.2427758-1-ryan.roberts@arm.com/
[3] https://lore.kernel.org/all/20251215163520.1144179-1-ryan.roberts@arm.com/
[4] https://lore.kernel.org/all/20260102131156.3265118-1-ryan.roberts@arm.com/

Thanks,
Ryan

Ryan Roberts (3):
  randomize_kstack: Maintain kstack_offset per task
  prandom: Add __always_inline version of prandom_u32_state()
  randomize_kstack: Unify random source across arches

 arch/Kconfig                         |  5 ++-
 arch/arm64/kernel/syscall.c          | 11 ------
 arch/loongarch/kernel/syscall.c      | 11 ------
 arch/powerpc/kernel/syscall.c        | 12 -------
 arch/riscv/kernel/traps.c            | 12 -------
 arch/s390/include/asm/entry-common.h |  8 -----
 arch/x86/include/asm/entry-common.h  | 12 -------
 include/linux/prandom.h              | 20 +++++++++++
 include/linux/randomize_kstack.h     | 54 +++++++++++-----------------
 init/main.c                          |  9 ++++-
 kernel/fork.c                        |  1 +
 lib/random32.c                       |  8 +----
 12 files changed, 52 insertions(+), 111 deletions(-)

--
2.43.0


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task
  2026-01-19 13:01 [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Ryan Roberts
@ 2026-01-19 13:01 ` Ryan Roberts
  2026-01-19 16:10   ` Dave Hansen
  2026-01-19 13:01 ` [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state() Ryan Roberts
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Ryan Roberts @ 2026-01-19 13:01 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Kees Cook,
	Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: Ryan Roberts, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390, linux-hardening, stable

kstack_offset was previously maintained per-cpu, but this caused a
couple of issues. So let's instead make it per-task.

Issue 1: add_random_kstack_offset() and choose_random_kstack_offset()
expected and required to be called with interrupts and preemption
disabled so that it could manipulate per-cpu state. But arm64, loongarch
and risc-v are calling them with interrupts and preemption enabled. I
don't _think_ this causes any functional issues, but it's certainly
unexpected and could lead to manipulating the wrong cpu's state, which
could cause a minor performance degradation due to bouncing the cache
lines. By maintaining the state per-task those functions can safely be
called in preemptible context.

Issue 2: add_random_kstack_offset() is called before executing the
syscall and expands the stack using a previously chosen random offset.
choose_random_kstack_offset() is called after executing the syscall and
chooses and stores a new random offset for the next syscall. With
per-cpu storage for this offset, an attacker could force cpu migration
during the execution of the syscall and prevent the offset from being
updated for the original cpu such that it is predictable for the next
syscall on that cpu. By maintaining the state per-task, this problem
goes away because the per-task random offset is updated after the
syscall regardless of which cpu it is executing on.

Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
Closes: https://lore.kernel.org/all/dd8c37bc-795f-4c7a-9086-69e584d8ab24@arm.com/
Cc: stable@vger.kernel.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 include/linux/randomize_kstack.h | 26 +++++++++++++++-----------
 include/linux/sched.h            |  4 ++++
 init/main.c                      |  1 -
 kernel/fork.c                    |  2 ++
 4 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
index 1d982dbdd0d0..5d3916ca747c 100644
--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -9,7 +9,6 @@
 
 DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
 			 randomize_kstack_offset);
-DECLARE_PER_CPU(u32, kstack_offset);
 
 /*
  * Do not use this anywhere else in the kernel. This is used here because
@@ -50,15 +49,14 @@ DECLARE_PER_CPU(u32, kstack_offset);
  * add_random_kstack_offset - Increase stack utilization by previously
  *			      chosen random offset
  *
- * This should be used in the syscall entry path when interrupts and
- * preempt are disabled, and after user registers have been stored to
- * the stack. For testing the resulting entropy, please see:
- * tools/testing/selftests/lkdtm/stack-entropy.sh
+ * This should be used in the syscall entry path after user registers have been
+ * stored to the stack. Preemption may be enabled. For testing the resulting
+ * entropy, please see: tools/testing/selftests/lkdtm/stack-entropy.sh
  */
 #define add_random_kstack_offset() do {					\
 	if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,	\
 				&randomize_kstack_offset)) {		\
-		u32 offset = raw_cpu_read(kstack_offset);		\
+		u32 offset = current->kstack_offset;			\
 		u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset));	\
 		/* Keep allocation even after "ptr" loses scope. */	\
 		asm volatile("" :: "r"(ptr) : "memory");		\
@@ -69,9 +67,9 @@ DECLARE_PER_CPU(u32, kstack_offset);
  * choose_random_kstack_offset - Choose the random offset for the next
  *				 add_random_kstack_offset()
  *
- * This should only be used during syscall exit when interrupts and
- * preempt are disabled. This position in the syscall flow is done to
- * frustrate attacks from userspace attempting to learn the next offset:
+ * This should only be used during syscall exit. Preemption may be enabled. This
+ * position in the syscall flow is done to frustrate attacks from userspace
+ * attempting to learn the next offset:
  * - Maximize the timing uncertainty visible from userspace: if the
  *   offset is chosen at syscall entry, userspace has much more control
  *   over the timing between choosing offsets. "How long will we be in
@@ -85,14 +83,20 @@ DECLARE_PER_CPU(u32, kstack_offset);
 #define choose_random_kstack_offset(rand) do {				\
 	if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,	\
 				&randomize_kstack_offset)) {		\
-		u32 offset = raw_cpu_read(kstack_offset);		\
+		u32 offset = current->kstack_offset;			\
 		offset = ror32(offset, 5) ^ (rand);			\
-		raw_cpu_write(kstack_offset, offset);			\
+		current->kstack_offset = offset;			\
 	}								\
 } while (0)
+
+static inline void random_kstack_task_init(struct task_struct *tsk)
+{
+	tsk->kstack_offset = 0;
+}
 #else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
 #define add_random_kstack_offset()		do { } while (0)
 #define choose_random_kstack_offset(rand)	do { } while (0)
+#define random_kstack_task_init(tsk)		do { } while (0)
 #endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
 
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index da0133524d08..23081a702ecf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1591,6 +1591,10 @@ struct task_struct {
 	unsigned long			prev_lowest_stack;
 #endif
 
+#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
+	u32				kstack_offset;
+#endif
+
 #ifdef CONFIG_X86_MCE
 	void __user			*mce_vaddr;
 	__u64				mce_kflags;
diff --git a/init/main.c b/init/main.c
index b84818ad9685..27fcbbde933e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -830,7 +830,6 @@ static inline void initcall_debug_enable(void)
 #ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
 DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
 			   randomize_kstack_offset);
-DEFINE_PER_CPU(u32, kstack_offset);
 
 static int __init early_randomize_kstack_offset(char *buf)
 {
diff --git a/kernel/fork.c b/kernel/fork.c
index b1f3915d5f8e..b061e1edbc43 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -95,6 +95,7 @@
 #include <linux/thread_info.h>
 #include <linux/kstack_erase.h>
 #include <linux/kasan.h>
+#include <linux/randomize_kstack.h>
 #include <linux/scs.h>
 #include <linux/io_uring.h>
 #include <linux/bpf.h>
@@ -2231,6 +2232,7 @@ __latent_entropy struct task_struct *copy_process(
 	if (retval)
 		goto bad_fork_cleanup_io;
 
+	random_kstack_task_init(p);
 	stackleak_task_init(p);
 
 	if (pid != &init_struct_pid) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task
  2026-01-19 13:01 ` [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task Ryan Roberts
@ 2026-01-19 16:10   ` Dave Hansen
  2026-01-19 16:51     ` Ryan Roberts
  0 siblings, 1 reply; 28+ messages in thread
From: Dave Hansen @ 2026-01-19 16:10 UTC (permalink / raw)
  To: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening, stable

On 1/19/26 05:01, Ryan Roberts wrote:
...
> Cc: stable@vger.kernel.org

Since this doesn't fix any known functional issues, if it were me, I'd
leave stable@ alone. It isn't clear that this is stable material.

> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1591,6 +1591,10 @@ struct task_struct {
>  	unsigned long			prev_lowest_stack;
>  #endif
>  
> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> +	u32				kstack_offset;
> +#endif
> +
>  #ifdef CONFIG_X86_MCE
>  	void __user			*mce_vaddr;

Nit: This seems to be throwing a u32 potentially in between a couple of
void*/ulong sized objects.

It probably doesn't matter with struct randomization and it's really
hard to get right among the web of task_struct #ifdefs. But, it would be
nice to at _least_ nestle this next to another int-sized thing.

Does it really even need to be 32 bits? x86 has this comment:

>         /*
>          * This value will get limited by KSTACK_OFFSET_MAX(), which is 10
>          * bits. The actual entropy will be further reduced by the compiler
>          * when applying stack alignment constraints (see cc_stack_align4/8 in
>          * arch/x86/Makefile), which will remove the 3 (x86_64) or 2 (ia32)
>          * low bits from any entropy chosen here.
>          *
>          * Therefore, final stack offset entropy will be 7 (x86_64) or
>          * 8 (ia32) bits.
>          */


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task
  2026-01-19 16:10   ` Dave Hansen
@ 2026-01-19 16:51     ` Ryan Roberts
  2026-01-19 16:53       ` Dave Hansen
  0 siblings, 1 reply; 28+ messages in thread
From: Ryan Roberts @ 2026-01-19 16:51 UTC (permalink / raw)
  To: Dave Hansen, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening, stable

Thanks for the review!

On 19/01/2026 16:10, Dave Hansen wrote:
> On 1/19/26 05:01, Ryan Roberts wrote:
> ...
>> Cc: stable@vger.kernel.org
> 
> Since this doesn't fix any known functional issues, if it were me, I'd
> leave stable@ alone. It isn't clear that this is stable material.

I listed 2 issues in the commit log; I agree that issue 1 falls into the
category of "don't really care", but issue 2 means that kstack randomization is
currently trivial to defeat. That's the reason I thought it would valuable in
stable.

But if you're saying don't bother and others agree, then this whole patch can be
dropped; this is just intended to be the backportable fix. Patch 3 reimplements
this entirely for upstream.

I'll wait and see if others have opinions if that's ok?

> 
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1591,6 +1591,10 @@ struct task_struct {
>>  	unsigned long			prev_lowest_stack;
>>  #endif
>>  
>> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
>> +	u32				kstack_offset;
>> +#endif
>> +
>>  #ifdef CONFIG_X86_MCE
>>  	void __user			*mce_vaddr;
> 
> Nit: This seems to be throwing a u32 potentially in between a couple of
> void*/ulong sized objects.

Yeah, I spent a bit of time with pahole but eventually concluded that it was
difficult to find somewhere to nestle it that would work reliably cross arch.
Eventually I just decided to group it with other stack meta data.

> 
> It probably doesn't matter with struct randomization and it's really
> hard to get right among the web of task_struct #ifdefs. But, it would be
> nice to at _least_ nestle this next to another int-sized thing.
> 
> Does it really even need to be 32 bits? x86 has this comment:
> 
>>         /*
>>          * This value will get limited by KSTACK_OFFSET_MAX(), which is 10
>>          * bits. The actual entropy will be further reduced by the compiler
>>          * when applying stack alignment constraints (see cc_stack_align4/8 in
>>          * arch/x86/Makefile), which will remove the 3 (x86_64) or 2 (ia32)
>>          * low bits from any entropy chosen here.
>>          *
>>          * Therefore, final stack offset entropy will be 7 (x86_64) or
>>          * 8 (ia32) bits.
>>          */

For more recent kernels it's 6 bits shifted by 4 for 64-bit kernels or 8 bits
shifted by 2 for 32-bit kernels regardless of arch. So could probably make it
work with 8 bits of storage. Although I was deliberately trying to keep the
change simple, since it was intended for backporting. Patch 3 rips it out.

Overall I'd prefer to leave it all as is. But if people don't think we should
backport, then let's just drop the whole patch.

Thanks,
Ryan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task
  2026-01-19 16:51     ` Ryan Roberts
@ 2026-01-19 16:53       ` Dave Hansen
  0 siblings, 0 replies; 28+ messages in thread
From: Dave Hansen @ 2026-01-19 16:53 UTC (permalink / raw)
  To: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening, stable

On 1/19/26 08:51, Ryan Roberts wrote:
>> Since this doesn't fix any known functional issues, if it were me, I'd
>> leave stable@ alone. It isn't clear that this is stable material.
> I listed 2 issues in the commit log; I agree that issue 1 falls into the
> category of "don't really care", but issue 2 means that kstack randomization is
> currently trivial to defeat. That's the reason I thought it would valuable in
> stable.
> 
> But if you're saying don't bother and others agree, then this whole patch can be
> dropped; this is just intended to be the backportable fix. Patch 3 reimplements
> this entirely for upstream.
> 
> I'll wait and see if others have opinions if that's ok?

Sure. I don't feel that strongly about it.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state()
  2026-01-19 13:01 [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Ryan Roberts
  2026-01-19 13:01 ` [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task Ryan Roberts
@ 2026-01-19 13:01 ` Ryan Roberts
  2026-01-28 17:00   ` Jason A. Donenfeld
  2026-01-19 13:01 ` [PATCH v4 3/3] randomize_kstack: Unify random source across arches Ryan Roberts
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Ryan Roberts @ 2026-01-19 13:01 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Kees Cook,
	Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: Ryan Roberts, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390, linux-hardening

We will shortly use prandom_u32_state() to implement kstack offset
randomization and some arches need to call it from non-instrumentable
context. So let's implement prandom_u32_state() as an out-of-line
wrapper around a new __always_inline prandom_u32_state_inline(). kstack
offset randomization will use this new version.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 include/linux/prandom.h | 20 ++++++++++++++++++++
 lib/random32.c          |  8 +-------
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/include/linux/prandom.h b/include/linux/prandom.h
index ff7dcc3fa105..801188680a29 100644
--- a/include/linux/prandom.h
+++ b/include/linux/prandom.h
@@ -17,6 +17,26 @@ struct rnd_state {
 	__u32 s1, s2, s3, s4;
 };
 
+/**
+ * prandom_u32_state_inline - seeded pseudo-random number generator.
+ * @state: pointer to state structure holding seeded state.
+ *
+ * This is used for pseudo-randomness with no outside seeding.
+ * For more random results, use get_random_u32().
+ * For use only where the out-of-line version, prandom_u32_state(), cannot be
+ * used (e.g. noinstr code).
+ */
+static __always_inline u32 prandom_u32_state_inline(struct rnd_state *state)
+{
+#define TAUSWORTHE(s, a, b, c, d) ((s & c) << d) ^ (((s << a) ^ s) >> b)
+	state->s1 = TAUSWORTHE(state->s1,  6U, 13U, 4294967294U, 18U);
+	state->s2 = TAUSWORTHE(state->s2,  2U, 27U, 4294967288U,  2U);
+	state->s3 = TAUSWORTHE(state->s3, 13U, 21U, 4294967280U,  7U);
+	state->s4 = TAUSWORTHE(state->s4,  3U, 12U, 4294967168U, 13U);
+
+	return (state->s1 ^ state->s2 ^ state->s3 ^ state->s4);
+}
+
 u32 prandom_u32_state(struct rnd_state *state);
 void prandom_bytes_state(struct rnd_state *state, void *buf, size_t nbytes);
 void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state);
diff --git a/lib/random32.c b/lib/random32.c
index 24e7acd9343f..2a02d82e91bc 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -51,13 +51,7 @@
  */
 u32 prandom_u32_state(struct rnd_state *state)
 {
-#define TAUSWORTHE(s, a, b, c, d) ((s & c) << d) ^ (((s << a) ^ s) >> b)
-	state->s1 = TAUSWORTHE(state->s1,  6U, 13U, 4294967294U, 18U);
-	state->s2 = TAUSWORTHE(state->s2,  2U, 27U, 4294967288U,  2U);
-	state->s3 = TAUSWORTHE(state->s3, 13U, 21U, 4294967280U,  7U);
-	state->s4 = TAUSWORTHE(state->s4,  3U, 12U, 4294967168U, 13U);
-
-	return (state->s1 ^ state->s2 ^ state->s3 ^ state->s4);
+	return prandom_u32_state_inline(state);
 }
 EXPORT_SYMBOL(prandom_u32_state);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state()
  2026-01-19 13:01 ` [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state() Ryan Roberts
@ 2026-01-28 17:00   ` Jason A. Donenfeld
  2026-01-28 17:33     ` Ryan Roberts
  2026-01-30 16:16     ` Christophe Leroy (CS GROUP)
  0 siblings, 2 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2026-01-28 17:00 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Kees Cook,
	Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland, Ard Biesheuvel,
	Jeremy Linton, David Laight, linux-kernel, linux-arm-kernel,
	loongarch, linuxppc-dev, linux-riscv, linux-s390, linux-hardening

On Mon, Jan 19, 2026 at 01:01:09PM +0000, Ryan Roberts wrote:
> We will shortly use prandom_u32_state() to implement kstack offset
> randomization and some arches need to call it from non-instrumentable
> context. So let's implement prandom_u32_state() as an out-of-line
> wrapper around a new __always_inline prandom_u32_state_inline(). kstack
> offset randomization will use this new version.
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  include/linux/prandom.h | 20 ++++++++++++++++++++
>  lib/random32.c          |  8 +-------
>  2 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/prandom.h b/include/linux/prandom.h
> index ff7dcc3fa105..801188680a29 100644
> --- a/include/linux/prandom.h
> +++ b/include/linux/prandom.h
> @@ -17,6 +17,26 @@ struct rnd_state {
>  	__u32 s1, s2, s3, s4;
>  };
>  
> +/**
> + * prandom_u32_state_inline - seeded pseudo-random number generator.
> + * @state: pointer to state structure holding seeded state.
> + *
> + * This is used for pseudo-randomness with no outside seeding.
> + * For more random results, use get_random_u32().
> + * For use only where the out-of-line version, prandom_u32_state(), cannot be
> + * used (e.g. noinstr code).
> + */
> +static __always_inline u32 prandom_u32_state_inline(struct rnd_state *state)

This is pretty bikesheddy and I'm not really entirely convinced that my
intuition is correct here, but I thought I should at least ask. Do you
think this would be better called __prandom_u32_state(), where the "__"
is kind of a, "don't use this directly unless you know what you're doing
because it's sort of internal"? It seems like either we make this inline
for everybody, or if there's a good reason for having most users use the
non-inline version, then we should be careful that new users don't use
the inline version. I was thinking the __ would help with that.

Jason

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state()
  2026-01-28 17:00   ` Jason A. Donenfeld
@ 2026-01-28 17:33     ` Ryan Roberts
  2026-01-28 18:32       ` David Laight
  2026-01-30 16:16     ` Christophe Leroy (CS GROUP)
  1 sibling, 1 reply; 28+ messages in thread
From: Ryan Roberts @ 2026-01-28 17:33 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Kees Cook,
	Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland, Ard Biesheuvel,
	Jeremy Linton, David Laight, linux-kernel, linux-arm-kernel,
	loongarch, linuxppc-dev, linux-riscv, linux-s390, linux-hardening

On 28/01/2026 17:00, Jason A. Donenfeld wrote:
> On Mon, Jan 19, 2026 at 01:01:09PM +0000, Ryan Roberts wrote:
>> We will shortly use prandom_u32_state() to implement kstack offset
>> randomization and some arches need to call it from non-instrumentable
>> context. So let's implement prandom_u32_state() as an out-of-line
>> wrapper around a new __always_inline prandom_u32_state_inline(). kstack
>> offset randomization will use this new version.
>>
>> Acked-by: Mark Rutland <mark.rutland@arm.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>  include/linux/prandom.h | 20 ++++++++++++++++++++
>>  lib/random32.c          |  8 +-------
>>  2 files changed, 21 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/prandom.h b/include/linux/prandom.h
>> index ff7dcc3fa105..801188680a29 100644
>> --- a/include/linux/prandom.h
>> +++ b/include/linux/prandom.h
>> @@ -17,6 +17,26 @@ struct rnd_state {
>>  	__u32 s1, s2, s3, s4;
>>  };
>>  
>> +/**
>> + * prandom_u32_state_inline - seeded pseudo-random number generator.
>> + * @state: pointer to state structure holding seeded state.
>> + *
>> + * This is used for pseudo-randomness with no outside seeding.
>> + * For more random results, use get_random_u32().
>> + * For use only where the out-of-line version, prandom_u32_state(), cannot be
>> + * used (e.g. noinstr code).
>> + */
>> +static __always_inline u32 prandom_u32_state_inline(struct rnd_state *state)
> 
> This is pretty bikesheddy and I'm not really entirely convinced that my
> intuition is correct here, but I thought I should at least ask. Do you
> think this would be better called __prandom_u32_state(), where the "__"
> is kind of a, "don't use this directly unless you know what you're doing
> because it's sort of internal"? It seems like either we make this inline
> for everybody, or if there's a good reason for having most users use the
> non-inline version, then we should be careful that new users don't use
> the inline version. I was thinking the __ would help with that.

I'm certainly happy to do that, if that's your preference. I have to respin this
anyway, given the noinstr issue.

> 
> Jason


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state()
  2026-01-28 17:33     ` Ryan Roberts
@ 2026-01-28 18:32       ` David Laight
  0 siblings, 0 replies; 28+ messages in thread
From: David Laight @ 2026-01-28 18:32 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Jason A. Donenfeld, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Ard Biesheuvel, Jeremy Linton, linux-kernel,
	linux-arm-kernel, loongarch, linuxppc-dev, linux-riscv,
	linux-s390, linux-hardening

On Wed, 28 Jan 2026 17:33:19 +0000
Ryan Roberts <ryan.roberts@arm.com> wrote:

> On 28/01/2026 17:00, Jason A. Donenfeld wrote:
> > On Mon, Jan 19, 2026 at 01:01:09PM +0000, Ryan Roberts wrote:  
> >> We will shortly use prandom_u32_state() to implement kstack offset
> >> randomization and some arches need to call it from non-instrumentable
> >> context. So let's implement prandom_u32_state() as an out-of-line
> >> wrapper around a new __always_inline prandom_u32_state_inline(). kstack
> >> offset randomization will use this new version.
> >>
> >> Acked-by: Mark Rutland <mark.rutland@arm.com>
> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> >> ---
> >>  include/linux/prandom.h | 20 ++++++++++++++++++++
> >>  lib/random32.c          |  8 +-------
> >>  2 files changed, 21 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/include/linux/prandom.h b/include/linux/prandom.h
> >> index ff7dcc3fa105..801188680a29 100644
> >> --- a/include/linux/prandom.h
> >> +++ b/include/linux/prandom.h
> >> @@ -17,6 +17,26 @@ struct rnd_state {
> >>  	__u32 s1, s2, s3, s4;
> >>  };
> >>  
> >> +/**
> >> + * prandom_u32_state_inline - seeded pseudo-random number generator.
> >> + * @state: pointer to state structure holding seeded state.
> >> + *
> >> + * This is used for pseudo-randomness with no outside seeding.
> >> + * For more random results, use get_random_u32().
> >> + * For use only where the out-of-line version, prandom_u32_state(), cannot be
> >> + * used (e.g. noinstr code).

If you are going to respin:
		(e.g. noinst or performance critical code).

	David

> >> + */
> >> +static __always_inline u32 prandom_u32_state_inline(struct rnd_state *state)  
> > 
> > This is pretty bikesheddy and I'm not really entirely convinced that my
> > intuition is correct here, but I thought I should at least ask. Do you
> > think this would be better called __prandom_u32_state(), where the "__"
> > is kind of a, "don't use this directly unless you know what you're doing
> > because it's sort of internal"? It seems like either we make this inline
> > for everybody, or if there's a good reason for having most users use the
> > non-inline version, then we should be careful that new users don't use
> > the inline version. I was thinking the __ would help with that.  
> 
> I'm certainly happy to do that, if that's your preference. I have to respin this
> anyway, given the noinstr issue.
> 
> > 
> > Jason  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state()
  2026-01-28 17:00   ` Jason A. Donenfeld
  2026-01-28 17:33     ` Ryan Roberts
@ 2026-01-30 16:16     ` Christophe Leroy (CS GROUP)
  1 sibling, 0 replies; 28+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2026-01-30 16:16 UTC (permalink / raw)
  To: Jason A. Donenfeld, Ryan Roberts
  Cc: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Kees Cook,
	Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland, Ard Biesheuvel,
	Jeremy Linton, David Laight, linux-kernel, linux-arm-kernel,
	loongarch, linuxppc-dev, linux-riscv, linux-s390, linux-hardening

Le 28/01/2026 à 18:00, Jason A. Donenfeld a écrit :
> On Mon, Jan 19, 2026 at 01:01:09PM +0000, Ryan Roberts wrote:
>> We will shortly use prandom_u32_state() to implement kstack offset
>> randomization and some arches need to call it from non-instrumentable
>> context. So let's implement prandom_u32_state() as an out-of-line
>> wrapper around a new __always_inline prandom_u32_state_inline(). kstack
>> offset randomization will use this new version.
>>
>> Acked-by: Mark Rutland <mark.rutland@arm.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>   include/linux/prandom.h | 20 ++++++++++++++++++++
>>   lib/random32.c          |  8 +-------
>>   2 files changed, 21 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/prandom.h b/include/linux/prandom.h
>> index ff7dcc3fa105..801188680a29 100644
>> --- a/include/linux/prandom.h
>> +++ b/include/linux/prandom.h
>> @@ -17,6 +17,26 @@ struct rnd_state {
>>   	__u32 s1, s2, s3, s4;
>>   };
>>   
>> +/**
>> + * prandom_u32_state_inline - seeded pseudo-random number generator.
>> + * @state: pointer to state structure holding seeded state.
>> + *
>> + * This is used for pseudo-randomness with no outside seeding.
>> + * For more random results, use get_random_u32().
>> + * For use only where the out-of-line version, prandom_u32_state(), cannot be
>> + * used (e.g. noinstr code).
>> + */
>> +static __always_inline u32 prandom_u32_state_inline(struct rnd_state *state)
> 
> This is pretty bikesheddy and I'm not really entirely convinced that my
> intuition is correct here, but I thought I should at least ask. Do you
> think this would be better called __prandom_u32_state(), where the "__"
> is kind of a, "don't use this directly unless you know what you're doing
> because it's sort of internal"? It seems like either we make this inline
> for everybody, or if there's a good reason for having most users use the
> non-inline version, then we should be careful that new users don't use
> the inline version. I was thinking the __ would help with that.

I looked into kernel sources and there are several functions named 
something_something_else_inline() and it doesn't mean those functions 
get inlined, so I would also prefer __prandom_u32_state() which means 
"If you use it you know what you are doing", just like __get_user() for 
instance.

However maybe we could also reconsider making it inline for everyone. We 
have spotted half a dozen of places where the code size increases a lot 
when forcing it inline, but those places deserve a local trampoline to 
avoid code duplication, and then the compiler decides to inline or not.

Because there are also several places that benefit from the inlining 
because it allows GCC to simplify the calculation, for instance when 
some calculation is performed with the result like with 
(prandom_u32_state(rng) % ceil) where ceil is 2 or 4.

That can of course be done as a followup patch but it means at the end 
we will have to rename all __prandom_u32_state() to prandom_u32_state().

Or should we do the other way round ? Make __prandom_u32_state() the 
out-of-line version and just change the few places where the size 
explodes like drm_test_buddy_alloc_range_bias(), loss_gilb_ell(), 
generate_random_testvec_config(), generate_random_sgl_divisions(), 
mutate_buffer(), ... ?

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-19 13:01 [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Ryan Roberts
  2026-01-19 13:01 ` [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task Ryan Roberts
  2026-01-19 13:01 ` [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state() Ryan Roberts
@ 2026-01-19 13:01 ` Ryan Roberts
  2026-01-20 23:50   ` kernel test robot
  2026-02-22 21:34   ` Thomas Gleixner
  2026-01-19 16:00 ` [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Dave Hansen
  2026-01-19 16:25 ` Heiko Carstens
  4 siblings, 2 replies; 28+ messages in thread
From: Ryan Roberts @ 2026-01-19 13:01 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Kees Cook,
	Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: Ryan Roberts, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390, linux-hardening

Previously different architectures were using random sources of
differing strength and cost to decide the random kstack offset. A number
of architectures (loongarch, powerpc, s390, x86) were using their
timestamp counter, at whatever the frequency happened to be. Other
arches (arm64, riscv) were using entropy from the crng via
get_random_u16().

There have been concerns that in some cases the timestamp counters may
be too weak, because they can be easily guessed or influenced by user
space. And get_random_u16() has been shown to be too costly for the
level of protection kstack offset randomization provides.

So let's use a common, architecture-agnostic source of entropy; a
per-cpu prng, seeded at boot-time from the crng. This has a few
benefits:

  - We can remove choose_random_kstack_offset(); That was only there to
    try to make the timestamp counter value a bit harder to influence
    from user space [*].

  - The architecture code is simplified. All it has to do now is call
    add_random_kstack_offset() in the syscall path.

  - The strength of the randomness can be reasoned about independently
    of the architecture.

  - Arches previously using get_random_u16() now have much faster
    syscall paths, see below results.

[*] Additionally, this gets rid of some redundant work on s390 and x86.
Before this patch, those architectures called
choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(),
which is also called for exception returns to userspace which were *not*
syscalls (e.g. regular interrupts). Getting rid of
choose_random_kstack_offset() avoids a small amount of redundant work
for the non-syscall cases.

There have been some claims that a prng may be less strong than the
timestamp counter if not regularly reseeded. But the prng has a period
of about 2^113. So as long as the prng state remains secret, it should
not be possible to guess. If the prng state can be accessed, we have
bigger problems.

Additionally, we are only consuming 6 bits to randomize the stack, so
there are only 64 possible random offsets. I assert that it would be
trivial for an attacker to brute force by repeating their attack and
waiting for the random stack offset to be the desired one. The prng
approach seems entirely proportional to this level of protection.

Performance data are provided below. The baseline is v6.18 with rndstack
on for each respective arch. (I)/(R) indicate statistically significant
improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal).
x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge):

+-----------------+--------------+---------------+---------------+
| Benchmark       | Result Class | per-task-prng | per-task-prng |
|                 |              | arm64 (metal) |   x86_64 (VM) |
+=================+==============+===============+===============+
| syscall/getpid  | mean (ns)    |    (I) -9.50% |   (I) -17.65% |
|                 | p99 (ns)     |   (I) -59.24% |   (I) -24.41% |
|                 | p99.9 (ns)   |   (I) -59.52% |   (I) -28.52% |
+-----------------+--------------+---------------+---------------+
| syscall/getppid | mean (ns)    |    (I) -9.52% |   (I) -19.24% |
|                 | p99 (ns)     |   (I) -59.25% |   (I) -25.03% |
|                 | p99.9 (ns)   |   (I) -59.50% |   (I) -28.17% |
+-----------------+--------------+---------------+---------------+
| syscall/invalid | mean (ns)    |   (I) -10.31% |   (I) -18.56% |
|                 | p99 (ns)     |   (I) -60.79% |   (I) -20.06% |
|                 | p99.9 (ns)   |   (I) -61.04% |   (I) -25.04% |
+-----------------+--------------+---------------+---------------+

I tested an earlier version of this change on x86 bare metal and it
showed a smaller but still significant improvement. The bare metal
system wasn't available this time around so testing was done in a VM
instance. I'm guessing the cost of rdtsc is higher for VMs.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/Kconfig                         |  5 ++-
 arch/arm64/kernel/syscall.c          | 11 ------
 arch/loongarch/kernel/syscall.c      | 11 ------
 arch/powerpc/kernel/syscall.c        | 12 -------
 arch/riscv/kernel/traps.c            | 12 -------
 arch/s390/include/asm/entry-common.h |  8 -----
 arch/x86/include/asm/entry-common.h  | 12 -------
 include/linux/randomize_kstack.h     | 52 +++++++++-------------------
 include/linux/sched.h                |  4 ---
 init/main.c                          |  8 +++++
 kernel/fork.c                        |  1 -
 11 files changed, 27 insertions(+), 109 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 31220f512b16..8591fe7b4ac1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1516,9 +1516,8 @@ config HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
 	def_bool n
 	help
 	  An arch should select this symbol if it can support kernel stack
-	  offset randomization with calls to add_random_kstack_offset()
-	  during syscall entry and choose_random_kstack_offset() during
-	  syscall exit. Careful removal of -fstack-protector-strong and
+	  offset randomization with a call to add_random_kstack_offset()
+	  during syscall entry. Careful removal of -fstack-protector-strong and
 	  -fstack-protector should also be applied to the entry code and
 	  closely examined, as the artificial stack bump looks like an array
 	  to the compiler, so it will attempt to add canary checks regardless
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index c062badd1a56..358ddfbf1401 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -52,17 +52,6 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
 	}
 
 	syscall_set_return_value(current, regs, 0, ret);
-
-	/*
-	 * This value will get limited by KSTACK_OFFSET_MAX(), which is 10
-	 * bits. The actual entropy will be further reduced by the compiler
-	 * when applying stack alignment constraints: the AAPCS mandates a
-	 * 16-byte aligned SP at function boundaries, which will remove the
-	 * 4 low bits from any entropy chosen here.
-	 *
-	 * The resulting 6 bits of entropy is seen in SP[9:4].
-	 */
-	choose_random_kstack_offset(get_random_u16());
 }
 
 static inline bool has_syscall_work(unsigned long flags)
diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c
index 1249d82c1cd0..85da7e050d97 100644
--- a/arch/loongarch/kernel/syscall.c
+++ b/arch/loongarch/kernel/syscall.c
@@ -79,16 +79,5 @@ void noinstr __no_stack_protector do_syscall(struct pt_regs *regs)
 					   regs->regs[7], regs->regs[8], regs->regs[9]);
 	}
 
-	/*
-	 * This value will get limited by KSTACK_OFFSET_MAX(), which is 10
-	 * bits. The actual entropy will be further reduced by the compiler
-	 * when applying stack alignment constraints: 16-bytes (i.e. 4-bits)
-	 * aligned, which will remove the 4 low bits from any entropy chosen
-	 * here.
-	 *
-	 * The resulting 6 bits of entropy is seen in SP[9:4].
-	 */
-	choose_random_kstack_offset(get_cycles());
-
 	syscall_exit_to_user_mode(regs);
 }
diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
index be159ad4b77b..b3d8b0f9823b 100644
--- a/arch/powerpc/kernel/syscall.c
+++ b/arch/powerpc/kernel/syscall.c
@@ -173,17 +173,5 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
 	}
 #endif
 
-	/*
-	 * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
-	 * so the maximum stack offset is 1k bytes (10 bits).
-	 *
-	 * The actual entropy will be further reduced by the compiler when
-	 * applying stack alignment constraints: the powerpc architecture
-	 * may have two kinds of stack alignment (16-bytes and 8-bytes).
-	 *
-	 * So the resulting 6 or 7 bits of entropy is seen in SP[9:4] or SP[9:3].
-	 */
-	choose_random_kstack_offset(mftb());
-
 	return ret;
 }
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 47afea4ff1a8..dfb91ed3a243 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -344,18 +344,6 @@ void do_trap_ecall_u(struct pt_regs *regs)
 			syscall_handler(regs, syscall);
 		}
 
-		/*
-		 * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
-		 * so the maximum stack offset is 1k bytes (10 bits).
-		 *
-		 * The actual entropy will be further reduced by the compiler when
-		 * applying stack alignment constraints: 16-byte (i.e. 4-bit) aligned
-		 * for RV32I or RV64I.
-		 *
-		 * The resulting 6 bits of entropy is seen in SP[9:4].
-		 */
-		choose_random_kstack_offset(get_random_u16());
-
 		syscall_exit_to_user_mode(regs);
 	} else {
 		irqentry_state_t state = irqentry_nmi_enter(regs);
diff --git a/arch/s390/include/asm/entry-common.h b/arch/s390/include/asm/entry-common.h
index 979af986a8fe..35450a485323 100644
--- a/arch/s390/include/asm/entry-common.h
+++ b/arch/s390/include/asm/entry-common.h
@@ -51,14 +51,6 @@ static __always_inline void arch_exit_to_user_mode(void)
 
 #define arch_exit_to_user_mode arch_exit_to_user_mode
 
-static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
-						  unsigned long ti_work)
-{
-	choose_random_kstack_offset(get_tod_clock_fast());
-}
-
-#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
-
 static __always_inline bool arch_in_rcu_eqs(void)
 {
 	if (IS_ENABLED(CONFIG_KVM))
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index ce3eb6d5fdf9..7535131c711b 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -82,18 +82,6 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
 	current_thread_info()->status &= ~(TS_COMPAT | TS_I386_REGS_POKED);
 #endif
 
-	/*
-	 * This value will get limited by KSTACK_OFFSET_MAX(), which is 10
-	 * bits. The actual entropy will be further reduced by the compiler
-	 * when applying stack alignment constraints (see cc_stack_align4/8 in
-	 * arch/x86/Makefile), which will remove the 3 (x86_64) or 2 (ia32)
-	 * low bits from any entropy chosen here.
-	 *
-	 * Therefore, final stack offset entropy will be 7 (x86_64) or
-	 * 8 (ia32) bits.
-	 */
-	choose_random_kstack_offset(rdtsc());
-
 	/* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */
 	if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
 	    this_cpu_read(x86_ibpb_exit_to_user)) {
diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
index 5d3916ca747c..eef39701e914 100644
--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -6,6 +6,7 @@
 #include <linux/kernel.h>
 #include <linux/jump_label.h>
 #include <linux/percpu-defs.h>
+#include <linux/prandom.h>
 
 DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
 			 randomize_kstack_offset);
@@ -45,9 +46,22 @@ DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
 #define KSTACK_OFFSET_MAX(x)	((x) & 0b1111111100)
 #endif
 
+DECLARE_PER_CPU(struct rnd_state, kstack_rnd_state);
+
+static __always_inline u32 get_kstack_offset(void)
+{
+	struct rnd_state *state;
+	u32 rnd;
+
+	state = &get_cpu_var(kstack_rnd_state);
+	rnd = prandom_u32_state_inline(state);
+	put_cpu_var(kstack_rnd_state);
+
+	return rnd;
+}
+
 /**
- * add_random_kstack_offset - Increase stack utilization by previously
- *			      chosen random offset
+ * add_random_kstack_offset - Increase stack utilization by a random offset.
  *
  * This should be used in the syscall entry path after user registers have been
  * stored to the stack. Preemption may be enabled. For testing the resulting
@@ -56,47 +70,15 @@ DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
 #define add_random_kstack_offset() do {					\
 	if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,	\
 				&randomize_kstack_offset)) {		\
-		u32 offset = current->kstack_offset;			\
+		u32 offset = get_kstack_offset();			\
 		u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset));	\
 		/* Keep allocation even after "ptr" loses scope. */	\
 		asm volatile("" :: "r"(ptr) : "memory");		\
 	}								\
 } while (0)
 
-/**
- * choose_random_kstack_offset - Choose the random offset for the next
- *				 add_random_kstack_offset()
- *
- * This should only be used during syscall exit. Preemption may be enabled. This
- * position in the syscall flow is done to frustrate attacks from userspace
- * attempting to learn the next offset:
- * - Maximize the timing uncertainty visible from userspace: if the
- *   offset is chosen at syscall entry, userspace has much more control
- *   over the timing between choosing offsets. "How long will we be in
- *   kernel mode?" tends to be more difficult to predict than "how long
- *   will we be in user mode?"
- * - Reduce the lifetime of the new offset sitting in memory during
- *   kernel mode execution. Exposure of "thread-local" memory content
- *   (e.g. current, percpu, etc) tends to be easier than arbitrary
- *   location memory exposure.
- */
-#define choose_random_kstack_offset(rand) do {				\
-	if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,	\
-				&randomize_kstack_offset)) {		\
-		u32 offset = current->kstack_offset;			\
-		offset = ror32(offset, 5) ^ (rand);			\
-		current->kstack_offset = offset;			\
-	}								\
-} while (0)
-
-static inline void random_kstack_task_init(struct task_struct *tsk)
-{
-	tsk->kstack_offset = 0;
-}
 #else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
 #define add_random_kstack_offset()		do { } while (0)
-#define choose_random_kstack_offset(rand)	do { } while (0)
-#define random_kstack_task_init(tsk)		do { } while (0)
 #endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
 
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 23081a702ecf..da0133524d08 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1591,10 +1591,6 @@ struct task_struct {
 	unsigned long			prev_lowest_stack;
 #endif
 
-#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
-	u32				kstack_offset;
-#endif
-
 #ifdef CONFIG_X86_MCE
 	void __user			*mce_vaddr;
 	__u64				mce_kflags;
diff --git a/init/main.c b/init/main.c
index 27fcbbde933e..8626e048095a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -830,6 +830,14 @@ static inline void initcall_debug_enable(void)
 #ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
 DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
 			   randomize_kstack_offset);
+DEFINE_PER_CPU(struct rnd_state, kstack_rnd_state);
+
+static int __init random_kstack_init(void)
+{
+	prandom_seed_full_state(&kstack_rnd_state);
+	return 0;
+}
+late_initcall(random_kstack_init);
 
 static int __init early_randomize_kstack_offset(char *buf)
 {
diff --git a/kernel/fork.c b/kernel/fork.c
index b061e1edbc43..68d9766288fd 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2232,7 +2232,6 @@ __latent_entropy struct task_struct *copy_process(
 	if (retval)
 		goto bad_fork_cleanup_io;
 
-	random_kstack_task_init(p);
 	stackleak_task_init(p);
 
 	if (pid != &init_struct_pid) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-19 13:01 ` [PATCH v4 3/3] randomize_kstack: Unify random source across arches Ryan Roberts
@ 2026-01-20 23:50   ` kernel test robot
  2026-01-21 10:20     ` David Laight
  2026-01-21 10:52     ` Ryan Roberts
  2026-02-22 21:34   ` Thomas Gleixner
  1 sibling, 2 replies; 28+ messages in thread
From: kernel test robot @ 2026-01-20 23:50 UTC (permalink / raw)
  To: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	David Laight
  Cc: llvm, oe-kbuild-all, Ryan Roberts, linux-kernel, linux-arm-kernel,
	loongarch, linuxppc-dev, linux-riscv, linux-s390

Hi Ryan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.19-rc6 next-20260119]
[cannot apply to tip/sched/core kees/for-next/hardening kees/for-next/execve]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ryan-Roberts/randomize_kstack-Maintain-kstack_offset-per-task/20260119-210329
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20260119130122.1283821-4-ryan.roberts%40arm.com
patch subject: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
config: x86_64-allmodconfig (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202601210752.6Nsv9et9-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> vmlinux.o: warning: objtool: do_syscall_64+0x2c: call to preempt_count_add() leaves .noinstr.text section
>> vmlinux.o: warning: objtool: __do_fast_syscall_32+0x3d: call to preempt_count_add() leaves .noinstr.text section

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-20 23:50   ` kernel test robot
@ 2026-01-21 10:20     ` David Laight
  2026-01-21 14:48       ` David Laight
  2026-01-21 10:52     ` Ryan Roberts
  1 sibling, 1 reply; 28+ messages in thread
From: David Laight @ 2026-01-21 10:20 UTC (permalink / raw)
  To: kernel test robot
  Cc: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	llvm, oe-kbuild-all, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390

On Wed, 21 Jan 2026 07:50:16 +0800
kernel test robot <lkp@intel.com> wrote:

> Hi Ryan,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on akpm-mm/mm-everything]
> [also build test WARNING on linus/master v6.19-rc6 next-20260119]
> [cannot apply to tip/sched/core kees/for-next/hardening kees/for-next/execve]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Ryan-Roberts/randomize_kstack-Maintain-kstack_offset-per-task/20260119-210329
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/20260119130122.1283821-4-ryan.roberts%40arm.com
> patch subject: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
> config: x86_64-allmodconfig (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/config)
> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202601210752.6Nsv9et9-lkp@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
> >> vmlinux.o: warning: objtool: do_syscall_64+0x2c: call to preempt_count_add() leaves .noinstr.text section
> >> vmlinux.o: warning: objtool: __do_fast_syscall_32+0x3d: call to preempt_count_add() leaves .noinstr.text section  
> 

When CONFIG_DEBUG_PREEMPT or CONFIG_TRACE_PREEMP_TOGGLE is set
the preempt_count_[en|dis]able() calls inside [put|get]_cpu_var()
become real functions.

Maybe __preempt_count_[inc|dec]() can be called (with this_cpu_ptr()).

	David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-21 10:20     ` David Laight
@ 2026-01-21 14:48       ` David Laight
  0 siblings, 0 replies; 28+ messages in thread
From: David Laight @ 2026-01-21 14:48 UTC (permalink / raw)
  To: kernel test robot
  Cc: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	llvm, oe-kbuild-all, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390

On Wed, 21 Jan 2026 10:20:17 +0000
David Laight <david.laight.linux@gmail.com> wrote:

> On Wed, 21 Jan 2026 07:50:16 +0800
> kernel test robot <lkp@intel.com> wrote:
> 
> > Hi Ryan,
> > 
> > kernel test robot noticed the following build warnings:
> > 
> > [auto build test WARNING on akpm-mm/mm-everything]
> > [also build test WARNING on linus/master v6.19-rc6 next-20260119]
> > [cannot apply to tip/sched/core kees/for-next/hardening kees/for-next/execve]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> > 
> > url:    https://github.com/intel-lab-lkp/linux/commits/Ryan-Roberts/randomize_kstack-Maintain-kstack_offset-per-task/20260119-210329
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> > patch link:    https://lore.kernel.org/r/20260119130122.1283821-4-ryan.roberts%40arm.com
> > patch subject: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
> > config: x86_64-allmodconfig (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/config)
> > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/reproduce)
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202601210752.6Nsv9et9-lkp@intel.com/
> > 
> > All warnings (new ones prefixed by >>):
> >   
> > >> vmlinux.o: warning: objtool: do_syscall_64+0x2c: call to preempt_count_add() leaves .noinstr.text section
> > >> vmlinux.o: warning: objtool: __do_fast_syscall_32+0x3d: call to preempt_count_add() leaves .noinstr.text section    
> >   
> 
> When CONFIG_DEBUG_PREEMPT or CONFIG_TRACE_PREEMP_TOGGLE is set
> the preempt_count_[en|dis]able() calls inside [put|get]_cpu_var()
> become real functions.
> 
> Maybe __preempt_count_[inc|dec]() can be called (with this_cpu_ptr()).

Or the code could just use the per-cpu data without disabling preemption.
Usually that isn't a good idea at all, but it can't matter in this case.
Might give a noticeable performance gain, disabling preemption is
non-trivial and/or an atomic operation on some architectures.

If anyone is worried about preemption causing the output be repeated, that
would be (mostly) mitigated by checking that s[1234] haven't changed prior
to writing the new values.
I think a 'not locked at all' compare of two of the four values will
stop everything except two threads doing system calls at the same time
getting the same output from the prng.

The whole thing is very unlikely and there will be much easier ways
to break the prng.
Provided s[1234] are only written with valid values (ie ones which aren't
effectively zero) it will continue generating numbers.

	David


> 
> 	David
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-20 23:50   ` kernel test robot
  2026-01-21 10:20     ` David Laight
@ 2026-01-21 10:52     ` Ryan Roberts
  2026-01-21 12:32       ` Mark Rutland
  1 sibling, 1 reply; 28+ messages in thread
From: Ryan Roberts @ 2026-01-21 10:52 UTC (permalink / raw)
  To: kernel test robot, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	David Laight
  Cc: llvm, oe-kbuild-all, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390

On 20/01/2026 23:50, kernel test robot wrote:
> Hi Ryan,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on akpm-mm/mm-everything]
> [also build test WARNING on linus/master v6.19-rc6 next-20260119]
> [cannot apply to tip/sched/core kees/for-next/hardening kees/for-next/execve]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Ryan-Roberts/randomize_kstack-Maintain-kstack_offset-per-task/20260119-210329
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/20260119130122.1283821-4-ryan.roberts%40arm.com
> patch subject: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
> config: x86_64-allmodconfig (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/config)
> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202601210752.6Nsv9et9-lkp@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
>>> vmlinux.o: warning: objtool: do_syscall_64+0x2c: call to preempt_count_add() leaves .noinstr.text section
>>> vmlinux.o: warning: objtool: __do_fast_syscall_32+0x3d: call to preempt_count_add() leaves .noinstr.text section

Hmm, clearly Dave was correct not to rush this through... yuck. I'll take a
look, but I guess there is no rush if this won't go into -next until shortly
after -rc1.

Thanks,
Ryan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-21 10:52     ` Ryan Roberts
@ 2026-01-21 12:32       ` Mark Rutland
  2026-02-18 15:20         ` Ryan Roberts
  0 siblings, 1 reply; 28+ messages in thread
From: Mark Rutland @ 2026-01-21 12:32 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: kernel test robot, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight,
	llvm, oe-kbuild-all, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390

On Wed, Jan 21, 2026 at 10:52:21AM +0000, Ryan Roberts wrote:
> On 20/01/2026 23:50, kernel test robot wrote:
> > Hi Ryan,
> > 
> > kernel test robot noticed the following build warnings:
> > 
> > [auto build test WARNING on akpm-mm/mm-everything]
> > [also build test WARNING on linus/master v6.19-rc6 next-20260119]
> > [cannot apply to tip/sched/core kees/for-next/hardening kees/for-next/execve]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> > 
> > url:    https://github.com/intel-lab-lkp/linux/commits/Ryan-Roberts/randomize_kstack-Maintain-kstack_offset-per-task/20260119-210329
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> > patch link:    https://lore.kernel.org/r/20260119130122.1283821-4-ryan.roberts%40arm.com
> > patch subject: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
> > config: x86_64-allmodconfig (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/config)
> > compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/reproduce)
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202601210752.6Nsv9et9-lkp@intel.com/
> > 
> > All warnings (new ones prefixed by >>):
> > 
> >>> vmlinux.o: warning: objtool: do_syscall_64+0x2c: call to preempt_count_add() leaves .noinstr.text section
> >>> vmlinux.o: warning: objtool: __do_fast_syscall_32+0x3d: call to preempt_count_add() leaves .noinstr.text section
> 
> Hmm, clearly Dave was correct not to rush this through... yuck. I'll take a
> look, but I guess there is no rush if this won't go into -next until shortly
> after -rc1.

Sorry, I should have checked the entry sequencing more thoroughly when I
reviewed this,.

From a quick look, I suspect the right thing to do is to pull the call
to add_random_kstack_offset() a bit later in a few cases; after the
entry logic has run, and after instrumentation_begin() (if the arch code
uses that), such that it doesn't matter if this gets instrumented.

Considering the callers of add_random_kstack_offset(), if we did that:

* arm64 is fine as-is.

* loongarch is fine as-is.

* powerpc's system_call_exception() would need this moved after the
  user_exit_irqoff(). Given that function is notrace rather than
  noinstr, it looks like there are bigger extant issues here.

* riscv is fine as-is.

* s390's __do_syscall() would need this moved after
  enter_from_user_mode().

* On x86:
  - do_int80_emulation() is fine as-is.
  - int80_emulation() is fine as-is.
  - do_int80_syscall_32() would need this moved after
    instrumentation_begin().
  - __do_fast_syscall_32() would need this moved after
    instrumentation_begin().
  - do_syscall_64() would need this moved after instrumentation_begin().

Mark.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-21 12:32       ` Mark Rutland
@ 2026-02-18 15:20         ` Ryan Roberts
  0 siblings, 0 replies; 28+ messages in thread
From: Ryan Roberts @ 2026-02-18 15:20 UTC (permalink / raw)
  To: Mark Rutland
  Cc: kernel test robot, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight,
	llvm, oe-kbuild-all, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390

On 21/01/2026 12:32, Mark Rutland wrote:
> On Wed, Jan 21, 2026 at 10:52:21AM +0000, Ryan Roberts wrote:
>> On 20/01/2026 23:50, kernel test robot wrote:
>>> Hi Ryan,
>>>
>>> kernel test robot noticed the following build warnings:
>>>
>>> [auto build test WARNING on akpm-mm/mm-everything]
>>> [also build test WARNING on linus/master v6.19-rc6 next-20260119]
>>> [cannot apply to tip/sched/core kees/for-next/hardening kees/for-next/execve]
>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>> And when submitting patch, we suggest to use '--base' as documented in
>>> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>>>
>>> url:    https://github.com/intel-lab-lkp/linux/commits/Ryan-Roberts/randomize_kstack-Maintain-kstack_offset-per-task/20260119-210329
>>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
>>> patch link:    https://lore.kernel.org/r/20260119130122.1283821-4-ryan.roberts%40arm.com
>>> patch subject: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
>>> config: x86_64-allmodconfig (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/config)
>>> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
>>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260121/202601210752.6Nsv9et9-lkp@intel.com/reproduce)
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <lkp@intel.com>
>>> | Closes: https://lore.kernel.org/oe-kbuild-all/202601210752.6Nsv9et9-lkp@intel.com/
>>>
>>> All warnings (new ones prefixed by >>):
>>>
>>>>> vmlinux.o: warning: objtool: do_syscall_64+0x2c: call to preempt_count_add() leaves .noinstr.text section
>>>>> vmlinux.o: warning: objtool: __do_fast_syscall_32+0x3d: call to preempt_count_add() leaves .noinstr.text section
>>
>> Hmm, clearly Dave was correct not to rush this through... yuck. I'll take a
>> look, but I guess there is no rush if this won't go into -next until shortly
>> after -rc1.
> 
> Sorry, I should have checked the entry sequencing more thoroughly when I
> reviewed this,.
> 
> From a quick look, I suspect the right thing to do is to pull the call
> to add_random_kstack_offset() a bit later in a few cases; after the
> entry logic has run, and after instrumentation_begin() (if the arch code
> uses that), such that it doesn't matter if this gets instrumented.
> 
> Considering the callers of add_random_kstack_offset(), if we did that:
> 
> * arm64 is fine as-is.
> 
> * loongarch is fine as-is.
> 
> * powerpc's system_call_exception() would need this moved after the
>   user_exit_irqoff(). Given that function is notrace rather than
>   noinstr, it looks like there are bigger extant issues here.
> 
> * riscv is fine as-is.
> 
> * s390's __do_syscall() would need this moved after
>   enter_from_user_mode().
> 
> * On x86:
>   - do_int80_emulation() is fine as-is.
>   - int80_emulation() is fine as-is.
>   - do_int80_syscall_32() would need this moved after
>     instrumentation_begin().
>   - __do_fast_syscall_32() would need this moved after
>     instrumentation_begin().
>   - do_syscall_64() would need this moved after instrumentation_begin().

Thanks for the detailed suggestions, Mark. I've taken this approach, and
assuming perf testing doesn't throw up any issue, I'm going to revert back to
using the out-of-line version of prandom_u32_state() and will drop patch 2.

Thanks,
Ryan


> 
> Mark.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-01-19 13:01 ` [PATCH v4 3/3] randomize_kstack: Unify random source across arches Ryan Roberts
  2026-01-20 23:50   ` kernel test robot
@ 2026-02-22 21:34   ` Thomas Gleixner
  2026-02-23  9:41     ` David Laight
  2026-03-03 14:43     ` Ryan Roberts
  1 sibling, 2 replies; 28+ messages in thread
From: Thomas Gleixner @ 2026-02-22 21:34 UTC (permalink / raw)
  To: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Kees Cook, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: Ryan Roberts, linux-kernel, linux-arm-kernel, loongarch,
	linuxppc-dev, linux-riscv, linux-s390, linux-hardening

On Mon, Jan 19 2026 at 13:01, Ryan Roberts wrote:
> I tested an earlier version of this change on x86 bare metal and it
> showed a smaller but still significant improvement. The bare metal
> system wasn't available this time around so testing was done in a VM
> instance. I'm guessing the cost of rdtsc is higher for VMs.

No it's not, unless the hypervisor traps RDTSC, which would be insane as
that would cause massive regressions all over the place.

So guessing is not really helpful if you want to argue performance.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-02-22 21:34   ` Thomas Gleixner
@ 2026-02-23  9:41     ` David Laight
  2026-03-03 14:43     ` Ryan Roberts
  1 sibling, 0 replies; 28+ messages in thread
From: David Laight @ 2026-02-23  9:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Kees Cook, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, linux-kernel,
	linux-arm-kernel, loongarch, linuxppc-dev, linux-riscv,
	linux-s390, linux-hardening

On Sun, 22 Feb 2026 22:34:26 +0100
Thomas Gleixner <tglx@kernel.org> wrote:

> On Mon, Jan 19 2026 at 13:01, Ryan Roberts wrote:
> > I tested an earlier version of this change on x86 bare metal and it
> > showed a smaller but still significant improvement. The bare metal
> > system wasn't available this time around so testing was done in a VM
> > instance. I'm guessing the cost of rdtsc is higher for VMs.  
> 
> No it's not, unless the hypervisor traps RDTSC, which would be insane as
> that would cause massive regressions all over the place.
> 
> So guessing is not really helpful if you want to argue performance.

The cost of rdtsc will depend on the cpu architecture.
To get valid comparisons you need to run on identical systems.

Regardless, the cost of rdtsc could easily be larger than the
cost of the prandom_u32_state() code (especially if inlined or
without all the return thunk 'crap').

	David

> 
> Thanks,
> 
>         tglx


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] randomize_kstack: Unify random source across arches
  2026-02-22 21:34   ` Thomas Gleixner
  2026-02-23  9:41     ` David Laight
@ 2026-03-03 14:43     ` Ryan Roberts
  1 sibling, 0 replies; 28+ messages in thread
From: Ryan Roberts @ 2026-03-03 14:43 UTC (permalink / raw)
  To: Thomas Gleixner, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Kees Cook, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening

On 22/02/2026 21:34, Thomas Gleixner wrote:
> On Mon, Jan 19 2026 at 13:01, Ryan Roberts wrote:
>> I tested an earlier version of this change on x86 bare metal and it
>> showed a smaller but still significant improvement. The bare metal
>> system wasn't available this time around so testing was done in a VM
>> instance. I'm guessing the cost of rdtsc is higher for VMs.
> 
> No it's not, unless the hypervisor traps RDTSC, which would be insane as
> that would cause massive regressions all over the place.
> 
> So guessing is not really helpful if you want to argue performance.

Sorry for the slow response. I no longer have access to a recent bare metal x86
system that I can do performance testing on. All I have is the Sapphire Rapids
(m7i.24xlarge) VM.

My original testing was on bare metal Sapphire Rapids (same number of CPUs and
RAM as the VM).

Just to be clear, these are the results I got with bare metal vs vm. Negative is
an improvement (less time). (I)/(R) means statistically significant
improvement/regression:

+-----------------+--------------+---------------+---------------+
| Benchmark       | Result Class |        x86_64 |        x86_64 |
|                 |              |    bare metal |            VM |
+=================+==============+===============+===============+
| syscall/getpid  | mean (ns)    |    (I) -7.69% |   (I) -17.65% |
|                 | p99 (ns)     |         4.14% |   (I) -24.41% |
|                 | p99.9 (ns)   |         2.68% |   (I) -28.52% |
+-----------------+--------------+---------------+---------------+
| syscall/getppid | mean (ns)    |    (I) -5.98% |   (I) -19.24% |
|                 | p99 (ns)     |        -3.11% |   (I) -25.03% |
|                 | p99.9 (ns)   |     (R) 9.84% |   (I) -28.17% |
+-----------------+--------------+---------------+---------------+
| syscall/invalid | mean (ns)    |    (I) -6.94% |   (I) -18.56% |
|                 | p99 (ns)     |    (I) -5.57% |   (I) -20.06% |
|                 | p99.9 (ns)   |    (R) 10.53% |   (I) -25.04% |
+-----------------+--------------+---------------+---------------+

So both sets of results represent an improvement, I would say.

Given the level of review that the series has had, I propose to repost today,
then hopefully Kees will be happy to put it in his branch so that it can get
plenty of linux-next soak testing and if there are any x86 regressions lurking,
hopefully ZeroDay will spot them?

Thanks,
Ryan


> 
> Thanks,
> 
>         tglx


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-19 13:01 [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Ryan Roberts
                   ` (2 preceding siblings ...)
  2026-01-19 13:01 ` [PATCH v4 3/3] randomize_kstack: Unify random source across arches Ryan Roberts
@ 2026-01-19 16:00 ` Dave Hansen
  2026-01-19 16:44   ` Kees Cook
  2026-01-19 16:25 ` Heiko Carstens
  4 siblings, 1 reply; 28+ messages in thread
From: Dave Hansen @ 2026-01-19 16:00 UTC (permalink / raw)
  To: Ryan Roberts, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Kees Cook, Gustavo A. R. Silva, Arnd Bergmann,
	Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton,
	David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening

On 1/19/26 05:01, Ryan Roberts wrote:
> x86 (AWS Sapphire Rapids):
> +-----------------+--------------+-------------+---------------+
> | Benchmark       | Result Class |   v6.18-rc5 | per-task-prng |
> |                 |              | rndstack-on |               |
> |                 |              |             |               |
> +=================+==============+=============+===============+
> | syscall/getpid  | mean (ns)    |  (R) 13.32% |     (R) 4.60% |
> |                 | p99 (ns)     |  (R) 13.38% |    (R) 18.08% |
> |                 | p99.9 (ns)   |      16.26% |    (R) 19.38% |

Like you noted, this is surprising. This would be a good thing to make
sure it goes in very early after -rc1 and gets plenty of wide testing.

But I don't see any problems with the approach, and the move to common
code looks like a big win as well:

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-19 16:00 ` [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Dave Hansen
@ 2026-01-19 16:44   ` Kees Cook
  2026-01-19 16:51     ` Dave Hansen
  2026-01-20 16:32     ` Ryan Roberts
  0 siblings, 2 replies; 28+ messages in thread
From: Kees Cook @ 2026-01-19 16:44 UTC (permalink / raw)
  To: Dave Hansen, Ryan Roberts, Catalin Marinas, Will Deacon,
	Huacai Chen, Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening



On January 19, 2026 8:00:00 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>On 1/19/26 05:01, Ryan Roberts wrote:
>> x86 (AWS Sapphire Rapids):
>> +-----------------+--------------+-------------+---------------+
>> | Benchmark       | Result Class |   v6.18-rc5 | per-task-prng |
>> |                 |              | rndstack-on |               |
>> |                 |              |             |               |
>> +=================+==============+=============+===============+
>> | syscall/getpid  | mean (ns)    |  (R) 13.32% |     (R) 4.60% |
>> |                 | p99 (ns)     |  (R) 13.38% |    (R) 18.08% |
>> |                 | p99.9 (ns)   |      16.26% |    (R) 19.38% |
>
>Like you noted, this is surprising. This would be a good thing to make
>sure it goes in very early after -rc1 and gets plenty of wide testing.

Right, we are pretty late in the dev cycle (rc6). It would be prudent to get this into -next after the coming rc1 (1 month from now).

On the other hand, the changes are pretty "binary" in the sense that mistakes should be VERY visible right away. Would it be better to take this into -next immediately instead?

>But I don't see any problems with the approach, and the move to common
>code looks like a big win as well:

Agreed; I think it's looking great.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-19 16:44   ` Kees Cook
@ 2026-01-19 16:51     ` Dave Hansen
  2026-01-20 16:32     ` Ryan Roberts
  1 sibling, 0 replies; 28+ messages in thread
From: Dave Hansen @ 2026-01-19 16:51 UTC (permalink / raw)
  To: Kees Cook, Ryan Roberts, Catalin Marinas, Will Deacon,
	Huacai Chen, Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening

On 1/19/26 08:44, Kees Cook wrote:
>> Like you noted, this is surprising. This would be a good thing to
>> make sure it goes in very early after -rc1 and gets plenty of wide
>> testing.
> Right, we are pretty late in the dev cycle (rc6). It would be
> prudent to get this into -next after the coming rc1 (1 month from
> now).
> 
> On the other hand, the changes are pretty "binary" in the sense that
> mistakes should be VERY visible right away. Would it be better to
> take this into -next immediately instead?
I think it can go into -next ASAP. It's just a matter of when it goes to
Linus.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-19 16:44   ` Kees Cook
  2026-01-19 16:51     ` Dave Hansen
@ 2026-01-20 16:32     ` Ryan Roberts
  2026-01-20 16:37       ` Dave Hansen
  1 sibling, 1 reply; 28+ messages in thread
From: Ryan Roberts @ 2026-01-20 16:32 UTC (permalink / raw)
  To: Kees Cook, Dave Hansen, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening

On 19/01/2026 16:44, Kees Cook wrote:
> 
> 
> On January 19, 2026 8:00:00 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>> On 1/19/26 05:01, Ryan Roberts wrote:
>>> x86 (AWS Sapphire Rapids):
>>> +-----------------+--------------+-------------+---------------+
>>> | Benchmark       | Result Class |   v6.18-rc5 | per-task-prng |
>>> |                 |              | rndstack-on |               |
>>> |                 |              |             |               |
>>> +=================+==============+=============+===============+
>>> | syscall/getpid  | mean (ns)    |  (R) 13.32% |     (R) 4.60% |
>>> |                 | p99 (ns)     |  (R) 13.38% |    (R) 18.08% |
>>> |                 | p99.9 (ns)   |      16.26% |    (R) 19.38% |
>>
>> Like you noted, this is surprising. This would be a good thing to make
>> sure it goes in very early after -rc1 and gets plenty of wide testing.
> 
> Right, we are pretty late in the dev cycle (rc6). It would be prudent to get this into -next after the coming rc1 (1 month from now).
> 
> On the other hand, the changes are pretty "binary" in the sense that mistakes should be VERY visible right away. Would it be better to take this into -next immediately instead?

I don't think this question was really addressed to me, but I'll give my opinion
anyway; I agree it's pretty binary - it will either work or it will explode.
I've tested on arm64 and x86_64 so I have high confidence that it works. If you
get it into -next ASAP it has 3 weeks to soak before the merge window opens
right? (Linus said he would do an -rc8 this cycle). That feels like enough time
to me. But it's your tree ;-)

Thanks,
Ryan


> 
>> But I don't see any problems with the approach, and the move to common
>> code looks like a big win as well:
> 
> Agreed; I think it's looking great.
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-20 16:32     ` Ryan Roberts
@ 2026-01-20 16:37       ` Dave Hansen
  2026-01-20 16:45         ` Ryan Roberts
  2026-01-20 18:45         ` David Laight
  0 siblings, 2 replies; 28+ messages in thread
From: Dave Hansen @ 2026-01-20 16:37 UTC (permalink / raw)
  To: Ryan Roberts, Kees Cook, Catalin Marinas, Will Deacon,
	Huacai Chen, Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening

On 1/20/26 08:32, Ryan Roberts wrote:
> I don't think this question was really addressed to me, but I'll give my opinion
> anyway; I agree it's pretty binary - it will either work or it will explode.
> I've tested on arm64 and x86_64 so I have high confidence that it works. If you
> get it into -next ASAP it has 3 weeks to soak before the merge window opens
> right? (Linus said he would do an -rc8 this cycle). That feels like enough time
> to me. But it's your tree 😉

First of all, thank you for testing it on x86! Having that one data
point where it helped performance is super valuable.

I'm more worried that it's going to regress performance somewhere and
then it's going to be a pain to back out. I'm not super worried about
functional regressions.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-20 16:37       ` Dave Hansen
@ 2026-01-20 16:45         ` Ryan Roberts
  2026-01-20 18:45         ` David Laight
  1 sibling, 0 replies; 28+ messages in thread
From: Ryan Roberts @ 2026-01-20 16:45 UTC (permalink / raw)
  To: Dave Hansen, Kees Cook, Catalin Marinas, Will Deacon, Huacai Chen,
	Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, David Laight
  Cc: linux-kernel, linux-arm-kernel, loongarch, linuxppc-dev,
	linux-riscv, linux-s390, linux-hardening

On 20/01/2026 16:37, Dave Hansen wrote:
> On 1/20/26 08:32, Ryan Roberts wrote:
>> I don't think this question was really addressed to me, but I'll give my opinion
>> anyway; I agree it's pretty binary - it will either work or it will explode.
>> I've tested on arm64 and x86_64 so I have high confidence that it works. If you
>> get it into -next ASAP it has 3 weeks to soak before the merge window opens
>> right? (Linus said he would do an -rc8 this cycle). That feels like enough time
>> to me. But it's your tree 😉
> 
> First of all, thank you for testing it on x86! Having that one data
> point where it helped performance is super valuable.
> 
> I'm more worried that it's going to regress performance somewhere and
> then it's going to be a pain to back out. I'm not super worried about
> functional regressions.

Fair enough. Let's go slow then.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-20 16:37       ` Dave Hansen
  2026-01-20 16:45         ` Ryan Roberts
@ 2026-01-20 18:45         ` David Laight
  1 sibling, 0 replies; 28+ messages in thread
From: David Laight @ 2026-01-20 18:45 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Ryan Roberts, Kees Cook, Catalin Marinas, Will Deacon,
	Huacai Chen, Madhavan Srinivasan, Michael Ellerman, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Gustavo A. R. Silva, Arnd Bergmann, Mark Rutland,
	Jason A. Donenfeld, Ard Biesheuvel, Jeremy Linton, linux-kernel,
	linux-arm-kernel, loongarch, linuxppc-dev, linux-riscv,
	linux-s390, linux-hardening

On Tue, 20 Jan 2026 08:37:43 -0800
Dave Hansen <dave.hansen@intel.com> wrote:

> On 1/20/26 08:32, Ryan Roberts wrote:
> > I don't think this question was really addressed to me, but I'll give my opinion
> > anyway; I agree it's pretty binary - it will either work or it will explode.
> > I've tested on arm64 and x86_64 so I have high confidence that it works. If you
> > get it into -next ASAP it has 3 weeks to soak before the merge window opens
> > right? (Linus said he would do an -rc8 this cycle). That feels like enough time
> > to me. But it's your tree 😉  
> 
> First of all, thank you for testing it on x86! Having that one data
> point where it helped performance is super valuable.
> 
> I'm more worried that it's going to regress performance somewhere and
> then it's going to be a pain to back out. I'm not super worried about
> functional regressions.

Unlikely, on x86 the 'rdtsc' is ~20 clocks on Intel cpu and even slower
on amd (according to Agner).
(That is serialised against another rdtsc rather than other instructions.)
Whereas the four TAUSWORTHE() are independent so can execute in parallel.
IIRC each is a memory read and 5 ALU instructions - not much at all.
The slow bit will be the cache miss on the per-cpu data.
You lose a clock at the end because gcc will compile the a | b | c | d
as (((a | b) | c) | d) not ((a | b) | (c | d)).

I think someone reported the 'new' version being faster on x86,
that might be why.

	David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation
  2026-01-19 13:01 [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Ryan Roberts
                   ` (3 preceding siblings ...)
  2026-01-19 16:00 ` [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Dave Hansen
@ 2026-01-19 16:25 ` Heiko Carstens
  4 siblings, 0 replies; 28+ messages in thread
From: Heiko Carstens @ 2026-01-19 16:25 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Catalin Marinas, Will Deacon, Huacai Chen, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Vasily Gorbik, Alexander Gordeev, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Kees Cook, Gustavo A. R. Silva,
	Arnd Bergmann, Mark Rutland, Jason A. Donenfeld, Ard Biesheuvel,
	Jeremy Linton, David Laight, linux-kernel, linux-arm-kernel,
	loongarch, linuxppc-dev, linux-riscv, linux-s390, linux-hardening

On Mon, Jan 19, 2026 at 01:01:07PM +0000, Ryan Roberts wrote:
> As I reported at [1], kstack offset randomisation suffers from a couple of bugs
> and, on arm64 at least, the performance is poor. This series attempts to fix
> both; patch 1 provides back-portable fixes for the functional bugs. Patches 2-3
> propose a performance improvement approach.
> 
> I've looked at a few different options but ultimately decided that Jeremy's
> original prng approach is the fastest. I made the argument that this approach is
> secure "enough" in the RFC [2] and the responses indicated agreement.
> 
> More details in the commit logs.
> 
> 
> Performance
> ===========
> 
> Mean and tail performance of 3 "small" syscalls was measured. syscall was made
> 10 million times and each individually measured and binned. These results have
> low noise so I'm confident that they are trustworthy.
> 
> The baseline is v6.18-rc5 with stack randomization turned *off*. So I'm showing
> performance cost of turning it on without any changes to the implementation,
> then the reduced performance cost of turning it on with my changes applied.

This adds 16 instructions to the system call fast path on s390, however
some quick measurements show that executing this extra code is within
noise ratio performance wise.

Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2026-03-03 14:43 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-19 13:01 [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Ryan Roberts
2026-01-19 13:01 ` [PATCH v4 1/3] randomize_kstack: Maintain kstack_offset per task Ryan Roberts
2026-01-19 16:10   ` Dave Hansen
2026-01-19 16:51     ` Ryan Roberts
2026-01-19 16:53       ` Dave Hansen
2026-01-19 13:01 ` [PATCH v4 2/3] prandom: Add __always_inline version of prandom_u32_state() Ryan Roberts
2026-01-28 17:00   ` Jason A. Donenfeld
2026-01-28 17:33     ` Ryan Roberts
2026-01-28 18:32       ` David Laight
2026-01-30 16:16     ` Christophe Leroy (CS GROUP)
2026-01-19 13:01 ` [PATCH v4 3/3] randomize_kstack: Unify random source across arches Ryan Roberts
2026-01-20 23:50   ` kernel test robot
2026-01-21 10:20     ` David Laight
2026-01-21 14:48       ` David Laight
2026-01-21 10:52     ` Ryan Roberts
2026-01-21 12:32       ` Mark Rutland
2026-02-18 15:20         ` Ryan Roberts
2026-02-22 21:34   ` Thomas Gleixner
2026-02-23  9:41     ` David Laight
2026-03-03 14:43     ` Ryan Roberts
2026-01-19 16:00 ` [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation Dave Hansen
2026-01-19 16:44   ` Kees Cook
2026-01-19 16:51     ` Dave Hansen
2026-01-20 16:32     ` Ryan Roberts
2026-01-20 16:37       ` Dave Hansen
2026-01-20 16:45         ` Ryan Roberts
2026-01-20 18:45         ` David Laight
2026-01-19 16:25 ` Heiko Carstens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox