* [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB @ 2026-05-14 7:50 Zong Li 2026-05-14 8:56 ` David Laight 0 siblings, 1 reply; 3+ messages in thread From: Zong Li @ 2026-05-14 7:50 UTC (permalink / raw) To: pjw, palmer, aou, alex, debug, linux-riscv, linux-kernel, david.laight.linux Cc: Zong Li Follow the ARM64 GCS (Guarded Control Stack) implementation approach by reducing the shadow stack size allocation from min(RLIMIT_STACK, 4GB) to min(RLIMIT_STACK/2, 2GB). see commit '506496bcbb42 "arm64/gcs: Ensure that new threads have a GCS")' Rationale: 1. Shadow stacks only store return addresses (8 bytes per entry), not local variables, function parameters, or saved registers. A 2GB shadow stack is far more than sufficient for any practical application, even with extremely deep recursion. Using half the size maintains adequate while being more resource-efficient margin 2. On memory-constrained systems (e.g., platforms with only 4GB of physical memory, which is a common configuration), allocating 4GB of virtual address space for shadow stack per process/thread can lead to virtual memory allocation failures when the overcommit mode is set to OVERCOMMIT_GUESS or OVERCOMMIT_NEVER: Error: "__vm_enough_memory: not enough memory for the allocation" This reduces virtual address space consumption by 50% while maintaining more than adequate space for return address storage. Additionally, add max(PAGE_SIZE, size) constraint, which covers the case where RLIMIT_STACK is smaller than PAGE_SIZE. Signed-off-by: Zong Li <zong.li@sifive.com> --- Changed in v2: - Add max() in case RLIMIT_STACK is smaller than PAGE_SIZE. Suggested by Paul Walmsley and Sashiko Changed in v1: - Use min() instead of min_t(). Suggested by David Laight arch/riscv/kernel/usercfi.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c index 6eaa0d94fdfe..0f75e8f5d0ec 100644 --- a/arch/riscv/kernel/usercfi.c +++ b/arch/riscv/kernel/usercfi.c @@ -109,15 +109,17 @@ void set_indir_lp_lock(struct task_struct *task, bool lock) task->thread_info.user_cfi_state.ufcfi_locked = lock; } /* - * If size is 0, then to be compatible with regular stack we want it to be as big as - * regular stack. Else PAGE_ALIGN it and return back + * The shadow stack only stores the return address and not any variables + * 2G should be more than sufficient for most applications. */ static unsigned long calc_shstk_size(unsigned long size) { if (size) return PAGE_ALIGN(size); - return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); + size = PAGE_ALIGN(min(rlimit(RLIMIT_STACK) / 2, SZ_2G)); + + return max(PAGE_SIZE, size); } /* -- 2.43.7 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB 2026-05-14 7:50 [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB Zong Li @ 2026-05-14 8:56 ` David Laight 2026-05-15 3:42 ` Zong Li 0 siblings, 1 reply; 3+ messages in thread From: David Laight @ 2026-05-14 8:56 UTC (permalink / raw) To: Zong Li; +Cc: pjw, palmer, aou, alex, debug, linux-riscv, linux-kernel On Thu, 14 May 2026 00:50:35 -0700 Zong Li <zong.li@sifive.com> wrote: > Follow the ARM64 GCS (Guarded Control Stack) implementation approach > by reducing the shadow stack size allocation from min(RLIMIT_STACK, 4GB) > to min(RLIMIT_STACK/2, 2GB). see commit '506496bcbb42 "arm64/gcs: Ensure > that new threads have a GCS")' > > Rationale: > > 1. Shadow stacks only store return addresses (8 bytes per entry), not > local variables, function parameters, or saved registers. A 2GB > shadow stack is far more than sufficient for any practical > application, even with extremely deep recursion. Using half the size > maintains adequate while being more resource-efficient margin > > 2. On memory-constrained systems (e.g., platforms with only 4GB of > physical memory, which is a common configuration), allocating 4GB > of virtual address space for shadow stack per process/thread can > lead to virtual memory allocation failures when the overcommit mode > is set to OVERCOMMIT_GUESS or OVERCOMMIT_NEVER: > Error: "__vm_enough_memory: not enough memory for the allocation" > > This reduces virtual address space consumption by 50% while maintaining > more than adequate space for return address storage. > > Additionally, add max(PAGE_SIZE, size) constraint, which covers the > case where RLIMIT_STACK is smaller than PAGE_SIZE. > > Signed-off-by: Zong Li <zong.li@sifive.com> > --- > > Changed in v2: > - Add max() in case RLIMIT_STACK is smaller than PAGE_SIZE. Suggested by > Paul Walmsley and Sashiko > > Changed in v1: > - Use min() instead of min_t(). Suggested by David Laight > > arch/riscv/kernel/usercfi.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c > index 6eaa0d94fdfe..0f75e8f5d0ec 100644 > --- a/arch/riscv/kernel/usercfi.c > +++ b/arch/riscv/kernel/usercfi.c > @@ -109,15 +109,17 @@ void set_indir_lp_lock(struct task_struct *task, bool lock) > task->thread_info.user_cfi_state.ufcfi_locked = lock; > } > /* > - * If size is 0, then to be compatible with regular stack we want it to be as big as > - * regular stack. Else PAGE_ALIGN it and return back > + * The shadow stack only stores the return address and not any variables > + * 2G should be more than sufficient for most applications. > */ > static unsigned long calc_shstk_size(unsigned long size) > { > if (size) > return PAGE_ALIGN(size); > > - return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); > + size = PAGE_ALIGN(min(rlimit(RLIMIT_STACK) / 2, SZ_2G)); PAGE_ALIGN() already rounds up, so the only problem would be if rlimit(STACK) were 0 or 1, I'm sure that would fail earlier (or not be allowed). I also don't understand the rational for just /2 and the 2G upper limit. You need 512 nested function calls to even use 4k. That would have to be quite deep recursion. -- David > + > + return max(PAGE_SIZE, size); > } > > /* ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB 2026-05-14 8:56 ` David Laight @ 2026-05-15 3:42 ` Zong Li 0 siblings, 0 replies; 3+ messages in thread From: Zong Li @ 2026-05-15 3:42 UTC (permalink / raw) To: David Laight; +Cc: pjw, palmer, aou, alex, debug, linux-riscv, linux-kernel On Thu, May 14, 2026 at 4:56 PM David Laight <david.laight.linux@gmail.com> wrote: > > On Thu, 14 May 2026 00:50:35 -0700 > Zong Li <zong.li@sifive.com> wrote: > > > Follow the ARM64 GCS (Guarded Control Stack) implementation approach > > by reducing the shadow stack size allocation from min(RLIMIT_STACK, 4GB) > > to min(RLIMIT_STACK/2, 2GB). see commit '506496bcbb42 "arm64/gcs: Ensure > > that new threads have a GCS")' > > > > Rationale: > > > > 1. Shadow stacks only store return addresses (8 bytes per entry), not > > local variables, function parameters, or saved registers. A 2GB > > shadow stack is far more than sufficient for any practical > > application, even with extremely deep recursion. Using half the size > > maintains adequate while being more resource-efficient margin > > > > 2. On memory-constrained systems (e.g., platforms with only 4GB of > > physical memory, which is a common configuration), allocating 4GB > > of virtual address space for shadow stack per process/thread can > > lead to virtual memory allocation failures when the overcommit mode > > is set to OVERCOMMIT_GUESS or OVERCOMMIT_NEVER: > > Error: "__vm_enough_memory: not enough memory for the allocation" > > > > This reduces virtual address space consumption by 50% while maintaining > > more than adequate space for return address storage. > > > > Additionally, add max(PAGE_SIZE, size) constraint, which covers the > > case where RLIMIT_STACK is smaller than PAGE_SIZE. > > > > Signed-off-by: Zong Li <zong.li@sifive.com> > > --- > > > > Changed in v2: > > - Add max() in case RLIMIT_STACK is smaller than PAGE_SIZE. Suggested by > > Paul Walmsley and Sashiko > > > > Changed in v1: > > - Use min() instead of min_t(). Suggested by David Laight > > > > arch/riscv/kernel/usercfi.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c > > index 6eaa0d94fdfe..0f75e8f5d0ec 100644 > > --- a/arch/riscv/kernel/usercfi.c > > +++ b/arch/riscv/kernel/usercfi.c > > @@ -109,15 +109,17 @@ void set_indir_lp_lock(struct task_struct *task, bool lock) > > task->thread_info.user_cfi_state.ufcfi_locked = lock; > > } > > /* > > - * If size is 0, then to be compatible with regular stack we want it to be as big as > > - * regular stack. Else PAGE_ALIGN it and return back > > + * The shadow stack only stores the return address and not any variables > > + * 2G should be more than sufficient for most applications. > > */ > > static unsigned long calc_shstk_size(unsigned long size) > > { > > if (size) > > return PAGE_ALIGN(size); > > > > - return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); > > + size = PAGE_ALIGN(min(rlimit(RLIMIT_STACK) / 2, SZ_2G)); > > PAGE_ALIGN() already rounds up, so the only problem would be if rlimit(STACK) > were 0 or 1, I'm sure that would fail earlier (or not be allowed). Thank you for pointing this out. If this case doesn't happen, I will remove the max check. > > I also don't understand the rational for just /2 and the 2G upper limit. > You need 512 nested function calls to even use 4k. > That would have to be quite deep recursion. During the discussions about the ARM GCS v3 series, community pointed out that a 4G shadow stack might be too large. This size is hard to support in memory-constrained environments like Android. However, the size cannot be too small either, or we might face stack overflow issues. At that time, a perfect size was not decided. In later versions, the GCS implementation cut the size in half, reducing it to 2G. I did not see any detailed analysis for this change; it seemed to be just a simple reduction: https://patchew.org/linux/20230807-arm64-gcs-v4-0-68cfa37f9069@kernel.org/20230807-arm64-gcs-v4-19-68cfa37f9069@kernel.org/ My patch also aims to reduce the shadow stack size from 4G, as 4G might be too large for many systems, especially for embedded systems. For now, we are simply aligning with the 2G limit used in GCS. The reason is that since 2G is already accepted by the community and Android system, it might be a safe starting point. We can use 2G for now, until we find that it no longer meets common needs. I do not have a strong opinion for the default value. Do you have a suggestion for the size? If so, I would happily use your suggested value in the next version. Thanks > > -- David > > > + > > + return max(PAGE_SIZE, size); > > } > > > > /* > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-15 3:42 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-14 7:50 [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB Zong Li 2026-05-14 8:56 ` David Laight 2026-05-15 3:42 ` Zong Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox