All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: Zong Li <zong.li@sifive.com>
Cc: pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu,
	alex@ghiti.fr, debug@rivosinc.com,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB
Date: Mon, 18 May 2026 10:57:25 +0100	[thread overview]
Message-ID: <20260518105725.7afe7a4c@pumpkin> (raw)
In-Reply-To: <CANXhq0qOjCVOQfCe5QqNU+UksAzK3F4OLYAFSfsBrRP3zo9GUg@mail.gmail.com>

On Mon, 18 May 2026 11:54:32 +0800
Zong Li <zong.li@sifive.com> wrote:

> On Sat, May 16, 2026 at 3:16 AM David Laight
> <david.laight.linux@gmail.com> wrote:
> >
> > On Fri, 15 May 2026 22:29:05 +0800
> > Zong Li <zong.li@sifive.com> wrote:
> >  
> > > On Fri, May 15, 2026 at 5:24 PM David Laight
> > > <david.laight.linux@gmail.com> wrote:  
> > > >
> > > > On Fri, 15 May 2026 11:42:45 +0800
> > > > Zong Li <zong.li@sifive.com> wrote:
> > > >  
> > > > > On Thu, May 14, 2026 at 4:56 PM David Laight  
> > > > ..  
> > > > > > I also don't understand the rational for just /2 and the 2G upper limit.
> > > > > > You need 512 nested function calls to even use 4k.
> > > > > > That would have to be quite deep recursion.  
> > > > >
> > > > > During the discussions about the ARM GCS v3 series, community pointed
> > > > > out that a 4G shadow stack might be too large. This size is hard to
> > > > > support in memory-constrained environments like Android. However, the
> > > > > size cannot be too small either, or we might face stack overflow
> > > > > issues. At that time, a perfect size was not decided.  
> > > >
> > > > It is only VA not real memory so shouldn't make much difference to memory
> > > > use (except for nommu where the actual memory has to be allocated).
> > > >  
> > >
> > > You raise a valid point that shadow stacks are primarily a VA
> > > allocation. However, in Linux, the memory overcommit mechanism creates
> > > a practical link between VA allocation and physical memory capacity.
> > > As I mentioned in the commit message, memory allocation will fail when
> > > the overcommit mode is set to OVERCOMMIT_GUESS or OVERCOMMIT_NEVER.
> > >
> > > In __vm_enough_memory:
> > >         if (pages > totalram_pages() + total_swap_pages)
> > >                 goto error;
> > >
> > > Many page requests for VA will fail if the requested size exceeds the
> > > system's total RAM plus Swap. On memory-constrained systems,
> > > allocating a massive 4GB shadow stack per thread would immediately
> > > trigger this error.  
> >
> > But reducing the size by half makes little difference.
> > You'd need a much bigger reduction to make any real difference.
> >  
> 
> I agree with you that a smaller size would cover more cases. I am very
> open to your ideas regarding the size. Would you prefer to use 1GB or
> 512MB as the default instead?
> As I mentioned in my previous emails, using 2GB seems to be a safe
> starting point. This is because it is already accepted by the
> community and the Android system (in GCS implementation).
> Additionally, although the CFI feature doesn't support 32-bit systems
> yet, normal 32-bit systems can only support up to 4GB of physical
> memory. If the default shadow stack size is 4GB, it would be almost
> impossible to run on a 32-bit system. Using at least 2GB can help
> avoid this issue in the future. If you don't have a preferred default
> value, maybe we could start with 2G?

I've no real idea - note that the rlimit value should be small for 32bit
(or at least the actual stack is small regardless of the rlimit value).
The 2G is just an upper bound - probably matching the 4G upper bound
for the 64bit stack itself.

Don't focus on the 2G limit, but on the rlimit(STACK)/2 (or rather the size
of the normal stack).
On my systems the default soft limit is 8M, a 4M shadow stack supports 512k
nested function calls (64bit) - none of which can have any local data.
In reality programs that use a lot of stack allocate large buffers on stack,
they don't have silly depths of recursive functions with no local data.

I've just looked at vmlinux.o - which won't be representative of userspace!
While there are a lot of functions with small stack frame (sub $0x10,%rsp)
they tend to have saved a few registers in stack first.
The majority will have a stack delta of over 64 bytes.
That corresponds to rlimit(STACK)/8 and even that is conservative.

I'd suspect that could safely halve that again.

Actually would it be possible to initially just allocate one page?
If you get an overflow fault on the shadow stack I think you can
safely reallocate it at an entirely different user virtual address.
That would remove all the problems over committing a lot of swap.
Most threads will never do the 512 nested calls needed to blow the stack.

-- David

> 
> 
> > -- David
> >  
> > >  
> > > > But 32bit programs with lots of threads can run out of VA.
> > > > Increasing the stack VA size by 50% might even give problems for 64bit
> > > > programs - if they are already reducing the thread stack size avoid
> > > > running out of VA.
> > > >
> > > > I've not checked, but pthread_attr_setstacksize() sets a limit for the
> > > > thread stack size (which would otherwise default so rlimit(STACK)).
> > > > I don't believe it should update the rlimit value itself.
> > > > In which case you are using the wrong size.
> > > >
> > > > But for a thread with a very reduced stack (say 128k) you probably only
> > > > need 1 page of shadow stack, any more could easily lead to running out
> > > > of VA.
> > > >
> > > > -- David  
> > >  
> >  


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: David Laight <david.laight.linux@gmail.com>
To: Zong Li <zong.li@sifive.com>
Cc: pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu,
	alex@ghiti.fr, debug@rivosinc.com,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB
Date: Mon, 18 May 2026 10:57:25 +0100	[thread overview]
Message-ID: <20260518105725.7afe7a4c@pumpkin> (raw)
In-Reply-To: <CANXhq0qOjCVOQfCe5QqNU+UksAzK3F4OLYAFSfsBrRP3zo9GUg@mail.gmail.com>

On Mon, 18 May 2026 11:54:32 +0800
Zong Li <zong.li@sifive.com> wrote:

> On Sat, May 16, 2026 at 3:16 AM David Laight
> <david.laight.linux@gmail.com> wrote:
> >
> > On Fri, 15 May 2026 22:29:05 +0800
> > Zong Li <zong.li@sifive.com> wrote:
> >  
> > > On Fri, May 15, 2026 at 5:24 PM David Laight
> > > <david.laight.linux@gmail.com> wrote:  
> > > >
> > > > On Fri, 15 May 2026 11:42:45 +0800
> > > > Zong Li <zong.li@sifive.com> wrote:
> > > >  
> > > > > On Thu, May 14, 2026 at 4:56 PM David Laight  
> > > > ..  
> > > > > > I also don't understand the rational for just /2 and the 2G upper limit.
> > > > > > You need 512 nested function calls to even use 4k.
> > > > > > That would have to be quite deep recursion.  
> > > > >
> > > > > During the discussions about the ARM GCS v3 series, community pointed
> > > > > out that a 4G shadow stack might be too large. This size is hard to
> > > > > support in memory-constrained environments like Android. However, the
> > > > > size cannot be too small either, or we might face stack overflow
> > > > > issues. At that time, a perfect size was not decided.  
> > > >
> > > > It is only VA not real memory so shouldn't make much difference to memory
> > > > use (except for nommu where the actual memory has to be allocated).
> > > >  
> > >
> > > You raise a valid point that shadow stacks are primarily a VA
> > > allocation. However, in Linux, the memory overcommit mechanism creates
> > > a practical link between VA allocation and physical memory capacity.
> > > As I mentioned in the commit message, memory allocation will fail when
> > > the overcommit mode is set to OVERCOMMIT_GUESS or OVERCOMMIT_NEVER.
> > >
> > > In __vm_enough_memory:
> > >         if (pages > totalram_pages() + total_swap_pages)
> > >                 goto error;
> > >
> > > Many page requests for VA will fail if the requested size exceeds the
> > > system's total RAM plus Swap. On memory-constrained systems,
> > > allocating a massive 4GB shadow stack per thread would immediately
> > > trigger this error.  
> >
> > But reducing the size by half makes little difference.
> > You'd need a much bigger reduction to make any real difference.
> >  
> 
> I agree with you that a smaller size would cover more cases. I am very
> open to your ideas regarding the size. Would you prefer to use 1GB or
> 512MB as the default instead?
> As I mentioned in my previous emails, using 2GB seems to be a safe
> starting point. This is because it is already accepted by the
> community and the Android system (in GCS implementation).
> Additionally, although the CFI feature doesn't support 32-bit systems
> yet, normal 32-bit systems can only support up to 4GB of physical
> memory. If the default shadow stack size is 4GB, it would be almost
> impossible to run on a 32-bit system. Using at least 2GB can help
> avoid this issue in the future. If you don't have a preferred default
> value, maybe we could start with 2G?

I've no real idea - note that the rlimit value should be small for 32bit
(or at least the actual stack is small regardless of the rlimit value).
The 2G is just an upper bound - probably matching the 4G upper bound
for the 64bit stack itself.

Don't focus on the 2G limit, but on the rlimit(STACK)/2 (or rather the size
of the normal stack).
On my systems the default soft limit is 8M, a 4M shadow stack supports 512k
nested function calls (64bit) - none of which can have any local data.
In reality programs that use a lot of stack allocate large buffers on stack,
they don't have silly depths of recursive functions with no local data.

I've just looked at vmlinux.o - which won't be representative of userspace!
While there are a lot of functions with small stack frame (sub $0x10,%rsp)
they tend to have saved a few registers in stack first.
The majority will have a stack delta of over 64 bytes.
That corresponds to rlimit(STACK)/8 and even that is conservative.

I'd suspect that could safely halve that again.

Actually would it be possible to initially just allocate one page?
If you get an overflow fault on the shadow stack I think you can
safely reallocate it at an entirely different user virtual address.
That would remove all the problems over committing a lot of swap.
Most threads will never do the 512 nested calls needed to blow the stack.

-- David

> 
> 
> > -- David
> >  
> > >  
> > > > But 32bit programs with lots of threads can run out of VA.
> > > > Increasing the stack VA size by 50% might even give problems for 64bit
> > > > programs - if they are already reducing the thread stack size avoid
> > > > running out of VA.
> > > >
> > > > I've not checked, but pthread_attr_setstacksize() sets a limit for the
> > > > thread stack size (which would otherwise default so rlimit(STACK)).
> > > > I don't believe it should update the rlimit value itself.
> > > > In which case you are using the wrong size.
> > > >
> > > > But for a thread with a very reduced stack (say 128k) you probably only
> > > > need 1 page of shadow stack, any more could easily lead to running out
> > > > of VA.
> > > >
> > > > -- David  
> > >  
> >  


  reply	other threads:[~2026-05-18  9:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-14  7:50 [PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB Zong Li
2026-05-14  7:50 ` Zong Li
2026-05-14  8:56 ` David Laight
2026-05-14  8:56   ` David Laight
2026-05-15  3:42   ` Zong Li
2026-05-15  3:42     ` Zong Li
2026-05-15  9:24     ` David Laight
2026-05-15  9:24       ` David Laight
2026-05-15 14:29       ` Zong Li
2026-05-15 14:29         ` Zong Li
2026-05-15 19:16         ` David Laight
2026-05-15 19:16           ` David Laight
2026-05-18  3:54           ` Zong Li
2026-05-18  3:54             ` Zong Li
2026-05-18  9:57             ` David Laight [this message]
2026-05-18  9:57               ` David Laight
2026-05-18 10:28               ` David Laight
2026-05-18 10:28                 ` David Laight
2026-05-19  7:04               ` Zong Li
2026-05-19  7:04                 ` Zong Li
  -- strict thread matches above, loose matches on Subject: below --
2026-05-19  7:13 Zong Li
2026-05-19  7:13 ` Zong Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260518105725.7afe7a4c@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=debug@rivosinc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=pjw@kernel.org \
    --cc=zong.li@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.