[PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
@ 2025-09-21 13:21 Mark Brown
  2025-09-21 13:21 ` [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads Mark Brown
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Mark Brown @ 2025-09-21 13:21 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Christian Brauner,
	Adhemerval Zanella Netto, Shuah Khan
  Cc: Rick Edgecombe, Deepak Gupta, Wilco Dijkstra, Carlos O'Donell,
	Florian Weimer, Szabolcs Nagy, Rich Felker, libc-alpha,
	linux-arm-kernel, linux-kernel, linux-kselftest, Mark Brown

During the discussion of the clone3() support for shadow stacks concerns
were raised from the glibc side that since it is not possible to reuse
the allocated shadow stack[1]. This means that the benefit of being able
to manage allocations is greatly reduced, for example it is not possible
to integrate the shadow stacks into the glibc thread stack cache. The
stack can be inspected but otherwise it would have to be unmapped and
remapped before it could be used again, it's not clear that this is
better than managing things in the kernel.

In that discussion I suggested that we could enable reuse by writing a
token to the shadow stack of exiting threads, mirroring how the
userspace stack pivot instructions write a token to the outgoing stack.
As mentioned by Florian[2] glibc already unwinds the stack and exits the
thread from the start routine which would integrate nicely with this,
the shadow stack pointer will be at the same place as it was when the
thread started.

This would not write a token if the thread doesn't exit cleanly, that
seems viable to me - users should probably handle this by double
checking that a token is present after waiting for the thread.

This is tagged as a RFC since I put it together fairly quickly to
demonstrate the proposal and the suggestion hasn't had much response
either way from the glibc developers.  At the very least we don't
currently handle scheduling during exit(), or distinguish why the thread
is exiting.  I've also not done anything about x86.

[1] https://marc.info/?l=glibc-alpha&m=175821637429537&w=2
[2] https://marc.info/?l=glibc-alpha&m=175733266913483&w=2

Signed-off-by: Mark Brown <broonie@kernel.org>
---
Mark Brown (3):
      arm64/gcs: Support reuse of GCS for exited threads
      kselftest/arm64: Validate PR_SHADOW_STACK_EXIT_TOKEN in basic-gcs
      kselftest/arm64: Add PR_SHADOW_STACK_EXIT_TOKEN to gcs-locking

 arch/arm64/include/asm/gcs.h                    |   3 +-
 arch/arm64/mm/gcs.c                             |  25 ++++-
 include/uapi/linux/prctl.h                      |   1 +
 tools/testing/selftests/arm64/gcs/basic-gcs.c   | 121 ++++++++++++++++++++++++
 tools/testing/selftests/arm64/gcs/gcs-locking.c |  23 +++++
 tools/testing/selftests/arm64/gcs/gcs-util.h    |   3 +-
 6 files changed, 173 insertions(+), 3 deletions(-)
---
base-commit: 0b67d4b724b4afed2690c21bef418b8a803c5be2
change-id: 20250919-arm64-gcs-exit-token-82c3c2570aad
prerequisite-change-id: 20231019-clone3-shadow-stack-15d40d2bf536

Best regards,
--  
Mark Brown <broonie@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads
  2025-09-21 13:21 [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Mark Brown
@ 2025-09-21 13:21 ` Mark Brown
  2025-09-25 16:46   ` Catalin Marinas
  2025-09-21 13:21 ` [PATCH RFC 2/3] kselftest/arm64: Validate PR_SHADOW_STACK_EXIT_TOKEN in basic-gcs Mark Brown
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Mark Brown @ 2025-09-21 13:21 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Christian Brauner,
	Adhemerval Zanella Netto, Shuah Khan
  Cc: Rick Edgecombe, Deepak Gupta, Wilco Dijkstra, Carlos O'Donell,
	Florian Weimer, Szabolcs Nagy, Rich Felker, libc-alpha,
	linux-arm-kernel, linux-kernel, linux-kselftest, Mark Brown

Currently when a thread with a userspace allocated stack exits it is not
possible for the GCS that was in use by the thread to be reused since no
stack switch token will be left on the stack, preventing use with
clone3() or userspace stack switching. The only thing userspace can
realistically do with the GCS is inspect it or unmap it which is not
ideal.

Enable reuse by modelling thread exit like pivoting in userspace with
the stack pivot instructions, writing a stack switch token at the top
entry of the GCS of the exiting thread.  This allows userspace to switch
back to the GCS in future, the use of the current stack location should
work well with glibc's current behaviour of fully uwninding the stack of
threads that exit cleanly.

This patch is an RFC and should not be applied as-is. Currently the
token will only be written for the current thread, but will be written
regardless of the reason the thread is exiting. This version of the
patch does not handle scheduling during exit() at all, the code is racy.

The feature is gated behind a new GCS mode flag PR_SHADOW_STACK_EXIT_TOKEN
to ensure that userspace that does not wish to use the tokens never has
to see them.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/gcs.h |  3 ++-
 arch/arm64/mm/gcs.c          | 25 ++++++++++++++++++++++++-
 include/uapi/linux/prctl.h   |  1 +
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/gcs.h b/arch/arm64/include/asm/gcs.h
index b4bbec9382a1..1ec359d0ad51 100644
--- a/arch/arm64/include/asm/gcs.h
+++ b/arch/arm64/include/asm/gcs.h
@@ -52,7 +52,8 @@ static inline u64 gcsss2(void)
 }
 
 #define PR_SHADOW_STACK_SUPPORTED_STATUS_MASK \
-	(PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | PR_SHADOW_STACK_PUSH)
+	(PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | \
+	 PR_SHADOW_STACK_PUSH | PR_SHADOW_STACK_EXIT_TOKEN)
 
 #ifdef CONFIG_ARM64_GCS
 
diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c
index fd1d5a6655de..4649c2b107a7 100644
--- a/arch/arm64/mm/gcs.c
+++ b/arch/arm64/mm/gcs.c
@@ -199,14 +199,37 @@ void gcs_set_el0_mode(struct task_struct *task)
 
 void gcs_free(struct task_struct *task)
 {
+	unsigned long __user *cap_ptr;
+	unsigned long cap_val;
+	int ret;
+
 	if (!system_supports_gcs())
 		return;
 
 	if (!task->mm || task->mm != current->mm)
 		return;
 
-	if (task->thread.gcs_base)
+	if (task->thread.gcs_base) {
 		vm_munmap(task->thread.gcs_base, task->thread.gcs_size);
+	} else if (task == current &&
+		   task->thread.gcs_el0_mode & PR_SHADOW_STACK_EXIT_TOKEN) {
+		cap_ptr = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0);
+		cap_ptr--;
+		cap_val = GCS_CAP(cap_ptr);
+
+		/*
+		 * We can't do anything constructive if this fails,
+		 * and the thread might be exiting due to being in a
+		 * bad state anyway.
+		 */
+		put_user_gcs(cap_val, cap_ptr, &ret);
+
+		/*
+		 * Ensure the new cap is ordered before standard
+		 * memory accesses to the same location.
+		 */
+		gcsb_dsync();
+	}
 
 	task->thread.gcspr_el0 = 0;
 	task->thread.gcs_base = 0;
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index ed3aed264aeb..c3c37c39639f 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -352,6 +352,7 @@ struct prctl_mm_map {
 # define PR_SHADOW_STACK_ENABLE         (1UL << 0)
 # define PR_SHADOW_STACK_WRITE		(1UL << 1)
 # define PR_SHADOW_STACK_PUSH		(1UL << 2)
+# define PR_SHADOW_STACK_EXIT_TOKEN	(1UL << 3)
 
 /*
  * Prevent further changes to the specified shadow stack

-- 
2.47.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 2/3] kselftest/arm64: Validate PR_SHADOW_STACK_EXIT_TOKEN in basic-gcs
  2025-09-21 13:21 [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Mark Brown
  2025-09-21 13:21 ` [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads Mark Brown
@ 2025-09-21 13:21 ` Mark Brown
  2025-09-21 13:21 ` [PATCH RFC 3/3] kselftest/arm64: Add PR_SHADOW_STACK_EXIT_TOKEN to gcs-locking Mark Brown
  2025-09-25 20:40 ` [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Edgecombe, Rick P
  3 siblings, 0 replies; 22+ messages in thread
From: Mark Brown @ 2025-09-21 13:21 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Christian Brauner,
	Adhemerval Zanella Netto, Shuah Khan
  Cc: Rick Edgecombe, Deepak Gupta, Wilco Dijkstra, Carlos O'Donell,
	Florian Weimer, Szabolcs Nagy, Rich Felker, libc-alpha,
	linux-arm-kernel, linux-kernel, linux-kselftest, Mark Brown

Add a simple test which validates that exit tokens can be written to the
GCS of exiting threads, in basic-gcs since this functionality is expected
to be used by libcs.

We should add further tests which validate the option being absent.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 tools/testing/selftests/arm64/gcs/basic-gcs.c | 121 ++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

diff --git a/tools/testing/selftests/arm64/gcs/basic-gcs.c b/tools/testing/selftests/arm64/gcs/basic-gcs.c
index 54f9c888249d..5515a5425186 100644
--- a/tools/testing/selftests/arm64/gcs/basic-gcs.c
+++ b/tools/testing/selftests/arm64/gcs/basic-gcs.c
@@ -360,6 +360,126 @@ static bool test_vfork(void)
 	return pass;
 }
 
+/* We can reuse a shadow stack with an exit token */
+static bool test_exit_token(void)
+{
+	struct clone_args clone_args;
+	int ret, status;
+	static bool pass = true; /* These will be used in the thread */
+	static uint64_t expected_cap;
+	static int elem;
+	static uint64_t *buf;
+
+	/* Ensure we've got exit tokens enabled here */
+	ret = my_syscall5(__NR_prctl, PR_SET_SHADOW_STACK_STATUS,
+			  PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_EXIT_TOKEN,
+			  0, 0, 0);
+	if (ret != 0)
+		ksft_exit_fail_msg("Failed to enable exit token: %d\n", ret);
+
+	buf = (void *)my_syscall3(__NR_map_shadow_stack, 0, page_size,
+				  SHADOW_STACK_SET_TOKEN);
+	if (buf == MAP_FAILED) {
+		ksft_print_msg("Failed to map %lu byte GCS: %d\n",
+			       page_size, errno);
+		return false;
+	}
+	ksft_print_msg("Mapped GCS at %p-%p\n", buf,
+		       (void *)((uint64_t)buf + page_size));
+
+	/* We should have a cap token */
+	elem = (page_size / sizeof(uint64_t)) - 1;
+	expected_cap = ((uint64_t)buf + page_size - 8);
+	expected_cap &= GCS_CAP_ADDR_MASK;
+	expected_cap |= GCS_CAP_VALID_TOKEN;
+	if (buf[elem] != expected_cap) {
+		ksft_print_msg("Cap entry is 0x%llx not 0x%llx\n",
+			       buf[elem], expected_cap);
+		pass = false;
+	}
+	ksft_print_msg("cap token is 0x%llx\n", buf[elem]);
+
+	memset(&clone_args, 0, sizeof(clone_args));
+	clone_args.exit_signal = SIGCHLD;
+	clone_args.flags = CLONE_VM;
+	clone_args.shstk_token = (uint64_t)&(buf[elem]);
+	clone_args.stack = (uint64_t)malloc(page_size);
+	clone_args.stack_size = page_size;
+
+	if (!clone_args.stack) {
+		ksft_print_msg("Failed to allocate stack\n");
+		pass = false;
+	}
+
+	/* Don't try to clone if we're failing, we might hang */
+	if (!pass)
+		goto out;
+
+	/* There is no wrapper for clone3() in nolibc (or glibc) */
+	ret = my_syscall2(__NR_clone3, &clone_args, sizeof(clone_args));
+	if (ret == -1) {
+		ksft_print_msg("clone3() failed: %d\n", errno);
+		pass = false;
+		goto out;
+	}
+
+	if (ret == 0) {
+		/* In the child, make sure the token is gone */
+		if (buf[elem]) {
+			ksft_print_msg("GCS token was not consumed: %llx\n",
+				       buf[elem]);
+			pass = false;
+		}
+
+		/* Make sure we're using the right stack */
+		if ((uint64_t)get_gcspr() != (uint64_t)&buf[elem + 1]) {
+			ksft_print_msg("Child GCSPR_EL0 is %llx not %llx\n",
+				       (uint64_t)get_gcspr(),
+				       (uint64_t)&buf[elem + 1]);
+			pass = false;
+		}
+
+		/* We want to exit with *exactly* the same GCS pointer */
+		my_syscall1(__NR_exit, 0);
+	}
+
+	ksft_print_msg("Waiting for child %d\n", ret);
+
+	ret = waitpid(ret, &status, 0);
+	if (ret == -1) {
+		ksft_print_msg("Failed to wait for child: %d\n",
+			       errno);
+		pass = false;
+		goto out;
+	}
+
+	if (!WIFEXITED(status)) {
+		ksft_print_msg("Child exited due to signal %d\n",
+			       WTERMSIG(status));
+		pass = false;
+	} else {
+		if (WEXITSTATUS(status)) {
+			ksft_print_msg("Child exited with status %d\n",
+				       WEXITSTATUS(status));
+			pass = false;
+		}
+	}
+
+	/* The token should have been restored */
+	if (buf[elem] == expected_cap) {
+		ksft_print_msg("Cap entry restored\n");
+	} else {
+		ksft_print_msg("Cap entry is 0x%llx not 0x%llx\n",
+			       buf[elem], expected_cap);
+		pass = false;
+	}
+
+out:
+	free((void*)clone_args.stack);
+	munmap(buf, page_size);
+	return pass;
+}
+
 typedef bool (*gcs_test)(void);
 
 static struct {
@@ -377,6 +497,7 @@ static struct {
 	{ "map_guarded_stack", map_guarded_stack },
 	{ "fork", test_fork },
 	{ "vfork", test_vfork },
+	{ "exit_token", test_exit_token },
 };
 
 int main(void)

-- 
2.47.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 3/3] kselftest/arm64: Add PR_SHADOW_STACK_EXIT_TOKEN to gcs-locking
  2025-09-21 13:21 [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Mark Brown
  2025-09-21 13:21 ` [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads Mark Brown
  2025-09-21 13:21 ` [PATCH RFC 2/3] kselftest/arm64: Validate PR_SHADOW_STACK_EXIT_TOKEN in basic-gcs Mark Brown
@ 2025-09-21 13:21 ` Mark Brown
  2025-09-25 20:40 ` [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Edgecombe, Rick P
  3 siblings, 0 replies; 22+ messages in thread
From: Mark Brown @ 2025-09-21 13:21 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Christian Brauner,
	Adhemerval Zanella Netto, Shuah Khan
  Cc: Rick Edgecombe, Deepak Gupta, Wilco Dijkstra, Carlos O'Donell,
	Florian Weimer, Szabolcs Nagy, Rich Felker, libc-alpha,
	linux-arm-kernel, linux-kernel, linux-kselftest, Mark Brown

We have added PR_SHADOW_STACK_EXIT_TOKEN, ensure that locking works as
expected for it.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 tools/testing/selftests/arm64/gcs/gcs-locking.c | 23 +++++++++++++++++++++++
 tools/testing/selftests/arm64/gcs/gcs-util.h    |  3 ++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/arm64/gcs/gcs-locking.c b/tools/testing/selftests/arm64/gcs/gcs-locking.c
index 989f75a491b7..0e8928096918 100644
--- a/tools/testing/selftests/arm64/gcs/gcs-locking.c
+++ b/tools/testing/selftests/arm64/gcs/gcs-locking.c
@@ -77,6 +77,29 @@ FIXTURE_VARIANT_ADD(valid_modes, enable_write_push)
 		PR_SHADOW_STACK_PUSH,
 };
 
+FIXTURE_VARIANT_ADD(valid_modes, enable_token)
+{
+	.mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_EXIT_TOKEN,
+};
+
+FIXTURE_VARIANT_ADD(valid_modes, enable_write_exit)
+{
+	.mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE |
+		PR_SHADOW_STACK_EXIT_TOKEN,
+};
+
+FIXTURE_VARIANT_ADD(valid_modes, enable_push_exit)
+{
+	.mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_PUSH |
+		PR_SHADOW_STACK_EXIT_TOKEN,
+};
+
+FIXTURE_VARIANT_ADD(valid_modes, enable_write_push_exit)
+{
+	.mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE |
+		PR_SHADOW_STACK_PUSH | PR_SHADOW_STACK_EXIT_TOKEN,
+};
+
 FIXTURE_SETUP(valid_modes)
 {
 }
diff --git a/tools/testing/selftests/arm64/gcs/gcs-util.h b/tools/testing/selftests/arm64/gcs/gcs-util.h
index c99a6b39ac14..1abc9d122ac1 100644
--- a/tools/testing/selftests/arm64/gcs/gcs-util.h
+++ b/tools/testing/selftests/arm64/gcs/gcs-util.h
@@ -36,7 +36,8 @@ struct user_gcs {
 # define PR_SHADOW_STACK_PUSH		(1UL << 2)
 
 #define PR_SHADOW_STACK_ALL_MODES \
-	PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | PR_SHADOW_STACK_PUSH
+	PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | \
+	PR_SHADOW_STACK_PUSH | PR_SHADOW_STACK_EXIT_TOKEN
 
 #define SHADOW_STACK_SET_TOKEN (1ULL << 0)     /* Set up a restore token in the shadow stack */
 #define SHADOW_STACK_SET_MARKER (1ULL << 1)     /* Set up a top of stack merker in the shadow stack */

-- 
2.47.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads
  2025-09-21 13:21 ` [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads Mark Brown
@ 2025-09-25 16:46   ` Catalin Marinas
  2025-09-25 17:01     ` Mark Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2025-09-25 16:46 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Christian Brauner, Adhemerval Zanella Netto,
	Shuah Khan, Rick Edgecombe, Deepak Gupta, Wilco Dijkstra,
	Carlos O'Donell, Florian Weimer, Szabolcs Nagy, Rich Felker,
	libc-alpha, linux-arm-kernel, linux-kernel, linux-kselftest

On Sun, Sep 21, 2025 at 02:21:35PM +0100, Mark Brown wrote:
> diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c
> index fd1d5a6655de..4649c2b107a7 100644
> --- a/arch/arm64/mm/gcs.c
> +++ b/arch/arm64/mm/gcs.c
> @@ -199,14 +199,37 @@ void gcs_set_el0_mode(struct task_struct *task)
>  
>  void gcs_free(struct task_struct *task)
>  {
> +	unsigned long __user *cap_ptr;
> +	unsigned long cap_val;
> +	int ret;
> +
>  	if (!system_supports_gcs())
>  		return;
>  
>  	if (!task->mm || task->mm != current->mm)
>  		return;
> -	if (task->thread.gcs_base)
> +	if (task->thread.gcs_base) {
>  		vm_munmap(task->thread.gcs_base, task->thread.gcs_size);
> +	} else if (task == current &&
> +		   task->thread.gcs_el0_mode & PR_SHADOW_STACK_EXIT_TOKEN) {

I checked the code paths leading here and task is always current. But
better to keep the test in case the core code ever changes.

> +		cap_ptr = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0);
> +		cap_ptr--;
> +		cap_val = GCS_CAP(cap_ptr);
> +
> +		/*
> +		 * We can't do anything constructive if this fails,
> +		 * and the thread might be exiting due to being in a
> +		 * bad state anyway.
> +		 */
> +		put_user_gcs(cap_val, cap_ptr, &ret);
> +
> +		/*
> +		 * Ensure the new cap is ordered before standard
> +		 * memory accesses to the same location.
> +		 */
> +		gcsb_dsync();
> +	}

The only downside is that, if the thread did not unwind properly, we
don't write the token where it was initially. We could save the token
address from clone3() and restore it there instead.

-- 
Catalin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads
  2025-09-25 16:46   ` Catalin Marinas
@ 2025-09-25 17:01     ` Mark Brown
  2025-09-25 18:36       ` Catalin Marinas
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Brown @ 2025-09-25 17:01 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Christian Brauner, Adhemerval Zanella Netto,
	Shuah Khan, Rick Edgecombe, Deepak Gupta, Wilco Dijkstra,
	Carlos O'Donell, Florian Weimer, Szabolcs Nagy, Rich Felker,
	libc-alpha, linux-arm-kernel, linux-kernel, linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 1580 bytes --]

On Thu, Sep 25, 2025 at 05:46:46PM +0100, Catalin Marinas wrote:
> On Sun, Sep 21, 2025 at 02:21:35PM +0100, Mark Brown wrote:

> > +	} else if (task == current &&
> > +		   task->thread.gcs_el0_mode & PR_SHADOW_STACK_EXIT_TOKEN) {

> I checked the code paths leading here and task is always current. But
> better to keep the test in case the core code ever changes.

We can't have scheduled?  That's actually a pleasant surprise, that was
the main hole I was thinking of in the cover letter.

> > +		/*
> > +		 * We can't do anything constructive if this fails,
> > +		 * and the thread might be exiting due to being in a
> > +		 * bad state anyway.
> > +		 */
> > +		put_user_gcs(cap_val, cap_ptr, &ret);
> > +
> > +		/*
> > +		 * Ensure the new cap is ordered before standard
> > +		 * memory accesses to the same location.
> > +		 */
> > +		gcsb_dsync();
> > +	}

> The only downside is that, if the thread did not unwind properly, we
> don't write the token where it was initially. We could save the token
> address from clone3() and restore it there instead.

If we do that and the thread pivots away to another GCS and exits from
there then we'll write the token onto a different stack.  Writing onto
the location that userspace provided when creating the thread should be
fine for glibc's needs but it feels like the wrong assumption to bake
in, to me it feels less bad to have to map a new GCS in the case where
we didn't unwind properly.  There will be overhead in doing that but the
thread is already exiting uncleanly so imposing a cost doesn't seem
disproportionate.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads
  2025-09-25 17:01     ` Mark Brown
@ 2025-09-25 18:36       ` Catalin Marinas
  2025-09-25 19:00         ` Mark Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2025-09-25 18:36 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Christian Brauner, Adhemerval Zanella Netto,
	Shuah Khan, Rick Edgecombe, Deepak Gupta, Wilco Dijkstra,
	Carlos O'Donell, Florian Weimer, Szabolcs Nagy, Rich Felker,
	libc-alpha, linux-arm-kernel, linux-kernel, linux-kselftest

On Thu, Sep 25, 2025 at 06:01:07PM +0100, Mark Brown wrote:
> On Thu, Sep 25, 2025 at 05:46:46PM +0100, Catalin Marinas wrote:
> > On Sun, Sep 21, 2025 at 02:21:35PM +0100, Mark Brown wrote:
> 
> > > +	} else if (task == current &&
> > > +		   task->thread.gcs_el0_mode & PR_SHADOW_STACK_EXIT_TOKEN) {
> 
> > I checked the code paths leading here and task is always current. But
> > better to keep the test in case the core code ever changes.
> 
> We can't have scheduled?  That's actually a pleasant surprise, that was
> the main hole I was thinking of in the cover letter.

Well, double-check. AFAICT, gcs_free() is only called on the exit_mm()
path when a thread dies.

I think gcs_free() may have been called in other contexts before the
cleanups you had in 6.16 (there were two more call sites for
gcs_free()). If that's the case, we could turn these checks into
WARN_ON_ONCE().

> > > +		/*
> > > +		 * We can't do anything constructive if this fails,
> > > +		 * and the thread might be exiting due to being in a
> > > +		 * bad state anyway.
> > > +		 */
> > > +		put_user_gcs(cap_val, cap_ptr, &ret);
> > > +
> > > +		/*
> > > +		 * Ensure the new cap is ordered before standard
> > > +		 * memory accesses to the same location.
> > > +		 */
> > > +		gcsb_dsync();
> > > +	}
> 
> > The only downside is that, if the thread did not unwind properly, we
> > don't write the token where it was initially. We could save the token
> > address from clone3() and restore it there instead.
> 
> If we do that and the thread pivots away to another GCS and exits from
> there then we'll write the token onto a different stack.  Writing onto
> the location that userspace provided when creating the thread should be
> fine for glibc's needs but it feels like the wrong assumption to bake
> in, to me it feels less bad to have to map a new GCS in the case where
> we didn't unwind properly.  There will be overhead in doing that but the
> thread is already exiting uncleanly so imposing a cost doesn't seem
> disproportionate.

You are right, that's the safest. glibc can always unmap the shadow
stack if the thread did not exit properly.

That said, does glibc ensure the thread unwinds its stack (and shadow
stack) on pthread_exit()? IIUC, it does, at least for the normal stack,
but I'm not familiar with the codebase.

-- 
Catalin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads
  2025-09-25 18:36       ` Catalin Marinas
@ 2025-09-25 19:00         ` Mark Brown
  2025-09-26 11:14           ` Catalin Marinas
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Brown @ 2025-09-25 19:00 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Christian Brauner, Adhemerval Zanella Netto,
	Shuah Khan, Rick Edgecombe, Deepak Gupta, Wilco Dijkstra,
	Carlos O'Donell, Florian Weimer, Szabolcs Nagy, Rich Felker,
	libc-alpha, linux-arm-kernel, linux-kernel, linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 2445 bytes --]

On Thu, Sep 25, 2025 at 07:36:50PM +0100, Catalin Marinas wrote:
> On Thu, Sep 25, 2025 at 06:01:07PM +0100, Mark Brown wrote:
> > On Thu, Sep 25, 2025 at 05:46:46PM +0100, Catalin Marinas wrote:
> > > On Sun, Sep 21, 2025 at 02:21:35PM +0100, Mark Brown wrote:

> > We can't have scheduled?  That's actually a pleasant surprise, that was
> > the main hole I was thinking of in the cover letter.

> Well, double-check. AFAICT, gcs_free() is only called on the exit_mm()
> path when a thread dies.

> I think gcs_free() may have been called in other contexts before the
> cleanups you had in 6.16 (there were two more call sites for
> gcs_free()). If that's the case, we could turn these checks into
> WARN_ON_ONCE().

Yeah, just I need to convince myself that we're always running the
exit_mm() path in the context of the exiting thread.  Like you say it
needs checking but hopefully you're right and the current code is more
correct than I had thought.

> > > The only downside is that, if the thread did not unwind properly, we
> > > don't write the token where it was initially. We could save the token
> > > address from clone3() and restore it there instead.

> > If we do that and the thread pivots away to another GCS and exits from
> > there then we'll write the token onto a different stack.  Writing onto
> > the location that userspace provided when creating the thread should be
> > fine for glibc's needs but it feels like the wrong assumption to bake
> > in, to me it feels less bad to have to map a new GCS in the case where
> > we didn't unwind properly.  There will be overhead in doing that but the
> > thread is already exiting uncleanly so imposing a cost doesn't seem
> > disproportionate.

> You are right, that's the safest. glibc can always unmap the shadow
> stack if the thread did not exit properly.

> That said, does glibc ensure the thread unwinds its stack (and shadow
> stack) on pthread_exit()? IIUC, it does, at least for the normal stack,
> but I'm not familiar with the codebase.

Florian indicated that it did in:

   https://marc.info/?l=glibc-alpha&m=175733266913483&w=2

I did look at the code to check, though I'm not at all familiar with the
codebase either so I'm not sure how much that check was worth.  If the
unwinder doesn't handle the shadow stack then userspace will be having a
very bad time anyway whenever it tries to run on an unwound stack so
doing so shouldn't be an additional cost there.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-21 13:21 [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Mark Brown
                   ` (2 preceding siblings ...)
  2025-09-21 13:21 ` [PATCH RFC 3/3] kselftest/arm64: Add PR_SHADOW_STACK_EXIT_TOKEN to gcs-locking Mark Brown
@ 2025-09-25 20:40 ` Edgecombe, Rick P
  2025-09-25 23:22   ` Mark Brown
  2025-09-26 15:07   ` Yury Khrustalev
  3 siblings, 2 replies; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-09-25 20:40 UTC (permalink / raw)
  To: shuah@kernel.org, broonie@kernel.org, brauner@kernel.org,
	will@kernel.org, catalin.marinas@arm.com,
	adhemerval.zanella@linaro.org
  Cc: nsz@port70.net, dalias@libc.org, debug@rivosinc.com,
	fweimer@redhat.com, libc-alpha@sourceware.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, wilco.dijkstra@arm.com,
	jeffxu@google.com, codonell@redhat.com,
	linux-kselftest@vger.kernel.org

On Sun, 2025-09-21 at 14:21 +0100, Mark Brown wrote:
> During the discussion of the clone3() support for shadow stacks concerns
> were raised from the glibc side that since it is not possible to reuse
> the allocated shadow stack[1]. This means that the benefit of being able
> to manage allocations is greatly reduced, for example it is not possible
> to integrate the shadow stacks into the glibc thread stack cache. The
> stack can be inspected but otherwise it would have to be unmapped and
> remapped before it could be used again, it's not clear that this is
> better than managing things in the kernel.
> 
> In that discussion I suggested that we could enable reuse by writing a
> token to the shadow stack of exiting threads, mirroring how the
> userspace stack pivot instructions write a token to the outgoing stack.
> As mentioned by Florian[2] glibc already unwinds the stack and exits the
> thread from the start routine which would integrate nicely with this,
> the shadow stack pointer will be at the same place as it was when the
> thread started.
> 
> This would not write a token if the thread doesn't exit cleanly, that
> seems viable to me - users should probably handle this by double
> checking that a token is present after waiting for the thread.
> 
> This is tagged as a RFC since I put it together fairly quickly to
> demonstrate the proposal and the suggestion hasn't had much response
> either way from the glibc developers.  At the very least we don't
> currently handle scheduling during exit(), or distinguish why the thread
> is exiting.  I've also not done anything about x86.

Security-wise, it seems reasonable that if you are leaving a shadow stack, that
you could leave a token behind. But for the userspace scheme to back up the SSP
by doing a longjmp() or similar I have some doubts. IIRC there were some cross
stack edge cases that we never figured out how to handle.

As far as re-using allocated shadow stacks, there is always the option to enable
WRSS (or similar) to write the shadow stack as well as longjmp at will.

I think we should see a fuller solution from the glibc side before adding new
kernel features like this. (apologies if I missed it). I wonder if we are
building something that will have an extremely complicated set of rules for what
types of stack operations should be expected to work.

Sort of related, I think we might think about msealing shadow stacks, which will
have trouble with a lot of these user managed shadow stack schemes. The reason
is that as long as shadow stacks can be unmapped while a thread is on them (say
a sleeping thread), a new shadow stack can be allocated in the same place with a
token. Then a second thread can consume the token and possibly corrupt the
shadow stack for the other thread with it's own calls. I don't know how
realistic it is in practice, but it's something that guard gaps can't totally
prevent.

But for automatic thread created shadow stacks, there is no need to allow
userspace to unmap a shadow stack, so the automatically created stacks could
simply be msealed on creation and unmapped from the kernel. For a lot of apps
(most?) this would work perfectly fine.

I think we don't want 100 modes of shadow stack. If we have two, I'd think:
1. Msealed, simple more locked down kernel allocated shadow stack. Limited or
none user space managed shadow stacks.
2. WRSS enabled, clone3-preferred max compatibility shadow stack. Longjmp via
token writes and don't even have to think about taking signals while unwinding
across stacks, or whatever other edge case.

This RFC seems to be going down the path of addressing one edge case at a time.
Alone it's fine, but I'd rather punt these types of usages to (2) by default. 

Thoughts?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-25 20:40 ` [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Edgecombe, Rick P
@ 2025-09-25 23:22   ` Mark Brown
  2025-09-25 23:58     ` Edgecombe, Rick P
  2025-09-26 15:07   ` Yury Khrustalev
  1 sibling, 1 reply; 22+ messages in thread
From: Mark Brown @ 2025-09-25 23:22 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: shuah@kernel.org, brauner@kernel.org, will@kernel.org,
	catalin.marinas@arm.com, adhemerval.zanella@linaro.org,
	nsz@port70.net, dalias@libc.org, debug@rivosinc.com,
	fweimer@redhat.com, libc-alpha@sourceware.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, wilco.dijkstra@arm.com,
	jeffxu@google.com, codonell@redhat.com, Yury Khrustalev,
	linux-kselftest@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4458 bytes --]

On Thu, Sep 25, 2025 at 08:40:56PM +0000, Edgecombe, Rick P wrote:

> Security-wise, it seems reasonable that if you are leaving a shadow stack, that
> you could leave a token behind. But for the userspace scheme to back up the SSP
> by doing a longjmp() or similar I have some doubts. IIRC there were some cross
> stack edge cases that we never figured out how to handle.

I think those were around the use of alt stacks, which we don't
currently do for shadow stacks at all because we couldn't figure out
those edge cases.  Possibly there's others as well, though - the alt
stacks issues dominated discussion a bit.

AFAICT those issues exist anyway, if userspace is already unwinding as
part of thread exit then they'll exercise that code though perhaps be
saved from any issues by virtue of not actually doing any function
calls.  Anything that actually does a longjmp() with the intent to
continue will do so more thoroughly.

> As far as re-using allocated shadow stacks, there is always the option to enable
> WRSS (or similar) to write the shadow stack as well as longjmp at will.

That's obviously a substantial downgrade in security though.

> I think we should see a fuller solution from the glibc side before adding new
> kernel features like this. (apologies if I missed it). I wonder if we are

I agree that we want to see some userspace code here, I'm hoping this
can be used for prototyping.  Yury has some code for the clone3() part
of things in glibc on arm64 already, hopefully that can be extended to
include the shadow stack in the thread stack cache.

> building something that will have an extremely complicated set of rules for what
> types of stack operations should be expected to work.

I think restricted more than complex?

> Sort of related, I think we might think about msealing shadow stacks, which will
> have trouble with a lot of these user managed shadow stack schemes. The reason
> is that as long as shadow stacks can be unmapped while a thread is on them (say
> a sleeping thread), a new shadow stack can be allocated in the same place with a
> token. Then a second thread can consume the token and possibly corrupt the
> shadow stack for the other thread with it's own calls. I don't know how
> realistic it is in practice, but it's something that guard gaps can't totally
> prevent.

> But for automatic thread created shadow stacks, there is no need to allow
> userspace to unmap a shadow stack, so the automatically created stacks could
> simply be msealed on creation and unmapped from the kernel. For a lot of apps
> (most?) this would work perfectly fine.

Indeed, we should be able to just do that if we're mseal()ing system
mappings I think - most likely anything that has a problem with it
probably already has a problem the existing mseal() stuff.  Yet another
reason we should be factoring more of this code out into the generic
code, like I say I'll try to look at that.

I do wonder if anyone would bother with those attacks if they've got
enough control over the process to do them, but equally a lot of this is
about how things chain together.

> I think we don't want 100 modes of shadow stack. If we have two, I'd think:
> 1. Msealed, simple more locked down kernel allocated shadow stack. Limited or
> none user space managed shadow stacks.
> 2. WRSS enabled, clone3-preferred max compatibility shadow stack. Longjmp via
> token writes and don't even have to think about taking signals while unwinding
> across stacks, or whatever other edge case.

I think the important thing from a kernel ABI point of view is to give
userspace the tools to do whatever it wants and get out of the way, and
that ideally this should include options that don't just make the shadow
stack writable since that's a substantial step down in protection.

That said your option 2 is already supported with the existing clone3()
on both arm64 and x86_64, policy for switching between that and kernel
managed stacks could be set by restricting the writable stacks flag on
the enable prctl(), and/or restricting map_shadow_stack().

> This RFC seems to be going down the path of addressing one edge case at a time.
> Alone it's fine, but I'd rather punt these types of usages to (2) by default. 

For me this is in the category of "oh, of course you should be able to
do that" where it feels like an obvious usability thing than an edge
case.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-25 23:22   ` Mark Brown
@ 2025-09-25 23:58     ` Edgecombe, Rick P
  2025-09-26  0:44       ` Mark Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-09-25 23:58 UTC (permalink / raw)
  To: broonie@kernel.org
  Cc: adhemerval.zanella@linaro.org, nsz@port70.net, brauner@kernel.org,
	shuah@kernel.org, debug@rivosinc.com, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	yury.khrustalev@arm.com, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	libc-alpha@sourceware.org, linux-kselftest@vger.kernel.org

On Fri, 2025-09-26 at 00:22 +0100, Mark Brown wrote:
> On Thu, Sep 25, 2025 at 08:40:56PM +0000, Edgecombe, Rick P wrote:
> 
> > Security-wise, it seems reasonable that if you are leaving a shadow stack, that
> > you could leave a token behind. But for the userspace scheme to back up the SSP
> > by doing a longjmp() or similar I have some doubts. IIRC there were some cross
> > stack edge cases that we never figured out how to handle.
> 
> I think those were around the use of alt stacks, which we don't
> currently do for shadow stacks at all because we couldn't figure out
> those edge cases.  Possibly there's others as well, though - the alt
> stacks issues dominated discussion a bit.

For longjmp, IIRC there were some plans to search for a token on the target
stack and use it, which seems somewhat at odds with the quick efficient jump
that longjmp() gets usually used for. But it also doesn't solve the problem of
taking a signal while you are unwinding.

Like say you do calls all the way to the end of a shadow stack, and it's about
to overflow. Then the thread swaps to another shadow stack. If you longjmp back
to the original stack you will have to transition to the end of the first stack
as you unwind. If at that point the thread gets a signal, it would overflow the
shadow stack. This is a subtle difference in behavior compared to non-shadow
stack. You also need to know that nothing else could consume that token on that
stack in the meantime. So doing it safely is not nearly as simple as normal
longjmp().

Anyway, I think you don't need alt shadow stack to hit that. Just normal
userspace threading?

> 
> AFAICT those issues exist anyway, if userspace is already unwinding as
> part of thread exit then they'll exercise that code though perhaps be
> saved from any issues by virtue of not actually doing any function
> calls.  Anything that actually does a longjmp() with the intent to
> continue will do so more thoroughly.
> 
> > As far as re-using allocated shadow stacks, there is always the option to enable
> > WRSS (or similar) to write the shadow stack as well as longjmp at will.
> 
> That's obviously a substantial downgrade in security though.

I don't know about substantial, but I'd love to hear some offensive security
persons analysis. There definitely was a school of thought though, that shadow
stack should be turned on as widely as possible. If we need WRSS to make that
happen in a sane way, you could argue there is sort of a security at scale
benefit.

> 
> > I think we should see a fuller solution from the glibc side before adding new
> > kernel features like this. (apologies if I missed it). I wonder if we are
> 
> I agree that we want to see some userspace code here, I'm hoping this
> can be used for prototyping.  Yury has some code for the clone3() part
> of things in glibc on arm64 already, hopefully that can be extended to
> include the shadow stack in the thread stack cache.
> 
> > building something that will have an extremely complicated set of rules for what
> > types of stack operations should be expected to work.
> 
> I think restricted more than complex?
> 
> > Sort of related, I think we might think about msealing shadow stacks, which will
> > have trouble with a lot of these user managed shadow stack schemes. The reason
> > is that as long as shadow stacks can be unmapped while a thread is on them (say
> > a sleeping thread), a new shadow stack can be allocated in the same place with a
> > token. Then a second thread can consume the token and possibly corrupt the
> > shadow stack for the other thread with it's own calls. I don't know how
> > realistic it is in practice, but it's something that guard gaps can't totally
> > prevent.
> 
> > But for automatic thread created shadow stacks, there is no need to allow
> > userspace to unmap a shadow stack, so the automatically created stacks could
> > simply be msealed on creation and unmapped from the kernel. For a lot of apps
> > (most?) this would work perfectly fine.
> 
> Indeed, we should be able to just do that if we're mseal()ing system
> mappings I think - most likely anything that has a problem with it
> probably already has a problem the existing mseal() stuff.  Yet another
> reason we should be factoring more of this code out into the generic
> code, like I say I'll try to look at that.

Agree. But for the mseal stuff, I think you would want to have map_shadow_stack
not available.

> 
> I do wonder if anyone would bother with those attacks if they've got
> enough control over the process to do them, but equally a lot of this is
> about how things chain together.

Yea, I don't know. But the guard gaps were added after a suggestion from Jann
Horn. This is sort of a similar concern of sharing a shadow stack, but the
difference is stack overflow vs controlling args to syscalls.

> 
> > I think we don't want 100 modes of shadow stack. If we have two, I'd think:
> > 1. Msealed, simple more locked down kernel allocated shadow stack. Limited or
> > none user space managed shadow stacks.
> > 2. WRSS enabled, clone3-preferred max compatibility shadow stack. Longjmp via
> > token writes and don't even have to think about taking signals while unwinding
> > across stacks, or whatever other edge case.
> 
> I think the important thing from a kernel ABI point of view is to give
> userspace the tools to do whatever it wants and get out of the way, and
> that ideally this should include options that don't just make the shadow
> stack writable since that's a substantial step down in protection.

Yes I hear that. But also try to avoid creating maintenance issues by adding
features that didn't turn out to be useful. It sounds like we agree that we need
more proof that this will work out in the long run.

> 
> That said your option 2 is already supported with the existing clone3()
> on both arm64 and x86_64, policy for switching between that and kernel
> managed stacks could be set by restricting the writable stacks flag on
> the enable prctl(), and/or restricting map_shadow_stack().

You mean userspace could already re-use shadow stacks if they enable writable
shadow stacks? Yes I agree.

> 
> > This RFC seems to be going down the path of addressing one edge case at a time.
> > Alone it's fine, but I'd rather punt these types of usages to (2) by default. 
> 
> For me this is in the category of "oh, of course you should be able to
> do that" where it feels like an obvious usability thing than an edge
> case.

True. I guess I was thinking more about the stack unwinding. Badly phrased,
sorry.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-25 23:58     ` Edgecombe, Rick P
@ 2025-09-26  0:44       ` Mark Brown
  2025-09-26 15:46         ` Edgecombe, Rick P
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Brown @ 2025-09-26  0:44 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: adhemerval.zanella@linaro.org, nsz@port70.net, brauner@kernel.org,
	shuah@kernel.org, debug@rivosinc.com, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	yury.khrustalev@arm.com, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	libc-alpha@sourceware.org, linux-kselftest@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 5036 bytes --]

On Thu, Sep 25, 2025 at 11:58:01PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2025-09-26 at 00:22 +0100, Mark Brown wrote:

> > I think those were around the use of alt stacks, which we don't
> > currently do for shadow stacks at all because we couldn't figure out
> > those edge cases.  Possibly there's others as well, though - the alt
> > stacks issues dominated discussion a bit.

> For longjmp, IIRC there were some plans to search for a token on the target
> stack and use it, which seems somewhat at odds with the quick efficient jump
> that longjmp() gets usually used for. But it also doesn't solve the problem of
> taking a signal while you are unwinding.

Yeah, that all seemed very unclear to me.

> Like say you do calls all the way to the end of a shadow stack, and it's about
> to overflow. Then the thread swaps to another shadow stack. If you longjmp back
> to the original stack you will have to transition to the end of the first stack
> as you unwind. If at that point the thread gets a signal, it would overflow the
> shadow stack. This is a subtle difference in behavior compared to non-shadow
> stack. You also need to know that nothing else could consume that token on that
> stack in the meantime. So doing it safely is not nearly as simple as normal
> longjmp().

> Anyway, I think you don't need alt shadow stack to hit that. Just normal
> userspace threading?

Ah, the backtrack through a pivot case - yes, I don't think anyone had a
good story for how that was going to work sensibly without making the
stack writable.  Sorry, I'd written off that case as entirely so it didn't
cross my mind.

> > > As far as re-using allocated shadow stacks, there is always the option to enable
> > > WRSS (or similar) to write the shadow stack as well as longjmp at will.

> > That's obviously a substantial downgrade in security though.

> I don't know about substantial, but I'd love to hear some offensive security
> persons analysis. There definitely was a school of thought though, that shadow
> stack should be turned on as widely as possible. If we need WRSS to make that
> happen in a sane way, you could argue there is sort of a security at scale
> benefit.

I agree it seems clearly better from a security point of view to have
writable shadow stacks than none at all, I don't think there's much
argument there other than the concerns about the memory consumption and
performance tradeoffs.

> > > But for automatic thread created shadow stacks, there is no need to allow
> > > userspace to unmap a shadow stack, so the automatically created stacks could
> > > simply be msealed on creation and unmapped from the kernel. For a lot of apps
> > > (most?) this would work perfectly fine.

> > Indeed, we should be able to just do that if we're mseal()ing system
> > mappings I think - most likely anything that has a problem with it
> > probably already has a problem the existing mseal() stuff.  Yet another
> > reason we should be factoring more of this code out into the generic
> > code, like I say I'll try to look at that.

> Agree. But for the mseal stuff, I think you would want to have map_shadow_stack
> not available.

That seems like something userspace could enforce with existing security
mechanisms?  I can imagine a system might want different policies for
different programs.

> > I think the important thing from a kernel ABI point of view is to give
> > userspace the tools to do whatever it wants and get out of the way, and
> > that ideally this should include options that don't just make the shadow
> > stack writable since that's a substantial step down in protection.

> Yes I hear that. But also try to avoid creating maintenance issues by adding
> features that didn't turn out to be useful. It sounds like we agree that we need
> more proof that this will work out in the long run.

Yes, we need at least some buy in from userspace.

> > That said your option 2 is already supported with the existing clone3()
> > on both arm64 and x86_64, policy for switching between that and kernel
> > managed stacks could be set by restricting the writable stacks flag on
> > the enable prctl(), and/or restricting map_shadow_stack().

> You mean userspace could already re-use shadow stacks if they enable writable
> shadow stacks? Yes I agree.

Yes, exactly.

> > > This RFC seems to be going down the path of addressing one edge case at a time.
> > > Alone it's fine, but I'd rather punt these types of usages to (2) by default. 

> > For me this is in the category of "oh, of course you should be able to
> > do that" where it feels like an obvious usability thing than an edge
> > case.

> True. I guess I was thinking more about the stack unwinding. Badly phrased,
> sorry.

I'm completely with you on stack unwinding over pivot stuff, I wasn't
imagining we could address that.  There's so many landmines there that
we'd need a super solid story before doing anything specific to that
case.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads
  2025-09-25 19:00         ` Mark Brown
@ 2025-09-26 11:14           ` Catalin Marinas
  2025-09-26 11:37             ` Mark Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2025-09-26 11:14 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Christian Brauner, Adhemerval Zanella Netto,
	Shuah Khan, Rick Edgecombe, Deepak Gupta, Wilco Dijkstra,
	Carlos O'Donell, Florian Weimer, Szabolcs Nagy, Rich Felker,
	libc-alpha, linux-arm-kernel, linux-kernel, linux-kselftest

On Thu, Sep 25, 2025 at 08:00:40PM +0100, Mark Brown wrote:
> On Thu, Sep 25, 2025 at 07:36:50PM +0100, Catalin Marinas wrote:
> > On Thu, Sep 25, 2025 at 06:01:07PM +0100, Mark Brown wrote:
> > > On Thu, Sep 25, 2025 at 05:46:46PM +0100, Catalin Marinas wrote:
> > > > On Sun, Sep 21, 2025 at 02:21:35PM +0100, Mark Brown wrote:
> 
> > > We can't have scheduled?  That's actually a pleasant surprise, that was
> > > the main hole I was thinking of in the cover letter.
> 
> > Well, double-check. AFAICT, gcs_free() is only called on the exit_mm()
> > path when a thread dies.
> 
> > I think gcs_free() may have been called in other contexts before the
> > cleanups you had in 6.16 (there were two more call sites for
> > gcs_free()). If that's the case, we could turn these checks into
> > WARN_ON_ONCE().
> 
> Yeah, just I need to convince myself that we're always running the
> exit_mm() path in the context of the exiting thread.  Like you say it
> needs checking but hopefully you're right and the current code is more
> correct than I had thought.

The only path to gcs_free() is via mm_release() -> deactivate_mm().
mm_release() is called from either exit_mm_release() or
exec_mm_release(). These two functions are only called with current and
current->mm.

I guess for historical reasons, they take task and mm parameters but in
recent mainline, they don't seem to get anything other than current.

-- 
Catalin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads
  2025-09-26 11:14           ` Catalin Marinas
@ 2025-09-26 11:37             ` Mark Brown
  0 siblings, 0 replies; 22+ messages in thread
From: Mark Brown @ 2025-09-26 11:37 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Christian Brauner, Adhemerval Zanella Netto,
	Shuah Khan, Rick Edgecombe, Deepak Gupta, Wilco Dijkstra,
	Carlos O'Donell, Florian Weimer, Szabolcs Nagy, Rich Felker,
	libc-alpha, linux-arm-kernel, linux-kernel, linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 841 bytes --]

On Fri, Sep 26, 2025 at 12:14:21PM +0100, Catalin Marinas wrote:
> On Thu, Sep 25, 2025 at 08:00:40PM +0100, Mark Brown wrote:

> > Yeah, just I need to convince myself that we're always running the
> > exit_mm() path in the context of the exiting thread.  Like you say it
> > needs checking but hopefully you're right and the current code is more
> > correct than I had thought.

> The only path to gcs_free() is via mm_release() -> deactivate_mm().
> mm_release() is called from either exit_mm_release() or
> exec_mm_release(). These two functions are only called with current and
> current->mm.

> I guess for historical reasons, they take task and mm parameters but in
> recent mainline, they don't seem to get anything other than current.

Thanks for checking for me.  I guess some refactoring might be in
order to make all this clear.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-25 20:40 ` [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Edgecombe, Rick P
  2025-09-25 23:22   ` Mark Brown
@ 2025-09-26 15:07   ` Yury Khrustalev
  2025-09-26 15:39     ` Edgecombe, Rick P
  1 sibling, 1 reply; 22+ messages in thread
From: Yury Khrustalev @ 2025-09-26 15:07 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: shuah@kernel.org, broonie@kernel.org, brauner@kernel.org,
	will@kernel.org, catalin.marinas@arm.com,
	adhemerval.zanella@linaro.org, nsz@port70.net, dalias@libc.org,
	debug@rivosinc.com, fweimer@redhat.com, libc-alpha@sourceware.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, wilco.dijkstra@arm.com,
	jeffxu@google.com, codonell@redhat.com,
	linux-kselftest@vger.kernel.org

Hi,

On Thu, Sep 25, 2025 at 08:40:56PM +0000, Edgecombe, Rick P wrote:
> On Sun, 2025-09-21 at 14:21 +0100, Mark Brown wrote:
> > During the discussion of the clone3() support for shadow stacks concerns
> > were raised from the glibc side that since it is not possible to reuse
> > the allocated shadow stack[1]. This means that the benefit of being able
> > ...
> >
> Security-wise, it seems reasonable that if you are leaving a shadow stack, that
> you could leave a token behind. But for the userspace scheme to back up the SSP
> by doing a longjmp() or similar I have some doubts. IIRC there were some cross
> stack edge cases that we never figured out how to handle.
> 
> As far as re-using allocated shadow stacks, there is always the option to enable
> WRSS (or similar) to write the shadow stack as well as longjmp at will.
> 
> I think we should see a fuller solution from the glibc side before adding new
> kernel features like this. (apologies if I missed it).

What do you mean by "a fuller solution from the glibc side"? A solution
for re-using shadow stacks? Right now Glibc cannot do anything about
shadow stacks for new threads because clone3 interface doesn't allow it.

Thanks,
Yury



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-26 15:07   ` Yury Khrustalev
@ 2025-09-26 15:39     ` Edgecombe, Rick P
  2025-09-26 16:03       ` Mark Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-09-26 15:39 UTC (permalink / raw)
  To: yury.khrustalev@arm.com
  Cc: adhemerval.zanella@linaro.org, nsz@port70.net, brauner@kernel.org,
	shuah@kernel.org, linux-kselftest@vger.kernel.org,
	debug@rivosinc.com, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	wilco.dijkstra@arm.com, linux-arm-kernel@lists.infradead.org,
	codonell@redhat.com, libc-alpha@sourceware.org,
	broonie@kernel.org

On Fri, 2025-09-26 at 16:07 +0100, Yury Khrustalev wrote:
> > I think we should see a fuller solution from the glibc side before
> > adding new
> > kernel features like this. (apologies if I missed it).
> 
> What do you mean by "a fuller solution from the glibc side"? A
> solution
> for re-using shadow stacks? 

I mean some code or a fuller explained solution that uses this new
kernel functionality. I think the scheme that Florian suggested in the
thread linked above (longjmp() to the start of the stack) will have
trouble if the thread pivots to a new shadow stack before exiting (e.g.
ucontext).

> Right now Glibc cannot do anything about
> shadow stacks for new threads because clone3 interface doesn't allow
> it.

If you enable WRSS (or the arm equivalent) you can re-use shadow stacks
today by writing a token.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-26  0:44       ` Mark Brown
@ 2025-09-26 15:46         ` Edgecombe, Rick P
  2025-09-26 16:09           ` Mark Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-09-26 15:46 UTC (permalink / raw)
  To: broonie@kernel.org
  Cc: adhemerval.zanella@linaro.org, nsz@port70.net, brauner@kernel.org,
	shuah@kernel.org, debug@rivosinc.com, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	yury.khrustalev@arm.com, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	libc-alpha@sourceware.org, linux-kselftest@vger.kernel.org

On Fri, 2025-09-26 at 01:44 +0100, Mark Brown wrote:
> > I don't know about substantial, but I'd love to hear some offensive
> > security persons analysis. There definitely was a school of thought
> > though, that shadow stack should be turned on as widely as
> > possible. If we need WRSS to make that happen in a sane way, you
> > could argue there is sort of a security at scale
> > benefit.
> 
> I agree it seems clearly better from a security point of view to have
> writable shadow stacks than none at all, I don't think there's much
> argument there other than the concerns about the memory consumption
> and performance tradeoffs.

IIRC the WRSS equivalent works the same for ARM where you need to use a
special instruction, right? So we are not talking about full writable
shadow stacks that could get attacked from any overflow, rather,
limited spots that have the WRSS (or similar) instruction. In the
presence of forward edge CFI, we might be able to worry less about
attackers being able to actually reach it? Still not quite as locked
down as having it disabled, but maybe not such a huge gap compared to
the mmap/munmap() stuff that is the alternative we are weighing.

> 
> > > > But for automatic thread created shadow stacks, there is no
> > > > need to allow userspace to unmap a shadow stack, so the
> > > > automatically created stacks could simply be msealed on
> > > > creation and unmapped from the kernel. For a lot of apps
> > > > (most?) this would work perfectly fine.
> 
> > > Indeed, we should be able to just do that if we're mseal()ing
> > > system mappings I think - most likely anything that has a problem
> > > with it probably already has a problem the existing mseal()
> > > stuff.  Yet another reason we should be factoring more of this
> > > code out into the generic code, like I say I'll try to look at
> > > that.
> 
> > Agree. But for the mseal stuff, I think you would want to have
> > map_shadow_stack not available.
> 
> That seems like something userspace could enforce with existing
> security mechanisms?  I can imagine a system might want different
> policies for different programs.

Yes, you could already do it with seccomp or something. Not sure if it
will work for the distro-wide enabling schemes or not though.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-26 15:39     ` Edgecombe, Rick P
@ 2025-09-26 16:03       ` Mark Brown
  2025-09-26 19:17         ` Edgecombe, Rick P
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Brown @ 2025-09-26 16:03 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: yury.khrustalev@arm.com, adhemerval.zanella@linaro.org,
	nsz@port70.net, brauner@kernel.org, shuah@kernel.org,
	linux-kselftest@vger.kernel.org, debug@rivosinc.com,
	fweimer@redhat.com, linux-kernel@vger.kernel.org,
	catalin.marinas@arm.com, dalias@libc.org, jeffxu@google.com,
	will@kernel.org, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	libc-alpha@sourceware.org

[-- Attachment #1: Type: text/plain, Size: 608 bytes --]

On Fri, Sep 26, 2025 at 03:39:46PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2025-09-26 at 16:07 +0100, Yury Khrustalev wrote:

> > What do you mean by "a fuller solution from the glibc side"? A
> > solution
> > for re-using shadow stacks? 

> I mean some code or a fuller explained solution that uses this new
> kernel functionality. I think the scheme that Florian suggested in the
> thread linked above (longjmp() to the start of the stack) will have
> trouble if the thread pivots to a new shadow stack before exiting (e.g.
> ucontext).

Is that supported even without user managed stacks?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-26 15:46         ` Edgecombe, Rick P
@ 2025-09-26 16:09           ` Mark Brown
  2025-09-29 18:37             ` Deepak Gupta
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Brown @ 2025-09-26 16:09 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: adhemerval.zanella@linaro.org, nsz@port70.net, brauner@kernel.org,
	shuah@kernel.org, debug@rivosinc.com, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	yury.khrustalev@arm.com, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	libc-alpha@sourceware.org, linux-kselftest@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]

On Fri, Sep 26, 2025 at 03:46:26PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2025-09-26 at 01:44 +0100, Mark Brown wrote:

> > I agree it seems clearly better from a security point of view to have
> > writable shadow stacks than none at all, I don't think there's much
> > argument there other than the concerns about the memory consumption
> > and performance tradeoffs.

> IIRC the WRSS equivalent works the same for ARM where you need to use a
> special instruction, right? So we are not talking about full writable

Yes, it's GCSSTR for arm64.

> shadow stacks that could get attacked from any overflow, rather,
> limited spots that have the WRSS (or similar) instruction. In the
> presence of forward edge CFI, we might be able to worry less about
> attackers being able to actually reach it? Still not quite as locked
> down as having it disabled, but maybe not such a huge gap compared to
> the mmap/munmap() stuff that is the alternative we are weighing.

Agreed, as I said it's a definite win still - just not quite as strong.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-26 16:03       ` Mark Brown
@ 2025-09-26 19:17         ` Edgecombe, Rick P
  2025-09-29 15:47           ` Mark Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-09-26 19:17 UTC (permalink / raw)
  To: broonie@kernel.org
  Cc: adhemerval.zanella@linaro.org, nsz@port70.net, brauner@kernel.org,
	shuah@kernel.org, debug@rivosinc.com, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	yury.khrustalev@arm.com, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	linux-kselftest@vger.kernel.org, libc-alpha@sourceware.org

On Fri, 2025-09-26 at 17:03 +0100, Mark Brown wrote:
> On Fri, Sep 26, 2025 at 03:39:46PM +0000, Edgecombe, Rick P wrote:
> > On Fri, 2025-09-26 at 16:07 +0100, Yury Khrustalev wrote:
> 
> > > What do you mean by "a fuller solution from the glibc side"? A
> > > solution
> > > for re-using shadow stacks? 
> 
> > I mean some code or a fuller explained solution that uses this new
> > kernel functionality. I think the scheme that Florian suggested in
> > the
> > thread linked above (longjmp() to the start of the stack) will have
> > trouble if the thread pivots to a new shadow stack before exiting
> > (e.g.
> > ucontext).
> 
> Is that supported even without user managed stacks?

IIUC longjmp() is sometimes used to implement user level thread
switching. So for non-shadow stack my understanding is that longjmp()
between user level threads is supported.

For shadow stack, I think user level threads are intended to be
supported. So I don't see why a thread could never exit from another
shadow stack? Is that what you are asking? Maybe we need to discuss
this with the glibc folks.

What I read in that thread you linked was that Florian was considering
using longjmp() as a way to inherit the solution to these unwinding
problems:
   Not sure if it helps, but glibc always calls the exit system call from
   the start routine, after unwinding the stack (with longjmp if
   necessary).
https://marc.info/?l=glibc-alpha&m=175733266913483&w=2

Reading into the "...if necessary" a bit on my part. But if longjmp()
doesn't work to unwind from a non-thread shadow stack, then my question
would be how will glibc deal with if an app exits while on another
shadow stack? Just declare shadow stack re-use is not supported with
any user shadow stack pivoting? Does it need a new elf header bit then?

Again, I'm just thinking that we should vet the solution a bit more
before actually adding this to the kernel. If the combined shadow stack
effort eventually throws its hands up in frustration and goes with
WRSS/GCSSTR for apps that want to do more advanced threading patterns,
then we are already done on the kernel side.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-26 19:17         ` Edgecombe, Rick P
@ 2025-09-29 15:47           ` Mark Brown
  0 siblings, 0 replies; 22+ messages in thread
From: Mark Brown @ 2025-09-29 15:47 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: adhemerval.zanella@linaro.org, nsz@port70.net, brauner@kernel.org,
	shuah@kernel.org, debug@rivosinc.com, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	yury.khrustalev@arm.com, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	linux-kselftest@vger.kernel.org, libc-alpha@sourceware.org

[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]

On Fri, Sep 26, 2025 at 07:17:16PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2025-09-26 at 17:03 +0100, Mark Brown wrote:
> > On Fri, Sep 26, 2025 at 03:39:46PM +0000, Edgecombe, Rick P wrote:

> > > > What do you mean by "a fuller solution from the glibc side"? A
> > > > solution
> > > > for re-using shadow stacks? 

> > > I mean some code or a fuller explained solution that uses this new
> > > kernel functionality. I think the scheme that Florian suggested in
> > > the
> > > thread linked above (longjmp() to the start of the stack) will have
> > > trouble if the thread pivots to a new shadow stack before exiting
> > > (e.g.
> > > ucontext).

> > Is that supported even without user managed stacks?

> IIUC longjmp() is sometimes used to implement user level thread
> switching. So for non-shadow stack my understanding is that longjmp()
> between user level threads is supported.

Yes, that was more a "does it actually work?" query than a "might
someone reasonably want to do this?".

> For shadow stack, I think user level threads are intended to be
> supported. So I don't see why a thread could never exit from another
> shadow stack? Is that what you are asking? Maybe we need to discuss
> this with the glibc folks.

There's a selection of them directly on this thread, and libc-alpha on
CC, so hopefully they'll chime in.  Unless I'm getting confused the code
does appear to be doing an unwind, there's a lot of layers there so I
might've got turned around though.

> Again, I'm just thinking that we should vet the solution a bit more
> before actually adding this to the kernel. If the combined shadow stack
> effort eventually throws its hands up in frustration and goes with
> WRSS/GCSSTR for apps that want to do more advanced threading patterns,
> then we are already done on the kernel side.

Some more input from the glibc side would indeed be useful, I've not
seen any thoughts on either this series or just turning on writability
both of which are kind of orthogonal to clone3() (which has been dropped
from -next today).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks
  2025-09-26 16:09           ` Mark Brown
@ 2025-09-29 18:37             ` Deepak Gupta
  0 siblings, 0 replies; 22+ messages in thread
From: Deepak Gupta @ 2025-09-29 18:37 UTC (permalink / raw)
  To: Mark Brown
  Cc: Edgecombe, Rick P, adhemerval.zanella@linaro.org, nsz@port70.net,
	brauner@kernel.org, shuah@kernel.org, fweimer@redhat.com,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	dalias@libc.org, jeffxu@google.com, will@kernel.org,
	yury.khrustalev@arm.com, wilco.dijkstra@arm.com,
	linux-arm-kernel@lists.infradead.org, codonell@redhat.com,
	libc-alpha@sourceware.org, linux-kselftest@vger.kernel.org

On Fri, Sep 26, 2025 at 05:09:08PM +0100, Mark Brown wrote:
>On Fri, Sep 26, 2025 at 03:46:26PM +0000, Edgecombe, Rick P wrote:
>> On Fri, 2025-09-26 at 01:44 +0100, Mark Brown wrote:
>
>> > I agree it seems clearly better from a security point of view to have
>> > writable shadow stacks than none at all, I don't think there's much
>> > argument there other than the concerns about the memory consumption
>> > and performance tradeoffs.
>
>> IIRC the WRSS equivalent works the same for ARM where you need to use a
>> special instruction, right? So we are not talking about full writable
>
>Yes, it's GCSSTR for arm64.

sspush / ssamoswap on RISC-V provides write mechanisms to shadow stack.


>
>> shadow stacks that could get attacked from any overflow, rather,
>> limited spots that have the WRSS (or similar) instruction. In the
>> presence of forward edge CFI, we might be able to worry less about
>> attackers being able to actually reach it? Still not quite as locked
>> down as having it disabled, but maybe not such a huge gap compared to
>> the mmap/munmap() stuff that is the alternative we are weighing.
>
>Agreed, as I said it's a definite win still - just not quite as strong.

If I have to put philosopher's hat, in order to have wider deployment and
adoption, its better to have to have better security posture for majority
users rather than making ultra secure system which is difficult to use.

This just means that mechanism(s) to write-to-shadow stack flows in user space
have to be carefully done.

- Sparse and not part of compile codegen. Mostly should be hand coded and
   reviewed.

- Reachability of such gadgets and their usage by adversary should be threat
   modeled.


If forward cfi is enabled, I don't expect gadget of write to shadow stack
itself being reachable without disabling fcfi or pivoting/corrupting shadow
stack. The only other way to achieve something like that would be to re-use
entire function (where sswrite is present) to achieve desired effect. I think
we should be focussing more on those.





^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-09-29 18:38 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-21 13:21 [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Mark Brown
2025-09-21 13:21 ` [PATCH RFC 1/3] arm64/gcs: Support reuse of GCS for exited threads Mark Brown
2025-09-25 16:46   ` Catalin Marinas
2025-09-25 17:01     ` Mark Brown
2025-09-25 18:36       ` Catalin Marinas
2025-09-25 19:00         ` Mark Brown
2025-09-26 11:14           ` Catalin Marinas
2025-09-26 11:37             ` Mark Brown
2025-09-21 13:21 ` [PATCH RFC 2/3] kselftest/arm64: Validate PR_SHADOW_STACK_EXIT_TOKEN in basic-gcs Mark Brown
2025-09-21 13:21 ` [PATCH RFC 3/3] kselftest/arm64: Add PR_SHADOW_STACK_EXIT_TOKEN to gcs-locking Mark Brown
2025-09-25 20:40 ` [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks Edgecombe, Rick P
2025-09-25 23:22   ` Mark Brown
2025-09-25 23:58     ` Edgecombe, Rick P
2025-09-26  0:44       ` Mark Brown
2025-09-26 15:46         ` Edgecombe, Rick P
2025-09-26 16:09           ` Mark Brown
2025-09-29 18:37             ` Deepak Gupta
2025-09-26 15:07   ` Yury Khrustalev
2025-09-26 15:39     ` Edgecombe, Rick P
2025-09-26 16:03       ` Mark Brown
2025-09-26 19:17         ` Edgecombe, Rick P
2025-09-29 15:47           ` Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).