* [PATCH 0/3] Fix shadow stack issues in clone error handling
@ 2023-09-08 20:36 Rick Edgecombe
2023-09-08 20:36 ` [PATCH 1/3] x86/shstk: Handle vfork clone failure correctly Rick Edgecombe
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Rick Edgecombe @ 2023-09-08 20:36 UTC (permalink / raw)
To: x86, Thomas Gleixner, Ingo Molnar, Andy Lutomirski,
Borislav Petkov, Dave Hansen, H . Peter Anvin, Peter Zijlstra,
hjl.tools, linux-kernel
Cc: rick.p.edgecombe
HJ reported a crash when running a new glibc test added two weeks ago,
with his (out of tree) rebased glibc shadow stack patches on top. The
issue turned out to be a bug in shadow stack clone error handling. The new
test exercises clone(CLONE_PIDFD|CLONE_VFORK) with the fd limit exhausted,
which causes the clone to fail after the shadow stack handling. A failure
at that point will normally try to clean up the shadow stack that it
allocated before returning the error to userspace. But for vfork, it
should not do this as nothing was allocated and the thread's shadow stack
is the parent's shadow stack too. The logic for not unmapping the thread's
shadow stack for CLONE_VFORK children works in the normal (exec() or
exit()) case, but is buggy in the clone failure case. So on clone failure
with those particular args, the parent will end up with a missing shadow
stack.
When working on the solution I did a review of the surrounding logic and
found another more minor issue.
Patch 1 is the bug fix for the main issue, and 2 is the other issue. Patch
3 is optional and adds a new warning to catch the issue in patch 2.
Rick Edgecombe (3):
x86/shstk: Handle vfork clone failure correctly
x86/shstk: Remove useless clone error handling
x86/shstk: Add warning for shadow stack double unmap
arch/x86/include/asm/mmu_context.h | 3 +--
arch/x86/kernel/process.c | 7 -------
arch/x86/kernel/shstk.c | 33 ++++++++++++++++++++++++++++--
3 files changed, 32 insertions(+), 11 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] x86/shstk: Handle vfork clone failure correctly
2023-09-08 20:36 [PATCH 0/3] Fix shadow stack issues in clone error handling Rick Edgecombe
@ 2023-09-08 20:36 ` Rick Edgecombe
2023-09-08 20:36 ` [PATCH 2/3] x86/shstk: Remove useless clone error handling Rick Edgecombe
2023-09-08 20:36 ` [PATCH 3/3] x86/shstk: Add warning for shadow stack double unmap Rick Edgecombe
2 siblings, 0 replies; 4+ messages in thread
From: Rick Edgecombe @ 2023-09-08 20:36 UTC (permalink / raw)
To: x86, Thomas Gleixner, Ingo Molnar, Andy Lutomirski,
Borislav Petkov, Dave Hansen, H . Peter Anvin, Peter Zijlstra,
hjl.tools, linux-kernel
Cc: rick.p.edgecombe
Shadow stacks are allocated automatically and freed on exit, depending
on the clone flags. The two cases where new shadow stacks are not
allocated are !CLONE_VM (fork()) and CLONE_VFORK (vfork()). For
!CLONE_VM, although a new stack is not allocated, it can be freed normally
because it will happen in the child's copy of the VM.
However, for CLONE_VFORK the parent and the child are actually using the
same shadow stack. So the kernel doesn't need to allocate *or* free a
shadow stack for a CLONE_VFORK child. CLONE_VFORK children already need
special tracking to avoid returning to userspace until the child exits or
execs. Shadow stack uses this same tracking to avoid freeing CLONE_VFORK
shadow stacks.
However, the tracking is not setup until the clone has succeeded
(internally). Which means, if a CLONE_VFORK fails, the existing logic will
not know it is a CLONE_VFORK and proceed to unmap the parents shadow stack.
This error handling cleanup logic runs via exit_thread() in the
bad_fork_cleanup_thread label in copy_process(). The issue was seen in
the glibc test "posix/tst-spawn3-pidfd" while running with shadow stack
using currently out-of-tree glibc patches.
Fix it by not unmapping the vfork shadow stack in the error case as well.
Since clone is implemented in core code, it is not ideal to pass the clone
flags along the error path in order to have shadow stack code have
symmetric logic in the freeing half of the thread shadow stack handling.
Instead use the existing state for thread shadow stacks to track whether
the thread is managing its own shadow stack. For CLONE_VFORK, simply set
shstk->base and shstk->size to 0, and have it mean the thread is not
managing a shadow stack and so should skip cleanup work. Implement this
by breaking up the CLONE_VFORK and !CLONE_VM cases in
shstk_alloc_thread_stack() to separate conditionals since, the logic is
now different between them. In the case of CLONE_VFORK && !CLONE_VM, the
existing behavior is to not clean up the shadow stack in the child (which
should go away quickly with either be exit or exec), so maintain that
behavior by handling the CLONE_VFORK case first in the allocation path.
This new logioc cleanly handles the case of normal, successful
CLONE_VFORK's skipping cleaning up their shadow stack's on exit as well.
So remove the existing, vfork shadow stack freeing logic. This is in
deactivate_mm() where vfork_done is used to tell if it is a vfork child
that can skip cleaning up the thread shadow stack.
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Tested-by: H.J. Lu <hjl.tools@gmail.com>
Fixes: b2926a36b97a ("x86/shstk: Handle thread shadow stack")
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
arch/x86/include/asm/mmu_context.h | 3 +--
arch/x86/kernel/shstk.c | 22 ++++++++++++++++++++--
2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 416901d406f8..8dac45a2c7fc 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -186,8 +186,7 @@ do { \
#else
#define deactivate_mm(tsk, mm) \
do { \
- if (!tsk->vfork_done) \
- shstk_free(tsk); \
+ shstk_free(tsk); \
load_gs_index(0); \
loadsegment(fs, 0); \
} while (0)
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index fd689921a1db..ad63252ebebc 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -205,10 +205,21 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long cl
return 0;
/*
- * For CLONE_VM, except vfork, the child needs a separate shadow
+ * For CLONE_VFORK the child will share the parents shadow stack.
+ * Make sure to clear the internal tracking of the thread shadow
+ * stack so the freeing logic run for child knows to leave it alone.
+ */
+ if (clone_flags & CLONE_VFORK) {
+ shstk->base = 0;
+ shstk->size = 0;
+ return 0;
+ }
+
+ /*
+ * For !CLONE_VM the child will use a copy of the parents shadow
* stack.
*/
- if ((clone_flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM)
+ if (!(clone_flags & CLONE_VM))
return 0;
size = adjust_shstk_size(stack_size);
@@ -408,6 +419,13 @@ void shstk_free(struct task_struct *tsk)
if (!tsk->mm || tsk->mm != current->mm)
return;
+ /*
+ * If shstk->base is NULL, then this task is not managing its
+ * own shadow stack (CLONE_VFORK). So skip freeing it.
+ */
+ if (!shstk->base)
+ return;
+
unmap_shadow_stack(shstk->base, shstk->size);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] x86/shstk: Remove useless clone error handling
2023-09-08 20:36 [PATCH 0/3] Fix shadow stack issues in clone error handling Rick Edgecombe
2023-09-08 20:36 ` [PATCH 1/3] x86/shstk: Handle vfork clone failure correctly Rick Edgecombe
@ 2023-09-08 20:36 ` Rick Edgecombe
2023-09-08 20:36 ` [PATCH 3/3] x86/shstk: Add warning for shadow stack double unmap Rick Edgecombe
2 siblings, 0 replies; 4+ messages in thread
From: Rick Edgecombe @ 2023-09-08 20:36 UTC (permalink / raw)
To: x86, Thomas Gleixner, Ingo Molnar, Andy Lutomirski,
Borislav Petkov, Dave Hansen, H . Peter Anvin, Peter Zijlstra,
hjl.tools, linux-kernel
Cc: rick.p.edgecombe
When clone fails after the shadow stack is allocated, any allocated shadow
stack is cleaned up in exit_thread() in copy_process(). So the logic in
copy_thread() is unneeded, and also will not handle failures that happen
outside of copy_thread().
In addition, since there is a second attempt to unmap the same shadow
stack, there is a race where an newly mapped region could get unmapped.
So remove the logic in copy_thread() and rely on exit_thread() to handle
clone failure.
Fixes: b2926a36b97a ("x86/shstk: Handle thread shadow stack")
Tested-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
arch/x86/kernel/process.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9f0909142a0a..b6f4e8399fca 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -257,13 +257,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
if (!ret && unlikely(test_tsk_thread_flag(current, TIF_IO_BITMAP)))
io_bitmap_share(p);
- /*
- * If copy_thread() if failing, don't leak the shadow stack possibly
- * allocated in shstk_alloc_thread_stack() above.
- */
- if (ret)
- shstk_free(p);
-
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] x86/shstk: Add warning for shadow stack double unmap
2023-09-08 20:36 [PATCH 0/3] Fix shadow stack issues in clone error handling Rick Edgecombe
2023-09-08 20:36 ` [PATCH 1/3] x86/shstk: Handle vfork clone failure correctly Rick Edgecombe
2023-09-08 20:36 ` [PATCH 2/3] x86/shstk: Remove useless clone error handling Rick Edgecombe
@ 2023-09-08 20:36 ` Rick Edgecombe
2 siblings, 0 replies; 4+ messages in thread
From: Rick Edgecombe @ 2023-09-08 20:36 UTC (permalink / raw)
To: x86, Thomas Gleixner, Ingo Molnar, Andy Lutomirski,
Borislav Petkov, Dave Hansen, H . Peter Anvin, Peter Zijlstra,
hjl.tools, linux-kernel
Cc: rick.p.edgecombe
There are several ways a thread's shadow stacks can get unmapped. This
can happen on exit or exec, as well as error handling in exec or clone.
The task struct already keeps track of the thread's shadow stack. Use the
size variable to keep track of if the shadow stack has already been freed.
When an attempt to double unmap the thread shadow stack is caught, warn
about it and abort the operation.
Tested-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
arch/x86/kernel/shstk.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index ad63252ebebc..59e15dd8d0f8 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -426,7 +426,18 @@ void shstk_free(struct task_struct *tsk)
if (!shstk->base)
return;
+ /*
+ * shstk->base is NULL for CLONE_VFORK child tasks, and so is
+ * normal. But size = 0 on a shstk->base is not normal and
+ * indicated an attempt to free the thread shadow stack twice.
+ * Warn about it.
+ */
+ if (WARN_ON(!shstk->size))
+ return;
+
unmap_shadow_stack(shstk->base, shstk->size);
+
+ shstk->size = 0;
}
static int wrss_control(bool enable)
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-09-08 20:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-08 20:36 [PATCH 0/3] Fix shadow stack issues in clone error handling Rick Edgecombe
2023-09-08 20:36 ` [PATCH 1/3] x86/shstk: Handle vfork clone failure correctly Rick Edgecombe
2023-09-08 20:36 ` [PATCH 2/3] x86/shstk: Remove useless clone error handling Rick Edgecombe
2023-09-08 20:36 ` [PATCH 3/3] x86/shstk: Add warning for shadow stack double unmap Rick Edgecombe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox