From: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Brown <broonie@kernel.org>
Cc: "Rick P. Edgecombe" <rick.p.edgecombe@intel.com>,
Deepak Gupta <debug@rivosinc.com>,
Szabolcs Nagy <Szabolcs.Nagy@arm.com>,
"H.J. Lu" <hjl.tools@gmail.com>,
Florian Weimer <fweimer@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Christian Brauner <brauner@kernel.org>,
Shuah Khan <shuah@kernel.org>,
linux-kernel@vger.kernel.org, Will Deacon <will@kernel.org>,
jannh@google.com, linux-kselftest@vger.kernel.org,
linux-api@vger.kernel.org, Kees Cook <kees@kernel.org>
Subject: Re: [PATCH RFT v8 4/9] fork: Add shadow stack support to clone3()
Date: Tue, 13 Aug 2024 17:25:47 +0100 [thread overview]
Message-ID: <ZruJCyXDRNhw6U5A@arm.com> (raw)
In-Reply-To: <Zrag5A5K9pv1K9Uz@finisterre.sirena.org.uk>
On Sat, Aug 10, 2024 at 12:06:12AM +0100, Mark Brown wrote:
> On Fri, Aug 09, 2024 at 07:19:26PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 08, 2024 at 09:15:25AM +0100, Mark Brown wrote:
> > > + /* This should really be an atomic cmpxchg. It is not. */
> > > + if (access_remote_vm(mm, addr, &val, sizeof(val),
> > > + FOLL_FORCE) != sizeof(val))
> > > + goto out;
>
> > If we restrict the shadow stack creation only to the CLONE_VM case, we'd
> > not need the remote vm access, it's in the current mm context already.
> > More on this below.
>
> The discussion in previous iterations was that it seemed better to allow
> even surprising use cases since it simplifies the analysis of what we
> have covered. If the user has specified a shadow stack we just do what
> they asked for and let them worry about if it's useful.
Thanks for the summary of the past discussions, the patch makes more
sense now. I guess it's easier to follow a clone*() syscall where one
can set a new stack pointer even in the !CLONE_VM case. Just let it set
the shadow stack as well with the new ABI.
However, the x86 would be slightly inconsistent here between clone() and
clone3(). I guess it depends how you look at it. The classic clone()
syscall, if one doesn't pass CLONE_VM but does set new stack, there's no
new shadow stack allocated which I'd expect since it's a new stack.
Well, I doubt anyone cares about this scenario. Are there real cases of
!CLONE_VM but with a new stack?
> > > + if (val != expected)
> > > + goto out;
>
> > I'm confused that we need to consume the token here. I could not find
> > the default shadow stack allocation doing this, only setting it via
> > create_rstor_token() (or I did not search enough). In the default case,
> > is the user consuming it? To me the only difference should been the
> > default allocation vs the one passed by the user via clone3(), with the
> > latter maybe requiring the user to set the token initially.
>
> As discussed for a couple of previous versions if we don't have the
> token and userspace can specify any old shadow stack page as the shadow
> stack this allows clone3() to be used to overwrite the shadow stack of
> another thread, you can point to a shadow stack page which is currently
> in use and then run some code that causes shadow stack writes. This
> could potentially then in turn be used as part of a bigger exploit
> chain, probably it's hard to get anything beyond just causing the other
> thread to fault but won't be impossible.
>
> With a kernel allocated shadow stack this is not an issue since we are
> placing the shadow stack in new memory, userspace can't control where we
> place it so it can't overwrite an existing shadow stack.
IIUC, the kernel-allocated shadow stack will have the token always set
while the user-allocated one will be cleared. I was looking to
understand the inconsistency between these two cases in terms of the
final layout of the new shadow stack: one with the token, the other
without. I can see the need for checking but maybe start with requiring
it to be 0 and setting the token before returning, for consistency with
clone().
In the kernel-allocated shadow stack, is the token used for anything? I
can see it's used for signal delivery and return but I couldn't figure
out what it is used for in a thread's shadow stack.
Also, can one not use the clone3() to point to the clone()-allocated
shadow stack? Maybe that's unlikely as an app tends to stick to one
syscall flavour or the other.
> > > + /*
> > > + * For CLONE_VFORK the child will share the parents
> > > + * shadow stack. Make sure to clear the internal
> > > + * tracking of the thread shadow stack so the freeing
> > > + * logic run for child knows to leave it alone.
> > > + */
> > > + if (clone_flags & CLONE_VFORK) {
> > > + shstk->base = 0;
> > > + shstk->size = 0;
> > > + return 0;
> > > + }
>
> > I think we should leave the CLONE_VFORK check on its own independent of
> > the clone3() arguments. If one passes both CLONE_VFORK and specific
> > shadow stack address/size, they should be ignored (or maybe return an
> > error if you want to make it stricter).
>
> This is existing logic from the current x86 code that's been reindented
> due to the addition of explicitly specified shadow stacks, it's not new
> behaviour. It is needed to stop the child thinking it has the parent's
> shadow stack in the CLONE_VFORK case.
I figured that. But similar to the current !CLONE_VM behaviour where no
new shadow stack is allocated even if a new stack is passed to clone(),
I was thinking of something similar here for consistency: don't set up a
shadow stack in the CLONE_VFORK case or at least allow it only if a new
stack is being set up (if we extend this to clone(), it would be a small
ABI change).
> > > - /*
> > > - * For !CLONE_VM the child will use a copy of the parents shadow
> > > - * stack.
> > > - */
> > > - if (!(clone_flags & CLONE_VM))
> > > - return 0;
> > > + /*
> > > + * For !CLONE_VM the child will use a copy of the
> > > + * parents shadow stack.
> > > + */
> > > + if (!(clone_flags & CLONE_VM))
> > > + return 0;
>
> > Is the !CLONE_VM case specific only to the default shadow stack
> > allocation? Sorry if this has been discussed already (or I completely
> > forgot) but I thought we'd only implement this for the thread creation
> > case. The typical fork() for a new process should inherit the parent's
> > layout, so applicable to the clone3() with the shadow stack arguments as
> > well (which should be ignored or maybe return an error with !CLONE_VM).
>
> This is again all existing behaviour for the case where the user has not
> specified a shadow stack reindented, as mentioned above if the user has
> specified one explicitly then we just do what we were asked. The
> existing behaviour is to only create a new shadow stack for the child in
> the CLONE_VM case and leave the child using the same shadow stack as the
> parent in the copied mm for !CLONE_VM.
I guess I was rather questioning the current choices than the new
clone3() ABI. But even for the new clone3() ABI, does it make sense to
set up a shadow stack if the current stack isn't changed? We'll end up
with a lot of possible combinations that will never get tested but
potentially become obscure ABI. Limiting the options to the sane choices
only helps with validation and unsurprising changes later on.
> > > @@ -2790,6 +2808,8 @@ pid_t kernel_clone(struct kernel_clone_args *args)
> > > */
> > > trace_sched_process_fork(current, p);
> > >
> > > + shstk_post_fork(p, args);
>
> > Do we need this post fork call? Can we not handle the setup via the
> > copy_thread() path in shstk_alloc_thread_stack()?
>
> It looks like we do actually have the new mm in the process before we
> call copy_thread() so we could move things into there though we'd loose
> a small bit of factoring out of the error handling (at one point I had
> more code factored out but right now it's quite small, looking again we
> could also factor out the get_task_mm()/mput()). ISTR having the new
> process' mm was the biggest reason for this initially but looking again
> I'm not sure why that was. It does still feel like even the small
> amount that's factored out currently is useful though, a bit less
> duplication in the architecture code which feels welcome here.
I think you can probably keep this. My comment was based on the
assumption that we only support the CLONE_VM case where we wouldn't need
the access_remote_vm(), just some direct write similar to
write_user_shstk_64().
I still think we should have limited this ABI to the CLONE_VM and
!CLONE_VFORK cases but I don't have a strong view if the consensus was
to allow it for classic fork() and vfork() like uses (I just think they
won't be used).
--
Catalin
next prev parent reply other threads:[~2024-08-13 16:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-08 8:15 [PATCH RFT v8 0/9] fork: Support shadow stacks in clone3() Mark Brown
2024-08-08 8:15 ` [PATCH RFT v8 1/9] Documentation: userspace-api: Add shadow stack API documentation Mark Brown
2024-08-14 10:40 ` Catalin Marinas
2024-08-08 8:15 ` [PATCH RFT v8 2/9] selftests: Provide helper header for shadow stack testing Mark Brown
2024-08-08 8:15 ` [PATCH RFT v8 3/9] mm: Introduce ARCH_HAS_USER_SHADOW_STACK Mark Brown
2024-08-14 10:41 ` Catalin Marinas
2024-08-08 8:15 ` [PATCH RFT v8 4/9] fork: Add shadow stack support to clone3() Mark Brown
2024-08-09 18:19 ` Catalin Marinas
2024-08-09 23:06 ` Mark Brown
2024-08-13 16:25 ` Catalin Marinas [this message]
2024-08-13 18:58 ` Mark Brown
2024-08-14 9:38 ` Catalin Marinas
2024-08-14 13:20 ` Mark Brown
2024-08-15 0:18 ` Edgecombe, Rick P
2024-08-15 14:24 ` Mark Brown
2024-08-16 8:44 ` Catalin Marinas
2024-08-16 10:51 ` Mark Brown
2024-08-16 15:29 ` Catalin Marinas
2024-08-16 15:46 ` Mark Brown
2024-08-16 14:52 ` Edgecombe, Rick P
2024-08-16 15:30 ` Mark Brown
2024-08-16 15:38 ` Catalin Marinas
2024-08-16 17:06 ` Mark Brown
2024-08-16 17:08 ` Jann Horn
2024-08-16 17:17 ` Mark Brown
2024-08-08 8:15 ` [PATCH RFT v8 5/9] selftests/clone3: Remove redundant flushes of output streams Mark Brown
2024-08-08 8:15 ` [PATCH RFT v8 6/9] selftests/clone3: Factor more of main loop into test_clone3() Mark Brown
2024-08-08 8:15 ` [PATCH RFT v8 7/9] selftests/clone3: Explicitly handle child exits due to signals Mark Brown
2024-08-08 8:15 ` [PATCH RFT v8 8/9] selftests/clone3: Allow tests to flag if -E2BIG is a valid error code Mark Brown
2024-08-08 8:15 ` [PATCH RFT v8 9/9] selftests/clone3: Test shadow stack support Mark Brown
2024-08-08 17:54 ` [PATCH RFT v8 0/9] fork: Support shadow stacks in clone3() Kees Cook
2024-08-15 0:19 ` Edgecombe, Rick P
2024-08-16 15:52 ` Jann Horn
2024-08-16 16:19 ` Mark Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZruJCyXDRNhw6U5A@arm.com \
--to=catalin.marinas@arm.com \
--cc=Szabolcs.Nagy@arm.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=broonie@kernel.org \
--cc=bsegall@google.com \
--cc=dave.hansen@linux.intel.com \
--cc=debug@rivosinc.com \
--cc=dietmar.eggemann@arm.com \
--cc=fweimer@redhat.com \
--cc=hjl.tools@gmail.com \
--cc=hpa@zytor.com \
--cc=jannh@google.com \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rostedt@goodmis.org \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox