* Re: [PATCH 2/3] seccomp: release task filters when the task exits
[not found] ` <CAEWA0a5dBvRwGAnztL56i=JV-WGGiaTd-GdJYdOxZmq1c+bdpg@mail.gmail.com>
@ 2024-05-16 9:34 ` Oleg Nesterov
2024-05-16 13:09 ` Oleg Nesterov
0 siblings, 1 reply; 8+ messages in thread
From: Oleg Nesterov @ 2024-05-16 9:34 UTC (permalink / raw)
To: Andrei Vagin
Cc: Kees Cook, Tycho Andersen, Andy Lutomirski, Will Drewry,
Jens Axboe, Christian Brauner, linux-kernel
(add lkml)
On 05/15, Andrei Vagin wrote:
>
> On Wed, May 15, 2024 at 5:52 AM Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Let me repeat I forgot everything about seccomp, but let me ask
> > a couple of questions...
>
> It seems you still remember something:). Thank you for the feedback.
Just I am still remember how to use grep ;)
> > > @@ -2126,6 +2137,11 @@ static struct seccomp_filter *get_nth_filter(struct task_struct *task,
> > > */
> > > spin_lock_irq(&task->sighand->siglock);
> > >
> > > + if (task->flags & PF_EXITING) {
> > > + spin_unlock_irq(&task->sighand->siglock);
> > > + return ERR_PTR(-EINVAL);
> > > + }
> >
> > Why do we need the PF_EXITING check here?
> >
> > This looks unnecessary even if get_nth_filter() could race with the
> > exiting task, but this doesn't matter.
> >
> > This race is not possible, get_nth_filter() is only called from ptrace()
> > paths, but the tracee can't stop in TASK_TRACED after exit_signals() which
> > sets PF_EXITING.
>
> If we rely on using seccomp_get_filter only from ptrace, you are right.
Plus it too does __get_seccomp_filter/__get_seccomp_filter, so I guess it
should be safe without this check even if it could be used outside of ptrace.
Just like proc_pid_seccomp_cache(), see below.
> > > @@ -2494,6 +2510,11 @@ int proc_pid_seccomp_cache(struct seq_file *m, struct pid_namespace *ns,
> > > if (!lock_task_sighand(task, &flags))
> > > return -ESRCH;
> > >
> > > + if (thread->flags & PF_EXITING) {
> > > + unlock_task_sighand(task, &flags);
> > > + return 0;
> >
> > Again, do we really need this check?
> >
> > It can race with the exiting task and (without this check) do
> > __get_seccomp_filter(f) right before seccomp_filter_release()
> > takes sighand->siglock. But why is it bad?
>
> I think you are right, this check isn't required.
>
> >
> > OTOH. I guess proc_pid_seccomp_cache() is the only reason why
> > seccomp_filter_release() takes ->siglock with your patch?
>
> seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
Yes. But we only need to consider them in the multi-thread case, right?
In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
miss this flag, seccomp_filter_release() doesn't need to take siglock.
> If we check PF_EXITING in all of them, we don't need to take ->siglock in
> seccomp_filter_release. Does it sound right?
The problem is a single-threaded exiting task. In this case exit_signals()
sets PF_EXITING lockless. This means that in this case
- proc_pid_seccomp_cache() can't rely on the PF_EXITING check
but it can be safely removed.
- seccomp_filter_release() needs to take ->siglock to avoid the
race with proc_pid_seccomp_cache().
And this chunk from your patch
static void __seccomp_filter_orphan(struct seccomp_filter *orig)
{
+ lockdep_assert_held(¤t->sighand->siglock);
+
looks unnecessary too, seccomp_filter_release() can just do
spin_lock_irq(siglock);
orig = tsk->seccomp.filter;
tsk->seccomp.filter = NULL;
spin_unlock_irq(siglock);
__seccomp_filter_release(orig);
Right?
Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] seccomp: release task filters when the task exits
2024-05-16 9:34 ` [PATCH 2/3] seccomp: release task filters when the task exits Oleg Nesterov
@ 2024-05-16 13:09 ` Oleg Nesterov
2024-05-22 6:49 ` Andrei Vagin
0 siblings, 1 reply; 8+ messages in thread
From: Oleg Nesterov @ 2024-05-16 13:09 UTC (permalink / raw)
To: Andrei Vagin
Cc: Kees Cook, Tycho Andersen, Andy Lutomirski, Will Drewry,
Jens Axboe, Christian Brauner, linux-kernel
On 05/16, Oleg Nesterov wrote:
>
> On 05/15, Andrei Vagin wrote:
> >
> > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
>
> Yes. But we only need to consider them in the multi-thread case, right?
> In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> miss this flag, seccomp_filter_release() doesn't need to take siglock.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Ah, no. seccomp_filter_release() does need to take ->siglock even if we
forget about proc_pid_seccomp_cache().
Without siglock
orig = tsk->seccomp.filter;
can leak into the critical section in exit_signals() (spin_unlock is the
one-way barrier) and this LOAD can be reordered with "flags |= PF_EXITING".
Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
don't. So we can't add a fast-path
if (!tsk->seccomp.filter)
return;
check at the start of seccomp_filter_release().
Cough... Now that I look at seccomp_can_sync_threads() I think it too
doesn't need the PF_EXITING check.
If it is called before seccomp_filter_release(), this doesn't really
differ from the case when it is called before do_exit/exit_signals.
If it is called after seccomp_filter_release(), then is_ancestor()
must be true.
But perhaps I missed something, I won't insist, up to you.
> > If we check PF_EXITING in all of them, we don't need to take ->siglock in
> > seccomp_filter_release. Does it sound right?
>
> The problem is a single-threaded exiting task. In this case exit_signals()
> sets PF_EXITING lockless. This means that in this case
>
> - proc_pid_seccomp_cache() can't rely on the PF_EXITING check
> but it can be safely removed.
>
> - seccomp_filter_release() needs to take ->siglock to avoid the
> race with proc_pid_seccomp_cache().
>
> And this chunk from your patch
>
> static void __seccomp_filter_orphan(struct seccomp_filter *orig)
> {
> + lockdep_assert_held(¤t->sighand->siglock);
> +
>
> looks unnecessary too, seccomp_filter_release() can just do
>
> spin_lock_irq(siglock);
> orig = tsk->seccomp.filter;
> tsk->seccomp.filter = NULL;
> spin_unlock_irq(siglock);
>
> __seccomp_filter_release(orig);
>
> Right?
>
> Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] seccomp: release task filters when the task exits
2024-05-16 13:09 ` Oleg Nesterov
@ 2024-05-22 6:49 ` Andrei Vagin
2024-05-22 7:06 ` Andrei Vagin
0 siblings, 1 reply; 8+ messages in thread
From: Andrei Vagin @ 2024-05-22 6:49 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Kees Cook, Tycho Andersen, Andy Lutomirski, Will Drewry,
Jens Axboe, Christian Brauner, linux-kernel
On Thu, May 16, 2024 at 6:10 AM Oleg Nesterov <oleg@redhat.com> wrote:
>
> On 05/16, Oleg Nesterov wrote:
> >
> > On 05/15, Andrei Vagin wrote:
> > >
> > > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
> >
> > Yes. But we only need to consider them in the multi-thread case, right?
> > In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> > miss this flag, seccomp_filter_release() doesn't need to take siglock.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Ah, no. seccomp_filter_release() does need to take ->siglock even if we
> forget about proc_pid_seccomp_cache().
>
> Without siglock
>
> orig = tsk->seccomp.filter;
>
> can leak into the critical section in exit_signals() (spin_unlock is the
> one-way barrier) and this LOAD can be reordered with "flags |= PF_EXITING".
>
> Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
> don't. So we can't add a fast-path
We have smp_mb__after_unlock_lock in include/linux/rcupdate.h.
>
> if (!tsk->seccomp.filter)
> return;
>
> check at the start of seccomp_filter_release().
>
>
> Cough... Now that I look at seccomp_can_sync_threads() I think it too
> doesn't need the PF_EXITING check.
>
> If it is called before seccomp_filter_release(), this doesn't really
> differ from the case when it is called before do_exit/exit_signals.
>
> If it is called after seccomp_filter_release(), then is_ancestor()
> must be true.
>
> But perhaps I missed something, I won't insist, up to you.
>
> > > If we check PF_EXITING in all of them, we don't need to take ->siglock in
> > > seccomp_filter_release. Does it sound right?
> >
> > The problem is a single-threaded exiting task. In this case exit_signals()
> > sets PF_EXITING lockless. This means that in this case
> >
> > - proc_pid_seccomp_cache() can't rely on the PF_EXITING check
> > but it can be safely removed.
> >
> > - seccomp_filter_release() needs to take ->siglock to avoid the
> > race with proc_pid_seccomp_cache().
> >
> > And this chunk from your patch
> >
> > static void __seccomp_filter_orphan(struct seccomp_filter *orig)
> > {
> > + lockdep_assert_held(¤t->sighand->siglock);
> > +
> >
> > looks unnecessary too, seccomp_filter_release() can just do
> >
> > spin_lock_irq(siglock);
> > orig = tsk->seccomp.filter;
> > tsk->seccomp.filter = NULL;
> > spin_unlock_irq(siglock);
> >
> > __seccomp_filter_release(orig);
> >
> > Right?
> >
> > Oleg.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] seccomp: release task filters when the task exits
2024-05-22 6:49 ` Andrei Vagin
@ 2024-05-22 7:06 ` Andrei Vagin
2024-05-22 10:35 ` Oleg Nesterov
0 siblings, 1 reply; 8+ messages in thread
From: Andrei Vagin @ 2024-05-22 7:06 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Kees Cook, Tycho Andersen, Andy Lutomirski, Will Drewry,
Jens Axboe, Christian Brauner, linux-kernel
> On Thu, May 16, 2024 at 6:10 AM Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > On 05/16, Oleg Nesterov wrote:
> > >
> > > On 05/15, Andrei Vagin wrote:
> > > >
> > > > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
> > >
> > > Yes. But we only need to consider them in the multi-thread case, right?
> > > In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> > > miss this flag, seccomp_filter_release() doesn't need to take siglock.
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PF_EXITING is set without holding ->siglock if tsk->signal has the
SIGNAL_GROUP_EXIT flag. I think it can be a case when one thread is in
seccomp_sync_threads and others are exiting. The first thread can check
that PF_EXITING isn't set for another thread. Then, the second thread calls
exit_signals and seccomp_filter_release(), and finally, the first thread
sets its seccomp.filter to the second thread. If seccomp_filter_release takes
siglock, it will be handled properly.
> >
> > Ah, no. seccomp_filter_release() does need to take ->siglock even if we
> > forget about proc_pid_seccomp_cache().
> >
> > Without siglock
> >
> > orig = tsk->seccomp.filter;
> >
> > can leak into the critical section in exit_signals() (spin_unlock is the
> > one-way barrier) and this LOAD can be reordered with "flags |= PF_EXITING".
> >
> > Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
> > don't. So we can't add a fast-path
We have smp_mb__after_unlock_lock in include/linux/rcupdate.h.
> >
> > if (!tsk->seccomp.filter)
> > return;
> >
> > check at the start of seccomp_filter_release().
> >
> >
> > Cough... Now that I look at seccomp_can_sync_threads() I think it too
> > doesn't need the PF_EXITING check.
> >
> > If it is called before seccomp_filter_release(), this doesn't really
> > differ from the case when it is called before do_exit/exit_signals.
> >
> > If it is called after seccomp_filter_release(), then is_ancestor()
> > must be true.
> >
> > But perhaps I missed something, I won't insist, up to you.
> >
You are right, this check isn't required in seccomp_can_sync_threads, but
I decided that it is better to be consistent with seccomp_sync_threads.
Thanks,
Andrei
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] seccomp: release task filters when the task exits
2024-05-22 7:06 ` Andrei Vagin
@ 2024-05-22 10:35 ` Oleg Nesterov
0 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2024-05-22 10:35 UTC (permalink / raw)
To: Andrei Vagin
Cc: Kees Cook, Tycho Andersen, Andy Lutomirski, Will Drewry,
Jens Axboe, Christian Brauner, linux-kernel
On 05/22, Andrei Vagin wrote:
>
> > On Thu, May 16, 2024 at 6:10 AM Oleg Nesterov <oleg@redhat.com> wrote:
> > >
> > > On 05/16, Oleg Nesterov wrote:
> > > >
> > > > On 05/15, Andrei Vagin wrote:
> > > > >
> > > > > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
> > > >
> > > > Yes. But we only need to consider them in the multi-thread case, right?
> > > > In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> > > > miss this flag, seccomp_filter_release() doesn't need to take siglock.
> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> PF_EXITING is set without holding ->siglock if tsk->signal has the
> SIGNAL_GROUP_EXIT flag. I think it can be a case when one thread is in
> seccomp_sync_threads and others are exiting.
Yes, I forgot this.
> > > Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
> > > don't. So we can't add a fast-path
>
> We have smp_mb__after_unlock_lock in include/linux/rcupdate.h.
This is another thing.
But sorry for confusion, this doesn't really matter, we could you a plain mb().
I mean, I was thinking about something like
seccomp_filter_release:
smp_mb();
if (!READ_ONCE(tsk->seccomp.filter))
return;
spin_lock_irq(siglock);
orig = tsk->seccomp.filter;
...
but then seccomp_sync_threads() should do something like
orig = READ_ONCE(thread->seccomp.filter);
smp_store_release(&thread->seccomp.filter,
caller->seccomp.filter);
smp_mb(); // pairs with mb() in seccomp_filter_release()
if (READ_ONCE(thread->flags) & PF_EXITING) {
WRITE_ONCE(thread->seccomp.filter, orig);
continue;
}
__seccomp_filter_release(orig);
...
too subtle even _if_ correct, and I am not sure at all this would be correct.
> > > Cough... Now that I look at seccomp_can_sync_threads() I think it too
> > > doesn't need the PF_EXITING check.
> > >
> > > If it is called before seccomp_filter_release(), this doesn't really
> > > differ from the case when it is called before do_exit/exit_signals.
> > >
> > > If it is called after seccomp_filter_release(), then is_ancestor()
> > > must be true.
> > >
> > > But perhaps I missed something, I won't insist, up to you.
> > >
>
> You are right, this check isn't required in seccomp_can_sync_threads, but
> I decided that it is better to be consistent with seccomp_sync_threads.
OK, agreed.
Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/3] seccomp: release task filters when the task exits
2024-05-23 1:45 [PATCH 0/3 v2] seccomp: improve handling of SECCOMP_IOCTL_NOTIF_RECV Andrei Vagin
@ 2024-05-23 1:45 ` Andrei Vagin
2024-05-23 9:00 ` Oleg Nesterov
2024-06-26 18:57 ` Kees Cook
0 siblings, 2 replies; 8+ messages in thread
From: Andrei Vagin @ 2024-05-23 1:45 UTC (permalink / raw)
To: Kees Cook, Andy Lutomirski, Will Drewry, Oleg Nesterov,
Christian Brauner
Cc: linux-kernel, Tycho Andersen, Andrei Vagin, Jens Axboe
Previously, seccomp filters were released in release_task(), which
required the process to exit and its zombie to be collected. However,
exited threads/processes can't trigger any seccomp events, making it
more logical to release filters upon task exits.
This adjustment simplifies scenarios where a parent is tracing its child
process. The parent process can now handle all events from a seccomp
listening descriptor and then call wait to collect a child zombie.
seccomp_filter_release takes the siglock to avoid races with
seccomp_sync_threads. There was an idea to bypass taking the lock by
checking PF_EXITING, but it can be set without holding siglock if
threads have SIGNAL_GROUP_EXIT. This means it can happen concurently
with seccomp_filter_release.
Signed-off-by: Andrei Vagin <avagin@google.com>
---
kernel/exit.c | 3 ++-
kernel/seccomp.c | 22 ++++++++++++++++------
2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index 41a12630cbbc..23439c021d8d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -278,7 +278,6 @@ void release_task(struct task_struct *p)
}
write_unlock_irq(&tasklist_lock);
- seccomp_filter_release(p);
proc_flush_pid(thread_pid);
put_pid(thread_pid);
release_thread(p);
@@ -836,6 +835,8 @@ void __noreturn do_exit(long code)
io_uring_files_cancel();
exit_signals(tsk); /* sets PF_EXITING */
+ seccomp_filter_release(tsk);
+
acct_update_integrals(tsk);
group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 35435e8f1035..67305e776dd3 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -502,6 +502,9 @@ static inline pid_t seccomp_can_sync_threads(void)
/* Skip current, since it is initiating the sync. */
if (thread == caller)
continue;
+ /* Skip exited threads. */
+ if (thread->flags & PF_EXITING)
+ continue;
if (thread->seccomp.mode == SECCOMP_MODE_DISABLED ||
(thread->seccomp.mode == SECCOMP_MODE_FILTER &&
@@ -563,18 +566,18 @@ static void __seccomp_filter_release(struct seccomp_filter *orig)
* @tsk: task the filter should be released from.
*
* This function should only be called when the task is exiting as
- * it detaches it from its filter tree. As such, READ_ONCE() and
- * barriers are not needed here, as would normally be needed.
+ * it detaches it from its filter tree. PF_EXITING has to be set
+ * for the task.
*/
void seccomp_filter_release(struct task_struct *tsk)
{
- struct seccomp_filter *orig = tsk->seccomp.filter;
-
- /* We are effectively holding the siglock by not having any sighand. */
- WARN_ON(tsk->sighand != NULL);
+ struct seccomp_filter *orig;
+ spin_lock_irq(¤t->sighand->siglock);
+ orig = tsk->seccomp.filter;
/* Detach task from its filter tree. */
tsk->seccomp.filter = NULL;
+ spin_unlock_irq(¤t->sighand->siglock);
__seccomp_filter_release(orig);
}
@@ -602,6 +605,13 @@ static inline void seccomp_sync_threads(unsigned long flags)
if (thread == caller)
continue;
+ /*
+ * Skip exited threads. seccomp_filter_release could have
+ * been already called for this task.
+ */
+ if (thread->flags & PF_EXITING)
+ continue;
+
/* Get a task reference for the new leaf node. */
get_seccomp_filter(caller);
--
2.45.1.288.g0e0cd299f1-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] seccomp: release task filters when the task exits
2024-05-23 1:45 ` [PATCH 2/3] seccomp: release task filters when the task exits Andrei Vagin
@ 2024-05-23 9:00 ` Oleg Nesterov
2024-06-26 18:57 ` Kees Cook
1 sibling, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2024-05-23 9:00 UTC (permalink / raw)
To: Andrei Vagin
Cc: Kees Cook, Andy Lutomirski, Will Drewry, Christian Brauner,
linux-kernel, Tycho Andersen, Jens Axboe
On 05/23, Andrei Vagin wrote:
>
> Previously, seccomp filters were released in release_task(), which
> required the process to exit and its zombie to be collected. However,
> exited threads/processes can't trigger any seccomp events, making it
> more logical to release filters upon task exits.
>
> This adjustment simplifies scenarios where a parent is tracing its child
> process. The parent process can now handle all events from a seccomp
> listening descriptor and then call wait to collect a child zombie.
>
> seccomp_filter_release takes the siglock to avoid races with
> seccomp_sync_threads. There was an idea to bypass taking the lock by
> checking PF_EXITING, but it can be set without holding siglock if
> threads have SIGNAL_GROUP_EXIT. This means it can happen concurently
> with seccomp_filter_release.
>
> Signed-off-by: Andrei Vagin <avagin@google.com>
> ---
> kernel/exit.c | 3 ++-
> kernel/seccomp.c | 22 ++++++++++++++++------
> 2 files changed, 18 insertions(+), 7 deletions(-)
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] seccomp: release task filters when the task exits
2024-05-23 1:45 ` [PATCH 2/3] seccomp: release task filters when the task exits Andrei Vagin
2024-05-23 9:00 ` Oleg Nesterov
@ 2024-06-26 18:57 ` Kees Cook
1 sibling, 0 replies; 8+ messages in thread
From: Kees Cook @ 2024-06-26 18:57 UTC (permalink / raw)
To: Andrei Vagin
Cc: Andy Lutomirski, Will Drewry, Oleg Nesterov, Christian Brauner,
linux-kernel, Tycho Andersen, Jens Axboe
On Thu, May 23, 2024 at 01:45:39AM +0000, Andrei Vagin wrote:
> Previously, seccomp filters were released in release_task(), which
> required the process to exit and its zombie to be collected. However,
> exited threads/processes can't trigger any seccomp events, making it
> more logical to release filters upon task exits.
>
> This adjustment simplifies scenarios where a parent is tracing its child
> process. The parent process can now handle all events from a seccomp
> listening descriptor and then call wait to collect a child zombie.
>
> seccomp_filter_release takes the siglock to avoid races with
> seccomp_sync_threads. There was an idea to bypass taking the lock by
> checking PF_EXITING, but it can be set without holding siglock if
> threads have SIGNAL_GROUP_EXIT. This means it can happen concurently
> with seccomp_filter_release.
>
> Signed-off-by: Andrei Vagin <avagin@google.com>
> ---
> kernel/exit.c | 3 ++-
> kernel/seccomp.c | 22 ++++++++++++++++------
> 2 files changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 41a12630cbbc..23439c021d8d 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -278,7 +278,6 @@ void release_task(struct task_struct *p)
> }
>
> write_unlock_irq(&tasklist_lock);
> - seccomp_filter_release(p);
> proc_flush_pid(thread_pid);
> put_pid(thread_pid);
> release_thread(p);
> @@ -836,6 +835,8 @@ void __noreturn do_exit(long code)
> io_uring_files_cancel();
> exit_signals(tsk); /* sets PF_EXITING */
>
> + seccomp_filter_release(tsk);
> +
> acct_update_integrals(tsk);
> group_dead = atomic_dec_and_test(&tsk->signal->live);
> if (group_dead) {
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 35435e8f1035..67305e776dd3 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -502,6 +502,9 @@ static inline pid_t seccomp_can_sync_threads(void)
> /* Skip current, since it is initiating the sync. */
> if (thread == caller)
> continue;
> + /* Skip exited threads. */
> + if (thread->flags & PF_EXITING)
> + continue;
>
> if (thread->seccomp.mode == SECCOMP_MODE_DISABLED ||
> (thread->seccomp.mode == SECCOMP_MODE_FILTER &&
> @@ -563,18 +566,18 @@ static void __seccomp_filter_release(struct seccomp_filter *orig)
> * @tsk: task the filter should be released from.
> *
> * This function should only be called when the task is exiting as
> - * it detaches it from its filter tree. As such, READ_ONCE() and
> - * barriers are not needed here, as would normally be needed.
> + * it detaches it from its filter tree. PF_EXITING has to be set
> + * for the task.
Let's capture this requirement with a WARN_ON() (like was done for the
sighand case before). So before the spinlock, check for PF_EXITING and
fail safe (don't release):
if (WARN_ON((tsk->flags & PF_EXITING) == 0))
return;
> */
> void seccomp_filter_release(struct task_struct *tsk)
> {
> - struct seccomp_filter *orig = tsk->seccomp.filter;
> -
> - /* We are effectively holding the siglock by not having any sighand. */
> - WARN_ON(tsk->sighand != NULL);
> + struct seccomp_filter *orig;
>
> + spin_lock_irq(¤t->sighand->siglock);
Shouldn't this be "tsk" not "current"?
> + orig = tsk->seccomp.filter;
> /* Detach task from its filter tree. */
> tsk->seccomp.filter = NULL;
> + spin_unlock_irq(¤t->sighand->siglock);
Same.
> __seccomp_filter_release(orig);
> }
>
> @@ -602,6 +605,13 @@ static inline void seccomp_sync_threads(unsigned long flags)
> if (thread == caller)
> continue;
>
> + /*
> + * Skip exited threads. seccomp_filter_release could have
> + * been already called for this task.
> + */
> + if (thread->flags & PF_EXITING)
> + continue;
> +
> /* Get a task reference for the new leaf node. */
> get_seccomp_filter(caller);
>
> --
> 2.45.1.288.g0e0cd299f1-goog
>
Otherwise, looks good!
--
Kees Cook
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-06-26 18:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240514175551.297237-1-avagin@google.com>
[not found] ` <20240514175551.297237-3-avagin@google.com>
[not found] ` <20240515125113.GC6821@redhat.com>
[not found] ` <CAEWA0a5dBvRwGAnztL56i=JV-WGGiaTd-GdJYdOxZmq1c+bdpg@mail.gmail.com>
2024-05-16 9:34 ` [PATCH 2/3] seccomp: release task filters when the task exits Oleg Nesterov
2024-05-16 13:09 ` Oleg Nesterov
2024-05-22 6:49 ` Andrei Vagin
2024-05-22 7:06 ` Andrei Vagin
2024-05-22 10:35 ` Oleg Nesterov
2024-05-23 1:45 [PATCH 0/3 v2] seccomp: improve handling of SECCOMP_IOCTL_NOTIF_RECV Andrei Vagin
2024-05-23 1:45 ` [PATCH 2/3] seccomp: release task filters when the task exits Andrei Vagin
2024-05-23 9:00 ` Oleg Nesterov
2024-06-26 18:57 ` Kees Cook
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox