* [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless
@ 2025-02-05 17:51 Oleg Nesterov
2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov
2025-02-05 17:53 ` [PATCH 0/1] " Mateusz Guzik
0 siblings, 2 replies; 8+ messages in thread
From: Oleg Nesterov @ 2025-02-05 17:51 UTC (permalink / raw)
To: Andrew Morton, Eric W. Biederman, Frederic Weisbecker,
Peter Zijlstra, Thomas Gleixner
Cc: Mateusz Guzik, linux-kernel
Thomas, Frederic, could you review?
Untested, and I'll recheck this patch tomorrow, perhaps I missed something...
Mateusz, this patch has minor conflicts with your
[PATCH v3 0/6] reduce tasklist_lock hold time on exit and do some
WIP changes. I can wait for v4 from you and rebase on top, but I'd like
to show this patch right now to simplify the review.
Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless
2025-02-05 17:51 [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless Oleg Nesterov
@ 2025-02-05 17:51 ` Oleg Nesterov
2025-02-05 22:18 ` Frederic Weisbecker
2025-02-06 15:57 ` Thomas Gleixner
2025-02-05 17:53 ` [PATCH 0/1] " Mateusz Guzik
1 sibling, 2 replies; 8+ messages in thread
From: Oleg Nesterov @ 2025-02-05 17:51 UTC (permalink / raw)
To: Andrew Morton, Eric W. Biederman, Frederic Weisbecker,
Peter Zijlstra, Thomas Gleixner
Cc: Mateusz Guzik, linux-kernel
A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues,
and exit. In this case __exit_signal()->flush_sigqueue() called with irqs
disabled can triger a hard lockup, see
https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/
Fortunately, after the recent posixtimer changes sys_timer_delete() paths
no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after
the exiting task passes __exit_signal() lock_task_sighand() can't succeed
and pid_task(tmr->it_pid) will return NULL.
This means that after __exit_signal(tsk) nobody can play with tsk->pending
or (if group_dead) with tsk->signal->shared_pending, so release_task() can
safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock).
Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed.
TODO:
- we can probably shift posix_cpu_timers_exit() as well
- do_sigaction() can hit the similar problem
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/exit.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index 3485e5fc499e..bc2c24ea4181 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -200,20 +200,12 @@ static void __exit_signal(struct task_struct *tsk)
__unhash_process(tsk, group_dead);
write_sequnlock(&sig->stats_lock);
- /*
- * Do this under ->siglock, we can race with another thread
- * doing sigqueue_free() if we have SIGQUEUE_PREALLOC signals.
- */
- flush_sigqueue(&tsk->pending);
tsk->sighand = NULL;
spin_unlock(&sighand->siglock);
__cleanup_sighand(sighand);
- clear_tsk_thread_flag(tsk, TIF_SIGPENDING);
- if (group_dead) {
- flush_sigqueue(&sig->shared_pending);
+ if (group_dead)
tty_kref_put(tty);
- }
}
static void delayed_put_task_struct(struct rcu_head *rhp)
@@ -279,6 +271,11 @@ void release_task(struct task_struct *p)
proc_flush_pid(thread_pid);
put_pid(thread_pid);
release_thread(p);
+
+ flush_sigqueue(&p->pending);
+ if (thread_group_leader(p))
+ flush_sigqueue(&p->signal->shared_pending);
+
put_task_struct_rcu_user(p);
p = leader;
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless
2025-02-05 17:51 [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless Oleg Nesterov
2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov
@ 2025-02-05 17:53 ` Mateusz Guzik
1 sibling, 0 replies; 8+ messages in thread
From: Mateusz Guzik @ 2025-02-05 17:53 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Eric W. Biederman, Frederic Weisbecker,
Peter Zijlstra, Thomas Gleixner, linux-kernel
On Wed, Feb 5, 2025 at 6:52 PM Oleg Nesterov <oleg@redhat.com> wrote:
>
> Thomas, Frederic, could you review?
>
> Untested, and I'll recheck this patch tomorrow, perhaps I missed something...
>
> Mateusz, this patch has minor conflicts with your
>
> [PATCH v3 0/6] reduce tasklist_lock hold time on exit and do some
>
> WIP changes. I can wait for v4 from you and rebase on top, but I'd like
> to show this patch right now to simplify the review.
>
I'm still waiting for feedback from Eric.
I'll rebase on whatever patch which lands.
--
Mateusz Guzik <mjguzik gmail.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless
2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov
@ 2025-02-05 22:18 ` Frederic Weisbecker
2025-02-06 12:14 ` Oleg Nesterov
2025-02-06 15:57 ` Thomas Gleixner
1 sibling, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2025-02-05 22:18 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Eric W. Biederman, Peter Zijlstra, Thomas Gleixner,
Mateusz Guzik, linux-kernel
Le Wed, Feb 05, 2025 at 06:51:59PM +0100, Oleg Nesterov a écrit :
> A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues,
> and exit. In this case __exit_signal()->flush_sigqueue() called with irqs
> disabled can triger a hard lockup, see
> https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/
>
> Fortunately, after the recent posixtimer changes sys_timer_delete() paths
> no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after
> the exiting task passes __exit_signal() lock_task_sighand() can't succeed
> and pid_task(tmr->it_pid) will return NULL.
>
> This means that after __exit_signal(tsk) nobody can play with tsk->pending
> or (if group_dead) with tsk->signal->shared_pending, so release_task() can
> safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock).
>
> Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed.
>
> TODO:
> - we can probably shift posix_cpu_timers_exit() as well
> - do_sigaction() can hit the similar problem
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
> kernel/exit.c | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 3485e5fc499e..bc2c24ea4181 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -200,20 +200,12 @@ static void __exit_signal(struct task_struct *tsk)
> __unhash_process(tsk, group_dead);
> write_sequnlock(&sig->stats_lock);
>
> - /*
> - * Do this under ->siglock, we can race with another thread
> - * doing sigqueue_free() if we have SIGQUEUE_PREALLOC signals.
> - */
> - flush_sigqueue(&tsk->pending);
> tsk->sighand = NULL;
> spin_unlock(&sighand->siglock);
>
> __cleanup_sighand(sighand);
> - clear_tsk_thread_flag(tsk, TIF_SIGPENDING);
Looks good to me, except for this TIF_SIGPENDING removal which I'm less
sure about, I see a lot of places where it is added/removed. Well it's
probably only checked locally on entry code. Would it make sense to move
this chunk to a separate preceding patch? Or keep it here but at least
explain on the changelog why it is safe to remove it?
Thanks!
> - if (group_dead) {
> - flush_sigqueue(&sig->shared_pending);
> + if (group_dead)
> tty_kref_put(tty);
> - }
> }
>
> static void delayed_put_task_struct(struct rcu_head *rhp)
> @@ -279,6 +271,11 @@ void release_task(struct task_struct *p)
> proc_flush_pid(thread_pid);
> put_pid(thread_pid);
> release_thread(p);
> +
> + flush_sigqueue(&p->pending);
> + if (thread_group_leader(p))
> + flush_sigqueue(&p->signal->shared_pending);
> +
> put_task_struct_rcu_user(p);
>
> p = leader;
> --
> 2.25.1.362.g51ebf55
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless
2025-02-05 22:18 ` Frederic Weisbecker
@ 2025-02-06 12:14 ` Oleg Nesterov
0 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2025-02-06 12:14 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Andrew Morton, Eric W. Biederman, Peter Zijlstra, Thomas Gleixner,
Mateusz Guzik, linux-kernel
On 02/05, Frederic Weisbecker wrote:
>
> Looks good to me, except for this TIF_SIGPENDING removal which I'm less
> sure about, I see a lot of places where it is added/removed. Well it's
> probably only checked locally on entry code. Would it make sense to move
> this chunk to a separate preceding patch?
OK, will do.
Thanks!
Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless
2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov
2025-02-05 22:18 ` Frederic Weisbecker
@ 2025-02-06 15:57 ` Thomas Gleixner
2025-02-06 16:23 ` Oleg Nesterov
1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2025-02-06 15:57 UTC (permalink / raw)
To: Oleg Nesterov, Andrew Morton, Eric W. Biederman,
Frederic Weisbecker, Peter Zijlstra
Cc: Mateusz Guzik, linux-kernel
On Wed, Feb 05 2025 at 18:51, Oleg Nesterov wrote:
> A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues,
> and exit. In this case __exit_signal()->flush_sigqueue() called with irqs
> disabled can triger a hard lockup, see
> https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/
>
> Fortunately, after the recent posixtimer changes sys_timer_delete() paths
> no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after
> the exiting task passes __exit_signal() lock_task_sighand() can't succeed
> and pid_task(tmr->it_pid) will return NULL.
>
> This means that after __exit_signal(tsk) nobody can play with tsk->pending
> or (if group_dead) with tsk->signal->shared_pending, so release_task() can
> safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock).
I can't find a problem with that.
> Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed.
I'm not entirely sure about that, but it does not hurt to clear it,
right?
> TODO:
> - we can probably shift posix_cpu_timers_exit() as well
I think so.
> - do_sigaction() can hit the similar problem
Indeed, but that's a tough on to solve.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless
2025-02-06 15:57 ` Thomas Gleixner
@ 2025-02-06 16:23 ` Oleg Nesterov
2025-02-06 21:04 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: Oleg Nesterov @ 2025-02-06 16:23 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Andrew Morton, Eric W. Biederman, Frederic Weisbecker,
Peter Zijlstra, Mateusz Guzik, linux-kernel
On 02/06, Thomas Gleixner wrote:
>
> On Wed, Feb 05 2025 at 18:51, Oleg Nesterov wrote:
> > A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues,
> > and exit. In this case __exit_signal()->flush_sigqueue() called with irqs
> > disabled can triger a hard lockup, see
> > https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/
> >
> > Fortunately, after the recent posixtimer changes sys_timer_delete() paths
> > no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after
> > the exiting task passes __exit_signal() lock_task_sighand() can't succeed
> > and pid_task(tmr->it_pid) will return NULL.
> >
> > This means that after __exit_signal(tsk) nobody can play with tsk->pending
> > or (if group_dead) with tsk->signal->shared_pending, so release_task() can
> > safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock).
>
> I can't find a problem with that.
Ah, good.
> > Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed.
>
> I'm not entirely sure about that, but it does not hurt to clear it,
> right?
Please see v2 which documents this change in a separate patch.
Again, it is not that it is really bad, just it looks very confusing
to me and I think it can confuse other readers of this code.
> > - do_sigaction() can hit the similar problem
>
> Indeed, but that's a tough on to solve.
Yeah... Although I have to admit that yesterday I had a very simple
(and wrong) solution in mind ;)
Thanks!
Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless
2025-02-06 16:23 ` Oleg Nesterov
@ 2025-02-06 21:04 ` Thomas Gleixner
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2025-02-06 21:04 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Eric W. Biederman, Frederic Weisbecker,
Peter Zijlstra, Mateusz Guzik, linux-kernel
Oleg!
On Thu, Feb 06 2025 at 17:23, Oleg Nesterov wrote:
> On 02/06, Thomas Gleixner wrote:
>> > - do_sigaction() can hit the similar problem
>>
>> Indeed, but that's a tough on to solve.
>
> Yeah... Although I have to admit that yesterday I had a very simple
> (and wrong) solution in mind ;)
I can relate. My initial "brilliant" idea to solve that turned out to be
broken after less than 10 seconds... :)
That needs some more thought to the whole signal handling business and
unfortunately the recent posix-timer changes do not help much in that
context. Let me think about it some more.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-02-06 21:04 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-05 17:51 [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless Oleg Nesterov
2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov
2025-02-05 22:18 ` Frederic Weisbecker
2025-02-06 12:14 ` Oleg Nesterov
2025-02-06 15:57 ` Thomas Gleixner
2025-02-06 16:23 ` Oleg Nesterov
2025-02-06 21:04 ` Thomas Gleixner
2025-02-05 17:53 ` [PATCH 0/1] " Mateusz Guzik
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.