* [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless @ 2025-02-05 17:51 Oleg Nesterov 2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov 2025-02-05 17:53 ` [PATCH 0/1] " Mateusz Guzik 0 siblings, 2 replies; 8+ messages in thread From: Oleg Nesterov @ 2025-02-05 17:51 UTC (permalink / raw) To: Andrew Morton, Eric W. Biederman, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner Cc: Mateusz Guzik, linux-kernel Thomas, Frederic, could you review? Untested, and I'll recheck this patch tomorrow, perhaps I missed something... Mateusz, this patch has minor conflicts with your [PATCH v3 0/6] reduce tasklist_lock hold time on exit and do some WIP changes. I can wait for v4 from you and rebase on top, but I'd like to show this patch right now to simplify the review. Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless 2025-02-05 17:51 [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless Oleg Nesterov @ 2025-02-05 17:51 ` Oleg Nesterov 2025-02-05 22:18 ` Frederic Weisbecker 2025-02-06 15:57 ` Thomas Gleixner 2025-02-05 17:53 ` [PATCH 0/1] " Mateusz Guzik 1 sibling, 2 replies; 8+ messages in thread From: Oleg Nesterov @ 2025-02-05 17:51 UTC (permalink / raw) To: Andrew Morton, Eric W. Biederman, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner Cc: Mateusz Guzik, linux-kernel A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues, and exit. In this case __exit_signal()->flush_sigqueue() called with irqs disabled can triger a hard lockup, see https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/ Fortunately, after the recent posixtimer changes sys_timer_delete() paths no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after the exiting task passes __exit_signal() lock_task_sighand() can't succeed and pid_task(tmr->it_pid) will return NULL. This means that after __exit_signal(tsk) nobody can play with tsk->pending or (if group_dead) with tsk->signal->shared_pending, so release_task() can safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock). Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed. TODO: - we can probably shift posix_cpu_timers_exit() as well - do_sigaction() can hit the similar problem Signed-off-by: Oleg Nesterov <oleg@redhat.com> --- kernel/exit.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 3485e5fc499e..bc2c24ea4181 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -200,20 +200,12 @@ static void __exit_signal(struct task_struct *tsk) __unhash_process(tsk, group_dead); write_sequnlock(&sig->stats_lock); - /* - * Do this under ->siglock, we can race with another thread - * doing sigqueue_free() if we have SIGQUEUE_PREALLOC signals. - */ - flush_sigqueue(&tsk->pending); tsk->sighand = NULL; spin_unlock(&sighand->siglock); __cleanup_sighand(sighand); - clear_tsk_thread_flag(tsk, TIF_SIGPENDING); - if (group_dead) { - flush_sigqueue(&sig->shared_pending); + if (group_dead) tty_kref_put(tty); - } } static void delayed_put_task_struct(struct rcu_head *rhp) @@ -279,6 +271,11 @@ void release_task(struct task_struct *p) proc_flush_pid(thread_pid); put_pid(thread_pid); release_thread(p); + + flush_sigqueue(&p->pending); + if (thread_group_leader(p)) + flush_sigqueue(&p->signal->shared_pending); + put_task_struct_rcu_user(p); p = leader; -- 2.25.1.362.g51ebf55 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless 2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov @ 2025-02-05 22:18 ` Frederic Weisbecker 2025-02-06 12:14 ` Oleg Nesterov 2025-02-06 15:57 ` Thomas Gleixner 1 sibling, 1 reply; 8+ messages in thread From: Frederic Weisbecker @ 2025-02-05 22:18 UTC (permalink / raw) To: Oleg Nesterov Cc: Andrew Morton, Eric W. Biederman, Peter Zijlstra, Thomas Gleixner, Mateusz Guzik, linux-kernel Le Wed, Feb 05, 2025 at 06:51:59PM +0100, Oleg Nesterov a écrit : > A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues, > and exit. In this case __exit_signal()->flush_sigqueue() called with irqs > disabled can triger a hard lockup, see > https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/ > > Fortunately, after the recent posixtimer changes sys_timer_delete() paths > no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after > the exiting task passes __exit_signal() lock_task_sighand() can't succeed > and pid_task(tmr->it_pid) will return NULL. > > This means that after __exit_signal(tsk) nobody can play with tsk->pending > or (if group_dead) with tsk->signal->shared_pending, so release_task() can > safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock). > > Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed. > > TODO: > - we can probably shift posix_cpu_timers_exit() as well > - do_sigaction() can hit the similar problem > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- > kernel/exit.c | 15 ++++++--------- > 1 file changed, 6 insertions(+), 9 deletions(-) > > diff --git a/kernel/exit.c b/kernel/exit.c > index 3485e5fc499e..bc2c24ea4181 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -200,20 +200,12 @@ static void __exit_signal(struct task_struct *tsk) > __unhash_process(tsk, group_dead); > write_sequnlock(&sig->stats_lock); > > - /* > - * Do this under ->siglock, we can race with another thread > - * doing sigqueue_free() if we have SIGQUEUE_PREALLOC signals. > - */ > - flush_sigqueue(&tsk->pending); > tsk->sighand = NULL; > spin_unlock(&sighand->siglock); > > __cleanup_sighand(sighand); > - clear_tsk_thread_flag(tsk, TIF_SIGPENDING); Looks good to me, except for this TIF_SIGPENDING removal which I'm less sure about, I see a lot of places where it is added/removed. Well it's probably only checked locally on entry code. Would it make sense to move this chunk to a separate preceding patch? Or keep it here but at least explain on the changelog why it is safe to remove it? Thanks! > - if (group_dead) { > - flush_sigqueue(&sig->shared_pending); > + if (group_dead) > tty_kref_put(tty); > - } > } > > static void delayed_put_task_struct(struct rcu_head *rhp) > @@ -279,6 +271,11 @@ void release_task(struct task_struct *p) > proc_flush_pid(thread_pid); > put_pid(thread_pid); > release_thread(p); > + > + flush_sigqueue(&p->pending); > + if (thread_group_leader(p)) > + flush_sigqueue(&p->signal->shared_pending); > + > put_task_struct_rcu_user(p); > > p = leader; > -- > 2.25.1.362.g51ebf55 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless 2025-02-05 22:18 ` Frederic Weisbecker @ 2025-02-06 12:14 ` Oleg Nesterov 0 siblings, 0 replies; 8+ messages in thread From: Oleg Nesterov @ 2025-02-06 12:14 UTC (permalink / raw) To: Frederic Weisbecker Cc: Andrew Morton, Eric W. Biederman, Peter Zijlstra, Thomas Gleixner, Mateusz Guzik, linux-kernel On 02/05, Frederic Weisbecker wrote: > > Looks good to me, except for this TIF_SIGPENDING removal which I'm less > sure about, I see a lot of places where it is added/removed. Well it's > probably only checked locally on entry code. Would it make sense to move > this chunk to a separate preceding patch? OK, will do. Thanks! Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless 2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov 2025-02-05 22:18 ` Frederic Weisbecker @ 2025-02-06 15:57 ` Thomas Gleixner 2025-02-06 16:23 ` Oleg Nesterov 1 sibling, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2025-02-06 15:57 UTC (permalink / raw) To: Oleg Nesterov, Andrew Morton, Eric W. Biederman, Frederic Weisbecker, Peter Zijlstra Cc: Mateusz Guzik, linux-kernel On Wed, Feb 05 2025 at 18:51, Oleg Nesterov wrote: > A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues, > and exit. In this case __exit_signal()->flush_sigqueue() called with irqs > disabled can triger a hard lockup, see > https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/ > > Fortunately, after the recent posixtimer changes sys_timer_delete() paths > no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after > the exiting task passes __exit_signal() lock_task_sighand() can't succeed > and pid_task(tmr->it_pid) will return NULL. > > This means that after __exit_signal(tsk) nobody can play with tsk->pending > or (if group_dead) with tsk->signal->shared_pending, so release_task() can > safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock). I can't find a problem with that. > Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed. I'm not entirely sure about that, but it does not hurt to clear it, right? > TODO: > - we can probably shift posix_cpu_timers_exit() as well I think so. > - do_sigaction() can hit the similar problem Indeed, but that's a tough on to solve. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless 2025-02-06 15:57 ` Thomas Gleixner @ 2025-02-06 16:23 ` Oleg Nesterov 2025-02-06 21:04 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Oleg Nesterov @ 2025-02-06 16:23 UTC (permalink / raw) To: Thomas Gleixner Cc: Andrew Morton, Eric W. Biederman, Frederic Weisbecker, Peter Zijlstra, Mateusz Guzik, linux-kernel On 02/06, Thomas Gleixner wrote: > > On Wed, Feb 05 2025 at 18:51, Oleg Nesterov wrote: > > A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues, > > and exit. In this case __exit_signal()->flush_sigqueue() called with irqs > > disabled can triger a hard lockup, see > > https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/ > > > > Fortunately, after the recent posixtimer changes sys_timer_delete() paths > > no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after > > the exiting task passes __exit_signal() lock_task_sighand() can't succeed > > and pid_task(tmr->it_pid) will return NULL. > > > > This means that after __exit_signal(tsk) nobody can play with tsk->pending > > or (if group_dead) with tsk->signal->shared_pending, so release_task() can > > safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock). > > I can't find a problem with that. Ah, good. > > Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed. > > I'm not entirely sure about that, but it does not hurt to clear it, > right? Please see v2 which documents this change in a separate patch. Again, it is not that it is really bad, just it looks very confusing to me and I think it can confuse other readers of this code. > > - do_sigaction() can hit the similar problem > > Indeed, but that's a tough on to solve. Yeah... Although I have to admit that yesterday I had a very simple (and wrong) solution in mind ;) Thanks! Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] exit: change the release_task() paths to call flush_sigqueue() lockless 2025-02-06 16:23 ` Oleg Nesterov @ 2025-02-06 21:04 ` Thomas Gleixner 0 siblings, 0 replies; 8+ messages in thread From: Thomas Gleixner @ 2025-02-06 21:04 UTC (permalink / raw) To: Oleg Nesterov Cc: Andrew Morton, Eric W. Biederman, Frederic Weisbecker, Peter Zijlstra, Mateusz Guzik, linux-kernel Oleg! On Thu, Feb 06 2025 at 17:23, Oleg Nesterov wrote: > On 02/06, Thomas Gleixner wrote: >> > - do_sigaction() can hit the similar problem >> >> Indeed, but that's a tough on to solve. > > Yeah... Although I have to admit that yesterday I had a very simple > (and wrong) solution in mind ;) I can relate. My initial "brilliant" idea to solve that turned out to be broken after less than 10 seconds... :) That needs some more thought to the whole signal handling business and unfortunately the recent posix-timer changes do not help much in that context. Let me think about it some more. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless 2025-02-05 17:51 [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless Oleg Nesterov 2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov @ 2025-02-05 17:53 ` Mateusz Guzik 1 sibling, 0 replies; 8+ messages in thread From: Mateusz Guzik @ 2025-02-05 17:53 UTC (permalink / raw) To: Oleg Nesterov Cc: Andrew Morton, Eric W. Biederman, Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner, linux-kernel On Wed, Feb 5, 2025 at 6:52 PM Oleg Nesterov <oleg@redhat.com> wrote: > > Thomas, Frederic, could you review? > > Untested, and I'll recheck this patch tomorrow, perhaps I missed something... > > Mateusz, this patch has minor conflicts with your > > [PATCH v3 0/6] reduce tasklist_lock hold time on exit and do some > > WIP changes. I can wait for v4 from you and rebase on top, but I'd like > to show this patch right now to simplify the review. > I'm still waiting for feedback from Eric. I'll rebase on whatever patch which lands. -- Mateusz Guzik <mjguzik gmail.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-02-06 21:04 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-05 17:51 [PATCH 0/1] exit: change the release_task() paths to call flush_sigqueue() lockless Oleg Nesterov 2025-02-05 17:51 ` [PATCH 1/1] " Oleg Nesterov 2025-02-05 22:18 ` Frederic Weisbecker 2025-02-06 12:14 ` Oleg Nesterov 2025-02-06 15:57 ` Thomas Gleixner 2025-02-06 16:23 ` Oleg Nesterov 2025-02-06 21:04 ` Thomas Gleixner 2025-02-05 17:53 ` [PATCH 0/1] " Mateusz Guzik
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.