From: Oleg Nesterov <oleg@redhat.com>
To: Matt Fleming <mfleming@cloudflare.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
John Stultz <jstultz@google.com>,
kernel-team <kernel-team@cloudflare.com>,
LKML <linux-kernel@vger.kernel.org>,
Chris Arges <carges@cloudflare.com>
Subject: Re: Debugging lost task in wait_task_inactive() when delivering signal (6.12)
Date: Sun, 21 Sep 2025 21:27:01 +0200 [thread overview]
Message-ID: <20250921192700.GA565@redhat.com> (raw)
In-Reply-To: <CAGis_TWHJva-gktrsvO9=m5mEFf4zzcN=rNEt+5+moqz=C7AEQ@mail.gmail.com>
Thanks Matt!
So I guess that this has nothing to do with coredump and wait_task_inactive()
is broken...
I am wondering if this code
/*
* If task is sched_delayed, force dequeue it, to avoid always
* hitting the tick timeout in the queued case
*/
if (p->se.sched_delayed)
dequeue_task(rq, p, DEQUEUE_SLEEP | DEQUEUE_DELAYED);
ia actually correct but I know nothing about the sched_delayed logic.
I will leave this to scheduler experts ;) I can't really help.
Oleg.
On 09/20, Matt Fleming wrote:
>
> On Fri, 19 Sept 2025 at 17:15, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > OK, thanks. Nothing "interesting" at first glance.
>
> Chris (Cc'd) and I managed to get a reproducer and I think I know
> what's happening now.
>
> When a task A gets the SIGKILL from whichever thread is handling the
> coredump (let's say task B) it might hit the delayed dequeue path in
> schedule() and call set_delayed(), e.g.
>
> dequeue_entity+1263
> dequeue_entities+216
> dequeue_task_fair+224
> __schedule+468
> schedule+39
> do_exit+221
> do_group_exit+48
> get_signal+2078
> arch_do_signal_or_restart+46
> irqentry_exit_to_user_mode+132
> asm_sysvec_apic_timer_interrupt+26
>
> At this point task A has ->on_rq=1, ->se.sched_delayed=1 and ->se.on_rq=1.
>
> Now when task B calls into wait_task_inactive(), it sees
> ->se.sched_delayed=1 and calls dequeue_task().
>
> At this point task A has ->on_rq=1, ->se.sched_delayed=0 and ->se.on_rq=0
>
> Unfortunately, task B still thinks that task A is scheduled because
> task_on_rq_queued(A) is true, but it's not runnable and will never run
> because it's no longer in the fair rbtree and the only task that will
> enqueue it again is task B once it leaves wait_task_inactive() and
> hits coredump_finish().
>
> > > do_exit+0xdd is here in coredump_task_wait():
> > >
> > > for (;;) {
> > > set_current_state(TASK_IDLE|TASK_FREEZABLE);
> > > if (!self.task) /* see coredump_finish() */
> > > break;
> > > schedule();
> > > }
> > >
> > > i.e. the task calls schedule() and never comes back.
> >
> > Are you sure it never comes back and doesn't loop?
>
> Yeah, positive:
>
> $ sudo perf stat -e cycles -t 1546531 -- sleep 30
>
> Performance counter stats for thread id '1546531':
>
> <not counted> cycles
>
> 30.001671072 seconds time elapsed
>
prev parent reply other threads:[~2025-09-21 19:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAGis_TWyhciem6bPzR98ysj1+gOVPHRGqSUNiiyvS1RnEidExw@mail.gmail.com>
2025-09-19 14:37 ` Debugging lost task in wait_task_inactive() when delivering signal (6.12) Oleg Nesterov
2025-09-19 15:16 ` Matt Fleming
2025-09-19 16:13 ` Oleg Nesterov
2025-09-20 22:10 ` Matt Fleming
2025-09-21 19:27 ` Oleg Nesterov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250921192700.GA565@redhat.com \
--to=oleg@redhat.com \
--cc=carges@cloudflare.com \
--cc=jstultz@google.com \
--cc=kernel-team@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mfleming@cloudflare.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.