public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Vineeth Remanan Pillai <vineeth@bitbyteword.org>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Ilya Maximets <i.maximets@ovn.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	vineethrp@google.com, shraash@google.com,
	marcel.ziswiler@codethink.co.uk
Subject: Re: [v6.12] WARNING: at kernel/sched/deadline.c:1995 enqueue_dl_entity (task blocked for more than 28262 seconds)
Date: Mon, 9 Dec 2024 15:01:08 +0100	[thread overview]
Message-ID: <20241209140108.GL8562@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAO7JXPj6_VF+T1ykwQsCmBjVhHQdpX0wJssPPRYOazJcciCCyA@mail.gmail.com>

On Mon, Dec 09, 2024 at 08:56:43AM -0500, Vineeth Remanan Pillai wrote:

> > So the scenario I had in mind was that we were doing something like:
> >
> >         current->state = TASK_INTERRUPTIBLE();
> >         schedule();
> >           deactivate_task()
> >             dl_stop_server();
> >           pick_next_task()
> >             pick_next_task_fair()
> >               sched_balance_newidle()
> >                 rq_unlock(this_rq)
> >
> > at which point another CPU can take our RQ-lock and do:
> >
> >         try_to_wake_up()
> >           ttwu_queue()
> >             rq_lock()
> >             ...
> >             activate_task()
> >               dl_server_start()
> >             wakeup_preempt() := check_preempt_wakeup_fair()
> >               update_curr()
> >                 update_curr_task()
> >                   if (current->dl_server)
> >                     dl_server_update()
> >                       enqueue_dl_entity()
> >
> >
> > Which then also goes *bang*. The above can't happen if we clear
> > current->dl_server in dl_stop_server().
> >
> I also thought this could be a possibility but the previous deactivate
> for this task would have cleared the dl_server no? 

That gets cleared in put_prev_set_next_task(), which gets called *after*
pick_next_task() completes. So until that time, current will have
dl_server set.

> Soon after this in
> update_curr() we again call dl_server_update if p_.dl_server !=
> rq->fair_server and this is also another possibility of a double
> enqueue.

Right, there's few possible paths there, I've not fully mapped them. But
I think clearing ->dl_server in dl_server_stop() is the cleanest option
for this.


> This should work as well. I was planning to send a second patch with
> the dl_server active flag as it was not strictly the root cause of
> this. But the active flag serves the purpose here and this change
> looks good to me :-). I will test this on my end and let you know. It
> takes more than 12 hours to reproduce in my test case ;-)

Urgh... Thanks!

  reply	other threads:[~2024-12-09 14:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-04 12:47 [v6.12] WARNING: at kernel/sched/deadline.c:1995 enqueue_dl_entity (task blocked for more than 28262 seconds) Ilya Maximets
2024-12-06 15:18 ` Joel Fernandes
2024-12-06 16:57   ` Vineeth Remanan Pillai
2024-12-06 17:24     ` Joel Fernandes
2024-12-09 10:48       ` Juri Lelli
2024-12-09 10:55     ` Peter Zijlstra
2024-12-09 12:29       ` Vineeth Remanan Pillai
2024-12-09 12:34         ` Ilya Maximets
2024-12-10  0:31           ` Ilya Maximets
2024-12-09 12:56         ` Peter Zijlstra
2024-12-09 13:56           ` Vineeth Remanan Pillai
2024-12-09 14:01             ` Peter Zijlstra [this message]
2024-12-09 14:12               ` Vineeth Remanan Pillai
2024-12-10  0:34           ` Ilya Maximets
2024-12-10  2:52             ` Vineeth Remanan Pillai
2024-12-10  2:58               ` Vineeth Remanan Pillai
2024-12-10  9:28                 ` Ilya Maximets
2024-12-10 23:16                   ` Ilya Maximets
2024-12-11  2:30                     ` Vineeth Remanan Pillai
2024-12-11  9:48                       ` Ilya Maximets
2024-12-10 16:11               ` Marcel Ziswiler
2024-12-10 16:08           ` Marcel Ziswiler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241209140108.GL8562@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=i.maximets@ovn.org \
    --cc=joel@joelfernandes.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcel.ziswiler@codethink.co.uk \
    --cc=mingo@redhat.com \
    --cc=shraash@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vineeth@bitbyteword.org \
    --cc=vineethrp@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox