From: Chen Yu <yu.c.chen@intel.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: <paulmck@kernel.org>, Peter Zijlstra <peterz@infradead.org>,
<linux-kernel@vger.kernel.org>, <sfr@canb.auug.org.au>,
<linux-next@vger.kernel.org>, <kernel-team@meta.com>
Subject: Re: [BUG almost bisected] Splat in dequeue_rt_stack() and build error
Date: Wed, 28 Aug 2024 21:44:08 +0800 [thread overview]
Message-ID: <Zs8pqJjIYOFuPDiH@chenyu5-mobl2> (raw)
In-Reply-To: <xhsmha5gwome6.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Hi,
On 2024-08-28 at 14:35:45 +0200, Valentin Schneider wrote:
> On 27/08/24 13:36, Paul E. McKenney wrote:
> > On Tue, Aug 27, 2024 at 10:30:24PM +0200, Valentin Schneider wrote:
> >> On 27/08/24 11:35, Paul E. McKenney wrote:
> >> > On Tue, Aug 27, 2024 at 10:33:13AM -0700, Paul E. McKenney wrote:
> >> >> On Tue, Aug 27, 2024 at 05:41:52PM +0200, Valentin Schneider wrote:
> >> >> > I've taken tip/sched/core and shuffled hunks around; I didn't re-order any
> >> >> > commit. I've also taken out the dequeue from switched_from_fair() and put
> >> >> > it at the very top of the branch which should hopefully help bisection.
> >> >> >
> >> >> > The final delta between that branch and tip/sched/core is empty, so it
> >> >> > really is just shuffling inbetween commits.
> >> >> >
> >> >> > Please find the branch at:
> >> >> >
> >> >> > https://gitlab.com/vschneid/linux.git -b mainline/sched/eevdf-complete-builderr
> >> >> >
> >> >> > I'll go stare at the BUG itself now.
> >> >>
> >> >> Thank you!
> >> >>
> >> >> I have fired up tests on the "BROKEN?" commit. If that fails, I will
> >> >> try its predecessor, and if that fails, I wlll bisect from e28b5f8bda01
> >> >> ("sched/fair: Assert {set_next,put_prev}_entity() are properly balanced"),
> >> >> which has stood up to heavy hammering in earlier testing.
> >> >
> >> > And of 50 runs of TREE03 on the "BROKEN?" commit resulted in 32 failures.
> >> > Of these, 29 were the dequeue_rt_stack() failure. Two more were RCU
> >> > CPU stall warnings, and the last one was an oddball "kernel BUG at
> >> > kernel/sched/rt.c:1714" followed by an equally oddball "Oops: invalid
> >> > opcode: 0000 [#1] PREEMPT SMP PTI".
> >> >
> >> > Just to be specific, this is commit:
> >> >
> >> > df8fe34bfa36 ("BROKEN? sched/fair: Dequeue sched_delayed tasks when switching from fair")
> >> >
> >> > This commit's predecessor is this commit:
> >> >
> >> > 2f888533d073 ("sched/eevdf: Propagate min_slice up the cgroup hierarchy")
> >> >
> >> > This predecessor commit passes 50 runs of TREE03 with no failures.
> >> >
> >> > So that addition of that dequeue_task() call to the switched_from_fair()
> >> > function is looking quite suspicious to me. ;-)
> >> >
> >> > Thanx, Paul
> >>
> >> Thanks for the testing!
> >>
> >> The WARN_ON_ONCE(!rt_se->on_list); hit in __dequeue_rt_entity() feels like
> >> a put_prev/set_next kind of issue...
> >>
> >> So far I'd assumed a ->sched_delayed task can't be current during
> >> switched_from_fair(), I got confused because it's Mond^CCC Tuesday, but I
> >> think that still holds: we can't get a balance_dl() or balance_rt() to drop
> >> the RQ lock because prev would be fair, and we can't get a
> >> newidle_balance() with a ->sched_delayed task because we'd have
> >> sched_fair_runnable() := true.
> >>
> >> I'll pick this back up tomorrow, this is a task that requires either
> >> caffeine or booze and it's too late for either.
> >
> > Thank you for chasing this, and get some sleep! This one is of course
> > annoying, but it is not (yet) an emergency. I look forward to seeing
> > what you come up with.
> >
> > Also, I would of course be happy to apply debug patches.
> >
> > Thanx, Paul
>
> Chen Yu made me realize [1] that dequeue_task() really isn't enough; the
> dequeue_task() in e.g. __sched_setscheduler() won't have DEQUEUE_DELAYED,
> so stuff will just be left on the CFS tree.
>
One question, although there is no DEQUEUE_DELAYED flag, it is possible
the delayed task could be dequeued from CFS tree. Because the dequeue in
set_schedule() does not have DEQUEUE_SLEEP. And in dequeue_entity():
bool sleep = flags & DEQUEUE_SLEEP;
if (flags & DEQUEUE_DELAYED) {
} else {
bool delay = sleep;
if (sched_feat(DELAY_DEQUEUE) && delay && //false
!entity_eligible(cfs_rq, se) {
//do not dequeue
}
}
//dequeue the task <---- we should reach here?
thanks,
Chenyu
> Worse, what we need here is the __block_task() like we have at the end of
> dequeue_entities(), otherwise p stays ->on_rq and that's borked - AFAICT
> that explains the splat you're getting, because affine_move_task() ends up
> doing a move_queued_task() for what really is a dequeued task.
>
> I unfortunately couldn't reproduce the issue locally using your TREE03
> invocation. I've pushed a new patch on top of my branch, would you mind
> giving it a spin? It's a bit sketchy but should at least be going in the
> right direction...
>
> [1]: http://lore.kernel.org/r/Zs2d2aaC/zSyR94v@chenyu5-mobl2
>
next prev parent reply other threads:[~2024-08-28 13:44 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-21 21:57 [BUG almost bisected] Splat in dequeue_rt_stack() and build error Paul E. McKenney
2024-08-22 23:01 ` Paul E. McKenney
2024-08-23 7:47 ` Peter Zijlstra
2024-08-23 12:46 ` Paul E. McKenney
2024-08-23 21:51 ` Paul E. McKenney
2024-08-24 6:54 ` Peter Zijlstra
2024-08-24 15:26 ` Paul E. McKenney
2024-08-25 2:10 ` Paul E. McKenney
2024-08-25 19:36 ` Paul E. McKenney
2024-08-26 11:44 ` Valentin Schneider
2024-08-26 16:31 ` Paul E. McKenney
2024-08-27 10:03 ` Valentin Schneider
2024-08-27 15:41 ` Valentin Schneider
2024-08-27 17:33 ` Paul E. McKenney
2024-08-27 18:35 ` Paul E. McKenney
2024-08-27 20:30 ` Valentin Schneider
2024-08-27 20:36 ` Paul E. McKenney
2024-08-28 12:35 ` Valentin Schneider
2024-08-28 13:03 ` Paul E. McKenney
2024-08-28 13:40 ` Paul E. McKenney
2024-08-28 13:44 ` Chen Yu [this message]
2024-08-28 14:32 ` Valentin Schneider
2024-08-28 16:35 ` Paul E. McKenney
2024-08-28 18:17 ` Valentin Schneider
2024-08-28 18:39 ` Paul E. McKenney
2024-08-29 10:28 ` Paul E. McKenney
2024-08-29 13:50 ` Valentin Schneider
2024-08-29 14:13 ` Paul E. McKenney
2024-09-08 16:32 ` Paul E. McKenney
2024-09-13 14:08 ` Paul E. McKenney
2024-09-13 16:55 ` Valentin Schneider
2024-09-13 18:00 ` Paul E. McKenney
2024-09-30 19:09 ` Paul E. McKenney
2024-09-30 20:44 ` Valentin Schneider
2024-10-01 10:10 ` Paul E. McKenney
2024-10-01 12:52 ` Valentin Schneider
2024-10-01 16:47 ` Paul E. McKenney
2024-10-02 9:01 ` Tomas Glozar
2024-10-02 12:07 ` Paul E. McKenney
2024-10-10 11:24 ` Tomas Glozar
2024-10-10 15:01 ` Paul E. McKenney
2024-10-10 23:28 ` Paul E. McKenney
2024-10-14 18:55 ` Paul E. McKenney
2024-10-21 19:25 ` Paul E. McKenney
2024-11-14 18:16 ` Paul E. McKenney
2024-12-15 18:31 ` Paul E. McKenney
2024-12-16 14:38 ` Tomas Glozar
2024-12-16 19:36 ` Paul E. McKenney
2024-12-17 16:42 ` Paul E. McKenney
2024-10-22 6:33 ` Tomas Glozar
2024-10-03 8:40 ` Peter Zijlstra
2024-10-03 8:47 ` Peter Zijlstra
2024-10-03 9:27 ` Peter Zijlstra
2024-10-03 12:28 ` Peter Zijlstra
2024-10-03 12:45 ` Paul E. McKenney
2024-10-03 14:22 ` Peter Zijlstra
2024-10-03 16:04 ` Paul E. McKenney
2024-10-03 18:50 ` Peter Zijlstra
2024-10-03 19:12 ` Paul E. McKenney
2024-10-04 13:22 ` Paul E. McKenney
2024-10-04 13:35 ` Peter Zijlstra
2024-10-06 20:44 ` Paul E. McKenney
2024-10-07 9:34 ` Peter Zijlstra
2024-10-08 11:11 ` Peter Zijlstra
2024-10-08 16:24 ` Paul E. McKenney
2024-10-08 22:34 ` Paul E. McKenney
2024-10-03 12:44 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zs8pqJjIYOFuPDiH@chenyu5-mobl2 \
--to=yu.c.chen@intel.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=sfr@canb.auug.org.au \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox