public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: David Vernet <void@manifault.com>,
	linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: [PATCH sched_ext/for-6.12] sched_ext: Temporarily work around pick_task_scx() being called without balance_scx()
Date: Fri, 6 Sep 2024 08:16:32 -1000	[thread overview]
Message-ID: <ZttHAAcXOGimyBuc@slm.duckdns.org> (raw)
In-Reply-To: <20240906090420.GU4723@noisy.programming.kicks-ass.net>

pick_task_scx() must be preceded by balance_scx() but there currently is a
bug where fair could say yes on balance() but no on pick_task(), which then
ends up calling pick_task_scx() without preceding balance_scx(). Work around
by dropping WARN_ON_ONCE() and ignoring cases which don't make sense.

This isn't great and can theoretically lead to stalls. However, for
switch_all cases, this happens only while a BPF scheduler is being loaded or
unloaded, and, for partial cases, fair will likely keep triggering this CPU.

This will be reverted once the fair behavior is fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
---
Applying a simpler workaround to sched_ext/for-6.12 for the time being.

Thanks.

 kernel/sched/ext.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -2909,9 +2909,24 @@ static struct task_struct *pick_task_scx
 	 * If balance_scx() is telling us to keep running @prev, replenish slice
 	 * if necessary and keep running @prev. Otherwise, pop the first one
 	 * from the local DSQ.
+	 *
+	 * WORKAROUND:
+	 *
+	 * %SCX_RQ_BAL_KEEP should be set iff $prev is on SCX as it must just
+	 * have gone through balance_scx(). Unfortunately, there currently is a
+	 * bug where fair could say yes on balance() but no on pick_task(),
+	 * which then ends up calling pick_task_scx() without preceding
+	 * balance_scx().
+	 *
+	 * For now, ignore cases where $prev is not on SCX. This isn't great and
+	 * can theoretically lead to stalls. However, for switch_all cases, this
+	 * happens only while a BPF scheduler is being loaded or unloaded, and,
+	 * for partial cases, fair will likely keep triggering this CPU.
+	 *
+	 * Once fair is fixed, restore WARN_ON_ONCE().
 	 */
 	if ((rq->scx.flags & SCX_RQ_BAL_KEEP) &&
-	    !WARN_ON_ONCE(prev->sched_class != &ext_sched_class)) {
+	    prev->sched_class == &ext_sched_class) {
 		p = prev;
 		if (!p->scx.slice)
 			p->scx.slice = SCX_SLICE_DFL;

  reply	other threads:[~2024-09-06 18:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-05  0:47 [PATCH sched_ext/for-6.12] sched_ext: Handle cases where pick_task_scx() is called without preceding balance_scx() Tejun Heo
2024-09-05  9:28 ` Peter Zijlstra
2024-09-05 15:00   ` Peter Zijlstra
2024-09-05 16:41     ` Tejun Heo
2024-09-06  1:17       ` Tejun Heo
2024-09-06  2:08         ` Tejun Heo
2024-09-06  9:04         ` Peter Zijlstra
2024-09-06 18:16           ` Tejun Heo [this message]
2024-11-09 20:17           ` Tejun Heo
2024-11-09 20:50 ` [PATCH sched_ext/for-6.12-fixes v2] " Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZttHAAcXOGimyBuc@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox