public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	John Stultz <jstultz@google.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Koba Ko <kobak@nvidia.com>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: [PATCH 05/10] sched_ext: Fix TOCTOU race in consume_remote_task()
Date: Wed,  6 May 2026 19:45:45 +0200	[thread overview]
Message-ID: <20260506174639.535232-6-arighi@nvidia.com> (raw)
In-Reply-To: <20260506174639.535232-1-arighi@nvidia.com>

When pulling a task from a non-local DSQ, consume_dispatch_q() checks if
the task can run on the destination rq via task_can_run_on_remote_rq().
However, it then drops the destination rq lock and locks the source rq
in consume_remote_task() -> unlink_dsq_and_lock_src_rq(). During this
window, the task might have become migration disabled, making it invalid
to migrate it to the destination rq.

Fix this by re-evaluating task_can_run_on_remote_rq() in
consume_remote_task() after the source rq is locked. If the task can no
longer be migrated, we clear its DSQ association, reset the holding CPU,
and enqueue it to the source rq's local DSQ instead.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index d64b1283fa851..a70f8693b906f 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -2418,13 +2418,24 @@ static bool unlink_dsq_and_lock_src_rq(struct task_struct *p,
 		!WARN_ON_ONCE(src_rq != task_rq(p));
 }
 
-static bool consume_remote_task(struct rq *this_rq,
+static bool consume_remote_task(struct scx_sched *sch, struct rq *this_rq,
 				struct task_struct *p, u64 enq_flags,
 				struct scx_dispatch_q *dsq, struct rq *src_rq)
 {
 	raw_spin_rq_unlock(this_rq);
 
 	if (unlink_dsq_and_lock_src_rq(p, dsq, src_rq)) {
+		if (unlikely(!task_can_run_on_remote_rq(sch, p, this_rq, true))) {
+			p->scx.dsq = NULL;
+			p->scx.holding_cpu = -1;
+			dispatch_enqueue(sch, src_rq, &src_rq->scx.local_dsq, p,
+					 enq_flags | SCX_ENQ_CLEAR_OPSS);
+			if (sched_class_above(p->sched_class, src_rq->donor->sched_class))
+				resched_curr(src_rq);
+			raw_spin_rq_unlock(src_rq);
+			raw_spin_rq_lock(this_rq);
+			return false;
+		}
 		move_remote_task_to_local_dsq(p, enq_flags, src_rq, this_rq);
 		return true;
 	} else {
@@ -2541,7 +2552,7 @@ static bool consume_dispatch_q(struct scx_sched *sch, struct rq *rq,
 		}
 
 		if (task_can_run_on_remote_rq(sch, p, rq, false)) {
-			if (likely(consume_remote_task(rq, p, enq_flags, dsq, task_rq)))
+			if (likely(consume_remote_task(sch, rq, p, enq_flags, dsq, task_rq)))
 				return true;
 			goto retry;
 		}
-- 
2.54.0


  parent reply	other threads:[~2026-05-06 17:47 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06 17:45 [RFC PATCH sched_ext/for-7.2 0/10] sched: Make proxy execution compatible with sched_ext Andrea Righi
2026-05-06 17:45 ` [PATCH 01/10] sched/core: Skip migration disabled tasks in proxy execution Andrea Righi
2026-05-06 21:09   ` John Stultz
2026-05-07  3:34     ` K Prateek Nayak
2026-05-07  6:31       ` Andrea Righi
2026-05-07  7:45         ` K Prateek Nayak
2026-05-06 17:45 ` [PATCH 02/10] sched/core: Skip put_prev_task/set_next_task re-entry for sched_ext donors Andrea Righi
2026-05-06 17:45 ` [PATCH 03/10] sched/ext: Split curr|donor references properly Andrea Righi
2026-05-06 17:45 ` [PATCH 04/10] sched/ext: Avoid migrating blocked tasks with proxy execution Andrea Righi
2026-05-06 17:45 ` Andrea Righi [this message]
2026-05-06 17:45 ` [PATCH 06/10] sched_ext: Fix ops.running/stopping() pairing for proxy-exec donors Andrea Righi
2026-05-06 17:45 ` [PATCH 07/10] sched_ext: Save/restore kf_tasks[] when task ops nest Andrea Righi
2026-05-06 17:45 ` [PATCH 08/10] sched_ext: Skip ops.runnable() when nested in SCX_CALL_OP_TASK Andrea Righi
2026-05-06 17:45 ` [PATCH 09/10] sched/core: Disable proxy-exec context switch under sched_ext by default Andrea Righi
2026-05-06 17:45 ` [PATCH 10/10] sched: Allow enabling proxy exec with sched_ext Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260506174639.535232-6-arighi@nvidia.com \
    --to=arighi@nvidia.com \
    --cc=bsegall@google.com \
    --cc=changwoo@igalia.com \
    --cc=christian.loehle@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelagnelf@nvidia.com \
    --cc=jstultz@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kobak@nvidia.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox