public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: void@manifault.com
Cc: kernel-team@meta.com, linux-kernel@vger.kernel.org,
	Tejun Heo <tj@kernel.org>
Subject: [PATCH 02/11] sched_ext: Refactor consume_remote_task()
Date: Fri, 30 Aug 2024 01:03:46 -1000	[thread overview]
Message-ID: <20240830110415.116090-3-tj@kernel.org> (raw)
In-Reply-To: <20240830110415.116090-1-tj@kernel.org>

The tricky p->scx.holding_cpu handling was split across
consume_remote_task() body and move_task_to_local_dsq(). Refactor such that:

- All the tricky part is now in the new unlink_dsq_and_lock_task_rq() with
  consolidated documentation.

- move_task_to_local_dsq() now implements straightforward task migration
  making it easier to use in other places.

- dispatch_to_local_dsq() is another user move_task_to_local_dsq(). The
  usage is updated accordingly. This makes the local and remote cases more
  symmetric.

No functional changes intended.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/sched/ext.c | 145 ++++++++++++++++++++++++---------------------
 1 file changed, 76 insertions(+), 69 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 5423554a11af..3facfca73337 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -2109,49 +2109,13 @@ static bool yield_to_task_scx(struct rq *rq, struct task_struct *to)
  * @src_rq: rq to move the task from, locked on entry, released on return
  * @dst_rq: rq to move the task into, locked on return
  *
- * Move @p which is currently on @src_rq to @dst_rq's local DSQ. The caller
- * must:
- *
- * 1. Start with exclusive access to @p either through its DSQ lock or
- *    %SCX_OPSS_DISPATCHING flag.
- *
- * 2. Set @p->scx.holding_cpu to raw_smp_processor_id().
- *
- * 3. Remember task_rq(@p) as @src_rq. Release the exclusive access so that we
- *    don't deadlock with dequeue.
- *
- * 4. Lock @src_rq from #3.
- *
- * 5. Call this function.
- *
- * Returns %true if @p was successfully moved. %false after racing dequeue and
- * losing. On return, @src_rq is unlocked and @dst_rq is locked.
+ * Move @p which is currently on @src_rq to @dst_rq's local DSQ.
  */
-static bool move_task_to_local_dsq(struct task_struct *p, u64 enq_flags,
+static void move_task_to_local_dsq(struct task_struct *p, u64 enq_flags,
 				   struct rq *src_rq, struct rq *dst_rq)
 {
 	lockdep_assert_rq_held(src_rq);
 
-	/*
-	 * If dequeue got to @p while we were trying to lock @src_rq, it'd have
-	 * cleared @p->scx.holding_cpu to -1. While other cpus may have updated
-	 * it to different values afterwards, as this operation can't be
-	 * preempted or recurse, @p->scx.holding_cpu can never become
-	 * raw_smp_processor_id() again before we're done. Thus, we can tell
-	 * whether we lost to dequeue by testing whether @p->scx.holding_cpu is
-	 * still raw_smp_processor_id().
-	 *
-	 * @p->rq couldn't have changed if we're still the holding cpu.
-	 *
-	 * See dispatch_dequeue() for the counterpart.
-	 */
-	if (unlikely(p->scx.holding_cpu != raw_smp_processor_id()) ||
-	    WARN_ON_ONCE(src_rq != task_rq(p))) {
-		raw_spin_rq_unlock(src_rq);
-		raw_spin_rq_lock(dst_rq);
-		return false;
-	}
-
 	/* the following marks @p MIGRATING which excludes dequeue */
 	deactivate_task(src_rq, p, 0);
 	set_task_cpu(p, cpu_of(dst_rq));
@@ -2170,8 +2134,6 @@ static bool move_task_to_local_dsq(struct task_struct *p, u64 enq_flags,
 	dst_rq->scx.extra_enq_flags = enq_flags;
 	activate_task(dst_rq, p, 0);
 	dst_rq->scx.extra_enq_flags = 0;
-
-	return true;
 }
 
 #endif	/* CONFIG_SMP */
@@ -2236,28 +2198,69 @@ static bool task_can_run_on_remote_rq(struct task_struct *p, struct rq *rq,
 	return true;
 }
 
-static bool consume_remote_task(struct rq *rq, struct scx_dispatch_q *dsq,
-				struct task_struct *p, struct rq *task_rq)
+/**
+ * unlink_dsq_and_lock_task_rq() - Unlink task from its DSQ and lock its task_rq
+ * @p: target task
+ * @dsq: locked DSQ @p is currently on
+ * @task_rq: @p's task_rq, stable with @dsq locked
+ *
+ * Called with @dsq locked but no rq's locked. We want to move @p to a different
+ * DSQ, including any local DSQ, but are not locking @task_rq. Locking @task_rq
+ * is required when transferring into a local DSQ. Even when transferring into a
+ * non-local DSQ, it's better to use the same mechanism to protect against
+ * dequeues and maintain the invariant that @p->scx.dsq can only change while
+ * @task_rq is locked, which e.g. scx_dump_task() depends on.
+ *
+ * We want to grab @task_rq but that can deadlock if we try while locking @dsq,
+ * so we want to unlink @p from @dsq, drop its lock and then lock @task_rq. As
+ * this may race with dequeue, which can't drop the rq lock or fail, do a little
+ * dancing from our side.
+ *
+ * @p->scx.holding_cpu is set to this CPU before @dsq is unlocked. If @p gets
+ * dequeued after we unlock @dsq but before locking @task_rq, the holding_cpu
+ * would be cleared to -1. While other cpus may have updated it to different
+ * values afterwards, as this operation can't be preempted or recurse, the
+ * holding_cpu can never become this CPU again before we're done. Thus, we can
+ * tell whether we lost to dequeue by testing whether the holding_cpu still
+ * points to this CPU. See dispatch_dequeue() for the counterpart.
+ *
+ * On return, @dsq is unlocked and @task_rq is locked. Returns %true if @p is
+ * still valid. %false if lost to dequeue.
+ */
+static bool unlink_dsq_and_lock_task_rq(struct task_struct *p,
+					struct scx_dispatch_q *dsq,
+					struct rq *task_rq)
 {
-	lockdep_assert_held(&dsq->lock);	/* released on return */
+	s32 cpu = raw_smp_processor_id();
+
+	lockdep_assert_held(&dsq->lock);
 
-	/*
-	 * @dsq is locked and @p is on a remote rq. @p is currently protected by
-	 * @dsq->lock. We want to pull @p to @rq but may deadlock if we grab
-	 * @task_rq while holding @dsq and @rq locks. As dequeue can't drop the
-	 * rq lock or fail, do a little dancing from our side. See
-	 * move_task_to_local_dsq().
-	 */
 	WARN_ON_ONCE(p->scx.holding_cpu >= 0);
 	task_unlink_from_dsq(p, dsq);
 	dsq_mod_nr(dsq, -1);
-	p->scx.holding_cpu = raw_smp_processor_id();
-	raw_spin_unlock(&dsq->lock);
+	p->scx.holding_cpu = cpu;
 
-	raw_spin_rq_unlock(rq);
+	raw_spin_unlock(&dsq->lock);
 	raw_spin_rq_lock(task_rq);
 
-	return move_task_to_local_dsq(p, 0, task_rq, rq);
+	/* task_rq couldn't have changed if we're still the holding cpu */
+	return likely(p->scx.holding_cpu == cpu) &&
+		!WARN_ON_ONCE(task_rq != task_rq(p));
+}
+
+static bool consume_remote_task(struct rq *this_rq, struct scx_dispatch_q *dsq,
+				struct task_struct *p, struct rq *task_rq)
+{
+	raw_spin_rq_unlock(this_rq);
+
+	if (unlink_dsq_and_lock_task_rq(p, dsq, task_rq)) {
+		move_task_to_local_dsq(p, 0, task_rq, this_rq);
+		return true;
+	} else {
+		raw_spin_rq_unlock(task_rq);
+		raw_spin_rq_lock(this_rq);
+		return false;
+	}
 }
 #else	/* CONFIG_SMP */
 static inline bool task_can_run_on_remote_rq(struct task_struct *p, struct rq *rq, bool trigger_error) { return false; }
@@ -2361,7 +2364,8 @@ dispatch_to_local_dsq(struct rq *rq, u64 dsq_id, struct task_struct *p,
 		 * As DISPATCHING guarantees that @p is wholly ours, we can
 		 * pretend that we're moving from a DSQ and use the same
 		 * mechanism - mark the task under transfer with holding_cpu,
-		 * release DISPATCHING and then follow the same protocol.
+		 * release DISPATCHING and then follow the same protocol. See
+		 * unlink_dsq_and_lock_task_rq().
 		 */
 		p->scx.holding_cpu = raw_smp_processor_id();
 
@@ -2374,28 +2378,31 @@ dispatch_to_local_dsq(struct rq *rq, u64 dsq_id, struct task_struct *p,
 			raw_spin_rq_lock(src_rq);
 		}
 
-		if (src_rq == dst_rq) {
+		/* task_rq couldn't have changed if we're still the holding cpu */
+		dsp = p->scx.holding_cpu == raw_smp_processor_id() &&
+			!WARN_ON_ONCE(src_rq != task_rq(p));
+
+		if (likely(dsp)) {
 			/*
-			 * As @p is staying on the same rq, there's no need to
+			 * If @p is staying on the same rq, there's no need to
 			 * go through the full deactivate/activate cycle.
 			 * Optimize by abbreviating the operations in
 			 * move_task_to_local_dsq().
 			 */
-			dsp = p->scx.holding_cpu == raw_smp_processor_id();
-			if (likely(dsp)) {
+			if (src_rq == dst_rq) {
 				p->scx.holding_cpu = -1;
-				dispatch_enqueue(&dst_rq->scx.local_dsq, p,
-						 enq_flags);
+				dispatch_enqueue(&dst_rq->scx.local_dsq,
+						 p, enq_flags);
+			} else {
+				move_task_to_local_dsq(p, enq_flags,
+						       src_rq, dst_rq);
 			}
-		} else {
-			dsp = move_task_to_local_dsq(p, enq_flags,
-						     src_rq, dst_rq);
-		}
 
-		/* if the destination CPU is idle, wake it up */
-		if (dsp && sched_class_above(p->sched_class,
-					     dst_rq->curr->sched_class))
-			resched_curr(dst_rq);
+			/* if the destination CPU is idle, wake it up */
+			if (sched_class_above(p->sched_class,
+					      dst_rq->curr->sched_class))
+				resched_curr(dst_rq);
+		}
 
 		/* switch back to @rq lock */
 		if (rq != dst_rq) {
-- 
2.46.0


  parent reply	other threads:[~2024-08-30 11:04 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30 11:03 [PATCHSET sched_ext/for-6.12] sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq() Tejun Heo
2024-08-30 11:03 ` [PATCH 01/11] sched_ext: Rename scx_kfunc_set_sleepable to unlocked and relocate Tejun Heo
2024-08-30 17:45   ` David Vernet
2024-08-30 11:03 ` Tejun Heo [this message]
2024-08-31  4:05   ` [PATCH 02/11] sched_ext: Refactor consume_remote_task() David Vernet
2024-08-31  5:33     ` Tejun Heo
2024-08-31 23:40       ` David Vernet
2024-08-30 11:03 ` [PATCH 03/11] sched_ext: Make find_dsq_for_dispatch() handle SCX_DSQ_LOCAL_ON Tejun Heo
2024-09-01  0:11   ` David Vernet
2024-08-30 11:03 ` [PATCH 04/11] sched_ext: Make dispatch_to_local_dsq() return void Tejun Heo
2024-08-30 17:44   ` [PATCH 04/11] sched_ext: Fix processs_ddsp_deferred_locals() by unifying DTL_INVALID handling Tejun Heo
2024-09-01  0:53     ` David Vernet
2024-09-01  0:56       ` David Vernet
2024-09-01  8:03         ` Tejun Heo
2024-09-01 15:35           ` David Vernet
2024-08-30 11:03 ` [PATCH 05/11] sched_ext: Restructure dispatch_to_local_dsq() Tejun Heo
2024-09-01  1:09   ` David Vernet
2024-08-30 11:03 ` [PATCH 06/11] sched_ext: Reorder args for consume_local/remote_task() Tejun Heo
2024-09-01  1:40   ` David Vernet
2024-09-01  6:37     ` Tejun Heo
2024-08-30 11:03 ` [PATCH 07/11] sched_ext: Move sanity check and dsq_mod_nr() into task_unlink_from_dsq() Tejun Heo
2024-09-01  1:42   ` David Vernet
2024-08-30 11:03 ` [PATCH 08/11] sched_ext: Move consume_local_task() upward Tejun Heo
2024-09-01  1:43   ` David Vernet
2024-08-30 11:03 ` [PATCH 09/11] sched_ext: Replace consume_local_task() with move_local_task_to_local_dsq() Tejun Heo
2024-09-01  1:55   ` David Vernet
2024-08-30 11:03 ` [PATCH 10/11] sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq() Tejun Heo
2024-08-31 14:30   ` Andrea Righi
2024-08-31 16:20     ` Tejun Heo
2024-08-31 21:15       ` Andrea Righi
2024-09-02  1:53         ` Changwoo Min
2024-09-02  5:59           ` Tejun Heo
2024-08-30 11:03 ` [PATCH 11/11] scx_qmap: Implement highpri boosting Tejun Heo
2024-08-30 20:59   ` [PATCH v2 " Tejun Heo
2024-08-30 17:31 ` [PATCHSET sched_ext/for-6.12] sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq() Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240830110415.116090-3-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox