All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH sched_ext/for-6.12] sched_ext: Make scx_rq_online() also test cpu_active() in addition to SCX_RQ_ONLINE
@ 2024-08-07 22:13 Tejun Heo
  2024-08-08 17:39 ` David Vernet
  2024-08-08 23:38 ` Tejun Heo
  0 siblings, 2 replies; 3+ messages in thread
From: Tejun Heo @ 2024-08-07 22:13 UTC (permalink / raw)
  To: David Vernet; +Cc: linux-kernel, kernel-team

scx_rq_online() currently only tests SCX_RQ_ONLINE. This isn't fully correct
- e.g. consume_dispatch_q() uses task_run_on_remote_rq() which tests
scx_rq_online() to see whether the current rq can run the task, and, if so,
calls consume_remote_task() to migrate the task to @rq. While the test
itself was done while locking @rq, @rq can be temporarily unlocked by
consume_remote_task() and nothing prevents SCX_RQ_ONLINE from going offline
before the migration takes place.

To address the issue, add cpu_active() test to scx_rq_online(). There is a
synchronize_rcu() between cpu_active() being cleared and the rq going
offline, so if an on-going scheduling operation sees cpu_active(), the
associated rq is guaranteed to not go offline until the scheduling operation
is complete.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 60c27fb59f6c ("sched_ext: Implement sched_ext_ops.cpu_online/offline()")
---
 kernel/sched/ext.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -1818,7 +1818,14 @@ dispatch:
 
 static bool scx_rq_online(struct rq *rq)
 {
-	return likely(rq->scx.flags & SCX_RQ_ONLINE);
+	/*
+	 * Test both cpu_active() and %SCX_RQ_ONLINE. %SCX_RQ_ONLINE indicates
+	 * the online state as seen from the BPF scheduler. cpu_active() test
+	 * guarantees that, if this function returns %true, %SCX_RQ_ONLINE will
+	 * stay set until the current scheduling operation is complete even if
+	 * we aren't locking @rq.
+	 */
+	return likely((rq->scx.flags & SCX_RQ_ONLINE) && cpu_active(cpu_of(rq)));
 }
 
 static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH sched_ext/for-6.12] sched_ext: Make scx_rq_online() also test cpu_active() in addition to SCX_RQ_ONLINE
  2024-08-07 22:13 [PATCH sched_ext/for-6.12] sched_ext: Make scx_rq_online() also test cpu_active() in addition to SCX_RQ_ONLINE Tejun Heo
@ 2024-08-08 17:39 ` David Vernet
  2024-08-08 23:38 ` Tejun Heo
  1 sibling, 0 replies; 3+ messages in thread
From: David Vernet @ 2024-08-08 17:39 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, kernel-team

[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]

On Wed, Aug 07, 2024 at 12:13:38PM -1000, Tejun Heo wrote:
> scx_rq_online() currently only tests SCX_RQ_ONLINE. This isn't fully correct
> - e.g. consume_dispatch_q() uses task_run_on_remote_rq() which tests
> scx_rq_online() to see whether the current rq can run the task, and, if so,
> calls consume_remote_task() to migrate the task to @rq. While the test
> itself was done while locking @rq, @rq can be temporarily unlocked by
> consume_remote_task() and nothing prevents SCX_RQ_ONLINE from going offline
> before the migration takes place.
> 
> To address the issue, add cpu_active() test to scx_rq_online(). There is a
> synchronize_rcu() between cpu_active() being cleared and the rq going
> offline, so if an on-going scheduling operation sees cpu_active(), the
> associated rq is guaranteed to not go offline until the scheduling operation
> is complete.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Fixes: 60c27fb59f6c ("sched_ext: Implement sched_ext_ops.cpu_online/offline()")

Acked-by: David Vernet <void@manifault.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH sched_ext/for-6.12] sched_ext: Make scx_rq_online() also test cpu_active() in addition to SCX_RQ_ONLINE
  2024-08-07 22:13 [PATCH sched_ext/for-6.12] sched_ext: Make scx_rq_online() also test cpu_active() in addition to SCX_RQ_ONLINE Tejun Heo
  2024-08-08 17:39 ` David Vernet
@ 2024-08-08 23:38 ` Tejun Heo
  1 sibling, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2024-08-08 23:38 UTC (permalink / raw)
  To: David Vernet; +Cc: linux-kernel, kernel-team

On Wed, Aug 07, 2024 at 12:13:38PM -1000, Tejun Heo wrote:
> scx_rq_online() currently only tests SCX_RQ_ONLINE. This isn't fully correct
> - e.g. consume_dispatch_q() uses task_run_on_remote_rq() which tests
> scx_rq_online() to see whether the current rq can run the task, and, if so,
> calls consume_remote_task() to migrate the task to @rq. While the test
> itself was done while locking @rq, @rq can be temporarily unlocked by
> consume_remote_task() and nothing prevents SCX_RQ_ONLINE from going offline
> before the migration takes place.
> 
> To address the issue, add cpu_active() test to scx_rq_online(). There is a
> synchronize_rcu() between cpu_active() being cleared and the rq going
> offline, so if an on-going scheduling operation sees cpu_active(), the
> associated rq is guaranteed to not go offline until the scheduling operation
> is complete.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Fixes: 60c27fb59f6c ("sched_ext: Implement sched_ext_ops.cpu_online/offline()")

Applied to sched_ext/for-6.12.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-08-08 23:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-07 22:13 [PATCH sched_ext/for-6.12] sched_ext: Make scx_rq_online() also test cpu_active() in addition to SCX_RQ_ONLINE Tejun Heo
2024-08-08 17:39 ` David Vernet
2024-08-08 23:38 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.