public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH sched_ext/for-7.1] sched_ext: Use irq_work_queue_on() in schedule_deferred()
@ 2026-03-22 20:33 Tejun Heo
  2026-03-22 23:34 ` Emil Tsalapatis
  2026-03-23  0:07 ` Tejun Heo
  0 siblings, 2 replies; 3+ messages in thread
From: Tejun Heo @ 2026-03-22 20:33 UTC (permalink / raw)
  To: David Vernet, Andrea Righi, Changwoo Min
  Cc: Emil Tsalapatis, sched-ext, linux-kernel

schedule_deferred() uses irq_work_queue() which always queues on the
calling CPU. The deferred work can run from any CPU correctly, and the
_locked() path already processes remote rqs from the calling CPU. However,
when falling through to the irq_work path, queuing on the target CPU is
preferable as the work can run sooner via IPI delivery rather than waiting
for the calling CPU to re-enable IRQs.

Currently, only reenqueue operations use this path - either BPF-initiated
reenqueue targeting a remote rq, or IMMED reenqueue when the target CPU is
busy running userspace (not in balance or wakeup, so the _locked() fast
paths aren't available). Use irq_work_queue_on() to target the owning CPU.

This improves IMMED reenqueue latency when tasks are dispatched to
remote local DSQs. Testing on a 24-CPU AMD Ryzen 3900X with scx_qmap
-I -F 50 (ALWAYS_ENQ_IMMED, every 50th enqueue forced to prev_cpu's
local DSQ) under heavy mixed load (2x CPU oversubscription, yield and
context-switch pressure, SCHED_FIFO bursts, periodic fork storms, mixed
nice levels, C-states disabled), measuring local DSQ residence time
(insert to remove) over 5 x 120s runs (~1.2M tasks per set):

  >128us outliers:  71 -> 39  (-45%)
  >256us outliers:  59 -> 36  (-39%)

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/sched/ext.c |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -1164,10 +1164,18 @@ static void deferred_irq_workfn(struct i
 static void schedule_deferred(struct rq *rq)
 {
 	/*
-	 * Queue an irq work. They are executed on IRQ re-enable which may take
-	 * a bit longer than the scheduler hook in schedule_deferred_locked().
+	 * This is the fallback when schedule_deferred_locked() can't use
+	 * the cheaper balance callback or wakeup hook paths (the target
+	 * CPU is not in balance or wakeup). Currently, this is primarily
+	 * hit by reenqueue operations targeting a remote CPU.
+	 *
+	 * Queue on the target CPU. The deferred work can run from any CPU
+	 * correctly - the _locked() path already processes remote rqs from
+	 * the calling CPU - but targeting the owning CPU allows IPI delivery
+	 * without waiting for the calling CPU to re-enable IRQs and is
+	 * cheaper as the reenqueue runs locally.
 	 */
-	irq_work_queue(&rq->scx.deferred_irq_work);
+	irq_work_queue_on(&rq->scx.deferred_irq_work, cpu_of(rq));
 }

 /**
--
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-23  0:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-22 20:33 [PATCH sched_ext/for-7.1] sched_ext: Use irq_work_queue_on() in schedule_deferred() Tejun Heo
2026-03-22 23:34 ` Emil Tsalapatis
2026-03-23  0:07 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox