From: Tejun Heo <tj@kernel.org>
To: David Vernet <void@manifault.com>,
Andrea Righi <arighi@nvidia.com>,
Changwoo Min <changwoo@igalia.com>
Cc: Emil Tsalapatis <emil@etsalapatis.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: [PATCH sched_ext/for-7.1] sched_ext: Use irq_work_queue_on() in schedule_deferred()
Date: Sun, 22 Mar 2026 10:33:08 -1000 [thread overview]
Message-ID: <a80d84c345be8fbbd5a1294b9a72ae37@kernel.org> (raw)
schedule_deferred() uses irq_work_queue() which always queues on the
calling CPU. The deferred work can run from any CPU correctly, and the
_locked() path already processes remote rqs from the calling CPU. However,
when falling through to the irq_work path, queuing on the target CPU is
preferable as the work can run sooner via IPI delivery rather than waiting
for the calling CPU to re-enable IRQs.
Currently, only reenqueue operations use this path - either BPF-initiated
reenqueue targeting a remote rq, or IMMED reenqueue when the target CPU is
busy running userspace (not in balance or wakeup, so the _locked() fast
paths aren't available). Use irq_work_queue_on() to target the owning CPU.
This improves IMMED reenqueue latency when tasks are dispatched to
remote local DSQs. Testing on a 24-CPU AMD Ryzen 3900X with scx_qmap
-I -F 50 (ALWAYS_ENQ_IMMED, every 50th enqueue forced to prev_cpu's
local DSQ) under heavy mixed load (2x CPU oversubscription, yield and
context-switch pressure, SCHED_FIFO bursts, periodic fork storms, mixed
nice levels, C-states disabled), measuring local DSQ residence time
(insert to remove) over 5 x 120s runs (~1.2M tasks per set):
>128us outliers: 71 -> 39 (-45%)
>256us outliers: 59 -> 36 (-39%)
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/sched/ext.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -1164,10 +1164,18 @@ static void deferred_irq_workfn(struct i
static void schedule_deferred(struct rq *rq)
{
/*
- * Queue an irq work. They are executed on IRQ re-enable which may take
- * a bit longer than the scheduler hook in schedule_deferred_locked().
+ * This is the fallback when schedule_deferred_locked() can't use
+ * the cheaper balance callback or wakeup hook paths (the target
+ * CPU is not in balance or wakeup). Currently, this is primarily
+ * hit by reenqueue operations targeting a remote CPU.
+ *
+ * Queue on the target CPU. The deferred work can run from any CPU
+ * correctly - the _locked() path already processes remote rqs from
+ * the calling CPU - but targeting the owning CPU allows IPI delivery
+ * without waiting for the calling CPU to re-enable IRQs and is
+ * cheaper as the reenqueue runs locally.
*/
- irq_work_queue(&rq->scx.deferred_irq_work);
+ irq_work_queue_on(&rq->scx.deferred_irq_work, cpu_of(rq));
}
/**
--
tejun
next reply other threads:[~2026-03-22 20:33 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-22 20:33 Tejun Heo [this message]
2026-03-22 23:34 ` [PATCH sched_ext/for-7.1] sched_ext: Use irq_work_queue_on() in schedule_deferred() Emil Tsalapatis
2026-03-23 0:07 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a80d84c345be8fbbd5a1294b9a72ae37@kernel.org \
--to=tj@kernel.org \
--cc=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=emil@etsalapatis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox