* [PATCH] fuse: Wake requests on the same cpu
@ 2025-10-13 17:27 Bernd Schubert
2025-10-14 7:25 ` Johannes Thumshirn
0 siblings, 1 reply; 3+ messages in thread
From: Bernd Schubert @ 2025-10-13 17:27 UTC (permalink / raw)
To: Miklos Szeredi, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
Mel Gorman, Valentin Schneider
Cc: Joanne Koong, Luis Henriques, linux-fsdevel, Bernd Schubert
For io-uring it makes sense to wake the waiting application (synchronous
IO) on the same core.
With queue-per-pore
fio --directory=/tmp/dest --name=iops.\$jobnum --rw=randread --bs=4k \
--size=1G --numjobs=1 --iodepth=1 --time_based --runtime=30s
\ --group_reporting --ioengine=psync --direct=1
no-io-uring
READ: bw=116MiB/s (122MB/s), 116MiB/s-116MiB/s
no-io-uring wake on the same core (not part of this patch)
READ: bw=115MiB/s (120MB/s), 115MiB/s-115MiB/s
unpatched
READ: bw=260MiB/s (273MB/s), 260MiB/s-260MiB/s
patched
READ: bw=345MiB/s (362MB/s), 345MiB/s-345MiB/s
Without io-uring and core bound fuse-server queues there is almost
not difference. In fact, fio results are very fluctuating, in
between 85MB/s and 205MB/s during the run.
With --numjobs=8
unpatched
READ: bw=2378MiB/s (2493MB/s), 2378MiB/s-2378MiB/s
patched
READ: bw=2402MiB/s (2518MB/s), 2402MiB/s-2402MiB/s
(differences within the confidence interval)
'-o io_uring_q_mask=0-3:8-11' (16 core / 32 SMT core system) and
unpatched
READ: bw=1286MiB/s (1348MB/s), 1286MiB/s-1286MiB/s
patched
READ: bw=1561MiB/s (1637MB/s), 1561MiB/s-1561MiB/s
I.e. no differences with many application threads and queue-per-core,
but perf gain with overloaded queues - a bit surprising.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
---
This was already part of the RFC series and was then removed on
request to keep out optimizations from the main fuse-io-uring
series.
Later I was hesitating to add it back, as I was working on reducing the
required number of queues/rings and initially thought
wake-on-current-cpu needs to be a conditional if queue-per-core or
a reduced number of queues is used.
After testing with reduced number of queues, there is still a measurable
benefit with reduced number of queues - no condition on that needed
and the patch can be handled independently of queue size reduction.
---
fs/fuse/dev.c | 8 ++++++--
include/linux/wait.h | 6 +++---
kernel/sched/wait.c | 12 ++++++++++++
3 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 132f38619d70720ce74eedc002a7b8f31e760a61..0f73ef9f77b463b6dfd07e35262dc3375648c56f 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -499,8 +499,12 @@ void fuse_request_end(struct fuse_req *req)
flush_bg_queue(fc);
spin_unlock(&fc->bg_lock);
} else {
- /* Wake up waiter sleeping in request_wait_answer() */
- wake_up(&req->waitq);
+ if (test_bit(FR_URING, &req->flags)) {
+ wake_up_on_current_cpu(&req->waitq);
+ } else {
+ /* Wake up waiter sleeping in request_wait_answer() */
+ wake_up(&req->waitq);
+ }
}
if (test_bit(FR_ASYNC, &req->flags))
diff --git a/include/linux/wait.h b/include/linux/wait.h
index f648044466d5f55f2d65a3aa153b4dfe39f0b6dc..831a187b3f68f0707c75ceee919fec338db410b3 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -219,6 +219,7 @@ void __wake_up_sync(struct wait_queue_head *wq_head, unsigned int mode);
void __wake_up_pollfree(struct wait_queue_head *wq_head);
#define wake_up(x) __wake_up(x, TASK_NORMAL, 1, NULL)
+#define wake_up_on_current_cpu(x) __wake_up_on_current_cpu(x, TASK_NORMAL, NULL)
#define wake_up_nr(x, nr) __wake_up(x, TASK_NORMAL, nr, NULL)
#define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL)
#define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1)
@@ -479,9 +480,8 @@ do { \
__wait_event_cmd(wq_head, condition, cmd1, cmd2); \
} while (0)
-#define __wait_event_interruptible(wq_head, condition) \
- ___wait_event(wq_head, condition, TASK_INTERRUPTIBLE, 0, 0, \
- schedule())
+#define __wait_event_interruptible(wq_head, condition) \
+ ___wait_event(wq_head, condition, TASK_INTERRUPTIBLE, 0, 0, schedule())
/**
* wait_event_interruptible - sleep until a condition gets true
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 20f27e2cf7aec691af040fcf2236a20374ec66bf..1c6943a620ae389590a9d06577b998c320310923 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -147,10 +147,22 @@ int __wake_up(struct wait_queue_head *wq_head, unsigned int mode,
}
EXPORT_SYMBOL(__wake_up);
+/**
+ * __wake_up - wake up threads blocked on a waitqueue, on the current cpu
+ * @wq_head: the waitqueue
+ * @mode: which threads
+ * @nr_exclusive: how many wake-one or wake-many threads to wake up
+ * @key: is directly passed to the wakeup function
+ *
+ * If this function wakes up a task, it executes a full memory barrier
+ * before accessing the task state. Returns the number of exclusive
+ * tasks that were awaken.
+ */
void __wake_up_on_current_cpu(struct wait_queue_head *wq_head, unsigned int mode, void *key)
{
__wake_up_common_lock(wq_head, mode, 1, WF_CURRENT_CPU, key);
}
+EXPORT_SYMBOL_GPL(__wake_up_on_current_cpu);
/*
* Same as __wake_up but called with the spinlock in wait_queue_head_t held.
---
base-commit: ec714e371f22f716a04e6ecb2a24988c92b26911
change-id: 20251013-wake-same-cpu-b7ddb0b0688e
Best regards,
--
Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] fuse: Wake requests on the same cpu
2025-10-13 17:27 [PATCH] fuse: Wake requests on the same cpu Bernd Schubert
@ 2025-10-14 7:25 ` Johannes Thumshirn
2025-10-14 9:12 ` Bernd Schubert
0 siblings, 1 reply; 3+ messages in thread
From: Johannes Thumshirn @ 2025-10-14 7:25 UTC (permalink / raw)
To: Bernd Schubert, Miklos Szeredi, Ingo Molnar, Peter Zijlstra,
Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
Ben Segall, Mel Gorman, Valentin Schneider
Cc: Joanne Koong, Luis Henriques, linux-fsdevel@vger.kernel.org
On 10/13/25 9:01 PM, Bernd Schubert wrote:
> +/**
> + * __wake_up - wake up threads blocked on a waitqueue, on the current cpu
That needs to be __wake_up_on_current_cpu
[..]
> void __wake_up_on_current_cpu(struct wait_queue_head *wq_head, unsigned int mode, void *key)
> {
> __wake_up_common_lock(wq_head, mode, 1, WF_CURRENT_CPU, key);
> }
> +EXPORT_SYMBOL_GPL(__wake_up_on_current_cpu);
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] fuse: Wake requests on the same cpu
2025-10-14 7:25 ` Johannes Thumshirn
@ 2025-10-14 9:12 ` Bernd Schubert
0 siblings, 0 replies; 3+ messages in thread
From: Bernd Schubert @ 2025-10-14 9:12 UTC (permalink / raw)
To: Johannes Thumshirn, Miklos Szeredi, Ingo Molnar, Peter Zijlstra,
Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
Ben Segall, Mel Gorman, Valentin Schneider
Cc: Joanne Koong, Luis Henriques, linux-fsdevel@vger.kernel.org
On 10/14/25 09:25, Johannes Thumshirn wrote:
> [You don't often get email from johannes.thumshirn@wdc.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> On 10/13/25 9:01 PM, Bernd Schubert wrote:
>> +/**
>> + * __wake_up - wake up threads blocked on a waitqueue, on the current cpu
> That needs to be __wake_up_on_current_cpu
> [..]
>> void __wake_up_on_current_cpu(struct wait_queue_head *wq_head, unsigned int mode, void *key)
>> {
>> __wake_up_common_lock(wq_head, mode, 1, WF_CURRENT_CPU, key);
>> }
>> +EXPORT_SYMBOL_GPL(__wake_up_on_current_cpu);
>
>
Oops, thanks and spotting! v2 is coming.
Thanks,
Bernd
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-10-14 9:12 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-13 17:27 [PATCH] fuse: Wake requests on the same cpu Bernd Schubert
2025-10-14 7:25 ` Johannes Thumshirn
2025-10-14 9:12 ` Bernd Schubert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).