From: Jens Axboe <axboe@kernel.dk>
To: io-uring@vger.kernel.org, linux-xfs@vger.kernel.org
Cc: hch@lst.de, andres@anarazel.de, david@fromorbit.com,
stable@vger.kernel.org, Pavel Begunkov <asml.silence@gmail.com>,
linux-kernel@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait
Date: Tue, 18 Jul 2023 13:49:15 -0600 [thread overview]
Message-ID: <20230718194920.1472184-2-axboe@kernel.dk> (raw)
In-Reply-To: <20230718194920.1472184-1-axboe@kernel.dk>
From: Andres Freund <andres@anarazel.de>
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
io_uring/io_uring.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e8096d502a7c..7505de2428e0 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2489,6 +2489,8 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq)
{
+ int token, ret;
+
if (unlikely(READ_ONCE(ctx->check_cq)))
return 1;
if (unlikely(!llist_empty(&ctx->work_llist)))
@@ -2499,11 +2501,20 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
return -EINTR;
if (unlikely(io_should_wake(iowq)))
return 0;
+
+ /*
+ * Use io_schedule_prepare/finish, so cpufreq can take into account
+ * that the task is waiting for IO - turns out to be important for low
+ * QD IO.
+ */
+ token = io_schedule_prepare();
+ ret = 0;
if (iowq->timeout == KTIME_MAX)
schedule();
else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
- return -ETIME;
- return 0;
+ ret = -ETIME;
+ io_schedule_finish(token);
+ return ret;
}
/*
--
2.40.1
next prev parent reply other threads:[~2023-07-18 19:49 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 19:49 [PATCHSET v2 0/5] Improve async iomap DIO performance Jens Axboe
2023-07-18 19:49 ` Jens Axboe [this message]
2023-07-18 19:50 ` [PATCH] io_uring: Use io_schedule* in cqring wait Jens Axboe
2023-07-18 19:49 ` [PATCH 1/5] iomap: simplify logic for when a dio can get completed inline Jens Axboe
2023-07-18 22:56 ` Dave Chinner
2023-07-19 15:23 ` Jens Axboe
2023-07-18 19:49 ` [PATCH 2/5] fs: add IOCB flags related to passing back dio completions Jens Axboe
2023-07-18 19:49 ` [PATCH 3/5] io_uring/rw: add write support for IOCB_DIO_DEFER Jens Axboe
2023-07-18 19:49 ` [PATCH 4/5] iomap: add local 'iocb' variable in iomap_dio_bio_end_io() Jens Axboe
2023-07-18 19:49 ` [PATCH 5/5] iomap: support IOCB_DIO_DEFER Jens Axboe
2023-07-18 23:50 ` Dave Chinner
2023-07-19 19:55 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2023-07-24 15:35 [PATCH] io_uring: Use io_schedule* in cqring wait Phil Elwell
2023-07-24 15:48 ` Greg KH
2023-07-24 15:50 ` Jens Axboe
2023-07-24 15:58 ` Jens Axboe
2023-07-24 16:07 ` Phil Elwell
2023-07-24 16:08 ` Jens Axboe
2023-07-24 16:48 ` Phil Elwell
2023-07-24 18:22 ` Jens Axboe
2023-07-24 19:22 ` Pavel Begunkov
2023-07-24 20:27 ` Jeff Moyer
2023-07-24 15:48 ` Jens Axboe
2023-07-24 16:16 ` Andres Freund
2023-07-24 16:20 ` Jens Axboe
2023-07-24 17:24 ` Andres Freund
2023-07-24 17:44 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230718194920.1472184-2-axboe@kernel.dk \
--to=axboe@kernel.dk \
--cc=andres@anarazel.de \
--cc=asml.silence@gmail.com \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=io-uring@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.