* FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
@ 2023-07-16 8:41 gregkh
2023-07-16 18:13 ` Jens Axboe
0 siblings, 1 reply; 10+ messages in thread
From: gregkh @ 2023-07-16 8:41 UTC (permalink / raw)
To: andres, asml.silence, axboe; +Cc: stable
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
8a796565cec3 ("io_uring: Use io_schedule* in cqring wait")
d33a39e57768 ("io_uring: keep timeout in io_wait_queue")
46ae7eef44f6 ("io_uring: optimise non-timeout waiting")
846072f16eed ("io_uring: mimimise io_cqring_wait_schedule")
3fcf19d592d5 ("io_uring: parse check_cq out of wq waiting")
12521a5d5cb7 ("io_uring: fix CQ waiting timeout handling")
52ea806ad983 ("io_uring: finish waiting before flushing overflow entries")
35d90f95cfa7 ("io_uring: include task_work run after scheduling in wait for events")
1b346e4aa8e7 ("io_uring: don't check overflow flush failures")
a85381d8326d ("io_uring: skip overflow CQE posting for dying ring")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8a796565cec3601071cbbd27d6304e202019d014 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Fri, 7 Jul 2023 09:20:07 -0700
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e8096d502a7c..7505de2428e0 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2489,6 +2489,8 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq)
{
+ int token, ret;
+
if (unlikely(READ_ONCE(ctx->check_cq)))
return 1;
if (unlikely(!llist_empty(&ctx->work_llist)))
@@ -2499,11 +2501,20 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
return -EINTR;
if (unlikely(io_should_wake(iowq)))
return 0;
+
+ /*
+ * Use io_schedule_prepare/finish, so cpufreq can take into account
+ * that the task is waiting for IO - turns out to be important for low
+ * QD IO.
+ */
+ token = io_schedule_prepare();
+ ret = 0;
if (iowq->timeout == KTIME_MAX)
schedule();
else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
- return -ETIME;
- return 0;
+ ret = -ETIME;
+ io_schedule_finish(token);
+ return ret;
}
/*
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-16 8:41 FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree gregkh @ 2023-07-16 18:13 ` Jens Axboe 2023-07-16 19:11 ` Andres Freund 2023-07-17 16:39 ` Jens Axboe 0 siblings, 2 replies; 10+ messages in thread From: Jens Axboe @ 2023-07-16 18:13 UTC (permalink / raw) To: gregkh, andres, asml.silence; +Cc: stable [-- Attachment #1: Type: text/plain, Size: 805 bytes --] On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote: > > The patch below does not apply to the 6.1-stable tree. > If someone wants it applied there, or to any other stable or longterm > tree, then please email the backport, including the original git commit > id to <stable@vger.kernel.org>. > > To reproduce the conflict and resubmit, you may use the following commands: > > git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y > git checkout FETCH_HEAD > git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014 > # <resolve conflicts, build, test, etc.> > git commit -s > git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Here's one for 6.1-stable. -- Jens Axboe [-- Attachment #2: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --] [-- Type: text/x-patch, Size: 2716 bytes --] From 71fc76b239a1c980c11821916d1d6785bc177c5c Mon Sep 17 00:00:00 2001 From: Andres Freund <andres@anarazel.de> Date: Sun, 16 Jul 2023 12:13:06 -0600 Subject: [PATCH] io_uring: Use io_schedule* in cqring wait I observed poor performance of io_uring compared to synchronous IO. That turns out to be caused by deeper CPU idle states entered with io_uring, due to io_uring using plain schedule(), whereas synchronous IO uses io_schedule(). The losses due to this are substantial. On my cascade lake workstation, t/io_uring from the fio repository e.g. yields regressions between 20% and 40% with the following command: ./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0 This is repeatable with different filesystems, using raw block devices and using different block devices. Use io_schedule_prepare() / io_schedule_finish() in io_cqring_wait_schedule() to address the difference. After that using io_uring is on par or surpassing synchronous IO (using registered files etc makes it reliably win, but arguably is a less fair comparison). There are other calls to schedule() in io_uring/, but none immediately jump out to be similarly situated, so I did not touch them. Similarly, it's possible that mutex_lock_io() should be used, but it's not clear if there are cases where that matters. Cc: stable@vger.kernel.org # 5.10+ Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: io-uring@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de [axboe: minor style fixup] Signed-off-by: Jens Axboe <axboe@kernel.dk> --- io_uring/io_uring.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index cc35aba1e495..de117d3424b2 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, struct io_wait_queue *iowq, ktime_t *timeout) { - int ret; + int token, ret; unsigned long check_cq; /* make sure we run task_work before checking for signals */ @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) return -EBADR; } + + /* + * Use io_schedule_prepare/finish, so cpufreq can take into account + * that the task is waiting for IO - turns out to be important for low + * QD IO. + */ + token = io_schedule_prepare(); + ret = 0; if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS)) - return -ETIME; - return 1; + ret = -ETIME; + io_schedule_finish(token); + return ret; } /* -- 2.40.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-16 18:13 ` Jens Axboe @ 2023-07-16 19:11 ` Andres Freund 2023-07-16 19:19 ` Jens Axboe 2023-07-17 16:39 ` Jens Axboe 1 sibling, 1 reply; 10+ messages in thread From: Andres Freund @ 2023-07-16 19:11 UTC (permalink / raw) To: Jens Axboe; +Cc: gregkh, asml.silence, stable Hi, On 2023-07-16 12:13:45 -0600, Jens Axboe wrote: > Here's one for 6.1-stable. Thanks for working on that! > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index cc35aba1e495..de117d3424b2 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, > struct io_wait_queue *iowq, > ktime_t *timeout) > { > - int ret; > + int token, ret; > unsigned long check_cq; > > /* make sure we run task_work before checking for signals */ > @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, > if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) > return -EBADR; > } > + > + /* > + * Use io_schedule_prepare/finish, so cpufreq can take into account > + * that the task is waiting for IO - turns out to be important for low > + * QD IO. > + */ > + token = io_schedule_prepare(); > + ret = 0; > if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS)) > - return -ETIME; > - return 1; > + ret = -ETIME; > + io_schedule_finish(token); > + return ret; > } To me it looks like this might have changed more than intended? Previously io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing something? Greetings, Andres Freund ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-16 19:11 ` Andres Freund @ 2023-07-16 19:19 ` Jens Axboe 2023-07-16 19:29 ` Greg KH 2023-07-17 16:32 ` Jens Axboe 0 siblings, 2 replies; 10+ messages in thread From: Jens Axboe @ 2023-07-16 19:19 UTC (permalink / raw) To: Andres Freund; +Cc: gregkh, asml.silence, stable On 7/16/23 1:11?PM, Andres Freund wrote: > Hi, > > On 2023-07-16 12:13:45 -0600, Jens Axboe wrote: >> Here's one for 6.1-stable. > > Thanks for working on that! > > >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >> index cc35aba1e495..de117d3424b2 100644 >> --- a/io_uring/io_uring.c >> +++ b/io_uring/io_uring.c >> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, >> struct io_wait_queue *iowq, >> ktime_t *timeout) >> { >> - int ret; >> + int token, ret; >> unsigned long check_cq; >> >> /* make sure we run task_work before checking for signals */ >> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, >> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) >> return -EBADR; >> } >> + >> + /* >> + * Use io_schedule_prepare/finish, so cpufreq can take into account >> + * that the task is waiting for IO - turns out to be important for low >> + * QD IO. >> + */ >> + token = io_schedule_prepare(); >> + ret = 0; >> if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS)) >> - return -ETIME; >> - return 1; >> + ret = -ETIME; >> + io_schedule_finish(token); >> + return ret; >> } > > To me it looks like this might have changed more than intended? Previously > io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned > non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing > something? Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now? I'll get it sorted tomorrow. Sorry about that, and thanks for catching that Andres! -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-16 19:19 ` Jens Axboe @ 2023-07-16 19:29 ` Greg KH 2023-07-17 16:32 ` Jens Axboe 1 sibling, 0 replies; 10+ messages in thread From: Greg KH @ 2023-07-16 19:29 UTC (permalink / raw) To: Jens Axboe; +Cc: Andres Freund, asml.silence, stable On Sun, Jul 16, 2023 at 01:19:31PM -0600, Jens Axboe wrote: > On 7/16/23 1:11?PM, Andres Freund wrote: > > Hi, > > > > On 2023-07-16 12:13:45 -0600, Jens Axboe wrote: > >> Here's one for 6.1-stable. > > > > Thanks for working on that! > > > > > >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > >> index cc35aba1e495..de117d3424b2 100644 > >> --- a/io_uring/io_uring.c > >> +++ b/io_uring/io_uring.c > >> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, > >> struct io_wait_queue *iowq, > >> ktime_t *timeout) > >> { > >> - int ret; > >> + int token, ret; > >> unsigned long check_cq; > >> > >> /* make sure we run task_work before checking for signals */ > >> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, > >> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) > >> return -EBADR; > >> } > >> + > >> + /* > >> + * Use io_schedule_prepare/finish, so cpufreq can take into account > >> + * that the task is waiting for IO - turns out to be important for low > >> + * QD IO. > >> + */ > >> + token = io_schedule_prepare(); > >> + ret = 0; > >> if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS)) > >> - return -ETIME; > >> - return 1; > >> + ret = -ETIME; > >> + io_schedule_finish(token); > >> + return ret; > >> } > > > > To me it looks like this might have changed more than intended? Previously > > io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned > > non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing > > something? > > Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now? > I'll get it sorted tomorrow. Sorry about that, and thanks for catching > that Andres! Sure, will go drop it now, thanks. greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-16 19:19 ` Jens Axboe 2023-07-16 19:29 ` Greg KH @ 2023-07-17 16:32 ` Jens Axboe 1 sibling, 0 replies; 10+ messages in thread From: Jens Axboe @ 2023-07-17 16:32 UTC (permalink / raw) To: Andres Freund; +Cc: gregkh, asml.silence, stable [-- Attachment #1: Type: text/plain, Size: 1871 bytes --] On 7/16/23 1:19?PM, Jens Axboe wrote: > On 7/16/23 1:11?PM, Andres Freund wrote: >> Hi, >> >> On 2023-07-16 12:13:45 -0600, Jens Axboe wrote: >>> Here's one for 6.1-stable. >> >> Thanks for working on that! >> >> >>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>> index cc35aba1e495..de117d3424b2 100644 >>> --- a/io_uring/io_uring.c >>> +++ b/io_uring/io_uring.c >>> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, >>> struct io_wait_queue *iowq, >>> ktime_t *timeout) >>> { >>> - int ret; >>> + int token, ret; >>> unsigned long check_cq; >>> >>> /* make sure we run task_work before checking for signals */ >>> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, >>> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) >>> return -EBADR; >>> } >>> + >>> + /* >>> + * Use io_schedule_prepare/finish, so cpufreq can take into account >>> + * that the task is waiting for IO - turns out to be important for low >>> + * QD IO. >>> + */ >>> + token = io_schedule_prepare(); >>> + ret = 0; >>> if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS)) >>> - return -ETIME; >>> - return 1; >>> + ret = -ETIME; >>> + io_schedule_finish(token); >>> + return ret; >>> } >> >> To me it looks like this might have changed more than intended? Previously >> io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned >> non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing >> something? > > Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now? > I'll get it sorted tomorrow. Sorry about that, and thanks for catching > that Andres! Greg, can you pick up these two for 5.10-stable and 5.15-stable? While running testing, noticed another backport that was missing, so added that as we.. -- Jens Axboe [-- Attachment #2: 0002-io_uring-add-reschedule-point-to-handle_tw_list.patch --] [-- Type: text/x-patch, Size: 1157 bytes --] From 4e214e7e01158a87308a17766706159bca472855 Mon Sep 17 00:00:00 2001 From: Jens Axboe <axboe@kernel.dk> Date: Mon, 17 Jul 2023 10:27:20 -0600 Subject: [PATCH 2/2] io_uring: add reschedule point to handle_tw_list() Commit f58680085478dd292435727210122960d38e8014 upstream. If CONFIG_PREEMPT_NONE is set and the task_work chains are long, we could be running into issues blocking others for too long. Add a reschedule check in handle_tw_list(), and flush the ctx if we need to reschedule. Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Jens Axboe <axboe@kernel.dk> --- io_uring/io_uring.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 33d4a2871dbb..eae7a3d89397 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2216,9 +2216,12 @@ static void tctx_task_work(struct callback_head *cb) } req->io_task_work.func(req, &locked); node = next; + if (unlikely(need_resched())) { + ctx_flush_and_put(ctx, &locked); + ctx = NULL; + cond_resched(); + } } while (node); - - cond_resched(); } ctx_flush_and_put(ctx, &locked); -- 2.40.1 [-- Attachment #3: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --] [-- Type: text/x-patch, Size: 2770 bytes --] From c8c88d523c89e0ac8affbf2fd57def82e0d5d4bf Mon Sep 17 00:00:00 2001 From: Andres Freund <andres@anarazel.de> Date: Sun, 16 Jul 2023 12:07:03 -0600 Subject: [PATCH 1/2] io_uring: Use io_schedule* in cqring wait Commit 8a796565cec3601071cbbd27d6304e202019d014 upstream. I observed poor performance of io_uring compared to synchronous IO. That turns out to be caused by deeper CPU idle states entered with io_uring, due to io_uring using plain schedule(), whereas synchronous IO uses io_schedule(). The losses due to this are substantial. On my cascade lake workstation, t/io_uring from the fio repository e.g. yields regressions between 20% and 40% with the following command: ./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0 This is repeatable with different filesystems, using raw block devices and using different block devices. Use io_schedule_prepare() / io_schedule_finish() in io_cqring_wait_schedule() to address the difference. After that using io_uring is on par or surpassing synchronous IO (using registered files etc makes it reliably win, but arguably is a less fair comparison). There are other calls to schedule() in io_uring/, but none immediately jump out to be similarly situated, so I did not touch them. Similarly, it's possible that mutex_lock_io() should be used, but it's not clear if there are cases where that matters. Cc: stable@vger.kernel.org # 5.10+ Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: io-uring@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de [axboe: minor style fixup] Signed-off-by: Jens Axboe <axboe@kernel.dk> --- io_uring/io_uring.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index e633799c9cea..33d4a2871dbb 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -7785,7 +7785,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, struct io_wait_queue *iowq, ktime_t *timeout) { - int ret; + int token, ret; /* make sure we run task_work before checking for signals */ ret = io_run_task_work_sig(); @@ -7795,9 +7795,17 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, if (test_bit(0, &ctx->check_cq_overflow)) return 1; + /* + * Use io_schedule_prepare/finish, so cpufreq can take into account + * that the task is waiting for IO - turns out to be important for low + * QD IO. + */ + token = io_schedule_prepare(); + ret = 1; if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS)) - return -ETIME; - return 1; + ret = -ETIME; + io_schedule_finish(token); + return ret; } /* -- 2.40.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-16 18:13 ` Jens Axboe 2023-07-16 19:11 ` Andres Freund @ 2023-07-17 16:39 ` Jens Axboe 2023-07-17 17:33 ` Andres Freund 2023-07-17 20:12 ` Greg KH 1 sibling, 2 replies; 10+ messages in thread From: Jens Axboe @ 2023-07-17 16:39 UTC (permalink / raw) To: gregkh, andres, asml.silence; +Cc: stable [-- Attachment #1: Type: text/plain, Size: 900 bytes --] On 7/16/23 12:13 PM, Jens Axboe wrote: > On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote: >> >> The patch below does not apply to the 6.1-stable tree. >> If someone wants it applied there, or to any other stable or longterm >> tree, then please email the backport, including the original git commit >> id to <stable@vger.kernel.org>. >> >> To reproduce the conflict and resubmit, you may use the following commands: >> >> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y >> git checkout FETCH_HEAD >> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014 >> # <resolve conflicts, build, test, etc.> >> git commit -s >> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. > > Here's one for 6.1-stable. And here's a corrected one for 6.1. -- Jens Axboe [-- Attachment #2: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --] [-- Type: text/x-patch, Size: 2775 bytes --] From f5f24ec27340daf12177fd09c2d107a589cbf527 Mon Sep 17 00:00:00 2001 From: Andres Freund <andres@anarazel.de> Date: Sun, 16 Jul 2023 12:13:06 -0600 Subject: [PATCH] io_uring: Use io_schedule* in cqring wait Commit 8a796565cec3601071cbbd27d6304e202019d014 upstream. I observed poor performance of io_uring compared to synchronous IO. That turns out to be caused by deeper CPU idle states entered with io_uring, due to io_uring using plain schedule(), whereas synchronous IO uses io_schedule(). The losses due to this are substantial. On my cascade lake workstation, t/io_uring from the fio repository e.g. yields regressions between 20% and 40% with the following command: ./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0 This is repeatable with different filesystems, using raw block devices and using different block devices. Use io_schedule_prepare() / io_schedule_finish() in io_cqring_wait_schedule() to address the difference. After that using io_uring is on par or surpassing synchronous IO (using registered files etc makes it reliably win, but arguably is a less fair comparison). There are other calls to schedule() in io_uring/, but none immediately jump out to be similarly situated, so I did not touch them. Similarly, it's possible that mutex_lock_io() should be used, but it's not clear if there are cases where that matters. Cc: stable@vger.kernel.org # 5.10+ Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: io-uring@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de [axboe: minor style fixup] Signed-off-by: Jens Axboe <axboe@kernel.dk> --- io_uring/io_uring.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index cc35aba1e495..6d7b358e71f1 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, struct io_wait_queue *iowq, ktime_t *timeout) { - int ret; + int token, ret; unsigned long check_cq; /* make sure we run task_work before checking for signals */ @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) return -EBADR; } + + /* + * Use io_schedule_prepare/finish, so cpufreq can take into account + * that the task is waiting for IO - turns out to be important for low + * QD IO. + */ + token = io_schedule_prepare(); + ret = 1; if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS)) - return -ETIME; - return 1; + ret = -ETIME; + io_schedule_finish(token); + return ret; } /* -- 2.40.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-17 16:39 ` Jens Axboe @ 2023-07-17 17:33 ` Andres Freund 2023-07-17 20:12 ` Greg KH 1 sibling, 0 replies; 10+ messages in thread From: Andres Freund @ 2023-07-17 17:33 UTC (permalink / raw) To: Jens Axboe; +Cc: gregkh, asml.silence, stable Hi, On 2023-07-17 10:39:51 -0600, Jens Axboe wrote: > And here's a corrected one for 6.1. Thanks! LGTM. Greetings, Andres Freund ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-17 16:39 ` Jens Axboe 2023-07-17 17:33 ` Andres Freund @ 2023-07-17 20:12 ` Greg KH 2023-07-17 20:13 ` Jens Axboe 1 sibling, 1 reply; 10+ messages in thread From: Greg KH @ 2023-07-17 20:12 UTC (permalink / raw) To: Jens Axboe; +Cc: andres, asml.silence, stable On Mon, Jul 17, 2023 at 10:39:51AM -0600, Jens Axboe wrote: > On 7/16/23 12:13 PM, Jens Axboe wrote: > > On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote: > >> > >> The patch below does not apply to the 6.1-stable tree. > >> If someone wants it applied there, or to any other stable or longterm > >> tree, then please email the backport, including the original git commit > >> id to <stable@vger.kernel.org>. > >> > >> To reproduce the conflict and resubmit, you may use the following commands: > >> > >> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y > >> git checkout FETCH_HEAD > >> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014 > >> # <resolve conflicts, build, test, etc.> > >> git commit -s > >> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. > > > > Here's one for 6.1-stable. > > And here's a corrected one for 6.1. All now queued up, thanks. greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree 2023-07-17 20:12 ` Greg KH @ 2023-07-17 20:13 ` Jens Axboe 0 siblings, 0 replies; 10+ messages in thread From: Jens Axboe @ 2023-07-17 20:13 UTC (permalink / raw) To: Greg KH; +Cc: andres, asml.silence, stable On 7/17/23 2:12?PM, Greg KH wrote: > On Mon, Jul 17, 2023 at 10:39:51AM -0600, Jens Axboe wrote: >> On 7/16/23 12:13?PM, Jens Axboe wrote: >>> On 7/16/23 2:41?AM, gregkh@linuxfoundation.org wrote: >>>> >>>> The patch below does not apply to the 6.1-stable tree. >>>> If someone wants it applied there, or to any other stable or longterm >>>> tree, then please email the backport, including the original git commit >>>> id to <stable@vger.kernel.org>. >>>> >>>> To reproduce the conflict and resubmit, you may use the following commands: >>>> >>>> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y >>>> git checkout FETCH_HEAD >>>> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014 >>>> # <resolve conflicts, build, test, etc.> >>>> git commit -s >>>> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. >>> >>> Here's one for 6.1-stable. >> >> And here's a corrected one for 6.1. > > All now queued up, thanks. Great, thanks Greg! -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-07-17 20:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-07-16 8:41 FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree gregkh 2023-07-16 18:13 ` Jens Axboe 2023-07-16 19:11 ` Andres Freund 2023-07-16 19:19 ` Jens Axboe 2023-07-16 19:29 ` Greg KH 2023-07-17 16:32 ` Jens Axboe 2023-07-17 16:39 ` Jens Axboe 2023-07-17 17:33 ` Andres Freund 2023-07-17 20:12 ` Greg KH 2023-07-17 20:13 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox