* FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
@ 2023-07-16 8:41 gregkh
2023-07-16 18:13 ` Jens Axboe
0 siblings, 1 reply; 10+ messages in thread
From: gregkh @ 2023-07-16 8:41 UTC (permalink / raw)
To: andres, asml.silence, axboe; +Cc: stable
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
8a796565cec3 ("io_uring: Use io_schedule* in cqring wait")
d33a39e57768 ("io_uring: keep timeout in io_wait_queue")
46ae7eef44f6 ("io_uring: optimise non-timeout waiting")
846072f16eed ("io_uring: mimimise io_cqring_wait_schedule")
3fcf19d592d5 ("io_uring: parse check_cq out of wq waiting")
12521a5d5cb7 ("io_uring: fix CQ waiting timeout handling")
52ea806ad983 ("io_uring: finish waiting before flushing overflow entries")
35d90f95cfa7 ("io_uring: include task_work run after scheduling in wait for events")
1b346e4aa8e7 ("io_uring: don't check overflow flush failures")
a85381d8326d ("io_uring: skip overflow CQE posting for dying ring")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8a796565cec3601071cbbd27d6304e202019d014 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Fri, 7 Jul 2023 09:20:07 -0700
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e8096d502a7c..7505de2428e0 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2489,6 +2489,8 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq)
{
+ int token, ret;
+
if (unlikely(READ_ONCE(ctx->check_cq)))
return 1;
if (unlikely(!llist_empty(&ctx->work_llist)))
@@ -2499,11 +2501,20 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
return -EINTR;
if (unlikely(io_should_wake(iowq)))
return 0;
+
+ /*
+ * Use io_schedule_prepare/finish, so cpufreq can take into account
+ * that the task is waiting for IO - turns out to be important for low
+ * QD IO.
+ */
+ token = io_schedule_prepare();
+ ret = 0;
if (iowq->timeout == KTIME_MAX)
schedule();
else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
- return -ETIME;
- return 0;
+ ret = -ETIME;
+ io_schedule_finish(token);
+ return ret;
}
/*
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-16 8:41 FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree gregkh
@ 2023-07-16 18:13 ` Jens Axboe
2023-07-16 19:11 ` Andres Freund
2023-07-17 16:39 ` Jens Axboe
0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-16 18:13 UTC (permalink / raw)
To: gregkh, andres, asml.silence; +Cc: stable
[-- Attachment #1: Type: text/plain, Size: 805 bytes --]
On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote:
>
> The patch below does not apply to the 6.1-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
>
> To reproduce the conflict and resubmit, you may use the following commands:
>
> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
> git checkout FETCH_HEAD
> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
> # <resolve conflicts, build, test, etc.>
> git commit -s
> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Here's one for 6.1-stable.
--
Jens Axboe
[-- Attachment #2: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --]
[-- Type: text/x-patch, Size: 2716 bytes --]
From 71fc76b239a1c980c11821916d1d6785bc177c5c Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Jul 2023 12:13:06 -0600
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
io_uring/io_uring.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index cc35aba1e495..de117d3424b2 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
ktime_t *timeout)
{
- int ret;
+ int token, ret;
unsigned long check_cq;
/* make sure we run task_work before checking for signals */
@@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
return -EBADR;
}
+
+ /*
+ * Use io_schedule_prepare/finish, so cpufreq can take into account
+ * that the task is waiting for IO - turns out to be important for low
+ * QD IO.
+ */
+ token = io_schedule_prepare();
+ ret = 0;
if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
- return -ETIME;
- return 1;
+ ret = -ETIME;
+ io_schedule_finish(token);
+ return ret;
}
/*
--
2.40.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-16 18:13 ` Jens Axboe
@ 2023-07-16 19:11 ` Andres Freund
2023-07-16 19:19 ` Jens Axboe
2023-07-17 16:39 ` Jens Axboe
1 sibling, 1 reply; 10+ messages in thread
From: Andres Freund @ 2023-07-16 19:11 UTC (permalink / raw)
To: Jens Axboe; +Cc: gregkh, asml.silence, stable
Hi,
On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
> Here's one for 6.1-stable.
Thanks for working on that!
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index cc35aba1e495..de117d3424b2 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> struct io_wait_queue *iowq,
> ktime_t *timeout)
> {
> - int ret;
> + int token, ret;
> unsigned long check_cq;
>
> /* make sure we run task_work before checking for signals */
> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
> return -EBADR;
> }
> +
> + /*
> + * Use io_schedule_prepare/finish, so cpufreq can take into account
> + * that the task is waiting for IO - turns out to be important for low
> + * QD IO.
> + */
> + token = io_schedule_prepare();
> + ret = 0;
> if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
> - return -ETIME;
> - return 1;
> + ret = -ETIME;
> + io_schedule_finish(token);
> + return ret;
> }
To me it looks like this might have changed more than intended? Previously
io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing
something?
Greetings,
Andres Freund
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-16 19:11 ` Andres Freund
@ 2023-07-16 19:19 ` Jens Axboe
2023-07-16 19:29 ` Greg KH
2023-07-17 16:32 ` Jens Axboe
0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-16 19:19 UTC (permalink / raw)
To: Andres Freund; +Cc: gregkh, asml.silence, stable
On 7/16/23 1:11?PM, Andres Freund wrote:
> Hi,
>
> On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
>> Here's one for 6.1-stable.
>
> Thanks for working on that!
>
>
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index cc35aba1e495..de117d3424b2 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>> struct io_wait_queue *iowq,
>> ktime_t *timeout)
>> {
>> - int ret;
>> + int token, ret;
>> unsigned long check_cq;
>>
>> /* make sure we run task_work before checking for signals */
>> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
>> return -EBADR;
>> }
>> +
>> + /*
>> + * Use io_schedule_prepare/finish, so cpufreq can take into account
>> + * that the task is waiting for IO - turns out to be important for low
>> + * QD IO.
>> + */
>> + token = io_schedule_prepare();
>> + ret = 0;
>> if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
>> - return -ETIME;
>> - return 1;
>> + ret = -ETIME;
>> + io_schedule_finish(token);
>> + return ret;
>> }
>
> To me it looks like this might have changed more than intended? Previously
> io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
> non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing
> something?
Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now?
I'll get it sorted tomorrow. Sorry about that, and thanks for catching
that Andres!
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-16 19:19 ` Jens Axboe
@ 2023-07-16 19:29 ` Greg KH
2023-07-17 16:32 ` Jens Axboe
1 sibling, 0 replies; 10+ messages in thread
From: Greg KH @ 2023-07-16 19:29 UTC (permalink / raw)
To: Jens Axboe; +Cc: Andres Freund, asml.silence, stable
On Sun, Jul 16, 2023 at 01:19:31PM -0600, Jens Axboe wrote:
> On 7/16/23 1:11?PM, Andres Freund wrote:
> > Hi,
> >
> > On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
> >> Here's one for 6.1-stable.
> >
> > Thanks for working on that!
> >
> >
> >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> >> index cc35aba1e495..de117d3424b2 100644
> >> --- a/io_uring/io_uring.c
> >> +++ b/io_uring/io_uring.c
> >> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> >> struct io_wait_queue *iowq,
> >> ktime_t *timeout)
> >> {
> >> - int ret;
> >> + int token, ret;
> >> unsigned long check_cq;
> >>
> >> /* make sure we run task_work before checking for signals */
> >> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> >> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
> >> return -EBADR;
> >> }
> >> +
> >> + /*
> >> + * Use io_schedule_prepare/finish, so cpufreq can take into account
> >> + * that the task is waiting for IO - turns out to be important for low
> >> + * QD IO.
> >> + */
> >> + token = io_schedule_prepare();
> >> + ret = 0;
> >> if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
> >> - return -ETIME;
> >> - return 1;
> >> + ret = -ETIME;
> >> + io_schedule_finish(token);
> >> + return ret;
> >> }
> >
> > To me it looks like this might have changed more than intended? Previously
> > io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
> > non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing
> > something?
>
> Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now?
> I'll get it sorted tomorrow. Sorry about that, and thanks for catching
> that Andres!
Sure, will go drop it now, thanks.
greg k-h
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-16 19:19 ` Jens Axboe
2023-07-16 19:29 ` Greg KH
@ 2023-07-17 16:32 ` Jens Axboe
1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-17 16:32 UTC (permalink / raw)
To: Andres Freund; +Cc: gregkh, asml.silence, stable
[-- Attachment #1: Type: text/plain, Size: 1871 bytes --]
On 7/16/23 1:19?PM, Jens Axboe wrote:
> On 7/16/23 1:11?PM, Andres Freund wrote:
>> Hi,
>>
>> On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
>>> Here's one for 6.1-stable.
>>
>> Thanks for working on that!
>>
>>
>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>> index cc35aba1e495..de117d3424b2 100644
>>> --- a/io_uring/io_uring.c
>>> +++ b/io_uring/io_uring.c
>>> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>>> struct io_wait_queue *iowq,
>>> ktime_t *timeout)
>>> {
>>> - int ret;
>>> + int token, ret;
>>> unsigned long check_cq;
>>>
>>> /* make sure we run task_work before checking for signals */
>>> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>>> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
>>> return -EBADR;
>>> }
>>> +
>>> + /*
>>> + * Use io_schedule_prepare/finish, so cpufreq can take into account
>>> + * that the task is waiting for IO - turns out to be important for low
>>> + * QD IO.
>>> + */
>>> + token = io_schedule_prepare();
>>> + ret = 0;
>>> if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
>>> - return -ETIME;
>>> - return 1;
>>> + ret = -ETIME;
>>> + io_schedule_finish(token);
>>> + return ret;
>>> }
>>
>> To me it looks like this might have changed more than intended? Previously
>> io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
>> non-zero, now io_cqring_wait_schedule() returns 1 in that case? Am I missing
>> something?
>
> Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now?
> I'll get it sorted tomorrow. Sorry about that, and thanks for catching
> that Andres!
Greg, can you pick up these two for 5.10-stable and 5.15-stable? While
running testing, noticed another backport that was missing, so added
that as we..
--
Jens Axboe
[-- Attachment #2: 0002-io_uring-add-reschedule-point-to-handle_tw_list.patch --]
[-- Type: text/x-patch, Size: 1157 bytes --]
From 4e214e7e01158a87308a17766706159bca472855 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 17 Jul 2023 10:27:20 -0600
Subject: [PATCH 2/2] io_uring: add reschedule point to handle_tw_list()
Commit f58680085478dd292435727210122960d38e8014 upstream.
If CONFIG_PREEMPT_NONE is set and the task_work chains are long, we
could be running into issues blocking others for too long. Add a
reschedule check in handle_tw_list(), and flush the ctx if we need to
reschedule.
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
io_uring/io_uring.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 33d4a2871dbb..eae7a3d89397 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2216,9 +2216,12 @@ static void tctx_task_work(struct callback_head *cb)
}
req->io_task_work.func(req, &locked);
node = next;
+ if (unlikely(need_resched())) {
+ ctx_flush_and_put(ctx, &locked);
+ ctx = NULL;
+ cond_resched();
+ }
} while (node);
-
- cond_resched();
}
ctx_flush_and_put(ctx, &locked);
--
2.40.1
[-- Attachment #3: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --]
[-- Type: text/x-patch, Size: 2770 bytes --]
From c8c88d523c89e0ac8affbf2fd57def82e0d5d4bf Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Jul 2023 12:07:03 -0600
Subject: [PATCH 1/2] io_uring: Use io_schedule* in cqring wait
Commit 8a796565cec3601071cbbd27d6304e202019d014 upstream.
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
io_uring/io_uring.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e633799c9cea..33d4a2871dbb 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -7785,7 +7785,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
ktime_t *timeout)
{
- int ret;
+ int token, ret;
/* make sure we run task_work before checking for signals */
ret = io_run_task_work_sig();
@@ -7795,9 +7795,17 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
if (test_bit(0, &ctx->check_cq_overflow))
return 1;
+ /*
+ * Use io_schedule_prepare/finish, so cpufreq can take into account
+ * that the task is waiting for IO - turns out to be important for low
+ * QD IO.
+ */
+ token = io_schedule_prepare();
+ ret = 1;
if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
- return -ETIME;
- return 1;
+ ret = -ETIME;
+ io_schedule_finish(token);
+ return ret;
}
/*
--
2.40.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-16 18:13 ` Jens Axboe
2023-07-16 19:11 ` Andres Freund
@ 2023-07-17 16:39 ` Jens Axboe
2023-07-17 17:33 ` Andres Freund
2023-07-17 20:12 ` Greg KH
1 sibling, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-17 16:39 UTC (permalink / raw)
To: gregkh, andres, asml.silence; +Cc: stable
[-- Attachment #1: Type: text/plain, Size: 900 bytes --]
On 7/16/23 12:13 PM, Jens Axboe wrote:
> On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote:
>>
>> The patch below does not apply to the 6.1-stable tree.
>> If someone wants it applied there, or to any other stable or longterm
>> tree, then please email the backport, including the original git commit
>> id to <stable@vger.kernel.org>.
>>
>> To reproduce the conflict and resubmit, you may use the following commands:
>>
>> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
>> git checkout FETCH_HEAD
>> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
>> # <resolve conflicts, build, test, etc.>
>> git commit -s
>> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
>
> Here's one for 6.1-stable.
And here's a corrected one for 6.1.
--
Jens Axboe
[-- Attachment #2: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --]
[-- Type: text/x-patch, Size: 2775 bytes --]
From f5f24ec27340daf12177fd09c2d107a589cbf527 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Jul 2023 12:13:06 -0600
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait
Commit 8a796565cec3601071cbbd27d6304e202019d014 upstream.
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
io_uring/io_uring.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index cc35aba1e495..6d7b358e71f1 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
ktime_t *timeout)
{
- int ret;
+ int token, ret;
unsigned long check_cq;
/* make sure we run task_work before checking for signals */
@@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
return -EBADR;
}
+
+ /*
+ * Use io_schedule_prepare/finish, so cpufreq can take into account
+ * that the task is waiting for IO - turns out to be important for low
+ * QD IO.
+ */
+ token = io_schedule_prepare();
+ ret = 1;
if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
- return -ETIME;
- return 1;
+ ret = -ETIME;
+ io_schedule_finish(token);
+ return ret;
}
/*
--
2.40.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-17 16:39 ` Jens Axboe
@ 2023-07-17 17:33 ` Andres Freund
2023-07-17 20:12 ` Greg KH
1 sibling, 0 replies; 10+ messages in thread
From: Andres Freund @ 2023-07-17 17:33 UTC (permalink / raw)
To: Jens Axboe; +Cc: gregkh, asml.silence, stable
Hi,
On 2023-07-17 10:39:51 -0600, Jens Axboe wrote:
> And here's a corrected one for 6.1.
Thanks! LGTM.
Greetings,
Andres Freund
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-17 16:39 ` Jens Axboe
2023-07-17 17:33 ` Andres Freund
@ 2023-07-17 20:12 ` Greg KH
2023-07-17 20:13 ` Jens Axboe
1 sibling, 1 reply; 10+ messages in thread
From: Greg KH @ 2023-07-17 20:12 UTC (permalink / raw)
To: Jens Axboe; +Cc: andres, asml.silence, stable
On Mon, Jul 17, 2023 at 10:39:51AM -0600, Jens Axboe wrote:
> On 7/16/23 12:13 PM, Jens Axboe wrote:
> > On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote:
> >>
> >> The patch below does not apply to the 6.1-stable tree.
> >> If someone wants it applied there, or to any other stable or longterm
> >> tree, then please email the backport, including the original git commit
> >> id to <stable@vger.kernel.org>.
> >>
> >> To reproduce the conflict and resubmit, you may use the following commands:
> >>
> >> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
> >> git checkout FETCH_HEAD
> >> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
> >> # <resolve conflicts, build, test, etc.>
> >> git commit -s
> >> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
> >
> > Here's one for 6.1-stable.
>
> And here's a corrected one for 6.1.
All now queued up, thanks.
greg k-h
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
2023-07-17 20:12 ` Greg KH
@ 2023-07-17 20:13 ` Jens Axboe
0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-17 20:13 UTC (permalink / raw)
To: Greg KH; +Cc: andres, asml.silence, stable
On 7/17/23 2:12?PM, Greg KH wrote:
> On Mon, Jul 17, 2023 at 10:39:51AM -0600, Jens Axboe wrote:
>> On 7/16/23 12:13?PM, Jens Axboe wrote:
>>> On 7/16/23 2:41?AM, gregkh@linuxfoundation.org wrote:
>>>>
>>>> The patch below does not apply to the 6.1-stable tree.
>>>> If someone wants it applied there, or to any other stable or longterm
>>>> tree, then please email the backport, including the original git commit
>>>> id to <stable@vger.kernel.org>.
>>>>
>>>> To reproduce the conflict and resubmit, you may use the following commands:
>>>>
>>>> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
>>>> git checkout FETCH_HEAD
>>>> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
>>>> # <resolve conflicts, build, test, etc.>
>>>> git commit -s
>>>> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
>>>
>>> Here's one for 6.1-stable.
>>
>> And here's a corrected one for 6.1.
>
> All now queued up, thanks.
Great, thanks Greg!
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-07-17 20:13 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-16 8:41 FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree gregkh
2023-07-16 18:13 ` Jens Axboe
2023-07-16 19:11 ` Andres Freund
2023-07-16 19:19 ` Jens Axboe
2023-07-16 19:29 ` Greg KH
2023-07-17 16:32 ` Jens Axboe
2023-07-17 16:39 ` Jens Axboe
2023-07-17 17:33 ` Andres Freund
2023-07-17 20:12 ` Greg KH
2023-07-17 20:13 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox