public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
@ 2023-07-16  8:41 gregkh
  2023-07-16 18:13 ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: gregkh @ 2023-07-16  8:41 UTC (permalink / raw)
  To: andres, asml.silence, axboe; +Cc: stable


The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.

To reproduce the conflict and resubmit, you may use the following commands:

git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..

Possible dependencies:

8a796565cec3 ("io_uring: Use io_schedule* in cqring wait")
d33a39e57768 ("io_uring: keep timeout in io_wait_queue")
46ae7eef44f6 ("io_uring: optimise non-timeout waiting")
846072f16eed ("io_uring: mimimise io_cqring_wait_schedule")
3fcf19d592d5 ("io_uring: parse check_cq out of wq waiting")
12521a5d5cb7 ("io_uring: fix CQ waiting timeout handling")
52ea806ad983 ("io_uring: finish waiting before flushing overflow entries")
35d90f95cfa7 ("io_uring: include task_work run after scheduling in wait for events")
1b346e4aa8e7 ("io_uring: don't check overflow flush failures")
a85381d8326d ("io_uring: skip overflow CQE posting for dying ring")

thanks,

greg k-h

------------------ original commit in Linus's tree ------------------

From 8a796565cec3601071cbbd27d6304e202019d014 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Fri, 7 Jul 2023 09:20:07 -0700
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait

I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().

The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0

This is repeatable with different filesystems, using raw block devices
and using different block devices.

Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.

After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).

There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.

Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e8096d502a7c..7505de2428e0 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2489,6 +2489,8 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
 static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 					  struct io_wait_queue *iowq)
 {
+	int token, ret;
+
 	if (unlikely(READ_ONCE(ctx->check_cq)))
 		return 1;
 	if (unlikely(!llist_empty(&ctx->work_llist)))
@@ -2499,11 +2501,20 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 		return -EINTR;
 	if (unlikely(io_should_wake(iowq)))
 		return 0;
+
+	/*
+	 * Use io_schedule_prepare/finish, so cpufreq can take into account
+	 * that the task is waiting for IO - turns out to be important for low
+	 * QD IO.
+	 */
+	token = io_schedule_prepare();
+	ret = 0;
 	if (iowq->timeout == KTIME_MAX)
 		schedule();
 	else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
-		return -ETIME;
-	return 0;
+		ret = -ETIME;
+	io_schedule_finish(token);
+	return ret;
 }
 
 /*


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-16  8:41 FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree gregkh
@ 2023-07-16 18:13 ` Jens Axboe
  2023-07-16 19:11   ` Andres Freund
  2023-07-17 16:39   ` Jens Axboe
  0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-16 18:13 UTC (permalink / raw)
  To: gregkh, andres, asml.silence; +Cc: stable

[-- Attachment #1: Type: text/plain, Size: 805 bytes --]

On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote:
> 
> The patch below does not apply to the 6.1-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
> 
> To reproduce the conflict and resubmit, you may use the following commands:
> 
> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
> git checkout FETCH_HEAD
> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
> # <resolve conflicts, build, test, etc.>
> git commit -s
> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..

Here's one for 6.1-stable.

-- 
Jens Axboe


[-- Attachment #2: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --]
[-- Type: text/x-patch, Size: 2716 bytes --]

From 71fc76b239a1c980c11821916d1d6785bc177c5c Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Jul 2023 12:13:06 -0600
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait

I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().

The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0

This is repeatable with different filesystems, using raw block devices
and using different block devices.

Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.

After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).

There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.

Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 io_uring/io_uring.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index cc35aba1e495..de117d3424b2 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 					  struct io_wait_queue *iowq,
 					  ktime_t *timeout)
 {
-	int ret;
+	int token, ret;
 	unsigned long check_cq;
 
 	/* make sure we run task_work before checking for signals */
@@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 		if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
 			return -EBADR;
 	}
+
+	/*
+	 * Use io_schedule_prepare/finish, so cpufreq can take into account
+	 * that the task is waiting for IO - turns out to be important for low
+	 * QD IO.
+	 */
+	token = io_schedule_prepare();
+	ret = 0;
 	if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
-		return -ETIME;
-	return 1;
+		ret = -ETIME;
+	io_schedule_finish(token);
+	return ret;
 }
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-16 18:13 ` Jens Axboe
@ 2023-07-16 19:11   ` Andres Freund
  2023-07-16 19:19     ` Jens Axboe
  2023-07-17 16:39   ` Jens Axboe
  1 sibling, 1 reply; 10+ messages in thread
From: Andres Freund @ 2023-07-16 19:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: gregkh, asml.silence, stable

Hi,

On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
> Here's one for 6.1-stable.

Thanks for working on that!


> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index cc35aba1e495..de117d3424b2 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>  					  struct io_wait_queue *iowq,
>  					  ktime_t *timeout)
>  {
> -	int ret;
> +	int token, ret;
>  	unsigned long check_cq;
>  
>  	/* make sure we run task_work before checking for signals */
> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>  		if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
>  			return -EBADR;
>  	}
> +
> +	/*
> +	 * Use io_schedule_prepare/finish, so cpufreq can take into account
> +	 * that the task is waiting for IO - turns out to be important for low
> +	 * QD IO.
> +	 */
> +	token = io_schedule_prepare();
> +	ret = 0;
>  	if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
> -		return -ETIME;
> -	return 1;
> +		ret = -ETIME;
> +	io_schedule_finish(token);
> +	return ret;
>  }

To me it looks like this might have changed more than intended? Previously
io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
non-zero, now io_cqring_wait_schedule() returns 1 in that case?  Am I missing
something?

Greetings,

Andres Freund

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-16 19:11   ` Andres Freund
@ 2023-07-16 19:19     ` Jens Axboe
  2023-07-16 19:29       ` Greg KH
  2023-07-17 16:32       ` Jens Axboe
  0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-16 19:19 UTC (permalink / raw)
  To: Andres Freund; +Cc: gregkh, asml.silence, stable

On 7/16/23 1:11?PM, Andres Freund wrote:
> Hi,
> 
> On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
>> Here's one for 6.1-stable.
> 
> Thanks for working on that!
> 
> 
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index cc35aba1e495..de117d3424b2 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>>  					  struct io_wait_queue *iowq,
>>  					  ktime_t *timeout)
>>  {
>> -	int ret;
>> +	int token, ret;
>>  	unsigned long check_cq;
>>  
>>  	/* make sure we run task_work before checking for signals */
>> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>>  		if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
>>  			return -EBADR;
>>  	}
>> +
>> +	/*
>> +	 * Use io_schedule_prepare/finish, so cpufreq can take into account
>> +	 * that the task is waiting for IO - turns out to be important for low
>> +	 * QD IO.
>> +	 */
>> +	token = io_schedule_prepare();
>> +	ret = 0;
>>  	if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
>> -		return -ETIME;
>> -	return 1;
>> +		ret = -ETIME;
>> +	io_schedule_finish(token);
>> +	return ret;
>>  }
> 
> To me it looks like this might have changed more than intended? Previously
> io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
> non-zero, now io_cqring_wait_schedule() returns 1 in that case?  Am I missing
> something?

Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now?
I'll get it sorted tomorrow. Sorry about that, and thanks for catching
that Andres!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-16 19:19     ` Jens Axboe
@ 2023-07-16 19:29       ` Greg KH
  2023-07-17 16:32       ` Jens Axboe
  1 sibling, 0 replies; 10+ messages in thread
From: Greg KH @ 2023-07-16 19:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andres Freund, asml.silence, stable

On Sun, Jul 16, 2023 at 01:19:31PM -0600, Jens Axboe wrote:
> On 7/16/23 1:11?PM, Andres Freund wrote:
> > Hi,
> > 
> > On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
> >> Here's one for 6.1-stable.
> > 
> > Thanks for working on that!
> > 
> > 
> >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> >> index cc35aba1e495..de117d3424b2 100644
> >> --- a/io_uring/io_uring.c
> >> +++ b/io_uring/io_uring.c
> >> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> >>  					  struct io_wait_queue *iowq,
> >>  					  ktime_t *timeout)
> >>  {
> >> -	int ret;
> >> +	int token, ret;
> >>  	unsigned long check_cq;
> >>  
> >>  	/* make sure we run task_work before checking for signals */
> >> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> >>  		if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
> >>  			return -EBADR;
> >>  	}
> >> +
> >> +	/*
> >> +	 * Use io_schedule_prepare/finish, so cpufreq can take into account
> >> +	 * that the task is waiting for IO - turns out to be important for low
> >> +	 * QD IO.
> >> +	 */
> >> +	token = io_schedule_prepare();
> >> +	ret = 0;
> >>  	if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
> >> -		return -ETIME;
> >> -	return 1;
> >> +		ret = -ETIME;
> >> +	io_schedule_finish(token);
> >> +	return ret;
> >>  }
> > 
> > To me it looks like this might have changed more than intended? Previously
> > io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
> > non-zero, now io_cqring_wait_schedule() returns 1 in that case?  Am I missing
> > something?
> 
> Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now?
> I'll get it sorted tomorrow. Sorry about that, and thanks for catching
> that Andres!

Sure, will go drop it now, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-16 19:19     ` Jens Axboe
  2023-07-16 19:29       ` Greg KH
@ 2023-07-17 16:32       ` Jens Axboe
  1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-17 16:32 UTC (permalink / raw)
  To: Andres Freund; +Cc: gregkh, asml.silence, stable

[-- Attachment #1: Type: text/plain, Size: 1871 bytes --]

On 7/16/23 1:19?PM, Jens Axboe wrote:
> On 7/16/23 1:11?PM, Andres Freund wrote:
>> Hi,
>>
>> On 2023-07-16 12:13:45 -0600, Jens Axboe wrote:
>>> Here's one for 6.1-stable.
>>
>> Thanks for working on that!
>>
>>
>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>> index cc35aba1e495..de117d3424b2 100644
>>> --- a/io_uring/io_uring.c
>>> +++ b/io_uring/io_uring.c
>>> @@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>>>  					  struct io_wait_queue *iowq,
>>>  					  ktime_t *timeout)
>>>  {
>>> -	int ret;
>>> +	int token, ret;
>>>  	unsigned long check_cq;
>>>  
>>>  	/* make sure we run task_work before checking for signals */
>>> @@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
>>>  		if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
>>>  			return -EBADR;
>>>  	}
>>> +
>>> +	/*
>>> +	 * Use io_schedule_prepare/finish, so cpufreq can take into account
>>> +	 * that the task is waiting for IO - turns out to be important for low
>>> +	 * QD IO.
>>> +	 */
>>> +	token = io_schedule_prepare();
>>> +	ret = 0;
>>>  	if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
>>> -		return -ETIME;
>>> -	return 1;
>>> +		ret = -ETIME;
>>> +	io_schedule_finish(token);
>>> +	return ret;
>>>  }
>>
>> To me it looks like this might have changed more than intended? Previously
>> io_cqring_wait_schedule() returned 0 in case schedule_hrtimeout() returned
>> non-zero, now io_cqring_wait_schedule() returns 1 in that case?  Am I missing
>> something?
> 
> Ah shoot yes indeed. Greg, can you drop the 5.10/5.15/6.1 ones for now?
> I'll get it sorted tomorrow. Sorry about that, and thanks for catching
> that Andres!

Greg, can you pick up these two for 5.10-stable and 5.15-stable? While
running testing, noticed another backport that was missing, so added
that as we..

-- 
Jens Axboe

[-- Attachment #2: 0002-io_uring-add-reschedule-point-to-handle_tw_list.patch --]
[-- Type: text/x-patch, Size: 1157 bytes --]

From 4e214e7e01158a87308a17766706159bca472855 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 17 Jul 2023 10:27:20 -0600
Subject: [PATCH 2/2] io_uring: add reschedule point to handle_tw_list()

Commit f58680085478dd292435727210122960d38e8014 upstream.

If CONFIG_PREEMPT_NONE is set and the task_work chains are long, we
could be running into issues blocking others for too long. Add a
reschedule check in handle_tw_list(), and flush the ctx if we need to
reschedule.

Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 io_uring/io_uring.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 33d4a2871dbb..eae7a3d89397 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2216,9 +2216,12 @@ static void tctx_task_work(struct callback_head *cb)
 			}
 			req->io_task_work.func(req, &locked);
 			node = next;
+			if (unlikely(need_resched())) {
+				ctx_flush_and_put(ctx, &locked);
+				ctx = NULL;
+				cond_resched();
+			}
 		} while (node);
-
-		cond_resched();
 	}
 
 	ctx_flush_and_put(ctx, &locked);
-- 
2.40.1


[-- Attachment #3: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --]
[-- Type: text/x-patch, Size: 2770 bytes --]

From c8c88d523c89e0ac8affbf2fd57def82e0d5d4bf Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Jul 2023 12:07:03 -0600
Subject: [PATCH 1/2] io_uring: Use io_schedule* in cqring wait

Commit 8a796565cec3601071cbbd27d6304e202019d014 upstream.

I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().

The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0

This is repeatable with different filesystems, using raw block devices
and using different block devices.

Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.

After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).

There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.

Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 io_uring/io_uring.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e633799c9cea..33d4a2871dbb 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -7785,7 +7785,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 					  struct io_wait_queue *iowq,
 					  ktime_t *timeout)
 {
-	int ret;
+	int token, ret;
 
 	/* make sure we run task_work before checking for signals */
 	ret = io_run_task_work_sig();
@@ -7795,9 +7795,17 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 	if (test_bit(0, &ctx->check_cq_overflow))
 		return 1;
 
+	/*
+	 * Use io_schedule_prepare/finish, so cpufreq can take into account
+	 * that the task is waiting for IO - turns out to be important for low
+	 * QD IO.
+	 */
+	token = io_schedule_prepare();
+	ret = 1;
 	if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
-		return -ETIME;
-	return 1;
+		ret = -ETIME;
+	io_schedule_finish(token);
+	return ret;
 }
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-16 18:13 ` Jens Axboe
  2023-07-16 19:11   ` Andres Freund
@ 2023-07-17 16:39   ` Jens Axboe
  2023-07-17 17:33     ` Andres Freund
  2023-07-17 20:12     ` Greg KH
  1 sibling, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-17 16:39 UTC (permalink / raw)
  To: gregkh, andres, asml.silence; +Cc: stable

[-- Attachment #1: Type: text/plain, Size: 900 bytes --]

On 7/16/23 12:13 PM, Jens Axboe wrote:
> On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote:
>>
>> The patch below does not apply to the 6.1-stable tree.
>> If someone wants it applied there, or to any other stable or longterm
>> tree, then please email the backport, including the original git commit
>> id to <stable@vger.kernel.org>.
>>
>> To reproduce the conflict and resubmit, you may use the following commands:
>>
>> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
>> git checkout FETCH_HEAD
>> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
>> # <resolve conflicts, build, test, etc.>
>> git commit -s
>> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
> 
> Here's one for 6.1-stable.

And here's a corrected one for 6.1.

-- 
Jens Axboe


[-- Attachment #2: 0001-io_uring-Use-io_schedule-in-cqring-wait.patch --]
[-- Type: text/x-patch, Size: 2775 bytes --]

From f5f24ec27340daf12177fd09c2d107a589cbf527 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sun, 16 Jul 2023 12:13:06 -0600
Subject: [PATCH] io_uring: Use io_schedule* in cqring wait

Commit 8a796565cec3601071cbbd27d6304e202019d014 upstream.

I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().

The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0

This is repeatable with different filesystems, using raw block devices
and using different block devices.

Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.

After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).

There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.

Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 io_uring/io_uring.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index cc35aba1e495..6d7b358e71f1 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2346,7 +2346,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 					  struct io_wait_queue *iowq,
 					  ktime_t *timeout)
 {
-	int ret;
+	int token, ret;
 	unsigned long check_cq;
 
 	/* make sure we run task_work before checking for signals */
@@ -2362,9 +2362,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
 		if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
 			return -EBADR;
 	}
+
+	/*
+	 * Use io_schedule_prepare/finish, so cpufreq can take into account
+	 * that the task is waiting for IO - turns out to be important for low
+	 * QD IO.
+	 */
+	token = io_schedule_prepare();
+	ret = 1;
 	if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
-		return -ETIME;
-	return 1;
+		ret = -ETIME;
+	io_schedule_finish(token);
+	return ret;
 }
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-17 16:39   ` Jens Axboe
@ 2023-07-17 17:33     ` Andres Freund
  2023-07-17 20:12     ` Greg KH
  1 sibling, 0 replies; 10+ messages in thread
From: Andres Freund @ 2023-07-17 17:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: gregkh, asml.silence, stable

Hi,

On 2023-07-17 10:39:51 -0600, Jens Axboe wrote:
> And here's a corrected one for 6.1.

Thanks! LGTM.

Greetings,

Andres Freund

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-17 16:39   ` Jens Axboe
  2023-07-17 17:33     ` Andres Freund
@ 2023-07-17 20:12     ` Greg KH
  2023-07-17 20:13       ` Jens Axboe
  1 sibling, 1 reply; 10+ messages in thread
From: Greg KH @ 2023-07-17 20:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: andres, asml.silence, stable

On Mon, Jul 17, 2023 at 10:39:51AM -0600, Jens Axboe wrote:
> On 7/16/23 12:13 PM, Jens Axboe wrote:
> > On 7/16/23 2:41 AM, gregkh@linuxfoundation.org wrote:
> >>
> >> The patch below does not apply to the 6.1-stable tree.
> >> If someone wants it applied there, or to any other stable or longterm
> >> tree, then please email the backport, including the original git commit
> >> id to <stable@vger.kernel.org>.
> >>
> >> To reproduce the conflict and resubmit, you may use the following commands:
> >>
> >> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
> >> git checkout FETCH_HEAD
> >> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
> >> # <resolve conflicts, build, test, etc.>
> >> git commit -s
> >> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
> > 
> > Here's one for 6.1-stable.
> 
> And here's a corrected one for 6.1.

All now queued up, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree
  2023-07-17 20:12     ` Greg KH
@ 2023-07-17 20:13       ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-07-17 20:13 UTC (permalink / raw)
  To: Greg KH; +Cc: andres, asml.silence, stable

On 7/17/23 2:12?PM, Greg KH wrote:
> On Mon, Jul 17, 2023 at 10:39:51AM -0600, Jens Axboe wrote:
>> On 7/16/23 12:13?PM, Jens Axboe wrote:
>>> On 7/16/23 2:41?AM, gregkh@linuxfoundation.org wrote:
>>>>
>>>> The patch below does not apply to the 6.1-stable tree.
>>>> If someone wants it applied there, or to any other stable or longterm
>>>> tree, then please email the backport, including the original git commit
>>>> id to <stable@vger.kernel.org>.
>>>>
>>>> To reproduce the conflict and resubmit, you may use the following commands:
>>>>
>>>> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
>>>> git checkout FETCH_HEAD
>>>> git cherry-pick -x 8a796565cec3601071cbbd27d6304e202019d014
>>>> # <resolve conflicts, build, test, etc.>
>>>> git commit -s
>>>> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2023071620-litigate-debunk-939a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
>>>
>>> Here's one for 6.1-stable.
>>
>> And here's a corrected one for 6.1.
> 
> All now queued up, thanks.

Great, thanks Greg!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-07-17 20:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-16  8:41 FAILED: patch "[PATCH] io_uring: Use io_schedule* in cqring wait" failed to apply to 6.1-stable tree gregkh
2023-07-16 18:13 ` Jens Axboe
2023-07-16 19:11   ` Andres Freund
2023-07-16 19:19     ` Jens Axboe
2023-07-16 19:29       ` Greg KH
2023-07-17 16:32       ` Jens Axboe
2023-07-17 16:39   ` Jens Axboe
2023-07-17 17:33     ` Andres Freund
2023-07-17 20:12     ` Greg KH
2023-07-17 20:13       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox