public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION] ioprio performance hangs, bisected
@ 2024-11-25 15:44 Chris Bainbridge
  2024-11-25 17:16 ` Chris Bainbridge
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Bainbridge @ 2024-11-25 15:44 UTC (permalink / raw)
  To: hch; +Cc: LKML, axboe, bvanassche, Linux regressions mailing list,
	linux-block

The commit 6975c1a486a4 ("block: remove the ioprio field from struct
request") appears to have introduced a performance regression. Test
case is the script /etc/cron.daily/locate from package `locate` (which
is from findutils locate on Debian). The script runs:

  ionice -c 3 -p 3965
  nice -n 10 updatedb.findutils

This locate script will usually complete in about 1m20s on my machine.
But, following commit 6975c1a486a4, the script doesn't seem to
complete in a reasonable time (I waited 20 minutes). `top` shows that
no or little CPU time seems to be dedicated to the find process. Also
it seems that this affects other processes too - once find is in a
"hung" state, doing `ls -R` on the affected drive will also hang, even
though the priority of the terminal/bash/ls process hasn't changed,
and is supposedly independent of the locate/find script. I'm running
btrfs, on an external USB SSD drive.

#regzbot introduced: 6975c1a486a40446b5bc77a89d9c520f8296fd08

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] ioprio performance hangs, bisected
  2024-11-25 15:44 [REGRESSION] ioprio performance hangs, bisected Chris Bainbridge
@ 2024-11-25 17:16 ` Chris Bainbridge
  2024-11-26  6:52   ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Bainbridge @ 2024-11-25 17:16 UTC (permalink / raw)
  To: hch
  Cc: LKML, axboe, bvanassche, Linux regressions mailing list,
	linux-block, semen.protsenko

On Mon, 25 Nov 2024 at 15:44, Chris Bainbridge
<chris.bainbridge@gmail.com> wrote:
>
> The commit 6975c1a486a4 ("block: remove the ioprio field from struct
> request") appears to have introduced a performance regression. Test
> case is the script /etc/cron.daily/locate from package `locate` (which
> is from findutils locate on Debian). The script runs:

I did a bit of debugging.

The problem is the function req_get_ioprio. The commit changed:

 static inline unsigned short req_get_ioprio(struct request *req)
 {
-       return req->ioprio;
+       if (req->bio)
+               return req->bio->bi_ioprio;
+       return 0;
 }

So when req->bio is NULL this function now returns 0. But previously the values
of req->ioprio were sometimes non-zero. If I revert the commit and then
instrument req_get_ioprio to log where the new return value differs:

  static inline unsigned short req_get_ioprio(struct request *req)
 {
-       return req->ioprio;
+       if (req->bio)
+       {
+              if (req->bio->bi_ioprio != req->ioprio)
+                      printk("req->bio->bi_ioprio != req->ioprio\n");
+               return req->bio->bi_ioprio;
+       }
+       if (req->ioprio != 0)
+               printk("bad ioprio 0 != %u\n", (unsigned int)req->ioprio);
+       return 0;
 }

then log shows:

[   36.922906] bad ioprio 0 != 16387
[   36.923061] bad ioprio 0 != 16387
[   36.930186] bad ioprio 0 != 16387
[   36.930680] bad ioprio 0 != 16387
[   78.875421] bad ioprio 0 != 24583
[   79.228801] bad ioprio 0 != 24583
[   87.411118] bad ioprio 0 != 24583
[   97.419607] bad ioprio 0 != 24583
[   97.421059] bad ioprio 0 != 24583
[  107.210364] bad ioprio 0 != 24583
[  107.210775] bad ioprio 0 != 24583

So, the new function is returning 0 when it would've previously returned these
non-zero values, and returning zero breaks things.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] ioprio performance hangs, bisected
  2024-11-25 17:16 ` Chris Bainbridge
@ 2024-11-26  6:52   ` Christoph Hellwig
  2024-11-26  9:18     ` Chris Bainbridge
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2024-11-26  6:52 UTC (permalink / raw)
  To: Chris Bainbridge
  Cc: hch, LKML, axboe, bvanassche, Linux regressions mailing list,
	linux-block, semen.protsenko

On Mon, Nov 25, 2024 at 05:16:39PM +0000, Chris Bainbridge wrote:
> I did a bit of debugging.

Thanks, this was extremely helpful!

mq-deadlink not only looks at the priority in the submission path,
but also in the completion path, which is rather unexpected.  Now
for drivers that consume bios, req->bio will eventually become
NULL before the completion.

Fortunately fixing this is not only easy but also improves the
code in mq-deadline.  Can you test the patch below?

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index acdc28756d9d..91b3789f710e 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -685,10 +685,9 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 
 	prio = ioprio_class_to_prio[ioprio_class];
 	per_prio = &dd->per_prio[prio];
-	if (!rq->elv.priv[0]) {
+	if (!rq->elv.priv[0])
 		per_prio->stats.inserted++;
-		rq->elv.priv[0] = (void *)(uintptr_t)1;
-	}
+	rq->elv.priv[0] = per_prio;
 
 	if (blk_mq_sched_try_insert_merge(q, rq, free))
 		return;
@@ -753,18 +752,14 @@ static void dd_prepare_request(struct request *rq)
  */
 static void dd_finish_request(struct request *rq)
 {
-	struct request_queue *q = rq->q;
-	struct deadline_data *dd = q->elevator->elevator_data;
-	const u8 ioprio_class = dd_rq_ioclass(rq);
-	const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
-	struct dd_per_prio *per_prio = &dd->per_prio[prio];
+	struct dd_per_prio *per_prio = rq->elv.priv[0];
 
 	/*
 	 * The block layer core may call dd_finish_request() without having
 	 * called dd_insert_requests(). Skip requests that bypassed I/O
 	 * scheduling. See also blk_mq_request_bypass_insert().
 	 */
-	if (rq->elv.priv[0])
+	if (per_prio)
 		atomic_inc(&per_prio->stats.completed);
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] ioprio performance hangs, bisected
  2024-11-26  6:52   ` Christoph Hellwig
@ 2024-11-26  9:18     ` Chris Bainbridge
  2024-11-30  6:09       ` David Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Bainbridge @ 2024-11-26  9:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: LKML, axboe, bvanassche, Linux regressions mailing list,
	linux-block, semen.protsenko

On Tue, 26 Nov 2024 at 06:52, Christoph Hellwig <hch@lst.de> wrote:
>
> Fortunately fixing this is not only easy but also improves the
> code in mq-deadline.  Can you test the patch below?

Yes, the patch fixes my test case.

Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] ioprio performance hangs, bisected
  2024-11-26  9:18     ` Chris Bainbridge
@ 2024-11-30  6:09       ` David Wang
  2024-11-30 16:00         ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: David Wang @ 2024-11-30  6:09 UTC (permalink / raw)
  To: chris.bainbridge
  Cc: axboe, bvanassche, hch, linux-block, linux-kernel, regressions,
	semen.protsenko

Would fix/revert reach 6.13-rc1? I think this regression has
significant nagative impact on daily usage. From time to time,
my system would *stuck* for seconds, sometimes even when
 just `cat` some small file it would take seconds, and once,
my desktop failed to resume from a suspend, I had to power cycle the system;
Polling /proc/diskstats,  I noticed Block-IO read latency,
delta(# of milliseconds spent reading)/delta(# of reads completed),
is very high, ~200ms average, and frequently reaches 10s.
 (Strangely block-io write latency seems normal.)
My bisect also land on this commit, revert or apply this patch would
fix it.

Kind of think that 6.13-rc1 would be very unpleseant to use without
a fix/revert for this.


David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] ioprio performance hangs, bisected
  2024-11-30  6:09       ` David Wang
@ 2024-11-30 16:00         ` Jens Axboe
  2024-11-30 16:59           ` David Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2024-11-30 16:00 UTC (permalink / raw)
  To: David Wang, chris.bainbridge
  Cc: bvanassche, hch, linux-block, linux-kernel, regressions,
	semen.protsenko

On 11/29/24 11:09 PM, David Wang wrote:
> Would fix/revert reach 6.13-rc1? I think this regression has
> significant nagative impact on daily usage. From time to time,
> my system would *stuck* for seconds, sometimes even when
>  just `cat` some small file it would take seconds, and once,
> my desktop failed to resume from a suspend, I had to power cycle the system;
> Polling /proc/diskstats,  I noticed Block-IO read latency,
> delta(# of milliseconds spent reading)/delta(# of reads completed),
> is very high, ~200ms average, and frequently reaches 10s.
>  (Strangely block-io write latency seems normal.)
> My bisect also land on this commit, revert or apply this patch would
> fix it.
> 
> Kind of think that 6.13-rc1 would be very unpleseant to use without
> a fix/revert for this.

The pull including this fix has been sent off to Linus yesterday,
it should make -rc1.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] ioprio performance hangs, bisected
  2024-11-30 16:00         ` Jens Axboe
@ 2024-11-30 16:59           ` David Wang
  0 siblings, 0 replies; 7+ messages in thread
From: David Wang @ 2024-11-30 16:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: chris.bainbridge, bvanassche, hch, linux-block, linux-kernel,
	regressions, semen.protsenko


At 2024-12-01 00:00:20, "Jens Axboe" <axboe@kernel.dk> wrote:
>On 11/29/24 11:09 PM, David Wang wrote:
>> Would fix/revert reach 6.13-rc1? I think this regression has
>> significant nagative impact on daily usage. From time to time,
>> my system would *stuck* for seconds, sometimes even when
>>  just `cat` some small file it would take seconds, and once,
>> my desktop failed to resume from a suspend, I had to power cycle the system;
>> Polling /proc/diskstats,  I noticed Block-IO read latency,
>> delta(# of milliseconds spent reading)/delta(# of reads completed),
>> is very high, ~200ms average, and frequently reaches 10s.
>>  (Strangely block-io write latency seems normal.)
>> My bisect also land on this commit, revert or apply this patch would
>> fix it.
>> 
>> Kind of think that 6.13-rc1 would be very unpleseant to use without
>> a fix/revert for this.
>
>The pull including this fix has been sent off to Linus yesterday,
>it should make -rc1.
>
>-- 
>Jens Axboe

Good to know~

Thanks
David

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-11-30 17:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-25 15:44 [REGRESSION] ioprio performance hangs, bisected Chris Bainbridge
2024-11-25 17:16 ` Chris Bainbridge
2024-11-26  6:52   ` Christoph Hellwig
2024-11-26  9:18     ` Chris Bainbridge
2024-11-30  6:09       ` David Wang
2024-11-30 16:00         ` Jens Axboe
2024-11-30 16:59           ` David Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox