[PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"
@ 2024-03-13 21:42 Bart Van Assche
  2024-03-13 21:56 ` Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Bart Van Assche @ 2024-03-13 21:42 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, stable,
	Damien Le Moal, Harshit Mogalapalli, Zhiguo Niu

The code "max(1U, 3 * (1U << shift)  / 4)" comes from the Kyber I/O
scheduler. The Kyber I/O scheduler maintains one internal queue per hwq
and hence derives its async_depth from the number of hwq tags. Using
this approach for the mq-deadline scheduler is wrong since the
mq-deadline scheduler maintains one internal queue for all hwqs
combined. Hence this revert.

Cc: stable@vger.kernel.org
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Cc: Zhiguo Niu <Zhiguo.Niu@unisoc.com>
Fixes: d47f9717e5cf ("block/mq-deadline: use correct way to throttling write requests")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index f958e79277b8..02a916ba62ee 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -646,9 +646,8 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
 	struct request_queue *q = hctx->queue;
 	struct deadline_data *dd = q->elevator->elevator_data;
 	struct blk_mq_tags *tags = hctx->sched_tags;
-	unsigned int shift = tags->bitmap_tags.sb.shift;
 
-	dd->async_depth = max(1U, 3 * (1U << shift)  / 4);
+	dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
 
 	sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
 }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"
  2024-03-13 21:42 [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests" Bart Van Assche
@ 2024-03-13 21:56 ` Jens Axboe
  2024-03-14  1:03 ` 答复: " 牛志国 (Zhiguo Niu)
  2024-03-14  7:58 ` Harshit Mogalapalli
  2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2024-03-13 21:56 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, Christoph Hellwig, stable, Damien Le Moal,
	Harshit Mogalapalli, Zhiguo Niu


On Wed, 13 Mar 2024 14:42:18 -0700, Bart Van Assche wrote:
> The code "max(1U, 3 * (1U << shift)  / 4)" comes from the Kyber I/O
> scheduler. The Kyber I/O scheduler maintains one internal queue per hwq
> and hence derives its async_depth from the number of hwq tags. Using
> this approach for the mq-deadline scheduler is wrong since the
> mq-deadline scheduler maintains one internal queue for all hwqs
> combined. Hence this revert.
> 
> [...]

Applied, thanks!

[1/1] Revert "block/mq-deadline: use correct way to throttling write requests"
      commit: 256aab46e31683d76d45ccbedc287b4d3f3e322b

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 6+ messages in thread

* 答复: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"
  2024-03-13 21:42 [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests" Bart Van Assche
  2024-03-13 21:56 ` Jens Axboe
@ 2024-03-14  1:03 ` 牛志国 (Zhiguo Niu)
  2024-03-14 17:08   ` Bart Van Assche
  2024-03-14  7:58 ` Harshit Mogalapalli
  2 siblings, 1 reply; 6+ messages in thread
From: 牛志国 (Zhiguo Niu) @ 2024-03-14  1:03 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block@vger.kernel.org, Christoph Hellwig,
	stable@vger.kernel.org, Damien Le Moal, Harshit Mogalapalli,
	金红宇 (Hongyu Jin)

Hi Bart,

Just as mentioned in original patch, "dd->async_depth = max(1UL, 3 * q->nr_requests / 4);", this limitation methods look likes won't have a limit effect, because tag allocated is based on sbitmap, not based the whole nr_requests.
Right?
Thanks!

For write requests, when we assign a tags from sched_tags,
data->shallow_depth will be passed to sbitmap_find_bit,
see the following code:

nr = sbitmap_find_bit_in_word(&sb->map[index],
			min_t (unsigned int,
			__map_depth(sb, index),
			depth),
			alloc_hint, wrap);

The smaller of data->shallow_depth and __map_depth(sb, index)
will be used as the maximum range when allocating bits.

For a mmc device (one hw queue, deadline I/O scheduler):
q->nr_requests = sched_tags = 128, so according to the previous
calculation method, dd->async_depth = data->shallow_depth = 96,
and the platform is 64bits with 8 cpus, sched_tags.bitmap_tags.sb.shift=5,
sb.maps[]=32/32/32/32, 32 is smaller than 96, whether it is a read or
a write I/O, tags can be allocated to the maximum range each time,
which has not throttling effect.

-----邮件原件-----
发件人: Bart Van Assche <bvanassche@acm.org> 
发送时间: 2024年3月14日 5:42
收件人: Jens Axboe <axboe@kernel.dk>
抄送: linux-block@vger.kernel.org; Christoph Hellwig <hch@lst.de>; Bart Van Assche <bvanassche@acm.org>; stable@vger.kernel.org; Damien Le Moal <dlemoal@kernel.org>; Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>; 牛志国 (Zhiguo Niu) <Zhiguo.Niu@unisoc.com>
主题: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"

注意: 这封邮件来自于外部。除非你确定邮件内容安全，否则不要点击任何链接和附件。
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

The code "max(1U, 3 * (1U << shift)  / 4)" comes from the Kyber I/O scheduler. The Kyber I/O scheduler maintains one internal queue per hwq and hence derives its async_depth from the number of hwq tags. Using this approach for the mq-deadline scheduler is wrong since the mq-deadline scheduler maintains one internal queue for all hwqs combined. Hence this revert.

Cc: stable@vger.kernel.org
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Cc: Zhiguo Niu <Zhiguo.Niu@unisoc.com>
Fixes: d47f9717e5cf ("block/mq-deadline: use correct way to throttling write requests")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c index f958e79277b8..02a916ba62ee 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -646,9 +646,8 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
        struct request_queue *q = hctx->queue;
        struct deadline_data *dd = q->elevator->elevator_data;
        struct blk_mq_tags *tags = hctx->sched_tags;
-       unsigned int shift = tags->bitmap_tags.sb.shift;

-       dd->async_depth = max(1U, 3 * (1U << shift)  / 4);
+       dd->async_depth = max(1UL, 3 * q->nr_requests / 4);

        sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);  }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"
  2024-03-13 21:42 [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests" Bart Van Assche
  2024-03-13 21:56 ` Jens Axboe
  2024-03-14  1:03 ` 答复: " 牛志国 (Zhiguo Niu)
@ 2024-03-14  7:58 ` Harshit Mogalapalli
  2 siblings, 0 replies; 6+ messages in thread
From: Harshit Mogalapalli @ 2024-03-14  7:58 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, stable, Damien Le Moal,
	Zhiguo Niu

Hi Bart, Jens and Zhiguo,

On 14/03/24 03:12, Bart Van Assche wrote:
> The code "max(1U, 3 * (1U << shift)  / 4)" comes from the Kyber I/O
> scheduler. The Kyber I/O scheduler maintains one internal queue per hwq
> and hence derives its async_depth from the number of hwq tags. Using
> this approach for the mq-deadline scheduler is wrong since the
> mq-deadline scheduler maintains one internal queue for all hwqs
> combined. Hence this revert.
> 

Thanks a lot for helping with this performance regression[1].


Regards,
Harshit

[1] 
https://lore.kernel.org/all/5ce2ae5d-61e2-4ede-ad55-551112602401@oracle.com/

> Cc: stable@vger.kernel.org
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
> Cc: Zhiguo Niu <Zhiguo.Niu@unisoc.com>
> Fixes: d47f9717e5cf ("block/mq-deadline: use correct way to throttling write requests")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>   block/mq-deadline.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index f958e79277b8..02a916ba62ee 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -646,9 +646,8 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
>   	struct request_queue *q = hctx->queue;
>   	struct deadline_data *dd = q->elevator->elevator_data;
>   	struct blk_mq_tags *tags = hctx->sched_tags;
> -	unsigned int shift = tags->bitmap_tags.sb.shift;
>   
> -	dd->async_depth = max(1U, 3 * (1U << shift)  / 4);
> +	dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
>   
>   	sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
>   }


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 答复: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"
  2024-03-14  1:03 ` 答复: " 牛志国 (Zhiguo Niu)
@ 2024-03-14 17:08   ` Bart Van Assche
  2024-03-14 19:31     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2024-03-14 17:08 UTC (permalink / raw)
  To: 牛志国 (Zhiguo Niu), Jens Axboe
  Cc: linux-block@vger.kernel.org, Christoph Hellwig,
	stable@vger.kernel.org, Damien Le Moal, Harshit Mogalapalli,
	金红宇 (Hongyu Jin)

On 3/13/24 18:03, 牛志国 (Zhiguo Niu) wrote:
> Just as mentioned in original patch, "dd->async_depth = max(1UL, 3 * q->nr_requests / 4);", this limitation methods look likes won't have a limit effect, because tag allocated is based on sbitmap, not based the whole nr_requests.
> Right?
> Thanks!
> 
> For write requests, when we assign a tags from sched_tags,
> data->shallow_depth will be passed to sbitmap_find_bit,
> see the following code:
> 
> nr = sbitmap_find_bit_in_word(&sb->map[index],
> 			min_t (unsigned int,
> 			__map_depth(sb, index),
> 			depth),
> 			alloc_hint, wrap);
> 
> The smaller of data->shallow_depth and __map_depth(sb, index)
> will be used as the maximum range when allocating bits.
> 
> For a mmc device (one hw queue, deadline I/O scheduler):
> q->nr_requests = sched_tags = 128, so according to the previous
> calculation method, dd->async_depth = data->shallow_depth = 96,
> and the platform is 64bits with 8 cpus, sched_tags.bitmap_tags.sb.shift=5,
> sb.maps[]=32/32/32/32, 32 is smaller than 96, whether it is a read or
> a write I/O, tags can be allocated to the maximum range each time,
> which has not throttling effect.
Whether or not the code in my patch effectively performs throttling,
we need this revert to be merged. The patch that is being reverted
("block/mq-deadline: use correct way to throttling write requests")
ended up in Greg KH's stable branches. Hence, the first step is to
revert that patch and tag it with "Cc: stable" such that the revert
lands in the stable branches.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 答复: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"
  2024-03-14 17:08   ` Bart Van Assche
@ 2024-03-14 19:31     ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2024-03-14 19:31 UTC (permalink / raw)
  To: Bart Van Assche, 牛志国 (Zhiguo Niu)
  Cc: linux-block@vger.kernel.org, Christoph Hellwig,
	stable@vger.kernel.org, Damien Le Moal, Harshit Mogalapalli,
	金红宇 (Hongyu Jin)

On 3/14/24 11:08 AM, Bart Van Assche wrote:
> On 3/13/24 18:03, ??? (Zhiguo Niu) wrote:
>> Just as mentioned in original patch, "dd->async_depth = max(1UL, 3 * q->nr_requests / 4);", this limitation methods look likes won't have a limit effect, because tag allocated is based on sbitmap, not based the whole nr_requests.
>> Right?
>> Thanks!
>>
>> For write requests, when we assign a tags from sched_tags,
>> data->shallow_depth will be passed to sbitmap_find_bit,
>> see the following code:
>>
>> nr = sbitmap_find_bit_in_word(&sb->map[index],
>>             min_t (unsigned int,
>>             __map_depth(sb, index),
>>             depth),
>>             alloc_hint, wrap);
>>
>> The smaller of data->shallow_depth and __map_depth(sb, index)
>> will be used as the maximum range when allocating bits.
>>
>> For a mmc device (one hw queue, deadline I/O scheduler):
>> q->nr_requests = sched_tags = 128, so according to the previous
>> calculation method, dd->async_depth = data->shallow_depth = 96,
>> and the platform is 64bits with 8 cpus, sched_tags.bitmap_tags.sb.shift=5,
>> sb.maps[]=32/32/32/32, 32 is smaller than 96, whether it is a read or
>> a write I/O, tags can be allocated to the maximum range each time,
>> which has not throttling effect.
> Whether or not the code in my patch effectively performs throttling,
> we need this revert to be merged. The patch that is being reverted
> ("block/mq-deadline: use correct way to throttling write requests")
> ended up in Greg KH's stable branches. Hence, the first step is to
> revert that patch and tag it with "Cc: stable" such that the revert
> lands in the stable branches.

Indeed, no amount of arguing is going to change that fact. Zhiguo, it
caused a regression. Rather than argue on why the change is correct,
it'd be much more productive to figure out a future solution.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-14 19:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-13 21:42 [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests" Bart Van Assche
2024-03-13 21:56 ` Jens Axboe
2024-03-14  1:03 ` 答复: " 牛志国 (Zhiguo Niu)
2024-03-14 17:08   ` Bart Van Assche
2024-03-14 19:31     ` Jens Axboe
2024-03-14  7:58 ` Harshit Mogalapalli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox