* [PATCH v2 0/2] Fix the mq-deadline async_depth implementation
@ 2024-05-09 17:01 Bart Van Assche
2024-05-09 17:01 ` [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set Bart Van Assche
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Bart Van Assche @ 2024-05-09 17:01 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Bart Van Assche
Hi Jens,
The mq-deadline 'async_depth' sysfs attribute doesn't behave as intended. This
patch series fixes the implementation of that attribute.
Please consider this patch series for the next merge window.
Thanks,
Bart.
Changes compared to v1: replaced patch 1/2 with a patch from Christoph.
Bart Van Assche (2):
block: Call .limit_depth() after .hctx has been set
block/mq-deadline: Fix the tag reservation code
block/blk-mq.c | 12 ++++++------
block/mq-deadline.c | 20 +++++++++++++++++---
2 files changed, 23 insertions(+), 9 deletions(-)
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set
2024-05-09 17:01 [PATCH v2 0/2] Fix the mq-deadline async_depth implementation Bart Van Assche
@ 2024-05-09 17:01 ` Bart Van Assche
2024-07-02 8:39 ` 答复: " 牛志国 (Zhiguo Niu)
2024-07-02 12:08 ` Christoph Hellwig
2024-05-09 17:01 ` [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code Bart Van Assche
` (2 subsequent siblings)
3 siblings, 2 replies; 17+ messages in thread
From: Bart Van Assche @ 2024-05-09 17:01 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Damien Le Moal,
Zhiguo Niu
Call .limit_depth() after data->hctx has been set such that data->hctx can
be used in .limit_depth() implementations.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Zhiguo Niu <zhiguo.niu@unisoc.com>
Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/blk-mq.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9f677ea85a52..a6310a550b78 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -448,6 +448,10 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
if (data->cmd_flags & REQ_NOWAIT)
data->flags |= BLK_MQ_REQ_NOWAIT;
+retry:
+ data->ctx = blk_mq_get_ctx(q);
+ data->hctx = blk_mq_map_queue(q, data->cmd_flags, data->ctx);
+
if (q->elevator) {
/*
* All requests use scheduler tags when an I/O scheduler is
@@ -469,13 +473,9 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
if (ops->limit_depth)
ops->limit_depth(data->cmd_flags, data);
}
- }
-
-retry:
- data->ctx = blk_mq_get_ctx(q);
- data->hctx = blk_mq_map_queue(q, data->cmd_flags, data->ctx);
- if (!(data->rq_flags & RQF_SCHED_TAGS))
+ } else {
blk_mq_tag_busy(data->hctx);
+ }
if (data->flags & BLK_MQ_REQ_RESERVED)
data->rq_flags |= RQF_RESV;
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-05-09 17:01 [PATCH v2 0/2] Fix the mq-deadline async_depth implementation Bart Van Assche
2024-05-09 17:01 ` [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set Bart Van Assche
@ 2024-05-09 17:01 ` Bart Van Assche
2024-05-16 8:14 ` YangYang
` (2 more replies)
[not found] ` <202407020202.46222wCt089266@SHSPAM01.spreadtrum.com>
2024-07-02 14:48 ` [PATCH v2 0/2] Fix the mq-deadline async_depth implementation Jens Axboe
3 siblings, 3 replies; 17+ messages in thread
From: Bart Van Assche @ 2024-05-09 17:01 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Damien Le Moal,
Zhiguo Niu
The current tag reservation code is based on a misunderstanding of the
meaning of data->shallow_depth. Fix the tag reservation code as follows:
* By default, do not reserve any tags for synchronous requests because
for certain use cases reserving tags reduces performance. See also
Harshit Mogalapalli, [bug-report] Performance regression with fio
sequential-write on a multipath setup, 2024-03-07
(https://lore.kernel.org/linux-block/5ce2ae5d-61e2-4ede-ad55-551112602401@oracle.com/)
* Reduce min_shallow_depth to one because min_shallow_depth must be less
than or equal any shallow_depth value.
* Scale dd->async_depth from the range [1, nr_requests] to [1,
bits_per_sbitmap_word].
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Zhiguo Niu <zhiguo.niu@unisoc.com>
Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/mq-deadline.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 94eede4fb9eb..acdc28756d9d 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -487,6 +487,20 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
return rq;
}
+/*
+ * 'depth' is a number in the range 1..INT_MAX representing a number of
+ * requests. Scale it with a factor (1 << bt->sb.shift) / q->nr_requests since
+ * 1..(1 << bt->sb.shift) is the range expected by sbitmap_get_shallow().
+ * Values larger than q->nr_requests have the same effect as q->nr_requests.
+ */
+static int dd_to_word_depth(struct blk_mq_hw_ctx *hctx, unsigned int qdepth)
+{
+ struct sbitmap_queue *bt = &hctx->sched_tags->bitmap_tags;
+ const unsigned int nrr = hctx->queue->nr_requests;
+
+ return ((qdepth << bt->sb.shift) + nrr - 1) / nrr;
+}
+
/*
* Called by __blk_mq_alloc_request(). The shallow_depth value set by this
* function is used by __blk_mq_get_tag().
@@ -503,7 +517,7 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
* Throttle asynchronous requests and writes such that these requests
* do not block the allocation of synchronous requests.
*/
- data->shallow_depth = dd->async_depth;
+ data->shallow_depth = dd_to_word_depth(data->hctx, dd->async_depth);
}
/* Called by blk_mq_update_nr_requests(). */
@@ -513,9 +527,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
struct deadline_data *dd = q->elevator->elevator_data;
struct blk_mq_tags *tags = hctx->sched_tags;
- dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
+ dd->async_depth = q->nr_requests;
- sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
+ sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
}
/* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-05-09 17:01 ` [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code Bart Van Assche
@ 2024-05-16 8:14 ` YangYang
2024-05-16 21:27 ` Bart Van Assche
2024-07-02 12:09 ` Christoph Hellwig
2024-12-07 2:17 ` Yu Kuai
2 siblings, 1 reply; 17+ messages in thread
From: YangYang @ 2024-05-16 8:14 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-block, Christoph Hellwig, Damien Le Moal, Zhiguo Niu,
Jens Axboe
On 2024/5/10 1:01, Bart Van Assche wrote:
> The current tag reservation code is based on a misunderstanding of the
> meaning of data->shallow_depth. Fix the tag reservation code as follows:
> * By default, do not reserve any tags for synchronous requests because
> for certain use cases reserving tags reduces performance. See also
> Harshit Mogalapalli, [bug-report] Performance regression with fio
> sequential-write on a multipath setup, 2024-03-07
> (https://lore.kernel.org/linux-block/5ce2ae5d-61e2-4ede-ad55-551112602401@oracle.com/)
> * Reduce min_shallow_depth to one because min_shallow_depth must be less
> than or equal any shallow_depth value.
> * Scale dd->async_depth from the range [1, nr_requests] to [1,
> bits_per_sbitmap_word].
>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Zhiguo Niu <zhiguo.niu@unisoc.com>
> Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> block/mq-deadline.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index 94eede4fb9eb..acdc28756d9d 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -487,6 +487,20 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
> return rq;
> }
>
> +/*
> + * 'depth' is a number in the range 1..INT_MAX representing a number of
> + * requests. Scale it with a factor (1 << bt->sb.shift) / q->nr_requests since
> + * 1..(1 << bt->sb.shift) is the range expected by sbitmap_get_shallow().
> + * Values larger than q->nr_requests have the same effect as q->nr_requests.
> + */
> +static int dd_to_word_depth(struct blk_mq_hw_ctx *hctx, unsigned int qdepth)
> +{
> + struct sbitmap_queue *bt = &hctx->sched_tags->bitmap_tags;
> + const unsigned int nrr = hctx->queue->nr_requests;
> +
> + return ((qdepth << bt->sb.shift) + nrr - 1) / nrr;
> +}
> +
> /*
> * Called by __blk_mq_alloc_request(). The shallow_depth value set by this
> * function is used by __blk_mq_get_tag().
> @@ -503,7 +517,7 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
> * Throttle asynchronous requests and writes such that these requests
> * do not block the allocation of synchronous requests.
> */
> - data->shallow_depth = dd->async_depth;
> + data->shallow_depth = dd_to_word_depth(data->hctx, dd->async_depth);
> }
>
> /* Called by blk_mq_update_nr_requests(). */
> @@ -513,9 +527,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
> struct deadline_data *dd = q->elevator->elevator_data;
> struct blk_mq_tags *tags = hctx->sched_tags;
>
> - dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
> + dd->async_depth = q->nr_requests;
>
> - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
> + sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
If sbq->min_shallow_depth is set to 1, sbq->wake_batch will also be set
to 1. I guess this may result in batch wakeup not working as expected.
> }
>
> /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-05-16 8:14 ` YangYang
@ 2024-05-16 21:27 ` Bart Van Assche
2024-06-04 9:08 ` 答复: " 牛志国 (Zhiguo Niu)
0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2024-05-16 21:27 UTC (permalink / raw)
To: YangYang
Cc: linux-block, Christoph Hellwig, Damien Le Moal, Zhiguo Niu,
Jens Axboe
On 5/16/24 02:14, YangYang wrote:
>> @@ -513,9 +527,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx
>> *hctx)
>> struct deadline_data *dd = q->elevator->elevator_data;
>> struct blk_mq_tags *tags = hctx->sched_tags;
>> - dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
>> + dd->async_depth = q->nr_requests;
>> - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags,
>> dd->async_depth);
>> + sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
>
> If sbq->min_shallow_depth is set to 1, sbq->wake_batch will also be set
> to 1. I guess this may result in batch wakeup not working as expected.
The value of the sbq->min_shallow_depth parameter may affect performance
but does not affect correctness. See also the comment above the
sbitmap_queue_min_shallow_depth() declaration. Is this sufficient to
address your concern?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* 答复: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-05-16 21:27 ` Bart Van Assche
@ 2024-06-04 9:08 ` 牛志国 (Zhiguo Niu)
2024-06-04 11:48 ` Bart Van Assche
0 siblings, 1 reply; 17+ messages in thread
From: 牛志国 (Zhiguo Niu) @ 2024-06-04 9:08 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: linux-block@vger.kernel.org, Christoph Hellwig, Damien Le Moal,
王皓 (Hao_hao Wang)
Hi Bart Van Assche and Jens Axboe
Sorry to disturb you.
Would you have a plan When will these patch sets be merged into the mainline?
Thanks!
-----邮件原件-----
发件人: Bart Van Assche <bvanassche@acm.org>
发送时间: 2024年5月17日 5:28
收件人: YangYang <yang.yang@vivo.com>
抄送: linux-block@vger.kernel.org; Christoph Hellwig <hch@lst.de>; Damien Le Moal <dlemoal@kernel.org>; 牛志国 (Zhiguo Niu) <Zhiguo.Niu@unisoc.com>; Jens Axboe <axboe@kernel.dk>
主题: Re: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
注意: 这封邮件来自于外部。除非你确定邮件内容安全,否则不要点击任何链接和附件。
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On 5/16/24 02:14, YangYang wrote:
>> @@ -513,9 +527,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx
>> *hctx)
>> struct deadline_data *dd = q->elevator->elevator_data;
>> struct blk_mq_tags *tags = hctx->sched_tags;
>> - dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
>> + dd->async_depth = q->nr_requests;
>> - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags,
>> dd->async_depth);
>> + sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
>
> If sbq->min_shallow_depth is set to 1, sbq->wake_batch will also be
> set to 1. I guess this may result in batch wakeup not working as expected.
The value of the sbq->min_shallow_depth parameter may affect performance but does not affect correctness. See also the comment above the
sbitmap_queue_min_shallow_depth() declaration. Is this sufficient to address your concern?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 答复: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-06-04 9:08 ` 答复: " 牛志国 (Zhiguo Niu)
@ 2024-06-04 11:48 ` Bart Van Assche
2024-07-01 1:30 ` 答复: " 牛志国 (Zhiguo Niu)
0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2024-06-04 11:48 UTC (permalink / raw)
To: 牛志国 (Zhiguo Niu), Jens Axboe
Cc: linux-block@vger.kernel.org, Christoph Hellwig, Damien Le Moal,
王皓 (Hao_hao Wang)
On 6/4/24 02:08, 牛志国 (Zhiguo Niu) wrote:
> Would you have a plan When will these patch sets be merged into the mainline?
These patches still apply without any changes to Jens' block/for-next
branch. I think the next step is that someone helps by posting
Reviewed-by or Tested-by tags for these patches.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* 答复: 答复: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-06-04 11:48 ` Bart Van Assche
@ 2024-07-01 1:30 ` 牛志国 (Zhiguo Niu)
2024-07-01 23:21 ` Bart Van Assche
0 siblings, 1 reply; 17+ messages in thread
From: 牛志国 (Zhiguo Niu) @ 2024-07-01 1:30 UTC (permalink / raw)
To: Jens Axboe, Christoph Hellwig, Damien Le Moal
Cc: linux-block@vger.kernel.org, Bart Van Assche,
王皓 (Hao_hao Wang)
Hi block developers,
Can you help review this serials patch from Bart Van Assche?
https://lore.kernel.org/all/20240509170149.7639-1-bvanassche@acm.org/
[PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set
[PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
These patch will fix the issue "there may warning happen if we set dd async_depth from user",
For more information about warnings, please refer to commit msg:
https://lore.kernel.org/all/CAHJ8P3KEOC_DXQmZK3u7PHgZFmWpMVzPa6pgkOgpyoH7wgT5nw@mail.gmail.com/
If you need any tests, you can ask me. I can help if my experimental environment can be implemented.
Thanks!
-----邮件原件-----
发件人: Bart Van Assche <bvanassche@acm.org>
发送时间: 2024年6月4日 19:49
收件人: 牛志国 (Zhiguo Niu) <Zhiguo.Niu@unisoc.com>; Jens Axboe <axboe@kernel.dk>
抄送: linux-block@vger.kernel.org; Christoph Hellwig <hch@lst.de>; Damien Le Moal <dlemoal@kernel.org>; 王皓 (Hao_hao Wang) <Hao_hao.Wang@unisoc.com>
主题: Re: 答复: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
注意: 这封邮件来自于外部。除非你确定邮件内容安全,否则不要点击任何链接和附件。
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On 6/4/24 02:08, 牛志国 (Zhiguo Niu) wrote:
> Would you have a plan When will these patch sets be merged into the mainline?
These patches still apply without any changes to Jens' block/for-next branch. I think the next step is that someone helps by posting Reviewed-by or Tested-by tags for these patches.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 答复: 答复: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-07-01 1:30 ` 答复: " 牛志国 (Zhiguo Niu)
@ 2024-07-01 23:21 ` Bart Van Assche
0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2024-07-01 23:21 UTC (permalink / raw)
To: 牛志国 (Zhiguo Niu), Jens Axboe,
Christoph Hellwig, Damien Le Moal
Cc: linux-block@vger.kernel.org, 王皓 (Hao_hao Wang)
On 6/30/24 6:30 PM, 牛志国 (Zhiguo Niu) wrote:
> Can you help review this serials patch from Bart Van Assche?
> https://lore.kernel.org/all/20240509170149.7639-1-bvanassche@acm.org/
> [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set
> [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
>
> These patch will fix the issue "there may warning happen if we set dd async_depth from user",
> For more information about warnings, please refer to commit msg:
> https://lore.kernel.org/all/CAHJ8P3KEOC_DXQmZK3u7PHgZFmWpMVzPa6pgkOgpyoH7wgT5nw@mail.gmail.com/
Hi Zhiguo,
If these patches pass your tests then you can reply to these patches
with your own Tested-by.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* 答复: [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set
2024-05-09 17:01 ` [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set Bart Van Assche
@ 2024-07-02 8:39 ` 牛志国 (Zhiguo Niu)
2024-07-02 12:08 ` Christoph Hellwig
1 sibling, 0 replies; 17+ messages in thread
From: 牛志国 (Zhiguo Niu) @ 2024-07-02 8:39 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: linux-block@vger.kernel.org, Christoph Hellwig, Damien Le Moal,
王皓 (Hao_hao Wang)
-----邮件原件-----
发件人: Bart Van Assche <bvanassche@acm.org>
发送时间: 2024年5月10日 1:02
收件人: Jens Axboe <axboe@kernel.dk>
抄送: linux-block@vger.kernel.org; Christoph Hellwig <hch@lst.de>; Bart Van Assche <bvanassche@acm.org>; Damien Le Moal <dlemoal@kernel.org>; 牛志国 (Zhiguo Niu) <Zhiguo.Niu@unisoc.com>
主题: [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set
注意: 这封邮件来自于外部。除非你确定邮件内容安全,否则不要点击任何链接和附件。
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Call .limit_depth() after data->hctx has been set such that data->hctx can be used in .limit_depth() implementations.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Zhiguo Niu <zhiguo.niu@unisoc.com>
Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/blk-mq.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index 9f677ea85a52..a6310a550b78 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -448,6 +448,10 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
if (data->cmd_flags & REQ_NOWAIT)
data->flags |= BLK_MQ_REQ_NOWAIT;
+retry:
+ data->ctx = blk_mq_get_ctx(q);
+ data->hctx = blk_mq_map_queue(q, data->cmd_flags, data->ctx);
+
Hi Bart,
I tested basic function about these serial patches and results is pass, and no warning report after setting async_depth value by sysfs, But there is a little question here, if do "retry", the following if case will be re-run also, so
Is there something wrong with this or can it be improved? Otherwise
Tested-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Thanks!
if (q->elevator) {
/*
* All requests use scheduler tags when an I/O scheduler is @@ -469,13 +473,9 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
if (ops->limit_depth)
ops->limit_depth(data->cmd_flags, data);
}
- }
-
-retry:
- data->ctx = blk_mq_get_ctx(q);
- data->hctx = blk_mq_map_queue(q, data->cmd_flags, data->ctx);
- if (!(data->rq_flags & RQF_SCHED_TAGS))
+ } else {
blk_mq_tag_busy(data->hctx);
+ }
if (data->flags & BLK_MQ_REQ_RESERVED)
data->rq_flags |= RQF_RESV;
^ permalink raw reply [flat|nested] 17+ messages in thread
* 答复: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
[not found] ` <202407020202.46222wCt089266@SHSPAM01.spreadtrum.com>
@ 2024-07-02 8:41 ` 牛志国 (Zhiguo Niu)
0 siblings, 0 replies; 17+ messages in thread
From: 牛志国 (Zhiguo Niu) @ 2024-07-02 8:41 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: linux-block@vger.kernel.org, Christoph Hellwig, Damien Le Moal,
王皓 (Hao_hao Wang)
-----邮件原件-----
发件人: Bart Van Assche <bvanassche@acm.org>
发送时间: 2024年5月10日 1:02
收件人: Jens Axboe <axboe@kernel.dk>
抄送: linux-block@vger.kernel.org; Christoph Hellwig <hch@lst.de>; Bart Van Assche <bvanassche@acm.org>; Damien Le Moal <dlemoal@kernel.org>; 牛志国 (Zhiguo Niu) <Zhiguo.Niu@unisoc.com>
主题: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
注意: 这封邮件来自于外部。除非你确定邮件内容安全,否则不要点击任何链接和附件。
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
The current tag reservation code is based on a misunderstanding of the meaning of data->shallow_depth. Fix the tag reservation code as follows:
* By default, do not reserve any tags for synchronous requests because
for certain use cases reserving tags reduces performance. See also
Harshit Mogalapalli, [bug-report] Performance regression with fio
sequential-write on a multipath setup, 2024-03-07
(https://lore.kernel.org/linux-block/5ce2ae5d-61e2-4ede-ad55-551112602401@oracle.com/)
* Reduce min_shallow_depth to one because min_shallow_depth must be less
than or equal any shallow_depth value.
* Scale dd->async_depth from the range [1, nr_requests] to [1,
bits_per_sbitmap_word].
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Zhiguo Niu <zhiguo.niu@unisoc.com>
Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/mq-deadline.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 94eede4fb9eb..acdc28756d9d 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -487,6 +487,20 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
return rq;
}
+/*
+ * 'depth' is a number in the range 1..INT_MAX representing a number of
+ * requests. Scale it with a factor (1 << bt->sb.shift) /
+q->nr_requests since
+ * 1..(1 << bt->sb.shift) is the range expected by sbitmap_get_shallow().
+ * Values larger than q->nr_requests have the same effect as q->nr_requests.
+ */
+static int dd_to_word_depth(struct blk_mq_hw_ctx *hctx, unsigned int
+qdepth) {
+ struct sbitmap_queue *bt = &hctx->sched_tags->bitmap_tags;
+ const unsigned int nrr = hctx->queue->nr_requests;
+
+ return ((qdepth << bt->sb.shift) + nrr - 1) / nrr; }
+
/*
* Called by __blk_mq_alloc_request(). The shallow_depth value set by this
* function is used by __blk_mq_get_tag().
@@ -503,7 +517,7 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
* Throttle asynchronous requests and writes such that these requests
* do not block the allocation of synchronous requests.
*/
- data->shallow_depth = dd->async_depth;
+ data->shallow_depth = dd_to_word_depth(data->hctx,
+ dd->async_depth);
}
/* Called by blk_mq_update_nr_requests(). */ @@ -513,9 +527,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
struct deadline_data *dd = q->elevator->elevator_data;
struct blk_mq_tags *tags = hctx->sched_tags;
- dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
+ dd->async_depth = q->nr_requests;
- sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
+ sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
}
Hi Bart,
I tested basic function about these serial patches and results is pass, and no warning report after setting async_depth value by sysfs, so
Tested-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Thanks!
/* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set
2024-05-09 17:01 ` [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set Bart Van Assche
2024-07-02 8:39 ` 答复: " 牛志国 (Zhiguo Niu)
@ 2024-07-02 12:08 ` Christoph Hellwig
1 sibling, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-07-02 12:08 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Damien Le Moal,
Zhiguo Niu
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-05-09 17:01 ` [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code Bart Van Assche
2024-05-16 8:14 ` YangYang
@ 2024-07-02 12:09 ` Christoph Hellwig
2024-12-07 2:17 ` Yu Kuai
2 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-07-02 12:09 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Damien Le Moal,
Zhiguo Niu
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/2] Fix the mq-deadline async_depth implementation
2024-05-09 17:01 [PATCH v2 0/2] Fix the mq-deadline async_depth implementation Bart Van Assche
` (2 preceding siblings ...)
[not found] ` <202407020202.46222wCt089266@SHSPAM01.spreadtrum.com>
@ 2024-07-02 14:48 ` Jens Axboe
3 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2024-07-02 14:48 UTC (permalink / raw)
To: Bart Van Assche; +Cc: linux-block, Christoph Hellwig
On Thu, 09 May 2024 10:01:47 -0700, Bart Van Assche wrote:
> The mq-deadline 'async_depth' sysfs attribute doesn't behave as intended. This
> patch series fixes the implementation of that attribute.
>
> Please consider this patch series for the next merge window.
>
> Thanks,
>
> [...]
Applied, thanks!
[1/2] block: Call .limit_depth() after .hctx has been set
commit: 6259151c04d4e0085e00d2dcb471ebdd1778e72e
[2/2] block/mq-deadline: Fix the tag reservation code
commit: 39823b47bbd40502632ffba90ebb34fff7c8b5e8
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-05-09 17:01 ` [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code Bart Van Assche
2024-05-16 8:14 ` YangYang
2024-07-02 12:09 ` Christoph Hellwig
@ 2024-12-07 2:17 ` Yu Kuai
2024-12-09 19:08 ` Bart Van Assche
2 siblings, 1 reply; 17+ messages in thread
From: Yu Kuai @ 2024-12-07 2:17 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: linux-block, Christoph Hellwig, Damien Le Moal, Zhiguo Niu,
yukuai (C)
Hi, Bart
在 2024/05/10 1:01, Bart Van Assche 写道:
> The current tag reservation code is based on a misunderstanding of the
> meaning of data->shallow_depth. Fix the tag reservation code as follows:
> * By default, do not reserve any tags for synchronous requests because
> for certain use cases reserving tags reduces performance. See also
> Harshit Mogalapalli, [bug-report] Performance regression with fio
> sequential-write on a multipath setup, 2024-03-07
> (https://lore.kernel.org/linux-block/5ce2ae5d-61e2-4ede-ad55-551112602401@oracle.com/)
> * Reduce min_shallow_depth to one because min_shallow_depth must be less
> than or equal any shallow_depth value.
> * Scale dd->async_depth from the range [1, nr_requests] to [1,
> bits_per_sbitmap_word].
>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Zhiguo Niu <zhiguo.niu@unisoc.com>
> Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> block/mq-deadline.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index 94eede4fb9eb..acdc28756d9d 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -487,6 +487,20 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
> return rq;
> }
>
> +/*
> + * 'depth' is a number in the range 1..INT_MAX representing a number of
> + * requests. Scale it with a factor (1 << bt->sb.shift) / q->nr_requests since
> + * 1..(1 << bt->sb.shift) is the range expected by sbitmap_get_shallow().
> + * Values larger than q->nr_requests have the same effect as q->nr_requests.
> + */
> +static int dd_to_word_depth(struct blk_mq_hw_ctx *hctx, unsigned int qdepth)
> +{
> + struct sbitmap_queue *bt = &hctx->sched_tags->bitmap_tags;
> + const unsigned int nrr = hctx->queue->nr_requests;
> +
> + return ((qdepth << bt->sb.shift) + nrr - 1) / nrr;
> +}
> +
> /*
> * Called by __blk_mq_alloc_request(). The shallow_depth value set by this
> * function is used by __blk_mq_get_tag().
> @@ -503,7 +517,7 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
> * Throttle asynchronous requests and writes such that these requests
> * do not block the allocation of synchronous requests.
> */
> - data->shallow_depth = dd->async_depth;
> + data->shallow_depth = dd_to_word_depth(data->hctx, dd->async_depth);
> }
>
> /* Called by blk_mq_update_nr_requests(). */
> @@ -513,9 +527,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
> struct deadline_data *dd = q->elevator->elevator_data;
> struct blk_mq_tags *tags = hctx->sched_tags;
>
> - dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
> + dd->async_depth = q->nr_requests;
We're comparing v6.6 and v5.10 performance in downstream kernel, we
met a regression and bisect to this patch. And during review, I don't
understand the above change.
For example, dd->async_depth is nr_requests, then dd_to_word_depth()
will just return 1 << bt->sb.shift. Then nothing will be throttled.
The regression is a corner test case that unlikely to happen in real
world, I can share more if you're interested.
Thanks,
Kuai
>
> - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
> + sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
> }
>
> /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
>
> .
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-12-07 2:17 ` Yu Kuai
@ 2024-12-09 19:08 ` Bart Van Assche
2024-12-10 1:09 ` Yu Kuai
0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2024-12-09 19:08 UTC (permalink / raw)
To: Yu Kuai, Jens Axboe
Cc: linux-block, Christoph Hellwig, Damien Le Moal, Zhiguo Niu,
yukuai (C)
On 12/6/24 6:17 PM, Yu Kuai wrote:
> We're comparing v6.6 and v5.10 performance in downstream kernel, we
> met a regression and bisect to this patch. And during review, I don't
> understand the above change.
>
> For example, dd->async_depth is nr_requests, then dd_to_word_depth()
> will just return 1 << bt->sb.shift. Then nothing will be throttled.
(just back from traveling)
As explained in the description of commit 39823b47bbd4 ("block/mq-
deadline: Fix the tag reservation code"), the default value of
async_depth has been changed from 3 * q->nr_requests / 4 into
q->nr_requests to fix a performance regression. The value of
dd->async_depth can be changed through sysfs.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code
2024-12-09 19:08 ` Bart Van Assche
@ 2024-12-10 1:09 ` Yu Kuai
0 siblings, 0 replies; 17+ messages in thread
From: Yu Kuai @ 2024-12-10 1:09 UTC (permalink / raw)
To: Bart Van Assche, Yu Kuai, Jens Axboe
Cc: linux-block, Christoph Hellwig, Damien Le Moal, Zhiguo Niu,
yukuai (C)
Hi,
在 2024/12/10 3:08, Bart Van Assche 写道:
> On 12/6/24 6:17 PM, Yu Kuai wrote:
>> We're comparing v6.6 and v5.10 performance in downstream kernel, we
>> met a regression and bisect to this patch. And during review, I don't
>> understand the above change.
>>
>> For example, dd->async_depth is nr_requests, then dd_to_word_depth()
>> will just return 1 << bt->sb.shift. Then nothing will be throttled.
>
> (just back from traveling)
>
> As explained in the description of commit 39823b47bbd4 ("block/mq-
> deadline: Fix the tag reservation code"), the default value of
> async_depth has been changed from 3 * q->nr_requests / 4 into
> q->nr_requests to fix a performance regression. The value of
> dd->async_depth can be changed through sysfs.
Then I think what we should do next is to fix the sysfs api as well,
it's not correct for the default value for now.
Thanks,
Kuai
>
> Thanks,
>
> Bart.
> .
>
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-12-10 1:09 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-09 17:01 [PATCH v2 0/2] Fix the mq-deadline async_depth implementation Bart Van Assche
2024-05-09 17:01 ` [PATCH v2 1/2] block: Call .limit_depth() after .hctx has been set Bart Van Assche
2024-07-02 8:39 ` 答复: " 牛志国 (Zhiguo Niu)
2024-07-02 12:08 ` Christoph Hellwig
2024-05-09 17:01 ` [PATCH v2 2/2] block/mq-deadline: Fix the tag reservation code Bart Van Assche
2024-05-16 8:14 ` YangYang
2024-05-16 21:27 ` Bart Van Assche
2024-06-04 9:08 ` 答复: " 牛志国 (Zhiguo Niu)
2024-06-04 11:48 ` Bart Van Assche
2024-07-01 1:30 ` 答复: " 牛志国 (Zhiguo Niu)
2024-07-01 23:21 ` Bart Van Assche
2024-07-02 12:09 ` Christoph Hellwig
2024-12-07 2:17 ` Yu Kuai
2024-12-09 19:08 ` Bart Van Assche
2024-12-10 1:09 ` Yu Kuai
[not found] ` <202407020202.46222wCt089266@SHSPAM01.spreadtrum.com>
2024-07-02 8:41 ` 答复: " 牛志国 (Zhiguo Niu)
2024-07-02 14:48 ` [PATCH v2 0/2] Fix the mq-deadline async_depth implementation Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).