* [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation
@ 2024-12-21 9:37 Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 1/7] lib/sbitmap: convert shallow_depth from one word to the whole sbitmap Yu Kuai
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
Changes in RFC v3:
- only patch 1,2 is from v1/v2, others are new patches, in order to change
async_depth to request queue attribute.
Changes in RFC v2:
- update commit message for patch 1;
- also handle min_shallow_depth in patch 2;
- add patch 3 to choose none elevator by default;
- add patch 4 to fix default wake_batch;
Yu Kuai (7):
lib/sbitmap: convert shallow_depth from one word to the whole sbitmap
lib/sbitmap: make sbitmap_get_shallow() internal
block/elevator: add new ops async_depth_updated()
block: change the filed nr_requests to unsgined int in request_queue
block: add new queue sysfs api async_depth
block/mq-deadline: switch to use queue async_depth
block/kyber-iosched: switch to use queue async_depth
block/blk-sysfs.c | 35 ++++++++++++++++++++
block/elevator.h | 1 +
block/kyber-iosched.c | 31 +++--------------
block/mq-deadline.c | 57 ++++++--------------------------
include/linux/blkdev.h | 8 ++++-
include/linux/sbitmap.h | 19 +----------
lib/sbitmap.c | 73 +++++++++++++++++++++++++----------------
7 files changed, 103 insertions(+), 121 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH RFC v3 1/7] lib/sbitmap: convert shallow_depth from one word to the whole sbitmap
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
@ 2024-12-21 9:37 ` Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 2/7] lib/sbitmap: make sbitmap_get_shallow() internal Yu Kuai
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
Both kyber and mq-deadline record internal 'async_depth' to throttle
asynchronous requests, and they both calculate shallow_dpeth based on
sb->shift, with the respect that sb->shift is the available tags in one
word.
However, sb->shift is not the availbale tags in the last word, see
__map_depth:
if (index == sb->map_nr - 1)
return sb->depth - (index << sb->shift);
What's worse, bfq just calculate shallow_depth as allowed tags for the
whole sbitmap.
On the one hand, callers doesn't know if the last word in bitmap will be
used; on the other hand, bfq can allow only 1 request for the whole
sbitmap. Fix above problems by using shallow_depth to the whole sbitmap,
also change kyber and mq-deadline to follow this.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
block/kyber-iosched.c | 9 ++-----
block/mq-deadline.c | 16 +-----------
include/linux/sbitmap.h | 6 ++---
lib/sbitmap.c | 55 +++++++++++++++++++++--------------------
4 files changed, 34 insertions(+), 52 deletions(-)
diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index 4155594aefc6..ccfefa6a3669 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -157,10 +157,7 @@ struct kyber_queue_data {
*/
struct sbitmap_queue domain_tokens[KYBER_NUM_DOMAINS];
- /*
- * Async request percentage, converted to per-word depth for
- * sbitmap_get_shallow().
- */
+ /* Number of allowed async requests. */
unsigned int async_depth;
struct kyber_cpu_latency __percpu *cpu_latency;
@@ -454,10 +451,8 @@ static void kyber_depth_updated(struct blk_mq_hw_ctx *hctx)
{
struct kyber_queue_data *kqd = hctx->queue->elevator->elevator_data;
struct blk_mq_tags *tags = hctx->sched_tags;
- unsigned int shift = tags->bitmap_tags.sb.shift;
-
- kqd->async_depth = (1U << shift) * KYBER_ASYNC_PERCENT / 100U;
+ kqd->async_depth = hctx->queue->nr_requests * KYBER_ASYNC_PERCENT / 100U;
sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, kqd->async_depth);
}
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 5528347b5fcf..853985bd13d4 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -487,20 +487,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
return rq;
}
-/*
- * 'depth' is a number in the range 1..INT_MAX representing a number of
- * requests. Scale it with a factor (1 << bt->sb.shift) / q->nr_requests since
- * 1..(1 << bt->sb.shift) is the range expected by sbitmap_get_shallow().
- * Values larger than q->nr_requests have the same effect as q->nr_requests.
- */
-static int dd_to_word_depth(struct blk_mq_hw_ctx *hctx, unsigned int qdepth)
-{
- struct sbitmap_queue *bt = &hctx->sched_tags->bitmap_tags;
- const unsigned int nrr = hctx->queue->nr_requests;
-
- return ((qdepth << bt->sb.shift) + nrr - 1) / nrr;
-}
-
/*
* Called by __blk_mq_alloc_request(). The shallow_depth value set by this
* function is used by __blk_mq_get_tag().
@@ -517,7 +503,7 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
* Throttle asynchronous requests and writes such that these requests
* do not block the allocation of synchronous requests.
*/
- data->shallow_depth = dd_to_word_depth(data->hctx, dd->async_depth);
+ data->shallow_depth = dd->async_depth;
}
/* Called by blk_mq_update_nr_requests(). */
diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 189140bf11fc..4adf4b364fcd 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -213,12 +213,12 @@ int sbitmap_get(struct sbitmap *sb);
* sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap,
* limiting the depth used from each word.
* @sb: Bitmap to allocate from.
- * @shallow_depth: The maximum number of bits to allocate from a single word.
+ * @shallow_depth: The maximum number of bits to allocate from the bitmap.
*
* This rather specific operation allows for having multiple users with
* different allocation limits. E.g., there can be a high-priority class that
* uses sbitmap_get() and a low-priority class that uses sbitmap_get_shallow()
- * with a @shallow_depth of (1 << (@sb->shift - 1)). Then, the low-priority
+ * with a @shallow_depth of (sb->depth >> 1). Then, the low-priority
* class can only allocate half of the total bits in the bitmap, preventing it
* from starving out the high-priority class.
*
@@ -478,7 +478,7 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
* sbitmap_queue, limiting the depth used from each word, with preemption
* already disabled.
* @sbq: Bitmap queue to allocate from.
- * @shallow_depth: The maximum number of bits to allocate from a single word.
+ * @shallow_depth: The maximum number of bits to allocate from the queue.
* See sbitmap_get_shallow().
*
* If you call this, make sure to call sbitmap_queue_min_shallow_depth() after
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index d3412984170c..f2e90ac6b56e 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -208,8 +208,27 @@ static int sbitmap_find_bit_in_word(struct sbitmap_word *map,
return nr;
}
+static unsigned int __map_depth_with_shallow(const struct sbitmap *sb,
+ int index,
+ unsigned int shallow_depth)
+{
+ unsigned int lower_bound = 0;
+
+ if (shallow_depth >= sb->depth)
+ return __map_depth(sb, index);
+
+ if (index > 0)
+ lower_bound += (index - 1) << sb->shift;
+
+ if (shallow_depth <= lower_bound)
+ return 0;
+
+ return min_t(unsigned int, __map_depth(sb, index),
+ shallow_depth - lower_bound);
+}
+
static int sbitmap_find_bit(struct sbitmap *sb,
- unsigned int depth,
+ unsigned int shallow_depth,
unsigned int index,
unsigned int alloc_hint,
bool wrap)
@@ -218,12 +237,12 @@ static int sbitmap_find_bit(struct sbitmap *sb,
int nr = -1;
for (i = 0; i < sb->map_nr; i++) {
- nr = sbitmap_find_bit_in_word(&sb->map[index],
- min_t(unsigned int,
- __map_depth(sb, index),
- depth),
- alloc_hint, wrap);
+ unsigned int depth = __map_depth_with_shallow(sb, index,
+ shallow_depth);
+ if (depth)
+ nr = sbitmap_find_bit_in_word(&sb->map[index], depth,
+ alloc_hint, wrap);
if (nr != -1) {
nr += index << sb->shift;
break;
@@ -406,27 +425,9 @@ EXPORT_SYMBOL_GPL(sbitmap_bitmap_show);
static unsigned int sbq_calc_wake_batch(struct sbitmap_queue *sbq,
unsigned int depth)
{
- unsigned int wake_batch;
- unsigned int shallow_depth;
-
- /*
- * Each full word of the bitmap has bits_per_word bits, and there might
- * be a partial word. There are depth / bits_per_word full words and
- * depth % bits_per_word bits left over. In bitwise arithmetic:
- *
- * bits_per_word = 1 << shift
- * depth / bits_per_word = depth >> shift
- * depth % bits_per_word = depth & ((1 << shift) - 1)
- *
- * Each word can be limited to sbq->min_shallow_depth bits.
- */
- shallow_depth = min(1U << sbq->sb.shift, sbq->min_shallow_depth);
- depth = ((depth >> sbq->sb.shift) * shallow_depth +
- min(depth & ((1U << sbq->sb.shift) - 1), shallow_depth));
- wake_batch = clamp_t(unsigned int, depth / SBQ_WAIT_QUEUES, 1,
- SBQ_WAKE_BATCH);
-
- return wake_batch;
+ return clamp_t(unsigned int,
+ min(depth, sbq->min_shallow_depth) / SBQ_WAIT_QUEUES,
+ 1, SBQ_WAKE_BATCH);
}
int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth,
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH RFC v3 2/7] lib/sbitmap: make sbitmap_get_shallow() internal
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 1/7] lib/sbitmap: convert shallow_depth from one word to the whole sbitmap Yu Kuai
@ 2024-12-21 9:37 ` Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 3/7] block/elevator: add new ops async_depth_updated() Yu Kuai
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
Because it's only used in sbitmap.c
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
include/linux/sbitmap.h | 17 -----------------
lib/sbitmap.c | 18 ++++++++++++++++--
2 files changed, 16 insertions(+), 19 deletions(-)
diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 4adf4b364fcd..ffb9907c7070 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -209,23 +209,6 @@ void sbitmap_resize(struct sbitmap *sb, unsigned int depth);
*/
int sbitmap_get(struct sbitmap *sb);
-/**
- * sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap,
- * limiting the depth used from each word.
- * @sb: Bitmap to allocate from.
- * @shallow_depth: The maximum number of bits to allocate from the bitmap.
- *
- * This rather specific operation allows for having multiple users with
- * different allocation limits. E.g., there can be a high-priority class that
- * uses sbitmap_get() and a low-priority class that uses sbitmap_get_shallow()
- * with a @shallow_depth of (sb->depth >> 1). Then, the low-priority
- * class can only allocate half of the total bits in the bitmap, preventing it
- * from starving out the high-priority class.
- *
- * Return: Non-negative allocated bit number if successful, -1 otherwise.
- */
-int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth);
-
/**
* sbitmap_any_bit_set() - Check for a set bit in a &struct sbitmap.
* @sb: Bitmap to check.
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index f2e90ac6b56e..5e3c35086253 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -306,7 +306,22 @@ static int __sbitmap_get_shallow(struct sbitmap *sb,
return sbitmap_find_bit(sb, shallow_depth, index, alloc_hint, true);
}
-int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth)
+/**
+ * sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap,
+ * limiting the depth used from each word.
+ * @sb: Bitmap to allocate from.
+ * @shallow_depth: The maximum number of bits to allocate from the bitmap.
+ *
+ * This rather specific operation allows for having multiple users with
+ * different allocation limits. E.g., there can be a high-priority class that
+ * uses sbitmap_get() and a low-priority class that uses sbitmap_get_shallow()
+ * with a @shallow_depth of (sb->depth >> 1). Then, the low-priority
+ * class can only allocate half of the total bits in the bitmap, preventing it
+ * from starving out the high-priority class.
+ *
+ * Return: Non-negative allocated bit number if successful, -1 otherwise.
+ */
+static int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth)
{
int nr;
unsigned int hint, depth;
@@ -321,7 +336,6 @@ int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth)
return nr;
}
-EXPORT_SYMBOL_GPL(sbitmap_get_shallow);
bool sbitmap_any_bit_set(const struct sbitmap *sb)
{
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH RFC v3 3/7] block/elevator: add new ops async_depth_updated()
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 1/7] lib/sbitmap: convert shallow_depth from one word to the whole sbitmap Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 2/7] lib/sbitmap: make sbitmap_get_shallow() internal Yu Kuai
@ 2024-12-21 9:37 ` Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 4/7] block: change the filed nr_requests to unsgined int in request_queue Yu Kuai
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
The new api will be used in mq-deadline and kyber in following patches,
prepare to refactor async_depth.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
block/elevator.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/elevator.h b/block/elevator.h
index dbf357ef4fab..21a431873f48 100644
--- a/block/elevator.h
+++ b/block/elevator.h
@@ -29,6 +29,7 @@ struct elevator_mq_ops {
int (*init_hctx)(struct blk_mq_hw_ctx *, unsigned int);
void (*exit_hctx)(struct blk_mq_hw_ctx *, unsigned int);
void (*depth_updated)(struct blk_mq_hw_ctx *);
+ int (*async_depth_updated)(struct request_queue *, unsigned int);
bool (*allow_merge)(struct request_queue *, struct request *, struct bio *);
bool (*bio_merge)(struct request_queue *, struct bio *, unsigned int);
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH RFC v3 4/7] block: change the filed nr_requests to unsgined int in request_queue
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
` (2 preceding siblings ...)
2024-12-21 9:37 ` [PATCH RFC v3 3/7] block/elevator: add new ops async_depth_updated() Yu Kuai
@ 2024-12-21 9:37 ` Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 5/7] block: add new queue sysfs api async_depth Yu Kuai
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
Because currently all the callers are using it as unsigned int.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
include/linux/blkdev.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 5d40af2ef971..8656ac4c3069 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -523,7 +523,7 @@ struct request_queue {
/*
* queue settings
*/
- unsigned long nr_requests; /* Max # of requests */
+ unsigned int nr_requests; /* Max # of requests */
#ifdef CONFIG_BLK_INLINE_ENCRYPTION
struct blk_crypto_profile *crypto_profile;
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH RFC v3 5/7] block: add new queue sysfs api async_depth
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
` (3 preceding siblings ...)
2024-12-21 9:37 ` [PATCH RFC v3 4/7] block: change the filed nr_requests to unsgined int in request_queue Yu Kuai
@ 2024-12-21 9:37 ` Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 6/7] block/mq-deadline: switch to use queue async_depth Yu Kuai
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
Prepare to refator async_depth for mq-deadline and kyber.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
block/blk-sysfs.c | 35 +++++++++++++++++++++++++++++++++++
include/linux/blkdev.h | 6 ++++++
2 files changed, 41 insertions(+)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 64f70c713d2f..3ee2fb8a5077 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -76,6 +76,39 @@ queue_requests_store(struct gendisk *disk, const char *page, size_t count)
return ret;
}
+static ssize_t queue_async_depth_show(struct gendisk *disk, char *page)
+{
+ if (disk->queue->async_depth)
+ return queue_var_show(disk->queue->async_depth, page);
+
+ return queue_requests_show(disk, page);
+}
+
+static ssize_t
+queue_async_depth_store(struct gendisk *disk, const char *page, size_t count)
+{
+ struct request_queue *q = disk->queue;
+ struct elevator_queue *e = q->elevator;
+ unsigned long nr;
+ int ret, err;
+
+ if (!e || !e->type->ops.async_depth_updated)
+ return -EINVAL;
+
+ ret = queue_var_store(&nr, page, count);
+ if (ret < 0)
+ return ret;
+
+ if (nr < 0 || nr >= q->nr_requests)
+ nr = 0;
+
+ err = e->type->ops.async_depth_updated(q, nr);
+ if (err)
+ return err;
+
+ return ret;
+}
+
static ssize_t queue_ra_show(struct gendisk *disk, char *page)
{
return queue_var_show(disk->bdi->ra_pages << (PAGE_SHIFT - 10), page);
@@ -440,6 +473,7 @@ static struct queue_sysfs_entry _prefix##_entry = { \
}
QUEUE_RW_ENTRY(queue_requests, "nr_requests");
+QUEUE_RW_ENTRY(queue_async_depth, "async_depth");
QUEUE_RW_ENTRY(queue_ra, "read_ahead_kb");
QUEUE_RW_ENTRY(queue_max_sectors, "max_sectors_kb");
QUEUE_RO_ENTRY(queue_max_hw_sectors, "max_hw_sectors_kb");
@@ -621,6 +655,7 @@ static struct attribute *queue_attrs[] = {
/* Request-based queue attributes that are not relevant for bio-based queues. */
static struct attribute *blk_mq_queue_attrs[] = {
&queue_requests_entry.attr,
+ &queue_async_depth_entry.attr,
&elv_iosched_entry.attr,
&queue_rq_affinity_entry.attr,
&queue_io_timeout_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8656ac4c3069..2fcc1ba39a28 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -524,6 +524,12 @@ struct request_queue {
* queue settings
*/
unsigned int nr_requests; /* Max # of requests */
+ /*
+ * Max number of async requests, used by elevator.
+ * Value range: [0, nr_requests)
+ * 0 is the default value, means unlimited.
+ */
+ unsigned int async_depth;
#ifdef CONFIG_BLK_INLINE_ENCRYPTION
struct blk_crypto_profile *crypto_profile;
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH RFC v3 6/7] block/mq-deadline: switch to use queue async_depth
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
` (4 preceding siblings ...)
2024-12-21 9:37 ` [PATCH RFC v3 5/7] block: add new queue sysfs api async_depth Yu Kuai
@ 2024-12-21 9:37 ` Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 7/7] block/kyber-iosched: " Yu Kuai
2025-02-13 2:47 ` [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
min_shallow_depth must be less or equal to any shallow_depth value, and
it's 1 currently, and this will change default wake_batch to 1, causing
performance degradation for fast disk with high concurrency.
Fix the problem by using queue async_depth, so that min_shallow_depth
can be updated if user set new value, hence it's not necessary to use 1
anymore.
Fixes: 39823b47bbd4 ("block/mq-deadline: Fix the tag reservation code")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
block/mq-deadline.c | 43 ++++++++++---------------------------------
1 file changed, 10 insertions(+), 33 deletions(-)
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 853985bd13d4..8d19685cce3e 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -98,7 +98,6 @@ struct deadline_data {
int fifo_batch;
int writes_starved;
int front_merges;
- u32 async_depth;
int prio_aging_expire;
spinlock_t lock;
@@ -493,8 +492,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
*/
static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
{
- struct deadline_data *dd = data->q->elevator->elevator_data;
-
/* Do not throttle synchronous reads. */
if (op_is_sync(opf) && !op_is_write(opf))
return;
@@ -503,25 +500,19 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
* Throttle asynchronous requests and writes such that these requests
* do not block the allocation of synchronous requests.
*/
- data->shallow_depth = dd->async_depth;
+ data->shallow_depth = data->q->async_depth;
}
-/* Called by blk_mq_update_nr_requests(). */
-static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
+static int dd_async_depth_updated(struct request_queue *q,
+ unsigned int async_depth)
{
- struct request_queue *q = hctx->queue;
- struct deadline_data *dd = q->elevator->elevator_data;
- struct blk_mq_tags *tags = hctx->sched_tags;
-
- dd->async_depth = q->nr_requests;
+ struct blk_mq_hw_ctx *hctx;
+ unsigned long i;
- sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
-}
-
-/* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
-static int dd_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
-{
- dd_depth_updated(hctx);
+ q->async_depth = async_depth;
+ queue_for_each_hw_ctx(q, hctx, i)
+ sbitmap_queue_min_shallow_depth(&hctx->sched_tags->bitmap_tags,
+ async_depth ? async_depth : UINT_MAX);
return 0;
}
@@ -781,7 +772,6 @@ SHOW_JIFFIES(deadline_write_expire_show, dd->fifo_expire[DD_WRITE]);
SHOW_JIFFIES(deadline_prio_aging_expire_show, dd->prio_aging_expire);
SHOW_INT(deadline_writes_starved_show, dd->writes_starved);
SHOW_INT(deadline_front_merges_show, dd->front_merges);
-SHOW_INT(deadline_async_depth_show, dd->async_depth);
SHOW_INT(deadline_fifo_batch_show, dd->fifo_batch);
#undef SHOW_INT
#undef SHOW_JIFFIES
@@ -811,7 +801,6 @@ STORE_JIFFIES(deadline_write_expire_store, &dd->fifo_expire[DD_WRITE], 0, INT_MA
STORE_JIFFIES(deadline_prio_aging_expire_store, &dd->prio_aging_expire, 0, INT_MAX);
STORE_INT(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, INT_MAX);
STORE_INT(deadline_front_merges_store, &dd->front_merges, 0, 1);
-STORE_INT(deadline_async_depth_store, &dd->async_depth, 1, INT_MAX);
STORE_INT(deadline_fifo_batch_store, &dd->fifo_batch, 0, INT_MAX);
#undef STORE_FUNCTION
#undef STORE_INT
@@ -825,7 +814,6 @@ static struct elv_fs_entry deadline_attrs[] = {
DD_ATTR(write_expire),
DD_ATTR(writes_starved),
DD_ATTR(front_merges),
- DD_ATTR(async_depth),
DD_ATTR(fifo_batch),
DD_ATTR(prio_aging_expire),
__ATTR_NULL
@@ -912,15 +900,6 @@ static int deadline_starved_show(void *data, struct seq_file *m)
return 0;
}
-static int dd_async_depth_show(void *data, struct seq_file *m)
-{
- struct request_queue *q = data;
- struct deadline_data *dd = q->elevator->elevator_data;
-
- seq_printf(m, "%u\n", dd->async_depth);
- return 0;
-}
-
static int dd_queued_show(void *data, struct seq_file *m)
{
struct request_queue *q = data;
@@ -1030,7 +1009,6 @@ static const struct blk_mq_debugfs_attr deadline_queue_debugfs_attrs[] = {
DEADLINE_NEXT_RQ_ATTR(write2),
{"batching", 0400, deadline_batching_show},
{"starved", 0400, deadline_starved_show},
- {"async_depth", 0400, dd_async_depth_show},
{"dispatch0", 0400, .seq_ops = &deadline_dispatch0_seq_ops},
{"dispatch1", 0400, .seq_ops = &deadline_dispatch1_seq_ops},
{"dispatch2", 0400, .seq_ops = &deadline_dispatch2_seq_ops},
@@ -1043,7 +1021,6 @@ static const struct blk_mq_debugfs_attr deadline_queue_debugfs_attrs[] = {
static struct elevator_type mq_deadline = {
.ops = {
- .depth_updated = dd_depth_updated,
.limit_depth = dd_limit_depth,
.insert_requests = dd_insert_requests,
.dispatch_request = dd_dispatch_request,
@@ -1058,7 +1035,7 @@ static struct elevator_type mq_deadline = {
.has_work = dd_has_work,
.init_sched = dd_init_sched,
.exit_sched = dd_exit_sched,
- .init_hctx = dd_init_hctx,
+ .async_depth_updated = dd_async_depth_updated,
},
#ifdef CONFIG_BLK_DEBUG_FS
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH RFC v3 7/7] block/kyber-iosched: switch to use queue async_depth
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
` (5 preceding siblings ...)
2024-12-21 9:37 ` [PATCH RFC v3 6/7] block/mq-deadline: switch to use queue async_depth Yu Kuai
@ 2024-12-21 9:37 ` Yu Kuai
2025-02-13 2:47 ` [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2024-12-21 9:37 UTC (permalink / raw)
To: axboe, akpm, ming.lei, yang.yang, yukuai3, bvanassche
Cc: linux-block, linux-kernel, yukuai1, yi.zhang, yangerkun
From: Yu Kuai <yukuai3@huawei.com>
To make code cleaner.
Noted that for kyber, async_depth is still always 75% nr_requests, and
user can't set new value for now.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
block/kyber-iosched.c | 26 +++++---------------------
1 file changed, 5 insertions(+), 21 deletions(-)
diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index ccfefa6a3669..0909851f22c6 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -157,9 +157,6 @@ struct kyber_queue_data {
*/
struct sbitmap_queue domain_tokens[KYBER_NUM_DOMAINS];
- /* Number of allowed async requests. */
- unsigned int async_depth;
-
struct kyber_cpu_latency __percpu *cpu_latency;
/* Timer for stats aggregation and adjusting domain tokens. */
@@ -449,11 +446,11 @@ static void kyber_ctx_queue_init(struct kyber_ctx_queue *kcq)
static void kyber_depth_updated(struct blk_mq_hw_ctx *hctx)
{
- struct kyber_queue_data *kqd = hctx->queue->elevator->elevator_data;
+ struct request_queue *q = hctx->queue;
struct blk_mq_tags *tags = hctx->sched_tags;
- kqd->async_depth = hctx->queue->nr_requests * KYBER_ASYNC_PERCENT / 100U;
- sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, kqd->async_depth);
+ q->async_depth = hctx->queue->nr_requests * KYBER_ASYNC_PERCENT / 100U;
+ sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, q->async_depth);
}
static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
@@ -552,11 +549,8 @@ static void kyber_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data)
* We use the scheduler tags as per-hardware queue queueing tokens.
* Async requests can be limited at this stage.
*/
- if (!op_is_sync(opf)) {
- struct kyber_queue_data *kqd = data->q->elevator->elevator_data;
-
- data->shallow_depth = kqd->async_depth;
- }
+ if (!op_is_sync(opf))
+ data->shallow_depth = data->q->async_depth;
}
static bool kyber_bio_merge(struct request_queue *q, struct bio *bio,
@@ -952,15 +946,6 @@ KYBER_DEBUGFS_DOMAIN_ATTRS(KYBER_DISCARD, discard)
KYBER_DEBUGFS_DOMAIN_ATTRS(KYBER_OTHER, other)
#undef KYBER_DEBUGFS_DOMAIN_ATTRS
-static int kyber_async_depth_show(void *data, struct seq_file *m)
-{
- struct request_queue *q = data;
- struct kyber_queue_data *kqd = q->elevator->elevator_data;
-
- seq_printf(m, "%u\n", kqd->async_depth);
- return 0;
-}
-
static int kyber_cur_domain_show(void *data, struct seq_file *m)
{
struct blk_mq_hw_ctx *hctx = data;
@@ -986,7 +971,6 @@ static const struct blk_mq_debugfs_attr kyber_queue_debugfs_attrs[] = {
KYBER_QUEUE_DOMAIN_ATTRS(write),
KYBER_QUEUE_DOMAIN_ATTRS(discard),
KYBER_QUEUE_DOMAIN_ATTRS(other),
- {"async_depth", 0400, kyber_async_depth_show},
{},
};
#undef KYBER_QUEUE_DOMAIN_ATTRS
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
` (6 preceding siblings ...)
2024-12-21 9:37 ` [PATCH RFC v3 7/7] block/kyber-iosched: " Yu Kuai
@ 2025-02-13 2:47 ` Yu Kuai
7 siblings, 0 replies; 9+ messages in thread
From: Yu Kuai @ 2025-02-13 2:47 UTC (permalink / raw)
To: Yu Kuai, axboe, akpm, ming.lei, yang.yang, bvanassche
Cc: linux-block, linux-kernel, yi.zhang, yangerkun, yukuai (C)
在 2024/12/21 17:37, Yu Kuai 写道:
> From: Yu Kuai <yukuai3@huawei.com>
>
> Changes in RFC v3:
> - only patch 1,2 is from v1/v2, others are new patches, in order to change
> async_depth to request queue attribute.
>
> Changes in RFC v2:
> - update commit message for patch 1;
> - also handle min_shallow_depth in patch 2;
> - add patch 3 to choose none elevator by default;
> - add patch 4 to fix default wake_batch;
>
Friendly ping ...
> Yu Kuai (7):
> lib/sbitmap: convert shallow_depth from one word to the whole sbitmap
> lib/sbitmap: make sbitmap_get_shallow() internal
> block/elevator: add new ops async_depth_updated()
> block: change the filed nr_requests to unsgined int in request_queue
> block: add new queue sysfs api async_depth
> block/mq-deadline: switch to use queue async_depth
> block/kyber-iosched: switch to use queue async_depth
>
> block/blk-sysfs.c | 35 ++++++++++++++++++++
> block/elevator.h | 1 +
> block/kyber-iosched.c | 31 +++--------------
> block/mq-deadline.c | 57 ++++++--------------------------
> include/linux/blkdev.h | 8 ++++-
> include/linux/sbitmap.h | 19 +----------
> lib/sbitmap.c | 73 +++++++++++++++++++++++++----------------
> 7 files changed, 103 insertions(+), 121 deletions(-)
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-02-13 2:47 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-21 9:37 [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 1/7] lib/sbitmap: convert shallow_depth from one word to the whole sbitmap Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 2/7] lib/sbitmap: make sbitmap_get_shallow() internal Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 3/7] block/elevator: add new ops async_depth_updated() Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 4/7] block: change the filed nr_requests to unsgined int in request_queue Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 5/7] block: add new queue sysfs api async_depth Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 6/7] block/mq-deadline: switch to use queue async_depth Yu Kuai
2024-12-21 9:37 ` [PATCH RFC v3 7/7] block/kyber-iosched: " Yu Kuai
2025-02-13 2:47 ` [PATCH RFC v3 0/7] lib/sbitmap: fix shallow_depth tag allocation Yu Kuai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).