* RFC: fix queue freeze and limit locking order (alt take)
@ 2025-01-06 10:06 Christoph Hellwig
2025-01-06 10:06 ` [PATCH 01/10] block: fix docs for freezing of queue limits updates Christoph Hellwig
` (9 more replies)
0 siblings, 10 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
Hi all,
this is my version of Damien's "Fix queue freeze and limit locking order".
A lot looks very similar, but it was done independently based on the
previous discussion.
The changelogs are very sparse as the moment as I started it before the
holidays and just finished it enough to send it out as a RFC today.
Diffstat:
block/blk-core.c | 14 +++-
block/blk-integrity.c | 4 -
block/blk-mq.c | 19 ------
block/blk-mq.h | 6 +
block/blk-settings.c | 27 ++++++++
block/blk-sysfs.c | 128 +++++++++++++++++++----------------------
block/blk-zoned.c | 7 --
drivers/block/loop.c | 7 ++
drivers/block/nbd.c | 17 -----
drivers/block/virtio_blk.c | 4 -
drivers/nvme/host/core.c | 9 +-
drivers/scsi/sd.c | 17 +----
drivers/scsi/sr.c | 5 -
drivers/usb/storage/scsiglue.c | 5 -
include/linux/blkdev.h | 5 -
15 files changed, 133 insertions(+), 141 deletions(-)
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 01/10] block: fix docs for freezing of queue limits updates
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:39 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 02/10] block: add a queue_limits_commit_update_frozen helper Christoph Hellwig
` (8 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
queue_limits_commit_update is the function that needs to operate on a
frozen queue, not queue_limits_start_update. Update the kerneldoc
comments to reflect that.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-settings.c | 3 ++-
include/linux/blkdev.h | 3 +--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 8f09e33f41f6..4187c3e8a07f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -413,7 +413,8 @@ int blk_set_default_limits(struct queue_limits *lim)
* @lim: limits to apply
*
* Apply the limits in @lim that were obtained from queue_limits_start_update()
- * and updated by the caller to @q.
+ * and updated by the caller to @q. The caller must have frozen the queue or
+ * ensured that there is outstanding I/O by other means.
*
* Returns 0 if successful, else a negative error code.
*/
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 5d40af2ef971..e781d4e6f92d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -944,8 +944,7 @@ static inline unsigned int blk_boundary_sectors_left(sector_t offset,
* the caller can modify. The caller must call queue_limits_commit_update()
* to finish the update.
*
- * Context: process context. The caller must have frozen the queue or ensured
- * that there is outstanding I/O by other means.
+ * Context: process context.
*/
static inline struct queue_limits
queue_limits_start_update(struct request_queue *q)
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 02/10] block: add a queue_limits_commit_update_frozen helper
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
2025-01-06 10:06 ` [PATCH 01/10] block: fix docs for freezing of queue limits updates Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:41 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 03/10] block: add a store_limit operations for sysfs entries Christoph Hellwig
` (7 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
Add a helper that freezes the queue, updates the queue limits and
unfreezes the queue and convert all open coded versions of that to the
new helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-integrity.c | 4 +---
block/blk-settings.c | 24 ++++++++++++++++++++++++
block/blk-zoned.c | 7 +------
drivers/block/virtio_blk.c | 2 --
drivers/scsi/sd.c | 17 +++++------------
drivers/scsi/sr.c | 5 +----
include/linux/blkdev.h | 2 ++
7 files changed, 34 insertions(+), 27 deletions(-)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index b180cac61a9d..013469faa5e7 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -218,9 +218,7 @@ static ssize_t flag_store(struct device *dev, const char *page, size_t count,
else
lim.integrity.flags |= flag;
- blk_mq_freeze_queue(q);
- err = queue_limits_commit_update(q, &lim);
- blk_mq_unfreeze_queue(q);
+ err = queue_limits_commit_update_frozen(q, &lim);
if (err)
return err;
return count;
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 4187c3e8a07f..4a1d0729f34d 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -444,6 +444,30 @@ int queue_limits_commit_update(struct request_queue *q,
}
EXPORT_SYMBOL_GPL(queue_limits_commit_update);
+/**
+ * queue_limits_commit_update_frozen - commit an atomic update of queue limits
+ * @q: queue to update
+ * @lim: limits to apply
+ *
+ * Apply the limits in @lim that were obtained from queue_limits_start_update()
+ * and updated by the caller to @q. Freezes the queue before the updated and
+ * unfreezes it after.
+ *
+ * Returns 0 if successful, else a negative error code.
+ */
+int queue_limits_commit_update_frozen(struct request_queue *q,
+ struct queue_limits *lim)
+{
+ int ret;
+
+ blk_mq_freeze_queue(q);
+ ret = queue_limits_commit_update(q, lim);
+ blk_mq_unfreeze_queue(q);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(queue_limits_commit_update_frozen);
+
/**
* queue_limits_set - apply queue limits to queue
* @q: queue to update
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 4b0be40a8ea7..9d08a54c201e 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -1444,7 +1444,6 @@ static int disk_update_zone_resources(struct gendisk *disk,
unsigned int nr_seq_zones, nr_conv_zones;
unsigned int pool_size;
struct queue_limits lim;
- int ret;
disk->nr_zones = args->nr_zones;
disk->zone_capacity = args->zone_capacity;
@@ -1495,11 +1494,7 @@ static int disk_update_zone_resources(struct gendisk *disk,
}
commit:
- blk_mq_freeze_queue(q);
- ret = queue_limits_commit_update(q, &lim);
- blk_mq_unfreeze_queue(q);
-
- return ret;
+ return queue_limits_commit_update_frozen(q, &lim);
}
static int blk_revalidate_conv_zone(struct blk_zone *zone, unsigned int idx,
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 71a7ffeafb32..0a987f195630 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -1105,9 +1105,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
lim.features |= BLK_FEAT_WRITE_CACHE;
else
lim.features &= ~BLK_FEAT_WRITE_CACHE;
- blk_mq_freeze_queue(disk->queue);
i = queue_limits_commit_update(disk->queue, &lim);
- blk_mq_unfreeze_queue(disk->queue);
if (i)
return i;
return count;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8947dab132d7..af62a8ed8620 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -177,9 +177,8 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
lim = queue_limits_start_update(sdkp->disk->queue);
sd_set_flush_flag(sdkp, &lim);
- blk_mq_freeze_queue(sdkp->disk->queue);
- ret = queue_limits_commit_update(sdkp->disk->queue, &lim);
- blk_mq_unfreeze_queue(sdkp->disk->queue);
+ ret = queue_limits_commit_update_frozen(sdkp->disk->queue,
+ &lim);
if (ret)
return ret;
return count;
@@ -483,9 +482,7 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
lim = queue_limits_start_update(sdkp->disk->queue);
sd_config_discard(sdkp, &lim, mode);
- blk_mq_freeze_queue(sdkp->disk->queue);
- err = queue_limits_commit_update(sdkp->disk->queue, &lim);
- blk_mq_unfreeze_queue(sdkp->disk->queue);
+ err = queue_limits_commit_update_frozen(sdkp->disk->queue, &lim);
if (err)
return err;
return count;
@@ -594,9 +591,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
lim = queue_limits_start_update(sdkp->disk->queue);
sd_config_write_same(sdkp, &lim);
- blk_mq_freeze_queue(sdkp->disk->queue);
- err = queue_limits_commit_update(sdkp->disk->queue, &lim);
- blk_mq_unfreeze_queue(sdkp->disk->queue);
+ err = queue_limits_commit_update_frozen(sdkp->disk->queue, &lim);
if (err)
return err;
return count;
@@ -3803,9 +3798,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
sd_config_write_same(sdkp, &lim);
kfree(buffer);
- blk_mq_freeze_queue(sdkp->disk->queue);
- err = queue_limits_commit_update(sdkp->disk->queue, &lim);
- blk_mq_unfreeze_queue(sdkp->disk->queue);
+ err = queue_limits_commit_update_frozen(sdkp->disk->queue, &lim);
if (err)
return err;
diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 198bec87bb8e..b17796d5ee66 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -797,10 +797,7 @@ static int get_sectorsize(struct scsi_cd *cd)
lim = queue_limits_start_update(q);
lim.logical_block_size = sector_size;
- blk_mq_freeze_queue(q);
- err = queue_limits_commit_update(q, &lim);
- blk_mq_unfreeze_queue(q);
- return err;
+ return queue_limits_commit_update_frozen(q, &lim);
}
static int get_capabilities(struct scsi_cd *cd)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e781d4e6f92d..13d353351c37 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -952,6 +952,8 @@ queue_limits_start_update(struct request_queue *q)
mutex_lock(&q->limits_lock);
return q->limits;
}
+int queue_limits_commit_update_frozen(struct request_queue *q,
+ struct queue_limits *lim);
int queue_limits_commit_update(struct request_queue *q,
struct queue_limits *lim);
int queue_limits_set(struct request_queue *q, struct queue_limits *lim);
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 03/10] block: add a store_limit operations for sysfs entries
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
2025-01-06 10:06 ` [PATCH 01/10] block: fix docs for freezing of queue limits updates Christoph Hellwig
2025-01-06 10:06 ` [PATCH 02/10] block: add a queue_limits_commit_update_frozen helper Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:46 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 04/10] block: use queue_limits_commit_update in queue_attr_store Christoph Hellwig
` (6 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
De-duplicate the code for updating queue limits by adding a store_limit
method that allows having common code handle the actual queue limits
update.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-sysfs.c | 128 ++++++++++++++++++++++------------------------
1 file changed, 61 insertions(+), 67 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 767598e719ab..8d69315e986d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -24,6 +24,8 @@ struct queue_sysfs_entry {
struct attribute attr;
ssize_t (*show)(struct gendisk *disk, char *page);
ssize_t (*store)(struct gendisk *disk, const char *page, size_t count);
+ int (*store_limit)(struct gendisk *disk, const char *page,
+ size_t count, struct queue_limits *lim);
void (*load_module)(struct gendisk *disk, const char *page, size_t count);
};
@@ -153,13 +155,11 @@ QUEUE_SYSFS_SHOW_CONST(discard_zeroes_data, 0)
QUEUE_SYSFS_SHOW_CONST(write_same_max, 0)
QUEUE_SYSFS_SHOW_CONST(poll_delay, -1)
-static ssize_t queue_max_discard_sectors_store(struct gendisk *disk,
- const char *page, size_t count)
+static int queue_max_discard_sectors_store(struct gendisk *disk,
+ const char *page, size_t count, struct queue_limits *lim)
{
unsigned long max_discard_bytes;
- struct queue_limits lim;
ssize_t ret;
- int err;
ret = queue_var_store(&max_discard_bytes, page, count);
if (ret < 0)
@@ -171,38 +171,28 @@ static ssize_t queue_max_discard_sectors_store(struct gendisk *disk,
if ((max_discard_bytes >> SECTOR_SHIFT) > UINT_MAX)
return -EINVAL;
- lim = queue_limits_start_update(disk->queue);
- lim.max_user_discard_sectors = max_discard_bytes >> SECTOR_SHIFT;
- err = queue_limits_commit_update(disk->queue, &lim);
- if (err)
- return err;
- return ret;
+ lim->max_user_discard_sectors = max_discard_bytes >> SECTOR_SHIFT;
+ return 0;
}
-static ssize_t
-queue_max_sectors_store(struct gendisk *disk, const char *page, size_t count)
+static int
+queue_max_sectors_store(struct gendisk *disk, const char *page, size_t count,
+ struct queue_limits *lim)
{
unsigned long max_sectors_kb;
- struct queue_limits lim;
ssize_t ret;
- int err;
ret = queue_var_store(&max_sectors_kb, page, count);
if (ret < 0)
return ret;
- lim = queue_limits_start_update(disk->queue);
- lim.max_user_sectors = max_sectors_kb << 1;
- err = queue_limits_commit_update(disk->queue, &lim);
- if (err)
- return err;
- return ret;
+ lim->max_user_sectors = max_sectors_kb << 1;
+ return 0;
}
static ssize_t queue_feature_store(struct gendisk *disk, const char *page,
- size_t count, blk_features_t feature)
+ size_t count, struct queue_limits *lim, blk_features_t feature)
{
- struct queue_limits lim;
unsigned long val;
ssize_t ret;
@@ -210,15 +200,11 @@ static ssize_t queue_feature_store(struct gendisk *disk, const char *page,
if (ret < 0)
return ret;
- lim = queue_limits_start_update(disk->queue);
if (val)
- lim.features |= feature;
+ lim->features |= feature;
else
- lim.features &= ~feature;
- ret = queue_limits_commit_update(disk->queue, &lim);
- if (ret)
- return ret;
- return count;
+ lim->features &= ~feature;
+ return 0;
}
#define QUEUE_SYSFS_FEATURE(_name, _feature) \
@@ -227,10 +213,10 @@ static ssize_t queue_##_name##_show(struct gendisk *disk, char *page) \
return sysfs_emit(page, "%u\n", \
!!(disk->queue->limits.features & _feature)); \
} \
-static ssize_t queue_##_name##_store(struct gendisk *disk, \
- const char *page, size_t count) \
+static int queue_##_name##_store(struct gendisk *disk, \
+ const char *page, size_t count, struct queue_limits *lim) \
{ \
- return queue_feature_store(disk, page, count, _feature); \
+ return queue_feature_store(disk, page, count, lim, _feature); \
}
QUEUE_SYSFS_FEATURE(rotational, BLK_FEAT_ROTATIONAL)
@@ -266,10 +252,9 @@ static ssize_t queue_iostats_passthrough_show(struct gendisk *disk, char *page)
return queue_var_show(!!blk_queue_passthrough_stat(disk->queue), page);
}
-static ssize_t queue_iostats_passthrough_store(struct gendisk *disk,
- const char *page, size_t count)
+static int queue_iostats_passthrough_store(struct gendisk *disk,
+ const char *page, size_t count, struct queue_limits *lim)
{
- struct queue_limits lim;
unsigned long ios;
ssize_t ret;
@@ -277,18 +262,13 @@ static ssize_t queue_iostats_passthrough_store(struct gendisk *disk,
if (ret < 0)
return ret;
- lim = queue_limits_start_update(disk->queue);
if (ios)
- lim.flags |= BLK_FLAG_IOSTATS_PASSTHROUGH;
+ lim->flags |= BLK_FLAG_IOSTATS_PASSTHROUGH;
else
- lim.flags &= ~BLK_FLAG_IOSTATS_PASSTHROUGH;
-
- ret = queue_limits_commit_update(disk->queue, &lim);
- if (ret)
- return ret;
-
- return count;
+ lim->flags &= ~BLK_FLAG_IOSTATS_PASSTHROUGH;
+ return 0;
}
+
static ssize_t queue_nomerges_show(struct gendisk *disk, char *page)
{
return queue_var_show((blk_queue_nomerges(disk->queue) << 1) |
@@ -391,12 +371,10 @@ static ssize_t queue_wc_show(struct gendisk *disk, char *page)
return sysfs_emit(page, "write through\n");
}
-static ssize_t queue_wc_store(struct gendisk *disk, const char *page,
- size_t count)
+static int queue_wc_store(struct gendisk *disk, const char *page,
+ size_t count, struct queue_limits *lim)
{
- struct queue_limits lim;
bool disable;
- int err;
if (!strncmp(page, "write back", 10)) {
disable = false;
@@ -407,15 +385,11 @@ static ssize_t queue_wc_store(struct gendisk *disk, const char *page,
return -EINVAL;
}
- lim = queue_limits_start_update(disk->queue);
if (disable)
- lim.flags |= BLK_FLAG_WRITE_CACHE_DISABLED;
+ lim->flags |= BLK_FLAG_WRITE_CACHE_DISABLED;
else
- lim.flags &= ~BLK_FLAG_WRITE_CACHE_DISABLED;
- err = queue_limits_commit_update(disk->queue, &lim);
- if (err)
- return err;
- return count;
+ lim->flags &= ~BLK_FLAG_WRITE_CACHE_DISABLED;
+ return 0;
}
#define QUEUE_RO_ENTRY(_prefix, _name) \
@@ -431,6 +405,13 @@ static struct queue_sysfs_entry _prefix##_entry = { \
.store = _prefix##_store, \
};
+#define QUEUE_LIM_RW_ENTRY(_prefix, _name) \
+static struct queue_sysfs_entry _prefix##_entry = { \
+ .attr = { .name = _name, .mode = 0644 }, \
+ .show = _prefix##_show, \
+ .store_limit = _prefix##_store, \
+}
+
#define QUEUE_RW_LOAD_MODULE_ENTRY(_prefix, _name) \
static struct queue_sysfs_entry _prefix##_entry = { \
.attr = { .name = _name, .mode = 0644 }, \
@@ -441,7 +422,7 @@ static struct queue_sysfs_entry _prefix##_entry = { \
QUEUE_RW_ENTRY(queue_requests, "nr_requests");
QUEUE_RW_ENTRY(queue_ra, "read_ahead_kb");
-QUEUE_RW_ENTRY(queue_max_sectors, "max_sectors_kb");
+QUEUE_LIM_RW_ENTRY(queue_max_sectors, "max_sectors_kb");
QUEUE_RO_ENTRY(queue_max_hw_sectors, "max_hw_sectors_kb");
QUEUE_RO_ENTRY(queue_max_segments, "max_segments");
QUEUE_RO_ENTRY(queue_max_integrity_segments, "max_integrity_segments");
@@ -457,7 +438,7 @@ QUEUE_RO_ENTRY(queue_io_opt, "optimal_io_size");
QUEUE_RO_ENTRY(queue_max_discard_segments, "max_discard_segments");
QUEUE_RO_ENTRY(queue_discard_granularity, "discard_granularity");
QUEUE_RO_ENTRY(queue_max_hw_discard_sectors, "discard_max_hw_bytes");
-QUEUE_RW_ENTRY(queue_max_discard_sectors, "discard_max_bytes");
+QUEUE_LIM_RW_ENTRY(queue_max_discard_sectors, "discard_max_bytes");
QUEUE_RO_ENTRY(queue_discard_zeroes_data, "discard_zeroes_data");
QUEUE_RO_ENTRY(queue_atomic_write_max_sectors, "atomic_write_max_bytes");
@@ -477,11 +458,11 @@ QUEUE_RO_ENTRY(queue_max_open_zones, "max_open_zones");
QUEUE_RO_ENTRY(queue_max_active_zones, "max_active_zones");
QUEUE_RW_ENTRY(queue_nomerges, "nomerges");
-QUEUE_RW_ENTRY(queue_iostats_passthrough, "iostats_passthrough");
+QUEUE_LIM_RW_ENTRY(queue_iostats_passthrough, "iostats_passthrough");
QUEUE_RW_ENTRY(queue_rq_affinity, "rq_affinity");
QUEUE_RW_ENTRY(queue_poll, "io_poll");
QUEUE_RW_ENTRY(queue_poll_delay, "io_poll_delay");
-QUEUE_RW_ENTRY(queue_wc, "write_cache");
+QUEUE_LIM_RW_ENTRY(queue_wc, "write_cache");
QUEUE_RO_ENTRY(queue_fua, "fua");
QUEUE_RO_ENTRY(queue_dax, "dax");
QUEUE_RW_ENTRY(queue_io_timeout, "io_timeout");
@@ -494,10 +475,10 @@ static struct queue_sysfs_entry queue_hw_sector_size_entry = {
.show = queue_logical_block_size_show,
};
-QUEUE_RW_ENTRY(queue_rotational, "rotational");
-QUEUE_RW_ENTRY(queue_iostats, "iostats");
-QUEUE_RW_ENTRY(queue_add_random, "add_random");
-QUEUE_RW_ENTRY(queue_stable_writes, "stable_writes");
+QUEUE_LIM_RW_ENTRY(queue_rotational, "rotational");
+QUEUE_LIM_RW_ENTRY(queue_iostats, "iostats");
+QUEUE_LIM_RW_ENTRY(queue_add_random, "add_random");
+QUEUE_LIM_RW_ENTRY(queue_stable_writes, "stable_writes");
#ifdef CONFIG_BLK_WBT
static ssize_t queue_var_store64(s64 *var, const char *page)
@@ -695,7 +676,7 @@ queue_attr_store(struct kobject *kobj, struct attribute *attr,
struct request_queue *q = disk->queue;
ssize_t res;
- if (!entry->store)
+ if (!entry->store_limit && !entry->store)
return -EIO;
/*
@@ -706,11 +687,24 @@ queue_attr_store(struct kobject *kobj, struct attribute *attr,
if (entry->load_module)
entry->load_module(disk, page, length);
- blk_mq_freeze_queue(q);
mutex_lock(&q->sysfs_lock);
- res = entry->store(disk, page, length);
- mutex_unlock(&q->sysfs_lock);
+ blk_mq_freeze_queue(q);
+ if (entry->store_limit) {
+ struct queue_limits lim = queue_limits_start_update(q);
+
+ res = entry->store_limit(disk, page, length, &lim);
+ if (res < 0) {
+ queue_limits_cancel_update(q);
+ } else {
+ res = queue_limits_commit_update(q, &lim);
+ if (!res)
+ res = length;
+ }
+ } else {
+ res = entry->store(disk, page, length);
+ }
blk_mq_unfreeze_queue(q);
+ mutex_unlock(&q->sysfs_lock);
return res;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 04/10] block: use queue_limits_commit_update in queue_attr_store
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
` (2 preceding siblings ...)
2025-01-06 10:06 ` [PATCH 03/10] block: add a store_limit operations for sysfs entries Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:48 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues Christoph Hellwig
` (5 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
Use queue_limits_commit_update to apply a consistent freeze vs limits
lock order in queue_attr_store. Also remove taking the sysfs lock
as it doesn't protect anything here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-sysfs.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 8d69315e986d..3bac27fcd635 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -687,22 +687,24 @@ queue_attr_store(struct kobject *kobj, struct attribute *attr,
if (entry->load_module)
entry->load_module(disk, page, length);
- mutex_lock(&q->sysfs_lock);
- blk_mq_freeze_queue(q);
if (entry->store_limit) {
struct queue_limits lim = queue_limits_start_update(q);
res = entry->store_limit(disk, page, length, &lim);
if (res < 0) {
queue_limits_cancel_update(q);
- } else {
- res = queue_limits_commit_update(q, &lim);
- if (!res)
- res = length;
+ return res;
}
- } else {
- res = entry->store(disk, page, length);
+
+ res = queue_limits_commit_update_frozen(q, &lim);
+ if (res)
+ return res;
+ return length;
}
+
+ mutex_lock(&q->sysfs_lock);
+ blk_mq_freeze_queue(q);
+ res = entry->store(disk, page, length);
blk_mq_unfreeze_queue(q);
mutex_unlock(&q->sysfs_lock);
return res;
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
` (3 preceding siblings ...)
2025-01-06 10:06 ` [PATCH 04/10] block: use queue_limits_commit_update in queue_attr_store Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:55 ` Damien Le Moal
2025-01-06 11:01 ` Nilay Shroff
2025-01-06 10:06 ` [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store Christoph Hellwig
` (4 subsequent siblings)
9 siblings, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
When __blk_mq_update_nr_hw_queues changes the number of tag sets, it
might have to disable poll queues. Currently it does so by adjusting
the BLK_FEAT_POLL, which is a bit against the intent of features that
describe hardware / driver capabilities, but more importantly causes
nasty lock order problems with the broadly held freeze when updating the
number of hardware queues and the limits lock. Fix this by leaving
BLK_FEAT_POLL alone, and instead check for the number of sets and poll
queues in the bio submission and poll handler. While this adds extra
work to the fast path, the variables are in cache lines used by these
operations anyway, so it should be cheap enough.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-core.c | 14 +++++++++++---
block/blk-mq.c | 19 +------------------
block/blk-mq.h | 6 ++++++
3 files changed, 18 insertions(+), 21 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 666efe8fa202..483c14a50d9f 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -753,6 +753,15 @@ static blk_status_t blk_validate_atomic_write_op_size(struct request_queue *q,
return BLK_STS_OK;
}
+static bool bdev_can_poll(struct block_device *bdev)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+
+ if (queue_is_mq(q))
+ return blk_mq_can_poll(q->tag_set);
+ return q->limits.features & BLK_FEAT_POLL;
+}
+
/**
* submit_bio_noacct - re-submit a bio to the block device layer for I/O
* @bio: The bio describing the location in memory and on the device.
@@ -805,8 +814,7 @@ void submit_bio_noacct(struct bio *bio)
}
}
- if (!(q->limits.features & BLK_FEAT_POLL) &&
- (bio->bi_opf & REQ_POLLED)) {
+ if ((bio->bi_opf & REQ_POLLED) && !bdev_can_poll(bdev)) {
bio_clear_polled(bio);
goto not_supported;
}
@@ -935,7 +943,7 @@ int bio_poll(struct bio *bio, struct io_comp_batch *iob, unsigned int flags)
return 0;
q = bdev_get_queue(bdev);
- if (cookie == BLK_QC_T_NONE || !(q->limits.features & BLK_FEAT_POLL))
+ if (cookie == BLK_QC_T_NONE || !bdev_can_poll(bdev))
return 0;
blk_flush_plug(current->plug, false);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 17f10683d640..0a7f059735fa 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -4321,12 +4321,6 @@ void blk_mq_release(struct request_queue *q)
blk_mq_sysfs_deinit(q);
}
-static bool blk_mq_can_poll(struct blk_mq_tag_set *set)
-{
- return set->nr_maps > HCTX_TYPE_POLL &&
- set->map[HCTX_TYPE_POLL].nr_queues;
-}
-
struct request_queue *blk_mq_alloc_queue(struct blk_mq_tag_set *set,
struct queue_limits *lim, void *queuedata)
{
@@ -4336,9 +4330,7 @@ struct request_queue *blk_mq_alloc_queue(struct blk_mq_tag_set *set,
if (!lim)
lim = &default_lim;
- lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT;
- if (blk_mq_can_poll(set))
- lim->features |= BLK_FEAT_POLL;
+ lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL;
q = blk_alloc_queue(lim, set->numa_node);
if (IS_ERR(q))
@@ -5025,8 +5017,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
fallback:
blk_mq_update_queue_map(set);
list_for_each_entry(q, &set->tag_list, tag_set_list) {
- struct queue_limits lim;
-
blk_mq_realloc_hw_ctxs(set, q);
if (q->nr_hw_queues != set->nr_hw_queues) {
@@ -5040,13 +5030,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
set->nr_hw_queues = prev_nr_hw_queues;
goto fallback;
}
- lim = queue_limits_start_update(q);
- if (blk_mq_can_poll(set))
- lim.features |= BLK_FEAT_POLL;
- else
- lim.features &= ~BLK_FEAT_POLL;
- if (queue_limits_commit_update(q, &lim) < 0)
- pr_warn("updating the poll flag failed\n");
blk_mq_map_swqueue(q);
}
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 89a20fffa4b1..ecd7bd7ec609 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -111,6 +111,12 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q,
return ctx->hctxs[blk_mq_get_hctx_type(opf)];
}
+static inline bool blk_mq_can_poll(struct blk_mq_tag_set *set)
+{
+ return set->nr_maps > HCTX_TYPE_POLL &&
+ set->map[HCTX_TYPE_POLL].nr_queues;
+}
+
/*
* sysfs helpers
*/
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
` (4 preceding siblings ...)
2025-01-06 10:06 ` [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:56 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 07/10] usb-storage: use queue_limits_commit_update_frozen in max_sectors_store Christoph Hellwig
` (3 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
So far cache_type_store didn't freeze the queue, fix that by using the
queue_limits_commit_update_frozen helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/block/virtio_blk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 0a987f195630..bbaa26b523b8 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -1105,7 +1105,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
lim.features |= BLK_FEAT_WRITE_CACHE;
else
lim.features &= ~BLK_FEAT_WRITE_CACHE;
- i = queue_limits_commit_update(disk->queue, &lim);
+ i = queue_limits_commit_update_frozen(disk->queue, &lim);
if (i)
return i;
return count;
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 07/10] usb-storage: use queue_limits_commit_update_frozen in max_sectors_store
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
` (5 preceding siblings ...)
2025-01-06 10:06 ` [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:06 ` [PATCH 08/10] nvme: freeze queue after taking limits lock Christoph Hellwig
` (2 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
Use queue_limits_commit_update_frozen so that limits lock is taken
with the queue frozen.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/usb/storage/scsiglue.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 8c8b5e6041cc..dc98ceecb724 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -592,12 +592,9 @@ static ssize_t max_sectors_store(struct device *dev, struct device_attribute *at
if (sscanf(buf, "%hu", &ms) <= 0)
return -EINVAL;
- blk_mq_freeze_queue(sdev->request_queue);
lim = queue_limits_start_update(sdev->request_queue);
lim.max_hw_sectors = ms;
- ret = queue_limits_commit_update(sdev->request_queue, &lim);
- blk_mq_unfreeze_queue(sdev->request_queue);
-
+ ret = queue_limits_commit_update_frozen(sdev->request_queue, &lim);
if (ret)
return ret;
return count;
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 08/10] nvme: freeze queue after taking limits lock
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
` (6 preceding siblings ...)
2025-01-06 10:06 ` [PATCH 07/10] usb-storage: use queue_limits_commit_update_frozen in max_sectors_store Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:06 ` [PATCH 09/10] loop: document why loop_clear_limits updates queue limits without freezing Christoph Hellwig
2025-01-06 10:06 ` [PATCH 10/10] nbd: use queue_limits_commit_update_frozen in nbd_set_size Christoph Hellwig
9 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
Work towards a consistent locking order by always freezing the queue
inside the limits lock.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/nvme/host/core.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 42283d268500..c6b7884bb343 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2128,9 +2128,10 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
struct queue_limits lim;
int ret;
- blk_mq_freeze_queue(ns->disk->queue);
lim = queue_limits_start_update(ns->disk->queue);
nvme_set_ctrl_limits(ns->ctrl, &lim);
+
+ blk_mq_freeze_queue(ns->disk->queue);
ret = queue_limits_commit_update(ns->disk->queue, &lim);
set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
blk_mq_unfreeze_queue(ns->disk->queue);
@@ -2177,12 +2178,12 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
goto out;
}
+ lim = queue_limits_start_update(ns->disk->queue);
+
blk_mq_freeze_queue(ns->disk->queue);
ns->head->lba_shift = id->lbaf[lbaf].ds;
ns->head->nuse = le64_to_cpu(id->nuse);
capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
-
- lim = queue_limits_start_update(ns->disk->queue);
nvme_set_ctrl_limits(ns->ctrl, &lim);
nvme_configure_metadata(ns->ctrl, ns->head, id, nvm, info);
nvme_set_chunk_sectors(ns, id, &lim);
@@ -2285,6 +2286,7 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
struct queue_limits *ns_lim = &ns->disk->queue->limits;
struct queue_limits lim;
+ lim = queue_limits_start_update(ns->head->disk->queue);
blk_mq_freeze_queue(ns->head->disk->queue);
/*
* queue_limits mixes values that are the hardware limitations
@@ -2301,7 +2303,6 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
* the splitting limits in to make sure we still obey possibly
* lower limitations of other controllers.
*/
- lim = queue_limits_start_update(ns->head->disk->queue);
lim.logical_block_size = ns_lim->logical_block_size;
lim.physical_block_size = ns_lim->physical_block_size;
lim.io_min = ns_lim->io_min;
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 09/10] loop: document why loop_clear_limits updates queue limits without freezing
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
` (7 preceding siblings ...)
2025-01-06 10:06 ` [PATCH 08/10] nvme: freeze queue after taking limits lock Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
2025-01-06 10:06 ` [PATCH 10/10] nbd: use queue_limits_commit_update_frozen in nbd_set_size Christoph Hellwig
9 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/block/loop.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 836a53eef4b4..84b007b9c38c 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -311,6 +311,13 @@ static void loop_clear_limits(struct loop_device *lo, int mode)
lim.discard_granularity = 0;
}
+ /*
+ * XXX: this updates the queue limits without freezing the queue, which
+ * is against the locking protocol and dangerous. But we can't just
+ * freeze the queue as we're inside the ->queue_rq method here. So this
+ * should move out into a workqueue unless we get the file operations to
+ * advertise if they support specific fallocate operations.
+ */
queue_limits_commit_update(lo->lo_queue, &lim);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 10/10] nbd: use queue_limits_commit_update_frozen in nbd_set_size
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
` (8 preceding siblings ...)
2025-01-06 10:06 ` [PATCH 09/10] loop: document why loop_clear_limits updates queue limits without freezing Christoph Hellwig
@ 2025-01-06 10:06 ` Christoph Hellwig
9 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:06 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Ming Lei, Nilay Shroff, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
Apply the consistent freeze vs limit lock order in nbd_set_size by
using queue_limits_commit_update_frozen. This also allows to get rid of
the lower level __nbd_set_size helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/block/nbd.c | 17 ++---------------
1 file changed, 2 insertions(+), 15 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b1a5af69a66d..ca8bd2f3b941 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -327,8 +327,7 @@ static void nbd_mark_nsock_dead(struct nbd_device *nbd, struct nbd_sock *nsock,
nsock->sent = 0;
}
-static int __nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
- loff_t blksize)
+static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize, loff_t blksize)
{
struct queue_limits lim;
int error;
@@ -368,7 +367,7 @@ static int __nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
lim.logical_block_size = blksize;
lim.physical_block_size = blksize;
- error = queue_limits_commit_update(nbd->disk->queue, &lim);
+ error = queue_limits_commit_update_frozen(nbd->disk->queue, &lim);
if (error)
return error;
@@ -379,18 +378,6 @@ static int __nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
return 0;
}
-static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
- loff_t blksize)
-{
- int error;
-
- blk_mq_freeze_queue(nbd->disk->queue);
- error = __nbd_set_size(nbd, bytesize, blksize);
- blk_mq_unfreeze_queue(nbd->disk->queue);
-
- return error;
-}
-
static void nbd_complete_rq(struct request *req)
{
struct nbd_cmd *cmd = blk_mq_rq_to_pdu(req);
--
2.45.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 01/10] block: fix docs for freezing of queue limits updates
2025-01-06 10:06 ` [PATCH 01/10] block: fix docs for freezing of queue limits updates Christoph Hellwig
@ 2025-01-06 10:39 ` Damien Le Moal
0 siblings, 0 replies; 27+ messages in thread
From: Damien Le Moal @ 2025-01-06 10:39 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Ming Lei, Nilay Shroff, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 7:06 PM, Christoph Hellwig wrote:
> queue_limits_commit_update is the function that needs to operate on a
> frozen queue, not queue_limits_start_update. Update the kerneldoc
> comments to reflect that.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> block/blk-settings.c | 3 ++-
> include/linux/blkdev.h | 3 +--
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 8f09e33f41f6..4187c3e8a07f 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -413,7 +413,8 @@ int blk_set_default_limits(struct queue_limits *lim)
> * @lim: limits to apply
> *
> * Apply the limits in @lim that were obtained from queue_limits_start_update()
> - * and updated by the caller to @q.
> + * and updated by the caller to @q. The caller must have frozen the queue or
> + * ensured that there is outstanding I/O by other means.
...ensure that there are no outstanding I/Os by other means.
> *
> * Returns 0 if successful, else a negative error code.
> */
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 5d40af2ef971..e781d4e6f92d 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -944,8 +944,7 @@ static inline unsigned int blk_boundary_sectors_left(sector_t offset,
> * the caller can modify. The caller must call queue_limits_commit_update()
> * to finish the update.
> *
> - * Context: process context. The caller must have frozen the queue or ensured
> - * that there is outstanding I/O by other means.
> + * Context: process context.
> */
> static inline struct queue_limits
> queue_limits_start_update(struct request_queue *q)
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 02/10] block: add a queue_limits_commit_update_frozen helper
2025-01-06 10:06 ` [PATCH 02/10] block: add a queue_limits_commit_update_frozen helper Christoph Hellwig
@ 2025-01-06 10:41 ` Damien Le Moal
0 siblings, 0 replies; 27+ messages in thread
From: Damien Le Moal @ 2025-01-06 10:41 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Ming Lei, Nilay Shroff, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 7:06 PM, Christoph Hellwig wrote:
> Add a helper that freezes the queue, updates the queue limits and
> unfreezes the queue and convert all open coded versions of that to the
> new helper.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
[...]
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 71a7ffeafb32..0a987f195630 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -1105,9 +1105,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
> lim.features |= BLK_FEAT_WRITE_CACHE;
> else
> lim.features &= ~BLK_FEAT_WRITE_CACHE;
> - blk_mq_freeze_queue(disk->queue);
> i = queue_limits_commit_update(disk->queue, &lim);
You need to change this line to use queue_limits_commit_update_frozen().
Other than that, looks good !
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 03/10] block: add a store_limit operations for sysfs entries
2025-01-06 10:06 ` [PATCH 03/10] block: add a store_limit operations for sysfs entries Christoph Hellwig
@ 2025-01-06 10:46 ` Damien Le Moal
2025-01-06 10:57 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Damien Le Moal @ 2025-01-06 10:46 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Ming Lei, Nilay Shroff, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 7:06 PM, Christoph Hellwig wrote:
> De-duplicate the code for updating queue limits by adding a store_limit
> method that allows having common code handle the actual queue limits
> update.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
[...]
> @@ -706,11 +687,24 @@ queue_attr_store(struct kobject *kobj, struct attribute *attr,
> if (entry->load_module)
> entry->load_module(disk, page, length);
>
> - blk_mq_freeze_queue(q);
> mutex_lock(&q->sysfs_lock);
> - res = entry->store(disk, page, length);
> - mutex_unlock(&q->sysfs_lock);
> + blk_mq_freeze_queue(q);
The freeze must NOT be done for the "if (entry->store_limit)" case. So this
needs to go int the "else". Otherwise, you still have freeze the take limit
lock order which can cause the ABBA deadlocks in SCSI sd.
> + if (entry->store_limit) {
> + struct queue_limits lim = queue_limits_start_update(q);
> +
> + res = entry->store_limit(disk, page, length, &lim);
> + if (res < 0) {
> + queue_limits_cancel_update(q);
> + } else {
> + res = queue_limits_commit_update(q, &lim);
Here you must use queue_limits_commit_update_frozen().
> + if (!res)
> + res = length;
> + }
> + } else {
> + res = entry->store(disk, page, length);
> + }
> blk_mq_unfreeze_queue(q);
And this one needs to go in the else after the call to entry->store().
> + mutex_unlock(&q->sysfs_lock);
> return res;
> }
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 04/10] block: use queue_limits_commit_update in queue_attr_store
2025-01-06 10:06 ` [PATCH 04/10] block: use queue_limits_commit_update in queue_attr_store Christoph Hellwig
@ 2025-01-06 10:48 ` Damien Le Moal
0 siblings, 0 replies; 27+ messages in thread
From: Damien Le Moal @ 2025-01-06 10:48 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Ming Lei, Nilay Shroff, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 7:06 PM, Christoph Hellwig wrote:
> Use queue_limits_commit_update to apply a consistent freeze vs limits
> lock order in queue_attr_store. Also remove taking the sysfs lock
> as it doesn't protect anything here.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Oops. OK. so please disregard my comments on patch 3. This is all fixed here.
May be sqash this patch with patch 3 ?
> ---
> block/blk-sysfs.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 8d69315e986d..3bac27fcd635 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -687,22 +687,24 @@ queue_attr_store(struct kobject *kobj, struct attribute *attr,
> if (entry->load_module)
> entry->load_module(disk, page, length);
>
> - mutex_lock(&q->sysfs_lock);
> - blk_mq_freeze_queue(q);
> if (entry->store_limit) {
> struct queue_limits lim = queue_limits_start_update(q);
>
> res = entry->store_limit(disk, page, length, &lim);
> if (res < 0) {
> queue_limits_cancel_update(q);
> - } else {
> - res = queue_limits_commit_update(q, &lim);
> - if (!res)
> - res = length;
> + return res;
> }
> - } else {
> - res = entry->store(disk, page, length);
> +
> + res = queue_limits_commit_update_frozen(q, &lim);
> + if (res)
> + return res;
> + return length;
> }
> +
> + mutex_lock(&q->sysfs_lock);
> + blk_mq_freeze_queue(q);
> + res = entry->store(disk, page, length);
> blk_mq_unfreeze_queue(q);
> mutex_unlock(&q->sysfs_lock);
> return res;
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 10:06 ` [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues Christoph Hellwig
@ 2025-01-06 10:55 ` Damien Le Moal
2025-01-06 10:59 ` Christoph Hellwig
2025-01-06 11:01 ` Nilay Shroff
1 sibling, 1 reply; 27+ messages in thread
From: Damien Le Moal @ 2025-01-06 10:55 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Ming Lei, Nilay Shroff, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 7:06 PM, Christoph Hellwig wrote:
> When __blk_mq_update_nr_hw_queues changes the number of tag sets, it
> might have to disable poll queues. Currently it does so by adjusting
> the BLK_FEAT_POLL, which is a bit against the intent of features that
> describe hardware / driver capabilities, but more importantly causes
> nasty lock order problems with the broadly held freeze when updating the
> number of hardware queues and the limits lock. Fix this by leaving
> BLK_FEAT_POLL alone, and instead check for the number of sets and poll
> queues in the bio submission and poll handler. While this adds extra
> work to the fast path, the variables are in cache lines used by these
> operations anyway, so it should be cheap enough.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> block/blk-core.c | 14 +++++++++++---
> block/blk-mq.c | 19 +------------------
> block/blk-mq.h | 6 ++++++
> 3 files changed, 18 insertions(+), 21 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 666efe8fa202..483c14a50d9f 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -753,6 +753,15 @@ static blk_status_t blk_validate_atomic_write_op_size(struct request_queue *q,
> return BLK_STS_OK;
> }
>
> +static bool bdev_can_poll(struct block_device *bdev)
> +{
> + struct request_queue *q = bdev_get_queue(bdev);
> +
> + if (queue_is_mq(q))
> + return blk_mq_can_poll(q->tag_set);
> + return q->limits.features & BLK_FEAT_POLL;
> +}
> +
> /**
> * submit_bio_noacct - re-submit a bio to the block device layer for I/O
> * @bio: The bio describing the location in memory and on the device.
> @@ -805,8 +814,7 @@ void submit_bio_noacct(struct bio *bio)
> }
> }
>
> - if (!(q->limits.features & BLK_FEAT_POLL) &&
> - (bio->bi_opf & REQ_POLLED)) {
> + if ((bio->bi_opf & REQ_POLLED) && !bdev_can_poll(bdev)) {
> bio_clear_polled(bio);
> goto not_supported;
> }
> @@ -935,7 +943,7 @@ int bio_poll(struct bio *bio, struct io_comp_batch *iob, unsigned int flags)
> return 0;
>
> q = bdev_get_queue(bdev);
> - if (cookie == BLK_QC_T_NONE || !(q->limits.features & BLK_FEAT_POLL))
> + if (cookie == BLK_QC_T_NONE || !bdev_can_poll(bdev))
> return 0;
>
> blk_flush_plug(current->plug, false);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 17f10683d640..0a7f059735fa 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -4321,12 +4321,6 @@ void blk_mq_release(struct request_queue *q)
> blk_mq_sysfs_deinit(q);
> }
>
> -static bool blk_mq_can_poll(struct blk_mq_tag_set *set)
> -{
> - return set->nr_maps > HCTX_TYPE_POLL &&
> - set->map[HCTX_TYPE_POLL].nr_queues;
> -}
> -
> struct request_queue *blk_mq_alloc_queue(struct blk_mq_tag_set *set,
> struct queue_limits *lim, void *queuedata)
> {
> @@ -4336,9 +4330,7 @@ struct request_queue *blk_mq_alloc_queue(struct blk_mq_tag_set *set,
>
> if (!lim)
> lim = &default_lim;
> - lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT;
> - if (blk_mq_can_poll(set))
> - lim->features |= BLK_FEAT_POLL;
> + lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL;
Why set BLK_FEAT_POLL unconditionally ? This is changing the current default
for many devices, no ?
>
> q = blk_alloc_queue(lim, set->numa_node);
> if (IS_ERR(q))
> @@ -5025,8 +5017,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> fallback:
> blk_mq_update_queue_map(set);
> list_for_each_entry(q, &set->tag_list, tag_set_list) {
> - struct queue_limits lim;
> -
> blk_mq_realloc_hw_ctxs(set, q);
>
> if (q->nr_hw_queues != set->nr_hw_queues) {
> @@ -5040,13 +5030,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> set->nr_hw_queues = prev_nr_hw_queues;
> goto fallback;
> }
> - lim = queue_limits_start_update(q);
> - if (blk_mq_can_poll(set))
> - lim.features |= BLK_FEAT_POLL;
> - else
> - lim.features &= ~BLK_FEAT_POLL;
> - if (queue_limits_commit_update(q, &lim) < 0)
> - pr_warn("updating the poll flag failed\n");
> blk_mq_map_swqueue(q);
> }
>
> diff --git a/block/blk-mq.h b/block/blk-mq.h
> index 89a20fffa4b1..ecd7bd7ec609 100644
> --- a/block/blk-mq.h
> +++ b/block/blk-mq.h
> @@ -111,6 +111,12 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q,
> return ctx->hctxs[blk_mq_get_hctx_type(opf)];
> }
>
> +static inline bool blk_mq_can_poll(struct blk_mq_tag_set *set)
> +{
> + return set->nr_maps > HCTX_TYPE_POLL &&
> + set->map[HCTX_TYPE_POLL].nr_queues;
> +}
> +
> /*
> * sysfs helpers
> */
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store
2025-01-06 10:06 ` [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store Christoph Hellwig
@ 2025-01-06 10:56 ` Damien Le Moal
2025-01-06 10:59 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Damien Le Moal @ 2025-01-06 10:56 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Ming Lei, Nilay Shroff, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 7:06 PM, Christoph Hellwig wrote:
> So far cache_type_store didn't freeze the queue, fix that by using the
> queue_limits_commit_update_frozen helper.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
This should be squashed in patch 2, no ?
> ---
> drivers/block/virtio_blk.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 0a987f195630..bbaa26b523b8 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -1105,7 +1105,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
> lim.features |= BLK_FEAT_WRITE_CACHE;
> else
> lim.features &= ~BLK_FEAT_WRITE_CACHE;
> - i = queue_limits_commit_update(disk->queue, &lim);
> + i = queue_limits_commit_update_frozen(disk->queue, &lim);
> if (i)
> return i;
> return count;
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 03/10] block: add a store_limit operations for sysfs entries
2025-01-06 10:46 ` Damien Le Moal
@ 2025-01-06 10:57 ` Christoph Hellwig
0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:57 UTC (permalink / raw)
To: Damien Le Moal
Cc: Christoph Hellwig, Jens Axboe, Ming Lei, Nilay Shroff,
linux-block, linux-nvme, nbd, virtualization, linux-scsi,
usb-storage
On Mon, Jan 06, 2025 at 07:46:19PM +0900, Damien Le Moal wrote:
> The freeze must NOT be done for the "if (entry->store_limit)" case. So this
> needs to go int the "else". Otherwise, you still have freeze the take limit
> lock order which can cause the ABBA deadlocks in SCSI sd.
That is fixed in the next patch. This one is pure refactoring without
behavior change.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 10:55 ` Damien Le Moal
@ 2025-01-06 10:59 ` Christoph Hellwig
0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:59 UTC (permalink / raw)
To: Damien Le Moal
Cc: Christoph Hellwig, Jens Axboe, Ming Lei, Nilay Shroff,
linux-block, linux-nvme, nbd, virtualization, linux-scsi,
usb-storage
On Mon, Jan 06, 2025 at 07:55:30PM +0900, Damien Le Moal wrote:
> > + lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL;
>
> Why set BLK_FEAT_POLL unconditionally ? This is changing the current default
> for many devices, no ?
Due to the runtime check it doesn't actually change behavior. But
it does change the value read from sysfs, which also need extra check
for poll queues. But the entire point is that we don't have to update
this field when updating the queues, so yes it should be set
unconditonally.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store
2025-01-06 10:56 ` Damien Le Moal
@ 2025-01-06 10:59 ` Christoph Hellwig
2025-01-06 13:09 ` Damien Le Moal
0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 10:59 UTC (permalink / raw)
To: Damien Le Moal
Cc: Christoph Hellwig, Jens Axboe, Ming Lei, Nilay Shroff,
linux-block, linux-nvme, nbd, virtualization, linux-scsi,
usb-storage
On Mon, Jan 06, 2025 at 07:56:19PM +0900, Damien Le Moal wrote:
> On 1/6/25 7:06 PM, Christoph Hellwig wrote:
> > So far cache_type_store didn't freeze the queue, fix that by using the
> > queue_limits_commit_update_frozen helper.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> This should be squashed in patch 2, no ?
patch 2 is supposed to just be a mechanical conversion, and each
behavior change should be in it's own patch.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 10:06 ` [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues Christoph Hellwig
2025-01-06 10:55 ` Damien Le Moal
@ 2025-01-06 11:01 ` Nilay Shroff
2025-01-06 11:05 ` Christoph Hellwig
1 sibling, 1 reply; 27+ messages in thread
From: Nilay Shroff @ 2025-01-06 11:01 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Damien Le Moal, Ming Lei, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 3:36 PM, Christoph Hellwig wrote:
> When __blk_mq_update_nr_hw_queues changes the number of tag sets, it
> might have to disable poll queues. Currently it does so by adjusting
> the BLK_FEAT_POLL, which is a bit against the intent of features that
> describe hardware / driver capabilities, but more importantly causes
> nasty lock order problems with the broadly held freeze when updating the
> number of hardware queues and the limits lock. Fix this by leaving
> BLK_FEAT_POLL alone, and instead check for the number of sets and poll
> queues in the bio submission and poll handler. While this adds extra
> work to the fast path, the variables are in cache lines used by these
> operations anyway, so it should be cheap enough.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> block/blk-core.c | 14 +++++++++++---
> block/blk-mq.c | 19 +------------------
> block/blk-mq.h | 6 ++++++
> 3 files changed, 18 insertions(+), 21 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 666efe8fa202..483c14a50d9f 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -753,6 +753,15 @@ static blk_status_t blk_validate_atomic_write_op_size(struct request_queue *q,
> return BLK_STS_OK;
> }
>
> +static bool bdev_can_poll(struct block_device *bdev)
> +{
> + struct request_queue *q = bdev_get_queue(bdev);
> +
> + if (queue_is_mq(q))
> + return blk_mq_can_poll(q->tag_set);
> + return q->limits.features & BLK_FEAT_POLL;
> +}
> +
Should we make bdev_can_poll() inline ?
> /**
> * submit_bio_noacct - re-submit a bio to the block device layer for I/O
> * @bio: The bio describing the location in memory and on the device.
> @@ -805,8 +814,7 @@ void submit_bio_noacct(struct bio *bio)
> }
> }
>
> - if (!(q->limits.features & BLK_FEAT_POLL) &&
> - (bio->bi_opf & REQ_POLLED)) {
> + if ((bio->bi_opf & REQ_POLLED) && !bdev_can_poll(bdev)) {
> bio_clear_polled(bio);
> goto not_supported;
> }
> @@ -935,7 +943,7 @@ int bio_poll(struct bio *bio, struct io_comp_batch *iob, unsigned int flags)
> return 0;
>
> q = bdev_get_queue(bdev);
> - if (cookie == BLK_QC_T_NONE || !(q->limits.features & BLK_FEAT_POLL))
> + if (cookie == BLK_QC_T_NONE || !bdev_can_poll(bdev))
> return 0;
>
> blk_flush_plug(current->plug, false);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 17f10683d640..0a7f059735fa 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -4321,12 +4321,6 @@ void blk_mq_release(struct request_queue *q)
> blk_mq_sysfs_deinit(q);
> }
>
> -static bool blk_mq_can_poll(struct blk_mq_tag_set *set)
> -{
> - return set->nr_maps > HCTX_TYPE_POLL &&
> - set->map[HCTX_TYPE_POLL].nr_queues;
> -}
> -
> struct request_queue *blk_mq_alloc_queue(struct blk_mq_tag_set *set,
> struct queue_limits *lim, void *queuedata)
> {
> @@ -4336,9 +4330,7 @@ struct request_queue *blk_mq_alloc_queue(struct blk_mq_tag_set *set,
>
> if (!lim)
> lim = &default_lim;
> - lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT;
> - if (blk_mq_can_poll(set))
> - lim->features |= BLK_FEAT_POLL;
> + lim->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL;
>
> q = blk_alloc_queue(lim, set->numa_node);
> if (IS_ERR(q))
> @@ -5025,8 +5017,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> fallback:
> blk_mq_update_queue_map(set);
> list_for_each_entry(q, &set->tag_list, tag_set_list) {
> - struct queue_limits lim;
> -
> blk_mq_realloc_hw_ctxs(set, q);
>
> if (q->nr_hw_queues != set->nr_hw_queues) {
> @@ -5040,13 +5030,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> set->nr_hw_queues = prev_nr_hw_queues;
> goto fallback;
> }
> - lim = queue_limits_start_update(q);
> - if (blk_mq_can_poll(set))
> - lim.features |= BLK_FEAT_POLL;
> - else
> - lim.features &= ~BLK_FEAT_POLL;
> - if (queue_limits_commit_update(q, &lim) < 0)
> - pr_warn("updating the poll flag failed\n");
> blk_mq_map_swqueue(q);
> }
>
> diff --git a/block/blk-mq.h b/block/blk-mq.h
> index 89a20fffa4b1..ecd7bd7ec609 100644
> --- a/block/blk-mq.h
> +++ b/block/blk-mq.h
> @@ -111,6 +111,12 @@ static inline struct blk_mq_hw_ctx *blk_mq_map_queue(struct request_queue *q,
> return ctx->hctxs[blk_mq_get_hctx_type(opf)];
> }
>
> +static inline bool blk_mq_can_poll(struct blk_mq_tag_set *set)
> +{
> + return set->nr_maps > HCTX_TYPE_POLL &&
> + set->map[HCTX_TYPE_POLL].nr_queues;
> +}
> +
> /*
> * sysfs helpers
> */
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 11:01 ` Nilay Shroff
@ 2025-01-06 11:05 ` Christoph Hellwig
2025-01-06 12:06 ` Nilay Shroff
0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 11:05 UTC (permalink / raw)
To: Nilay Shroff
Cc: Christoph Hellwig, Jens Axboe, Damien Le Moal, Ming Lei,
linux-block, linux-nvme, nbd, virtualization, linux-scsi,
usb-storage
On Mon, Jan 06, 2025 at 04:31:23PM +0530, Nilay Shroff wrote:
> > +static bool bdev_can_poll(struct block_device *bdev)
> > +{
> > + struct request_queue *q = bdev_get_queue(bdev);
> > +
> > + if (queue_is_mq(q))
> > + return blk_mq_can_poll(q->tag_set);
> > + return q->limits.features & BLK_FEAT_POLL;
> > +}
> > +
>
> Should we make bdev_can_poll() inline ?
I don't really see the point. It's file local and doesn't have any
magic that could confuse the code generator, so we might as well leave
it to the compiler. Although that might be about to change per the
discussion with Damien, which could require it in blk-sysfs, in
which case it should become an inline in a header.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 11:05 ` Christoph Hellwig
@ 2025-01-06 12:06 ` Nilay Shroff
2025-01-06 15:27 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Nilay Shroff @ 2025-01-06 12:06 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Damien Le Moal, Ming Lei, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
On 1/6/25 4:35 PM, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 04:31:23PM +0530, Nilay Shroff wrote:
>>> +static bool bdev_can_poll(struct block_device *bdev)
>>> +{
>>> + struct request_queue *q = bdev_get_queue(bdev);
>>> +
>>> + if (queue_is_mq(q))
>>> + return blk_mq_can_poll(q->tag_set);
>>> + return q->limits.features & BLK_FEAT_POLL;
>>> +}
>>> +
>>
>> Should we make bdev_can_poll() inline ?
>
> I don't really see the point. It's file local and doesn't have any
> magic that could confuse the code generator, so we might as well leave
> it to the compiler. Although that might be about to change per the
> discussion with Damien, which could require it in blk-sysfs, in
> which case it should become an inline in a header.
>
Oh yes, I saw that you moved blk_mq_can_poll() to blk-mq.h and
made it inline so thought why bdev_can_poll() can't be made inline?
BTW, bdev_can_poll() is called from IO fastpath and so making it inline
may slightly improve performance.
On another note, I do see that blk_mq_can_poll() is now only called
from bdev_can_poll(). So you may want to merge these two functions
in a single call and make that inline.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store
2025-01-06 10:59 ` Christoph Hellwig
@ 2025-01-06 13:09 ` Damien Le Moal
2025-01-06 15:27 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Damien Le Moal @ 2025-01-06 13:09 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Ming Lei, Nilay Shroff, linux-block, linux-nvme, nbd,
virtualization, linux-scsi, usb-storage
On 1/6/25 19:59, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 07:56:19PM +0900, Damien Le Moal wrote:
>> On 1/6/25 7:06 PM, Christoph Hellwig wrote:
>>> So far cache_type_store didn't freeze the queue, fix that by using the
>>> queue_limits_commit_update_frozen helper.
>>>
>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>
>> This should be squashed in patch 2, no ?
>
> patch 2 is supposed to just be a mechanical conversion, and each
> behavior change should be in it's own patch.
Sounds good to me, but let's be consistent then: do not remove the
freeze/unfreeze from virtio_blk in patch 2 and do it here in this patch.
Otherwise, patch 2 *does* change the behavior of virtio_blk, introducing a bug
(missing freeze around commit update).
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 12:06 ` Nilay Shroff
@ 2025-01-06 15:27 ` Christoph Hellwig
2025-01-07 7:06 ` Nilay Shroff
0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 15:27 UTC (permalink / raw)
To: Nilay Shroff
Cc: Christoph Hellwig, Jens Axboe, Damien Le Moal, Ming Lei,
linux-block, linux-nvme, nbd, virtualization, linux-scsi,
usb-storage
On Mon, Jan 06, 2025 at 05:36:52PM +0530, Nilay Shroff wrote:
> Oh yes, I saw that you moved blk_mq_can_poll() to blk-mq.h and
> made it inline so thought why bdev_can_poll() can't be made inline?
It can be, but why would you want it to? What do you gain from forcing
the compiler to inline it, when sane compilers with a sane inlining
threshold will do that anyway.
> BTW, bdev_can_poll() is called from IO fastpath and so making it inline
> may slightly improve performance.
> On another note, I do see that blk_mq_can_poll() is now only called
> from bdev_can_poll(). So you may want to merge these two functions
> in a single call and make that inline.
I'd rather keep generic block layer logic separate from blk-mq logic.
We tend to do a few direct calls into blk-mq from the core code to
avoid the indirect call overhead, but we should still keep the code
as separate as possible to keep it somewhat modular.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store
2025-01-06 13:09 ` Damien Le Moal
@ 2025-01-06 15:27 ` Christoph Hellwig
0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2025-01-06 15:27 UTC (permalink / raw)
To: Damien Le Moal
Cc: Christoph Hellwig, Jens Axboe, Ming Lei, Nilay Shroff,
linux-block, linux-nvme, nbd, virtualization, linux-scsi,
usb-storage
On Mon, Jan 06, 2025 at 10:09:16PM +0900, Damien Le Moal wrote:
> Sounds good to me, but let's be consistent then: do not remove the
> freeze/unfreeze from virtio_blk in patch 2 and do it here in this patch.
> Otherwise, patch 2 *does* change the behavior of virtio_blk, introducing a bug
> (missing freeze around commit update).
Yeah, I'll fix it up. As mentioned in the cover letter I just brushed
this up just enough to be presentable for now.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues
2025-01-06 15:27 ` Christoph Hellwig
@ 2025-01-07 7:06 ` Nilay Shroff
0 siblings, 0 replies; 27+ messages in thread
From: Nilay Shroff @ 2025-01-07 7:06 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Damien Le Moal, Ming Lei, linux-block, linux-nvme,
nbd, virtualization, linux-scsi, usb-storage
On 1/6/25 8:57 PM, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 05:36:52PM +0530, Nilay Shroff wrote:
>> Oh yes, I saw that you moved blk_mq_can_poll() to blk-mq.h and
>> made it inline so thought why bdev_can_poll() can't be made inline?
>
> It can be, but why would you want it to? What do you gain from forcing
> the compiler to inline it, when sane compilers with a sane inlining
> threshold will do that anyway.
Hmm, ok, I was thinking just in case we want to force compiler. What if
in case compiler doesn't inline it? However, if we're moving this function
to a header then it would be made inline, anyways.
>
>> BTW, bdev_can_poll() is called from IO fastpath and so making it inline
>> may slightly improve performance.
>> On another note, I do see that blk_mq_can_poll() is now only called
>> from bdev_can_poll(). So you may want to merge these two functions
>> in a single call and make that inline.
>
> I'd rather keep generic block layer logic separate from blk-mq logic.
> We tend to do a few direct calls into blk-mq from the core code to
> avoid the indirect call overhead, but we should still keep the code
> as separate as possible to keep it somewhat modular.
>
Okay, make sense.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2025-01-07 7:07 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-06 10:06 RFC: fix queue freeze and limit locking order (alt take) Christoph Hellwig
2025-01-06 10:06 ` [PATCH 01/10] block: fix docs for freezing of queue limits updates Christoph Hellwig
2025-01-06 10:39 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 02/10] block: add a queue_limits_commit_update_frozen helper Christoph Hellwig
2025-01-06 10:41 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 03/10] block: add a store_limit operations for sysfs entries Christoph Hellwig
2025-01-06 10:46 ` Damien Le Moal
2025-01-06 10:57 ` Christoph Hellwig
2025-01-06 10:06 ` [PATCH 04/10] block: use queue_limits_commit_update in queue_attr_store Christoph Hellwig
2025-01-06 10:48 ` Damien Le Moal
2025-01-06 10:06 ` [PATCH 05/10] block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues Christoph Hellwig
2025-01-06 10:55 ` Damien Le Moal
2025-01-06 10:59 ` Christoph Hellwig
2025-01-06 11:01 ` Nilay Shroff
2025-01-06 11:05 ` Christoph Hellwig
2025-01-06 12:06 ` Nilay Shroff
2025-01-06 15:27 ` Christoph Hellwig
2025-01-07 7:06 ` Nilay Shroff
2025-01-06 10:06 ` [PATCH 06/10] virtio_blk: use queue_limits_commit_update_frozen in cache_type_store Christoph Hellwig
2025-01-06 10:56 ` Damien Le Moal
2025-01-06 10:59 ` Christoph Hellwig
2025-01-06 13:09 ` Damien Le Moal
2025-01-06 15:27 ` Christoph Hellwig
2025-01-06 10:06 ` [PATCH 07/10] usb-storage: use queue_limits_commit_update_frozen in max_sectors_store Christoph Hellwig
2025-01-06 10:06 ` [PATCH 08/10] nvme: freeze queue after taking limits lock Christoph Hellwig
2025-01-06 10:06 ` [PATCH 09/10] loop: document why loop_clear_limits updates queue limits without freezing Christoph Hellwig
2025-01-06 10:06 ` [PATCH 10/10] nbd: use queue_limits_commit_update_frozen in nbd_set_size Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).