linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* convert the SCSI ULDs to the atomic queue limits API
@ 2024-05-29  5:04 Christoph Hellwig
  2024-05-29  5:04 ` [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling Christoph Hellwig
                   ` (11 more replies)
  0 siblings, 12 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Hi all,

this series converts the SCSI upper level drivers to the atomic queue
limits API.

The first patch is a bug fix for ubd that later patches depend on and
might be worth picking up for 6.10.

The second patch changes the max_sectors calculation to take the optimal
I/O size into account so that sd, nbd and rbd don't have to mess with
the user max_sector value.  I'd love to see a careful review from the
nbd and rbd maintainers for this one!

The following patches clean up a few lose ends in the sd driver, and
then convert sd and sr to the atomic queue limits API.  The final
patches remove the now unused block APIs, and convert a few to be
specific to their now more narrow use case.

The patches are against Jens' block-6.10 tree.  Due to the amount of
block layer changes in here, and other that will depend on it, it
would be good if this could eventually be merged through the block
tree, or at least a shared branch between the SCSI and block trees.

Diffstat:
 arch/um/drivers/ubd_kern.c   |   10 +
 block/blk-settings.c         |  238 +------------------------------------------
 drivers/block/nbd.c          |    2 
 drivers/block/rbd.c          |    1 
 drivers/block/xen-blkfront.c |    4 
 drivers/scsi/sd.c            |  218 ++++++++++++++++++++-------------------
 drivers/scsi/sd.h            |    6 -
 drivers/scsi/sd_zbc.c        |   27 ++--
 drivers/scsi/sr.c            |   42 ++++---
 include/linux/blkdev.h       |   40 +++----
 10 files changed, 196 insertions(+), 392 deletions(-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:00   ` Damien Le Moal
  2024-05-30 19:44   ` Bart Van Assche
  2024-05-29  5:04 ` [PATCH 02/12] block: take io_opt and io_min into account for max_sectors Christoph Hellwig
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Discard and Write Zeroes are different operation and implemented
by different fallocate opcodes for ubd.  If one fails the other one
can work and vice versa.

Split the code to disable the operations in ubd_handler to only
disable the operation that actually failed.

Fixes: 50109b5a03b4 ("um: Add support for DISCARD in the UBD Driver")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/um/drivers/ubd_kern.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index ef805eaa9e013d..a79a3b7c33a647 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -471,9 +471,14 @@ static void ubd_handler(void)
 		for (count = 0; count < n/sizeof(struct io_thread_req *); count++) {
 			struct io_thread_req *io_req = (*irq_req_buffer)[count];
 
-			if ((io_req->error == BLK_STS_NOTSUPP) && (req_op(io_req->req) == REQ_OP_DISCARD)) {
-				blk_queue_max_discard_sectors(io_req->req->q, 0);
-				blk_queue_max_write_zeroes_sectors(io_req->req->q, 0);
+			if (io_req->error == BLK_STS_NOTSUPP) {
+				struct request_queue *q = io_req->req->q;
+
+				if (req_op(io_req->req) == REQ_OP_DISCARD)
+					blk_queue_max_discard_sectors(q, 0);
+				if (req_op(io_req->req) == REQ_OP_WRITE_ZEROES)
+					blk_queue_max_write_zeroes_sectors(q,
+							0);
 			}
 			blk_mq_end_request(io_req->req, io_req->error);
 			kfree(io_req);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/12] block: take io_opt and io_min into account for max_sectors
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
  2024-05-29  5:04 ` [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:05   ` Damien Le Moal
                     ` (2 more replies)
  2024-05-29  5:04 ` [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store Christoph Hellwig
                   ` (9 subsequent siblings)
  11 siblings, 3 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

The soft max_sectors limit is normally capped by the hardware limits and
an arbitrary upper limit enforced by the kernel, but can be modified by
the user.  A few drivers want to increase this limit (nbd, rbd) or
adjust it up or down based on hardware capabilities (sd).

Change blk_validate_limits to default max_sectors to the optimal I/O
size, or upgrade it to the preferred minimal I/O size if that is
larger than the kernel default if no optimal I/O size is provided based
on the logic in the SD driver.

This keeps the existing kernel default for drivers that do not provide
an io_opt or very big io_min value, but picks a much more useful
default for those who provide these hints, and allows to remove the
hacks to set the user max_sectors limit in nbd, rbd and sd.

Note that rd picks a different value for the optimal I/O size vs the
user max_sectors value, so this is a bit of a behavior change that
could use careful review from people familiar with rbd.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-settings.c |  7 +++++++
 drivers/block/nbd.c  |  2 +-
 drivers/block/rbd.c  |  1 -
 drivers/scsi/sd.c    | 29 +++++------------------------
 4 files changed, 13 insertions(+), 26 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index effeb9a639bb45..a49abdb3554834 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -153,6 +153,13 @@ static int blk_validate_limits(struct queue_limits *lim)
 		if (lim->max_user_sectors < PAGE_SIZE / SECTOR_SIZE)
 			return -EINVAL;
 		lim->max_sectors = min(max_hw_sectors, lim->max_user_sectors);
+	} else if (lim->io_opt) {
+		lim->max_sectors =
+			min(max_hw_sectors, lim->io_opt >> SECTOR_SHIFT);
+	} else if (lim->io_min &&
+		   lim->io_min > (BLK_DEF_MAX_SECTORS_CAP << SECTOR_SHIFT)) {
+		lim->max_sectors =
+			min(max_hw_sectors, lim->io_min >> SECTOR_SHIFT);
 	} else {
 		lim->max_sectors = min(max_hw_sectors, BLK_DEF_MAX_SECTORS_CAP);
 	}
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 22a79a62cc4eab..ad887d614d5b3f 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1808,7 +1808,7 @@ static struct nbd_device *nbd_dev_add(int index, unsigned int refs)
 {
 	struct queue_limits lim = {
 		.max_hw_sectors		= 65536,
-		.max_user_sectors	= 256,
+		.io_opt			= 256 << SECTOR_SHIFT,
 		.max_segments		= USHRT_MAX,
 		.max_segment_size	= UINT_MAX,
 	};
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 26ff5cd2bf0abc..05096172f334a3 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -4954,7 +4954,6 @@ static int rbd_init_disk(struct rbd_device *rbd_dev)
 	    rbd_dev->layout.object_size * rbd_dev->layout.stripe_count;
 	struct queue_limits lim = {
 		.max_hw_sectors		= objset_bytes >> SECTOR_SHIFT,
-		.max_user_sectors	= objset_bytes >> SECTOR_SHIFT,
 		.io_min			= rbd_dev->opts->alloc_size,
 		.io_opt			= rbd_dev->opts->alloc_size,
 		.max_segments		= USHRT_MAX,
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index f6c822c9cbd2d3..3dff9150ce11e2 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3593,7 +3593,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	struct request_queue *q = sdkp->disk->queue;
 	sector_t old_capacity = sdkp->capacity;
 	unsigned char *buffer;
-	unsigned int dev_max, rw_max;
+	unsigned int dev_max;
 
 	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
 				      "sd_revalidate_disk\n"));
@@ -3675,34 +3675,15 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	else
 		blk_queue_io_min(sdkp->disk->queue, 0);
 
-	if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
-		q->limits.io_opt = logical_to_bytes(sdp, sdkp->opt_xfer_blocks);
-		rw_max = logical_to_sectors(sdp, sdkp->opt_xfer_blocks);
-	} else {
-		q->limits.io_opt = 0;
-		rw_max = min_not_zero(logical_to_sectors(sdp, dev_max),
-				      (sector_t)BLK_DEF_MAX_SECTORS_CAP);
-	}
-
 	/*
 	 * Limit default to SCSI host optimal sector limit if set. There may be
 	 * an impact on performance for when the size of a request exceeds this
 	 * host limit.
 	 */
-	rw_max = min_not_zero(rw_max, sdp->host->opt_sectors);
-
-	/* Do not exceed controller limit */
-	rw_max = min(rw_max, queue_max_hw_sectors(q));
-
-	/*
-	 * Only update max_sectors if previously unset or if the current value
-	 * exceeds the capabilities of the hardware.
-	 */
-	if (sdkp->first_scan ||
-	    q->limits.max_sectors > q->limits.max_dev_sectors ||
-	    q->limits.max_sectors > q->limits.max_hw_sectors) {
-		q->limits.max_sectors = rw_max;
-		q->limits.max_user_sectors = rw_max;
+	q->limits.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
+	if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
+		q->limits.io_opt = min_not_zero(q->limits.io_opt,
+				logical_to_bytes(sdp, sdkp->opt_xfer_blocks));
 	}
 
 	sdkp->first_scan = 0;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
  2024-05-29  5:04 ` [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling Christoph Hellwig
  2024-05-29  5:04 ` [PATCH 02/12] block: take io_opt and io_min into account for max_sectors Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:07   ` Damien Le Moal
  2024-05-30 19:48   ` Bart Van Assche
  2024-05-29  5:04 ` [PATCH 04/12] sd: add a sd_disable_discard helper Christoph Hellwig
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Don't reset the discard settings to no-op over and over when a user
writes to the provisioning attribute as that is already the default
mode for ZBC devices.  In hindsight we should have made writing to
the attribute fail for ZBC devices, but the code has probably been
around for far too long to change this now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 3dff9150ce11e2..15d0035048d902 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -461,14 +461,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sd_is_zoned(sdkp)) {
-		sd_config_discard(sdkp, SD_LBP_DISABLE);
-		return count;
-	}
-
 	if (sdp->type != TYPE_DISK)
 		return -EINVAL;
 
+	/* ignore the proivisioning mode for ZBB devices */
+	if (sd_is_zoned(sdkp))
+		return count;
+
 	mode = sysfs_match_string(lbp_mode, buf);
 	if (mode < 0)
 		return -EINVAL;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/12] sd: add a sd_disable_discard helper
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (2 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:10   ` Damien Le Moal
  2024-05-30 19:50   ` Bart Van Assche
  2024-05-29  5:04 ` [PATCH 05/12] sd: add a sd_disable_write_same helper Christoph Hellwig
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Add helper to disable discard when it is not supported and use it
instead of sd_config_discard in the I/O completion handler.  This avoids
touching more fields than required in the I/O completion handler and
prepares for converting sd to use the atomic queue limits API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 15d0035048d902..a8838381823254 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -821,6 +821,12 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 	return protect;
 }
 
+static void sd_disable_discard(struct scsi_disk *sdkp)
+{
+	sdkp->provisioning_mode = SD_LBP_DISABLE;
+	blk_queue_max_discard_sectors(sdkp->disk->queue, 0);
+}
+
 static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 {
 	struct request_queue *q = sdkp->disk->queue;
@@ -2245,12 +2251,12 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 		case 0x24:	/* INVALID FIELD IN CDB */
 			switch (SCpnt->cmnd[0]) {
 			case UNMAP:
-				sd_config_discard(sdkp, SD_LBP_DISABLE);
+				sd_disable_discard(sdkp);
 				break;
 			case WRITE_SAME_16:
 			case WRITE_SAME:
 				if (SCpnt->cmnd[1] & 8) { /* UNMAP */
-					sd_config_discard(sdkp, SD_LBP_DISABLE);
+					sd_disable_discard(sdkp);
 				} else {
 					sdkp->device->no_write_same = 1;
 					sd_config_write_same(sdkp);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/12] sd: add a sd_disable_write_same helper
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (3 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 04/12] sd: add a sd_disable_discard helper Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:12   ` Damien Le Moal
  2024-05-30 19:51   ` Bart Van Assche
  2024-05-29  5:04 ` [PATCH 06/12] sd: simplify the disable case in sd_config_discard Christoph Hellwig
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Add helper to disable WRITE SAME when it is not supported and use it
instead of sd_config_write_same in the I/O completion handler.  This
avoids touching more fields than required in the I/O completion handler
and  prepares for converting sd to use the atomic queue limits API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index a8838381823254..09ffe9d826aeac 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1004,6 +1004,13 @@ static blk_status_t sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd)
 	return sd_setup_write_same10_cmnd(cmd, false);
 }
 
+static void sd_disable_write_same(struct scsi_disk *sdkp)
+{
+	sdkp->device->no_write_same = 1;
+	sdkp->max_ws_blocks = 0;
+	blk_queue_max_write_zeroes_sectors(sdkp->disk->queue, 0);
+}
+
 static void sd_config_write_same(struct scsi_disk *sdkp)
 {
 	struct request_queue *q = sdkp->disk->queue;
@@ -2258,8 +2265,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 				if (SCpnt->cmnd[1] & 8) { /* UNMAP */
 					sd_disable_discard(sdkp);
 				} else {
-					sdkp->device->no_write_same = 1;
-					sd_config_write_same(sdkp);
+					sd_disable_write_same(sdkp);
 					req->rq_flags |= RQF_QUIET;
 				}
 				break;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/12] sd: simplify the disable case in sd_config_discard
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (4 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 05/12] sd: add a sd_disable_write_same helper Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:13   ` Damien Le Moal
  2024-05-30 20:02   ` Bart Van Assche
  2024-05-29  5:04 ` [PATCH 07/12] sd: factor out a sd_discard_mode helper Christoph Hellwig
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Fall through to the main call to blk_queue_max_discard_sectors given that
max_blocks has been initialized to zero above instead of duplicating the
call.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 09ffe9d826aeac..437743e3a0d37d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -844,8 +844,7 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 
 	case SD_LBP_FULL:
 	case SD_LBP_DISABLE:
-		blk_queue_max_discard_sectors(q, 0);
-		return;
+		break;
 
 	case SD_LBP_UNMAP:
 		max_blocks = min_not_zero(sdkp->max_unmap_blocks,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/12] sd: factor out a sd_discard_mode helper
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (5 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 06/12] sd: simplify the disable case in sd_config_discard Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:14   ` Damien Le Moal
  2024-05-29 21:11   ` Bart Van Assche
  2024-05-29  5:04 ` [PATCH 08/12] sd: cleanup zoned queue limits initialization Christoph Hellwig
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Split the logic to pick the right discard mode into a little helper
to prepare for further changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 37 ++++++++++++++++++++-----------------
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 437743e3a0d37d..2d08b69154b995 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3201,6 +3201,25 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
 	return;
 }
 
+static unsigned int sd_discard_mode(struct scsi_disk *sdkp)
+{
+	if (!sdkp->lbpvpd) {
+		/* LBP VPD page not provided */
+		if (sdkp->max_unmap_blocks)
+			return SD_LBP_UNMAP;
+		return SD_LBP_WS16;
+	}
+
+	/* LBP VPD page tells us what to use */
+	if (sdkp->lbpu && sdkp->max_unmap_blocks)
+		return SD_LBP_UNMAP;
+	if (sdkp->lbpws)
+		return SD_LBP_WS16;
+	if (sdkp->lbpws10)
+		return SD_LBP_WS10;
+	return SD_LBP_DISABLE;
+}
+
 /**
  * sd_read_block_limits - Query disk device for preferred I/O sizes.
  * @sdkp: disk to query
@@ -3239,23 +3258,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 			sdkp->unmap_alignment =
 				get_unaligned_be32(&vpd->data[32]) & ~(1 << 31);
 
-		if (!sdkp->lbpvpd) { /* LBP VPD page not provided */
-
-			if (sdkp->max_unmap_blocks)
-				sd_config_discard(sdkp, SD_LBP_UNMAP);
-			else
-				sd_config_discard(sdkp, SD_LBP_WS16);
-
-		} else {	/* LBP VPD page tells us what to use */
-			if (sdkp->lbpu && sdkp->max_unmap_blocks)
-				sd_config_discard(sdkp, SD_LBP_UNMAP);
-			else if (sdkp->lbpws)
-				sd_config_discard(sdkp, SD_LBP_WS16);
-			else if (sdkp->lbpws10)
-				sd_config_discard(sdkp, SD_LBP_WS10);
-			else
-				sd_config_discard(sdkp, SD_LBP_DISABLE);
-		}
+		sd_config_discard(sdkp, sd_discard_mode(sdkp));
 	}
 
  out:
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/12] sd: cleanup zoned queue limits initialization
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (6 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 07/12] sd: factor out a sd_discard_mode helper Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:18   ` Damien Le Moal
  2024-05-30 20:07   ` Bart Van Assche
  2024-05-29  5:04 ` [PATCH 09/12] sd: convert to the atomic queue limits API Christoph Hellwig
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Consolidate setting zone-related queue limits in sd_zbc_read_zones
instead of splitting them between sd_zbc_revalidate_zones and
sd_zbc_read_zones, and move the early_zone_information initialization
in sd_zbc_read_zones above setting up the queue limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd_zbc.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 806036e48abeda..1c24c844e8d178 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -565,12 +565,6 @@ int sd_zbc_revalidate_zones(struct scsi_disk *sdkp)
 	sdkp->zone_info.zone_blocks = zone_blocks;
 	sdkp->zone_info.nr_zones = nr_zones;
 
-	blk_queue_chunk_sectors(q,
-			logical_to_sectors(sdkp->device, zone_blocks));
-
-	/* Enable block layer zone append emulation */
-	blk_queue_max_zone_append_sectors(q, 0);
-
 	flags = memalloc_noio_save();
 	ret = blk_revalidate_disk_zones(disk);
 	memalloc_noio_restore(flags);
@@ -625,6 +619,10 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE])
 	if (ret != 0)
 		goto err;
 
+	nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
+	sdkp->early_zone_info.nr_zones = nr_zones;
+	sdkp->early_zone_info.zone_blocks = zone_blocks;
+
 	/* The drive satisfies the kernel restrictions: set it up */
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 	if (sdkp->zones_max_open == U32_MAX)
@@ -632,10 +630,10 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE])
 	else
 		disk_set_max_open_zones(disk, sdkp->zones_max_open);
 	disk_set_max_active_zones(disk, 0);
-	nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
-
-	sdkp->early_zone_info.nr_zones = nr_zones;
-	sdkp->early_zone_info.zone_blocks = zone_blocks;
+	blk_queue_chunk_sectors(q,
+			logical_to_sectors(sdkp->device, zone_blocks));
+	/* Enable block layer zone append emulation */
+	blk_queue_max_zone_append_sectors(q, 0);
 
 	return 0;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/12] sd: convert to the atomic queue limits API
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (7 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 08/12] sd: cleanup zoned queue limits initialization Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:23   ` Damien Le Moal
  2024-05-30  9:16   ` John Garry
  2024-05-29  5:04 ` [PATCH 10/12] sr: " Christoph Hellwig
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Assign all queue limits through a local queue_limits variable and
queue_limits_commit_update so that we can't race updating them from
multiple places, and free the queue when updating them so that
in-progress I/O submissions don't see half-updated limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c     | 126 ++++++++++++++++++++++++------------------
 drivers/scsi/sd.h     |   6 +-
 drivers/scsi/sd_zbc.c |  15 ++---
 3 files changed, 84 insertions(+), 63 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2d08b69154b995..03e67936b27928 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -101,12 +101,13 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
 
 #define SD_MINORS	16
 
-static void sd_config_discard(struct scsi_disk *, unsigned int);
-static void sd_config_write_same(struct scsi_disk *);
+static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim,
+		unsigned int mode);
+static void sd_config_write_same(struct scsi_disk *sdkp,
+		struct queue_limits *lim);
 static int  sd_revalidate_disk(struct gendisk *);
 static void sd_unlock_native_capacity(struct gendisk *disk);
 static void sd_shutdown(struct device *);
-static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer);
 static void scsi_disk_release(struct device *cdev);
 
 static DEFINE_IDA(sd_index_ida);
@@ -456,7 +457,8 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
 {
 	struct scsi_disk *sdkp = to_scsi_disk(dev);
 	struct scsi_device *sdp = sdkp->device;
-	int mode;
+	struct queue_limits lim;
+	int mode, err;
 
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
@@ -472,8 +474,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
 	if (mode < 0)
 		return -EINVAL;
 
-	sd_config_discard(sdkp, mode);
-
+	lim = queue_limits_start_update(sdkp->disk->queue);
+	sd_config_discard(sdkp, &lim, mode);
+	blk_mq_freeze_queue(sdkp->disk->queue);
+	err = queue_limits_commit_update(sdkp->disk->queue, &lim);
+	blk_mq_unfreeze_queue(sdkp->disk->queue);
+	if (err)
+		return err;
 	return count;
 }
 static DEVICE_ATTR_RW(provisioning_mode);
@@ -556,6 +563,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
 {
 	struct scsi_disk *sdkp = to_scsi_disk(dev);
 	struct scsi_device *sdp = sdkp->device;
+	struct queue_limits lim;
 	unsigned long max;
 	int err;
 
@@ -577,8 +585,13 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
 		sdkp->max_ws_blocks = max;
 	}
 
-	sd_config_write_same(sdkp);
-
+	lim = queue_limits_start_update(sdkp->disk->queue);
+	sd_config_write_same(sdkp, &lim);
+	blk_mq_freeze_queue(sdkp->disk->queue);
+	err = queue_limits_commit_update(sdkp->disk->queue, &lim);
+	blk_mq_unfreeze_queue(sdkp->disk->queue);
+	if (err)
+		return err;
 	return count;
 }
 static DEVICE_ATTR_RW(max_write_same_blocks);
@@ -827,17 +840,15 @@ static void sd_disable_discard(struct scsi_disk *sdkp)
 	blk_queue_max_discard_sectors(sdkp->disk->queue, 0);
 }
 
-static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
+static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim,
+		unsigned int mode)
 {
-	struct request_queue *q = sdkp->disk->queue;
 	unsigned int logical_block_size = sdkp->device->sector_size;
 	unsigned int max_blocks = 0;
 
-	q->limits.discard_alignment =
-		sdkp->unmap_alignment * logical_block_size;
-	q->limits.discard_granularity =
-		max(sdkp->physical_block_size,
-		    sdkp->unmap_granularity * logical_block_size);
+	lim->discard_alignment = sdkp->unmap_alignment * logical_block_size;
+	lim->discard_granularity = max(sdkp->physical_block_size,
+			sdkp->unmap_granularity * logical_block_size);
 	sdkp->provisioning_mode = mode;
 
 	switch (mode) {
@@ -875,7 +886,8 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 		break;
 	}
 
-	blk_queue_max_discard_sectors(q, max_blocks * (logical_block_size >> 9));
+	lim->max_hw_discard_sectors = max_blocks *
+		(logical_block_size >> SECTOR_SHIFT);
 }
 
 static void *sd_set_special_bvec(struct request *rq, unsigned int data_len)
@@ -1010,9 +1022,9 @@ static void sd_disable_write_same(struct scsi_disk *sdkp)
 	blk_queue_max_write_zeroes_sectors(sdkp->disk->queue, 0);
 }
 
-static void sd_config_write_same(struct scsi_disk *sdkp)
+static void sd_config_write_same(struct scsi_disk *sdkp,
+		struct queue_limits *lim)
 {
-	struct request_queue *q = sdkp->disk->queue;
 	unsigned int logical_block_size = sdkp->device->sector_size;
 
 	if (sdkp->device->no_write_same) {
@@ -1066,8 +1078,8 @@ static void sd_config_write_same(struct scsi_disk *sdkp)
 	}
 
 out:
-	blk_queue_max_write_zeroes_sectors(q, sdkp->max_ws_blocks *
-					 (logical_block_size >> 9));
+	lim->max_write_zeroes_sectors =
+		sdkp->max_ws_blocks * (logical_block_size >> 9);
 }
 
 static blk_status_t sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
@@ -2523,7 +2535,7 @@ static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
 #define READ_CAPACITY_RETRIES_ON_RESET	10
 
 static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
-						unsigned char *buffer)
+		struct queue_limits *lim, unsigned char *buffer)
 {
 	unsigned char cmd[16];
 	struct scsi_sense_hdr sshdr;
@@ -2597,7 +2609,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 
 	/* Lowest aligned logical block */
 	alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
-	blk_queue_alignment_offset(sdp->request_queue, alignment);
+	lim->alignment_offset = alignment;
 	if (alignment && sdkp->first_scan)
 		sd_printk(KERN_NOTICE, sdkp,
 			  "physical block alignment offset: %u\n", alignment);
@@ -2608,7 +2620,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		if (buffer[14] & 0x40) /* LBPRZ */
 			sdkp->lbprz = 1;
 
-		sd_config_discard(sdkp, SD_LBP_WS16);
+		sd_config_discard(sdkp, lim, SD_LBP_WS16);
 	}
 
 	sdkp->capacity = lba + 1;
@@ -2711,13 +2723,14 @@ static int sd_try_rc16_first(struct scsi_device *sdp)
  * read disk capacity
  */
 static void
-sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
+sd_read_capacity(struct scsi_disk *sdkp, struct queue_limits *lim,
+		unsigned char *buffer)
 {
 	int sector_size;
 	struct scsi_device *sdp = sdkp->device;
 
 	if (sd_try_rc16_first(sdp)) {
-		sector_size = read_capacity_16(sdkp, sdp, buffer);
+		sector_size = read_capacity_16(sdkp, sdp, lim, buffer);
 		if (sector_size == -EOVERFLOW)
 			goto got_data;
 		if (sector_size == -ENODEV)
@@ -2737,7 +2750,7 @@ sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
 			int old_sector_size = sector_size;
 			sd_printk(KERN_NOTICE, sdkp, "Very big device. "
 					"Trying to use READ CAPACITY(16).\n");
-			sector_size = read_capacity_16(sdkp, sdp, buffer);
+			sector_size = read_capacity_16(sdkp, sdp, lim, buffer);
 			if (sector_size < 0) {
 				sd_printk(KERN_NOTICE, sdkp,
 					"Using 0xffffffff as device size\n");
@@ -2796,9 +2809,8 @@ sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
 		 */
 		sector_size = 512;
 	}
-	blk_queue_logical_block_size(sdp->request_queue, sector_size);
-	blk_queue_physical_block_size(sdp->request_queue,
-				      sdkp->physical_block_size);
+	lim->logical_block_size = sector_size;
+	lim->physical_block_size = sdkp->physical_block_size;
 	sdkp->device->sector_size = sector_size;
 
 	if (sdkp->capacity > 0xffffffff)
@@ -3220,11 +3232,11 @@ static unsigned int sd_discard_mode(struct scsi_disk *sdkp)
 	return SD_LBP_DISABLE;
 }
 
-/**
- * sd_read_block_limits - Query disk device for preferred I/O sizes.
- * @sdkp: disk to query
+/*
+ * Query disk device for preferred I/O sizes.
  */
-static void sd_read_block_limits(struct scsi_disk *sdkp)
+static void sd_read_block_limits(struct scsi_disk *sdkp,
+		struct queue_limits *lim)
 {
 	struct scsi_vpd *vpd;
 
@@ -3258,7 +3270,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 			sdkp->unmap_alignment =
 				get_unaligned_be32(&vpd->data[32]) & ~(1 << 31);
 
-		sd_config_discard(sdkp, sd_discard_mode(sdkp));
+		sd_config_discard(sdkp, lim, sd_discard_mode(sdkp));
 	}
 
  out:
@@ -3278,10 +3290,10 @@ static void sd_read_block_limits_ext(struct scsi_disk *sdkp)
 }
 
 /**
- * sd_read_block_characteristics - Query block dev. characteristics
- * @sdkp: disk to query
+ * Query block dev. characteristics
  */
-static void sd_read_block_characteristics(struct scsi_disk *sdkp)
+static void sd_read_block_characteristics(struct scsi_disk *sdkp,
+		struct queue_limits *lim)
 {
 	struct request_queue *q = sdkp->disk->queue;
 	struct scsi_vpd *vpd;
@@ -3307,29 +3319,26 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 
 #ifdef CONFIG_BLK_DEV_ZONED /* sd_probe rejects ZBD devices early otherwise */
 	if (sdkp->device->type == TYPE_ZBC) {
-		/*
-		 * Host-managed.
-		 */
-		disk_set_zoned(sdkp->disk);
+		lim->zoned = true;
 
 		/*
 		 * Per ZBC and ZAC specifications, writes in sequential write
 		 * required zones of host-managed devices must be aligned to
 		 * the device physical block size.
 		 */
-		blk_queue_zone_write_granularity(q, sdkp->physical_block_size);
+		lim->zone_write_granularity = sdkp->physical_block_size;
 	} else {
 		/*
 		 * Host-aware devices are treated as conventional.
 		 */
-		WARN_ON_ONCE(blk_queue_is_zoned(q));
+		lim->zoned = false;
 	}
 #endif /* CONFIG_BLK_DEV_ZONED */
 
 	if (!sdkp->first_scan)
 		return;
 
-	if (blk_queue_is_zoned(q))
+	if (lim->zoned)
 		sd_printk(KERN_NOTICE, sdkp, "Host-managed zoned block device\n");
 	else if (sdkp->zoned == 1)
 		sd_printk(KERN_NOTICE, sdkp, "Host-aware SMR disk used as regular disk\n");
@@ -3605,8 +3614,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	struct scsi_device *sdp = sdkp->device;
 	struct request_queue *q = sdkp->disk->queue;
 	sector_t old_capacity = sdkp->capacity;
+	struct queue_limits lim;
 	unsigned char *buffer;
 	unsigned int dev_max;
+	int err;
 
 	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
 				      "sd_revalidate_disk\n"));
@@ -3627,12 +3638,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
 
 	sd_spinup_disk(sdkp);
 
+	lim = queue_limits_start_update(sdkp->disk->queue);
+
 	/*
 	 * Without media there is no reason to ask; moreover, some devices
 	 * react badly if we do.
 	 */
 	if (sdkp->media_present) {
-		sd_read_capacity(sdkp, buffer);
+		sd_read_capacity(sdkp, &lim, buffer);
 		/*
 		 * Some USB/UAS devices return generic values for mode pages
 		 * until the media has been accessed. Trigger a READ operation
@@ -3651,10 +3664,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
 
 		if (scsi_device_supports_vpd(sdp)) {
 			sd_read_block_provisioning(sdkp);
-			sd_read_block_limits(sdkp);
+			sd_read_block_limits(sdkp, &lim);
 			sd_read_block_limits_ext(sdkp);
-			sd_read_block_characteristics(sdkp);
-			sd_zbc_read_zones(sdkp, buffer);
+			sd_read_block_characteristics(sdkp, &lim);
+			sd_zbc_read_zones(sdkp, &lim, buffer);
 			sd_read_cpr(sdkp);
 		}
 
@@ -3683,28 +3696,33 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	q->limits.max_dev_sectors = logical_to_sectors(sdp, dev_max);
 
 	if (sd_validate_min_xfer_size(sdkp))
-		blk_queue_io_min(sdkp->disk->queue,
-				 logical_to_bytes(sdp, sdkp->min_xfer_blocks));
+		lim.io_min = logical_to_bytes(sdp, sdkp->min_xfer_blocks);
 	else
-		blk_queue_io_min(sdkp->disk->queue, 0);
+		lim.io_min = 0;
 
 	/*
 	 * Limit default to SCSI host optimal sector limit if set. There may be
 	 * an impact on performance for when the size of a request exceeds this
 	 * host limit.
 	 */
-	q->limits.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
+	lim.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
 	if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
-		q->limits.io_opt = min_not_zero(q->limits.io_opt,
+		lim.io_opt = min_not_zero(lim.io_opt,
 				logical_to_bytes(sdp, sdkp->opt_xfer_blocks));
 	}
 
 	sdkp->first_scan = 0;
 
 	set_capacity_and_notify(disk, logical_to_sectors(sdp, sdkp->capacity));
-	sd_config_write_same(sdkp);
+	sd_config_write_same(sdkp, &lim);
 	kfree(buffer);
 
+	blk_mq_freeze_queue(sdkp->disk->queue);
+	err = queue_limits_commit_update(sdkp->disk->queue, &lim);
+	blk_mq_unfreeze_queue(sdkp->disk->queue);
+	if (err)
+		return err;
+
 	/*
 	 * For a zoned drive, revalidating the zones can be done only once
 	 * the gendisk capacity is set. So if this fails, set back the gendisk
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 49dd600bfa4825..b4170b17bad47a 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -239,7 +239,8 @@ static inline int sd_is_zoned(struct scsi_disk *sdkp)
 
 #ifdef CONFIG_BLK_DEV_ZONED
 
-int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE]);
+int sd_zbc_read_zones(struct scsi_disk *sdkp, struct queue_limits *lim,
+		u8 buf[SD_BUF_SIZE]);
 int sd_zbc_revalidate_zones(struct scsi_disk *sdkp);
 blk_status_t sd_zbc_setup_zone_mgmt_cmnd(struct scsi_cmnd *cmd,
 					 unsigned char op, bool all);
@@ -250,7 +251,8 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
 
 #else /* CONFIG_BLK_DEV_ZONED */
 
-static inline int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE])
+static inline int sd_zbc_read_zones(struct scsi_disk *sdkp,
+		struct queue_limits *lim, u8 buf[SD_BUF_SIZE])
 {
 	return 0;
 }
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 1c24c844e8d178..f685838d9ed214 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -582,13 +582,15 @@ int sd_zbc_revalidate_zones(struct scsi_disk *sdkp)
 /**
  * sd_zbc_read_zones - Read zone information and update the request queue
  * @sdkp: SCSI disk pointer.
+ * @lim: queue limits to read into
  * @buf: 512 byte buffer used for storing SCSI command output.
  *
  * Read zone information and update the request queue zone characteristics and
  * also the zoned device information in *sdkp. Called by sd_revalidate_disk()
  * before the gendisk capacity has been set.
  */
-int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE])
+int sd_zbc_read_zones(struct scsi_disk *sdkp, struct queue_limits *lim,
+		u8 buf[SD_BUF_SIZE])
 {
 	struct gendisk *disk = sdkp->disk;
 	struct request_queue *q = disk->queue;
@@ -626,14 +628,13 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE])
 	/* The drive satisfies the kernel restrictions: set it up */
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
 	if (sdkp->zones_max_open == U32_MAX)
-		disk_set_max_open_zones(disk, 0);
+		lim->max_open_zones = 0;
 	else
-		disk_set_max_open_zones(disk, sdkp->zones_max_open);
-	disk_set_max_active_zones(disk, 0);
-	blk_queue_chunk_sectors(q,
-			logical_to_sectors(sdkp->device, zone_blocks));
+		lim->max_open_zones = sdkp->zones_max_open;
+	lim->max_active_zones = 0;
+	lim->chunk_sectors = logical_to_sectors(sdkp->device, zone_blocks);
 	/* Enable block layer zone append emulation */
-	blk_queue_max_zone_append_sectors(q, 0);
+	lim->max_zone_append_sectors = 0;
 
 	return 0;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/12] sr: convert to the atomic queue limits API
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (8 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 09/12] sd: convert to the atomic queue limits API Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:25   ` Damien Le Moal
  2024-05-29  5:04 ` [PATCH 11/12] block: remove unused " Christoph Hellwig
  2024-05-29  5:04 ` [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends Christoph Hellwig
  11 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Assign all queue limits through a local queue_limits variable and
queue_limits_commit_update so that we can't race updating them from
multiple places, and free the queue when updating them so that
in-progress I/O submissions don't see half-updated limits.

Also use the chance to clean up variable names to standard ones.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sr.c | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 7ab000942b97fc..3f491019103e0c 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -111,7 +111,7 @@ static struct lock_class_key sr_bio_compl_lkclass;
 static int sr_open(struct cdrom_device_info *, int);
 static void sr_release(struct cdrom_device_info *);
 
-static void get_sectorsize(struct scsi_cd *);
+static int get_sectorsize(struct scsi_cd *);
 static int get_capabilities(struct scsi_cd *);
 
 static unsigned int sr_check_events(struct cdrom_device_info *cdi,
@@ -473,15 +473,15 @@ static blk_status_t sr_init_command(struct scsi_cmnd *SCpnt)
 	return BLK_STS_IOERR;
 }
 
-static void sr_revalidate_disk(struct scsi_cd *cd)
+static int sr_revalidate_disk(struct scsi_cd *cd)
 {
 	struct scsi_sense_hdr sshdr;
 
 	/* if the unit is not ready, nothing more to do */
 	if (scsi_test_unit_ready(cd->device, SR_TIMEOUT, MAX_RETRIES, &sshdr))
-		return;
+		return 0;
 	sr_cd_check(&cd->cdi);
-	get_sectorsize(cd);
+	return get_sectorsize(cd);
 }
 
 static int sr_block_open(struct gendisk *disk, blk_mode_t mode)
@@ -494,13 +494,16 @@ static int sr_block_open(struct gendisk *disk, blk_mode_t mode)
 		return -ENXIO;
 
 	scsi_autopm_get_device(sdev);
-	if (disk_check_media_change(disk))
-		sr_revalidate_disk(cd);
+	if (disk_check_media_change(disk)) {
+		ret = sr_revalidate_disk(cd);
+		if (ret)
+			goto out;
+	}
 
 	mutex_lock(&cd->lock);
 	ret = cdrom_open(&cd->cdi, mode);
 	mutex_unlock(&cd->lock);
-
+out:
 	scsi_autopm_put_device(sdev);
 	if (ret)
 		scsi_device_put(cd->device);
@@ -685,7 +688,9 @@ static int sr_probe(struct device *dev)
 	blk_pm_runtime_init(sdev->request_queue, dev);
 
 	dev_set_drvdata(dev, cd);
-	sr_revalidate_disk(cd);
+	error = sr_revalidate_disk(cd);
+	if (error)
+		goto unregister_cdrom;
 
 	error = device_add_disk(&sdev->sdev_gendev, disk, NULL);
 	if (error)
@@ -714,13 +719,14 @@ static int sr_probe(struct device *dev)
 }
 
 
-static void get_sectorsize(struct scsi_cd *cd)
+static int get_sectorsize(struct scsi_cd *cd)
 {
+	struct request_queue *q = cd->device->request_queue;
 	static const u8 cmd[10] = { READ_CAPACITY };
 	unsigned char buffer[8] = { };
-	int the_result;
+	struct queue_limits lim;
+	int err;
 	int sector_size;
-	struct request_queue *queue;
 	struct scsi_failure failure_defs[] = {
 		{
 			.result = SCMD_FAILURE_RESULT_ANY,
@@ -736,10 +742,10 @@ static void get_sectorsize(struct scsi_cd *cd)
 	};
 
 	/* Do the command and wait.. */
-	the_result = scsi_execute_cmd(cd->device, cmd, REQ_OP_DRV_IN, buffer,
+	err = scsi_execute_cmd(cd->device, cmd, REQ_OP_DRV_IN, buffer,
 				      sizeof(buffer), SR_TIMEOUT, MAX_RETRIES,
 				      &exec_args);
-	if (the_result) {
+	if (err) {
 		cd->capacity = 0x1fffff;
 		sector_size = 2048;	/* A guess, just in case */
 	} else {
@@ -789,10 +795,12 @@ static void get_sectorsize(struct scsi_cd *cd)
 		set_capacity(cd->disk, cd->capacity);
 	}
 
-	queue = cd->device->request_queue;
-	blk_queue_logical_block_size(queue, sector_size);
-
-	return;
+	lim = queue_limits_start_update(q);
+	lim.logical_block_size = sector_size;
+	blk_mq_freeze_queue(q);
+	err = queue_limits_commit_update(q, &lim);
+	blk_mq_unfreeze_queue(q);
+	return err;
 }
 
 static int get_capabilities(struct scsi_cd *cd)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/12] block: remove unused queue limits API
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (9 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 10/12] sr: " Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:28   ` Damien Le Moal
                     ` (2 more replies)
  2024-05-29  5:04 ` [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends Christoph Hellwig
  11 siblings, 3 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

Remove all APIs that are unused now that sd and sr have been converted
to the atomic queue limits API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-settings.c   | 190 -----------------------------------------
 include/linux/blkdev.h |  12 ---
 2 files changed, 202 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index a49abdb3554834..0b038729608f4b 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -293,24 +293,6 @@ int queue_limits_set(struct request_queue *q, struct queue_limits *lim)
 }
 EXPORT_SYMBOL_GPL(queue_limits_set);
 
-/**
- * blk_queue_chunk_sectors - set size of the chunk for this queue
- * @q:  the request queue for the device
- * @chunk_sectors:  chunk sectors in the usual 512b unit
- *
- * Description:
- *    If a driver doesn't want IOs to cross a given chunk size, it can set
- *    this limit and prevent merging across chunks. Note that the block layer
- *    must accept a page worth of data at any offset. So if the crossing of
- *    chunks is a hard limitation in the driver, it must still be prepared
- *    to split single page bios.
- **/
-void blk_queue_chunk_sectors(struct request_queue *q, unsigned int chunk_sectors)
-{
-	q->limits.chunk_sectors = chunk_sectors;
-}
-EXPORT_SYMBOL(blk_queue_chunk_sectors);
-
 /**
  * blk_queue_max_discard_sectors - set max sectors for a single discard
  * @q:  the request queue for the device
@@ -352,139 +334,6 @@ void blk_queue_max_write_zeroes_sectors(struct request_queue *q,
 }
 EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors);
 
-/**
- * blk_queue_max_zone_append_sectors - set max sectors for a single zone append
- * @q:  the request queue for the device
- * @max_zone_append_sectors: maximum number of sectors to write per command
- *
- * Sets the maximum number of sectors allowed for zone append commands. If
- * Specifying 0 for @max_zone_append_sectors indicates that the queue does
- * not natively support zone append operations and that the block layer must
- * emulate these operations using regular writes.
- **/
-void blk_queue_max_zone_append_sectors(struct request_queue *q,
-		unsigned int max_zone_append_sectors)
-{
-	unsigned int max_sectors = 0;
-
-	if (WARN_ON(!blk_queue_is_zoned(q)))
-		return;
-
-	if (max_zone_append_sectors) {
-		max_sectors = min(q->limits.max_hw_sectors,
-				  max_zone_append_sectors);
-		max_sectors = min(q->limits.chunk_sectors, max_sectors);
-
-		/*
-		 * Signal eventual driver bugs resulting in the max_zone_append
-		 * sectors limit being 0 due to the chunk_sectors limit (zone
-		 * size) not set or the max_hw_sectors limit not set.
-		 */
-		WARN_ON_ONCE(!max_sectors);
-	}
-
-	q->limits.max_zone_append_sectors = max_sectors;
-}
-EXPORT_SYMBOL_GPL(blk_queue_max_zone_append_sectors);
-
-/**
- * blk_queue_logical_block_size - set logical block size for the queue
- * @q:  the request queue for the device
- * @size:  the logical block size, in bytes
- *
- * Description:
- *   This should be set to the lowest possible block size that the
- *   storage device can address.  The default of 512 covers most
- *   hardware.
- **/
-void blk_queue_logical_block_size(struct request_queue *q, unsigned int size)
-{
-	struct queue_limits *limits = &q->limits;
-
-	limits->logical_block_size = size;
-
-	if (limits->discard_granularity < limits->logical_block_size)
-		limits->discard_granularity = limits->logical_block_size;
-
-	if (limits->physical_block_size < size)
-		limits->physical_block_size = size;
-
-	if (limits->io_min < limits->physical_block_size)
-		limits->io_min = limits->physical_block_size;
-
-	limits->max_hw_sectors =
-		round_down(limits->max_hw_sectors, size >> SECTOR_SHIFT);
-	limits->max_sectors =
-		round_down(limits->max_sectors, size >> SECTOR_SHIFT);
-}
-EXPORT_SYMBOL(blk_queue_logical_block_size);
-
-/**
- * blk_queue_physical_block_size - set physical block size for the queue
- * @q:  the request queue for the device
- * @size:  the physical block size, in bytes
- *
- * Description:
- *   This should be set to the lowest possible sector size that the
- *   hardware can operate on without reverting to read-modify-write
- *   operations.
- */
-void blk_queue_physical_block_size(struct request_queue *q, unsigned int size)
-{
-	q->limits.physical_block_size = size;
-
-	if (q->limits.physical_block_size < q->limits.logical_block_size)
-		q->limits.physical_block_size = q->limits.logical_block_size;
-
-	if (q->limits.discard_granularity < q->limits.physical_block_size)
-		q->limits.discard_granularity = q->limits.physical_block_size;
-
-	if (q->limits.io_min < q->limits.physical_block_size)
-		q->limits.io_min = q->limits.physical_block_size;
-}
-EXPORT_SYMBOL(blk_queue_physical_block_size);
-
-/**
- * blk_queue_zone_write_granularity - set zone write granularity for the queue
- * @q:  the request queue for the zoned device
- * @size:  the zone write granularity size, in bytes
- *
- * Description:
- *   This should be set to the lowest possible size allowing to write in
- *   sequential zones of a zoned block device.
- */
-void blk_queue_zone_write_granularity(struct request_queue *q,
-				      unsigned int size)
-{
-	if (WARN_ON_ONCE(!blk_queue_is_zoned(q)))
-		return;
-
-	q->limits.zone_write_granularity = size;
-
-	if (q->limits.zone_write_granularity < q->limits.logical_block_size)
-		q->limits.zone_write_granularity = q->limits.logical_block_size;
-}
-EXPORT_SYMBOL_GPL(blk_queue_zone_write_granularity);
-
-/**
- * blk_queue_alignment_offset - set physical block alignment offset
- * @q:	the request queue for the device
- * @offset: alignment offset in bytes
- *
- * Description:
- *   Some devices are naturally misaligned to compensate for things like
- *   the legacy DOS partition table 63-sector offset.  Low-level drivers
- *   should call this function for devices whose first sector is not
- *   naturally aligned.
- */
-void blk_queue_alignment_offset(struct request_queue *q, unsigned int offset)
-{
-	q->limits.alignment_offset =
-		offset & (q->limits.physical_block_size - 1);
-	q->limits.misaligned = 0;
-}
-EXPORT_SYMBOL(blk_queue_alignment_offset);
-
 void disk_update_readahead(struct gendisk *disk)
 {
 	blk_apply_bdi_limits(disk->bdi, &disk->queue->limits);
@@ -514,26 +363,6 @@ void blk_limits_io_min(struct queue_limits *limits, unsigned int min)
 }
 EXPORT_SYMBOL(blk_limits_io_min);
 
-/**
- * blk_queue_io_min - set minimum request size for the queue
- * @q:	the request queue for the device
- * @min:  smallest I/O size in bytes
- *
- * Description:
- *   Storage devices may report a granularity or preferred minimum I/O
- *   size which is the smallest request the device can perform without
- *   incurring a performance penalty.  For disk drives this is often the
- *   physical block size.  For RAID arrays it is often the stripe chunk
- *   size.  A properly aligned multiple of minimum_io_size is the
- *   preferred request size for workloads where a high number of I/O
- *   operations is desired.
- */
-void blk_queue_io_min(struct request_queue *q, unsigned int min)
-{
-	blk_limits_io_min(&q->limits, min);
-}
-EXPORT_SYMBOL(blk_queue_io_min);
-
 /**
  * blk_limits_io_opt - set optimal request size for a device
  * @limits: the queue limits
@@ -841,25 +670,6 @@ void blk_queue_write_cache(struct request_queue *q, bool wc, bool fua)
 }
 EXPORT_SYMBOL_GPL(blk_queue_write_cache);
 
-/**
- * disk_set_zoned - inidicate a zoned device
- * @disk:	gendisk to configure
- */
-void disk_set_zoned(struct gendisk *disk)
-{
-	struct request_queue *q = disk->queue;
-
-	WARN_ON_ONCE(!IS_ENABLED(CONFIG_BLK_DEV_ZONED));
-
-	/*
-	 * Set the zone write granularity to the device logical block
-	 * size by default. The driver can change this value if needed.
-	 */
-	q->limits.zoned = true;
-	blk_queue_zone_write_granularity(q, queue_logical_block_size(q));
-}
-EXPORT_SYMBOL_GPL(disk_set_zoned);
-
 int bdev_alignment_offset(struct block_device *bdev)
 {
 	struct request_queue *q = bdev_get_queue(bdev);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index aefdda9f4ec711..bee71deb8ca066 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -331,8 +331,6 @@ struct queue_limits {
 typedef int (*report_zones_cb)(struct blk_zone *zone, unsigned int idx,
 			       void *data);
 
-void disk_set_zoned(struct gendisk *disk);
-
 #define BLK_ALL_ZONES  ((unsigned int)-1)
 int blkdev_report_zones(struct block_device *bdev, sector_t sector,
 		unsigned int nr_zones, report_zones_cb cb, void *data);
@@ -928,24 +926,14 @@ static inline void queue_limits_cancel_update(struct request_queue *q)
 /*
  * Access functions for manipulating queue properties
  */
-extern void blk_queue_chunk_sectors(struct request_queue *, unsigned int);
 void blk_queue_max_secure_erase_sectors(struct request_queue *q,
 		unsigned int max_sectors);
 extern void blk_queue_max_discard_sectors(struct request_queue *q,
 		unsigned int max_discard_sectors);
 extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q,
 		unsigned int max_write_same_sectors);
-extern void blk_queue_logical_block_size(struct request_queue *, unsigned int);
-extern void blk_queue_max_zone_append_sectors(struct request_queue *q,
-		unsigned int max_zone_append_sectors);
-extern void blk_queue_physical_block_size(struct request_queue *, unsigned int);
-void blk_queue_zone_write_granularity(struct request_queue *q,
-				      unsigned int size);
-extern void blk_queue_alignment_offset(struct request_queue *q,
-				       unsigned int alignment);
 void disk_update_readahead(struct gendisk *disk);
 extern void blk_limits_io_min(struct queue_limits *limits, unsigned int min);
-extern void blk_queue_io_min(struct request_queue *q, unsigned int min);
 extern void blk_limits_io_opt(struct queue_limits *limits, unsigned int opt);
 extern void blk_set_queue_depth(struct request_queue *q, unsigned int depth);
 extern void blk_set_stacking_limits(struct queue_limits *lim);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends
  2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
                   ` (10 preceding siblings ...)
  2024-05-29  5:04 ` [PATCH 11/12] block: remove unused " Christoph Hellwig
@ 2024-05-29  5:04 ` Christoph Hellwig
  2024-05-29  8:30   ` Damien Le Moal
                     ` (2 more replies)
  11 siblings, 3 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-29  5:04 UTC (permalink / raw)
  To: Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

A few drivers optimistically try to support discard, write zeroes and
secure erase and disable the features from the I/O completion handler
if the hardware can't support them.  This disable can't be done using
the atomic queue limits API because the I/O completion handlers can't
take sleeping locks or freezer the queue.  Keep the existing clearing
of the relevant field to zero, but replace the old blk_queue_max_*
APIs with new disable APIs that force the value to 0.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/um/drivers/ubd_kern.c   |  5 ++---
 block/blk-settings.c         | 41 ------------------------------------
 drivers/block/xen-blkfront.c |  4 ++--
 drivers/scsi/sd.c            |  4 ++--
 include/linux/blkdev.h       | 28 ++++++++++++++++++------
 5 files changed, 28 insertions(+), 54 deletions(-)

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index a79a3b7c33a647..7eae1519300fbd 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -475,10 +475,9 @@ static void ubd_handler(void)
 				struct request_queue *q = io_req->req->q;
 
 				if (req_op(io_req->req) == REQ_OP_DISCARD)
-					blk_queue_max_discard_sectors(q, 0);
+					blk_queue_disable_discard(q);
 				if (req_op(io_req->req) == REQ_OP_WRITE_ZEROES)
-					blk_queue_max_write_zeroes_sectors(q,
-							0);
+					blk_queue_disable_write_zeroes(q);
 			}
 			blk_mq_end_request(io_req->req, io_req->error);
 			kfree(io_req);
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 0b038729608f4b..996f247fc98e80 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -293,47 +293,6 @@ int queue_limits_set(struct request_queue *q, struct queue_limits *lim)
 }
 EXPORT_SYMBOL_GPL(queue_limits_set);
 
-/**
- * blk_queue_max_discard_sectors - set max sectors for a single discard
- * @q:  the request queue for the device
- * @max_discard_sectors: maximum number of sectors to discard
- **/
-void blk_queue_max_discard_sectors(struct request_queue *q,
-		unsigned int max_discard_sectors)
-{
-	struct queue_limits *lim = &q->limits;
-
-	lim->max_hw_discard_sectors = max_discard_sectors;
-	lim->max_discard_sectors =
-		min(max_discard_sectors, lim->max_user_discard_sectors);
-}
-EXPORT_SYMBOL(blk_queue_max_discard_sectors);
-
-/**
- * blk_queue_max_secure_erase_sectors - set max sectors for a secure erase
- * @q:  the request queue for the device
- * @max_sectors: maximum number of sectors to secure_erase
- **/
-void blk_queue_max_secure_erase_sectors(struct request_queue *q,
-		unsigned int max_sectors)
-{
-	q->limits.max_secure_erase_sectors = max_sectors;
-}
-EXPORT_SYMBOL(blk_queue_max_secure_erase_sectors);
-
-/**
- * blk_queue_max_write_zeroes_sectors - set max sectors for a single
- *                                      write zeroes
- * @q:  the request queue for the device
- * @max_write_zeroes_sectors: maximum number of sectors to write per command
- **/
-void blk_queue_max_write_zeroes_sectors(struct request_queue *q,
-		unsigned int max_write_zeroes_sectors)
-{
-	q->limits.max_write_zeroes_sectors = max_write_zeroes_sectors;
-}
-EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors);
-
 void disk_update_readahead(struct gendisk *disk)
 {
 	blk_apply_bdi_limits(disk->bdi, &disk->queue->limits);
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index fd7c0ff2139cee..9b4ec3e4908cce 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1605,8 +1605,8 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 				blkif_req(req)->error = BLK_STS_NOTSUPP;
 				info->feature_discard = 0;
 				info->feature_secdiscard = 0;
-				blk_queue_max_discard_sectors(rq, 0);
-				blk_queue_max_secure_erase_sectors(rq, 0);
+				blk_queue_disable_discard(rq);
+				blk_queue_disable_secure_erase(rq);
 			}
 			break;
 		case BLKIF_OP_FLUSH_DISKCACHE:
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 03e67936b27928..56fd523b3987a5 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -837,7 +837,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 static void sd_disable_discard(struct scsi_disk *sdkp)
 {
 	sdkp->provisioning_mode = SD_LBP_DISABLE;
-	blk_queue_max_discard_sectors(sdkp->disk->queue, 0);
+	blk_queue_disable_discard(sdkp->disk->queue);
 }
 
 static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim,
@@ -1019,7 +1019,7 @@ static void sd_disable_write_same(struct scsi_disk *sdkp)
 {
 	sdkp->device->no_write_same = 1;
 	sdkp->max_ws_blocks = 0;
-	blk_queue_max_write_zeroes_sectors(sdkp->disk->queue, 0);
+	blk_queue_disable_write_zeroes(sdkp->disk->queue);
 }
 
 static void sd_config_write_same(struct scsi_disk *sdkp,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bee71deb8ca066..b83441da12456a 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -923,15 +923,31 @@ static inline void queue_limits_cancel_update(struct request_queue *q)
 	mutex_unlock(&q->limits_lock);
 }
 
+/*
+ * These helpers are for drivers that have sloppy feature negotiation and might
+ * have to disable DISCARD, WRITE_ZEROES or SECURE_DISCARD from the I/O
+ * completion handler when the device returned an indicator that the respective
+ * feature is not actually supported.  They are racy and the driver needs to
+ * cope with that.  Try to avoid this scheme if you can.
+ */
+static inline void blk_queue_disable_discard(struct request_queue *q)
+{
+	q->limits.max_discard_sectors = 0;
+}
+
+static inline void blk_queue_disable_secure_erase(struct request_queue *q)
+{
+	q->limits.max_secure_erase_sectors = 0;
+}
+
+static inline void blk_queue_disable_write_zeroes(struct request_queue *q)
+{
+	q->limits.max_write_zeroes_sectors = 0;
+}
+
 /*
  * Access functions for manipulating queue properties
  */
-void blk_queue_max_secure_erase_sectors(struct request_queue *q,
-		unsigned int max_sectors);
-extern void blk_queue_max_discard_sectors(struct request_queue *q,
-		unsigned int max_discard_sectors);
-extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q,
-		unsigned int max_write_same_sectors);
 void disk_update_readahead(struct gendisk *disk);
 extern void blk_limits_io_min(struct queue_limits *limits, unsigned int min);
 extern void blk_limits_io_opt(struct queue_limits *limits, unsigned int opt);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling
  2024-05-29  5:04 ` [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling Christoph Hellwig
@ 2024-05-29  8:00   ` Damien Le Moal
  2024-05-30 19:44   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:00 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Discard and Write Zeroes are different operation and implemented
> by different fallocate opcodes for ubd.  If one fails the other one
> can work and vice versa.
> 
> Split the code to disable the operations in ubd_handler to only
> disable the operation that actually failed.
> 
> Fixes: 50109b5a03b4 ("um: Add support for DISCARD in the UBD Driver")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  arch/um/drivers/ubd_kern.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index ef805eaa9e013d..a79a3b7c33a647 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -471,9 +471,14 @@ static void ubd_handler(void)
>  		for (count = 0; count < n/sizeof(struct io_thread_req *); count++) {
>  			struct io_thread_req *io_req = (*irq_req_buffer)[count];
>  
> -			if ((io_req->error == BLK_STS_NOTSUPP) && (req_op(io_req->req) == REQ_OP_DISCARD)) {
> -				blk_queue_max_discard_sectors(io_req->req->q, 0);
> -				blk_queue_max_write_zeroes_sectors(io_req->req->q, 0);
> +			if (io_req->error == BLK_STS_NOTSUPP) {
> +				struct request_queue *q = io_req->req->q;
> +
> +				if (req_op(io_req->req) == REQ_OP_DISCARD)
> +					blk_queue_max_discard_sectors(q, 0);
> +				if (req_op(io_req->req) == REQ_OP_WRITE_ZEROES)

Nit: this can be an "else if".

Otherwise, looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> +					blk_queue_max_write_zeroes_sectors(q,
> +							0);
>  			}
>  			blk_mq_end_request(io_req->req, io_req->error);
>  			kfree(io_req);

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/12] block: take io_opt and io_min into account for max_sectors
  2024-05-29  5:04 ` [PATCH 02/12] block: take io_opt and io_min into account for max_sectors Christoph Hellwig
@ 2024-05-29  8:05   ` Damien Le Moal
  2024-05-30 19:47   ` Bart Van Assche
  2024-05-30 19:48   ` Ilya Dryomov
  2 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:05 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> The soft max_sectors limit is normally capped by the hardware limits and
> an arbitrary upper limit enforced by the kernel, but can be modified by
> the user.  A few drivers want to increase this limit (nbd, rbd) or
> adjust it up or down based on hardware capabilities (sd).
> 
> Change blk_validate_limits to default max_sectors to the optimal I/O
> size, or upgrade it to the preferred minimal I/O size if that is
> larger than the kernel default if no optimal I/O size is provided based
> on the logic in the SD driver.
> 
> This keeps the existing kernel default for drivers that do not provide
> an io_opt or very big io_min value, but picks a much more useful
> default for those who provide these hints, and allows to remove the
> hacks to set the user max_sectors limit in nbd, rbd and sd.
> 
> Note that rd picks a different value for the optimal I/O size vs the
> user max_sectors value, so this is a bit of a behavior change that
> could use careful review from people familiar with rbd.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store
  2024-05-29  5:04 ` [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store Christoph Hellwig
@ 2024-05-29  8:07   ` Damien Le Moal
  2024-05-30 19:48   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:07 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Don't reset the discard settings to no-op over and over when a user
> writes to the provisioning attribute as that is already the default
> mode for ZBC devices.  In hindsight we should have made writing to
> the attribute fail for ZBC devices, but the code has probably been
> around for far too long to change this now.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/scsi/sd.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 3dff9150ce11e2..15d0035048d902 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -461,14 +461,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EACCES;
>  
> -	if (sd_is_zoned(sdkp)) {
> -		sd_config_discard(sdkp, SD_LBP_DISABLE);
> -		return count;
> -	}
> -
>  	if (sdp->type != TYPE_DISK)
>  		return -EINVAL;
>  
> +	/* ignore the proivisioning mode for ZBB devices */

s/proivisioning/provisioning
s/ZBB/ZBC

With that fixed,

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> +	if (sd_is_zoned(sdkp))
> +		return count;
> +
>  	mode = sysfs_match_string(lbp_mode, buf);
>  	if (mode < 0)
>  		return -EINVAL;

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] sd: add a sd_disable_discard helper
  2024-05-29  5:04 ` [PATCH 04/12] sd: add a sd_disable_discard helper Christoph Hellwig
@ 2024-05-29  8:10   ` Damien Le Moal
  2024-05-30 19:50   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:10 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Add helper to disable discard when it is not supported and use it
> instead of sd_config_discard in the I/O completion handler.  This avoids
> touching more fields than required in the I/O completion handler and
> prepares for converting sd to use the atomic queue limits API.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/12] sd: add a sd_disable_write_same helper
  2024-05-29  5:04 ` [PATCH 05/12] sd: add a sd_disable_write_same helper Christoph Hellwig
@ 2024-05-29  8:12   ` Damien Le Moal
  2024-05-30 19:51   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:12 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Add helper to disable WRITE SAME when it is not supported and use it
> instead of sd_config_write_same in the I/O completion handler.  This
> avoids touching more fields than required in the I/O completion handler
> and  prepares for converting sd to use the atomic queue limits API.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/12] sd: simplify the disable case in sd_config_discard
  2024-05-29  5:04 ` [PATCH 06/12] sd: simplify the disable case in sd_config_discard Christoph Hellwig
@ 2024-05-29  8:13   ` Damien Le Moal
  2024-05-30 20:02   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:13 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Fall through to the main call to blk_queue_max_discard_sectors given that
> max_blocks has been initialized to zero above instead of duplicating the
> call.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/12] sd: factor out a sd_discard_mode helper
  2024-05-29  5:04 ` [PATCH 07/12] sd: factor out a sd_discard_mode helper Christoph Hellwig
@ 2024-05-29  8:14   ` Damien Le Moal
  2024-05-29 21:11   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:14 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Split the logic to pick the right discard mode into a little helper
> to prepare for further changes.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/12] sd: cleanup zoned queue limits initialization
  2024-05-29  5:04 ` [PATCH 08/12] sd: cleanup zoned queue limits initialization Christoph Hellwig
@ 2024-05-29  8:18   ` Damien Le Moal
  2024-05-30 20:07   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:18 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Consolidate setting zone-related queue limits in sd_zbc_read_zones
> instead of splitting them between sd_zbc_revalidate_zones and
> sd_zbc_read_zones, and move the early_zone_information initialization
> in sd_zbc_read_zones above setting up the queue limits.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/12] sd: convert to the atomic queue limits API
  2024-05-29  5:04 ` [PATCH 09/12] sd: convert to the atomic queue limits API Christoph Hellwig
@ 2024-05-29  8:23   ` Damien Le Moal
  2024-05-30  9:16   ` John Garry
  1 sibling, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:23 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Assign all queue limits through a local queue_limits variable and
> queue_limits_commit_update so that we can't race updating them from
> multiple places, and free the queue when updating them so that

s/free/freeze

> in-progress I/O submissions don't see half-updated limits.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

With the above fixed, looks good to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/12] sr: convert to the atomic queue limits API
  2024-05-29  5:04 ` [PATCH 10/12] sr: " Christoph Hellwig
@ 2024-05-29  8:25   ` Damien Le Moal
  0 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:25 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Assign all queue limits through a local queue_limits variable and
> queue_limits_commit_update so that we can't race updating them from
> multiple places, and free the queue when updating them so that
> in-progress I/O submissions don't see half-updated limits.
> 
> Also use the chance to clean up variable names to standard ones.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/12] block: remove unused queue limits API
  2024-05-29  5:04 ` [PATCH 11/12] block: remove unused " Christoph Hellwig
@ 2024-05-29  8:28   ` Damien Le Moal
  2024-05-30 20:08   ` Bart Van Assche
  2024-05-31  9:14   ` John Garry
  2 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:28 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> Remove all APIs that are unused now that sd and sr have been converted
> to the atomic queue limits API.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

I think that disk_set_max_open_zones() and disk_set_max_active_zones() can also
go away.

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends
  2024-05-29  5:04 ` [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends Christoph Hellwig
@ 2024-05-29  8:30   ` Damien Le Moal
  2024-05-30 20:09   ` Bart Van Assche
  2024-05-31  9:13   ` John Garry
  2 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2024-05-29  8:30 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/29/24 14:04, Christoph Hellwig wrote:
> A few drivers optimistically try to support discard, write zeroes and
> secure erase and disable the features from the I/O completion handler
> if the hardware can't support them.  This disable can't be done using
> the atomic queue limits API because the I/O completion handlers can't
> take sleeping locks or freezer the queue.  Keep the existing clearing

s/freezer/freeze

> of the relevant field to zero, but replace the old blk_queue_max_*
> APIs with new disable APIs that force the value to 0.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

With the typo fixed, looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/12] sd: factor out a sd_discard_mode helper
  2024-05-29  5:04 ` [PATCH 07/12] sd: factor out a sd_discard_mode helper Christoph Hellwig
  2024-05-29  8:14   ` Damien Le Moal
@ 2024-05-29 21:11   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-29 21:11 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Split the logic to pick the right discard mode into a little helper
> to prepare for further changes.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/12] sd: convert to the atomic queue limits API
  2024-05-29  5:04 ` [PATCH 09/12] sd: convert to the atomic queue limits API Christoph Hellwig
  2024-05-29  8:23   ` Damien Le Moal
@ 2024-05-30  9:16   ` John Garry
  2024-05-31  5:48     ` Christoph Hellwig
  1 sibling, 1 reply; 43+ messages in thread
From: John Garry @ 2024-05-30  9:16 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 29/05/2024 06:04, Christoph Hellwig wrote:
> Assign all queue limits through a local queue_limits variable and
> queue_limits_commit_update so that we can't race updating them from
> multiple places, and free the queue when updating them so that
> in-progress I/O submissions don't see half-updated limits.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/scsi/sd.c     | 126 ++++++++++++++++++++++++------------------
>   drivers/scsi/sd.h     |   6 +-
>   drivers/scsi/sd_zbc.c |  15 ++---
>   3 files changed, 84 insertions(+), 63 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 2d08b69154b995..03e67936b27928 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -101,12 +101,13 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
>   
>   #define SD_MINORS	16
>   
> -static void sd_config_discard(struct scsi_disk *, unsigned int);
> -static void sd_config_write_same(struct scsi_disk *);
> +static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim,
> +		unsigned int mode);

Are there any reasons why we keep forward declarations like this? 
AFAICS, this sd_config_discard forward declaration could be removed.

> +static void sd_config_write_same(struct scsi_disk *sdkp,
> +		struct queue_limits *lim);
>   static int  sd_revalidate_disk(struct gendisk *);
>   static void sd_unlock_native_capacity(struct gendisk *disk);
>   static void sd_shutdown(struct device *);
> -static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer);
>   static void scsi_disk_release(struct device *cdev);
>   
>   static DEFINE_IDA(sd_index_ida);
> @@ -456,7 +457,8 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
>   {
>   	struct scsi_disk *sdkp = to_scsi_disk(dev);
>   	struct scsi_device *sdp = sdkp->device;
> -	int mode;
> +	struct queue_limits lim;
> +	int mode, err;
>   
>   	if (!capable(CAP_SYS_ADMIN))
>   		return -EACCES;
> @@ -472,8 +474,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
>   	if (mode < 0)
>   		return -EINVAL;
>   
> -	sd_config_discard(sdkp, mode);
> -
> +	lim = queue_limits_start_update(sdkp->disk->queue);
> +	sd_config_discard(sdkp, &lim, mode);
> +	blk_mq_freeze_queue(sdkp->disk->queue);
> +	err = queue_limits_commit_update(sdkp->disk->queue, &lim);
> +	blk_mq_unfreeze_queue(sdkp->disk->queue);
> +	if (err)
> +		return err;
>   	return count;
>   }
>   static DEVICE_ATTR_RW(provisioning_mode);
> @@ -556,6 +563,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
>   {
>   	struct scsi_disk *sdkp = to_scsi_disk(dev);
>   	struct scsi_device *sdp = sdkp->device;
> +	struct queue_limits lim;
>   	unsigned long max;
>   	int err;
>   
> @@ -577,8 +585,13 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
>   		sdkp->max_ws_blocks = max;
>   	}
>   
> -	sd_config_write_same(sdkp);
> -
> +	lim = queue_limits_start_update(sdkp->disk->queue);
> +	sd_config_write_same(sdkp, &lim);
> +	blk_mq_freeze_queue(sdkp->disk->queue);
> +	err = queue_limits_commit_update(sdkp->disk->queue, &lim);
> +	blk_mq_unfreeze_queue(sdkp->disk->queue);
> +	if (err)
> +		return err;
>   	return count;
>   }
>   static DEVICE_ATTR_RW(max_write_same_blocks);
> @@ -827,17 +840,15 @@ static void sd_disable_discard(struct scsi_disk *sdkp)
>   	blk_queue_max_discard_sectors(sdkp->disk->queue, 0);
>   }
>   
> -static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> +static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim,
> +		unsigned int mode)
>   {
> -	struct request_queue *q = sdkp->disk->queue;
>   	unsigned int logical_block_size = sdkp->device->sector_size;
>   	unsigned int max_blocks = 0;
>   
> -	q->limits.discard_alignment =
> -		sdkp->unmap_alignment * logical_block_size;
> -	q->limits.discard_granularity =
> -		max(sdkp->physical_block_size,
> -		    sdkp->unmap_granularity * logical_block_size);
> +	lim->discard_alignment = sdkp->unmap_alignment * logical_block_size;
> +	lim->discard_granularity = max(sdkp->physical_block_size,
> +			sdkp->unmap_granularity * logical_block_size);
>   	sdkp->provisioning_mode = mode;
>   
>   	switch (mode) {
> @@ -875,7 +886,8 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
>   		break;
>   	}
>   
> -	blk_queue_max_discard_sectors(q, max_blocks * (logical_block_size >> 9));
> +	lim->max_hw_discard_sectors = max_blocks *
> +		(logical_block_size >> SECTOR_SHIFT);
>   }
>   
>   static void *sd_set_special_bvec(struct request *rq, unsigned int data_len)
> @@ -1010,9 +1022,9 @@ static void sd_disable_write_same(struct scsi_disk *sdkp)
>   	blk_queue_max_write_zeroes_sectors(sdkp->disk->queue, 0);
>   }
>   
> -static void sd_config_write_same(struct scsi_disk *sdkp)
> +static void sd_config_write_same(struct scsi_disk *sdkp,
> +		struct queue_limits *lim)
>   {
> -	struct request_queue *q = sdkp->disk->queue;
>   	unsigned int logical_block_size = sdkp->device->sector_size;
>   
>   	if (sdkp->device->no_write_same) {
> @@ -1066,8 +1078,8 @@ static void sd_config_write_same(struct scsi_disk *sdkp)
>   	}
>   
>   out:
> -	blk_queue_max_write_zeroes_sectors(q, sdkp->max_ws_blocks *
> -					 (logical_block_size >> 9));
> +	lim->max_write_zeroes_sectors =
> +		sdkp->max_ws_blocks * (logical_block_size >> 9);

Would it be ok to use SECTOR_SHIFT here? A similar change is made in 
sd_config_discard(), above

>   }
>   
>   static blk_status_t sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
> @@ -2523,7 +2535,7 @@ static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
>   #define READ_CAPACITY_RETRIES_ON_RESET	10
>   
>   static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> -						unsigned char *buffer)
> +		struct queue_limits *lim, unsigned char *buffer)
>   {
>   	unsigned char cmd[16];
>   	struct scsi_sense_hdr sshdr;
> @@ -2597,7 +2609,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>   
>   	/* Lowest aligned logical block */
>   	alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
> -	blk_queue_alignment_offset(sdp->request_queue, alignment);
> +	lim->alignment_offset = alignment;
>   	if (alignment && sdkp->first_scan)
>   		sd_printk(KERN_NOTICE, sdkp,
>   			  "physical block alignment offset: %u\n", alignment);
> @@ -2608,7 +2620,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>   		if (buffer[14] & 0x40) /* LBPRZ */
>   			sdkp->lbprz = 1;
>   
> -		sd_config_discard(sdkp, SD_LBP_WS16);
> +		sd_config_discard(sdkp, lim, SD_LBP_WS16);
>   	}
>   
>   	sdkp->capacity = lba + 1;
> @@ -2711,13 +2723,14 @@ static int sd_try_rc16_first(struct scsi_device *sdp)
>    * read disk capacity
>    */
>   static void
> -sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
> +sd_read_capacity(struct scsi_disk *sdkp, struct queue_limits *lim,
> +		unsigned char *buffer)
>   {
>   	int sector_size;
>   	struct scsi_device *sdp = sdkp->device;
>   
>   	if (sd_try_rc16_first(sdp)) {
> -		sector_size = read_capacity_16(sdkp, sdp, buffer);
> +		sector_size = read_capacity_16(sdkp, sdp, lim, buffer);
>   		if (sector_size == -EOVERFLOW)
>   			goto got_data;
>   		if (sector_size == -ENODEV)
> @@ -2737,7 +2750,7 @@ sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
>   			int old_sector_size = sector_size;
>   			sd_printk(KERN_NOTICE, sdkp, "Very big device. "
>   					"Trying to use READ CAPACITY(16).\n");
> -			sector_size = read_capacity_16(sdkp, sdp, buffer);
> +			sector_size = read_capacity_16(sdkp, sdp, lim, buffer);
>   			if (sector_size < 0) {
>   				sd_printk(KERN_NOTICE, sdkp,
>   					"Using 0xffffffff as device size\n");
> @@ -2796,9 +2809,8 @@ sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
>   		 */
>   		sector_size = 512;
>   	}
> -	blk_queue_logical_block_size(sdp->request_queue, sector_size);
> -	blk_queue_physical_block_size(sdp->request_queue,
> -				      sdkp->physical_block_size);
> +	lim->logical_block_size = sector_size;
> +	lim->physical_block_size = sdkp->physical_block_size;
>   	sdkp->device->sector_size = sector_size;
>   
>   	if (sdkp->capacity > 0xffffffff)
> @@ -3220,11 +3232,11 @@ static unsigned int sd_discard_mode(struct scsi_disk *sdkp)
>   	return SD_LBP_DISABLE;
>   }
>   
> -/**
> - * sd_read_block_limits - Query disk device for preferred I/O sizes.
> - * @sdkp: disk to query
> +/*
> + * Query disk device for preferred I/O sizes.
>    */
> -static void sd_read_block_limits(struct scsi_disk *sdkp)
> +static void sd_read_block_limits(struct scsi_disk *sdkp,
> +		struct queue_limits *lim)
>   {
>   	struct scsi_vpd *vpd;
>   
> @@ -3258,7 +3270,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
>   			sdkp->unmap_alignment =
>   				get_unaligned_be32(&vpd->data[32]) & ~(1 << 31);
>   
> -		sd_config_discard(sdkp, sd_discard_mode(sdkp));
> +		sd_config_discard(sdkp, lim, sd_discard_mode(sdkp));
>   	}
>   
>    out:
> @@ -3278,10 +3290,10 @@ static void sd_read_block_limits_ext(struct scsi_disk *sdkp)
>   }
>   
>   /**

below is not a kernel doc comment

> - * sd_read_block_characteristics - Query block dev. characteristics
> - * @sdkp: disk to query
> + * Query block dev. characteristics
>    */
> -static void sd_read_block_characteristics(struct scsi_disk *sdkp)
> +static void sd_read_block_characteristics(struct scsi_disk *sdkp,
> +		struct queue_limits *lim)
>   {
>   	struct request_queue *q = sdkp->disk->queue;
>   	struct scsi_vpd *vpd;
> @@ -3307,29 +3319,26 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>   
>   #ifdef CONFIG_BLK_DEV_ZONED /* sd_probe rejects ZBD devices early otherwise */
>   	if (sdkp->device->type == TYPE_ZBC) {
> -		/*
> -		 * Host-managed.
> -		 */
> -		disk_set_zoned(sdkp->disk);
> +		lim->zoned = true;
>   
>   		/*
>   		 * Per ZBC and ZAC specifications, writes in sequential write
>   		 * required zones of host-managed devices must be aligned to
>   		 * the device physical block size.
>   		 */
> -		blk_queue_zone_write_granularity(q, sdkp->physical_block_size);
> +		lim->zone_write_granularity = sdkp->physical_block_size;
>   	} else {
>   		/*
>   		 * Host-aware devices are treated as conventional.
>   		 */
> -		WARN_ON_ONCE(blk_queue_is_zoned(q));
> +		lim->zoned = false;
>   	}
>   #endif /* CONFIG_BLK_DEV_ZONED */
>   
>   	if (!sdkp->first_scan)
>   		return;
>   
> -	if (blk_queue_is_zoned(q))
> +	if (lim->zoned)
>   		sd_printk(KERN_NOTICE, sdkp, "Host-managed zoned block device\n");
>   	else if (sdkp->zoned == 1)
>   		sd_printk(KERN_NOTICE, sdkp, "Host-aware SMR disk used as regular disk\n");
> @@ -3605,8 +3614,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
>   	struct scsi_device *sdp = sdkp->device;
>   	struct request_queue *q = sdkp->disk->queue;
>   	sector_t old_capacity = sdkp->capacity;
> +	struct queue_limits lim;
>   	unsigned char *buffer;
>   	unsigned int dev_max;
> +	int err;
>   
>   	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
>   				      "sd_revalidate_disk\n"));
> @@ -3627,12 +3638,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
>   
>   	sd_spinup_disk(sdkp);
>   
> +	lim = queue_limits_start_update(sdkp->disk->queue);
> +
>   	/*
>   	 * Without media there is no reason to ask; moreover, some devices
>   	 * react badly if we do.
>   	 */
>   	if (sdkp->media_present) {
> -		sd_read_capacity(sdkp, buffer);
> +		sd_read_capacity(sdkp, &lim, buffer);
>   		/*
>   		 * Some USB/UAS devices return generic values for mode pages
>   		 * until the media has been accessed. Trigger a READ operation
> @@ -3651,10 +3664,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
>   
>   		if (scsi_device_supports_vpd(sdp)) {
>   			sd_read_block_provisioning(sdkp);
> -			sd_read_block_limits(sdkp);
> +			sd_read_block_limits(sdkp, &lim);
>   			sd_read_block_limits_ext(sdkp);
> -			sd_read_block_characteristics(sdkp);
> -			sd_zbc_read_zones(sdkp, buffer);
> +			sd_read_block_characteristics(sdkp, &lim);
> +			sd_zbc_read_zones(sdkp, &lim, buffer);
>   			sd_read_cpr(sdkp);
>   		}
>   
> @@ -3683,28 +3696,33 @@ static int sd_revalidate_disk(struct gendisk *disk)
>   	q->limits.max_dev_sectors = logical_to_sectors(sdp, dev_max);


is setting q->limits.max_dev_sectors directly proper?

>   
>   	if (sd_validate_min_xfer_size(sdkp))
> -		blk_queue_io_min(sdkp->disk->queue,
> -				 logical_to_bytes(sdp, sdkp->min_xfer_blocks));
> +		lim.io_min = logical_to_bytes(sdp, sdkp->min_xfer_blocks);
>   	else
> -		blk_queue_io_min(sdkp->disk->queue, 0);
> +		lim.io_min = 0;
>   
>   	/*
>   	 * Limit default to SCSI host optimal sector limit if set. There may be
>   	 * an impact on performance for when the size of a request exceeds this
>   	 * host limit.
>   	 */
> -	q->limits.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
> +	lim.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT;
>   	if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
> -		q->limits.io_opt = min_not_zero(q->limits.io_opt,
> +		lim.io_opt = min_not_zero(lim.io_opt,
>   				logical_to_bytes(sdp, sdkp->opt_xfer_blocks));
>   	}
>   
>   	sdkp->first_scan = 0;
>   
>   	set_capacity_and_notify(disk, logical_to_sectors(sdp, sdkp->capacity));
> -	sd_config_write_same(sdkp);
> +	sd_config_write_same(sdkp, &lim);
>   	kfree(buffer);
>   


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling
  2024-05-29  5:04 ` [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling Christoph Hellwig
  2024-05-29  8:00   ` Damien Le Moal
@ 2024-05-30 19:44   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 19:44 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Discard and Write Zeroes are different operation and implemented
> by different fallocate opcodes for ubd.  If one fails the other one
> can work and vice versa.
> 
> Split the code to disable the operations in ubd_handler to only
> disable the operation that actually failed.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/12] block: take io_opt and io_min into account for max_sectors
  2024-05-29  5:04 ` [PATCH 02/12] block: take io_opt and io_min into account for max_sectors Christoph Hellwig
  2024-05-29  8:05   ` Damien Le Moal
@ 2024-05-30 19:47   ` Bart Van Assche
  2024-05-30 19:48   ` Ilya Dryomov
  2 siblings, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 19:47 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> The soft max_sectors limit is normally capped by the hardware limits and
> an arbitrary upper limit enforced by the kernel, but can be modified by
> the user.  A few drivers want to increase this limit (nbd, rbd) or
> adjust it up or down based on hardware capabilities (sd).
> 
> Change blk_validate_limits to default max_sectors to the optimal I/O
> size, or upgrade it to the preferred minimal I/O size if that is
> larger than the kernel default if no optimal I/O size is provided based
> on the logic in the SD driver.
> 
> This keeps the existing kernel default for drivers that do not provide
> an io_opt or very big io_min value, but picks a much more useful
> default for those who provide these hints, and allows to remove the
> hacks to set the user max_sectors limit in nbd, rbd and sd.
> 
> Note that rd picks a different value for the optimal I/O size vs the
             ^^
             rbd?

> user max_sectors value, so this is a bit of a behavior change that
> could use careful review from people familiar with rbd.

Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/12] block: take io_opt and io_min into account for max_sectors
  2024-05-29  5:04 ` [PATCH 02/12] block: take io_opt and io_min into account for max_sectors Christoph Hellwig
  2024-05-29  8:05   ` Damien Le Moal
  2024-05-30 19:47   ` Bart Van Assche
@ 2024-05-30 19:48   ` Ilya Dryomov
  2024-05-31  5:54     ` Christoph Hellwig
  2 siblings, 1 reply; 43+ messages in thread
From: Ilya Dryomov @ 2024-05-30 19:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Richard Weinberger, Anton Ivanov,
	Johannes Berg, Josef Bacik, Dongsheng Yang, Roger Pau Monné,
	linux-um, linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On Wed, May 29, 2024 at 7:05 AM Christoph Hellwig <hch@lst.de> wrote:
>
> The soft max_sectors limit is normally capped by the hardware limits and
> an arbitrary upper limit enforced by the kernel, but can be modified by
> the user.  A few drivers want to increase this limit (nbd, rbd) or
> adjust it up or down based on hardware capabilities (sd).
>
> Change blk_validate_limits to default max_sectors to the optimal I/O
> size, or upgrade it to the preferred minimal I/O size if that is
> larger than the kernel default if no optimal I/O size is provided based
> on the logic in the SD driver.
>
> This keeps the existing kernel default for drivers that do not provide
> an io_opt or very big io_min value, but picks a much more useful
> default for those who provide these hints, and allows to remove the
> hacks to set the user max_sectors limit in nbd, rbd and sd.
>
> Note that rd picks a different value for the optimal I/O size vs the
> user max_sectors value, so this is a bit of a behavior change that
> could use careful review from people familiar with rbd.

Hi Christoph,

For rbd, this change effectively lowers max_sectors from 4M to 64K or
less and that is definitely not desirable.  From previous interactions
with users we want max_sectors to match max_hw_sectors -- this has come
up a quite a few times over the years.  Some people just aren't aware
of the soft cap and the fact that it's adjustable and get frustrated
over the time poured into debugging their iostat numbers for workloads
that can send object (set) size I/Os.

Looking at the git history, we lowered io_opt from objset_bytes to
opts->alloc_size in commit [1], but I guess io_opt was lowered just
along for the ride.  What that commit was concerned with is really
discard_granularity and to a smaller extent io_min.

How much difference does io_opt make in the real world?  If what rbd
does stands in the way of a tree-wide cleanup, I would much rather bump
io_opt back to objset_bytes (i.e. what max_user_sectors is currently
set to).

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=16d80c54ad42c573a897ae7bcf5a9816be54e6fe

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store
  2024-05-29  5:04 ` [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store Christoph Hellwig
  2024-05-29  8:07   ` Damien Le Moal
@ 2024-05-30 19:48   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 19:48 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Don't reset the discard settings to no-op over and over when a user
> writes to the provisioning attribute as that is already the default
> mode for ZBC devices.  In hindsight we should have made writing to
> the attribute fail for ZBC devices, but the code has probably been
> around for far too long to change this now.

If Damien's feedback gets addressed, feel free to add:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] sd: add a sd_disable_discard helper
  2024-05-29  5:04 ` [PATCH 04/12] sd: add a sd_disable_discard helper Christoph Hellwig
  2024-05-29  8:10   ` Damien Le Moal
@ 2024-05-30 19:50   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 19:50 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Add helper to disable discard when it is not supported and use it
> instead of sd_config_discard in the I/O completion handler.  This avoids
> touching more fields than required in the I/O completion handler and
> prepares for converting sd to use the atomic queue limits API.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/12] sd: add a sd_disable_write_same helper
  2024-05-29  5:04 ` [PATCH 05/12] sd: add a sd_disable_write_same helper Christoph Hellwig
  2024-05-29  8:12   ` Damien Le Moal
@ 2024-05-30 19:51   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 19:51 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Add helper to disable WRITE SAME when it is not supported and use it
> instead of sd_config_write_same in the I/O completion handler.  This
> avoids touching more fields than required in the I/O completion handler
> and  prepares for converting sd to use the atomic queue limits API.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/12] sd: simplify the disable case in sd_config_discard
  2024-05-29  5:04 ` [PATCH 06/12] sd: simplify the disable case in sd_config_discard Christoph Hellwig
  2024-05-29  8:13   ` Damien Le Moal
@ 2024-05-30 20:02   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 20:02 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Fall through to the main call to blk_queue_max_discard_sectors given that
> max_blocks has been initialized to zero above instead of duplicating the
> call.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/12] sd: cleanup zoned queue limits initialization
  2024-05-29  5:04 ` [PATCH 08/12] sd: cleanup zoned queue limits initialization Christoph Hellwig
  2024-05-29  8:18   ` Damien Le Moal
@ 2024-05-30 20:07   ` Bart Van Assche
  1 sibling, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 20:07 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Consolidate setting zone-related queue limits in sd_zbc_read_zones
> instead of splitting them between sd_zbc_revalidate_zones and
> sd_zbc_read_zones, and move the early_zone_information initialization
> in sd_zbc_read_zones above setting up the queue limits.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/12] block: remove unused queue limits API
  2024-05-29  5:04 ` [PATCH 11/12] block: remove unused " Christoph Hellwig
  2024-05-29  8:28   ` Damien Le Moal
@ 2024-05-30 20:08   ` Bart Van Assche
  2024-05-31  9:14   ` John Garry
  2 siblings, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 20:08 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> Remove all APIs that are unused now that sd and sr have been converted
> to the atomic queue limits API.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends
  2024-05-29  5:04 ` [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends Christoph Hellwig
  2024-05-29  8:30   ` Damien Le Moal
@ 2024-05-30 20:09   ` Bart Van Assche
  2024-05-31  9:13   ` John Garry
  2 siblings, 0 replies; 43+ messages in thread
From: Bart Van Assche @ 2024-05-30 20:09 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 5/28/24 22:04, Christoph Hellwig wrote:
> A few drivers optimistically try to support discard, write zeroes and
> secure erase and disable the features from the I/O completion handler
> if the hardware can't support them.  This disable can't be done using
> the atomic queue limits API because the I/O completion handlers can't
> take sleeping locks or freezer the queue.  Keep the existing clearing
> of the relevant field to zero, but replace the old blk_queue_max_*
> APIs with new disable APIs that force the value to 0.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/12] sd: convert to the atomic queue limits API
  2024-05-30  9:16   ` John Garry
@ 2024-05-31  5:48     ` Christoph Hellwig
  0 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-31  5:48 UTC (permalink / raw)
  To: John Garry
  Cc: Christoph Hellwig, Jens Axboe, Martin K. Petersen,
	Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monn??, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On Thu, May 30, 2024 at 10:16:33AM +0100, John Garry wrote:
>> -static void sd_config_write_same(struct scsi_disk *);
>> +static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim,
>> +		unsigned int mode);
>
> Are there any reasons why we keep forward declarations like this? AFAICS, 
> this sd_config_discard forward declaration could be removed.

Mostly to avoid churn.  This is a series that needs to feed into the
block tree, so I don't want major churn in sd.c.  Maybe after the dust
has settled it would be nice to bring sd.c into a natural order.

>> -	blk_queue_max_write_zeroes_sectors(q, sdkp->max_ws_blocks *
>> -					 (logical_block_size >> 9));
>> +	lim->max_write_zeroes_sectors =
>> +		sdkp->max_ws_blocks * (logical_block_size >> 9);
>
> Would it be ok to use SECTOR_SHIFT here? A similar change is made in 
> sd_config_discard(), above

Sure.

>> +		sd_config_discard(sdkp, lim, sd_discard_mode(sdkp));
>>   	}
>>      out:
>> @@ -3278,10 +3290,10 @@ static void sd_read_block_limits_ext(struct scsi_disk *sdkp)
>>   }
>>     /**
>
> below is not a kernel doc comment

And that is on the one hand intentional to avoid documenting all the
obvious paramters in a local function, but on the other hand requires
removing the double *. Fixed.

>>   @@ -3683,28 +3696,33 @@ static int sd_revalidate_disk(struct gendisk 
>> *disk)
>>   	q->limits.max_dev_sectors = logical_to_sectors(sdp, dev_max);
>
>
> is setting q->limits.max_dev_sectors directly proper?

No, and I've already fixed it in my local tree.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/12] block: take io_opt and io_min into account for max_sectors
  2024-05-30 19:48   ` Ilya Dryomov
@ 2024-05-31  5:54     ` Christoph Hellwig
  2024-05-31  6:48       ` Ilya Dryomov
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-31  5:54 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, Jens Axboe, Martin K. Petersen,
	Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Dongsheng Yang, Roger Pau Monn??, linux-um, linux-block, nbd,
	ceph-devel, xen-devel, linux-scsi

On Thu, May 30, 2024 at 09:48:06PM +0200, Ilya Dryomov wrote:
> For rbd, this change effectively lowers max_sectors from 4M to 64K or
> less and that is definitely not desirable.  From previous interactions
> with users we want max_sectors to match max_hw_sectors -- this has come
> up a quite a few times over the years.  Some people just aren't aware
> of the soft cap and the fact that it's adjustable and get frustrated
> over the time poured into debugging their iostat numbers for workloads
> that can send object (set) size I/Os.
> 
> Looking at the git history, we lowered io_opt from objset_bytes to
> opts->alloc_size in commit [1], but I guess io_opt was lowered just
> along for the ride.  What that commit was concerned with is really
> discard_granularity and to a smaller extent io_min.
> 
> How much difference does io_opt make in the real world?  If what rbd
> does stands in the way of a tree-wide cleanup, I would much rather bump
> io_opt back to objset_bytes (i.e. what max_user_sectors is currently
> set to).

The only existing in-kernel usage is to set the readahead size.
Based on your comments I seems like we should revert io_opt to
objset to ->alloc_size in a prep patch?


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/12] block: take io_opt and io_min into account for max_sectors
  2024-05-31  5:54     ` Christoph Hellwig
@ 2024-05-31  6:48       ` Ilya Dryomov
  2024-05-31  6:56         ` Christoph Hellwig
  0 siblings, 1 reply; 43+ messages in thread
From: Ilya Dryomov @ 2024-05-31  6:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Richard Weinberger, Anton Ivanov,
	Johannes Berg, Josef Bacik, Dongsheng Yang, Roger Pau Monn??,
	linux-um, linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On Fri, May 31, 2024 at 7:54 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, May 30, 2024 at 09:48:06PM +0200, Ilya Dryomov wrote:
> > For rbd, this change effectively lowers max_sectors from 4M to 64K or
> > less and that is definitely not desirable.  From previous interactions
> > with users we want max_sectors to match max_hw_sectors -- this has come
> > up a quite a few times over the years.  Some people just aren't aware
> > of the soft cap and the fact that it's adjustable and get frustrated
> > over the time poured into debugging their iostat numbers for workloads
> > that can send object (set) size I/Os.
> >
> > Looking at the git history, we lowered io_opt from objset_bytes to
> > opts->alloc_size in commit [1], but I guess io_opt was lowered just
> > along for the ride.  What that commit was concerned with is really
> > discard_granularity and to a smaller extent io_min.
> >
> > How much difference does io_opt make in the real world?  If what rbd
> > does stands in the way of a tree-wide cleanup, I would much rather bump
> > io_opt back to objset_bytes (i.e. what max_user_sectors is currently
> > set to).
>
> The only existing in-kernel usage is to set the readahead size.
> Based on your comments I seems like we should revert io_opt to
> objset to ->alloc_size in a prep patch?

We should revert io_opt from opts->alloc_size to objset_bytes (I think
it's what you meant to say but typoed).

How do you want to handle it?  I can put together a patch, send it to
ceph-devel and it will be picked by linux-next sometime next week.  Then
this patch would grow a contextual conflict and the description would
need to be updated to not mention a behavior change for rbd anymore.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/12] block: take io_opt and io_min into account for max_sectors
  2024-05-31  6:48       ` Ilya Dryomov
@ 2024-05-31  6:56         ` Christoph Hellwig
  0 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2024-05-31  6:56 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, Jens Axboe, Martin K. Petersen,
	Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Dongsheng Yang, Roger Pau Monn??, linux-um, linux-block, nbd,
	ceph-devel, xen-devel, linux-scsi

On Fri, May 31, 2024 at 08:48:12AM +0200, Ilya Dryomov wrote:
> We should revert io_opt from opts->alloc_size to objset_bytes (I think
> it's what you meant to say but typoed).

Yes, sorry.

> How do you want to handle it?  I can put together a patch, send it to
> ceph-devel and it will be picked by linux-next sometime next week.  Then
> this patch would grow a contextual conflict and the description would
> need to be updated to not mention a behavior change for rbd anymore.

If that works for you I'd much prefer to include it with this series.
I'd be happy to write it myself.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends
  2024-05-29  5:04 ` [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends Christoph Hellwig
  2024-05-29  8:30   ` Damien Le Moal
  2024-05-30 20:09   ` Bart Van Assche
@ 2024-05-31  9:13   ` John Garry
  2 siblings, 0 replies; 43+ messages in thread
From: John Garry @ 2024-05-31  9:13 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 29/05/2024 06:04, Christoph Hellwig wrote:
> A few drivers optimistically try to support discard, write zeroes and
> secure erase and disable the features from the I/O completion handler
> if the hardware can't support them.  This disable can't be done using
> the atomic queue limits API because the I/O completion handlers can't
> take sleeping locks or freezer the queue.  Keep the existing clearing
> of the relevant field to zero, but replace the old blk_queue_max_*
> APIs with new disable APIs that force the value to 0.
> 
> Signed-off-by: Christoph Hellwig<hch@lst.de>

It would be nice to improve this eventually (to use the atomic queue 
limits API). Anyway:
Reviewed-by: John Garry <john.g.garry@oracle.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/12] block: remove unused queue limits API
  2024-05-29  5:04 ` [PATCH 11/12] block: remove unused " Christoph Hellwig
  2024-05-29  8:28   ` Damien Le Moal
  2024-05-30 20:08   ` Bart Van Assche
@ 2024-05-31  9:14   ` John Garry
  2 siblings, 0 replies; 43+ messages in thread
From: John Garry @ 2024-05-31  9:14 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Martin K. Petersen
  Cc: Richard Weinberger, Anton Ivanov, Johannes Berg, Josef Bacik,
	Ilya Dryomov, Dongsheng Yang, Roger Pau Monné, linux-um,
	linux-block, nbd, ceph-devel, xen-devel, linux-scsi

On 29/05/2024 06:04, Christoph Hellwig wrote:
> Remove all APIs that are unused now that sd and sr have been converted
> to the atomic queue limits API.
> 
> Signed-off-by: Christoph Hellwig<hch@lst.de>
> ---


Reviewed-by: John Garry <john.g.garry@oracle.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2024-05-31  9:14 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29  5:04 convert the SCSI ULDs to the atomic queue limits API Christoph Hellwig
2024-05-29  5:04 ` [PATCH 01/12] ubd: untagle discard vs write zeroes not support handling Christoph Hellwig
2024-05-29  8:00   ` Damien Le Moal
2024-05-30 19:44   ` Bart Van Assche
2024-05-29  5:04 ` [PATCH 02/12] block: take io_opt and io_min into account for max_sectors Christoph Hellwig
2024-05-29  8:05   ` Damien Le Moal
2024-05-30 19:47   ` Bart Van Assche
2024-05-30 19:48   ` Ilya Dryomov
2024-05-31  5:54     ` Christoph Hellwig
2024-05-31  6:48       ` Ilya Dryomov
2024-05-31  6:56         ` Christoph Hellwig
2024-05-29  5:04 ` [PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store Christoph Hellwig
2024-05-29  8:07   ` Damien Le Moal
2024-05-30 19:48   ` Bart Van Assche
2024-05-29  5:04 ` [PATCH 04/12] sd: add a sd_disable_discard helper Christoph Hellwig
2024-05-29  8:10   ` Damien Le Moal
2024-05-30 19:50   ` Bart Van Assche
2024-05-29  5:04 ` [PATCH 05/12] sd: add a sd_disable_write_same helper Christoph Hellwig
2024-05-29  8:12   ` Damien Le Moal
2024-05-30 19:51   ` Bart Van Assche
2024-05-29  5:04 ` [PATCH 06/12] sd: simplify the disable case in sd_config_discard Christoph Hellwig
2024-05-29  8:13   ` Damien Le Moal
2024-05-30 20:02   ` Bart Van Assche
2024-05-29  5:04 ` [PATCH 07/12] sd: factor out a sd_discard_mode helper Christoph Hellwig
2024-05-29  8:14   ` Damien Le Moal
2024-05-29 21:11   ` Bart Van Assche
2024-05-29  5:04 ` [PATCH 08/12] sd: cleanup zoned queue limits initialization Christoph Hellwig
2024-05-29  8:18   ` Damien Le Moal
2024-05-30 20:07   ` Bart Van Assche
2024-05-29  5:04 ` [PATCH 09/12] sd: convert to the atomic queue limits API Christoph Hellwig
2024-05-29  8:23   ` Damien Le Moal
2024-05-30  9:16   ` John Garry
2024-05-31  5:48     ` Christoph Hellwig
2024-05-29  5:04 ` [PATCH 10/12] sr: " Christoph Hellwig
2024-05-29  8:25   ` Damien Le Moal
2024-05-29  5:04 ` [PATCH 11/12] block: remove unused " Christoph Hellwig
2024-05-29  8:28   ` Damien Le Moal
2024-05-30 20:08   ` Bart Van Assche
2024-05-31  9:14   ` John Garry
2024-05-29  5:04 ` [PATCH 12/12] block: add special APIs for run-time disabling of discard and friends Christoph Hellwig
2024-05-29  8:30   ` Damien Le Moal
2024-05-30 20:09   ` Bart Van Assche
2024-05-31  9:13   ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).