linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fix discard limits
@ 2023-07-07  9:46 Christoph Hellwig
  2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Christoph Hellwig @ 2023-07-07  9:46 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Sagi Grimberg; +Cc: linux-block, linux-nvme

Hi all,

this series fixes a few issues related to max_discard_sector limits
in the block layer and nvme.

Subject:
 block/blk-settings.c     |    4 +++-
 drivers/nvme/host/core.c |   35 +++++++++++++----------------------
 drivers/nvme/host/nvme.h |    3 +--
 3 files changed, 17 insertions(+), 25 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors
  2023-07-07  9:46 fix discard limits Christoph Hellwig
@ 2023-07-07  9:46 ` Christoph Hellwig
  2023-07-10  3:53   ` Damien Le Moal
                     ` (3 more replies)
  2023-07-07  9:46 ` [PATCH 2/4] nvme: update discard limits in nvme_config_discard Christoph Hellwig
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 17+ messages in thread
From: Christoph Hellwig @ 2023-07-07  9:46 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Sagi Grimberg; +Cc: linux-block, linux-nvme

max_discard_sectors is split into a hardware and a tunable value, but
blk_queue_max_discard_sectors sets both unconditionally, thus dropping
any user stored value on a rescan.  Fix blk_queue_max_discard_sectors to
only set max_discard_sectors if it either wasn't set, or the new hardware
limit is smaller than the previous user limit.

Fixes: 0034af036554 ("block: make /sys/block/<dev>/queue/discard_max_bytes writeable")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-settings.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 0046b447268f91..978d2e1fd67a51 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -179,7 +179,9 @@ void blk_queue_max_discard_sectors(struct request_queue *q,
 		unsigned int max_discard_sectors)
 {
 	q->limits.max_hw_discard_sectors = max_discard_sectors;
-	q->limits.max_discard_sectors = max_discard_sectors;
+	if (!q->limits.max_discard_sectors ||
+	     q->limits.max_discard_sectors > max_discard_sectors)
+		q->limits.max_discard_sectors = max_discard_sectors;
 }
 EXPORT_SYMBOL(blk_queue_max_discard_sectors);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] nvme: update discard limits in nvme_config_discard
  2023-07-07  9:46 fix discard limits Christoph Hellwig
  2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
@ 2023-07-07  9:46 ` Christoph Hellwig
  2023-07-10  3:54   ` Damien Le Moal
  2023-07-10  9:29   ` Sagi Grimberg
  2023-07-07  9:46 ` [PATCH 3/4] nvme: fix max_discard_sectors calculation Christoph Hellwig
  2023-07-07  9:46 ` [PATCH 4/4] nvme: simplify the max_discard_segments calculation Christoph Hellwig
  3 siblings, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2023-07-07  9:46 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Sagi Grimberg; +Cc: linux-block, linux-nvme

nvme_config_discard currently skips updating the discard limits if they
were set before because blk_queue_max_discard_sectors used to update the
configurable max_discard_sectors limit unconditionally.  Now that this
has been fixed we can update the discard limits even if they were set
to deal with the case of a reset changing the limits after e.g. a
firmware update.

Fixes: 3831761eb859 ("nvme: only reconfigure discard if necessary")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 47d7ba2827ff29..2d6c1f4ad7f5c8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1734,10 +1734,6 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns)
 
 	queue->limits.discard_granularity = size;
 
-	/* If discard is already enabled, don't reset queue limits */
-	if (queue->limits.max_discard_sectors)
-		return;
-
 	blk_queue_max_discard_sectors(queue, ctrl->max_discard_sectors);
 	blk_queue_max_discard_segments(queue, ctrl->max_discard_segments);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] nvme: fix max_discard_sectors calculation
  2023-07-07  9:46 fix discard limits Christoph Hellwig
  2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
  2023-07-07  9:46 ` [PATCH 2/4] nvme: update discard limits in nvme_config_discard Christoph Hellwig
@ 2023-07-07  9:46 ` Christoph Hellwig
  2023-07-10  3:57   ` Damien Le Moal
  2023-07-10  9:31   ` Sagi Grimberg
  2023-07-07  9:46 ` [PATCH 4/4] nvme: simplify the max_discard_segments calculation Christoph Hellwig
  3 siblings, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2023-07-07  9:46 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Sagi Grimberg; +Cc: linux-block, linux-nvme

ctrl->max_discard_sectors stores a value that is potentially based of
the DMRSL field in Identify Controller, which is in units of LBAs and
thus dependent on the Format of a namespace.

Fix this by moving the calculation of max_discard_sectors entirely
into nvme_config_discard and replacing the ctrl->max_discard_sectors
value with a local variable so that the calculation is always
namespace-specific.

Fixes: 1a86924e4f46 ("nvme: fix interpretation of DMRSL")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 22 ++++++++++------------
 drivers/nvme/host/nvme.h |  1 -
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 2d6c1f4ad7f5c8..05372bec3b7aff 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1721,20 +1721,21 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns)
 	struct request_queue *queue = disk->queue;
 	u32 size = queue_logical_block_size(queue);
 
-	if (ctrl->dmrsl && ctrl->dmrsl <= nvme_sect_to_lba(ns, UINT_MAX))
-		ctrl->max_discard_sectors = nvme_lba_to_sect(ns, ctrl->dmrsl);
+	BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
+			NVME_DSM_MAX_RANGES);
 
-	if (ctrl->max_discard_sectors == 0) {
+	if (ctrl->dmrsl && ctrl->dmrsl <= nvme_sect_to_lba(ns, UINT_MAX)) {
+		blk_queue_max_discard_sectors(queue,
+				nvme_lba_to_sect(ns, ctrl->dmrsl));
+	} else if (ctrl->oncs & NVME_CTRL_ONCS_DSM) {
+		blk_queue_max_discard_sectors(queue, UINT_MAX);
+	} else {
 		blk_queue_max_discard_sectors(queue, 0);
 		return;
 	}
 
-	BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
-			NVME_DSM_MAX_RANGES);
-
 	queue->limits.discard_granularity = size;
 
-	blk_queue_max_discard_sectors(queue, ctrl->max_discard_sectors);
 	blk_queue_max_discard_segments(queue, ctrl->max_discard_segments);
 
 	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
@@ -2870,13 +2871,10 @@ static int nvme_init_non_mdts_limits(struct nvme_ctrl *ctrl)
 	struct nvme_id_ctrl_nvm *id;
 	int ret;
 
-	if (ctrl->oncs & NVME_CTRL_ONCS_DSM) {
-		ctrl->max_discard_sectors = UINT_MAX;
+	if (ctrl->oncs & NVME_CTRL_ONCS_DSM)
 		ctrl->max_discard_segments = NVME_DSM_MAX_RANGES;
-	} else {
-		ctrl->max_discard_sectors = 0;
+	else
 		ctrl->max_discard_segments = 0;
-	}
 
 	/*
 	 * Even though NVMe spec explicitly states that MDTS is not applicable
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index f35647c470afad..d59ed2ba1c37ca 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -296,7 +296,6 @@ struct nvme_ctrl {
 	u32 max_hw_sectors;
 	u32 max_segments;
 	u32 max_integrity_segments;
-	u32 max_discard_sectors;
 	u32 max_discard_segments;
 	u32 max_zeroes_sectors;
 #ifdef CONFIG_BLK_DEV_ZONED
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] nvme: simplify the max_discard_segments calculation
  2023-07-07  9:46 fix discard limits Christoph Hellwig
                   ` (2 preceding siblings ...)
  2023-07-07  9:46 ` [PATCH 3/4] nvme: fix max_discard_sectors calculation Christoph Hellwig
@ 2023-07-07  9:46 ` Christoph Hellwig
  2023-07-10  9:32   ` Sagi Grimberg
  3 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2023-07-07  9:46 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch, Sagi Grimberg; +Cc: linux-block, linux-nvme

Just stash away the DMRL value in the nvme_ctrl struture, and leave
all interpretation to nvme_config_discard, where we know DSM is
supported by the time we're configuring the number of segments.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/core.c | 13 +++++--------
 drivers/nvme/host/nvme.h |  2 +-
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 05372bec3b7aff..f5814aa1b33910 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1736,7 +1736,10 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns)
 
 	queue->limits.discard_granularity = size;
 
-	blk_queue_max_discard_segments(queue, ctrl->max_discard_segments);
+	if (ctrl->dmrl)
+		blk_queue_max_discard_segments(queue, ctrl->dmrl);
+	else
+		blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
 
 	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
 		blk_queue_max_write_zeroes_sectors(queue, UINT_MAX);
@@ -2871,11 +2874,6 @@ static int nvme_init_non_mdts_limits(struct nvme_ctrl *ctrl)
 	struct nvme_id_ctrl_nvm *id;
 	int ret;
 
-	if (ctrl->oncs & NVME_CTRL_ONCS_DSM)
-		ctrl->max_discard_segments = NVME_DSM_MAX_RANGES;
-	else
-		ctrl->max_discard_segments = 0;
-
 	/*
 	 * Even though NVMe spec explicitly states that MDTS is not applicable
 	 * to the write-zeroes, we are cautious and limit the size to the
@@ -2905,8 +2903,7 @@ static int nvme_init_non_mdts_limits(struct nvme_ctrl *ctrl)
 	if (ret)
 		goto free_data;
 
-	if (id->dmrl)
-		ctrl->max_discard_segments = id->dmrl;
+	ctrl->dmrl = id->dmrl;
 	ctrl->dmrsl = le32_to_cpu(id->dmrsl);
 	if (id->wzsl)
 		ctrl->max_zeroes_sectors = nvme_mps_to_sectors(ctrl, id->wzsl);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index d59ed2ba1c37ca..1bfe172f9268a0 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -296,13 +296,13 @@ struct nvme_ctrl {
 	u32 max_hw_sectors;
 	u32 max_segments;
 	u32 max_integrity_segments;
-	u32 max_discard_segments;
 	u32 max_zeroes_sectors;
 #ifdef CONFIG_BLK_DEV_ZONED
 	u32 max_zone_append;
 #endif
 	u16 crdt[3];
 	u16 oncs;
+	u8 dmrl;
 	u32 dmrsl;
 	u16 oacs;
 	u16 sqsize;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors
  2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
@ 2023-07-10  3:53   ` Damien Le Moal
  2023-07-10  9:29   ` Sagi Grimberg
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-07-10  3:53 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg
  Cc: linux-block, linux-nvme

On 7/7/23 18:46, Christoph Hellwig wrote:
> max_discard_sectors is split into a hardware and a tunable value, but
> blk_queue_max_discard_sectors sets both unconditionally, thus dropping
> any user stored value on a rescan.  Fix blk_queue_max_discard_sectors to
> only set max_discard_sectors if it either wasn't set, or the new hardware
> limit is smaller than the previous user limit.
> 
> Fixes: 0034af036554 ("block: make /sys/block/<dev>/queue/discard_max_bytes writeable")
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Look OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>


> ---
>  block/blk-settings.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 0046b447268f91..978d2e1fd67a51 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -179,7 +179,9 @@ void blk_queue_max_discard_sectors(struct request_queue *q,
>  		unsigned int max_discard_sectors)
>  {
>  	q->limits.max_hw_discard_sectors = max_discard_sectors;
> -	q->limits.max_discard_sectors = max_discard_sectors;
> +	if (!q->limits.max_discard_sectors ||
> +	     q->limits.max_discard_sectors > max_discard_sectors)
> +		q->limits.max_discard_sectors = max_discard_sectors;
>  }
>  EXPORT_SYMBOL(blk_queue_max_discard_sectors);
>  

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] nvme: update discard limits in nvme_config_discard
  2023-07-07  9:46 ` [PATCH 2/4] nvme: update discard limits in nvme_config_discard Christoph Hellwig
@ 2023-07-10  3:54   ` Damien Le Moal
  2023-07-10  9:29   ` Sagi Grimberg
  1 sibling, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-07-10  3:54 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg
  Cc: linux-block, linux-nvme

On 7/7/23 18:46, Christoph Hellwig wrote:
> nvme_config_discard currently skips updating the discard limits if they
> were set before because blk_queue_max_discard_sectors used to update the
> configurable max_discard_sectors limit unconditionally.  Now that this
> has been fixed we can update the discard limits even if they were set
> to deal with the case of a reset changing the limits after e.g. a
> firmware update.
> 
> Fixes: 3831761eb859 ("nvme: only reconfigure discard if necessary")
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Look OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> ---
>  drivers/nvme/host/core.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 47d7ba2827ff29..2d6c1f4ad7f5c8 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1734,10 +1734,6 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns)
>  
>  	queue->limits.discard_granularity = size;
>  
> -	/* If discard is already enabled, don't reset queue limits */
> -	if (queue->limits.max_discard_sectors)
> -		return;
> -
>  	blk_queue_max_discard_sectors(queue, ctrl->max_discard_sectors);
>  	blk_queue_max_discard_segments(queue, ctrl->max_discard_segments);
>  

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] nvme: fix max_discard_sectors calculation
  2023-07-07  9:46 ` [PATCH 3/4] nvme: fix max_discard_sectors calculation Christoph Hellwig
@ 2023-07-10  3:57   ` Damien Le Moal
  2023-07-10  6:39     ` Christoph Hellwig
  2023-07-10  9:31   ` Sagi Grimberg
  1 sibling, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2023-07-10  3:57 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg
  Cc: linux-block, linux-nvme

On 7/7/23 18:46, Christoph Hellwig wrote:
> ctrl->max_discard_sectors stores a value that is potentially based of
> the DMRSL field in Identify Controller, which is in units of LBAs and
> thus dependent on the Format of a namespace.
> 
> Fix this by moving the calculation of max_discard_sectors entirely
> into nvme_config_discard and replacing the ctrl->max_discard_sectors
> value with a local variable so that the calculation is always

I do not see a local variable replacement... May be you meant direct calls to
blk_queue_max_discard_sectors() ?

Other than that, looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> namespace-specific.
> 
> Fixes: 1a86924e4f46 ("nvme: fix interpretation of DMRSL")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/nvme/host/core.c | 22 ++++++++++------------
>  drivers/nvme/host/nvme.h |  1 -
>  2 files changed, 10 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 2d6c1f4ad7f5c8..05372bec3b7aff 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1721,20 +1721,21 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns)
>  	struct request_queue *queue = disk->queue;
>  	u32 size = queue_logical_block_size(queue);
>  
> -	if (ctrl->dmrsl && ctrl->dmrsl <= nvme_sect_to_lba(ns, UINT_MAX))
> -		ctrl->max_discard_sectors = nvme_lba_to_sect(ns, ctrl->dmrsl);
> +	BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
> +			NVME_DSM_MAX_RANGES);
>  
> -	if (ctrl->max_discard_sectors == 0) {
> +	if (ctrl->dmrsl && ctrl->dmrsl <= nvme_sect_to_lba(ns, UINT_MAX)) {
> +		blk_queue_max_discard_sectors(queue,
> +				nvme_lba_to_sect(ns, ctrl->dmrsl));
> +	} else if (ctrl->oncs & NVME_CTRL_ONCS_DSM) {
> +		blk_queue_max_discard_sectors(queue, UINT_MAX);
> +	} else {
>  		blk_queue_max_discard_sectors(queue, 0);
>  		return;
>  	}
>  
> -	BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
> -			NVME_DSM_MAX_RANGES);
> -
>  	queue->limits.discard_granularity = size;
>  
> -	blk_queue_max_discard_sectors(queue, ctrl->max_discard_sectors);
>  	blk_queue_max_discard_segments(queue, ctrl->max_discard_segments);
>  
>  	if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
> @@ -2870,13 +2871,10 @@ static int nvme_init_non_mdts_limits(struct nvme_ctrl *ctrl)
>  	struct nvme_id_ctrl_nvm *id;
>  	int ret;
>  
> -	if (ctrl->oncs & NVME_CTRL_ONCS_DSM) {
> -		ctrl->max_discard_sectors = UINT_MAX;
> +	if (ctrl->oncs & NVME_CTRL_ONCS_DSM)
>  		ctrl->max_discard_segments = NVME_DSM_MAX_RANGES;
> -	} else {
> -		ctrl->max_discard_sectors = 0;
> +	else
>  		ctrl->max_discard_segments = 0;
> -	}
>  
>  	/*
>  	 * Even though NVMe spec explicitly states that MDTS is not applicable
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index f35647c470afad..d59ed2ba1c37ca 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -296,7 +296,6 @@ struct nvme_ctrl {
>  	u32 max_hw_sectors;
>  	u32 max_segments;
>  	u32 max_integrity_segments;
> -	u32 max_discard_sectors;
>  	u32 max_discard_segments;
>  	u32 max_zeroes_sectors;
>  #ifdef CONFIG_BLK_DEV_ZONED

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] nvme: fix max_discard_sectors calculation
  2023-07-10  3:57   ` Damien Le Moal
@ 2023-07-10  6:39     ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2023-07-10  6:39 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
	linux-block, linux-nvme

On Mon, Jul 10, 2023 at 12:57:56PM +0900, Damien Le Moal wrote:
> On 7/7/23 18:46, Christoph Hellwig wrote:
> > ctrl->max_discard_sectors stores a value that is potentially based of
> > the DMRSL field in Identify Controller, which is in units of LBAs and
> > thus dependent on the Format of a namespace.
> > 
> > Fix this by moving the calculation of max_discard_sectors entirely
> > into nvme_config_discard and replacing the ctrl->max_discard_sectors
> > value with a local variable so that the calculation is always
> 
> I do not see a local variable replacement... May be you meant direct calls to
> blk_queue_max_discard_sectors() ?

Yeah, I used a local variable first, but then noticed they are
pointless as we can just call blk_queue_max_discard_sectors directly
and didn't update the commit log.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors
  2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
  2023-07-10  3:53   ` Damien Le Moal
@ 2023-07-10  9:29   ` Sagi Grimberg
  2023-07-10 10:42   ` Ming Lei
  2023-07-10 15:01   ` Keith Busch
  3 siblings, 0 replies; 17+ messages in thread
From: Sagi Grimberg @ 2023-07-10  9:29 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch; +Cc: linux-block, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] nvme: update discard limits in nvme_config_discard
  2023-07-07  9:46 ` [PATCH 2/4] nvme: update discard limits in nvme_config_discard Christoph Hellwig
  2023-07-10  3:54   ` Damien Le Moal
@ 2023-07-10  9:29   ` Sagi Grimberg
  1 sibling, 0 replies; 17+ messages in thread
From: Sagi Grimberg @ 2023-07-10  9:29 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch; +Cc: linux-block, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] nvme: fix max_discard_sectors calculation
  2023-07-07  9:46 ` [PATCH 3/4] nvme: fix max_discard_sectors calculation Christoph Hellwig
  2023-07-10  3:57   ` Damien Le Moal
@ 2023-07-10  9:31   ` Sagi Grimberg
  1 sibling, 0 replies; 17+ messages in thread
From: Sagi Grimberg @ 2023-07-10  9:31 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch; +Cc: linux-block, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] nvme: simplify the max_discard_segments calculation
  2023-07-07  9:46 ` [PATCH 4/4] nvme: simplify the max_discard_segments calculation Christoph Hellwig
@ 2023-07-10  9:32   ` Sagi Grimberg
  0 siblings, 0 replies; 17+ messages in thread
From: Sagi Grimberg @ 2023-07-10  9:32 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch; +Cc: linux-block, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors
  2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
  2023-07-10  3:53   ` Damien Le Moal
  2023-07-10  9:29   ` Sagi Grimberg
@ 2023-07-10 10:42   ` Ming Lei
  2023-07-12 16:23     ` Christoph Hellwig
  2023-07-10 15:01   ` Keith Busch
  3 siblings, 1 reply; 17+ messages in thread
From: Ming Lei @ 2023-07-10 10:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Sagi Grimberg, linux-block, linux-nvme,
	ming.lei

On Fri, Jul 07, 2023 at 11:46:13AM +0200, Christoph Hellwig wrote:
> max_discard_sectors is split into a hardware and a tunable value, but
> blk_queue_max_discard_sectors sets both unconditionally, thus dropping
> any user stored value on a rescan.  Fix blk_queue_max_discard_sectors to
> only set max_discard_sectors if it either wasn't set, or the new hardware
> limit is smaller than the previous user limit.
> 
> Fixes: 0034af036554 ("block: make /sys/block/<dev>/queue/discard_max_bytes writeable")

It is hard to say a fix, given discard_max_bytes can still be changed
by kernel. I'd suggest to document this behavior in Documentation/ABI/stable/sysfs-block.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-settings.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 0046b447268f91..978d2e1fd67a51 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -179,7 +179,9 @@ void blk_queue_max_discard_sectors(struct request_queue *q,
>  		unsigned int max_discard_sectors)
>  {
>  	q->limits.max_hw_discard_sectors = max_discard_sectors;
> -	q->limits.max_discard_sectors = max_discard_sectors;
> +	if (!q->limits.max_discard_sectors ||
> +	     q->limits.max_discard_sectors > max_discard_sectors)
> +		q->limits.max_discard_sectors = max_discard_sectors;
>  }

Userspace may write 0 to discard_max_bytes, and this patch still can
override user setting.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors
  2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
                     ` (2 preceding siblings ...)
  2023-07-10 10:42   ` Ming Lei
@ 2023-07-10 15:01   ` Keith Busch
  3 siblings, 0 replies; 17+ messages in thread
From: Keith Busch @ 2023-07-10 15:01 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, Sagi Grimberg, linux-block, linux-nvme

On Fri, Jul 07, 2023 at 11:46:13AM +0200, Christoph Hellwig wrote:
>  {
>  	q->limits.max_hw_discard_sectors = max_discard_sectors;
> -	q->limits.max_discard_sectors = max_discard_sectors;
> +	if (!q->limits.max_discard_sectors ||
> +	     q->limits.max_discard_sectors > max_discard_sectors)
> +		q->limits.max_discard_sectors = max_discard_sectors;

Could simplify to min_not_zero().

But this only allows you to make the limit smaller. If the user never
set max_discard_sectors before, and a firmware update allows a larger
max_hw_discard_sectors, the subsequent rescan won't use the new limit.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors
  2023-07-10 10:42   ` Ming Lei
@ 2023-07-12 16:23     ` Christoph Hellwig
  2023-07-12 16:38       ` Keith Busch
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2023-07-12 16:23 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Sagi Grimberg,
	linux-block, linux-nvme

On Mon, Jul 10, 2023 at 06:42:36PM +0800, Ming Lei wrote:
> Userspace may write 0 to discard_max_bytes, and this patch still can
> override user setting.

True.  Maybe the right thing is to have a user_limit field, and just
looks at the min of that and the hw limit everywhere.  These hardware
vs user limits are a pain, and we'll probably need some proper
infrastructure for them :P


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors
  2023-07-12 16:23     ` Christoph Hellwig
@ 2023-07-12 16:38       ` Keith Busch
  0 siblings, 0 replies; 17+ messages in thread
From: Keith Busch @ 2023-07-12 16:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Jens Axboe, Sagi Grimberg, linux-block, linux-nvme

On Wed, Jul 12, 2023 at 06:23:10PM +0200, Christoph Hellwig wrote:
> On Mon, Jul 10, 2023 at 06:42:36PM +0800, Ming Lei wrote:
> > Userspace may write 0 to discard_max_bytes, and this patch still can
> > override user setting.
> 
> True.  Maybe the right thing is to have a user_limit field, and just
> looks at the min of that and the hw limit everywhere.  These hardware
> vs user limits are a pain, and we'll probably need some proper
> infrastructure for them :P

Yeah, I had to do something very similiar for the max_sectors limit too:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=c9c77418a98273fe96835c42666f7427b3883f48

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-07-12 16:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-07  9:46 fix discard limits Christoph Hellwig
2023-07-07  9:46 ` [PATCH 1/4] block: don't unconditionally set max_discard_sectors in blk_queue_max_discard_sectors Christoph Hellwig
2023-07-10  3:53   ` Damien Le Moal
2023-07-10  9:29   ` Sagi Grimberg
2023-07-10 10:42   ` Ming Lei
2023-07-12 16:23     ` Christoph Hellwig
2023-07-12 16:38       ` Keith Busch
2023-07-10 15:01   ` Keith Busch
2023-07-07  9:46 ` [PATCH 2/4] nvme: update discard limits in nvme_config_discard Christoph Hellwig
2023-07-10  3:54   ` Damien Le Moal
2023-07-10  9:29   ` Sagi Grimberg
2023-07-07  9:46 ` [PATCH 3/4] nvme: fix max_discard_sectors calculation Christoph Hellwig
2023-07-10  3:57   ` Damien Le Moal
2023-07-10  6:39     ` Christoph Hellwig
2023-07-10  9:31   ` Sagi Grimberg
2023-07-07  9:46 ` [PATCH 4/4] nvme: simplify the max_discard_segments calculation Christoph Hellwig
2023-07-10  9:32   ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).