linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Ming Lei <ming.lei@redhat.com>, Christoph Hellwig <hch@lst.de>,
	axboe@kernel.dk
Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org,
	Marco Patalano <mpatalan@redhat.com>,
	Ewan Milne <emilne@redhat.com>,
	linux-raid@vger.kernel.org
Subject: [PATCH for-6.10-rc1] block: fix blk_validate_limits() to properly handle stacked devices
Date: Thu, 23 May 2024 11:38:21 -0400	[thread overview]
Message-ID: <Zk9i7V2GRoHxBPRu@kernel.org> (raw)
In-Reply-To: <Zk6haNVa5JXxlOf1@fedora>

On Thu, May 23, 2024 at 09:52:40AM +0800, Ming Lei wrote:
> On Wed, May 22, 2024 at 12:48:59PM -0400, Mike Snitzer wrote:
> > On Wed, May 22, 2024 at 04:24:58PM +0200, Christoph Hellwig wrote:
> > > On Tue, May 21, 2024 at 10:51:17PM -0400, Mike Snitzer wrote:
> > > > Otherwise, blk_validate_limits() will throw-away the max_sectors that
> > > > was stacked from underlying device(s). In doing so it can set a
> > > > max_sectors limit that violates underlying device limits.
> > > 
> > > Hmm, yes it sort of is "throwing the limit away", but it really
> > > recalculates it from max_hw_sectors, max_dev_sectors and user_max_sectors.
> > 
> > Yes, but it needs to do that recalculation at each level of a stacked
> > device.  And then we need to combine them via blk_stack_limits() -- as
> > is done with the various limits stacking loops in
> > drivers/md/dm-table.c:dm_calculate_queue_limits
> 
> This way looks one stacking specific requirement, just wondering why not
> put the logic into blk_validate_limits() by passing 'stacking' parameter?
> Then raid can benefit from it too.

Sure, we could elevate it to blk_validate_limits (and callers) but
adding a 'stacking' parameter is more intrusive on an API level.

Best to just update blk_set_stacking_limits() to set a new 'stacking'
flag in struct queue_limits, and update blk_stack_limits() to stack
that flag up.

I've verified this commit to work and have staged it in linux-next via
linux-dm.git's 'for-next', see:

https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=cedc03d697ff255dd5b600146521434e2e921815

Jens (and obviously: Christoph, Ming and others), I'm happy to send
this to Linus tomorrow morning if you could please provide your
Reviewed-by or Acked-by.  I'd prefer to keep the intermediate DM fix
just to "show the work and testing".

Thanks,
Mike

From cedc03d697ff255dd5b600146521434e2e921815 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer@kernel.org>
Date: Thu, 23 May 2024 11:19:29 -0400
Subject: [PATCH] block: fix blk_validate_limits() to properly handle stacked devices

For the benefit of other stacking block drivers, e.g. MD, elevate the
DM fix from commit 0ead1c8e8e48 ("dm: retain stacked max_sectors when
setting queue_limits") to block core.

Switches to using a bool bitfield in struct queue_limits (for old
member 'zoned' and new member 'stacking') to not grow that struct.

Suggested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
 block/blk-settings.c   | 30 ++++++++++++++++++++++++++++--
 drivers/md/dm-table.c  |  8 --------
 include/linux/blkdev.h |  3 ++-
 3 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index cdbaef159c4b..24c799072f6c 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -35,6 +35,7 @@ EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
 void blk_set_stacking_limits(struct queue_limits *lim)
 {
 	memset(lim, 0, sizeof(*lim));
+	lim->stacking = true;
 	lim->logical_block_size = SECTOR_SIZE;
 	lim->physical_block_size = SECTOR_SIZE;
 	lim->io_min = SECTOR_SIZE;
@@ -103,7 +104,7 @@ static int blk_validate_zoned_limits(struct queue_limits *lim)
  */
 static int blk_validate_limits(struct queue_limits *lim)
 {
-	unsigned int max_hw_sectors;
+	unsigned int max_hw_sectors, stacked_max_hw_sectors = 0;
 
 	/*
 	 * Unless otherwise specified, default to 512 byte logical blocks and a
@@ -121,6 +122,23 @@ static int blk_validate_limits(struct queue_limits *lim)
 	if (lim->io_min < lim->physical_block_size)
 		lim->io_min = lim->physical_block_size;
 
+
+	/*
+	 * For stacked block devices, don't throw-away stacked max_sectors.
+	 */
+	if (lim->stacking && lim->max_hw_sectors) {
+		/*
+		 * lim->max_sectors and lim->max_hw_sectors were already
+		 * validated, relative underlying device(s) in this stacked
+		 * block device.
+		 */
+		stacked_max_hw_sectors = lim->max_hw_sectors;
+		/*
+		 * Impose stacked max_sectors as upper-bound for code below.
+		 */
+		lim->max_hw_sectors = lim->max_sectors;
+	}
+
 	/*
 	 * max_hw_sectors has a somewhat weird default for historical reason,
 	 * but driver really should set their own instead of relying on this
@@ -155,6 +173,11 @@ static int blk_validate_limits(struct queue_limits *lim)
 	lim->max_sectors = round_down(lim->max_sectors,
 			lim->logical_block_size >> SECTOR_SHIFT);
 
+	if (stacked_max_hw_sectors) {
+		/* Restore previously validated stacked max_hw_sectors */
+		lim->max_hw_sectors = max_hw_sectors;
+	}
+
 	/*
 	 * Random default for the maximum number of segments.  Driver should not
 	 * rely on this and set their own.
@@ -881,11 +904,14 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 						   b->max_secure_erase_sectors);
 	t->zone_write_granularity = max(t->zone_write_granularity,
 					b->zone_write_granularity);
-	t->zoned = max(t->zoned, b->zoned);
+	t->zoned |= b->zoned;
 	if (!t->zoned) {
 		t->zone_write_granularity = 0;
 		t->max_zone_append_sectors = 0;
 	}
+
+	t->stacking |= b->stacking;
+
 	return ret;
 }
 EXPORT_SYMBOL(blk_stack_limits);
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 6463b4afeaa4..88114719fe18 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1961,7 +1961,6 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
 			      struct queue_limits *limits)
 {
 	bool wc = false, fua = false;
-	unsigned int max_hw_sectors;
 	int r;
 
 	if (dm_table_supports_nowait(t))
@@ -1982,16 +1981,9 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
 	if (!dm_table_supports_secure_erase(t))
 		limits->max_secure_erase_sectors = 0;
 
-	/* Don't allow queue_limits_set() to throw-away stacked max_sectors */
-	max_hw_sectors = limits->max_hw_sectors;
-	limits->max_hw_sectors = limits->max_sectors;
 	r = queue_limits_set(q, limits);
 	if (r)
 		return r;
-	/* Restore stacked max_hw_sectors */
-	mutex_lock(&q->limits_lock);
-	limits->max_hw_sectors = max_hw_sectors;
-	mutex_unlock(&q->limits_lock);
 
 	if (dm_table_supports_flush(t, (1UL << QUEUE_FLAG_WC))) {
 		wc = true;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c3e8f7cf96be..ad1b00e5cc3e 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -307,7 +307,8 @@ struct queue_limits {
 	unsigned char		misaligned;
 	unsigned char		discard_misaligned;
 	unsigned char		raid_partial_stripes_expensive;
-	bool			zoned;
+	bool			zoned:1;
+	bool			stacking:1;
 	unsigned int		max_open_zones;
 	unsigned int		max_active_zones;
 
-- 
2.44.0


  reply	other threads:[~2024-05-23 15:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240522025117.75568-1-snitzer@kernel.org>
     [not found] ` <20240522142458.GB7502@lst.de>
     [not found]   ` <Zk4h-6f2M0XmraJV@kernel.org>
2024-05-23  1:52     ` dm: retain stacked max_sectors when setting queue_limits Ming Lei
2024-05-23 15:38       ` Mike Snitzer [this message]
2024-05-23 15:44         ` [PATCH for-6.10-rc1] block: fix blk_validate_limits() to properly handle stacked devices Christoph Hellwig
2024-05-23 15:48           ` Mike Snitzer
2024-05-23 15:52             ` Christoph Hellwig
2024-05-23 16:38               ` Mike Snitzer
2024-05-23 17:05                 ` Christoph Hellwig
2024-05-23 17:14                   ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zk9i7V2GRoHxBPRu@kernel.org \
    --to=snitzer@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@lists.linux.dev \
    --cc=emilne@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mpatalan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).