All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Daniel Gomez <da.gomez@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Yi Zhang <yi.zhang@redhat.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	John Garry <john.g.garry@oracle.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH V2] block: make segment size limit workable for > 4K PAGE_SIZE
Date: Fri, 14 Feb 2025 19:19:45 +0800	[thread overview]
Message-ID: <Z68m0X9o3Mw_oPsU@fedora> (raw)
In-Reply-To: <ifgg2za26r6frfco4cky6wxywgdj3l7r6hx6sbqarizqltshfx@kccnmlr3x7nq>

On Fri, Feb 14, 2025 at 10:38:36AM +0100, Daniel Gomez wrote:
> On Mon, Feb 10, 2025 at 05:03:19PM +0100, Ming Lei wrote:
> > PAGE_SIZE is applied in some block device queue limits, this way is
> > very fragile and is wrong:
> > 
> > - queue limits are read from hardware, which is often one readonly
> > hardware property
> > 
> > - PAGE_SIZE is one config option which can be changed during build time.
> > 
> > In RH lab, it has been found that max segment size of some mmc card is
> > less than 64K, then this kind of card can't work in case of 64K PAGE_SIZE.
> > 
> > Fix this issue by using BLK_MIN_SEGMENT_SIZE in related code for dealing
> > with queue limits and checking if bio needn't split. Define BLK_MIN_SEGMENT_SIZE
> > as 4K(minimized PAGE_SIZE).
> > 
> > Cc: Yi Zhang <yi.zhang@redhat.com>
> > Cc: Luis Chamberlain <mcgrof@kernel.org>
> > Cc: John Garry <john.g.garry@oracle.com>
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: Keith Busch <kbusch@kernel.org>
> > Link: https://lore.kernel.org/linux-block/20250102015620.500754-1-ming.lei@redhat.com/
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > V2:
> > 	- cover bio_split_rw_at()
> > 	- add BLK_MIN_SEGMENT_SIZE
> > 
> >  block/blk-merge.c      | 2 +-
> >  block/blk-settings.c   | 6 +++---
> >  block/blk.h            | 2 +-
> >  include/linux/blkdev.h | 1 +
> >  4 files changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index 15cd231d560c..b55c52a42303 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -329,7 +329,7 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
> >  
> >  		if (nsegs < lim->max_segments &&
> >  		    bytes + bv.bv_len <= max_bytes &&
> > -		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
> > +		    bv.bv_offset + bv.bv_len <= BLK_MIN_SEGMENT_SIZE) {
> >  			nsegs++;
> >  			bytes += bv.bv_len;
> >  		} else {
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index c44dadc35e1e..539a64ad7989 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -303,7 +303,7 @@ int blk_validate_limits(struct queue_limits *lim)
> >  	max_hw_sectors = min_not_zero(lim->max_hw_sectors,
> >  				lim->max_dev_sectors);
> >  	if (lim->max_user_sectors) {
> > -		if (lim->max_user_sectors < PAGE_SIZE / SECTOR_SIZE)
> > +		if (lim->max_user_sectors < BLK_MIN_SEGMENT_SIZE / SECTOR_SIZE)
> >  			return -EINVAL;
> >  		lim->max_sectors = min(max_hw_sectors, lim->max_user_sectors);
> >  	} else if (lim->io_opt > (BLK_DEF_MAX_SECTORS_CAP << SECTOR_SHIFT)) {
> > @@ -341,7 +341,7 @@ int blk_validate_limits(struct queue_limits *lim)
> >  	 */
> >  	if (!lim->seg_boundary_mask)
> >  		lim->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK;
> > -	if (WARN_ON_ONCE(lim->seg_boundary_mask < PAGE_SIZE - 1))
> > +	if (WARN_ON_ONCE(lim->seg_boundary_mask < BLK_MIN_SEGMENT_SIZE - 1))
> >  		return -EINVAL;
> >  
> >  	/*
> > @@ -362,7 +362,7 @@ int blk_validate_limits(struct queue_limits *lim)
> >  		 */
> >  		if (!lim->max_segment_size)
> >  			lim->max_segment_size = BLK_MAX_SEGMENT_SIZE;
> > -		if (WARN_ON_ONCE(lim->max_segment_size < PAGE_SIZE))
> > +		if (WARN_ON_ONCE(lim->max_segment_size < BLK_MIN_SEGMENT_SIZE))
> >  			return -EINVAL;
> >  	}
> >  
> > diff --git a/block/blk.h b/block/blk.h
> > index 90fa5f28ccab..cbfa8a3d4e42 100644
> > --- a/block/blk.h
> > +++ b/block/blk.h
> > @@ -359,7 +359,7 @@ static inline bool bio_may_need_split(struct bio *bio,
> >  		const struct queue_limits *lim)
> >  {
> >  	return lim->chunk_sectors || bio->bi_vcnt != 1 ||
> > -		bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > PAGE_SIZE;
> > +		bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > BLK_MIN_SEGMENT_SIZE;
> >  }
> >  
> >  /**
> > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> > index 248416ecd01c..32188af4051e 100644
> > --- a/include/linux/blkdev.h
> > +++ b/include/linux/blkdev.h
> > @@ -1163,6 +1163,7 @@ static inline bool bdev_is_partition(struct block_device *bdev)
> >  enum blk_default_limits {
> >  	BLK_MAX_SEGMENTS	= 128,
> >  	BLK_SAFE_MAX_SECTORS	= 255,
> > +	BLK_MIN_SEGMENT_SIZE	= 4096, /* min(PAGE_SIZE) */
> 
> I think it would be useful to expose this value to the queue_limits and

Can you share it is useful for what?

> sysfs (and remove it from here). We can default it to PAGE_SIZE (as it has
> always been) and allow to overwrite it when the block driver initializes the

Which device driver needs to initialize it?

> limits. This allows to see we are not anymore in the range of PAGE_SIZE -
> max_segment_size 'world' but min_segment_size - max_segment_size one. Unless
> there's a reason to not increase queue_limits data struct?

Unless you provide one real hardware which needs this way, I don't think
the min_segment_size limit is useful.


Thanks,
Ming


  reply	other threads:[~2025-02-14 11:20 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-10  9:03 [PATCH V2] block: make segment size limit workable for > 4K PAGE_SIZE Ming Lei
2025-02-10 12:14 ` Hannes Reinecke
2025-02-10 13:26   ` Ming Lei
2025-02-10 20:17 ` Luis Chamberlain
2025-02-11  2:10   ` Ming Lei
2025-02-13  7:34     ` Daniel Gomez
2025-02-13  8:02       ` Ming Lei
2025-02-13  8:30         ` Christoph Hellwig
2025-02-13  8:51           ` Ming Lei
2025-02-13 14:18             ` Daniel Gomez
2025-02-14  1:37               ` Ming Lei
2025-02-13  8:33 ` Christoph Hellwig
2025-02-13  8:45 ` John Garry
2025-02-13  9:58   ` Ming Lei
2025-02-13 10:23     ` John Garry
2025-02-13 10:35       ` Ming Lei
2025-02-13 11:12         ` John Garry
2025-02-13 11:33           ` Ming Lei
2025-02-13 11:41             ` John Garry
2025-02-14  9:38 ` Daniel Gomez
2025-02-14 11:19   ` Ming Lei [this message]
2025-02-14 12:28     ` Daniel Gomez
2025-02-14 12:51       ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z68m0X9o3Mw_oPsU@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=da.gomez@kernel.org \
    --cc=john.g.garry@oracle.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.