From: Ming Lei <ming.lei@redhat.com>
To: Daniel Gomez <da.gomez@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Paul Bunyan <pbunyan@redhat.com>,
Yi Zhang <yi.zhang@redhat.com>,
Luis Chamberlain <mcgrof@kernel.org>,
John Garry <john.g.garry@oracle.com>,
Bart Van Assche <bvanassche@acm.org>,
Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH V3] block: make segment size limit workable for > 4K PAGE_SIZE
Date: Tue, 18 Feb 2025 19:42:13 +0800 [thread overview]
Message-ID: <Z7RyFb6SkxDGzlY_@fedora> (raw)
In-Reply-To: <rthanqn7zv66456urv23nh36l7rhdav2ubldz4e3r5e52ow5a5@dicfrbdak5ia>
On Tue, Feb 18, 2025 at 11:42:43AM +0100, Daniel Gomez wrote:
> On Thu, Feb 13, 2025 at 08:00:40PM +0100, Ming Lei wrote:
> > PAGE_SIZE is applied in validating block device queue limits, this way is
> > very fragile and is wrong:
> >
> > - queue limits are read from hardware, which is often one readonly hardware
> > property
> >
> > - PAGE_SIZE is one config option which can be changed during build time.
> >
> > In RH lab, it has been found that max segment size of some mmc card is
> > less than 64K, then this kind of card can't be probed successfully when
> > same kernel is re-built with 64K PAGE_SIZE.
> >
> > Fix this issue by using BLK_MIN_SEGMENT_SIZE in related code for dealing
> > with queue limits and checking if bio needn't split as a hint. Define
> > BLK_MIN_SEGMENT_SIZE as 4K(minimized PAGE_SIZE).
> >
> > The following commits are depended for backporting:
> >
> > commit 6aeb4f836480 ("block: remove bio_add_pc_page")
> > commit 02ee5d69e3ba ("block: remove blk_rq_bio_prep")
> > commit b7175e24d6ac ("block: add a dma mapping iterator")
> >
> > Cc: Paul Bunyan <pbunyan@redhat.com>
> > Cc: Yi Zhang <yi.zhang@redhat.com>
> > Cc: Luis Chamberlain <mcgrof@kernel.org>
> > Cc: John Garry <john.g.garry@oracle.com>
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: Keith Busch <kbusch@kernel.org>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > V3:
> > - rephrase commit log & fix patch style(Christoph)
> > - more comment log(Christoph)
> > V2:
> > - cover bio_split_rw_at()
> > - add BLK_MIN_SEGMENT_SIZE
> >
> > block/blk-merge.c | 2 +-
> > block/blk-settings.c | 6 +++---
> > block/blk.h | 8 ++++++--
> > include/linux/blkdev.h | 2 ++
> > 4 files changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index 15cd231d560c..b55c52a42303 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -329,7 +329,7 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
> >
> > if (nsegs < lim->max_segments &&
> > bytes + bv.bv_len <= max_bytes &&
> > - bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
> > + bv.bv_offset + bv.bv_len <= BLK_MIN_SEGMENT_SIZE) {
>
> In which cases this "new" condition (for systems where PAGE_SIZE >
> BLK_MIN_SEGMENT_SIZE) is going to be true? In my test case below, is always
Yes.
> false, so it defaults to the else path. And I think that is going to be the
> "normal" case in these systems, is that right?
It depends on block size in your workload.
>
> Doing a 'quick' test using next-20250213 on a 16k PAGE_SIZE system with the NVMe
> driver and a 4k lbs disk + ~1h fio sequential writes, I get results indicating
> a write performance degradation of ~0.8%. This is due to the new loop condition
> doing 4k steps rather than PS. I guess it's going to be slighly worse the larger
> the PAGE_SIZE the system is, and bio? So, why not decreasing the minimum segment
No, just one extra bvec_split_segs() is called once for any >4k page size
Probably the opposite, effect on 64K page size could be smaller since
bio may have less bvec on same workload.
> size for the cases it's actually needed rather than making it now the default?
>
> I've measured bio_split_rw_at latency in the above test with the following
> results:
Fine, I will add one min segment size for covering this case since you care the
little perf drop on >4k PAGE_SIZE.
Thanks,
Ming
prev parent reply other threads:[~2025-02-18 11:42 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-13 12:00 [PATCH V3] block: make segment size limit workable for > 4K PAGE_SIZE Ming Lei
2025-02-14 17:37 ` Bart Van Assche
2025-02-18 10:42 ` Daniel Gomez
2025-02-18 11:42 ` Ming Lei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7RyFb6SkxDGzlY_@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=da.gomez@kernel.org \
--cc=john.g.garry@oracle.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=pbunyan@redhat.com \
--cc=yi.zhang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox