From: Hannes Reinecke <hare@suse.de>
To: Sean Anderson <seanga2@gmail.com>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org
Cc: Miquel Raynal <miquel.raynal@bootlin.com>,
Richard Weinberger <richard@nod.at>,
Vignesh Raghavendra <vigneshr@ti.com>,
linux-mtd@lists.infradead.org,
Zhihao Cheng <chengzhihao1@huawei.com>
Subject: Re: bio segment constraints
Date: Mon, 7 Apr 2025 09:10:20 +0200 [thread overview]
Message-ID: <8a232716-74f8-4bba-a514-d0f766492344@suse.de> (raw)
In-Reply-To: <8dfd97ac-59e7-ae69-238a-85b7a2dae4f1@gmail.com>
On 4/6/25 21:40, Sean Anderson wrote:
> Hi all,
>
> I'm not really sure what guarantees the block layer makes regarding the
> segments in a bio as part of a request submitted to a block driver. As
> far as I can tell this is not documented anywhere. In particular,
>
> - Is bv_len aligned to SECTOR_SIZE?
The block layer always uses a 512 byte sector size, so yes.
> - To logical_sector_size?
Not necessarily. Bvecs are a consecutive list of byte ranges which
make up the data portion of a bio.
The logical sector size is a property of the request queue, which is
applied when a request is formed from one or several bios.
For the request the overall length need to be a multiple of the logical
sector size, but not necessarily the individual bios.
> - What if logical_sector_size > PAGE_SIZE?
See above.
> - What about bv_offset?
Same story. The eventual request needs to observe that the offset
and the length is aligned to the logical block size, but the individual
bios might not.
> - Is it possible to have a bio where the total length is a multiple of
> logical_sector_size, but the data is split across several segments
> where each segment is a multiple of SECTOR_SIZE?
Sure.
> - Is is possible to have segments not even aligned to SECTOR_SIZE?
Nope.
> - Can I somehow request to only get segments with bv_len aligned to
> logical_sector_size? Or do I need to do my own coalescing and bounce
> buffering for that?
>
The driver surely can. You should be able to set 'max_segment_size' to
the logical block size, and that should give you what you want.
> I've been reading some drivers (as well as stuff in block/) to try and
> figure things out, but it's hard to figure out all the places where
> constraints are enforced. In particular, I've read several drivers that
> make some big assumptions (which might be bugs?) For example, in
> drivers/mtd/mtd_blkdevs.c, do_blktrans_request looks like:
>
In general, the block layer has two major data items, bios and requests.
'struct bio' is the central structure for any 'upper' layers to submit
data (via the 'submit_bio()' function), and 'struct request' is the
central structure for drivers to fetch data for submission to the
hardware (via the 'queue_rq()' request_queue callback).
And the task of the block layer is to convert 'struct bio' into
'struct request'.
[ .. ]
> For context, tr->blkshift is either 512 or 4096, depending on the
> backend. From what I can tell, this code assumes the following:
>
mtd is probably not a good examples, as MTD has it's own set of
limitations which might result in certain shortcuts to be taken.
> - There is only one bio in a request. This one is a bit of a soft
> assumption since we should only flush the pages in the bio and not the
> whole request otherwise.
> - There is only one segment in a bio. This one could be reasonable if
> max_segments was set to 1, but it's not as far as I can tell. So I
> guess we just go off the end of the bio if there's a second segment?
> - The data is in lowmem OR bv_offset + bv_len <= PAGE_SIZE. kmap() only
> maps a single page, so if we go past one page we end up in adjacent
> kmapped pages.
>
Well, that code _does_ look suspicious. It really should be converted
to using the iov iterators.
But then again, it _might_ be okay if there are underlying MTD
restrictions which would devolve into MTD only having a single bvec.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
next prev parent reply other threads:[~2025-04-07 7:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-06 19:40 bio segment constraints Sean Anderson
2025-04-07 7:07 ` Christoph Hellwig
2025-04-07 13:46 ` Keith Busch
2025-04-07 13:59 ` Christoph Hellwig
2025-04-07 15:52 ` Bart Van Assche
2025-04-07 13:59 ` Sean Anderson
2025-04-07 14:12 ` Christoph Hellwig
2025-04-07 7:10 ` Hannes Reinecke [this message]
2025-04-07 14:14 ` Sean Anderson
2025-04-08 6:10 ` Hannes Reinecke
2025-04-08 13:57 ` Sean Anderson
2025-04-08 14:33 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8a232716-74f8-4bba-a514-d0f766492344@suse.de \
--to=hare@suse.de \
--cc=axboe@kernel.dk \
--cc=chengzhihao1@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-mtd@lists.infradead.org \
--cc=miquel.raynal@bootlin.com \
--cc=richard@nod.at \
--cc=seanga2@gmail.com \
--cc=vigneshr@ti.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox