From: Sean Anderson <seanga2@gmail.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Miquel Raynal <miquel.raynal@bootlin.com>,
Richard Weinberger <richard@nod.at>,
Vignesh Raghavendra <vigneshr@ti.com>,
linux-mtd@lists.infradead.org,
Zhihao Cheng <chengzhihao1@huawei.com>
Subject: Re: bio segment constraints
Date: Mon, 7 Apr 2025 09:59:16 -0400 [thread overview]
Message-ID: <28cd9608-5c62-7acc-ed52-41c9a74e8724@gmail.com> (raw)
In-Reply-To: <Z_N5nxLDOBb5NDAM@infradead.org>
On 4/7/25 03:07, Christoph Hellwig wrote:
> On Sun, Apr 06, 2025 at 03:40:04PM -0400, Sean Anderson wrote:
>> Hi all,
>>
>> I'm not really sure what guarantees the block layer makes regarding the
>> segments in a bio as part of a request submitted to a block driver. As
>> far as I can tell this is not documented anywhere. In particular,
>
> First you need to define what segment you mean. We have at least two and
> a half historical uses of the name. One is for each bio_vec attached to
> the bio, either directly as submitted into ->submit_bio for bio based
> drivers (case 1a), or generated by bio_split_to_limits (case 1b), which
> is called for every blk-mq driver before calling into ->queue_rq(s) or
> explicitly called by a few bio based driver.
>
> The other is the bio-vec synthesized by bio_for_each_segment (case 2).
I'm referring to the bio_vecs you get from queue_mq. Which I think is the
latter.
>> - Is bv_len aligned to SECTOR_SIZE?
>
> Yes.
>
>> - To logical_sector_size?
>
> Yes.
OK, but...
>> - What if logical_sector_size > PAGE_SIZE?
>
> Still always aligned to logical_sector_size.
>
>> - What about bv_offset?
>
> bv_offset is a memory offset and must only be aligned to the
> dma_alignment limit.
>
>> - Is it possible to have a bio where the total length is a multiple of
>> logical_sector_size, but the data is split across several segments
>> where each segment is a multiple of SECTOR_SIZE?
>
> Yes.
...if this is the case, then for some of those segments wouldn't bv_len
not be a multiple of logical_sector_size?
>> - Is is possible to have segments not even aligned to SECTOR_SIZE?
>
> No.
>
>> - Can I somehow request to only get segments with bv_len aligned to
>> logical_sector_size?
>
> For drivers that use bio_split_to_limits implicitly or explicitly you can
> do that by setting the right seg_boundary_mask.
Is that the right knob? It operates on the physical address, so it looked
more like something for broken DMA engines. For example (if I recall correctly)
MMC SDMA can't cross a page boundary, so you could use seg_boundary_mask to
enforce that.
>> make some big assumptions (which might be bugs?) For example, in
>> drivers/mtd/mtd_blkdevs.c, do_blktrans_request looks like:
>
>> - There is only one bio in a request. This one is a bit of a soft
>> assumption since we should only flush the pages in the bio and not the
>> whole request otherwise.
>
> It always operates on the first bio in the request and then uses
> blk_update_request to move the context past that. It is an old
> and somewhat arkane way to write drivers, but should work. The
> rq_for_each_segment looks do call flush_dcache_page look horribly
> wrong for this model, though.
>
>> - The data is in lowmem OR bv_offset + bv_len <= PAGE_SIZE. kmap() only
>> maps a single page, so if we go past one page we end up in adjacent
>> kmapped pages.
>
> Yes, this looks broken.
>
>> Am I missing something here? Handling highmem seems like a persistent
>> issue. E.g. drivers/mtd/ubi/block.c doesn't even bother doing a kmap.
>> Should both of these have BLK_FEAT_BOUNCE_HIGH?
>
> BLK_FEAT_BOUNCE_HIGH needs to go away rather sooner than later.
>
> in the short run the best fix would be to synthesized a
> bio_for_each_segment like bio_vec that stays inside a single page
> using bio_iter_iovec) at the top of do_blktrans_request and use
> that for all references to the data.
>
OK, but if you have to stay inside a single page couldn't you end up
with a sector spanning a page boundary due to only being aligned to
dma_alignment? Or maybe we set seg_boundary_mask to PAGE_MASK to enforce that?
--Sean
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
next prev parent reply other threads:[~2025-04-07 15:12 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-06 19:40 bio segment constraints Sean Anderson
2025-04-07 7:07 ` Christoph Hellwig
2025-04-07 13:46 ` Keith Busch
2025-04-07 13:59 ` Christoph Hellwig
2025-04-07 15:52 ` Bart Van Assche
2025-04-07 13:59 ` Sean Anderson [this message]
2025-04-07 14:12 ` Christoph Hellwig
2025-04-07 7:10 ` Hannes Reinecke
2025-04-07 14:14 ` Sean Anderson
2025-04-08 6:10 ` Hannes Reinecke
2025-04-08 13:57 ` Sean Anderson
2025-04-08 14:33 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28cd9608-5c62-7acc-ed52-41c9a74e8724@gmail.com \
--to=seanga2@gmail.com \
--cc=axboe@kernel.dk \
--cc=chengzhihao1@huawei.com \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-mtd@lists.infradead.org \
--cc=miquel.raynal@bootlin.com \
--cc=richard@nod.at \
--cc=vigneshr@ti.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox