From: Benjamin Marzinski <bmarzins@redhat.com>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
Alasdair Kergon <agk@redhat.com>, DMML <dm-devel@lists.linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Mike Snitzer <snitzer@redhat.com>, Christoph Hellwig <hch@lst.de>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Date: Tue, 18 Nov 2025 15:36:13 -0500 [thread overview]
Message-ID: <aRzYvYCLW66Zhcda@redhat.com> (raw)
In-Reply-To: <faafd90d-e41c-dcfc-cc25-7f29ff4f958c@redhat.com>
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
>
>
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
>
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > >
> > >
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > >
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > >
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > >
> > > Hi
> > >
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > users would also benefit from the fix.
> >
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> >
> > In dm_bufio_client_create() I think we want to make sure that block_size
> > is a multiple of bdev_logical_block_size(bdev), instead of 512b.
>
> I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to
> dm_bufio_client_create. But I think it's too late in this development
> cycle, I would add it after the next merge window closes, when I open a
> new patch series for the kernel 6.20 (or 7.0).
>
> > Otherwise block_to_sector() can return sectors that are not addressable
> > on the device. Unfortunatley, I don't think all users of dm-bufio will
> > pass in block_sizes that are larger than 4k (uds_make_bufio() in
> > dm-vdp/indexer/io-factory.c for instance).
> >
> > -Ben
> >
> > > Please try this patch - does it fix it?
> > >
> > > Mikulas
>
> I changed the patch below, so that it aligns write bios on
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
> greater than logical block size, the writes are aligned so that the device
> doesn't do read-modify-write.
This will really only help if the bufio client block_size is a multiple
of the underlying device's physical block size, and the device is
aligned to the physical block size. Perhaps we should figure
out the alignment in dm_bufio_client_create(), with something like:
c->align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(bdev));
if (block_size & -bdev_physical_block_size(bdev) &&
bdev_alignment_offset(bdev) == 0)
c->align = bdev_physical_block_size(bdev);
I suppose pre-calculating this could cause problems if the underlying
device was another dm device, and it switched tables in a way that
changed its limits. I dunno if we care about that, however.
-Ben
> Mikulas
>
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > >
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > >
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > >
> > > ---
> > > drivers/md/dm-bufio.c | 9 +++++----
> > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > {
> > > unsigned int n_sectors;
> > > sector_t sector;
> > > - unsigned int offset, end;
> > > + unsigned int offset, end, align;
> > >
> > > b->end_io = end_io;
> > >
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > b->c->write_callback(b);
> > > offset = b->write_start;
> > > end = b->write_end;
> > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > + offset &= -align;
> > > + end += align - 1;
> > > + end &= -align;
> > > if (unlikely(end > b->c->block_size))
> > > end = b->c->block_size;
> > >
> > >
> >
next prev parent reply other threads:[~2025-11-18 20:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 12:33 [PATCH v2] dm-ebs: Mark full buffer dirty even on partial write Uladzislau Rezki (Sony)
2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
2025-10-28 8:47 ` Uladzislau Rezki
2025-10-28 13:18 ` Uladzislau Rezki
2025-10-29 10:24 ` Mikulas Patocka
2025-10-29 13:06 ` Uladzislau Rezki
2025-11-10 10:26 ` Uladzislau Rezki
2025-11-18 4:00 ` Benjamin Marzinski
2025-11-18 11:15 ` Mikulas Patocka
2025-11-18 12:42 ` Uladzislau Rezki
2025-11-18 17:45 ` Mikulas Patocka
2025-11-18 20:36 ` Benjamin Marzinski [this message]
2025-11-19 5:45 ` Christoph Hellwig
2025-11-19 17:13 ` Mikulas Patocka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRzYvYCLW66Zhcda@redhat.com \
--to=bmarzins@redhat.com \
--cc=agk@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dm-devel@lists.linux.dev \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=snitzer@redhat.com \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.