* [PATCH v2] dm-ebs: Mark full buffer dirty even on partial write
@ 2025-10-20 12:33 Uladzislau Rezki (Sony)
2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-10-20 12:33 UTC (permalink / raw)
To: Mikulas Patocka, Alasdair Kergon, DMML
Cc: Andrew Morton, Mike Snitzer, Christoph Hellwig, LKML,
Uladzislau Rezki
When performing a read-modify-write(RMW) operation, any modification
to a buffered block must cause the entire buffer to be marked dirty.
Marking only a subrange as dirty is incorrect because the underlying
device block size(ubs) defines the minimum read/write granularity. A
lower device can perform I/O only on regions which are fully aligned
and sized to ubs.
This change ensures that write-back operations always occur in full
ubs-sized chunks, matching the intended emulation semantics of the
EBS target.
As for user space visible impact, submitting sub-ubs and misaligned
I/O for devices which are tuned to ubs sizes only, will reject such
requests, therefore it can lead to losing data. Example:
1) Create a 8K nvme device in qemu by adding
-device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
2) Setup dm-ebs to emulate 512B to 8K mapping
urezki@pc638:~/bin$ cat dmsetup.sh
lower=/dev/nvme0n1
len=$(blockdev --getsz "$lower")
echo "0 $len ebs $lower 0 1 16" | dmsetup create nvme-8k
urezki@pc638:~/bin$
offset 0, ebs=1 and ubs=16(in sectors).
3) Create an ext4 filesystem(default 4K block size)
urezki@pc638:~/bin$ sudo mkfs.ext4 -F /dev/dm-0
mke2fs 1.47.0 (5-Feb-2023)
Discarding device blocks: done
Creating filesystem with 2072576 4k blocks and 518144 inodes
Filesystem UUID: bd0b6ca6-0506-4e31-86da-8d22c9d50b63
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: mkfs.ext4: Input/output error while writing out and closing file system
urezki@pc638:~/bin$ dmesg
<snip>
[ 1618.875449] buffer_io_error: 1028 callbacks suppressed
[ 1618.875456] Buffer I/O error on dev dm-0, logical block 0, lost async page write
[ 1618.875527] Buffer I/O error on dev dm-0, logical block 1, lost async page write
[ 1618.875602] Buffer I/O error on dev dm-0, logical block 2, lost async page write
[ 1618.875620] Buffer I/O error on dev dm-0, logical block 3, lost async page write
[ 1618.875639] Buffer I/O error on dev dm-0, logical block 4, lost async page write
[ 1618.894316] Buffer I/O error on dev dm-0, logical block 5, lost async page write
[ 1618.894358] Buffer I/O error on dev dm-0, logical block 6, lost async page write
[ 1618.894380] Buffer I/O error on dev dm-0, logical block 7, lost async page write
[ 1618.894405] Buffer I/O error on dev dm-0, logical block 8, lost async page write
[ 1618.894427] Buffer I/O error on dev dm-0, logical block 9, lost async page write
<snip>
Many I/O errors because the lower 8K device rejects sub-ubs/misaligned
requests.
with a patch:
urezki@pc638:~/bin$ sudo mkfs.ext4 -F /dev/dm-0
mke2fs 1.47.0 (5-Feb-2023)
Discarding device blocks: done
Creating filesystem with 2072576 4k blocks and 518144 inodes
Filesystem UUID: 9b54f44f-ef55-4bd4-9e40-c8b775a616ac
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
urezki@pc638:~/bin$ sudo mount /dev/dm-0 /mnt/
urezki@pc638:~/bin$ ls -al /mnt/
total 24
drwxr-xr-x 3 root root 4096 Oct 17 15:13 .
drwxr-xr-x 19 root root 4096 Jul 10 19:42 ..
drwx------ 2 root root 16384 Oct 17 15:13 lost+found
urezki@pc638:~/bin$
After this change: mkfs completes; mount succeeds.
v1 -> v2:
- reflect a user space visible impact in the commit message.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
drivers/md/dm-ebs-target.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/dm-ebs-target.c b/drivers/md/dm-ebs-target.c
index 6abb31ca9662..b354e74a670e 100644
--- a/drivers/md/dm-ebs-target.c
+++ b/drivers/md/dm-ebs-target.c
@@ -103,7 +103,7 @@ static int __ebs_rw_bvec(struct ebs_c *ec, enum req_op op, struct bio_vec *bv,
} else {
flush_dcache_page(bv->bv_page);
memcpy(ba, pa, cur_len);
- dm_bufio_mark_partial_buffer_dirty(b, buf_off, buf_off + cur_len);
+ dm_bufio_mark_buffer_dirty(b);
}
dm_bufio_release(b);
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-10-20 12:33 [PATCH v2] dm-ebs: Mark full buffer dirty even on partial write Uladzislau Rezki (Sony)
@ 2025-10-20 12:48 ` Mikulas Patocka
2025-10-28 8:47 ` Uladzislau Rezki
2025-11-18 4:00 ` Benjamin Marzinski
0 siblings, 2 replies; 14+ messages in thread
From: Mikulas Patocka @ 2025-10-20 12:48 UTC (permalink / raw)
To: Uladzislau Rezki (Sony)
Cc: Alasdair Kergon, DMML, Andrew Morton, Mike Snitzer,
Christoph Hellwig, LKML
On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> When performing a read-modify-write(RMW) operation, any modification
> to a buffered block must cause the entire buffer to be marked dirty.
>
> Marking only a subrange as dirty is incorrect because the underlying
> device block size(ubs) defines the minimum read/write granularity. A
> lower device can perform I/O only on regions which are fully aligned
> and sized to ubs.
Hi
I think it would be better to fix this in dm-bufio, so that other dm-bufio
users would also benefit from the fix. Please try this patch - does it fix
it?
Mikulas
From: Mikulas Patocka <mpatocka@redhat.com>
There may be devices with logical block size larger than 4k. Fix
dm-bufio, so that it will align I/O on logical block size. This commit
fixes I/O errors on the dm-ebs target on the top of emulated nvme device
with 8k logical block size created with qemu parameters:
-device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
---
drivers/md/dm-bufio.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
@@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
{
unsigned int n_sectors;
sector_t sector;
- unsigned int offset, end;
+ unsigned int offset, end, align;
b->end_io = end_io;
@@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
b->c->write_callback(b);
offset = b->write_start;
end = b->write_end;
- offset &= -DM_BUFIO_WRITE_ALIGN;
- end += DM_BUFIO_WRITE_ALIGN - 1;
- end &= -DM_BUFIO_WRITE_ALIGN;
+ align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
+ offset &= -align;
+ end += align - 1;
+ end &= -align;
if (unlikely(end > b->c->block_size))
end = b->c->block_size;
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
@ 2025-10-28 8:47 ` Uladzislau Rezki
2025-10-28 13:18 ` Uladzislau Rezki
2025-11-18 4:00 ` Benjamin Marzinski
1 sibling, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2025-10-28 8:47 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
Hello!
Sorry i have missed you email for unknown reason to me. It is
probably because you answered to email with different subject
i sent initially.
>
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
>
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> >
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
>
> Hi
>
> I think it would be better to fix this in dm-bufio, so that other dm-bufio
> users would also benefit from the fix. Please try this patch - does it fix
> it?
>
If it solves what i describe i do not mind :)
>
>
> From: Mikulas Patocka <mpatocka@redhat.com>
>
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
>
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
>
> ---
> drivers/md/dm-bufio.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> {
> unsigned int n_sectors;
> sector_t sector;
> - unsigned int offset, end;
> + unsigned int offset, end, align;
>
> b->end_io = end_io;
>
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> b->c->write_callback(b);
> offset = b->write_start;
> end = b->write_end;
> - offset &= -DM_BUFIO_WRITE_ALIGN;
> - end += DM_BUFIO_WRITE_ALIGN - 1;
> - end &= -DM_BUFIO_WRITE_ALIGN;
> + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> + offset &= -align;
> + end += align - 1;
> + end &= -align;
> if (unlikely(end > b->c->block_size))
> end = b->c->block_size;
>
>
I will check it and get back soon.
Thank you.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-10-28 8:47 ` Uladzislau Rezki
@ 2025-10-28 13:18 ` Uladzislau Rezki
2025-10-29 10:24 ` Mikulas Patocka
0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2025-10-28 13:18 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Mikulas Patocka, Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> Hello!
>
> Sorry i have missed you email for unknown reason to me. It is
> probably because you answered to email with different subject
> i sent initially.
>
> >
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> >
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > >
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> >
> > Hi
> >
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > users would also benefit from the fix. Please try this patch - does it fix
> > it?
> >
> If it solves what i describe i do not mind :)
>
> >
> >
> > From: Mikulas Patocka <mpatocka@redhat.com>
> >
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> >
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> >
> > ---
> > drivers/md/dm-bufio.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > {
> > unsigned int n_sectors;
> > sector_t sector;
> > - unsigned int offset, end;
> > + unsigned int offset, end, align;
> >
> > b->end_io = end_io;
> >
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > b->c->write_callback(b);
> > offset = b->write_start;
> > end = b->write_end;
> > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > - end &= -DM_BUFIO_WRITE_ALIGN;
> > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
>
Should it be physical_block_size of device? It is a min_io the device
can perform. The point is, a user sets "ubs" size which should correspond
to the smallest I/O the device can write, i.e. physically.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-10-28 13:18 ` Uladzislau Rezki
@ 2025-10-29 10:24 ` Mikulas Patocka
2025-10-29 13:06 ` Uladzislau Rezki
0 siblings, 1 reply; 14+ messages in thread
From: Mikulas Patocka @ 2025-10-29 10:24 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Alasdair Kergon, DMML, Andrew Morton, Mike Snitzer,
Christoph Hellwig, LKML
On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > Hello!
> >
> > Sorry i have missed you email for unknown reason to me. It is
> > probably because you answered to email with different subject
> > i sent initially.
> >
> > >
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > >
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > >
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > >
> > > Hi
> > >
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > users would also benefit from the fix. Please try this patch - does it fix
> > > it?
> > >
> > If it solves what i describe i do not mind :)
> >
> > >
> > >
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > >
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > >
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > >
> > > ---
> > > drivers/md/dm-bufio.c | 9 +++++----
> > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > {
> > > unsigned int n_sectors;
> > > sector_t sector;
> > > - unsigned int offset, end;
> > > + unsigned int offset, end, align;
> > >
> > > b->end_io = end_io;
> > >
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > b->c->write_callback(b);
> > > offset = b->write_start;
> > > end = b->write_end;
> > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> >
> Should it be physical_block_size of device? It is a min_io the device
> can perform. The point is, a user sets "ubs" size which should correspond
> to the smallest I/O the device can write, i.e. physically.
physical_block_size is unreliable - some SSDs report physical block size
512 bytes, some 4k. Regardless of what they report, all current SSDs have
4k sector size internally and they do slow read-modify-write cycle on
requests that are not aligned on 4k boundary.
Mikulas
> --
> Uladzislau Rezki
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-10-29 10:24 ` Mikulas Patocka
@ 2025-10-29 13:06 ` Uladzislau Rezki
2025-11-10 10:26 ` Uladzislau Rezki
0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2025-10-29 13:06 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Uladzislau Rezki, Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
>
>
> On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
>
> > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > Hello!
> > >
> > > Sorry i have missed you email for unknown reason to me. It is
> > > probably because you answered to email with different subject
> > > i sent initially.
> > >
> > > >
> > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > >
> > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > >
> > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > and sized to ubs.
> > > >
> > > > Hi
> > > >
> > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > > users would also benefit from the fix. Please try this patch - does it fix
> > > > it?
> > > >
> > > If it solves what i describe i do not mind :)
> > >
> > > >
> > > >
> > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > >
> > > > There may be devices with logical block size larger than 4k. Fix
> > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > with 8k logical block size created with qemu parameters:
> > > >
> > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > >
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > Cc: stable@vger.kernel.org
> > > >
> > > > ---
> > > > drivers/md/dm-bufio.c | 9 +++++----
> > > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > > >
> > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > > {
> > > > unsigned int n_sectors;
> > > > sector_t sector;
> > > > - unsigned int offset, end;
> > > > + unsigned int offset, end, align;
> > > >
> > > > b->end_io = end_io;
> > > >
> > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > > b->c->write_callback(b);
> > > > offset = b->write_start;
> > > > end = b->write_end;
> > > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > >
> > Should it be physical_block_size of device? It is a min_io the device
> > can perform. The point is, a user sets "ubs" size which should correspond
> > to the smallest I/O the device can write, i.e. physically.
>
> physical_block_size is unreliable - some SSDs report physical block size
> 512 bytes, some 4k. Regardless of what they report, all current SSDs have
> 4k sector size internally and they do slow read-modify-write cycle on
> requests that are not aligned on 4k boundary.
>
I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
flags. I agree there is mess there.
The change does not help my project and case. I posted the patch to fix
the dm-ebs as the code offloads partial size instead of ubs size, what
actually a user asking for. When a target is created, the physical_block_size
corresponds to ubs.
I really appreciate if you take the fix i posted. Your patch can be
sent out separately.
Does it work for you?
Thank you!
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-10-29 13:06 ` Uladzislau Rezki
@ 2025-11-10 10:26 ` Uladzislau Rezki
0 siblings, 0 replies; 14+ messages in thread
From: Uladzislau Rezki @ 2025-11-10 10:26 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Mikulas Patocka, Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
On Wed, Oct 29, 2025 at 02:06:31PM +0100, Uladzislau Rezki wrote:
> On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
> >
> >
> > On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> >
> > > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > > Hello!
> > > >
> > > > Sorry i have missed you email for unknown reason to me. It is
> > > > probably because you answered to email with different subject
> > > > i sent initially.
> > > >
> > > > >
> > > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > > >
> > > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > > >
> > > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > > and sized to ubs.
> > > > >
> > > > > Hi
> > > > >
> > > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > > > users would also benefit from the fix. Please try this patch - does it fix
> > > > > it?
> > > > >
> > > > If it solves what i describe i do not mind :)
> > > >
> > > > >
> > > > >
> > > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > > >
> > > > > There may be devices with logical block size larger than 4k. Fix
> > > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > > with 8k logical block size created with qemu parameters:
> > > > >
> > > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > > >
> > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > > Cc: stable@vger.kernel.org
> > > > >
> > > > > ---
> > > > > drivers/md/dm-bufio.c | 9 +++++----
> > > > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > > > >
> > > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > > ===================================================================
> > > > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > > > {
> > > > > unsigned int n_sectors;
> > > > > sector_t sector;
> > > > > - unsigned int offset, end;
> > > > > + unsigned int offset, end, align;
> > > > >
> > > > > b->end_io = end_io;
> > > > >
> > > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > > > b->c->write_callback(b);
> > > > > offset = b->write_start;
> > > > > end = b->write_end;
> > > > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > >
> > > Should it be physical_block_size of device? It is a min_io the device
> > > can perform. The point is, a user sets "ubs" size which should correspond
> > > to the smallest I/O the device can write, i.e. physically.
> >
> > physical_block_size is unreliable - some SSDs report physical block size
> > 512 bytes, some 4k. Regardless of what they report, all current SSDs have
> > 4k sector size internally and they do slow read-modify-write cycle on
> > requests that are not aligned on 4k boundary.
> >
> I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
> flags. I agree there is mess there.
>
> The change does not help my project and case. I posted the patch to fix
> the dm-ebs as the code offloads partial size instead of ubs size, what
> actually a user asking for. When a target is created, the physical_block_size
> corresponds to ubs.
>
> I really appreciate if you take the fix i posted. Your patch can be
> sent out separately.
>
> Does it work for you?
>
Any feedback or comments on it?
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
2025-10-28 8:47 ` Uladzislau Rezki
@ 2025-11-18 4:00 ` Benjamin Marzinski
2025-11-18 11:15 ` Mikulas Patocka
2025-11-18 17:45 ` Mikulas Patocka
1 sibling, 2 replies; 14+ messages in thread
From: Benjamin Marzinski @ 2025-11-18 4:00 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
>
>
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
>
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> >
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
>
> Hi
>
> I think it would be better to fix this in dm-bufio, so that other dm-bufio
> users would also benefit from the fix.
This looks to me like it should accomplish the same thing as
Uladzislau's patch. But I think there could still be problems with other
dm-bufio users, for devices where the blocksize is larger than 4k.
In dm_bufio_client_create() I think we want to make sure that block_size
is a multiple of bdev_logical_block_size(bdev), instead of 512b.
Otherwise block_to_sector() can return sectors that are not addressable
on the device. Unfortunatley, I don't think all users of dm-bufio will
pass in block_sizes that are larger than 4k (uds_make_bufio() in
dm-vdp/indexer/io-factory.c for instance).
-Ben
> Please try this patch - does it fix it?
>
> Mikulas
>
>
>
> From: Mikulas Patocka <mpatocka@redhat.com>
>
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
>
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
>
> ---
> drivers/md/dm-bufio.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> {
> unsigned int n_sectors;
> sector_t sector;
> - unsigned int offset, end;
> + unsigned int offset, end, align;
>
> b->end_io = end_io;
>
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> b->c->write_callback(b);
> offset = b->write_start;
> end = b->write_end;
> - offset &= -DM_BUFIO_WRITE_ALIGN;
> - end += DM_BUFIO_WRITE_ALIGN - 1;
> - end &= -DM_BUFIO_WRITE_ALIGN;
> + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> + offset &= -align;
> + end += align - 1;
> + end &= -align;
> if (unlikely(end > b->c->block_size))
> end = b->c->block_size;
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-11-18 4:00 ` Benjamin Marzinski
@ 2025-11-18 11:15 ` Mikulas Patocka
2025-11-18 12:42 ` Uladzislau Rezki
2025-11-18 17:45 ` Mikulas Patocka
1 sibling, 1 reply; 14+ messages in thread
From: Mikulas Patocka @ 2025-11-18 11:15 UTC (permalink / raw)
To: Benjamin Marzinski
Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> >
> >
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> >
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > >
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> >
> > Hi
> >
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > users would also benefit from the fix.
>
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.
Yes, but Uladzislau said that this patch doesn't work for him. So, I
suspect that he has "logical_block_size" set incorrectly.
Mikulas
> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.
> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
>
> -Ben
>
> > Please try this patch - does it fix it?
> >
> > Mikulas
> >
> >
> >
> > From: Mikulas Patocka <mpatocka@redhat.com>
> >
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> >
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> >
> > ---
> > drivers/md/dm-bufio.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > {
> > unsigned int n_sectors;
> > sector_t sector;
> > - unsigned int offset, end;
> > + unsigned int offset, end, align;
> >
> > b->end_io = end_io;
> >
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > b->c->write_callback(b);
> > offset = b->write_start;
> > end = b->write_end;
> > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > - end &= -DM_BUFIO_WRITE_ALIGN;
> > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > + offset &= -align;
> > + end += align - 1;
> > + end &= -align;
> > if (unlikely(end > b->c->block_size))
> > end = b->c->block_size;
> >
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-11-18 11:15 ` Mikulas Patocka
@ 2025-11-18 12:42 ` Uladzislau Rezki
0 siblings, 0 replies; 14+ messages in thread
From: Uladzislau Rezki @ 2025-11-18 12:42 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Benjamin Marzinski, Uladzislau Rezki (Sony), Alasdair Kergon,
DMML, Andrew Morton, Mike Snitzer, Christoph Hellwig, LKML
On Tue, Nov 18, 2025 at 12:15:43PM +0100, Mikulas Patocka wrote:
>
>
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
>
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > >
> > >
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > >
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > >
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > >
> > > Hi
> > >
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > users would also benefit from the fix.
> >
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
>
> Yes, but Uladzislau said that this patch doesn't work for him. So, I
> suspect that he has "logical_block_size" set incorrectly.
>
Indeed. Because logical is < physical in my case. Your change does not fix
it because of I/O size is equal to physical.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-11-18 4:00 ` Benjamin Marzinski
2025-11-18 11:15 ` Mikulas Patocka
@ 2025-11-18 17:45 ` Mikulas Patocka
2025-11-18 20:36 ` Benjamin Marzinski
2025-11-19 5:45 ` Christoph Hellwig
1 sibling, 2 replies; 14+ messages in thread
From: Mikulas Patocka @ 2025-11-18 17:45 UTC (permalink / raw)
To: Benjamin Marzinski
Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> >
> >
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> >
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > >
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> >
> > Hi
> >
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > users would also benefit from the fix.
>
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.
>
> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.
I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to
dm_bufio_client_create. But I think it's too late in this development
cycle, I would add it after the next merge window closes, when I open a
new patch series for the kernel 6.20 (or 7.0).
> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
>
> -Ben
>
> > Please try this patch - does it fix it?
> >
> > Mikulas
I changed the patch below, so that it aligns write bios on
max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
greater than logical block size, the writes are aligned so that the device
doesn't do read-modify-write.
Mikulas
> > From: Mikulas Patocka <mpatocka@redhat.com>
> >
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> >
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> >
> > ---
> > drivers/md/dm-bufio.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > {
> > unsigned int n_sectors;
> > sector_t sector;
> > - unsigned int offset, end;
> > + unsigned int offset, end, align;
> >
> > b->end_io = end_io;
> >
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > b->c->write_callback(b);
> > offset = b->write_start;
> > end = b->write_end;
> > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > - end &= -DM_BUFIO_WRITE_ALIGN;
> > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > + offset &= -align;
> > + end += align - 1;
> > + end &= -align;
> > if (unlikely(end > b->c->block_size))
> > end = b->c->block_size;
> >
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-11-18 17:45 ` Mikulas Patocka
@ 2025-11-18 20:36 ` Benjamin Marzinski
2025-11-19 5:45 ` Christoph Hellwig
1 sibling, 0 replies; 14+ messages in thread
From: Benjamin Marzinski @ 2025-11-18 20:36 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
Mike Snitzer, Christoph Hellwig, LKML
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
>
>
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
>
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > >
> > >
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > >
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > >
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > >
> > > Hi
> > >
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > users would also benefit from the fix.
> >
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> >
> > In dm_bufio_client_create() I think we want to make sure that block_size
> > is a multiple of bdev_logical_block_size(bdev), instead of 512b.
>
> I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to
> dm_bufio_client_create. But I think it's too late in this development
> cycle, I would add it after the next merge window closes, when I open a
> new patch series for the kernel 6.20 (or 7.0).
>
> > Otherwise block_to_sector() can return sectors that are not addressable
> > on the device. Unfortunatley, I don't think all users of dm-bufio will
> > pass in block_sizes that are larger than 4k (uds_make_bufio() in
> > dm-vdp/indexer/io-factory.c for instance).
> >
> > -Ben
> >
> > > Please try this patch - does it fix it?
> > >
> > > Mikulas
>
> I changed the patch below, so that it aligns write bios on
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
> greater than logical block size, the writes are aligned so that the device
> doesn't do read-modify-write.
This will really only help if the bufio client block_size is a multiple
of the underlying device's physical block size, and the device is
aligned to the physical block size. Perhaps we should figure
out the alignment in dm_bufio_client_create(), with something like:
c->align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(bdev));
if (block_size & -bdev_physical_block_size(bdev) &&
bdev_alignment_offset(bdev) == 0)
c->align = bdev_physical_block_size(bdev);
I suppose pre-calculating this could cause problems if the underlying
device was another dm device, and it switched tables in a way that
changed its limits. I dunno if we care about that, however.
-Ben
> Mikulas
>
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > >
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > >
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > >
> > > ---
> > > drivers/md/dm-bufio.c | 9 +++++----
> > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > {
> > > unsigned int n_sectors;
> > > sector_t sector;
> > > - unsigned int offset, end;
> > > + unsigned int offset, end, align;
> > >
> > > b->end_io = end_io;
> > >
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > b->c->write_callback(b);
> > > offset = b->write_start;
> > > end = b->write_end;
> > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > + offset &= -align;
> > > + end += align - 1;
> > > + end &= -align;
> > > if (unlikely(end > b->c->block_size))
> > > end = b->c->block_size;
> > >
> > >
> >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-11-18 17:45 ` Mikulas Patocka
2025-11-18 20:36 ` Benjamin Marzinski
@ 2025-11-19 5:45 ` Christoph Hellwig
2025-11-19 17:13 ` Mikulas Patocka
1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2025-11-19 5:45 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Benjamin Marzinski, Uladzislau Rezki (Sony), Alasdair Kergon,
DMML, Andrew Morton, Mike Snitzer, Christoph Hellwig, LKML
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> I changed the patch below, so that it aligns write bios on
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
> greater than logical block size, the writes are aligned so that the device
> doesn't do read-modify-write.
That doesn't make any sense whatsoever. The physical block size must
be >= logical block size, and the block enforces that.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
2025-11-19 5:45 ` Christoph Hellwig
@ 2025-11-19 17:13 ` Mikulas Patocka
0 siblings, 0 replies; 14+ messages in thread
From: Mikulas Patocka @ 2025-11-19 17:13 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Benjamin Marzinski, Uladzislau Rezki (Sony), Alasdair Kergon,
DMML, Andrew Morton, Mike Snitzer, LKML
On Wed, 19 Nov 2025, Christoph Hellwig wrote:
> On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> > I changed the patch below, so that it aligns write bios on
> > max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
> > bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
> > greater than logical block size, the writes are aligned so that the device
> > doesn't do read-modify-write.
>
> That doesn't make any sense whatsoever. The physical block size must
> be >= logical block size, and the block enforces that.
OK, so I changed it to max(DM_BUFIO_WRITE_ALIGN,
bdev_physical_block_size(b->c->bdev))
Mikulas
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-11-19 17:14 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 12:33 [PATCH v2] dm-ebs: Mark full buffer dirty even on partial write Uladzislau Rezki (Sony)
2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
2025-10-28 8:47 ` Uladzislau Rezki
2025-10-28 13:18 ` Uladzislau Rezki
2025-10-29 10:24 ` Mikulas Patocka
2025-10-29 13:06 ` Uladzislau Rezki
2025-11-10 10:26 ` Uladzislau Rezki
2025-11-18 4:00 ` Benjamin Marzinski
2025-11-18 11:15 ` Mikulas Patocka
2025-11-18 12:42 ` Uladzislau Rezki
2025-11-18 17:45 ` Mikulas Patocka
2025-11-18 20:36 ` Benjamin Marzinski
2025-11-19 5:45 ` Christoph Hellwig
2025-11-19 17:13 ` Mikulas Patocka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).