linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] dm-ebs: Mark full buffer dirty even on partial write
@ 2025-10-20 12:33 Uladzislau Rezki (Sony)
  2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
  0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki (Sony) @ 2025-10-20 12:33 UTC (permalink / raw)
  To: Mikulas Patocka, Alasdair Kergon, DMML
  Cc: Andrew Morton, Mike Snitzer, Christoph Hellwig, LKML,
	Uladzislau Rezki

When performing a read-modify-write(RMW) operation, any modification
to a buffered block must cause the entire buffer to be marked dirty.

Marking only a subrange as dirty is incorrect because the underlying
device block size(ubs) defines the minimum read/write granularity. A
lower device can perform I/O only on regions which are fully aligned
and sized to ubs.

This change ensures that write-back operations always occur in full
ubs-sized chunks, matching the intended emulation semantics of the
EBS target.

As for user space visible impact, submitting sub-ubs and misaligned
I/O for devices which are tuned to ubs sizes only, will reject such
requests, therefore it can lead to losing data. Example:

1) Create a 8K nvme device in qemu by adding

-device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192

2) Setup dm-ebs to emulate 512B to 8K mapping

urezki@pc638:~/bin$ cat dmsetup.sh

lower=/dev/nvme0n1
len=$(blockdev --getsz "$lower")

echo "0 $len ebs $lower 0 1 16" | dmsetup create nvme-8k
urezki@pc638:~/bin$

offset 0, ebs=1 and ubs=16(in sectors).

3) Create an ext4 filesystem(default 4K block size)

urezki@pc638:~/bin$ sudo mkfs.ext4 -F /dev/dm-0
mke2fs 1.47.0 (5-Feb-2023)
Discarding device blocks: done
Creating filesystem with 2072576 4k blocks and 518144 inodes
Filesystem UUID: bd0b6ca6-0506-4e31-86da-8d22c9d50b63
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: mkfs.ext4: Input/output error while writing out and closing file system
urezki@pc638:~/bin$ dmesg

<snip>
[ 1618.875449] buffer_io_error: 1028 callbacks suppressed
[ 1618.875456] Buffer I/O error on dev dm-0, logical block 0, lost async page write
[ 1618.875527] Buffer I/O error on dev dm-0, logical block 1, lost async page write
[ 1618.875602] Buffer I/O error on dev dm-0, logical block 2, lost async page write
[ 1618.875620] Buffer I/O error on dev dm-0, logical block 3, lost async page write
[ 1618.875639] Buffer I/O error on dev dm-0, logical block 4, lost async page write
[ 1618.894316] Buffer I/O error on dev dm-0, logical block 5, lost async page write
[ 1618.894358] Buffer I/O error on dev dm-0, logical block 6, lost async page write
[ 1618.894380] Buffer I/O error on dev dm-0, logical block 7, lost async page write
[ 1618.894405] Buffer I/O error on dev dm-0, logical block 8, lost async page write
[ 1618.894427] Buffer I/O error on dev dm-0, logical block 9, lost async page write
<snip>

Many I/O errors because the lower 8K device rejects sub-ubs/misaligned
requests.

with a patch:

urezki@pc638:~/bin$ sudo mkfs.ext4 -F /dev/dm-0
mke2fs 1.47.0 (5-Feb-2023)
Discarding device blocks: done
Creating filesystem with 2072576 4k blocks and 518144 inodes
Filesystem UUID: 9b54f44f-ef55-4bd4-9e40-c8b775a616ac
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

urezki@pc638:~/bin$ sudo mount /dev/dm-0 /mnt/
urezki@pc638:~/bin$ ls -al /mnt/
total 24
drwxr-xr-x  3 root root  4096 Oct 17 15:13 .
drwxr-xr-x 19 root root  4096 Jul 10 19:42 ..
drwx------  2 root root 16384 Oct 17 15:13 lost+found
urezki@pc638:~/bin$

After this change: mkfs completes; mount succeeds.

v1 -> v2:
 - reflect a user space visible impact in the commit message.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 drivers/md/dm-ebs-target.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-ebs-target.c b/drivers/md/dm-ebs-target.c
index 6abb31ca9662..b354e74a670e 100644
--- a/drivers/md/dm-ebs-target.c
+++ b/drivers/md/dm-ebs-target.c
@@ -103,7 +103,7 @@ static int __ebs_rw_bvec(struct ebs_c *ec, enum req_op op, struct bio_vec *bv,
 			} else {
 				flush_dcache_page(bv->bv_page);
 				memcpy(ba, pa, cur_len);
-				dm_bufio_mark_partial_buffer_dirty(b, buf_off, buf_off + cur_len);
+				dm_bufio_mark_buffer_dirty(b);
 			}
 
 			dm_bufio_release(b);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-10-20 12:33 [PATCH v2] dm-ebs: Mark full buffer dirty even on partial write Uladzislau Rezki (Sony)
@ 2025-10-20 12:48 ` Mikulas Patocka
  2025-10-28  8:47   ` Uladzislau Rezki
  2025-11-18  4:00   ` Benjamin Marzinski
  0 siblings, 2 replies; 14+ messages in thread
From: Mikulas Patocka @ 2025-10-20 12:48 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: Alasdair Kergon, DMML, Andrew Morton, Mike Snitzer,
	Christoph Hellwig, LKML



On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:

> When performing a read-modify-write(RMW) operation, any modification
> to a buffered block must cause the entire buffer to be marked dirty.
> 
> Marking only a subrange as dirty is incorrect because the underlying
> device block size(ubs) defines the minimum read/write granularity. A
> lower device can perform I/O only on regions which are fully aligned
> and sized to ubs.

Hi

I think it would be better to fix this in dm-bufio, so that other dm-bufio 
users would also benefit from the fix. Please try this patch - does it fix 
it?

Mikulas



From: Mikulas Patocka <mpatocka@redhat.com>

There may be devices with logical block size larger than 4k. Fix
dm-bufio, so that it will align I/O on logical block size. This commit
fixes I/O errors on the dm-ebs target on the top of emulated nvme device
with 8k logical block size created with qemu parameters:

-device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/md/dm-bufio.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
@@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
 {
 	unsigned int n_sectors;
 	sector_t sector;
-	unsigned int offset, end;
+	unsigned int offset, end, align;
 
 	b->end_io = end_io;
 
@@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
 			b->c->write_callback(b);
 		offset = b->write_start;
 		end = b->write_end;
-		offset &= -DM_BUFIO_WRITE_ALIGN;
-		end += DM_BUFIO_WRITE_ALIGN - 1;
-		end &= -DM_BUFIO_WRITE_ALIGN;
+		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
+		offset &= -align;
+		end += align - 1;
+		end &= -align;
 		if (unlikely(end > b->c->block_size))
 			end = b->c->block_size;
 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
@ 2025-10-28  8:47   ` Uladzislau Rezki
  2025-10-28 13:18     ` Uladzislau Rezki
  2025-11-18  4:00   ` Benjamin Marzinski
  1 sibling, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2025-10-28  8:47 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML

Hello!

Sorry i have missed you email for unknown reason to me. It is
probably because you answered to email with different subject
i sent initially.

> 
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> 
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> > 
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
> 
> Hi
> 
> I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> users would also benefit from the fix. Please try this patch - does it fix 
> it?
> 
If it solves what i describe i do not mind :)

> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
> 
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  drivers/md/dm-bufio.c |    9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
>  {
>  	unsigned int n_sectors;
>  	sector_t sector;
> -	unsigned int offset, end;
> +	unsigned int offset, end, align;
>  
>  	b->end_io = end_io;
>  
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
>  			b->c->write_callback(b);
>  		offset = b->write_start;
>  		end = b->write_end;
> -		offset &= -DM_BUFIO_WRITE_ALIGN;
> -		end += DM_BUFIO_WRITE_ALIGN - 1;
> -		end &= -DM_BUFIO_WRITE_ALIGN;
> +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> +		offset &= -align;
> +		end += align - 1;
> +		end &= -align;
>  		if (unlikely(end > b->c->block_size))
>  			end = b->c->block_size;
>  
> 
I will check it and get back soon.

Thank you.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-10-28  8:47   ` Uladzislau Rezki
@ 2025-10-28 13:18     ` Uladzislau Rezki
  2025-10-29 10:24       ` Mikulas Patocka
  0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2025-10-28 13:18 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Mikulas Patocka, Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML

On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> Hello!
> 
> Sorry i have missed you email for unknown reason to me. It is
> probably because you answered to email with different subject
> i sent initially.
> 
> > 
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > 
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > > 
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> > 
> > Hi
> > 
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > users would also benefit from the fix. Please try this patch - does it fix 
> > it?
> > 
> If it solves what i describe i do not mind :)
> 
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> > 
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> >  drivers/md/dm-bufio.c |    9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> >  {
> >  	unsigned int n_sectors;
> >  	sector_t sector;
> > -	unsigned int offset, end;
> > +	unsigned int offset, end, align;
> >  
> >  	b->end_io = end_io;
> >  
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> >  			b->c->write_callback(b);
> >  		offset = b->write_start;
> >  		end = b->write_end;
> > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
>
Should it be physical_block_size of device? It is a min_io the device
can perform. The point is, a user sets "ubs" size which should correspond
to the smallest I/O the device can write, i.e. physically.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-10-28 13:18     ` Uladzislau Rezki
@ 2025-10-29 10:24       ` Mikulas Patocka
  2025-10-29 13:06         ` Uladzislau Rezki
  0 siblings, 1 reply; 14+ messages in thread
From: Mikulas Patocka @ 2025-10-29 10:24 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Alasdair Kergon, DMML, Andrew Morton, Mike Snitzer,
	Christoph Hellwig, LKML



On Tue, 28 Oct 2025, Uladzislau Rezki wrote:

> On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > Hello!
> > 
> > Sorry i have missed you email for unknown reason to me. It is
> > probably because you answered to email with different subject
> > i sent initially.
> > 
> > > 
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > 
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > 
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > > 
> > > Hi
> > > 
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > users would also benefit from the fix. Please try this patch - does it fix 
> > > it?
> > > 
> > If it solves what i describe i do not mind :)
> > 
> > > 
> > > 
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > 
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > > 
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > > 
> > > ---
> > >  drivers/md/dm-bufio.c |    9 +++++----
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > 
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > >  {
> > >  	unsigned int n_sectors;
> > >  	sector_t sector;
> > > -	unsigned int offset, end;
> > > +	unsigned int offset, end, align;
> > >  
> > >  	b->end_io = end_io;
> > >  
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > >  			b->c->write_callback(b);
> > >  		offset = b->write_start;
> > >  		end = b->write_end;
> > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> >
> Should it be physical_block_size of device? It is a min_io the device
> can perform. The point is, a user sets "ubs" size which should correspond
> to the smallest I/O the device can write, i.e. physically.

physical_block_size is unreliable - some SSDs report physical block size 
512 bytes, some 4k. Regardless of what they report, all current SSDs have 
4k sector size internally and they do slow read-modify-write cycle on 
requests that are not aligned on 4k boundary.

Mikulas

> --
> Uladzislau Rezki
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-10-29 10:24       ` Mikulas Patocka
@ 2025-10-29 13:06         ` Uladzislau Rezki
  2025-11-10 10:26           ` Uladzislau Rezki
  0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2025-10-29 13:06 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Uladzislau Rezki, Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML

On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
> 
> 
> On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> 
> > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > Hello!
> > > 
> > > Sorry i have missed you email for unknown reason to me. It is
> > > probably because you answered to email with different subject
> > > i sent initially.
> > > 
> > > > 
> > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > > 
> > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > > 
> > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > and sized to ubs.
> > > > 
> > > > Hi
> > > > 
> > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > > users would also benefit from the fix. Please try this patch - does it fix 
> > > > it?
> > > > 
> > > If it solves what i describe i do not mind :)
> > > 
> > > > 
> > > > 
> > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > > 
> > > > There may be devices with logical block size larger than 4k. Fix
> > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > with 8k logical block size created with qemu parameters:
> > > > 
> > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > > 
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > Cc: stable@vger.kernel.org
> > > > 
> > > > ---
> > > >  drivers/md/dm-bufio.c |    9 +++++----
> > > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > > 
> > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > >  {
> > > >  	unsigned int n_sectors;
> > > >  	sector_t sector;
> > > > -	unsigned int offset, end;
> > > > +	unsigned int offset, end, align;
> > > >  
> > > >  	b->end_io = end_io;
> > > >  
> > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > >  			b->c->write_callback(b);
> > > >  		offset = b->write_start;
> > > >  		end = b->write_end;
> > > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > >
> > Should it be physical_block_size of device? It is a min_io the device
> > can perform. The point is, a user sets "ubs" size which should correspond
> > to the smallest I/O the device can write, i.e. physically.
> 
> physical_block_size is unreliable - some SSDs report physical block size 
> 512 bytes, some 4k. Regardless of what they report, all current SSDs have 
> 4k sector size internally and they do slow read-modify-write cycle on 
> requests that are not aligned on 4k boundary.
> 
I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
flags. I agree there is mess there.
 
The change does not help my project and case. I posted the patch to fix
the dm-ebs as the code offloads partial size instead of ubs size, what
actually a user asking for. When a target is created, the physical_block_size
corresponds to ubs.
 
I really appreciate if you take the fix i posted. Your patch can be
sent out separately.
 
Does it work for you?
 
Thank you!
 
--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-10-29 13:06         ` Uladzislau Rezki
@ 2025-11-10 10:26           ` Uladzislau Rezki
  0 siblings, 0 replies; 14+ messages in thread
From: Uladzislau Rezki @ 2025-11-10 10:26 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Mikulas Patocka, Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML

On Wed, Oct 29, 2025 at 02:06:31PM +0100, Uladzislau Rezki wrote:
> On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> > 
> > > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > > Hello!
> > > > 
> > > > Sorry i have missed you email for unknown reason to me. It is
> > > > probably because you answered to email with different subject
> > > > i sent initially.
> > > > 
> > > > > 
> > > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > > > 
> > > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > > > 
> > > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > > and sized to ubs.
> > > > > 
> > > > > Hi
> > > > > 
> > > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > > > users would also benefit from the fix. Please try this patch - does it fix 
> > > > > it?
> > > > > 
> > > > If it solves what i describe i do not mind :)
> > > > 
> > > > > 
> > > > > 
> > > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > > > 
> > > > > There may be devices with logical block size larger than 4k. Fix
> > > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > > with 8k logical block size created with qemu parameters:
> > > > > 
> > > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > > > 
> > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > > Cc: stable@vger.kernel.org
> > > > > 
> > > > > ---
> > > > >  drivers/md/dm-bufio.c |    9 +++++----
> > > > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > > > 
> > > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > > ===================================================================
> > > > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > > >  {
> > > > >  	unsigned int n_sectors;
> > > > >  	sector_t sector;
> > > > > -	unsigned int offset, end;
> > > > > +	unsigned int offset, end, align;
> > > > >  
> > > > >  	b->end_io = end_io;
> > > > >  
> > > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > > >  			b->c->write_callback(b);
> > > > >  		offset = b->write_start;
> > > > >  		end = b->write_end;
> > > > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > >
> > > Should it be physical_block_size of device? It is a min_io the device
> > > can perform. The point is, a user sets "ubs" size which should correspond
> > > to the smallest I/O the device can write, i.e. physically.
> > 
> > physical_block_size is unreliable - some SSDs report physical block size 
> > 512 bytes, some 4k. Regardless of what they report, all current SSDs have 
> > 4k sector size internally and they do slow read-modify-write cycle on 
> > requests that are not aligned on 4k boundary.
> > 
> I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
> flags. I agree there is mess there.
>  
> The change does not help my project and case. I posted the patch to fix
> the dm-ebs as the code offloads partial size instead of ubs size, what
> actually a user asking for. When a target is created, the physical_block_size
> corresponds to ubs.
>  
> I really appreciate if you take the fix i posted. Your patch can be
> sent out separately.
>  
> Does it work for you?
>  
Any feedback or comments on it?

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
  2025-10-28  8:47   ` Uladzislau Rezki
@ 2025-11-18  4:00   ` Benjamin Marzinski
  2025-11-18 11:15     ` Mikulas Patocka
  2025-11-18 17:45     ` Mikulas Patocka
  1 sibling, 2 replies; 14+ messages in thread
From: Benjamin Marzinski @ 2025-11-18  4:00 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML

On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> 
> 
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> 
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> > 
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
> 
> Hi
> 
> I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> users would also benefit from the fix.

This looks to me like it should accomplish the same thing as
Uladzislau's patch. But I think there could still be problems with other
dm-bufio users, for devices where the blocksize is larger than 4k.

In dm_bufio_client_create() I think we want to make sure that block_size
is a multiple of bdev_logical_block_size(bdev), instead of 512b.
Otherwise block_to_sector() can return sectors that are not addressable
on the device. Unfortunatley, I don't think all users of dm-bufio will
pass in block_sizes that are larger than 4k (uds_make_bufio() in
dm-vdp/indexer/io-factory.c for instance).

-Ben

> Please try this patch - does it fix it?
> 
> Mikulas
> 
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
> 
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  drivers/md/dm-bufio.c |    9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
>  {
>  	unsigned int n_sectors;
>  	sector_t sector;
> -	unsigned int offset, end;
> +	unsigned int offset, end, align;
>  
>  	b->end_io = end_io;
>  
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
>  			b->c->write_callback(b);
>  		offset = b->write_start;
>  		end = b->write_end;
> -		offset &= -DM_BUFIO_WRITE_ALIGN;
> -		end += DM_BUFIO_WRITE_ALIGN - 1;
> -		end &= -DM_BUFIO_WRITE_ALIGN;
> +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> +		offset &= -align;
> +		end += align - 1;
> +		end &= -align;
>  		if (unlikely(end > b->c->block_size))
>  			end = b->c->block_size;
>  
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-11-18  4:00   ` Benjamin Marzinski
@ 2025-11-18 11:15     ` Mikulas Patocka
  2025-11-18 12:42       ` Uladzislau Rezki
  2025-11-18 17:45     ` Mikulas Patocka
  1 sibling, 1 reply; 14+ messages in thread
From: Mikulas Patocka @ 2025-11-18 11:15 UTC (permalink / raw)
  To: Benjamin Marzinski
  Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML



On Mon, 17 Nov 2025, Benjamin Marzinski wrote:

> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > 
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > > 
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> > 
> > Hi
> > 
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > users would also benefit from the fix.
> 
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.

Yes, but Uladzislau said that this patch doesn't work for him. So, I 
suspect that he has "logical_block_size" set incorrectly.

Mikulas

> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.
> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
> 
> -Ben
> 
> > Please try this patch - does it fix it?
> > 
> > Mikulas
> > 
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> > 
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> >  drivers/md/dm-bufio.c |    9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> >  {
> >  	unsigned int n_sectors;
> >  	sector_t sector;
> > -	unsigned int offset, end;
> > +	unsigned int offset, end, align;
> >  
> >  	b->end_io = end_io;
> >  
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> >  			b->c->write_callback(b);
> >  		offset = b->write_start;
> >  		end = b->write_end;
> > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > +		offset &= -align;
> > +		end += align - 1;
> > +		end &= -align;
> >  		if (unlikely(end > b->c->block_size))
> >  			end = b->c->block_size;
> >  
> > 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-11-18 11:15     ` Mikulas Patocka
@ 2025-11-18 12:42       ` Uladzislau Rezki
  0 siblings, 0 replies; 14+ messages in thread
From: Uladzislau Rezki @ 2025-11-18 12:42 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Benjamin Marzinski, Uladzislau Rezki (Sony), Alasdair Kergon,
	DMML, Andrew Morton, Mike Snitzer, Christoph Hellwig, LKML

On Tue, Nov 18, 2025 at 12:15:43PM +0100, Mikulas Patocka wrote:
> 
> 
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> 
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > 
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > 
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > > 
> > > Hi
> > > 
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > users would also benefit from the fix.
> > 
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> 
> Yes, but Uladzislau said that this patch doesn't work for him. So, I 
> suspect that he has "logical_block_size" set incorrectly.
> 
Indeed. Because logical is < physical in my case. Your change does not fix
it because of I/O size is equal to physical.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-11-18  4:00   ` Benjamin Marzinski
  2025-11-18 11:15     ` Mikulas Patocka
@ 2025-11-18 17:45     ` Mikulas Patocka
  2025-11-18 20:36       ` Benjamin Marzinski
  2025-11-19  5:45       ` Christoph Hellwig
  1 sibling, 2 replies; 14+ messages in thread
From: Mikulas Patocka @ 2025-11-18 17:45 UTC (permalink / raw)
  To: Benjamin Marzinski
  Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML



On Mon, 17 Nov 2025, Benjamin Marzinski wrote:

> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > 
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > > 
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> > 
> > Hi
> > 
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > users would also benefit from the fix.
> 
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.
> 
> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.

I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to 
dm_bufio_client_create. But I think it's too late in this development 
cycle, I would add it after the next merge window closes, when I open a 
new patch series for the kernel 6.20 (or 7.0).

> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
> 
> -Ben
> 
> > Please try this patch - does it fix it?
> > 
> > Mikulas

I changed the patch below, so that it aligns write bios on 
max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
greater than logical block size, the writes are aligned so that the device 
doesn't do read-modify-write.

Mikulas

> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> > 
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> >  drivers/md/dm-bufio.c |    9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> >  {
> >  	unsigned int n_sectors;
> >  	sector_t sector;
> > -	unsigned int offset, end;
> > +	unsigned int offset, end, align;
> >  
> >  	b->end_io = end_io;
> >  
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> >  			b->c->write_callback(b);
> >  		offset = b->write_start;
> >  		end = b->write_end;
> > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > +		offset &= -align;
> > +		end += align - 1;
> > +		end &= -align;
> >  		if (unlikely(end > b->c->block_size))
> >  			end = b->c->block_size;
> >  
> > 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-11-18 17:45     ` Mikulas Patocka
@ 2025-11-18 20:36       ` Benjamin Marzinski
  2025-11-19  5:45       ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Benjamin Marzinski @ 2025-11-18 20:36 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Uladzislau Rezki (Sony), Alasdair Kergon, DMML, Andrew Morton,
	Mike Snitzer, Christoph Hellwig, LKML

On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> 
> 
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> 
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > 
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > 
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > > 
> > > Hi
> > > 
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > users would also benefit from the fix.
> > 
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> > 
> > In dm_bufio_client_create() I think we want to make sure that block_size
> > is a multiple of bdev_logical_block_size(bdev), instead of 512b.
> 
> I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to 
> dm_bufio_client_create. But I think it's too late in this development 
> cycle, I would add it after the next merge window closes, when I open a 
> new patch series for the kernel 6.20 (or 7.0).
> 
> > Otherwise block_to_sector() can return sectors that are not addressable
> > on the device. Unfortunatley, I don't think all users of dm-bufio will
> > pass in block_sizes that are larger than 4k (uds_make_bufio() in
> > dm-vdp/indexer/io-factory.c for instance).
> > 
> > -Ben
> > 
> > > Please try this patch - does it fix it?
> > > 
> > > Mikulas
> 
> I changed the patch below, so that it aligns write bios on 
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
> greater than logical block size, the writes are aligned so that the device 
> doesn't do read-modify-write.

This will really only help if the bufio client block_size is a multiple
of the underlying device's physical block size, and the device is
aligned to the physical block size. Perhaps we should figure
out the alignment in dm_bufio_client_create(), with something like:

	c->align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(bdev));
	if (block_size & -bdev_physical_block_size(bdev) &&
	    bdev_alignment_offset(bdev) == 0)
		c->align = bdev_physical_block_size(bdev);

I suppose pre-calculating this could cause problems if the underlying
device was another dm device, and it switched tables in a way that
changed its limits. I dunno if we care about that, however.

-Ben 

> Mikulas
> 
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > 
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > > 
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > > 
> > > ---
> > >  drivers/md/dm-bufio.c |    9 +++++----
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > 
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > >  {
> > >  	unsigned int n_sectors;
> > >  	sector_t sector;
> > > -	unsigned int offset, end;
> > > +	unsigned int offset, end, align;
> > >  
> > >  	b->end_io = end_io;
> > >  
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > >  			b->c->write_callback(b);
> > >  		offset = b->write_start;
> > >  		end = b->write_end;
> > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > +		offset &= -align;
> > > +		end += align - 1;
> > > +		end &= -align;
> > >  		if (unlikely(end > b->c->block_size))
> > >  			end = b->c->block_size;
> > >  
> > > 
> > 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-11-18 17:45     ` Mikulas Patocka
  2025-11-18 20:36       ` Benjamin Marzinski
@ 2025-11-19  5:45       ` Christoph Hellwig
  2025-11-19 17:13         ` Mikulas Patocka
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2025-11-19  5:45 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Benjamin Marzinski, Uladzislau Rezki (Sony), Alasdair Kergon,
	DMML, Andrew Morton, Mike Snitzer, Christoph Hellwig, LKML

On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> I changed the patch below, so that it aligns write bios on 
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
> greater than logical block size, the writes are aligned so that the device 
> doesn't do read-modify-write.

That doesn't make any sense whatsoever.  The physical block size must
be >= logical block size, and the block enforces that.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
  2025-11-19  5:45       ` Christoph Hellwig
@ 2025-11-19 17:13         ` Mikulas Patocka
  0 siblings, 0 replies; 14+ messages in thread
From: Mikulas Patocka @ 2025-11-19 17:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Benjamin Marzinski, Uladzislau Rezki (Sony), Alasdair Kergon,
	DMML, Andrew Morton, Mike Snitzer, LKML



On Wed, 19 Nov 2025, Christoph Hellwig wrote:

> On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> > I changed the patch below, so that it aligns write bios on 
> > max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
> > bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
> > greater than logical block size, the writes are aligned so that the device 
> > doesn't do read-modify-write.
> 
> That doesn't make any sense whatsoever.  The physical block size must
> be >= logical block size, and the block enforces that.

OK, so I changed it to max(DM_BUFIO_WRITE_ALIGN, 
bdev_physical_block_size(b->c->bdev))

Mikulas


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-11-19 17:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 12:33 [PATCH v2] dm-ebs: Mark full buffer dirty even on partial write Uladzislau Rezki (Sony)
2025-10-20 12:48 ` [PATCH] dm-bufio: align write boundary on bdev_logical_block_size Mikulas Patocka
2025-10-28  8:47   ` Uladzislau Rezki
2025-10-28 13:18     ` Uladzislau Rezki
2025-10-29 10:24       ` Mikulas Patocka
2025-10-29 13:06         ` Uladzislau Rezki
2025-11-10 10:26           ` Uladzislau Rezki
2025-11-18  4:00   ` Benjamin Marzinski
2025-11-18 11:15     ` Mikulas Patocka
2025-11-18 12:42       ` Uladzislau Rezki
2025-11-18 17:45     ` Mikulas Patocka
2025-11-18 20:36       ` Benjamin Marzinski
2025-11-19  5:45       ` Christoph Hellwig
2025-11-19 17:13         ` Mikulas Patocka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).