* [PATCH] iomap: enforce DIO alignment check in iomap
@ 2026-06-10 14:52 cem
2026-06-11 5:57 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: cem @ 2026-06-10 14:52 UTC (permalink / raw)
To: brauner
Cc: linux-block, linux-fsdevel, linux-ext4, linux-xfs,
Carlos Maiolino, Keith Busch, Hannes Reinecke, Martin K. Petersen,
Christoph Hellwig, Jens Axboe
From: Carlos Maiolino <cem@kernel.org>
The DIO alignment check has been lifted from iomap layer to rely on the
block layer to enforce proper alignment when issuing direct IO
operations. This though, depending on the IO size and buffer address
passed to the IO operation may lead to user-visible behavior change.
This has been caught initially by LTP test diotest4 running on
PPC architecture, where the test fails because a read() operation
with a supposedly misaligned buffer succeeds instead of an expected
-EINVAL.
This has no direct relationship with PPC, but seems to do with the
IO size crossing page borders or not.
The test allocates a 4k buffer, and then increments the buffer pointer
by a single byte to enforce a misaligned address. It then issues a 4k
read() using such buffer. The read is supposed to return an -EINVAL but
it ends up succeeding.
The allocated buffer is at least a single page, so the read() size being
smaller will end up most of the time within the very same page initially
allocated which seems to suffice the block layer to accept the IO.
On x86 though, the same 4k read will end up crossing page boundaries
causing a bio_split which ends up properly checking the address and
rejecting it due to misalignment.
The test itself is buggy (which seems by design) because it ends up
attempting to read 4096 bytes into a 4095, but I believe the test
expected the address to be rejected prior to any write attempt.
The problematic behavior is reproducible on x86 by reducing the IO size
to something < PAGE_SIZE, so the misaligned read()s will also be accepted
by the block layer.
Fixing this is just a matter of enforcing daddr and memory
alignment back into iomap.
This behavior is reproducible in ext4 and xfs due to both relying on
iomap layer, btrfs does not present this behavior change as it does its
own DIO alignment checking.
Fixes: 7eac33186957 ("iomap: simplify direct io validity check")
Cc: Keith Busch <kbusch@kernel.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
---
While I didn't spot any memory/disk corruption looking into this, it
changes the user behavior that dictates buffer addresses must be
properly aligned when issuing direct IO operations so I thought making
iomap check again for the buffer address alignment is reasonable.
fs/iomap/direct-io.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 95254aa1b654..0064984e64e5 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -400,6 +400,9 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
if ((pos | length) & (alignment - 1))
return -EINVAL;
+ if (iov_iter_alignment(dio->submit.iter) & (alignment - 1))
+ return -EINVAL;
+
if (dio->flags & IOMAP_DIO_WRITE) {
bool need_completion_work = true;
--
2.54.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: enforce DIO alignment check in iomap
2026-06-10 14:52 [PATCH] iomap: enforce DIO alignment check in iomap cem
@ 2026-06-11 5:57 ` Christoph Hellwig
2026-06-11 10:05 ` Carlos Maiolino
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2026-06-11 5:57 UTC (permalink / raw)
To: cem
Cc: brauner, linux-block, linux-fsdevel, linux-ext4, linux-xfs,
Keith Busch, Hannes Reinecke, Martin K. Petersen,
Christoph Hellwig, Jens Axboe
On Wed, Jun 10, 2026 at 04:52:11PM +0200, cem@kernel.org wrote:
> From: Carlos Maiolino <cem@kernel.org>
>
> The DIO alignment check has been lifted from iomap layer to rely on the
> block layer to enforce proper alignment when issuing direct IO
> operations. This though, depending on the IO size and buffer address
> passed to the IO operation may lead to user-visible behavior change.
>
> This has been caught initially by LTP test diotest4 running on
> PPC architecture, where the test fails because a read() operation
> with a supposedly misaligned buffer succeeds instead of an expected
> -EINVAL.
> This has no direct relationship with PPC, but seems to do with the
> IO size crossing page borders or not.
I don't understand the problem here. Why do we want to insist on a
failure when we can support it? I think the test is just broken.
> The problematic behavior is reproducible on x86 by reducing the IO size
> to something < PAGE_SIZE, so the misaligned read()s will also be accepted
> by the block layer.
What do you mean with misaligned here? For a long time the kernel
supports basically arbitrary low memory alignment for diret I/O,
just bounded by the device capabilities (typical 4 byte alignment).
The supported memory alignment is reported in the statx
dio_mem_align. What does that say compared to the alignment
expectations in this test?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: enforce DIO alignment check in iomap
2026-06-11 5:57 ` Christoph Hellwig
@ 2026-06-11 10:05 ` Carlos Maiolino
2026-06-11 12:57 ` Keith Busch
0 siblings, 1 reply; 6+ messages in thread
From: Carlos Maiolino @ 2026-06-11 10:05 UTC (permalink / raw)
To: Christoph Hellwig
Cc: brauner, linux-block, linux-fsdevel, linux-ext4, linux-xfs,
Keith Busch, Hannes Reinecke, Martin K. Petersen, Jens Axboe
On Thu, Jun 11, 2026 at 07:57:44AM +0200, Christoph Hellwig wrote:
> On Wed, Jun 10, 2026 at 04:52:11PM +0200, cem@kernel.org wrote:
> > From: Carlos Maiolino <cem@kernel.org>
> >
> > The DIO alignment check has been lifted from iomap layer to rely on the
> > block layer to enforce proper alignment when issuing direct IO
> > operations. This though, depending on the IO size and buffer address
> > passed to the IO operation may lead to user-visible behavior change.
> >
> > This has been caught initially by LTP test diotest4 running on
> > PPC architecture, where the test fails because a read() operation
> > with a supposedly misaligned buffer succeeds instead of an expected
> > -EINVAL.
> > This has no direct relationship with PPC, but seems to do with the
> > IO size crossing page borders or not.
>
> I don't understand the problem here. Why do we want to insist on a
> failure when we can support it? I think the test is just broken.
The problem I see here from my POV is this changed the behavior expected
from the syscalls when the passed in buffer is misaligned as the read()
(in the test) succeeds when the passed in buffer does not match the
alignment requirements (see below).
I am pretty happy in declaring this a test bug, but I thought it would be
worth starting a discussion about the sudden/unexpected behavior change.
Not to mention now different filesystems will have different alignment
requirements which seems at least "weird" to me. I mean, now suddenly
iomap-based filesystems have a more relaxed alignment constraint than
for example btrfs.
>
> > The problematic behavior is reproducible on x86 by reducing the IO size
> > to something < PAGE_SIZE, so the misaligned read()s will also be accepted
> > by the block layer.
>
> What do you mean with misaligned here? For a long time the kernel
> supports basically arbitrary low memory alignment for diret I/O,
> just bounded by the device capabilities (typical 4 byte alignment).
The test sends to read() a buffer misplaced by 1 byte (see below) which
doesn't match the system's alignment constraints at least from the user
passed buffer perspective.
I've been assuming it should match device's dma_alignment constraints.
The typical 4 byte alignment indeed is the requirement from my PPC
machine, but not for my x86:
>
> The supported memory alignment is reported in the statx
> dio_mem_align. What does that say compared to the alignment
> expectations in this test?
From my x86:
dio_mem_align: 512
dio_offset_align: 512
From PPC:
dio_mem_align: 4
dio_offset_align: 512
But this does not explain how the following call would succeed in either
case (below one taken from PPC):
openat(dirfd=AT_FDCWD, pathname="testdata-4.135256", flags=O_RDWR|O_DIRECT) = 3
_llseek(fd=3, offset=4096, result=[4096], whence=SEEK_SET) = 0
read(arg1=0x3, arg2=0x1003af80001, arg3=0x1000) = 0x1000
The passed in address 0x1003af80001 is one byte misaligned and shouldn't
(at least in theory) ever be accepted no? Or am I missing something
else?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: enforce DIO alignment check in iomap
2026-06-11 10:05 ` Carlos Maiolino
@ 2026-06-11 12:57 ` Keith Busch
2026-06-11 13:38 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2026-06-11 12:57 UTC (permalink / raw)
To: Carlos Maiolino
Cc: Christoph Hellwig, brauner, linux-block, linux-fsdevel,
linux-ext4, linux-xfs, Hannes Reinecke, Martin K. Petersen,
Jens Axboe
On Thu, Jun 11, 2026 at 12:05:22PM +0200, Carlos Maiolino wrote:
> The passed in address 0x1003af80001 is one byte misaligned and shouldn't
> (at least in theory) ever be accepted no? Or am I missing something
> else?
It's entirely possible a device supports byte aligned addresses. The
block layer just doesn't let a driver report that. So either it really
was successful because you found a bug that skips the alignment checks,
or your device silently corrupted your payload.
Anyway, my earlier suggestion should work. Ming thinks it may go to far,
though, in not taking the optimization when it was possible. So here's
an alternative suggestion that should get things working as expected:
---
diff --git a/block/blk.h b/block/blk.h
index 1a2d9101bba04..4c31762d6fb5f 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -404,6 +404,9 @@ static inline bool bio_may_need_split(struct bio *bio,
bv = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
if (bio->bi_iter.bi_size > bv->bv_len - bio->bi_iter.bi_bvec_done)
return true;
+
+ if ((bv->bv_offset | bv->bv_len) & lim->dma_alignment)
+ return true;
return bv->bv_len + bv->bv_offset > lim->max_fast_segment_size;
}
--
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: enforce DIO alignment check in iomap
2026-06-11 12:57 ` Keith Busch
@ 2026-06-11 13:38 ` Christoph Hellwig
2026-06-11 15:47 ` Carlos Maiolino
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2026-06-11 13:38 UTC (permalink / raw)
To: Keith Busch
Cc: Carlos Maiolino, Christoph Hellwig, brauner, linux-block,
linux-fsdevel, linux-ext4, linux-xfs, Hannes Reinecke,
Martin K. Petersen, Jens Axboe
On Thu, Jun 11, 2026 at 06:57:47AM -0600, Keith Busch wrote:
> It's entirely possible a device supports byte aligned addresses. The
> block layer just doesn't let a driver report that. So either it really
> was successful because you found a bug that skips the alignment checks,
> or your device silently corrupted your payload.
>
> Anyway, my earlier suggestion should work. Ming thinks it may go to far,
> though, in not taking the optimization when it was possible. So here's
> an alternative suggestion that should get things working as expected:
The fix below looks like it is addressing a real bug. I'm not sure if
Carlos is hitting it, but we were missing the alignment checks for
single-bvec fast path bios so far indeed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: enforce DIO alignment check in iomap
2026-06-11 13:38 ` Christoph Hellwig
@ 2026-06-11 15:47 ` Carlos Maiolino
0 siblings, 0 replies; 6+ messages in thread
From: Carlos Maiolino @ 2026-06-11 15:47 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, brauner, linux-block, linux-fsdevel, linux-ext4,
linux-xfs, Hannes Reinecke, Martin K. Petersen, Jens Axboe
On Thu, Jun 11, 2026 at 03:38:33PM +0200, Christoph Hellwig wrote:
> On Thu, Jun 11, 2026 at 06:57:47AM -0600, Keith Busch wrote:
> > It's entirely possible a device supports byte aligned addresses. The
> > block layer just doesn't let a driver report that. So either it really
> > was successful because you found a bug that skips the alignment checks,
> > or your device silently corrupted your payload.
I tried this on different hardware, I find it hard to say all those
devices were corrupting the payload.
> >
> > Anyway, my earlier suggestion should work. Ming thinks it may go to far,
> > though, in not taking the optimization when it was possible. So here's
> > an alternative suggestion that should get things working as expected:
>
> The fix below looks like it is addressing a real bug. I'm not sure if
> Carlos is hitting it, but we were missing the alignment checks for
> single-bvec fast path bios so far indeed.
You left context out so I'm assuming by the fix you meant Keith's patch.
I can give it a spin and see if it fixes the behavior I'm talking
about. Give me some time as I have a bunch of stuff to do tonight so
likely I will only manage to try this tomorrow.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-11 15:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 14:52 [PATCH] iomap: enforce DIO alignment check in iomap cem
2026-06-11 5:57 ` Christoph Hellwig
2026-06-11 10:05 ` Carlos Maiolino
2026-06-11 12:57 ` Keith Busch
2026-06-11 13:38 ` Christoph Hellwig
2026-06-11 15:47 ` Carlos Maiolino
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.