From: Li Wang <liwang@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>, linux-fsdevel@vger.kernel.org
Cc: rppt@kernel.org, david@kernel.org, ljs@kernel.org,
Liam.Howlett@oracle.com, vbabka@kernel.org, surenb@google.com,
mhocko@suse.com, shuah@kernel.org, aubaker@redhat.com,
linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] selftests/mm: skip hugetlb_dio tests when DIO alignment is incompatible
Date: Mon, 30 Mar 2026 13:19:41 +0800 [thread overview]
Message-ID: <acoH7TALgBmBkk2h@redhat.com> (raw)
In-Reply-To: <20260327121350.858a127fa49ed6e1eb4a40a7@linux-foundation.org>
Reply Sashiko comment:
> https://sashiko.dev/#/patchset/20260327120305.58653-1-liwang@redhat.com
>
> > + if (writesize % dio_align != 0) {
> + ksft_test_result_skip("DIO alignment (%u) incompatible with offset %zu\n",
> + dio_align, writesize);
> + return;
> + }
>
> Is this alignment check complete?
>
> Direct I/O requires both the transfer length and the memory buffer address
> to be aligned. Later in this function, start_off is used as the buffer offset:
> buffer = orig_buffer;
> buffer += start_off;
> If start_off is pagesize / 2 (e.g., 2048) and writesize is pagesize * 3
> (e.g., 12288), writesize is a multiple of a 4096-byte alignment, so the test
> is not skipped.
>
> However, the memory buffer itself is only 2048-byte aligned. Will the
> subsequent write() still fail with -EINVAL on 4K-sector devices?
TL;DR: Yes, we should do both buffer address and writesize alignment
checks to satisfy all FS types.
Looking at the kernel code: fs/iomap/direct-dio.c, the only alignment
check there is at line#413, which checks file's pos and write length.
EXT4:
ext4_file_write_iter
ext4_dio_write_iter
iomap_dio_rw
__iomap_dio_rw
iomap_dio_iter
iomap_dio_bio_iter
390 static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
391 {
...
403
404 /*
405 * File systems that write out of place and always allocate new blocks
406 * need each bio to be block aligned as that's the unit of allocation.
407 */
408 if (dio->flags & IOMAP_DIO_FSBLOCK_ALIGNED)
409 alignment = fs_block_size;
410 else
411 alignment = bdev_logical_block_size(iomap->bdev);
412
413 if ((pos | length) & (alignment - 1))
414 return -EINVAL;
415 ...
Sashiko points out the buffer-address should do alignment check as well,
I firstly suspect it based on the FS extra check before the iomap_dio_rw:
ext4_file_write_iter
ext4_dio_write_iter
ext4_should_use_dio
iov_iter_alignment <--- do buffer/writesize alignment check
842 unsigned long iov_iter_alignment(const struct iov_iter *i)
843 {
...
853 return iov_iter_alignment_iovec(i);
...
865 }
799 static unsigned long iov_iter_alignment_iovec(const struct iov_iter *i)
800 {
...
809 res |= (unsigned long)iov->iov_base + skip;
812 res |= len;
...
818 return res;
819 }
But eventually I found that this is only fallback to the buffer I/O
when the direct I/O is unsupported (go to: ext4_buffered_write_iter).
This wouldn't happen in the test as it open with O_DIRECT flag.
Then, I turned to look at Btrfs path:
btrfs_file_write_iter
btrfs_do_write_iter
btrfs_direct_write
check_direct_IO <--- do buffer alignment check
...
btrfs_dio_write
__iomap_dio_rw <--- do samething like ext4
778 static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info,
779 const struct iov_iter *iter, loff_t offset)
780 {
781 const u32 blocksize_mask = fs_info->sectorsize - 1;
782
783 if (offset & blocksize_mask)
784 return -EINVAL;
785
786 if (iov_iter_alignment(iter) & blocksize_mask)
787 return -EINVAL;
788 return 0;
789 }
Yes, here I found the evendice that iov_iter_alignment(iter) & blocksize_mask)
do the alignment check.
Unlike ext4 which never reaches the check for normal files, btrfs always checks
buffer alignment for every DIO operation. And it's a hard -EINVAL, not a silent
fallback to buffered I/O.
--
Regards,
Li Wang
prev parent reply other threads:[~2026-03-30 5:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 12:03 [PATCH v2] selftests/mm: skip hugetlb_dio tests when DIO alignment is incompatible Li Wang
2026-03-27 19:13 ` Andrew Morton
2026-03-28 0:28 ` Li Wang
2026-03-30 5:19 ` Li Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acoH7TALgBmBkk2h@redhat.com \
--to=liwang@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aubaker@redhat.com \
--cc=david@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox