linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] iomap: add IOMAP_DIO_FSBLOCK_ALIGNED flag
@ 2025-10-13  9:05 Qu Wenruo
  2025-10-13 11:14 ` Johannes Thumshirn
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Qu Wenruo @ 2025-10-13  9:05 UTC (permalink / raw)
  To: linux-btrfs; +Cc: brauner, djwong, linux-xfs, linux-fsdevel

Btrfs requires all of its bios to be fs block aligned, normally it's
totally fine but with the incoming block size larger than page size
(bs > ps) support, the requirement is no longer met for direct IOs.

Because iomap_dio_bio_iter() calls bio_iov_iter_get_pages(), only
requiring alignment to be bdev_logical_block_size().

In the real world that value is either 512 or 4K, on 4K page sized
systems it means bio_iov_iter_get_pages() can break the bio at any page
boundary, breaking btrfs' requirement for bs > ps cases.

To address this problem, introduce a new public iomap dio flag,
IOMAP_DIO_FSBLOCK_ALIGNED.

When calling __iomap_dio_rw() with that new flag, iomap_dio::flags will
inherit that new flag, and iomap_dio_bio_iter() will take fs block size
into the calculation of the alignment, and pass the alignment to
bio_iov_iter_get_pages(), respecting the fs block size requirement.

The initial user of this flag will be btrfs, which needs to calculate the
checksum for direct read and thus requires the biovec to be fs block
aligned for the incoming bs > ps support.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Fix too long lines
  To follow the old 80 chars limits.

- Reword the commit messages a little

- Rename the public IOMAP flag to IOMAP_DIO_FSBLOCK_ALIGNED

- Make the calculation of alignment eaiser to read
  With a short comment explaing why we need to use the larger value of
  bdev and fs block size.

- Remove the btrfs part that utilize the new flag
  Now it's in the enablement patch of btrfs' bs > ps direct IO support.

 fs/iomap/direct-io.c  | 13 ++++++++++++-
 include/linux/iomap.h |  8 ++++++++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 5d5d63efbd57..ce9cbd2bace0 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -336,10 +336,18 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
 	int nr_pages, ret = 0;
 	u64 copied = 0;
 	size_t orig_count;
+	unsigned int alignment = bdev_logical_block_size(iomap->bdev);
 
 	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1))
 		return -EINVAL;
 
+	/*
+	 * Align to the larger one of bdev and fs block size, to meet the
+	 * alignment requirement of both layers.
+	 */
+	if (dio->flags & IOMAP_DIO_FSBLOCK_ALIGNED)
+		alignment = max(alignment, fs_block_size);
+
 	if (dio->flags & IOMAP_DIO_WRITE) {
 		bio_opf |= REQ_OP_WRITE;
 
@@ -434,7 +442,7 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
 		bio->bi_end_io = iomap_dio_bio_end_io;
 
 		ret = bio_iov_iter_get_pages(bio, dio->submit.iter,
-				bdev_logical_block_size(iomap->bdev) - 1);
+					     alignment - 1);
 		if (unlikely(ret)) {
 			/*
 			 * We have to stop part way through an IO. We must fall
@@ -639,6 +647,9 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 	if (iocb->ki_flags & IOCB_NOWAIT)
 		iomi.flags |= IOMAP_NOWAIT;
 
+	if (dio_flags & IOMAP_DIO_FSBLOCK_ALIGNED)
+		dio->flags |= IOMAP_DIO_FSBLOCK_ALIGNED;
+
 	if (iov_iter_rw(iter) == READ) {
 		/* reads can always complete inline */
 		dio->flags |= IOMAP_DIO_INLINE_COMP;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 73dceabc21c8..4da13fe24ce8 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -518,6 +518,14 @@ struct iomap_dio_ops {
  */
 #define IOMAP_DIO_PARTIAL		(1 << 2)
 
+/*
+ * Ensure each bio is aligned to fs block size.
+ *
+ * For filesystems which need to calculate/verify the checksum of each fs
+ * block. Otherwise they may not be able to handle unaligned bios.
+ */
+#define IOMAP_DIO_FSBLOCK_ALIGNED	(1 << 3)
+
 ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
 		unsigned int dio_flags, void *private, size_t done_before);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-10-29 12:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-13  9:05 [PATCH v2] iomap: add IOMAP_DIO_FSBLOCK_ALIGNED flag Qu Wenruo
2025-10-13 11:14 ` Johannes Thumshirn
2025-10-13 20:39   ` Qu Wenruo
2025-10-14  4:36 ` Christoph Hellwig
2025-10-14  4:55   ` Qu Wenruo
2025-10-14  4:58     ` Christoph Hellwig
2025-10-14  5:10       ` Qu Wenruo
2025-10-16 12:10 ` Pankaj Raghav (Samsung)
2025-10-21 13:03 ` Christian Brauner
2025-10-22  7:23   ` Christoph Hellwig
2025-10-29 12:45     ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).