linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] direct-io: even more flexible io vectors
@ 2025-08-01 23:47 Keith Busch
  2025-08-01 23:47 ` [PATCH 1/7] block: check for valid bio while splitting Keith Busch
                   ` (7 more replies)
  0 siblings, 8 replies; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

In furthering direct IO use from user space buffers without bouncing to
align to unnecessary kernel software constraints, this series removes
the requirement that io vector lengths align to the logical block size.
The downside (if want to call it that) is that mis-aligned io vectors
are caught further down the block stack rather than closer to the
syscall.

This change also removes one walking of the io vector, so that's nice
too.

Keith Busch (7):
  block: check for valid bio while splitting
  block: align the bio after building it
  block: simplify direct io validity check
  iomap: simplify direct io validity check
  block: remove bdev_iter_is_aligned
  blk-integrity: use simpler alignment check
  iov_iter: remove iov_iter_is_aligned

 block/bio-integrity.c  |  4 +-
 block/bio.c            | 58 +++++++++++++++++---------
 block/blk-merge.c      |  5 +++
 block/fops.c           |  4 +-
 fs/iomap/direct-io.c   |  3 +-
 include/linux/blkdev.h |  7 ----
 include/linux/uio.h    |  2 -
 lib/iov_iter.c         | 95 ------------------------------------------
 8 files changed, 49 insertions(+), 129 deletions(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/7] block: check for valid bio while splitting
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
@ 2025-08-01 23:47 ` Keith Busch
  2025-08-04  6:54   ` Hannes Reinecke
  2025-08-01 23:47 ` [PATCH 2/7] block: align the bio after building it Keith Busch
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

We're already iterating every segment, so check alignment for a valid
IO at the same time. We had depended on these constraints were already
checked prior to splitting, but let's put more responsibility here since
splitting iterates each segment before dispatching to the driver anyway.
This way, upper layers don't need to concern themselves with it.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-merge.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 70d704615be52..81bdad915699a 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -298,6 +298,9 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
 	unsigned nsegs = 0, bytes = 0;
 
 	bio_for_each_bvec(bv, bio, iter) {
+		if (bv.bv_offset & lim->dma_alignment)
+			return -EINVAL;
+
 		/*
 		 * If the queue doesn't support SG gaps and adding this
 		 * offset would create a gap, disallow it.
@@ -341,6 +344,8 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
 	 * we do not use the full hardware limits.
 	 */
 	bytes = ALIGN_DOWN(bytes, bio_split_alignment(bio, lim));
+	if (!bytes)
+		return -EINVAL;
 
 	/*
 	 * Bio splitting may cause subtle trouble such as hang when doing sync
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/7] block: align the bio after building it
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
  2025-08-01 23:47 ` [PATCH 1/7] block: check for valid bio while splitting Keith Busch
@ 2025-08-01 23:47 ` Keith Busch
  2025-08-04  6:54   ` Hannes Reinecke
  2025-08-01 23:47 ` [PATCH 3/7] block: simplify direct io validity check Keith Busch
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

Ensure the entire size is aligned after it's built instead of ensuring
each vector is block size aligned while constructing it. This makes it
more flexible to accepting device valid vectors that would otherwise get
rejected by overzealous alignment checks.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/bio.c | 58 +++++++++++++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 92c512e876c8d..c050903e1be0c 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1227,13 +1227,6 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	if (bio->bi_bdev && blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue))
 		extraction_flags |= ITER_ALLOW_P2PDMA;
 
-	/*
-	 * Each segment in the iov is required to be a block size multiple.
-	 * However, we may not be able to get the entire segment if it spans
-	 * more pages than bi_max_vecs allows, so we have to ALIGN_DOWN the
-	 * result to ensure the bio's total size is correct. The remainder of
-	 * the iov data will be picked up in the next bio iteration.
-	 */
 	size = iov_iter_extract_pages(iter, &pages,
 				      UINT_MAX - bio->bi_iter.bi_size,
 				      nr_pages, extraction_flags, &offset);
@@ -1241,18 +1234,6 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 		return size ? size : -EFAULT;
 
 	nr_pages = DIV_ROUND_UP(offset + size, PAGE_SIZE);
-
-	if (bio->bi_bdev) {
-		size_t trim = size & (bdev_logical_block_size(bio->bi_bdev) - 1);
-		iov_iter_revert(iter, trim);
-		size -= trim;
-	}
-
-	if (unlikely(!size)) {
-		ret = -EFAULT;
-		goto out;
-	}
-
 	for (left = size, i = 0; left > 0; left -= len, i += num_pages) {
 		struct page *page = pages[i];
 		struct folio *folio = page_folio(page);
@@ -1297,6 +1278,44 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	return ret;
 }
 
+static inline void bio_revert(struct bio *bio, unsigned int nbytes)
+{
+	bio->bi_iter.bi_size -= nbytes;
+
+	while (nbytes) {
+		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
+
+		if (nbytes < bv->bv_len) {
+			bv->bv_len -= nbytes;
+			return;
+		}
+
+		bio_release_page(bio, bv->bv_page);
+		bio->bi_vcnt--;
+		nbytes -= bv->bv_len;
+       }
+}
+
+static int bio_align_to_lbs(struct bio *bio, struct iov_iter *iter)
+{
+	struct block_device *bdev = bio->bi_bdev;
+	size_t nbytes;
+
+	if (!bdev)
+		return 0;
+
+	nbytes = bio->bi_iter.bi_size & (bdev_logical_block_size(bdev) - 1);
+	if (!nbytes)
+		return 0;
+
+	bio_revert(bio, nbytes);
+	iov_iter_revert(iter, nbytes);
+	if (!bio->bi_iter.bi_size)
+		return -EFAULT;
+
+	return 0;
+}
+
 /**
  * bio_iov_iter_get_pages - add user or kernel pages to a bio
  * @bio: bio to add pages to
@@ -1336,6 +1355,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 		ret = __bio_iov_iter_get_pages(bio, iter);
 	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
 
+	ret = bio_align_to_lbs(bio, iter);
 	return bio->bi_vcnt ? 0 : ret;
 }
 EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/7] block: simplify direct io validity check
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
  2025-08-01 23:47 ` [PATCH 1/7] block: check for valid bio while splitting Keith Busch
  2025-08-01 23:47 ` [PATCH 2/7] block: align the bio after building it Keith Busch
@ 2025-08-01 23:47 ` Keith Busch
  2025-08-04  6:55   ` Hannes Reinecke
  2025-08-01 23:47 ` [PATCH 4/7] iomap: " Keith Busch
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

The block layer checks all the segments for validity later, so no need
for an early check. Just reduce it to a simple position and total length
and defer the segment checks to the block layer.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/fops.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index 82451ac8ff25d..820902cf10730 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -38,8 +38,8 @@ static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
 static bool blkdev_dio_invalid(struct block_device *bdev, struct kiocb *iocb,
 				struct iov_iter *iter)
 {
-	return iocb->ki_pos & (bdev_logical_block_size(bdev) - 1) ||
-		!bdev_iter_is_aligned(bdev, iter);
+	return (iocb->ki_pos | iov_iter_count(iter)) &
+			(bdev_logical_block_size(bdev) - 1);
 }
 
 #define DIO_INLINE_BIO_VECS 4
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/7] iomap: simplify direct io validity check
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
                   ` (2 preceding siblings ...)
  2025-08-01 23:47 ` [PATCH 3/7] block: simplify direct io validity check Keith Busch
@ 2025-08-01 23:47 ` Keith Busch
  2025-08-04  6:57   ` Hannes Reinecke
  2025-08-01 23:47 ` [PATCH 5/7] block: remove bdev_iter_is_aligned Keith Busch
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

The block layer checks all the segments for validity later, so no need
for an early check. Just reduce it to a simple position and total length
and defer the segment checks to the block layer.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 fs/iomap/direct-io.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 6f25d4cfea9f7..2c1a45e46dc75 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -337,8 +337,7 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
 	u64 copied = 0;
 	size_t orig_count;
 
-	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1) ||
-	    !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
+	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1))
 		return -EINVAL;
 
 	if (dio->flags & IOMAP_DIO_WRITE) {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 5/7] block: remove bdev_iter_is_aligned
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
                   ` (3 preceding siblings ...)
  2025-08-01 23:47 ` [PATCH 4/7] iomap: " Keith Busch
@ 2025-08-01 23:47 ` Keith Busch
  2025-08-04  6:57   ` Hannes Reinecke
  2025-08-01 23:47 ` [PATCH 6/7] blk-integrity: use simpler alignment check Keith Busch
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

No more callers.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 include/linux/blkdev.h | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 95886b404b16b..f86aecbb87385 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1589,13 +1589,6 @@ static inline unsigned int bdev_dma_alignment(struct block_device *bdev)
 	return queue_dma_alignment(bdev_get_queue(bdev));
 }
 
-static inline bool bdev_iter_is_aligned(struct block_device *bdev,
-					struct iov_iter *iter)
-{
-	return iov_iter_is_aligned(iter, bdev_dma_alignment(bdev),
-				   bdev_logical_block_size(bdev) - 1);
-}
-
 static inline unsigned int
 blk_lim_dma_alignment_and_pad(struct queue_limits *lim)
 {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/7] blk-integrity: use simpler alignment check
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
                   ` (4 preceding siblings ...)
  2025-08-01 23:47 ` [PATCH 5/7] block: remove bdev_iter_is_aligned Keith Busch
@ 2025-08-01 23:47 ` Keith Busch
  2025-08-04  6:58   ` Hannes Reinecke
  2025-08-01 23:47 ` [PATCH 7/7] iov_iter: remove iov_iter_is_aligned Keith Busch
  2025-08-02 15:37 ` [PATCH 0/7] direct-io: even more flexible io vectors Jens Axboe
  7 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

We're checking length and addresses against the same alignment value, so
use the more simple iterator check.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/bio-integrity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 6b077ca937f6b..6d069a49b4aad 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -262,7 +262,6 @@ static unsigned int bvec_from_pages(struct bio_vec *bvec, struct page **pages,
 int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 {
 	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
-	unsigned int align = blk_lim_dma_alignment_and_pad(&q->limits);
 	struct page *stack_pages[UIO_FASTIOV], **pages = stack_pages;
 	struct bio_vec stack_vec[UIO_FASTIOV], *bvec = stack_vec;
 	size_t offset, bytes = iter->count;
@@ -285,7 +284,8 @@ int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 		pages = NULL;
 	}
 
-	copy = !iov_iter_is_aligned(iter, align, align);
+	copy = iov_iter_alignment(iter) &
+			blk_lim_dma_alignment_and_pad(&q->limits);
 	ret = iov_iter_extract_pages(iter, &pages, bytes, nr_vecs, 0, &offset);
 	if (unlikely(ret < 0))
 		goto free_bvec;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
                   ` (5 preceding siblings ...)
  2025-08-01 23:47 ` [PATCH 6/7] blk-integrity: use simpler alignment check Keith Busch
@ 2025-08-01 23:47 ` Keith Busch
  2025-08-02  2:02   ` Mike Snitzer
  2025-08-05  0:24   ` Mike Snitzer
  2025-08-02 15:37 ` [PATCH 0/7] direct-io: even more flexible io vectors Jens Axboe
  7 siblings, 2 replies; 29+ messages in thread
From: Keith Busch @ 2025-08-01 23:47 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

From: Keith Busch <kbusch@kernel.org>

No more callers.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 include/linux/uio.h |  2 -
 lib/iov_iter.c      | 95 ---------------------------------------------
 2 files changed, 97 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 2e86c653186c6..5b127043a1519 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -286,8 +286,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
 #endif
 
 size_t iov_iter_zero(size_t bytes, struct iov_iter *);
-bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
-			unsigned len_mask);
 unsigned long iov_iter_alignment(const struct iov_iter *i);
 unsigned long iov_iter_gap_alignment(const struct iov_iter *i);
 void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index f9193f952f499..2fe66a6b8789e 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -784,101 +784,6 @@ void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
 }
 EXPORT_SYMBOL(iov_iter_discard);
 
-static bool iov_iter_aligned_iovec(const struct iov_iter *i, unsigned addr_mask,
-				   unsigned len_mask)
-{
-	const struct iovec *iov = iter_iov(i);
-	size_t size = i->count;
-	size_t skip = i->iov_offset;
-
-	do {
-		size_t len = iov->iov_len - skip;
-
-		if (len > size)
-			len = size;
-		if (len & len_mask)
-			return false;
-		if ((unsigned long)(iov->iov_base + skip) & addr_mask)
-			return false;
-
-		iov++;
-		size -= len;
-		skip = 0;
-	} while (size);
-
-	return true;
-}
-
-static bool iov_iter_aligned_bvec(const struct iov_iter *i, unsigned addr_mask,
-				  unsigned len_mask)
-{
-	const struct bio_vec *bvec = i->bvec;
-	unsigned skip = i->iov_offset;
-	size_t size = i->count;
-
-	do {
-		size_t len = bvec->bv_len - skip;
-
-		if (len > size)
-			len = size;
-		if (len & len_mask)
-			return false;
-		if ((unsigned long)(bvec->bv_offset + skip) & addr_mask)
-			return false;
-
-		bvec++;
-		size -= len;
-		skip = 0;
-	} while (size);
-
-	return true;
-}
-
-/**
- * iov_iter_is_aligned() - Check if the addresses and lengths of each segments
- * 	are aligned to the parameters.
- *
- * @i: &struct iov_iter to restore
- * @addr_mask: bit mask to check against the iov element's addresses
- * @len_mask: bit mask to check against the iov element's lengths
- *
- * Return: false if any addresses or lengths intersect with the provided masks
- */
-bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
-			 unsigned len_mask)
-{
-	if (likely(iter_is_ubuf(i))) {
-		if (i->count & len_mask)
-			return false;
-		if ((unsigned long)(i->ubuf + i->iov_offset) & addr_mask)
-			return false;
-		return true;
-	}
-
-	if (likely(iter_is_iovec(i) || iov_iter_is_kvec(i)))
-		return iov_iter_aligned_iovec(i, addr_mask, len_mask);
-
-	if (iov_iter_is_bvec(i))
-		return iov_iter_aligned_bvec(i, addr_mask, len_mask);
-
-	/* With both xarray and folioq types, we're dealing with whole folios. */
-	if (iov_iter_is_xarray(i)) {
-		if (i->count & len_mask)
-			return false;
-		if ((i->xarray_start + i->iov_offset) & addr_mask)
-			return false;
-	}
-	if (iov_iter_is_folioq(i)) {
-		if (i->count & len_mask)
-			return false;
-		if (i->iov_offset & addr_mask)
-			return false;
-	}
-
-	return true;
-}
-EXPORT_SYMBOL_GPL(iov_iter_is_aligned);
-
 static unsigned long iov_iter_alignment_iovec(const struct iov_iter *i)
 {
 	const struct iovec *iov = iter_iov(i);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-01 23:47 ` [PATCH 7/7] iov_iter: remove iov_iter_is_aligned Keith Busch
@ 2025-08-02  2:02   ` Mike Snitzer
  2025-08-04 14:16     ` Keith Busch
  2025-08-05  0:24   ` Mike Snitzer
  1 sibling, 1 reply; 29+ messages in thread
From: Mike Snitzer @ 2025-08-02  2:02 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-fsdevel, linux-kernel, axboe, dw, brauner,
	Keith Busch, Chuck Lever, linux-nfs

On Fri, Aug 01, 2025 at 04:47:36PM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> No more callers.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>

You had me up until this last patch.

I'm actually making use of iov_iter_is_aligned() in a series of
changes for both NFS and NFSD.  Chuck has included some of the
NFSD changes in his nfsd-testing branch, see:
https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-testing&id=5d78ac1e674b45f9c9e3769b48efb27c44f4e4d3

And the balance of my work that is pending review/inclusion is:
https://lore.kernel.org/linux-nfs/20250731230633.89983-1-snitzer@kernel.org/
https://lore.kernel.org/linux-nfs/20250801171049.94235-1-snitzer@kernel.org/

I only need iov_iter_aligned_bvec, but recall I want to relax its
checking with this patch:
https://lore.kernel.org/linux-nfs/20250708160619.64800-5-snitzer@kernel.org/

Should I just add iov_iter_aligned_bvec() to fs/nfs_common/ so that
both NFS and NFSD can use it?

Thanks,
Mike

> ---
>  include/linux/uio.h |  2 -
>  lib/iov_iter.c      | 95 ---------------------------------------------
>  2 files changed, 97 deletions(-)
> 
> diff --git a/include/linux/uio.h b/include/linux/uio.h
> index 2e86c653186c6..5b127043a1519 100644
> --- a/include/linux/uio.h
> +++ b/include/linux/uio.h
> @@ -286,8 +286,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
>  #endif
>  
>  size_t iov_iter_zero(size_t bytes, struct iov_iter *);
> -bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
> -			unsigned len_mask);
>  unsigned long iov_iter_alignment(const struct iov_iter *i);
>  unsigned long iov_iter_gap_alignment(const struct iov_iter *i);
>  void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index f9193f952f499..2fe66a6b8789e 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -784,101 +784,6 @@ void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
>  }
>  EXPORT_SYMBOL(iov_iter_discard);
>  
> -static bool iov_iter_aligned_iovec(const struct iov_iter *i, unsigned addr_mask,
> -				   unsigned len_mask)
> -{
> -	const struct iovec *iov = iter_iov(i);
> -	size_t size = i->count;
> -	size_t skip = i->iov_offset;
> -
> -	do {
> -		size_t len = iov->iov_len - skip;
> -
> -		if (len > size)
> -			len = size;
> -		if (len & len_mask)
> -			return false;
> -		if ((unsigned long)(iov->iov_base + skip) & addr_mask)
> -			return false;
> -
> -		iov++;
> -		size -= len;
> -		skip = 0;
> -	} while (size);
> -
> -	return true;
> -}
> -
> -static bool iov_iter_aligned_bvec(const struct iov_iter *i, unsigned addr_mask,
> -				  unsigned len_mask)
> -{
> -	const struct bio_vec *bvec = i->bvec;
> -	unsigned skip = i->iov_offset;
> -	size_t size = i->count;
> -
> -	do {
> -		size_t len = bvec->bv_len - skip;
> -
> -		if (len > size)
> -			len = size;
> -		if (len & len_mask)
> -			return false;
> -		if ((unsigned long)(bvec->bv_offset + skip) & addr_mask)
> -			return false;
> -
> -		bvec++;
> -		size -= len;
> -		skip = 0;
> -	} while (size);
> -
> -	return true;
> -}
> -
> -/**
> - * iov_iter_is_aligned() - Check if the addresses and lengths of each segments
> - * 	are aligned to the parameters.
> - *
> - * @i: &struct iov_iter to restore
> - * @addr_mask: bit mask to check against the iov element's addresses
> - * @len_mask: bit mask to check against the iov element's lengths
> - *
> - * Return: false if any addresses or lengths intersect with the provided masks
> - */
> -bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
> -			 unsigned len_mask)
> -{
> -	if (likely(iter_is_ubuf(i))) {
> -		if (i->count & len_mask)
> -			return false;
> -		if ((unsigned long)(i->ubuf + i->iov_offset) & addr_mask)
> -			return false;
> -		return true;
> -	}
> -
> -	if (likely(iter_is_iovec(i) || iov_iter_is_kvec(i)))
> -		return iov_iter_aligned_iovec(i, addr_mask, len_mask);
> -
> -	if (iov_iter_is_bvec(i))
> -		return iov_iter_aligned_bvec(i, addr_mask, len_mask);
> -
> -	/* With both xarray and folioq types, we're dealing with whole folios. */
> -	if (iov_iter_is_xarray(i)) {
> -		if (i->count & len_mask)
> -			return false;
> -		if ((i->xarray_start + i->iov_offset) & addr_mask)
> -			return false;
> -	}
> -	if (iov_iter_is_folioq(i)) {
> -		if (i->count & len_mask)
> -			return false;
> -		if (i->iov_offset & addr_mask)
> -			return false;
> -	}
> -
> -	return true;
> -}
> -EXPORT_SYMBOL_GPL(iov_iter_is_aligned);
> -
>  static unsigned long iov_iter_alignment_iovec(const struct iov_iter *i)
>  {
>  	const struct iovec *iov = iter_iov(i);
> -- 
> 2.47.3
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] direct-io: even more flexible io vectors
  2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
                   ` (6 preceding siblings ...)
  2025-08-01 23:47 ` [PATCH 7/7] iov_iter: remove iov_iter_is_aligned Keith Busch
@ 2025-08-02 15:37 ` Jens Axboe
  2025-08-04 17:06   ` Keith Busch
  7 siblings, 1 reply; 29+ messages in thread
From: Jens Axboe @ 2025-08-02 15:37 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, dw, brauner, Keith Busch

On 8/1/25 5:47 PM, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> In furthering direct IO use from user space buffers without bouncing to
> align to unnecessary kernel software constraints, this series removes
> the requirement that io vector lengths align to the logical block size.
> The downside (if want to call it that) is that mis-aligned io vectors
> are caught further down the block stack rather than closer to the
> syscall.

That's not a downside imho, it's much nicer to have the correct/expected
case be fast, and catch the unexpected error case down the line when we
have to iterate the vecs anyway.

IOW, I love this patchset. I'll spend some time going over the details.
Did you write some test cases for this?

> This change also removes one walking of the io vector, so that's nice
> too.
> 
> Keith Busch (7):
>   block: check for valid bio while splitting
>   block: align the bio after building it
>   block: simplify direct io validity check
>   iomap: simplify direct io validity check
>   block: remove bdev_iter_is_aligned
>   blk-integrity: use simpler alignment check
>   iov_iter: remove iov_iter_is_aligned
> 
>  block/bio-integrity.c  |  4 +-
>  block/bio.c            | 58 +++++++++++++++++---------
>  block/blk-merge.c      |  5 +++
>  block/fops.c           |  4 +-
>  fs/iomap/direct-io.c   |  3 +-
>  include/linux/blkdev.h |  7 ----
>  include/linux/uio.h    |  2 -
>  lib/iov_iter.c         | 95 ------------------------------------------
>  8 files changed, 49 insertions(+), 129 deletions(-)

Now that's a beautiful diffstat.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/7] block: align the bio after building it
  2025-08-01 23:47 ` [PATCH 2/7] block: align the bio after building it Keith Busch
@ 2025-08-04  6:54   ` Hannes Reinecke
  2025-08-04 14:08     ` Keith Busch
  0 siblings, 1 reply; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-04  6:54 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

On 8/2/25 01:47, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> Ensure the entire size is aligned after it's built instead of ensuring
> each vector is block size aligned while constructing it. This makes it
> more flexible to accepting device valid vectors that would otherwise get
> rejected by overzealous alignment checks.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   block/bio.c | 58 +++++++++++++++++++++++++++++++++++------------------
>   1 file changed, 39 insertions(+), 19 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 92c512e876c8d..c050903e1be0c 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1227,13 +1227,6 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>   	if (bio->bi_bdev && blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue))
>   		extraction_flags |= ITER_ALLOW_P2PDMA;
>   
> -	/*
> -	 * Each segment in the iov is required to be a block size multiple.
> -	 * However, we may not be able to get the entire segment if it spans
> -	 * more pages than bi_max_vecs allows, so we have to ALIGN_DOWN the
> -	 * result to ensure the bio's total size is correct. The remainder of
> -	 * the iov data will be picked up in the next bio iteration.
> -	 */
>   	size = iov_iter_extract_pages(iter, &pages,
>   				      UINT_MAX - bio->bi_iter.bi_size,
>   				      nr_pages, extraction_flags, &offset);
> @@ -1241,18 +1234,6 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>   		return size ? size : -EFAULT;
>   
>   	nr_pages = DIV_ROUND_UP(offset + size, PAGE_SIZE);
> -
> -	if (bio->bi_bdev) {
> -		size_t trim = size & (bdev_logical_block_size(bio->bi_bdev) - 1);
> -		iov_iter_revert(iter, trim);
> -		size -= trim;
> -	}
> -
> -	if (unlikely(!size)) {
> -		ret = -EFAULT;
> -		goto out;
> -	}
> -
>   	for (left = size, i = 0; left > 0; left -= len, i += num_pages) {
>   		struct page *page = pages[i];
>   		struct folio *folio = page_folio(page);
> @@ -1297,6 +1278,44 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>   	return ret;
>   }
>   
> +static inline void bio_revert(struct bio *bio, unsigned int nbytes)
> +{
> +	bio->bi_iter.bi_size -= nbytes;
> +
> +	while (nbytes) {
> +		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
> +
> +		if (nbytes < bv->bv_len) {
> +			bv->bv_len -= nbytes;
> +			return;
> +		}
> +
> +		bio_release_page(bio, bv->bv_page);
> +		bio->bi_vcnt--;
> +		nbytes -= bv->bv_len;
> +       }
> +}
> +
> +static int bio_align_to_lbs(struct bio *bio, struct iov_iter *iter)
> +{
> +	struct block_device *bdev = bio->bi_bdev;
> +	size_t nbytes;
> +
> +	if (!bdev)
> +		return 0;
> +
> +	nbytes = bio->bi_iter.bi_size & (bdev_logical_block_size(bdev) - 1);
> +	if (!nbytes)
> +		return 0;
> +
> +	bio_revert(bio, nbytes);
> +	iov_iter_revert(iter, nbytes);
> +	if (!bio->bi_iter.bi_size)
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
>   /**
>    * bio_iov_iter_get_pages - add user or kernel pages to a bio
>    * @bio: bio to add pages to
> @@ -1336,6 +1355,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>   		ret = __bio_iov_iter_get_pages(bio, iter);
>   	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
>   
> +	ret = bio_align_to_lbs(bio, iter);
>   	return bio->bi_vcnt ? 0 : ret;

Wouldn't that cause the error from bio_align_to_lba() to be ignored
if bio->bi_vcnt is greater than 0?

>   }
>   EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);

Cheers,

Hsnnes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] block: check for valid bio while splitting
  2025-08-01 23:47 ` [PATCH 1/7] block: check for valid bio while splitting Keith Busch
@ 2025-08-04  6:54   ` Hannes Reinecke
  0 siblings, 0 replies; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-04  6:54 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

On 8/2/25 01:47, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> We're already iterating every segment, so check alignment for a valid
> IO at the same time. We had depended on these constraints were already
> checked prior to splitting, but let's put more responsibility here since
> splitting iterates each segment before dispatching to the driver anyway.
> This way, upper layers don't need to concern themselves with it.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   block/blk-merge.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] block: simplify direct io validity check
  2025-08-01 23:47 ` [PATCH 3/7] block: simplify direct io validity check Keith Busch
@ 2025-08-04  6:55   ` Hannes Reinecke
  2025-08-04 17:11     ` Keith Busch
  0 siblings, 1 reply; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-04  6:55 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

On 8/2/25 01:47, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> The block layer checks all the segments for validity later, so no need
> for an early check. Just reduce it to a simple position and total length
> and defer the segment checks to the block layer.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   block/fops.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/fops.c b/block/fops.c
> index 82451ac8ff25d..820902cf10730 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -38,8 +38,8 @@ static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
>   static bool blkdev_dio_invalid(struct block_device *bdev, struct kiocb *iocb,
>   				struct iov_iter *iter)
>   {
> -	return iocb->ki_pos & (bdev_logical_block_size(bdev) - 1) ||
> -		!bdev_iter_is_aligned(bdev, iter);
> +	return (iocb->ki_pos | iov_iter_count(iter)) &
> +			(bdev_logical_block_size(bdev) - 1);

Bitwise or? Sure?

>   }
>   
>   #define DIO_INLINE_BIO_VECS 4

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/7] iomap: simplify direct io validity check
  2025-08-01 23:47 ` [PATCH 4/7] iomap: " Keith Busch
@ 2025-08-04  6:57   ` Hannes Reinecke
  0 siblings, 0 replies; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-04  6:57 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

On 8/2/25 01:47, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> The block layer checks all the segments for validity later, so no need
> for an early check. Just reduce it to a simple position and total length
> and defer the segment checks to the block layer.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   fs/iomap/direct-io.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 6f25d4cfea9f7..2c1a45e46dc75 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -337,8 +337,7 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
>   	u64 copied = 0;
>   	size_t orig_count;
>   
> -	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1) ||
> -	    !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
> +	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1))
>   		return -EINVAL;
>   
>   	if (dio->flags & IOMAP_DIO_WRITE) {

Similar here: why a bitwise or?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] block: remove bdev_iter_is_aligned
  2025-08-01 23:47 ` [PATCH 5/7] block: remove bdev_iter_is_aligned Keith Busch
@ 2025-08-04  6:57   ` Hannes Reinecke
  0 siblings, 0 replies; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-04  6:57 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

On 8/2/25 01:47, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> No more callers.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   include/linux/blkdev.h | 7 -------
>   1 file changed, 7 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] blk-integrity: use simpler alignment check
  2025-08-01 23:47 ` [PATCH 6/7] blk-integrity: use simpler alignment check Keith Busch
@ 2025-08-04  6:58   ` Hannes Reinecke
  0 siblings, 0 replies; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-04  6:58 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel, linux-kernel
  Cc: snitzer, axboe, dw, brauner, Keith Busch

On 8/2/25 01:47, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> We're checking length and addresses against the same alignment value, so
> use the more simple iterator check.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   block/bio-integrity.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/7] block: align the bio after building it
  2025-08-04  6:54   ` Hannes Reinecke
@ 2025-08-04 14:08     ` Keith Busch
  2025-08-04 16:47       ` Keith Busch
  2025-08-05  6:54       ` Hannes Reinecke
  0 siblings, 2 replies; 29+ messages in thread
From: Keith Busch @ 2025-08-04 14:08 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, snitzer,
	axboe, dw, brauner

On Mon, Aug 04, 2025 at 08:54:00AM +0200, Hannes Reinecke wrote:
> On 8/2/25 01:47, Keith Busch wrote:
> > +static int bio_align_to_lbs(struct bio *bio, struct iov_iter *iter)
> > +{
> > +	struct block_device *bdev = bio->bi_bdev;
> > +	size_t nbytes;
> > +
> > +	if (!bdev)
> > +		return 0;
> > +
> > +	nbytes = bio->bi_iter.bi_size & (bdev_logical_block_size(bdev) - 1);
> > +	if (!nbytes)
> > +		return 0;
> > +
> > +	bio_revert(bio, nbytes);
> > +	iov_iter_revert(iter, nbytes);
> > +	if (!bio->bi_iter.bi_size)
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +
> >   /**
> >    * bio_iov_iter_get_pages - add user or kernel pages to a bio
> >    * @bio: bio to add pages to
> > @@ -1336,6 +1355,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> >   		ret = __bio_iov_iter_get_pages(bio, iter);
> >   	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
> > +	ret = bio_align_to_lbs(bio, iter);
> >   	return bio->bi_vcnt ? 0 : ret;
> 
> Wouldn't that cause the error from bio_align_to_lba() to be ignored
> if bio->bi_vcnt is greater than 0?

That returns an error only if the alignment reduces the size to 0, so
there would be a bug somewhere if bi_vcnt is not also 0 in that case.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-02  2:02   ` Mike Snitzer
@ 2025-08-04 14:16     ` Keith Busch
  2025-08-04 15:25       ` Mike Snitzer
  0 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-04 14:16 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, axboe, dw,
	brauner, Chuck Lever, linux-nfs

On Fri, Aug 01, 2025 at 10:02:49PM -0400, Mike Snitzer wrote:
> On Fri, Aug 01, 2025 at 04:47:36PM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> > 
> > No more callers.
> > 
> > Signed-off-by: Keith Busch <kbusch@kernel.org>
> 
> You had me up until this last patch.
> 
> I'm actually making use of iov_iter_is_aligned() in a series of
> changes for both NFS and NFSD.  Chuck has included some of the
> NFSD changes in his nfsd-testing branch, see:
> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-testing&id=5d78ac1e674b45f9c9e3769b48efb27c44f4e4d3
> 
> And the balance of my work that is pending review/inclusion is:
> https://lore.kernel.org/linux-nfs/20250731230633.89983-1-snitzer@kernel.org/
> https://lore.kernel.org/linux-nfs/20250801171049.94235-1-snitzer@kernel.org/
> 
> I only need iov_iter_aligned_bvec, but recall I want to relax its
> checking with this patch:
> https://lore.kernel.org/linux-nfs/20250708160619.64800-5-snitzer@kernel.org/
> 
> Should I just add iov_iter_aligned_bvec() to fs/nfs_common/ so that
> both NFS and NFSD can use it?

If at all possible, I recommend finding a place that already walks the
vectors and do an opprotunistic check for the alignments there. This
will save CPU cycles. For example, nfsd_iter_read already iterates the
bvec while setting each page. Could you check the alignment while doing
that instead of iterating a second time immediately after?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-04 14:16     ` Keith Busch
@ 2025-08-04 15:25       ` Mike Snitzer
  2025-08-04 15:27         ` Mike Snitzer
  2025-08-04 22:26         ` Mike Snitzer
  0 siblings, 2 replies; 29+ messages in thread
From: Mike Snitzer @ 2025-08-04 15:25 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, axboe, dw,
	brauner, Chuck Lever, linux-nfs

On Mon, Aug 04, 2025 at 08:16:39AM -0600, Keith Busch wrote:
> On Fri, Aug 01, 2025 at 10:02:49PM -0400, Mike Snitzer wrote:
> > On Fri, Aug 01, 2025 at 04:47:36PM -0700, Keith Busch wrote:
> > > From: Keith Busch <kbusch@kernel.org>
> > > 
> > > No more callers.
> > > 
> > > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > 
> > You had me up until this last patch.
> > 
> > I'm actually making use of iov_iter_is_aligned() in a series of
> > changes for both NFS and NFSD.  Chuck has included some of the
> > NFSD changes in his nfsd-testing branch, see:
> > https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-testing&id=5d78ac1e674b45f9c9e3769b48efb27c44f4e4d3
> > 
> > And the balance of my work that is pending review/inclusion is:
> > https://lore.kernel.org/linux-nfs/20250731230633.89983-1-snitzer@kernel.org/
> > https://lore.kernel.org/linux-nfs/20250801171049.94235-1-snitzer@kernel.org/
> > 
> > I only need iov_iter_aligned_bvec, but recall I want to relax its
> > checking with this patch:
> > https://lore.kernel.org/linux-nfs/20250708160619.64800-5-snitzer@kernel.org/
> > 
> > Should I just add iov_iter_aligned_bvec() to fs/nfs_common/ so that
> > both NFS and NFSD can use it?
> 
> If at all possible, I recommend finding a place that already walks the
> vectors and do an opprotunistic check for the alignments there. This
> will save CPU cycles. For example, nfsd_iter_read already iterates the
> bvec while setting each page. Could you check the alignment while doing
> that instead of iterating a second time immediately after?

Nice goal, I'll see if I can pull it off.

I'm currently using iov_iter_is_aligned() in 3 new call sites (for
READ and WRITE) in both NFS and NFSD: nfs_local_iter_init,
nfsd_iter_read, nfsd_vfs_write

nfsd_vfs_write is the only one that doesn't iterate the bvec as-is,
but it does work that _should_ obviate the need to doable check the
alignment.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-04 15:25       ` Mike Snitzer
@ 2025-08-04 15:27         ` Mike Snitzer
  2025-08-04 22:26         ` Mike Snitzer
  1 sibling, 0 replies; 29+ messages in thread
From: Mike Snitzer @ 2025-08-04 15:27 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, axboe, dw,
	brauner, Chuck Lever, linux-nfs

On Mon, Aug 04, 2025 at 11:25:08AM -0400, Mike Snitzer wrote:
> On Mon, Aug 04, 2025 at 08:16:39AM -0600, Keith Busch wrote:
> > On Fri, Aug 01, 2025 at 10:02:49PM -0400, Mike Snitzer wrote:
> > > On Fri, Aug 01, 2025 at 04:47:36PM -0700, Keith Busch wrote:
> > > > From: Keith Busch <kbusch@kernel.org>
> > > > 
> > > > No more callers.
> > > > 
> > > > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > > 
> > > You had me up until this last patch.
> > > 
> > > I'm actually making use of iov_iter_is_aligned() in a series of
> > > changes for both NFS and NFSD.  Chuck has included some of the
> > > NFSD changes in his nfsd-testing branch, see:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-testing&id=5d78ac1e674b45f9c9e3769b48efb27c44f4e4d3
> > > 
> > > And the balance of my work that is pending review/inclusion is:
> > > https://lore.kernel.org/linux-nfs/20250731230633.89983-1-snitzer@kernel.org/
> > > https://lore.kernel.org/linux-nfs/20250801171049.94235-1-snitzer@kernel.org/
> > > 
> > > I only need iov_iter_aligned_bvec, but recall I want to relax its
> > > checking with this patch:
> > > https://lore.kernel.org/linux-nfs/20250708160619.64800-5-snitzer@kernel.org/
> > > 
> > > Should I just add iov_iter_aligned_bvec() to fs/nfs_common/ so that
> > > both NFS and NFSD can use it?
> > 
> > If at all possible, I recommend finding a place that already walks the
> > vectors and do an opprotunistic check for the alignments there. This
> > will save CPU cycles. For example, nfsd_iter_read already iterates the
> > bvec while setting each page. Could you check the alignment while doing
> > that instead of iterating a second time immediately after?
> 
> Nice goal, I'll see if I can pull it off.
> 
> I'm currently using iov_iter_is_aligned() in 3 new call sites (for
> READ and WRITE) in both NFS and NFSD: nfs_local_iter_init,
> nfsd_iter_read, nfsd_vfs_write
> 
> nfsd_vfs_write is the only one that doesn't iterate the bvec as-is,
> but it does work that _should_ obviate the need to doable check the
> alignment.

Freudian typo: s/doable/double/ :)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/7] block: align the bio after building it
  2025-08-04 14:08     ` Keith Busch
@ 2025-08-04 16:47       ` Keith Busch
  2025-08-05  6:54       ` Hannes Reinecke
  1 sibling, 0 replies; 29+ messages in thread
From: Keith Busch @ 2025-08-04 16:47 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, snitzer,
	axboe, dw, brauner

On Mon, Aug 04, 2025 at 08:08:38AM -0600, Keith Busch wrote:
> On Mon, Aug 04, 2025 at 08:54:00AM +0200, Hannes Reinecke wrote:
> > On 8/2/25 01:47, Keith Busch wrote:
> > > +static int bio_align_to_lbs(struct bio *bio, struct iov_iter *iter)
> > > +{
> > > +	struct block_device *bdev = bio->bi_bdev;
> > > +	size_t nbytes;
> > > +
> > > +	if (!bdev)
> > > +		return 0;
> > > +
> > > +	nbytes = bio->bi_iter.bi_size & (bdev_logical_block_size(bdev) - 1);
> > > +	if (!nbytes)
> > > +		return 0;
> > > +
> > > +	bio_revert(bio, nbytes);
> > > +	iov_iter_revert(iter, nbytes);
> > > +	if (!bio->bi_iter.bi_size)
> > > +		return -EFAULT;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >   /**
> > >    * bio_iov_iter_get_pages - add user or kernel pages to a bio
> > >    * @bio: bio to add pages to
> > > @@ -1336,6 +1355,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> > >   		ret = __bio_iov_iter_get_pages(bio, iter);
> > >   	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
> > > +	ret = bio_align_to_lbs(bio, iter);
> > >   	return bio->bi_vcnt ? 0 : ret;
> > 
> > Wouldn't that cause the error from bio_align_to_lba() to be ignored
> > if bio->bi_vcnt is greater than 0?
> 
> That returns an error only if the alignment reduces the size to 0, so
> there would be a bug somewhere if bi_vcnt is not also 0 in that case.

But there is definitely a problem the other-way-around: if bi_vcnt is
already 0 because the first vector was a bad address, then my patch here
is mistakenly overriding 'ret' to indicate success when it wasn't.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] direct-io: even more flexible io vectors
  2025-08-02 15:37 ` [PATCH 0/7] direct-io: even more flexible io vectors Jens Axboe
@ 2025-08-04 17:06   ` Keith Busch
  2025-08-04 23:45     ` Keith Busch
  0 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-04 17:06 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, snitzer,
	dw, brauner

On Sat, Aug 02, 2025 at 09:37:32AM -0600, Jens Axboe wrote:
> Did you write some test cases for this?

I have some crude unit tests to hit specific conditions that might
happen with nvme.

Note, the "second" test here will fail with the wrong result with this
version of the patchset due to the issue I mentioned on patch 2, but
I've a fix for it ready for the next version.

---
/*
 * This test is aligned to NVMe's PRP virtual boundary. It is intended to
 * execute on such a device with 4k formatted logical block size.
 *
 * The first test will submit a vectored read with a total size aligned to a 4k
 * block, but individual vectors may not be. This should be successful.
 *
 * The second test will submit a vectored read with a total size aligned to a
 * 4k block, but the first vector contains an invalid address. This should get
 * EFAULT.
 *
 * The third one will submit an IO with a total size aligned to a 4k block,
 * but it will fail the virtual boundary condition, which should result in a
 * split to a 0 length bio. This should get an EINVAL.
 *
 * The fourth test will submit IO with a total size aligned to a 4k block, but
 * with invalid DMA offsets. This should get an EINVAL.
 *
 * The last test will submit a large IO with a page offset that should exceed
 * the bio max vectors limit, resulting in reverting part of a bio iteration.
 * This should be successful.
 */
#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <sys/uio.h>
#include <string.h>

#define BSIZE (8 * 1024 * 1024)
#define VECS 4

int main(int argc, char **argv)
{
        int fd, ret, i, j;
        struct iovec iov[VECS];
        char *buf;

        if (argc < 2)
                return -1;

        fd = open(argv[1], O_RDONLY | O_DIRECT);
        if (fd < 0)
                return fd;

        ret = posix_memalign((void **)&buf, 4096, BSIZE);
        if (ret)
                return ret;

        memset(buf, 0, BSIZE);

        iov[0].iov_base = buf + 3072;
        iov[0].iov_len = 1024;

        iov[1].iov_base = buf + (2 * 4096);
        iov[1].iov_len = 4096;

        iov[2].iov_base = buf + (8 * 4096);
        iov[2].iov_len = 4096;

        iov[3].iov_base = buf + (16 * 4096);
        iov[3].iov_len = 3072;

        ret = preadv(fd, iov, VECS, 0);
        if (ret < 0)
                perror("unexpected read failure");

        iov[0].iov_base = 0;
        ret = preadv(fd, iov, VECS, 0);
        if (ret < 0)
                perror("expected read failure for invalid address");

        iov[0].iov_base = buf;
        iov[0].iov_len = 1024;

        iov[1].iov_base = buf + (2 * 4096);
        iov[1].iov_len = 1024;

        iov[2].iov_base = buf + (8 * 4096);
        iov[2].iov_len = 1024;

        iov[3].iov_base = buf + (16 * 4096);
        iov[3].iov_len = 1024;

        ret = preadv(fd, iov, VECS, 0);
        if (ret < 0)
                perror("expected read for invalid virtual boundary");

        iov[0].iov_base = buf + 3072;
        iov[0].iov_len = 1025;

        iov[1].iov_base = buf + (2 * 4096);
        iov[1].iov_len = 4096;

        iov[2].iov_base = buf + (8 * 4096);
        iov[2].iov_len = 4096;

        iov[3].iov_base = buf + (16 * 4096);
        iov[3].iov_len = 3073;

        ret = preadv(fd, iov, VECS, 0);
        if (ret < 0)
                perror("expected read for invalid dma boundary");

        ret = pread(fd, buf + 2048, BSIZE - 8192, 0);
        if (ret < 0)
                perror("unexpected large read failure");

        free(buf);
        return errno;
}
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] block: simplify direct io validity check
  2025-08-04  6:55   ` Hannes Reinecke
@ 2025-08-04 17:11     ` Keith Busch
  2025-08-05  6:57       ` Hannes Reinecke
  0 siblings, 1 reply; 29+ messages in thread
From: Keith Busch @ 2025-08-04 17:11 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, snitzer,
	axboe, dw, brauner

On Mon, Aug 04, 2025 at 08:55:36AM +0200, Hannes Reinecke wrote:
> On 8/2/25 01:47, Keith Busch wrote:
> >   static bool blkdev_dio_invalid(struct block_device *bdev, struct kiocb *iocb,
> >   				struct iov_iter *iter)
> >   {
> > -	return iocb->ki_pos & (bdev_logical_block_size(bdev) - 1) ||
> > -		!bdev_iter_is_aligned(bdev, iter);
> > +	return (iocb->ki_pos | iov_iter_count(iter)) &
> > +			(bdev_logical_block_size(bdev) - 1);
> 
> Bitwise or? Sure?

Yep, this is correct. We need to return an error if either the size or
offset are not aligned to the block size. "Or"ing the two together gets
us a single check against the logical block size mask instead of doing
each individually. 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-04 15:25       ` Mike Snitzer
  2025-08-04 15:27         ` Mike Snitzer
@ 2025-08-04 22:26         ` Mike Snitzer
  2025-08-04 22:57           ` Keith Busch
  1 sibling, 1 reply; 29+ messages in thread
From: Mike Snitzer @ 2025-08-04 22:26 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, axboe, dw,
	brauner, Chuck Lever, linux-nfs

On Mon, Aug 04, 2025 at 11:25:08AM -0400, Mike Snitzer wrote:
> On Mon, Aug 04, 2025 at 08:16:39AM -0600, Keith Busch wrote:
> > On Fri, Aug 01, 2025 at 10:02:49PM -0400, Mike Snitzer wrote:
> > > On Fri, Aug 01, 2025 at 04:47:36PM -0700, Keith Busch wrote:
> > > > From: Keith Busch <kbusch@kernel.org>
> > > > 
> > > > No more callers.
> > > > 
> > > > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > > 
> > > You had me up until this last patch.
> > > 
> > > I'm actually making use of iov_iter_is_aligned() in a series of
> > > changes for both NFS and NFSD.  Chuck has included some of the
> > > NFSD changes in his nfsd-testing branch, see:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-testing&id=5d78ac1e674b45f9c9e3769b48efb27c44f4e4d3
> > > 
> > > And the balance of my work that is pending review/inclusion is:
> > > https://lore.kernel.org/linux-nfs/20250731230633.89983-1-snitzer@kernel.org/
> > > https://lore.kernel.org/linux-nfs/20250801171049.94235-1-snitzer@kernel.org/
> > > 
> > > I only need iov_iter_aligned_bvec, but recall I want to relax its
> > > checking with this patch:
> > > https://lore.kernel.org/linux-nfs/20250708160619.64800-5-snitzer@kernel.org/
> > > 
> > > Should I just add iov_iter_aligned_bvec() to fs/nfs_common/ so that
> > > both NFS and NFSD can use it?
> > 
> > If at all possible, I recommend finding a place that already walks the
> > vectors and do an opprotunistic check for the alignments there. This
> > will save CPU cycles. For example, nfsd_iter_read already iterates the
> > bvec while setting each page. Could you check the alignment while doing
> > that instead of iterating a second time immediately after?
> 
> Nice goal, I'll see if I can pull it off.
> 
> I'm currently using iov_iter_is_aligned() in 3 new call sites (for
> READ and WRITE) in both NFS and NFSD: nfs_local_iter_init,
> nfsd_iter_read, nfsd_vfs_write
> 
> nfsd_vfs_write is the only one that doesn't iterate the bvec as-is,
> but it does work that _should_ obviate the need to doable check the
> alignment.

FYI, I was able to avoid using iov_iter_is_aligned() in favor of
checks in earlier code (in both NFSD and NFS).

Thanks,
Mike

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-04 22:26         ` Mike Snitzer
@ 2025-08-04 22:57           ` Keith Busch
  0 siblings, 0 replies; 29+ messages in thread
From: Keith Busch @ 2025-08-04 22:57 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, axboe, dw,
	brauner, Chuck Lever, linux-nfs

On Mon, Aug 04, 2025 at 06:26:55PM -0400, Mike Snitzer wrote:
> FYI, I was able to avoid using iov_iter_is_aligned() in favor of
> checks in earlier code (in both NFSD and NFS).

Excellent! I promise removing the extra iteration is totally worth it. ;)

I just know of one error case bug mentioned in patch 2, so unless I hear
anything else, I'll spin out the new version with a fix for just that
tomorrow.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] direct-io: even more flexible io vectors
  2025-08-04 17:06   ` Keith Busch
@ 2025-08-04 23:45     ` Keith Busch
  0 siblings, 0 replies; 29+ messages in thread
From: Keith Busch @ 2025-08-04 23:45 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, snitzer,
	dw, brauner

On Mon, Aug 04, 2025 at 11:06:12AM -0600, Keith Busch wrote:
> On Sat, Aug 02, 2025 at 09:37:32AM -0600, Jens Axboe wrote:
> > Did you write some test cases for this?
> 
> I have some crude unit tests to hit specific conditions that might
> happen with nvme.

I've made imporvements today that make these targeted tests fit into
blktests framework.

Just fyi, I took a look at what 'fio' needs in order to exercise these
new use cases. This patchset requires multiple io-vectors for anything
interesting to happen, which 'fio' currently doesn't do. I'm not even
sure what new command line parameters could best convey how you want to
construct iovecs! Maybe just make it random within some alignment
constraints?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] iov_iter: remove iov_iter_is_aligned
  2025-08-01 23:47 ` [PATCH 7/7] iov_iter: remove iov_iter_is_aligned Keith Busch
  2025-08-02  2:02   ` Mike Snitzer
@ 2025-08-05  0:24   ` Mike Snitzer
  1 sibling, 0 replies; 29+ messages in thread
From: Mike Snitzer @ 2025-08-05  0:24 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-fsdevel, linux-kernel, axboe, dw, brauner,
	Keith Busch

On Fri, Aug 01, 2025 at 04:47:36PM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> No more callers.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>

Reviewed-by: Mike Snitzer <snitzer@kernel.org>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/7] block: align the bio after building it
  2025-08-04 14:08     ` Keith Busch
  2025-08-04 16:47       ` Keith Busch
@ 2025-08-05  6:54       ` Hannes Reinecke
  1 sibling, 0 replies; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-05  6:54 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, snitzer,
	axboe, dw, brauner

On 8/4/25 16:08, Keith Busch wrote:
> On Mon, Aug 04, 2025 at 08:54:00AM +0200, Hannes Reinecke wrote:
>> On 8/2/25 01:47, Keith Busch wrote:
>>> +static int bio_align_to_lbs(struct bio *bio, struct iov_iter *iter)
>>> +{
>>> +	struct block_device *bdev = bio->bi_bdev;
>>> +	size_t nbytes;
>>> +
>>> +	if (!bdev)
>>> +		return 0;
>>> +
>>> +	nbytes = bio->bi_iter.bi_size & (bdev_logical_block_size(bdev) - 1);
>>> +	if (!nbytes)
>>> +		return 0;
>>> +
>>> +	bio_revert(bio, nbytes);
>>> +	iov_iter_revert(iter, nbytes);
>>> +	if (!bio->bi_iter.bi_size)
>>> +		return -EFAULT;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>    /**
>>>     * bio_iov_iter_get_pages - add user or kernel pages to a bio
>>>     * @bio: bio to add pages to
>>> @@ -1336,6 +1355,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>>>    		ret = __bio_iov_iter_get_pages(bio, iter);
>>>    	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
>>> +	ret = bio_align_to_lbs(bio, iter);
>>>    	return bio->bi_vcnt ? 0 : ret;
>>
>> Wouldn't that cause the error from bio_align_to_lba() to be ignored
>> if bio->bi_vcnt is greater than 0?
> 
> That returns an error only if the alignment reduces the size to 0, so
> there would be a bug somewhere if bi_vcnt is not also 0 in that case.

It would, but we wouldn't be seeing it as 'ret' would be obscured
if 'bio->bi_vcnt' continues to be greater than zero and 'ret' is set.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] block: simplify direct io validity check
  2025-08-04 17:11     ` Keith Busch
@ 2025-08-05  6:57       ` Hannes Reinecke
  0 siblings, 0 replies; 29+ messages in thread
From: Hannes Reinecke @ 2025-08-05  6:57 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-block, linux-fsdevel, linux-kernel, snitzer,
	axboe, dw, brauner

On 8/4/25 19:11, Keith Busch wrote:
> On Mon, Aug 04, 2025 at 08:55:36AM +0200, Hannes Reinecke wrote:
>> On 8/2/25 01:47, Keith Busch wrote:
>>>    static bool blkdev_dio_invalid(struct block_device *bdev, struct kiocb *iocb,
>>>    				struct iov_iter *iter)
>>>    {
>>> -	return iocb->ki_pos & (bdev_logical_block_size(bdev) - 1) ||
>>> -		!bdev_iter_is_aligned(bdev, iter);
>>> +	return (iocb->ki_pos | iov_iter_count(iter)) &
>>> +			(bdev_logical_block_size(bdev) - 1);
>>
>> Bitwise or? Sure?
> 
> Yep, this is correct. We need to return an error if either the size or
> offset are not aligned to the block size. "Or"ing the two together gets
> us a single check against the logical block size mask instead of doing
> each individually.

Oh, my. Wasn't aware that we're running an obfuscated C-contest ...

Anyway.

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2025-08-05  6:57 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-01 23:47 [PATCH 0/7] direct-io: even more flexible io vectors Keith Busch
2025-08-01 23:47 ` [PATCH 1/7] block: check for valid bio while splitting Keith Busch
2025-08-04  6:54   ` Hannes Reinecke
2025-08-01 23:47 ` [PATCH 2/7] block: align the bio after building it Keith Busch
2025-08-04  6:54   ` Hannes Reinecke
2025-08-04 14:08     ` Keith Busch
2025-08-04 16:47       ` Keith Busch
2025-08-05  6:54       ` Hannes Reinecke
2025-08-01 23:47 ` [PATCH 3/7] block: simplify direct io validity check Keith Busch
2025-08-04  6:55   ` Hannes Reinecke
2025-08-04 17:11     ` Keith Busch
2025-08-05  6:57       ` Hannes Reinecke
2025-08-01 23:47 ` [PATCH 4/7] iomap: " Keith Busch
2025-08-04  6:57   ` Hannes Reinecke
2025-08-01 23:47 ` [PATCH 5/7] block: remove bdev_iter_is_aligned Keith Busch
2025-08-04  6:57   ` Hannes Reinecke
2025-08-01 23:47 ` [PATCH 6/7] blk-integrity: use simpler alignment check Keith Busch
2025-08-04  6:58   ` Hannes Reinecke
2025-08-01 23:47 ` [PATCH 7/7] iov_iter: remove iov_iter_is_aligned Keith Busch
2025-08-02  2:02   ` Mike Snitzer
2025-08-04 14:16     ` Keith Busch
2025-08-04 15:25       ` Mike Snitzer
2025-08-04 15:27         ` Mike Snitzer
2025-08-04 22:26         ` Mike Snitzer
2025-08-04 22:57           ` Keith Busch
2025-08-05  0:24   ` Mike Snitzer
2025-08-02 15:37 ` [PATCH 0/7] direct-io: even more flexible io vectors Jens Axboe
2025-08-04 17:06   ` Keith Busch
2025-08-04 23:45     ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).