linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv4 0/8]
@ 2025-08-27 14:12 Keith Busch
  2025-08-27 14:12 ` [PATCHv4 1/8] block: check for valid bio while splitting Keith Busch
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel; +Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch

From: Keith Busch <kbusch@kernel.org>

Previous version:

  https://lore.kernel.org/linux-block/20250819164922.640964-1-kbusch@meta.com/

This series removes the direct io requirement that io vector lengths
align to the logical block size. There are two primary benefits from
doing this:

  1. It allows user space more flexibility in what kind of io vectors
     are accepted, removing the need to bounce their data to specially
     aligned buffers.

  2. By moving the alignment checks to later when the segments are
     already being checked, we remove one more iov walk per IO, reducing
     CPU utilization and submission latency.

Same as previously, I've tested direct IO on raw block, xfs, ext4, and btrfs.

Changes from v3:

  - Added reviews

  - Code style and comment updates

Keith Busch (8):
  block: check for valid bio while splitting
  block: add size alignment to bio_iov_iter_get_pages
  block: align the bio after building it
  block: simplify direct io validity check
  iomap: simplify direct io validity check
  block: remove bdev_iter_is_aligned
  blk-integrity: use simpler alignment check
  iov_iter: remove iov_iter_is_aligned

 block/bio-integrity.c  |  4 +-
 block/bio.c            | 64 ++++++++++++++++++----------
 block/blk-map.c        |  2 +-
 block/blk-merge.c      | 21 ++++++++--
 block/fops.c           | 10 ++---
 fs/iomap/direct-io.c   |  5 +--
 include/linux/bio.h    | 13 ++++--
 include/linux/blkdev.h | 21 ++++++----
 include/linux/uio.h    |  2 -
 lib/iov_iter.c         | 95 ------------------------------------------
 10 files changed, 92 insertions(+), 145 deletions(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCHv4 1/8] block: check for valid bio while splitting
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  2025-08-31  0:40   ` Martin K. Petersen
  2025-08-27 14:12 ` [PATCHv4 2/8] block: add size alignment to bio_iov_iter_get_pages Keith Busch
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel; +Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch

From: Keith Busch <kbusch@kernel.org>

We're already iterating every segment, so check these for a valid IO
lengths at the same time. Individual segment lengths will not be checked
on passthrough commands. The read/write command segments must be sized
to the dma alignment.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-map.c        |  2 +-
 block/blk-merge.c      | 21 +++++++++++++++++----
 include/linux/bio.h    |  4 ++--
 include/linux/blkdev.h |  7 +++++++
 4 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 23e5d5ebe59ec..6d1268aa82715 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -443,7 +443,7 @@ int blk_rq_append_bio(struct request *rq, struct bio *bio)
 	int ret;
 
 	/* check that the data layout matches the hardware restrictions */
-	ret = bio_split_rw_at(bio, lim, &nr_segs, max_bytes);
+	ret = bio_split_io_at(bio, lim, &nr_segs, max_bytes, 0);
 	if (ret) {
 		/* if we would have to split the bio, copy instead */
 		if (ret > 0)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 70d704615be52..cffc0fe48d8a3 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -279,25 +279,30 @@ static unsigned int bio_split_alignment(struct bio *bio,
 }
 
 /**
- * bio_split_rw_at - check if and where to split a read/write bio
+ * bio_split_io_at - check if and where to split a bio
  * @bio:  [in] bio to be split
  * @lim:  [in] queue limits to split based on
  * @segs: [out] number of segments in the bio with the first half of the sectors
  * @max_bytes: [in] maximum number of bytes per bio
+ * @len_align_mask: [in] length alignment mask for each vector
  *
  * Find out if @bio needs to be split to fit the queue limits in @lim and a
  * maximum size of @max_bytes.  Returns a negative error number if @bio can't be
  * split, 0 if the bio doesn't have to be split, or a positive sector offset if
  * @bio needs to be split.
  */
-int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
-		unsigned *segs, unsigned max_bytes)
+int bio_split_io_at(struct bio *bio, const struct queue_limits *lim,
+		unsigned *segs, unsigned max_bytes, unsigned len_align_mask)
 {
 	struct bio_vec bv, bvprv, *bvprvp = NULL;
 	struct bvec_iter iter;
 	unsigned nsegs = 0, bytes = 0;
 
 	bio_for_each_bvec(bv, bio, iter) {
+		if (bv.bv_offset & lim->dma_alignment ||
+		    bv.bv_len & len_align_mask)
+			return -EINVAL;
+
 		/*
 		 * If the queue doesn't support SG gaps and adding this
 		 * offset would create a gap, disallow it.
@@ -339,8 +344,16 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
 	 * Individual bvecs might not be logical block aligned. Round down the
 	 * split size so that each bio is properly block size aligned, even if
 	 * we do not use the full hardware limits.
+	 *
+	 * It is possible to submit a bio that can't be split into a valid io:
+	 * there may either be too many discontiguous vectors for the max
+	 * segments limit, or contain virtual boundary gaps without having a
+	 * valid block sized split. A zero byte result means one of those
+	 * conditions occured.
 	 */
 	bytes = ALIGN_DOWN(bytes, bio_split_alignment(bio, lim));
+	if (!bytes)
+		return -EINVAL;
 
 	/*
 	 * Bio splitting may cause subtle trouble such as hang when doing sync
@@ -350,7 +363,7 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
 	bio_clear_polled(bio);
 	return bytes >> SECTOR_SHIFT;
 }
-EXPORT_SYMBOL_GPL(bio_split_rw_at);
+EXPORT_SYMBOL_GPL(bio_split_io_at);
 
 struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim,
 		unsigned *nr_segs)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 46ffac5caab78..519a1d59805f8 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -322,8 +322,8 @@ static inline void bio_next_folio(struct folio_iter *fi, struct bio *bio)
 void bio_trim(struct bio *bio, sector_t offset, sector_t size);
 extern struct bio *bio_split(struct bio *bio, int sectors,
 			     gfp_t gfp, struct bio_set *bs);
-int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
-		unsigned *segs, unsigned max_bytes);
+int bio_split_io_at(struct bio *bio, const struct queue_limits *lim,
+		unsigned *segs, unsigned max_bytes, unsigned len_align);
 
 /**
  * bio_next_split - get next @sectors from a bio, splitting if necessary
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index fe1797bbec420..d75c77eb8cb97 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1870,6 +1870,13 @@ bdev_atomic_write_unit_max_bytes(struct block_device *bdev)
 	return queue_atomic_write_unit_max_bytes(bdev_get_queue(bdev));
 }
 
+static inline int bio_split_rw_at(struct bio *bio,
+		const struct queue_limits *lim,
+		unsigned *segs, unsigned max_bytes)
+{
+	return bio_split_io_at(bio, lim, segs, max_bytes, lim->dma_alignment);
+}
+
 #define DEFINE_IO_COMP_BATCH(name)	struct io_comp_batch name = { }
 
 #endif /* _LINUX_BLKDEV_H */
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 2/8] block: add size alignment to bio_iov_iter_get_pages
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
  2025-08-27 14:12 ` [PATCHv4 1/8] block: check for valid bio while splitting Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  2025-08-31  0:40   ` Martin K. Petersen
  2025-08-27 14:12 ` [PATCHv4 3/8] block: align the bio after building it Keith Busch
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel; +Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch

From: Keith Busch <kbusch@kernel.org>

The block layer tries to align bio vectors to the block device's logical
block size. Some cases don't have a block device, or we may need to
align to something larger, which we can't derive it from the queue
limits. Have the caller specify what they want, or allow any length
alignment if nothing was specified. Since the most common use case
relies on the block device's limits, a helper function is provided.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c            | 19 +++++++++++--------
 block/fops.c           |  6 +++---
 fs/iomap/direct-io.c   |  2 +-
 include/linux/bio.h    |  9 ++++++++-
 include/linux/blkdev.h |  7 +++++++
 5 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 3b371a5da159e..44286db14355f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1204,7 +1204,8 @@ static unsigned int get_contig_folio_len(unsigned int *num_pages,
  * For a multi-segment *iter, this function only adds pages from the next
  * non-empty segment of the iov iterator.
  */
-static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter,
+				    unsigned len_align_mask)
 {
 	iov_iter_extraction_t extraction_flags = 0;
 	unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt;
@@ -1213,7 +1214,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	struct page **pages = (struct page **)bv;
 	ssize_t size;
 	unsigned int num_pages, i = 0;
-	size_t offset, folio_offset, left, len;
+	size_t offset, folio_offset, left, len, trim;
 	int ret = 0;
 
 	/*
@@ -1242,8 +1243,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 
 	nr_pages = DIV_ROUND_UP(offset + size, PAGE_SIZE);
 
-	if (bio->bi_bdev) {
-		size_t trim = size & (bdev_logical_block_size(bio->bi_bdev) - 1);
+	trim = size & len_align_mask;
+	if (trim) {
 		iov_iter_revert(iter, trim);
 		size -= trim;
 	}
@@ -1298,9 +1299,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 }
 
 /**
- * bio_iov_iter_get_pages - add user or kernel pages to a bio
+ * bio_iov_iter_get_pages_aligned - add user or kernel pages to a bio
  * @bio: bio to add pages to
  * @iter: iov iterator describing the region to be added
+ * @len_align_mask: the mask to align each vector size to, 0 for any length
  *
  * This takes either an iterator pointing to user memory, or one pointing to
  * kernel pages (BVEC iterator). If we're adding user pages, we pin them and
@@ -1317,7 +1319,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
  * MM encounters an error pinning the requested pages, it stops. Error
  * is returned only if 0 pages could be pinned.
  */
-int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+int bio_iov_iter_get_pages_aligned(struct bio *bio, struct iov_iter *iter,
+			   unsigned len_align_mask)
 {
 	int ret = 0;
 
@@ -1333,12 +1336,12 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	if (iov_iter_extract_will_pin(iter))
 		bio_set_flag(bio, BIO_PAGE_PINNED);
 	do {
-		ret = __bio_iov_iter_get_pages(bio, iter);
+		ret = __bio_iov_iter_get_pages(bio, iter, len_align_mask);
 	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
 
 	return bio->bi_vcnt ? 0 : ret;
 }
-EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
+EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages_aligned);
 
 static void submit_bio_wait_endio(struct bio *bio)
 {
diff --git a/block/fops.c b/block/fops.c
index 82451ac8ff25d..d136fb5f6b6ab 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -78,7 +78,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
 	if (iocb->ki_flags & IOCB_ATOMIC)
 		bio.bi_opf |= REQ_ATOMIC;
 
-	ret = bio_iov_iter_get_pages(&bio, iter);
+	ret = bio_iov_iter_get_bdev_pages(&bio, iter, bdev);
 	if (unlikely(ret))
 		goto out;
 	ret = bio.bi_iter.bi_size;
@@ -212,7 +212,7 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 		bio->bi_end_io = blkdev_bio_end_io;
 		bio->bi_ioprio = iocb->ki_ioprio;
 
-		ret = bio_iov_iter_get_pages(bio, iter);
+		ret = bio_iov_iter_get_bdev_pages(bio, iter, bdev);
 		if (unlikely(ret)) {
 			bio->bi_status = BLK_STS_IOERR;
 			bio_endio(bio);
@@ -348,7 +348,7 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
 		 */
 		bio_iov_bvec_set(bio, iter);
 	} else {
-		ret = bio_iov_iter_get_pages(bio, iter);
+		ret = bio_iov_iter_get_bdev_pages(bio, iter, bdev);
 		if (unlikely(ret))
 			goto out_bio_put;
 	}
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index b84f6af2eb4c8..fea23fa6a402f 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -434,7 +434,7 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
 		bio->bi_private = dio;
 		bio->bi_end_io = iomap_dio_bio_end_io;
 
-		ret = bio_iov_iter_get_pages(bio, dio->submit.iter);
+		ret = bio_iov_iter_get_bdev_pages(bio, dio->submit.iter, iomap->bdev);
 		if (unlikely(ret)) {
 			/*
 			 * We have to stop part way through an IO. We must fall
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 519a1d59805f8..788a50ff319e3 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -441,7 +441,14 @@ int submit_bio_wait(struct bio *bio);
 int bdev_rw_virt(struct block_device *bdev, sector_t sector, void *data,
 		size_t len, enum req_op op);
 
-int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
+int bio_iov_iter_get_pages_aligned(struct bio *bio, struct iov_iter *iter,
+		unsigned len_align_mask);
+
+static inline int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+{
+	return bio_iov_iter_get_pages_aligned(bio, iter, 0);
+}
+
 void bio_iov_bvec_set(struct bio *bio, const struct iov_iter *iter);
 void __bio_release_pages(struct bio *bio, bool mark_dirty);
 extern void bio_set_pages_dirty(struct bio *bio);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d75c77eb8cb97..36500d576d7e9 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1877,6 +1877,13 @@ static inline int bio_split_rw_at(struct bio *bio,
 	return bio_split_io_at(bio, lim, segs, max_bytes, lim->dma_alignment);
 }
 
+static inline int bio_iov_iter_get_bdev_pages(struct bio *bio,
+		struct iov_iter *iter, struct block_device *bdev)
+{
+	return bio_iov_iter_get_pages_aligned(bio, iter,
+					bdev_logical_block_size(bdev) - 1);
+}
+
 #define DEFINE_IO_COMP_BATCH(name)	struct io_comp_batch name = { }
 
 #endif /* _LINUX_BLKDEV_H */
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 3/8] block: align the bio after building it
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
  2025-08-27 14:12 ` [PATCHv4 1/8] block: check for valid bio while splitting Keith Busch
  2025-08-27 14:12 ` [PATCHv4 2/8] block: add size alignment to bio_iov_iter_get_pages Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  2025-08-31  0:41   ` Martin K. Petersen
  2025-09-02  5:23   ` Christoph Hellwig
  2025-08-27 14:12 ` [PATCHv4 4/8] block: simplify direct io validity check Keith Busch
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel; +Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch

From: Keith Busch <kbusch@kernel.org>

Instead of ensuring each vector is block size aligned while constructing
the bio, just ensure the entire size is aligned after it's built. This
makes getting bio pages more flexible to accepting device valid io
vectors that would otherwise get rejected by alignment checks.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/bio.c | 65 ++++++++++++++++++++++++++++++++---------------------
 1 file changed, 40 insertions(+), 25 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 44286db14355f..8703be4748db8 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1204,8 +1204,7 @@ static unsigned int get_contig_folio_len(unsigned int *num_pages,
  * For a multi-segment *iter, this function only adds pages from the next
  * non-empty segment of the iov iterator.
  */
-static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter,
-				    unsigned len_align_mask)
+static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 {
 	iov_iter_extraction_t extraction_flags = 0;
 	unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt;
@@ -1214,7 +1213,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter,
 	struct page **pages = (struct page **)bv;
 	ssize_t size;
 	unsigned int num_pages, i = 0;
-	size_t offset, folio_offset, left, len, trim;
+	size_t offset, folio_offset, left, len;
 	int ret = 0;
 
 	/*
@@ -1228,13 +1227,6 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter,
 	if (bio->bi_bdev && blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue))
 		extraction_flags |= ITER_ALLOW_P2PDMA;
 
-	/*
-	 * Each segment in the iov is required to be a block size multiple.
-	 * However, we may not be able to get the entire segment if it spans
-	 * more pages than bi_max_vecs allows, so we have to ALIGN_DOWN the
-	 * result to ensure the bio's total size is correct. The remainder of
-	 * the iov data will be picked up in the next bio iteration.
-	 */
 	size = iov_iter_extract_pages(iter, &pages,
 				      UINT_MAX - bio->bi_iter.bi_size,
 				      nr_pages, extraction_flags, &offset);
@@ -1242,18 +1234,6 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter,
 		return size ? size : -EFAULT;
 
 	nr_pages = DIV_ROUND_UP(offset + size, PAGE_SIZE);
-
-	trim = size & len_align_mask;
-	if (trim) {
-		iov_iter_revert(iter, trim);
-		size -= trim;
-	}
-
-	if (unlikely(!size)) {
-		ret = -EFAULT;
-		goto out;
-	}
-
 	for (left = size, i = 0; left > 0; left -= len, i += num_pages) {
 		struct page *page = pages[i];
 		struct folio *folio = page_folio(page);
@@ -1298,11 +1278,44 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter,
 	return ret;
 }
 
+/*
+ * Aligns the bio size to the len_align_mask, releasing excessive bio vecs that
+ * __bio_iov_iter_get_pages may have inserted, and reverts the trimmed length
+ * for the next iteration.
+ */
+static int bio_iov_iter_align_down(struct bio *bio, struct iov_iter *iter,
+			    unsigned len_align_mask)
+{
+	size_t nbytes = bio->bi_iter.bi_size & len_align_mask;
+
+	if (!nbytes)
+		return 0;
+
+	iov_iter_revert(iter, nbytes);
+	bio->bi_iter.bi_size -= nbytes;
+	do {
+		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
+
+		if (nbytes < bv->bv_len) {
+			bv->bv_len -= nbytes;
+			break;
+		}
+
+		bio_release_page(bio, bv->bv_page);
+		bio->bi_vcnt--;
+		nbytes -= bv->bv_len;
+	} while (nbytes);
+
+	if (!bio->bi_vcnt)
+		return -EFAULT;
+	return 0;
+}
+
 /**
  * bio_iov_iter_get_pages_aligned - add user or kernel pages to a bio
  * @bio: bio to add pages to
  * @iter: iov iterator describing the region to be added
- * @len_align_mask: the mask to align each vector size to, 0 for any length
+ * @len_align_mask: the mask to align the total size to, 0 for any length
  *
  * This takes either an iterator pointing to user memory, or one pointing to
  * kernel pages (BVEC iterator). If we're adding user pages, we pin them and
@@ -1336,10 +1349,12 @@ int bio_iov_iter_get_pages_aligned(struct bio *bio, struct iov_iter *iter,
 	if (iov_iter_extract_will_pin(iter))
 		bio_set_flag(bio, BIO_PAGE_PINNED);
 	do {
-		ret = __bio_iov_iter_get_pages(bio, iter, len_align_mask);
+		ret = __bio_iov_iter_get_pages(bio, iter);
 	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
 
-	return bio->bi_vcnt ? 0 : ret;
+	if (bio->bi_vcnt)
+		return bio_iov_iter_align_down(bio, iter, len_align_mask);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages_aligned);
 
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 4/8] block: simplify direct io validity check
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
                   ` (2 preceding siblings ...)
  2025-08-27 14:12 ` [PATCHv4 3/8] block: align the bio after building it Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  2025-08-27 14:12 ` [PATCHv4 5/8] iomap: " Keith Busch
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel
  Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch, Hannes Reinecke,
	Martin K. Petersen

From: Keith Busch <kbusch@kernel.org>

The block layer checks all the segments for validity later, so no need
for an early check. Just reduce it to a simple position and total length
check, and defer the more invasive segment checks to the block layer.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/fops.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index d136fb5f6b6ab..19814bddd77e2 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -38,8 +38,8 @@ static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
 static bool blkdev_dio_invalid(struct block_device *bdev, struct kiocb *iocb,
 				struct iov_iter *iter)
 {
-	return iocb->ki_pos & (bdev_logical_block_size(bdev) - 1) ||
-		!bdev_iter_is_aligned(bdev, iter);
+	return (iocb->ki_pos | iov_iter_count(iter)) &
+			(bdev_logical_block_size(bdev) - 1);
 }
 
 #define DIO_INLINE_BIO_VECS 4
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 5/8] iomap: simplify direct io validity check
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
                   ` (3 preceding siblings ...)
  2025-08-27 14:12 ` [PATCHv4 4/8] block: simplify direct io validity check Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  2025-08-27 14:12 ` [PATCHv4 6/8] block: remove bdev_iter_is_aligned Keith Busch
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel
  Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch, Hannes Reinecke,
	Martin K. Petersen

From: Keith Busch <kbusch@kernel.org>

The block layer checks all the segments for validity later, so no need
for an early check. Just reduce it to a simple position and total length
check, and defer the more invasive segment checks to the block layer.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/direct-io.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index fea23fa6a402f..c06e41fd4d0af 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -337,8 +337,7 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
 	u64 copied = 0;
 	size_t orig_count;
 
-	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1) ||
-	    !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
+	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1))
 		return -EINVAL;
 
 	if (dio->flags & IOMAP_DIO_WRITE) {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 6/8] block: remove bdev_iter_is_aligned
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
                   ` (4 preceding siblings ...)
  2025-08-27 14:12 ` [PATCHv4 5/8] iomap: " Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  2025-08-27 14:12 ` [PATCHv4 7/8] blk-integrity: use simpler alignment check Keith Busch
  2025-08-27 14:12 ` [PATCHv4 8/8] iov_iter: remove iov_iter_is_aligned Keith Busch
  7 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel
  Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch, Hannes Reinecke,
	Martin K. Petersen

From: Keith Busch <kbusch@kernel.org>

No more callers.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/blkdev.h | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 36500d576d7e9..221f6d7c0beb1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1590,13 +1590,6 @@ static inline unsigned int bdev_dma_alignment(struct block_device *bdev)
 	return queue_dma_alignment(bdev_get_queue(bdev));
 }
 
-static inline bool bdev_iter_is_aligned(struct block_device *bdev,
-					struct iov_iter *iter)
-{
-	return iov_iter_is_aligned(iter, bdev_dma_alignment(bdev),
-				   bdev_logical_block_size(bdev) - 1);
-}
-
 static inline unsigned int
 blk_lim_dma_alignment_and_pad(struct queue_limits *lim)
 {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 7/8] blk-integrity: use simpler alignment check
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
                   ` (5 preceding siblings ...)
  2025-08-27 14:12 ` [PATCHv4 6/8] block: remove bdev_iter_is_aligned Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  2025-08-27 14:12 ` [PATCHv4 8/8] iov_iter: remove iov_iter_is_aligned Keith Busch
  7 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel
  Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch, Hannes Reinecke,
	Martin K. Petersen

From: Keith Busch <kbusch@kernel.org>

We're checking length and addresses against the same alignment value, so
use the more simple iterator check.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/bio-integrity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 6b077ca937f6b..6d069a49b4aad 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -262,7 +262,6 @@ static unsigned int bvec_from_pages(struct bio_vec *bvec, struct page **pages,
 int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 {
 	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
-	unsigned int align = blk_lim_dma_alignment_and_pad(&q->limits);
 	struct page *stack_pages[UIO_FASTIOV], **pages = stack_pages;
 	struct bio_vec stack_vec[UIO_FASTIOV], *bvec = stack_vec;
 	size_t offset, bytes = iter->count;
@@ -285,7 +284,8 @@ int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 		pages = NULL;
 	}
 
-	copy = !iov_iter_is_aligned(iter, align, align);
+	copy = iov_iter_alignment(iter) &
+			blk_lim_dma_alignment_and_pad(&q->limits);
 	ret = iov_iter_extract_pages(iter, &pages, bytes, nr_vecs, 0, &offset);
 	if (unlikely(ret < 0))
 		goto free_bvec;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv4 8/8] iov_iter: remove iov_iter_is_aligned
  2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
                   ` (6 preceding siblings ...)
  2025-08-27 14:12 ` [PATCHv4 7/8] blk-integrity: use simpler alignment check Keith Busch
@ 2025-08-27 14:12 ` Keith Busch
  7 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2025-08-27 14:12 UTC (permalink / raw)
  To: linux-block, linux-fsdevel
  Cc: linux-xfs, linux-ext4, hch, axboe, Keith Busch, Hannes Reinecke,
	Martin K. Petersen

From: Keith Busch <kbusch@kernel.org>

No more callers.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/uio.h |  2 -
 lib/iov_iter.c      | 95 ---------------------------------------------
 2 files changed, 97 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 2e86c653186c6..5b127043a1519 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -286,8 +286,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
 #endif
 
 size_t iov_iter_zero(size_t bytes, struct iov_iter *);
-bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
-			unsigned len_mask);
 unsigned long iov_iter_alignment(const struct iov_iter *i);
 unsigned long iov_iter_gap_alignment(const struct iov_iter *i);
 void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index f9193f952f499..2fe66a6b8789e 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -784,101 +784,6 @@ void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
 }
 EXPORT_SYMBOL(iov_iter_discard);
 
-static bool iov_iter_aligned_iovec(const struct iov_iter *i, unsigned addr_mask,
-				   unsigned len_mask)
-{
-	const struct iovec *iov = iter_iov(i);
-	size_t size = i->count;
-	size_t skip = i->iov_offset;
-
-	do {
-		size_t len = iov->iov_len - skip;
-
-		if (len > size)
-			len = size;
-		if (len & len_mask)
-			return false;
-		if ((unsigned long)(iov->iov_base + skip) & addr_mask)
-			return false;
-
-		iov++;
-		size -= len;
-		skip = 0;
-	} while (size);
-
-	return true;
-}
-
-static bool iov_iter_aligned_bvec(const struct iov_iter *i, unsigned addr_mask,
-				  unsigned len_mask)
-{
-	const struct bio_vec *bvec = i->bvec;
-	unsigned skip = i->iov_offset;
-	size_t size = i->count;
-
-	do {
-		size_t len = bvec->bv_len - skip;
-
-		if (len > size)
-			len = size;
-		if (len & len_mask)
-			return false;
-		if ((unsigned long)(bvec->bv_offset + skip) & addr_mask)
-			return false;
-
-		bvec++;
-		size -= len;
-		skip = 0;
-	} while (size);
-
-	return true;
-}
-
-/**
- * iov_iter_is_aligned() - Check if the addresses and lengths of each segments
- * 	are aligned to the parameters.
- *
- * @i: &struct iov_iter to restore
- * @addr_mask: bit mask to check against the iov element's addresses
- * @len_mask: bit mask to check against the iov element's lengths
- *
- * Return: false if any addresses or lengths intersect with the provided masks
- */
-bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
-			 unsigned len_mask)
-{
-	if (likely(iter_is_ubuf(i))) {
-		if (i->count & len_mask)
-			return false;
-		if ((unsigned long)(i->ubuf + i->iov_offset) & addr_mask)
-			return false;
-		return true;
-	}
-
-	if (likely(iter_is_iovec(i) || iov_iter_is_kvec(i)))
-		return iov_iter_aligned_iovec(i, addr_mask, len_mask);
-
-	if (iov_iter_is_bvec(i))
-		return iov_iter_aligned_bvec(i, addr_mask, len_mask);
-
-	/* With both xarray and folioq types, we're dealing with whole folios. */
-	if (iov_iter_is_xarray(i)) {
-		if (i->count & len_mask)
-			return false;
-		if ((i->xarray_start + i->iov_offset) & addr_mask)
-			return false;
-	}
-	if (iov_iter_is_folioq(i)) {
-		if (i->count & len_mask)
-			return false;
-		if (i->iov_offset & addr_mask)
-			return false;
-	}
-
-	return true;
-}
-EXPORT_SYMBOL_GPL(iov_iter_is_aligned);
-
 static unsigned long iov_iter_alignment_iovec(const struct iov_iter *i)
 {
 	const struct iovec *iov = iter_iov(i);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 1/8] block: check for valid bio while splitting
  2025-08-27 14:12 ` [PATCHv4 1/8] block: check for valid bio while splitting Keith Busch
@ 2025-08-31  0:40   ` Martin K. Petersen
  0 siblings, 0 replies; 13+ messages in thread
From: Martin K. Petersen @ 2025-08-31  0:40 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-fsdevel, linux-xfs, linux-ext4, hch, axboe,
	Keith Busch


Keith,

> We're already iterating every segment, so check these for a valid IO
> lengths at the same time. Individual segment lengths will not be
> checked on passthrough commands. The read/write command segments must
> be sized to the dma alignment.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 2/8] block: add size alignment to bio_iov_iter_get_pages
  2025-08-27 14:12 ` [PATCHv4 2/8] block: add size alignment to bio_iov_iter_get_pages Keith Busch
@ 2025-08-31  0:40   ` Martin K. Petersen
  0 siblings, 0 replies; 13+ messages in thread
From: Martin K. Petersen @ 2025-08-31  0:40 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-fsdevel, linux-xfs, linux-ext4, hch, axboe,
	Keith Busch


Keith,

> The block layer tries to align bio vectors to the block device's
> logical block size. Some cases don't have a block device, or we may
> need to align to something larger, which we can't derive it from the
> queue limits. Have the caller specify what they want, or allow any
> length alignment if nothing was specified. Since the most common use
> case relies on the block device's limits, a helper function is
> provided.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 3/8] block: align the bio after building it
  2025-08-27 14:12 ` [PATCHv4 3/8] block: align the bio after building it Keith Busch
@ 2025-08-31  0:41   ` Martin K. Petersen
  2025-09-02  5:23   ` Christoph Hellwig
  1 sibling, 0 replies; 13+ messages in thread
From: Martin K. Petersen @ 2025-08-31  0:41 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-fsdevel, linux-xfs, linux-ext4, hch, axboe,
	Keith Busch


Keith,

> Instead of ensuring each vector is block size aligned while
> constructing the bio, just ensure the entire size is aligned after
> it's built. This makes getting bio pages more flexible to accepting
> device valid io vectors that would otherwise get rejected by alignment
> checks.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv4 3/8] block: align the bio after building it
  2025-08-27 14:12 ` [PATCHv4 3/8] block: align the bio after building it Keith Busch
  2025-08-31  0:41   ` Martin K. Petersen
@ 2025-09-02  5:23   ` Christoph Hellwig
  1 sibling, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2025-09-02  5:23 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-fsdevel, linux-xfs, linux-ext4, hch, axboe,
	Keith Busch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-09-02  5:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
2025-08-27 14:12 ` [PATCHv4 1/8] block: check for valid bio while splitting Keith Busch
2025-08-31  0:40   ` Martin K. Petersen
2025-08-27 14:12 ` [PATCHv4 2/8] block: add size alignment to bio_iov_iter_get_pages Keith Busch
2025-08-31  0:40   ` Martin K. Petersen
2025-08-27 14:12 ` [PATCHv4 3/8] block: align the bio after building it Keith Busch
2025-08-31  0:41   ` Martin K. Petersen
2025-09-02  5:23   ` Christoph Hellwig
2025-08-27 14:12 ` [PATCHv4 4/8] block: simplify direct io validity check Keith Busch
2025-08-27 14:12 ` [PATCHv4 5/8] iomap: " Keith Busch
2025-08-27 14:12 ` [PATCHv4 6/8] block: remove bdev_iter_is_aligned Keith Busch
2025-08-27 14:12 ` [PATCHv4 7/8] blk-integrity: use simpler alignment check Keith Busch
2025-08-27 14:12 ` [PATCHv4 8/8] iov_iter: remove iov_iter_is_aligned Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).