linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 0/7] blk dma iter for metadata
@ 2025-07-20 18:40 Keith Busch
  2025-07-20 18:40 ` [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter Keith Busch
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

Here's a new take on the dma iteration for integrity requests.

The current code still has the integrity payloads subscribe to the
"virt_boundary" queue limit when considering merging and coalescing in
iova space. This is an unnecessary limit for nvme-pci metadata SGLs, and
it currently makes testing merges a bit difficult.

Changes since v1:

  Provided a bunch of prep patches to make the current dma iteration
  more generic to reduce code duplication with integrity metadata.

  An nvme optimization for single or coalesced segments

Keith Busch (7):
  blk-mq-dma: move the bio and bvec_iter to blk_dma_iter
  blk-mq-dma: set the bvec being iterated
  blk-mq-dma: require unmap caller provide p2p map type
  blk-mq: remove REQ_P2PDMA flag
  blk-mq-dma: move common dma start code to a helper
  blk-mq-dma: add support for mapping integrity metadata
  nvme: convert metadata mapping to dma iter

 block/bio.c                   |   2 +-
 block/blk-mq-dma.c            | 173 ++++++++++++++++++++++------------
 drivers/nvme/host/pci.c       | 135 +++++++++++++-------------
 include/linux/blk-integrity.h |  17 ++++
 include/linux/blk-mq-dma.h    |  10 +-
 include/linux/blk_types.h     |   1 -
 6 files changed, 207 insertions(+), 131 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter
  2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
@ 2025-07-20 18:40 ` Keith Busch
  2025-07-21  7:37   ` Christoph Hellwig
  2025-07-21  7:42   ` Christoph Hellwig
  2025-07-20 18:40 ` [PATCHv2 2/7] blk-mq-dma: set the bvec being iterated Keith Busch
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

The req_iterator just happens to have a similar fields to what the dma
iterator needs, but we're not necessarily iterating a bio_vec here. Have
the dma iterator define its private fields directly. It also helps to
remove eyesores like "iter->iter.iter".

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-mq-dma.c         | 14 +++++++-------
 include/linux/blk-mq-dma.h |  3 ++-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index ad283017caef2..21da3d8941b23 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -10,7 +10,7 @@ struct phys_vec {
 	u32		len;
 };
 
-static bool blk_map_iter_next(struct request *req, struct req_iterator *iter,
+static bool blk_map_iter_next(struct request *req, struct blk_dma_iter *iter,
 			      struct phys_vec *vec)
 {
 	unsigned int max_size;
@@ -114,7 +114,7 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
 		if (error)
 			break;
 		mapped += vec->len;
-	} while (blk_map_iter_next(req, &iter->iter, vec));
+	} while (blk_map_iter_next(req, iter, vec));
 
 	error = dma_iova_sync(dma_dev, state, 0, mapped);
 	if (error) {
@@ -153,8 +153,8 @@ bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
 	unsigned int total_len = blk_rq_payload_bytes(req);
 	struct phys_vec vec;
 
-	iter->iter.bio = req->bio;
-	iter->iter.iter = req->bio->bi_iter;
+	iter->bio = req->bio;
+	iter->iter = req->bio->bi_iter;
 	memset(&iter->p2pdma, 0, sizeof(iter->p2pdma));
 	iter->status = BLK_STS_OK;
 
@@ -162,7 +162,7 @@ bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
 	 * Grab the first segment ASAP because we'll need it to check for P2P
 	 * transfers.
 	 */
-	if (!blk_map_iter_next(req, &iter->iter, &vec))
+	if (!blk_map_iter_next(req, iter, &vec))
 		return false;
 
 	if (IS_ENABLED(CONFIG_PCI_P2PDMA) && (req->cmd_flags & REQ_P2PDMA)) {
@@ -213,7 +213,7 @@ bool blk_rq_dma_map_iter_next(struct request *req, struct device *dma_dev,
 {
 	struct phys_vec vec;
 
-	if (!blk_map_iter_next(req, &iter->iter, &vec))
+	if (!blk_map_iter_next(req, iter, &vec))
 		return false;
 
 	if (iter->p2pdma.map == PCI_P2PDMA_MAP_BUS_ADDR)
@@ -246,7 +246,7 @@ blk_next_sg(struct scatterlist **sg, struct scatterlist *sglist)
 int __blk_rq_map_sg(struct request *rq, struct scatterlist *sglist,
 		    struct scatterlist **last_sg)
 {
-	struct req_iterator iter = {
+	struct blk_dma_iter iter = {
 		.bio	= rq->bio,
 	};
 	struct phys_vec vec;
diff --git a/include/linux/blk-mq-dma.h b/include/linux/blk-mq-dma.h
index c26a01aeae006..e1c01ba1e2e58 100644
--- a/include/linux/blk-mq-dma.h
+++ b/include/linux/blk-mq-dma.h
@@ -14,7 +14,8 @@ struct blk_dma_iter {
 	blk_status_t			status;
 
 	/* Internal to blk_rq_dma_map_iter_* */
-	struct req_iterator		iter;
+	struct bvec_iter		iter;
+	struct bio			*bio;
 	struct pci_p2pdma_map_state	p2pdma;
 };
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv2 2/7] blk-mq-dma: set the bvec being iterated
  2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
  2025-07-20 18:40 ` [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter Keith Busch
@ 2025-07-20 18:40 ` Keith Busch
  2025-07-21  7:38   ` Christoph Hellwig
  2025-07-20 18:40 ` [PATCHv2 3/7] blk-mq-dma: require unmap caller provide p2p map type Keith Busch
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

This will make it easier to add different bvec sources, like for
upcoming integrity support. It also makes iterating "special" payloads
more common with iterating normal data bi_io_vecs.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-mq-dma.c         | 30 ++++++++++++++++--------------
 include/linux/blk-mq-dma.h |  1 +
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index 21da3d8941b23..7f3ad162b8ff2 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -16,23 +16,17 @@ static bool blk_map_iter_next(struct request *req, struct blk_dma_iter *iter,
 	unsigned int max_size;
 	struct bio_vec bv;
 
-	if (req->rq_flags & RQF_SPECIAL_PAYLOAD) {
-		if (!iter->bio)
-			return false;
-		vec->paddr = bvec_phys(&req->special_vec);
-		vec->len = req->special_vec.bv_len;
-		iter->bio = NULL;
-		return true;
-	}
-
 	if (!iter->iter.bi_size)
 		return false;
 
-	bv = mp_bvec_iter_bvec(iter->bio->bi_io_vec, iter->iter);
+	bv = mp_bvec_iter_bvec(iter->bvec, iter->iter);
 	vec->paddr = bvec_phys(&bv);
 	max_size = get_max_segment_size(&req->q->limits, vec->paddr, UINT_MAX);
 	bv.bv_len = min(bv.bv_len, max_size);
-	bio_advance_iter_single(iter->bio, &iter->iter, bv.bv_len);
+	bvec_iter_advance_single(iter->bvec, &iter->iter, bv.bv_len);
+
+	if (req->rq_flags & RQF_SPECIAL_PAYLOAD)
+		return true;
 
 	/*
 	 * If we are entirely done with this bi_io_vec entry, check if the next
@@ -47,15 +41,16 @@ static bool blk_map_iter_next(struct request *req, struct blk_dma_iter *iter,
 				break;
 			iter->bio = iter->bio->bi_next;
 			iter->iter = iter->bio->bi_iter;
+			iter->bvec = iter->bio->bi_io_vec;
 		}
 
-		next = mp_bvec_iter_bvec(iter->bio->bi_io_vec, iter->iter);
+		next = mp_bvec_iter_bvec(iter->bvec, iter->iter);
 		if (bv.bv_len + next.bv_len > max_size ||
 		    !biovec_phys_mergeable(req->q, &bv, &next))
 			break;
 
 		bv.bv_len += next.bv_len;
-		bio_advance_iter_single(iter->bio, &iter->iter, next.bv_len);
+		bvec_iter_advance_single(iter->bvec, &iter->iter, next.bv_len);
 	}
 
 	vec->len = bv.bv_len;
@@ -158,6 +153,11 @@ bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
 	memset(&iter->p2pdma, 0, sizeof(iter->p2pdma));
 	iter->status = BLK_STS_OK;
 
+	if (req->rq_flags & RQF_SPECIAL_PAYLOAD)
+		iter->bvec = &req->special_vec;
+	else
+		iter->bvec = req->bio->bi_io_vec;
+
 	/*
 	 * Grab the first segment ASAP because we'll need it to check for P2P
 	 * transfers.
@@ -253,8 +253,10 @@ int __blk_rq_map_sg(struct request *rq, struct scatterlist *sglist,
 	int nsegs = 0;
 
 	/* the internal flush request may not have bio attached */
-	if (iter.bio)
+	if (iter.bio) {
 		iter.iter = iter.bio->bi_iter;
+		iter.bvec = iter.bio->bi_io_vec;
+	}
 
 	while (blk_map_iter_next(rq, &iter, &vec)) {
 		*last_sg = blk_next_sg(last_sg, sglist);
diff --git a/include/linux/blk-mq-dma.h b/include/linux/blk-mq-dma.h
index e1c01ba1e2e58..c2cf74be6a3d6 100644
--- a/include/linux/blk-mq-dma.h
+++ b/include/linux/blk-mq-dma.h
@@ -14,6 +14,7 @@ struct blk_dma_iter {
 	blk_status_t			status;
 
 	/* Internal to blk_rq_dma_map_iter_* */
+	struct bio_vec			*bvec;
 	struct bvec_iter		iter;
 	struct bio			*bio;
 	struct pci_p2pdma_map_state	p2pdma;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv2 3/7] blk-mq-dma: require unmap caller provide p2p map type
  2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
  2025-07-20 18:40 ` [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter Keith Busch
  2025-07-20 18:40 ` [PATCHv2 2/7] blk-mq-dma: set the bvec being iterated Keith Busch
@ 2025-07-20 18:40 ` Keith Busch
  2025-07-21  7:39   ` Christoph Hellwig
  2025-07-20 18:40 ` [PATCHv2 4/7] blk-mq: remove REQ_P2PDMA flag Keith Busch
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

In preparing for integrity dma mappings, we can't rely on the request
flag because data and metadata may have different mapping types.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/nvme/host/pci.c    | 9 ++++++++-
 include/linux/blk-mq-dma.h | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 071efec25346f..6cefa8344f670 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -261,6 +261,9 @@ enum nvme_iod_flags {
 
 	/* single segment dma mapping */
 	IOD_SINGLE_SEGMENT	= 1U << 2,
+
+	/* DMA mapped with PCI_P2PDMA_MAP_BUS_ADDR */
+	IOD_P2P_BUS_ADDR	= 1U << 3,
 };
 
 struct nvme_dma_vec {
@@ -725,7 +728,8 @@ static void nvme_unmap_data(struct request *req)
 		return;
 	}
 
-	if (!blk_rq_dma_unmap(req, dma_dev, &iod->dma_state, iod->total_len)) {
+	if (!blk_rq_dma_unmap(req, dma_dev, &iod->dma_state, iod->total_len,
+				iod->flags & IOD_P2P_BUS_ADDR)) {
 		if (nvme_pci_cmd_use_sgl(&iod->cmd))
 			nvme_free_sgls(req);
 		else
@@ -1000,6 +1004,9 @@ static blk_status_t nvme_map_data(struct request *req)
 	if (!blk_rq_dma_map_iter_start(req, dev->dev, &iod->dma_state, &iter))
 		return iter.status;
 
+	if (iter.p2pdma.map == PCI_P2PDMA_MAP_BUS_ADDR)
+		iod->flags |= IOD_P2P_BUS_ADDR;
+
 	if (use_sgl == SGL_FORCED ||
 	    (use_sgl == SGL_SUPPORTED &&
 	     (sgl_threshold && nvme_pci_avg_seg_size(req) >= sgl_threshold)))
diff --git a/include/linux/blk-mq-dma.h b/include/linux/blk-mq-dma.h
index c2cf74be6a3d6..cdbaacc3db065 100644
--- a/include/linux/blk-mq-dma.h
+++ b/include/linux/blk-mq-dma.h
@@ -43,14 +43,15 @@ static inline bool blk_rq_dma_map_coalesce(struct dma_iova_state *state)
  * @dma_dev:	device to unmap from
  * @state:	DMA IOVA state
  * @mapped_len: number of bytes to unmap
+ * @p2p:	true if mapped with PCI_P2PDMA_MAP_BUS_ADDR
  *
  * Returns %false if the callers need to manually unmap every DMA segment
  * mapped using @iter or %true if no work is left to be done.
  */
 static inline bool blk_rq_dma_unmap(struct request *req, struct device *dma_dev,
-		struct dma_iova_state *state, size_t mapped_len)
+		struct dma_iova_state *state, size_t mapped_len, bool p2p)
 {
-	if (req->cmd_flags & REQ_P2PDMA)
+	if (p2p)
 		return true;
 
 	if (dma_use_iova(state)) {
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv2 4/7] blk-mq: remove REQ_P2PDMA flag
  2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
                   ` (2 preceding siblings ...)
  2025-07-20 18:40 ` [PATCHv2 3/7] blk-mq-dma: require unmap caller provide p2p map type Keith Busch
@ 2025-07-20 18:40 ` Keith Busch
  2025-07-21  7:39   ` Christoph Hellwig
  2025-07-20 18:40 ` [PATCHv2 5/7] blk-mq-dma: move common dma start code to a helper Keith Busch
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

It's not serving any particular purpose anymore. pci_p2pdma_state()
already has all the appropriate checks, so the flag isn't guarding
anything.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/bio.c               |  2 +-
 block/blk-mq-dma.c        | 30 ++++++++++++++----------------
 include/linux/blk_types.h |  1 -
 3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 92c512e876c8d..f56d285e6958e 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -981,7 +981,7 @@ void __bio_add_page(struct bio *bio, struct page *page,
 	WARN_ON_ONCE(bio_full(bio, len));
 
 	if (is_pci_p2pdma_page(page))
-		bio->bi_opf |= REQ_P2PDMA | REQ_NOMERGE;
+		bio->bi_opf |= REQ_NOMERGE;
 
 	bvec_set_page(&bio->bi_io_vec[bio->bi_vcnt], page, len, off);
 	bio->bi_iter.bi_size += len;
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index 7f3ad162b8ff2..56158ed504747 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -165,22 +165,20 @@ bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
 	if (!blk_map_iter_next(req, iter, &vec))
 		return false;
 
-	if (IS_ENABLED(CONFIG_PCI_P2PDMA) && (req->cmd_flags & REQ_P2PDMA)) {
-		switch (pci_p2pdma_state(&iter->p2pdma, dma_dev,
-					 phys_to_page(vec.paddr))) {
-		case PCI_P2PDMA_MAP_BUS_ADDR:
-			return blk_dma_map_bus(iter, &vec);
-		case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE:
-			/*
-			 * P2P transfers through the host bridge are treated the
-			 * same as non-P2P transfers below and during unmap.
-			 */
-			req->cmd_flags &= ~REQ_P2PDMA;
-			break;
-		default:
-			iter->status = BLK_STS_INVAL;
-			return false;
-		}
+	switch (pci_p2pdma_state(&iter->p2pdma, dma_dev,
+				 phys_to_page(vec.paddr))) {
+	case PCI_P2PDMA_MAP_BUS_ADDR:
+		return blk_dma_map_bus(iter, &vec);
+	case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE:
+		/*
+		 * P2P transfers through the host bridge are treated the
+		 * same as non-P2P transfers below and during unmap.
+		 */
+	case PCI_P2PDMA_MAP_NONE:
+		break;
+	default:
+		iter->status = BLK_STS_INVAL;
+		return false;
 	}
 
 	if (blk_can_dma_map_iova(req, dma_dev) &&
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 09b99d52fd365..0a29b20939d17 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -419,7 +419,6 @@ enum req_flag_bits {
 #define REQ_DRV		(__force blk_opf_t)(1ULL << __REQ_DRV)
 #define REQ_FS_PRIVATE	(__force blk_opf_t)(1ULL << __REQ_FS_PRIVATE)
 #define REQ_ATOMIC	(__force blk_opf_t)(1ULL << __REQ_ATOMIC)
-#define REQ_P2PDMA	(__force blk_opf_t)(1ULL << __REQ_P2PDMA)
 
 #define REQ_NOUNMAP	(__force blk_opf_t)(1ULL << __REQ_NOUNMAP)
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv2 5/7] blk-mq-dma: move common dma start code to a helper
  2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
                   ` (3 preceding siblings ...)
  2025-07-20 18:40 ` [PATCHv2 4/7] blk-mq: remove REQ_P2PDMA flag Keith Busch
@ 2025-07-20 18:40 ` Keith Busch
  2025-07-21  7:40   ` Christoph Hellwig
  2025-07-20 18:40 ` [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata Keith Busch
  2025-07-20 18:40 ` [PATCHv2 7/7] nvme: convert metadata mapping to dma iter Keith Busch
  6 siblings, 1 reply; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

In preparing for dma mapping integrity metadata, move the common dma
setup to a helper.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-mq-dma.c | 69 +++++++++++++++++++++++++---------------------
 1 file changed, 38 insertions(+), 31 deletions(-)

diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index 56158ed504747..1e2e93c1cf5e2 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -120,44 +120,16 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
 	return true;
 }
 
-/**
- * blk_rq_dma_map_iter_start - map the first DMA segment for a request
- * @req:	request to map
- * @dma_dev:	device to map to
- * @state:	DMA IOVA state
- * @iter:	block layer DMA iterator
- *
- * Start DMA mapping @req to @dma_dev.  @state and @iter are provided by the
- * caller and don't need to be initialized.  @state needs to be stored for use
- * at unmap time, @iter is only needed at map time.
- *
- * Returns %false if there is no segment to map, including due to an error, or
- * %true ft it did map a segment.
- *
- * If a segment was mapped, the DMA address for it is returned in @iter.addr and
- * the length in @iter.len.  If no segment was mapped the status code is
- * returned in @iter.status.
- *
- * The caller can call blk_rq_dma_map_coalesce() to check if further segments
- * need to be mapped after this, or go straight to blk_rq_dma_map_iter_next()
- * to try to map the following segments.
- */
-bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
-		struct dma_iova_state *state, struct blk_dma_iter *iter)
+static bool blk_dma_map_iter_start(struct request *req, struct device *dma_dev,
+		struct dma_iova_state *state, struct blk_dma_iter *iter,
+		unsigned int total_len)
 {
-	unsigned int total_len = blk_rq_payload_bytes(req);
 	struct phys_vec vec;
 
 	iter->bio = req->bio;
-	iter->iter = req->bio->bi_iter;
 	memset(&iter->p2pdma, 0, sizeof(iter->p2pdma));
 	iter->status = BLK_STS_OK;
 
-	if (req->rq_flags & RQF_SPECIAL_PAYLOAD)
-		iter->bvec = &req->special_vec;
-	else
-		iter->bvec = req->bio->bi_io_vec;
-
 	/*
 	 * Grab the first segment ASAP because we'll need it to check for P2P
 	 * transfers.
@@ -186,6 +158,41 @@ bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
 		return blk_rq_dma_map_iova(req, dma_dev, state, iter, &vec);
 	return blk_dma_map_direct(req, dma_dev, iter, &vec);
 }
+
+/**
+ * blk_rq_dma_map_iter_start - map the first DMA segment for a request
+ * @req:	request to map
+ * @dma_dev:	device to map to
+ * @state:	DMA IOVA state
+ * @iter:	block layer DMA iterator
+ *
+ * Start DMA mapping @req to @dma_dev.  @state and @iter are provided by the
+ * caller and don't need to be initialized.  @state needs to be stored for use
+ * at unmap time, @iter is only needed at map time.
+ *
+ * Returns %false if there is no segment to map, including due to an error, or
+ * %true ft it did map a segment.
+ *
+ * If a segment was mapped, the DMA address for it is returned in @iter.addr and
+ * the length in @iter.len.  If no segment was mapped the status code is
+ * returned in @iter.status.
+ *
+ * The caller can call blk_rq_dma_map_coalesce() to check if further segments
+ * need to be mapped after this, or go straight to blk_rq_dma_map_iter_next()
+ * to try to map the following segments.
+ */
+bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
+		struct dma_iova_state *state, struct blk_dma_iter *iter)
+{
+	iter->iter = req->bio->bi_iter;
+	if (req->rq_flags & RQF_SPECIAL_PAYLOAD)
+		iter->bvec = &req->special_vec;
+	else
+		iter->bvec = req->bio->bi_io_vec;
+
+	return blk_dma_map_iter_start(req, dma_dev, state, iter,
+				      blk_rq_payload_bytes(req));
+}
 EXPORT_SYMBOL_GPL(blk_rq_dma_map_iter_start);
 
 /**
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata
  2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
                   ` (4 preceding siblings ...)
  2025-07-20 18:40 ` [PATCHv2 5/7] blk-mq-dma: move common dma start code to a helper Keith Busch
@ 2025-07-20 18:40 ` Keith Busch
  2025-07-20 22:51   ` kernel test robot
                     ` (2 more replies)
  2025-07-20 18:40 ` [PATCHv2 7/7] nvme: convert metadata mapping to dma iter Keith Busch
  6 siblings, 3 replies; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

Provide integrity metadata helpers equivalent to the data payload
helpers for iterating a request for dma setup.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-mq-dma.c            | 58 ++++++++++++++++++++++++++++++-----
 include/linux/blk-integrity.h | 17 ++++++++++
 include/linux/blk-mq-dma.h    |  1 +
 3 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index 1e2e93c1cf5e2..a4ae043e2cd72 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -2,6 +2,7 @@
 /*
  * Copyright (C) 2025 Christoph Hellwig
  */
+#include <linux/blk-integrity.h>
 #include <linux/blk-mq-dma.h>
 #include "blk.h"
 
@@ -10,6 +11,24 @@ struct phys_vec {
 	u32		len;
 };
 
+static bool blk_dma_iter_next(struct blk_dma_iter *iter)
+{
+	if (iter->iter.bi_size)
+		return true;
+	if (!iter->bio->bi_next)
+		return false;
+
+	iter->bio = iter->bio->bi_next;
+	if (iter->integrity) {
+		iter->iter = iter->bio->bi_integrity->bip_iter;
+		iter->bvec = iter->bio->bi_integrity->bip_vec;
+	} else {
+		iter->iter = iter->bio->bi_iter;
+		iter->bvec = iter->bio->bi_io_vec;
+	}
+	return true;
+}
+
 static bool blk_map_iter_next(struct request *req, struct blk_dma_iter *iter,
 			      struct phys_vec *vec)
 {
@@ -36,13 +55,8 @@ static bool blk_map_iter_next(struct request *req, struct blk_dma_iter *iter,
 	while (!iter->iter.bi_size || !iter->iter.bi_bvec_done) {
 		struct bio_vec next;
 
-		if (!iter->iter.bi_size) {
-			if (!iter->bio->bi_next)
-				break;
-			iter->bio = iter->bio->bi_next;
-			iter->iter = iter->bio->bi_iter;
-			iter->bvec = iter->bio->bi_io_vec;
-		}
+		if (!blk_dma_iter_next(iter))
+			break;
 
 		next = mp_bvec_iter_bvec(iter->bvec, iter->iter);
 		if (bv.bv_len + next.bv_len > max_size ||
@@ -185,6 +199,7 @@ bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
 		struct dma_iova_state *state, struct blk_dma_iter *iter)
 {
 	iter->iter = req->bio->bi_iter;
+	iter->integrity = false;
 	if (req->rq_flags & RQF_SPECIAL_PAYLOAD)
 		iter->bvec = &req->special_vec;
 	else
@@ -227,6 +242,35 @@ bool blk_rq_dma_map_iter_next(struct request *req, struct device *dma_dev,
 }
 EXPORT_SYMBOL_GPL(blk_rq_dma_map_iter_next);
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+bool blk_rq_integrity_dma_map_iter_start(struct request *req,
+		struct device *dma_dev,  struct dma_iova_state *state,
+		struct blk_dma_iter *iter)
+{
+	unsigned len = bio_integrity_bytes(&req->q->limits.integrity,
+					   blk_rq_sectors(req));
+	iter->iter = req->bio->bi_integrity->bip_iter;
+	iter->bvec = req->bio->bi_integrity->bip_vec;
+	iter->integrity = true;
+	return blk_dma_map_iter_start(req, dma_dev, state, iter, len);
+}
+EXPORT_SYMBOL_GPL(blk_rq_integrity_dma_map_iter_start);
+
+bool blk_rq_integrity_dma_map_iter_next(struct request *req,
+               struct device *dma_dev, struct blk_dma_iter *iter)
+{
+	struct phys_vec vec;
+
+	if (!blk_map_iter_next(req, iter, &vec))
+		return false;
+
+	if (iter->p2pdma.map == PCI_P2PDMA_MAP_BUS_ADDR)
+		return blk_dma_map_bus(iter, &vec);
+	return blk_dma_map_direct(req, dma_dev, iter, &vec);
+}
+EXPORT_SYMBOL_GPL(blk_rq_integrity_dma_map_iter_next);
+#endif
+
 static inline struct scatterlist *
 blk_next_sg(struct scatterlist **sg, struct scatterlist *sglist)
 {
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index c7eae0bfb013f..d66ba850bb2a5 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -4,6 +4,7 @@
 
 #include <linux/blk-mq.h>
 #include <linux/bio-integrity.h>
+#include <linux/blk-mq-dma.h>
 
 struct request;
 
@@ -29,6 +30,11 @@ int blk_rq_map_integrity_sg(struct request *, struct scatterlist *);
 int blk_rq_count_integrity_sg(struct request_queue *, struct bio *);
 int blk_rq_integrity_map_user(struct request *rq, void __user *ubuf,
 			      ssize_t bytes);
+bool blk_rq_integrity_dma_map_iter_start(struct request *req,
+		struct device *dma_dev,  struct dma_iova_state *state,
+		struct blk_dma_iter *iter);
+bool blk_rq_integrity_dma_map_iter_next(struct request *req,
+		struct device *dma_dev, struct blk_dma_iter *iter);
 
 static inline bool
 blk_integrity_queue_supports_integrity(struct request_queue *q)
@@ -108,6 +114,17 @@ static inline int blk_rq_integrity_map_user(struct request *rq,
 {
 	return -EINVAL;
 }
+bool blk_rq_integrity_dma_map_iter_start(struct request *req,
+		struct device *dma_dev,  struct dma_iova_state *state,
+		struct blk_dma_iter *iter)
+{
+	return false;
+}
+bool blk_rq_integrity_dma_map_iter_next(struct request *req,
+		struct device *dma_dev, struct blk_dma_iter *iter)
+{
+	return false;
+}
 static inline struct blk_integrity *bdev_get_integrity(struct block_device *b)
 {
 	return NULL;
diff --git a/include/linux/blk-mq-dma.h b/include/linux/blk-mq-dma.h
index cdbaacc3db065..8b06659bf67dd 100644
--- a/include/linux/blk-mq-dma.h
+++ b/include/linux/blk-mq-dma.h
@@ -18,6 +18,7 @@ struct blk_dma_iter {
 	struct bvec_iter		iter;
 	struct bio			*bio;
 	struct pci_p2pdma_map_state	p2pdma;
+	bool				integrity;
 };
 
 bool blk_rq_dma_map_iter_start(struct request *req, struct device *dma_dev,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv2 7/7] nvme: convert metadata mapping to dma iter
  2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
                   ` (5 preceding siblings ...)
  2025-07-20 18:40 ` [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata Keith Busch
@ 2025-07-20 18:40 ` Keith Busch
  2025-07-21  7:50   ` Christoph Hellwig
  6 siblings, 1 reply; 22+ messages in thread
From: Keith Busch @ 2025-07-20 18:40 UTC (permalink / raw)
  To: linux-block, linux-nvme, hch; +Cc: axboe, leonro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

Aligns data and metadata to the similar dma mapping scheme and removes
one more user of the scatter-gather dma mapping.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/nvme/host/pci.c | 126 ++++++++++++++++++++--------------------
 1 file changed, 62 insertions(+), 64 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 6cefa8344f670..248093976a04e 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -172,9 +172,7 @@ struct nvme_dev {
 	u32 last_ps;
 	bool hmb;
 	struct sg_table *hmb_sgt;
-
 	mempool_t *dmavec_mempool;
-	mempool_t *iod_meta_mempool;
 
 	/* shadow doorbell buffer support: */
 	__le32 *dbbuf_dbs;
@@ -264,6 +262,9 @@ enum nvme_iod_flags {
 
 	/* DMA mapped with PCI_P2PDMA_MAP_BUS_ADDR */
 	IOD_P2P_BUS_ADDR	= 1U << 3,
+
+	/* Metadata DMA mapped with PCI_P2PDMA_MAP_BUS_ADDR */
+	IOD_META_P2P_BUS_ADDR	= 1U << 4,
 };
 
 struct nvme_dma_vec {
@@ -279,15 +280,17 @@ struct nvme_iod {
 	struct nvme_command cmd;
 	u8 flags;
 	u8 nr_descriptors;
+	u8 nr_meta_descriptors;
 
 	unsigned int total_len;
+	unsigned int meta_total_len;
 	struct dma_iova_state dma_state;
+	struct dma_iova_state meta_dma_state;
 	void *descriptors[NVME_MAX_NR_DESCRIPTORS];
 	struct nvme_dma_vec *dma_vecs;
 	unsigned int nr_dma_vecs;
 
 	dma_addr_t meta_dma;
-	struct sg_table meta_sgt;
 	struct nvme_sgl_desc *meta_descriptor;
 };
 
@@ -644,6 +647,11 @@ static inline struct dma_pool *nvme_dma_pool(struct nvme_queue *nvmeq,
 	return nvmeq->descriptor_pools.large;
 }
 
+static inline bool nvme_pci_cmd_use_meta_sgl(struct nvme_command *cmd)
+{
+	return (cmd->common.flags & NVME_CMD_SGL_ALL) == NVME_CMD_SGL_METASEG;
+}
+
 static inline bool nvme_pci_cmd_use_sgl(struct nvme_command *cmd)
 {
 	return cmd->common.flags &
@@ -1014,70 +1022,53 @@ static blk_status_t nvme_map_data(struct request *req)
 	return nvme_pci_setup_data_prp(req, &iter);
 }
 
-static void nvme_pci_sgl_set_data_sg(struct nvme_sgl_desc *sge,
-		struct scatterlist *sg)
-{
-	sge->addr = cpu_to_le64(sg_dma_address(sg));
-	sge->length = cpu_to_le32(sg_dma_len(sg));
-	sge->type = NVME_SGL_FMT_DATA_DESC << 4;
-}
-
 static blk_status_t nvme_pci_setup_meta_sgls(struct request *req)
 {
 	struct nvme_queue *nvmeq = req->mq_hctx->driver_data;
-	struct nvme_dev *dev = nvmeq->dev;
+	unsigned int entries = req->nr_integrity_segments;
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+	struct nvme_dev *dev = nvmeq->dev;
 	struct nvme_sgl_desc *sg_list;
-	struct scatterlist *sgl, *sg;
-	unsigned int entries;
+	struct blk_dma_iter iter;
 	dma_addr_t sgl_dma;
-	int rc, i;
-
-	iod->meta_sgt.sgl = mempool_alloc(dev->iod_meta_mempool, GFP_ATOMIC);
-	if (!iod->meta_sgt.sgl)
-		return BLK_STS_RESOURCE;
+	int i = 0;
 
-	sg_init_table(iod->meta_sgt.sgl, req->nr_integrity_segments);
-	iod->meta_sgt.orig_nents = blk_rq_map_integrity_sg(req,
-							   iod->meta_sgt.sgl);
-	if (!iod->meta_sgt.orig_nents)
-		goto out_free_sg;
+	if (!blk_rq_integrity_dma_map_iter_start(req, dev->dev,
+						&iod->meta_dma_state, &iter))
+		return iter.status;
 
-	rc = dma_map_sgtable(dev->dev, &iod->meta_sgt, rq_dma_dir(req),
-			     DMA_ATTR_NO_WARN);
-	if (rc)
-		goto out_free_sg;
+	if (iter.p2pdma.map == PCI_P2PDMA_MAP_BUS_ADDR)
+		iod->flags |= IOD_META_P2P_BUS_ADDR;
+	else if (blk_rq_dma_map_coalesce(&iod->meta_dma_state))
+		entries = 1;
 
 	sg_list = dma_pool_alloc(nvmeq->descriptor_pools.small, GFP_ATOMIC,
 			&sgl_dma);
 	if (!sg_list)
-		goto out_unmap_sg;
+		return BLK_STS_RESOURCE;
 
-	entries = iod->meta_sgt.nents;
 	iod->meta_descriptor = sg_list;
 	iod->meta_dma = sgl_dma;
 
 	iod->cmd.common.flags = NVME_CMD_SGL_METASEG;
 	iod->cmd.common.metadata = cpu_to_le64(sgl_dma);
 
-	sgl = iod->meta_sgt.sgl;
 	if (entries == 1) {
-		nvme_pci_sgl_set_data_sg(sg_list, sgl);
+		iod->meta_total_len = iter.len;
+		nvme_pci_sgl_set_data(sg_list, &iter);
+		iod->nr_meta_descriptors = 0;
 		return BLK_STS_OK;
 	}
 
 	sgl_dma += sizeof(*sg_list);
-	nvme_pci_sgl_set_seg(sg_list, sgl_dma, entries);
-	for_each_sg(sgl, sg, entries, i)
-		nvme_pci_sgl_set_data_sg(&sg_list[i + 1], sg);
-
-	return BLK_STS_OK;
+	do {
+		nvme_pci_sgl_set_data(&sg_list[++i], &iter);
+		iod->meta_total_len += iter.len;
+	} while (blk_rq_integrity_dma_map_iter_next(req, dev->dev, &iter));
 
-out_unmap_sg:
-	dma_unmap_sgtable(dev->dev, &iod->meta_sgt, rq_dma_dir(req), 0);
-out_free_sg:
-	mempool_free(iod->meta_sgt.sgl, dev->iod_meta_mempool);
-	return BLK_STS_RESOURCE;
+	nvme_pci_sgl_set_seg(sg_list, sgl_dma, i);
+	iod->nr_meta_descriptors = i;
+	return iter.status;
 }
 
 static blk_status_t nvme_pci_setup_meta_mptr(struct request *req)
@@ -1090,6 +1081,7 @@ static blk_status_t nvme_pci_setup_meta_mptr(struct request *req)
 	if (dma_mapping_error(nvmeq->dev->dev, iod->meta_dma))
 		return BLK_STS_IOERR;
 	iod->cmd.common.metadata = cpu_to_le64(iod->meta_dma);
+	iod->nr_meta_descriptors = 0;
 	return BLK_STS_OK;
 }
 
@@ -1111,7 +1103,6 @@ static blk_status_t nvme_prep_rq(struct request *req)
 	iod->flags = 0;
 	iod->nr_descriptors = 0;
 	iod->total_len = 0;
-	iod->meta_sgt.nents = 0;
 
 	ret = nvme_setup_cmd(req->q->queuedata, req);
 	if (ret)
@@ -1222,23 +1213,43 @@ static void nvme_queue_rqs(struct rq_list *rqlist)
 	*rqlist = requeue_list;
 }
 
+static void nvme_free_meta_sgls(struct nvme_iod *iod, struct device *dma_dev,
+				enum dma_data_direction dir)
+{
+	struct nvme_sgl_desc *sg_list = iod->meta_descriptor;
+	int i;
+
+	if (!iod->nr_meta_descriptors) {
+		dma_unmap_page(dma_dev, le64_to_cpu(sg_list->addr),
+				le32_to_cpu(sg_list->length), dir);
+		return;
+	}
+
+	for (i = 1; i <= iod->nr_meta_descriptors; i++)
+		dma_unmap_page(dma_dev, le64_to_cpu(sg_list[i].addr),
+				le32_to_cpu(sg_list[i].length), dir);
+}
+
 static __always_inline void nvme_unmap_metadata(struct request *req)
 {
-	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
 	struct nvme_queue *nvmeq = req->mq_hctx->driver_data;
-	struct nvme_dev *dev = nvmeq->dev;
+	enum dma_data_direction dir = rq_dma_dir(req);
+	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+	struct device *dma_dev = nvmeq->dev->dev;
 
-	if (!iod->meta_sgt.nents) {
-		dma_unmap_page(dev->dev, iod->meta_dma,
-			       rq_integrity_vec(req).bv_len,
-			       rq_dma_dir(req));
+	if (!nvme_pci_cmd_use_meta_sgl(&iod->cmd)) {
+		dma_unmap_page(dma_dev, iod->meta_dma,
+			       rq_integrity_vec(req).bv_len, dir);
 		return;
 	}
 
+	if (!blk_rq_dma_unmap(req, dma_dev, &iod->meta_dma_state,
+				iod->meta_total_len,
+				iod->flags & IOD_META_P2P_BUS_ADDR))
+		nvme_free_meta_sgls(iod, dma_dev, dir);
+
 	dma_pool_free(nvmeq->descriptor_pools.small, iod->meta_descriptor,
 			iod->meta_dma);
-	dma_unmap_sgtable(dev->dev, &iod->meta_sgt, rq_dma_dir(req), 0);
-	mempool_free(iod->meta_sgt.sgl, dev->iod_meta_mempool);
 }
 
 static __always_inline void nvme_pci_unmap_rq(struct request *req)
@@ -3046,7 +3057,6 @@ static int nvme_disable_prepare_reset(struct nvme_dev *dev, bool shutdown)
 
 static int nvme_pci_alloc_iod_mempool(struct nvme_dev *dev)
 {
-	size_t meta_size = sizeof(struct scatterlist) * (NVME_MAX_META_SEGS + 1);
 	size_t alloc_size = sizeof(struct nvme_dma_vec) * NVME_MAX_SEGS;
 
 	dev->dmavec_mempool = mempool_create_node(1,
@@ -3055,17 +3065,7 @@ static int nvme_pci_alloc_iod_mempool(struct nvme_dev *dev)
 			dev_to_node(dev->dev));
 	if (!dev->dmavec_mempool)
 		return -ENOMEM;
-
-	dev->iod_meta_mempool = mempool_create_node(1,
-			mempool_kmalloc, mempool_kfree,
-			(void *)meta_size, GFP_KERNEL,
-			dev_to_node(dev->dev));
-	if (!dev->iod_meta_mempool)
-		goto free;
 	return 0;
-free:
-	mempool_destroy(dev->dmavec_mempool);
-	return -ENOMEM;
 }
 
 static void nvme_free_tagset(struct nvme_dev *dev)
@@ -3515,7 +3515,6 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	nvme_free_queues(dev, 0);
 out_release_iod_mempool:
 	mempool_destroy(dev->dmavec_mempool);
-	mempool_destroy(dev->iod_meta_mempool);
 out_dev_unmap:
 	nvme_dev_unmap(dev);
 out_uninit_ctrl:
@@ -3579,7 +3578,6 @@ static void nvme_remove(struct pci_dev *pdev)
 	nvme_dbbuf_dma_free(dev);
 	nvme_free_queues(dev, 0);
 	mempool_destroy(dev->dmavec_mempool);
-	mempool_destroy(dev->iod_meta_mempool);
 	nvme_release_descriptor_pools(dev);
 	nvme_dev_unmap(dev);
 	nvme_uninit_ctrl(&dev->ctrl);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata
  2025-07-20 18:40 ` [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata Keith Busch
@ 2025-07-20 22:51   ` kernel test robot
  2025-07-21  3:18   ` kernel test robot
  2025-07-21  7:46   ` Christoph Hellwig
  2 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2025-07-20 22:51 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-nvme, hch
  Cc: llvm, oe-kbuild-all, axboe, leonro, Keith Busch

Hi Keith,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[cannot apply to linux-nvme/for-next hch-configfs/for-next linus/master v6.16-rc6 next-20250718]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/blk-mq-dma-move-the-bio-and-bvec_iter-to-blk_dma_iter/20250721-024616
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20250720184040.2402790-7-kbusch%40meta.com
patch subject: [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata
config: um-randconfig-002-20250721 (https://download.01.org/0day-ci/archive/20250721/202507210629.vtm1boC0-lkp@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250721/202507210629.vtm1boC0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507210629.vtm1boC0-lkp@intel.com/

All errors (new ones prefixed by >>):

>> ld.lld: error: duplicate symbol: blk_rq_integrity_dma_map_iter_start
   >>> defined at dm-table.c
   >>>            drivers/md/dm-table.o:(blk_rq_integrity_dma_map_iter_start)
   >>> defined at dm-io-rewind.c
   >>>            drivers/md/dm-io-rewind.o:(.text+0x0)
--
>> ld.lld: error: duplicate symbol: blk_rq_integrity_dma_map_iter_next
   >>> defined at dm-table.c
   >>>            drivers/md/dm-table.o:(blk_rq_integrity_dma_map_iter_next)
   >>> defined at dm-io-rewind.c
   >>>            drivers/md/dm-io-rewind.o:(.text+0x13)
--
>> ld.lld: error: duplicate symbol: blk_rq_integrity_dma_map_iter_start
   >>> defined at core.c
   >>>            drivers/nvme/host/core.o:(blk_rq_integrity_dma_map_iter_start)
   >>> defined at ioctl.c
   >>>            drivers/nvme/host/ioctl.o:(.text+0x0)
--
>> ld.lld: error: duplicate symbol: blk_rq_integrity_dma_map_iter_next
   >>> defined at core.c
   >>>            drivers/nvme/host/core.o:(blk_rq_integrity_dma_map_iter_next)
   >>> defined at ioctl.c
   >>>            drivers/nvme/host/ioctl.o:(.text+0x13)

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata
  2025-07-20 18:40 ` [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata Keith Busch
  2025-07-20 22:51   ` kernel test robot
@ 2025-07-21  3:18   ` kernel test robot
  2025-07-21  7:46   ` Christoph Hellwig
  2 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2025-07-21  3:18 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-nvme, hch
  Cc: llvm, oe-kbuild-all, axboe, leonro, Keith Busch

Hi Keith,

kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe-block/for-next]
[cannot apply to linux-nvme/for-next hch-configfs/for-next linus/master v6.16-rc7 next-20250718]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/blk-mq-dma-move-the-bio-and-bvec_iter-to-blk_dma_iter/20250721-024616
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20250720184040.2402790-7-kbusch%40meta.com
patch subject: [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata
config: riscv-randconfig-001-20250721 (https://download.01.org/0day-ci/archive/20250721/202507211037.lzkMQLrY-lkp@intel.com/config)
compiler: clang version 21.0.0git (https://github.com/llvm/llvm-project 16534d19bf50bde879a83f0ae62875e2c5120e64)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250721/202507211037.lzkMQLrY-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507211037.lzkMQLrY-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from block/blk-mq.c:13:
>> include/linux/blk-integrity.h:117:6: warning: no previous prototype for function 'blk_rq_integrity_dma_map_iter_start' [-Wmissing-prototypes]
     117 | bool blk_rq_integrity_dma_map_iter_start(struct request *req,
         |      ^
   include/linux/blk-integrity.h:117:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     117 | bool blk_rq_integrity_dma_map_iter_start(struct request *req,
         | ^
         | static 
>> include/linux/blk-integrity.h:123:6: warning: no previous prototype for function 'blk_rq_integrity_dma_map_iter_next' [-Wmissing-prototypes]
     123 | bool blk_rq_integrity_dma_map_iter_next(struct request *req,
         |      ^
   include/linux/blk-integrity.h:123:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     123 | bool blk_rq_integrity_dma_map_iter_next(struct request *req,
         | ^
         | static 
   2 warnings generated.
--
   In file included from block/blk-mq-dma.c:5:
>> include/linux/blk-integrity.h:117:6: warning: no previous prototype for function 'blk_rq_integrity_dma_map_iter_start' [-Wmissing-prototypes]
     117 | bool blk_rq_integrity_dma_map_iter_start(struct request *req,
         |      ^
   include/linux/blk-integrity.h:117:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     117 | bool blk_rq_integrity_dma_map_iter_start(struct request *req,
         | ^
         | static 
>> include/linux/blk-integrity.h:123:6: warning: no previous prototype for function 'blk_rq_integrity_dma_map_iter_next' [-Wmissing-prototypes]
     123 | bool blk_rq_integrity_dma_map_iter_next(struct request *req,
         |      ^
   include/linux/blk-integrity.h:123:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     123 | bool blk_rq_integrity_dma_map_iter_next(struct request *req,
         | ^
         | static 
   block/blk-mq-dma.c:23:27: error: no member named 'bi_integrity' in 'struct bio'
      23 |                 iter->iter = iter->bio->bi_integrity->bip_iter;
         |                              ~~~~~~~~~  ^
   block/blk-mq-dma.c:24:27: error: no member named 'bi_integrity' in 'struct bio'
      24 |                 iter->bvec = iter->bio->bi_integrity->bip_vec;
         |                              ~~~~~~~~~  ^
   2 warnings and 2 errors generated.


vim +/blk_rq_integrity_dma_map_iter_start +117 include/linux/blk-integrity.h

    90	
    91	/*
    92	 * Return the current bvec that contains the integrity data. bip_iter may be
    93	 * advanced to iterate over the integrity data.
    94	 */
    95	static inline struct bio_vec rq_integrity_vec(struct request *rq)
    96	{
    97		return mp_bvec_iter_bvec(rq->bio->bi_integrity->bip_vec,
    98					 rq->bio->bi_integrity->bip_iter);
    99	}
   100	#else /* CONFIG_BLK_DEV_INTEGRITY */
   101	static inline int blk_rq_count_integrity_sg(struct request_queue *q,
   102						    struct bio *b)
   103	{
   104		return 0;
   105	}
   106	static inline int blk_rq_map_integrity_sg(struct request *q,
   107						  struct scatterlist *s)
   108	{
   109		return 0;
   110	}
   111	static inline int blk_rq_integrity_map_user(struct request *rq,
   112						    void __user *ubuf,
   113						    ssize_t bytes)
   114	{
   115		return -EINVAL;
   116	}
 > 117	bool blk_rq_integrity_dma_map_iter_start(struct request *req,
   118			struct device *dma_dev,  struct dma_iova_state *state,
   119			struct blk_dma_iter *iter)
   120	{
   121		return false;
   122	}
 > 123	bool blk_rq_integrity_dma_map_iter_next(struct request *req,
   124			struct device *dma_dev, struct blk_dma_iter *iter)
   125	{
   126		return false;
   127	}
   128	static inline struct blk_integrity *bdev_get_integrity(struct block_device *b)
   129	{
   130		return NULL;
   131	}
   132	static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
   133	{
   134		return NULL;
   135	}
   136	static inline bool
   137	blk_integrity_queue_supports_integrity(struct request_queue *q)
   138	{
   139		return false;
   140	}
   141	static inline unsigned short
   142	queue_max_integrity_segments(const struct request_queue *q)
   143	{
   144		return 0;
   145	}
   146	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter
  2025-07-20 18:40 ` [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter Keith Busch
@ 2025-07-21  7:37   ` Christoph Hellwig
  2025-07-21  7:42   ` Christoph Hellwig
  1 sibling, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:37 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

On Sun, Jul 20, 2025 at 11:40:34AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> The req_iterator just happens to have a similar fields to what the dma
> iterator needs, but we're not necessarily iterating a bio_vec here. Have
> the dma iterator define its private fields directly. It also helps to
> remove eyesores like "iter->iter.iter".

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 2/7] blk-mq-dma: set the bvec being iterated
  2025-07-20 18:40 ` [PATCHv2 2/7] blk-mq-dma: set the bvec being iterated Keith Busch
@ 2025-07-21  7:38   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:38 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

On Sun, Jul 20, 2025 at 11:40:35AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> This will make it easier to add different bvec sources, like for
> upcoming integrity support. It also makes iterating "special" payloads
> more common with iterating normal data bi_io_vecs.

You're not really setting the bvec, but the bvec table, i.e. the start
of the contiguous chunk of bvecs for the bio / bip.  Can you make this
a bit more clear both in the commit message and the structure member
naming?  This really confused me and took a bit of careful reading
to understand.

Otherwise this looks fine.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 3/7] blk-mq-dma: require unmap caller provide p2p map type
  2025-07-20 18:40 ` [PATCHv2 3/7] blk-mq-dma: require unmap caller provide p2p map type Keith Busch
@ 2025-07-21  7:39   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:39 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

On Sun, Jul 20, 2025 at 11:40:36AM -0700, Keith Busch wrote:
>  static inline bool blk_rq_dma_unmap(struct request *req, struct device *dma_dev,
> -		struct dma_iova_state *state, size_t mapped_len)
> +		struct dma_iova_state *state, size_t mapped_len, bool p2p)

Nit: can we names this is_p2p as that makes reading these boolean flags
a bit easier?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 4/7] blk-mq: remove REQ_P2PDMA flag
  2025-07-20 18:40 ` [PATCHv2 4/7] blk-mq: remove REQ_P2PDMA flag Keith Busch
@ 2025-07-21  7:39   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:39 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

On Sun, Jul 20, 2025 at 11:40:37AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> It's not serving any particular purpose anymore. pci_p2pdma_state()
> already has all the appropriate checks, so the flag isn't guarding
> anything.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 5/7] blk-mq-dma: move common dma start code to a helper
  2025-07-20 18:40 ` [PATCHv2 5/7] blk-mq-dma: move common dma start code to a helper Keith Busch
@ 2025-07-21  7:40   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:40 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter
  2025-07-20 18:40 ` [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter Keith Busch
  2025-07-21  7:37   ` Christoph Hellwig
@ 2025-07-21  7:42   ` Christoph Hellwig
  2025-07-22  2:33     ` Keith Busch
  1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:42 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

On Sun, Jul 20, 2025 at 11:40:34AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> The req_iterator just happens to have a similar fields to what the dma
> iterator needs, but we're not necessarily iterating a bio_vec here. Have
> the dma iterator define its private fields directly. It also helps to
> remove eyesores like "iter->iter.iter".

Going back to this after looking at the later patches.  The level
for which this iter exists is only called blk_map_* as there is
nothing dma specific about, just mapping to physical addresses.

So maybe call it blk_map_iter instead?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata
  2025-07-20 18:40 ` [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata Keith Busch
  2025-07-20 22:51   ` kernel test robot
  2025-07-21  3:18   ` kernel test robot
@ 2025-07-21  7:46   ` Christoph Hellwig
  2 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:46 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

On Sun, Jul 20, 2025 at 11:40:39AM -0700, Keith Busch wrote:
> +static bool blk_dma_iter_next(struct blk_dma_iter *iter)

This is at the blk_map_iter level and one level below dma, so it
should not have a dma in the name.  I'm not entirely sure what a good
name is, though.  Maybe blk_map_iter_next_bio wich was the original
intention behind the refactored loop, although it isn't quite true
with the later !iter->iter.bi_bvec_done addition.

> +#ifdef CONFIG_BLK_DEV_INTEGRITY
> +bool blk_rq_integrity_dma_map_iter_start(struct request *req,
> +		struct device *dma_dev,  struct dma_iova_state *state,
> +		struct blk_dma_iter *iter)
> +{
> +	unsigned len = bio_integrity_bytes(&req->q->limits.integrity,
> +					   blk_rq_sectors(req));
> +	iter->iter = req->bio->bi_integrity->bip_iter;
> +	iter->bvec = req->bio->bi_integrity->bip_vec;
> +	iter->integrity = true;
> +	return blk_dma_map_iter_start(req, dma_dev, state, iter, len);
> +}
> +EXPORT_SYMBOL_GPL(blk_rq_integrity_dma_map_iter_start);
> +
> +bool blk_rq_integrity_dma_map_iter_next(struct request *req,
> +               struct device *dma_dev, struct blk_dma_iter *iter)

The wans kernel doc comments that can be mostly copy and pasted
from the data mapping version.

Also it would be great to convert the integrity sg mapping and
count macros to blk_dma_map_iter_ just like I did for the data
version.

> +bool blk_rq_integrity_dma_map_iter_start(struct request *req,
> +		struct device *dma_dev,  struct dma_iova_state *state,
> +		struct blk_dma_iter *iter)
> +{
> +	return false;
> +}
> +bool blk_rq_integrity_dma_map_iter_next(struct request *req,

As you've probably noticed from the buildbot these need to be inline.

> @@ -18,6 +18,7 @@ struct blk_dma_iter {
>  	struct bvec_iter		iter;
>  	struct bio			*bio;
>  	struct pci_p2pdma_map_state	p2pdma;
> +	bool				integrity;

Add an is_ prefix for the new field?  Or replace it with checks
for ->bvec (or ->bvec_table as it should be) != bio->bi_io_vec?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 7/7] nvme: convert metadata mapping to dma iter
  2025-07-20 18:40 ` [PATCHv2 7/7] nvme: convert metadata mapping to dma iter Keith Busch
@ 2025-07-21  7:50   ` Christoph Hellwig
  2025-07-21 13:15     ` Keith Busch
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-21  7:50 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-block, linux-nvme, hch, axboe, leonro, Keith Busch

> +	if (!blk_rq_integrity_dma_map_iter_start(req, dev->dev,
> +						&iod->meta_dma_state, &iter))

Do the normal two-tab indent here to make this a bit more readable?

>  	if (entries == 1) {
> -		nvme_pci_sgl_set_data_sg(sg_list, sgl);
> +		iod->meta_total_len = iter.len;
> +		nvme_pci_sgl_set_data(sg_list, &iter);
> +		iod->nr_meta_descriptors = 0;

This should probably just set up the linear metadata pointer instead
of a single-segment SGL.

> +	if (!iod->nr_meta_descriptors) {
> +		dma_unmap_page(dma_dev, le64_to_cpu(sg_list->addr),
> +				le32_to_cpu(sg_list->length), dir);
> +		return;
> +	}
> +
> +	for (i = 1; i <= iod->nr_meta_descriptors; i++)
> +		dma_unmap_page(dma_dev, le64_to_cpu(sg_list[i].addr),
> +				le32_to_cpu(sg_list[i].length), dir);
> +}

The use of nr_meta_descriptors is still incorrect here.  nr_descriptors
counts the number of descriptors we got from the dma pools, which
currently is always 1 for metadata SGLs.  The length of the SGL
descriptor simplify comes from le32_to_cpu(sg_list[0].length) divided
by the sgl entry size.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 7/7] nvme: convert metadata mapping to dma iter
  2025-07-21  7:50   ` Christoph Hellwig
@ 2025-07-21 13:15     ` Keith Busch
  2025-07-22  5:49       ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Keith Busch @ 2025-07-21 13:15 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Keith Busch, linux-block, linux-nvme, axboe, leonro

On Mon, Jul 21, 2025 at 09:50:53AM +0200, Christoph Hellwig wrote:
> >  	if (entries == 1) {
> > -		nvme_pci_sgl_set_data_sg(sg_list, sgl);
> > +		iod->meta_total_len = iter.len;
> > +		nvme_pci_sgl_set_data(sg_list, &iter);
> > +		iod->nr_meta_descriptors = 0;
> 
> This should probably just set up the linear metadata pointer instead
> of a single-segment SGL.

Okay, but we should still use SGL with user passthrough commands for
memory safety. Even if we have an iommu protecting access, there's still
a possibility of corrupting adjacent iova's if using MPTR.
 
> > +	if (!iod->nr_meta_descriptors) {
> > +		dma_unmap_page(dma_dev, le64_to_cpu(sg_list->addr),
> > +				le32_to_cpu(sg_list->length), dir);
> > +		return;
> > +	}
> > +
> > +	for (i = 1; i <= iod->nr_meta_descriptors; i++)
> > +		dma_unmap_page(dma_dev, le64_to_cpu(sg_list[i].addr),
> > +				le32_to_cpu(sg_list[i].length), dir);
> > +}
> 
> The use of nr_meta_descriptors is still incorrect here.  nr_descriptors
> counts the number of descriptors we got from the dma pools, which
> currently is always 1 for metadata SGLs.  The length of the SGL
> descriptor simplify comes from le32_to_cpu(sg_list[0].length) divided
> by the sgl entry size.

In this patch, the nr_meta_descriptors value matches the sg_list length.
The only real reason I need this 'nr_' value is to distinguish the
single data descriptor condition from the segment descriptor use, but I
can just add an iod flag for that too and save some space.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter
  2025-07-21  7:42   ` Christoph Hellwig
@ 2025-07-22  2:33     ` Keith Busch
  2025-07-22  5:53       ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Keith Busch @ 2025-07-22  2:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Keith Busch, linux-block, linux-nvme, axboe, leonro

On Mon, Jul 21, 2025 at 09:42:23AM +0200, Christoph Hellwig wrote:
> On Sun, Jul 20, 2025 at 11:40:34AM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> > 
> > The req_iterator just happens to have a similar fields to what the dma
> > iterator needs, but we're not necessarily iterating a bio_vec here. Have
> > the dma iterator define its private fields directly. It also helps to
> > remove eyesores like "iter->iter.iter".
> 
> Going back to this after looking at the later patches.  The level
> for which this iter exists is only called blk_map_* as there is
> nothing dma specific about, just mapping to physical addresses.
> 
> So maybe call it blk_map_iter instead?

But the structure yields "dma_addr_t" type values to the consumer.
Whether those are physical addresses or remapped io virtual addresses,
they're still used for direct memory access, so I think the name as-is
is a pretty good fit.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 7/7] nvme: convert metadata mapping to dma iter
  2025-07-21 13:15     ` Keith Busch
@ 2025-07-22  5:49       ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-22  5:49 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Keith Busch, linux-block, linux-nvme, axboe,
	leonro

On Mon, Jul 21, 2025 at 07:15:41AM -0600, Keith Busch wrote:
> > This should probably just set up the linear metadata pointer instead
> > of a single-segment SGL.
> 
> Okay, but we should still use SGL with user passthrough commands for
> memory safety. Even if we have an iommu protecting access, there's still
> a possibility of corrupting adjacent iova's if using MPTR.

True.

> In this patch, the nr_meta_descriptors value matches the sg_list length.
> The only real reason I need this 'nr_' value is to distinguish the
> single data descriptor condition from the segment descriptor use, but I
> can just add an iod flag for that too and save some space.

Yes, that would be great.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter
  2025-07-22  2:33     ` Keith Busch
@ 2025-07-22  5:53       ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-07-22  5:53 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Keith Busch, linux-block, linux-nvme, axboe,
	leonro

On Mon, Jul 21, 2025 at 08:33:02PM -0600, Keith Busch wrote:
> On Mon, Jul 21, 2025 at 09:42:23AM +0200, Christoph Hellwig wrote:
> > On Sun, Jul 20, 2025 at 11:40:34AM -0700, Keith Busch wrote:
> > > From: Keith Busch <kbusch@kernel.org>
> > > 
> > > The req_iterator just happens to have a similar fields to what the dma
> > > iterator needs, but we're not necessarily iterating a bio_vec here. Have
> > > the dma iterator define its private fields directly. It also helps to
> > > remove eyesores like "iter->iter.iter".
> > 
> > Going back to this after looking at the later patches.  The level
> > for which this iter exists is only called blk_map_* as there is
> > nothing dma specific about, just mapping to physical addresses.
> > 
> > So maybe call it blk_map_iter instead?
> 
> But the structure yields "dma_addr_t" type values to the consumer.
> Whether those are physical addresses or remapped io virtual addresses,
> they're still used for direct memory access, so I think the name as-is
> is a pretty good fit.

There's two layers:

The lower layer implemented by blk_map_iter_next just yields a phys_vec.

The upper layer built on top does the dma mapping in case of the new
blk_rq_dma_map_iter_start / blk_rq_dma_map_iter_next API.  But in case of
the old __blk_rq_map_sg API it doesn't even directly do the DMA mapping
yet.  And some drivers at least historically used the blk_map_sg without
then doing a DMA mapping, either for MMIO or platform specific things
using physical address.  My hope was to eventually migrate them to use
blk_map_iter_next.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-07-22  5:53 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-20 18:40 [PATCHv2 0/7] blk dma iter for metadata Keith Busch
2025-07-20 18:40 ` [PATCHv2 1/7] blk-mq-dma: move the bio and bvec_iter to blk_dma_iter Keith Busch
2025-07-21  7:37   ` Christoph Hellwig
2025-07-21  7:42   ` Christoph Hellwig
2025-07-22  2:33     ` Keith Busch
2025-07-22  5:53       ` Christoph Hellwig
2025-07-20 18:40 ` [PATCHv2 2/7] blk-mq-dma: set the bvec being iterated Keith Busch
2025-07-21  7:38   ` Christoph Hellwig
2025-07-20 18:40 ` [PATCHv2 3/7] blk-mq-dma: require unmap caller provide p2p map type Keith Busch
2025-07-21  7:39   ` Christoph Hellwig
2025-07-20 18:40 ` [PATCHv2 4/7] blk-mq: remove REQ_P2PDMA flag Keith Busch
2025-07-21  7:39   ` Christoph Hellwig
2025-07-20 18:40 ` [PATCHv2 5/7] blk-mq-dma: move common dma start code to a helper Keith Busch
2025-07-21  7:40   ` Christoph Hellwig
2025-07-20 18:40 ` [PATCHv2 6/7] blk-mq-dma: add support for mapping integrity metadata Keith Busch
2025-07-20 22:51   ` kernel test robot
2025-07-21  3:18   ` kernel test robot
2025-07-21  7:46   ` Christoph Hellwig
2025-07-20 18:40 ` [PATCHv2 7/7] nvme: convert metadata mapping to dma iter Keith Busch
2025-07-21  7:50   ` Christoph Hellwig
2025-07-21 13:15     ` Keith Busch
2025-07-22  5:49       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).