All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
@ 2023-10-19 15:25 ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: Marek Szyprowski, Chuck Lever, Robin Murphy, Alexander Potapenko,
	linux-mm, linux-rdma, Jens Axboe, kasan-dev, David Howells, iommu,
	Christoph Hellwig

The SunRPC stack manages pages (and eventually, folios) via an
array of struct biovec items within struct xdr_buf. We have not
fully committed to replacing the struct page array in xdr_buf
because, although the socket API supports biovec arrays, the RDMA
stack uses struct scatterlist rather than struct biovec.

This (incomplete) series explores what it might look like if the
RDMA core API could support struct biovec array arguments. The
series compiles on x86, but I haven't tested it further. I'm posting
early in hopes of starting further discussion.

Are there other upper layer API consumers, besides SunRPC, who might
prefer the use of biovec over scatterlist?

Besides handling folios as well as single pages in bv_page, what
other work might be needed in the DMA layer?

What RDMA core APIs should be converted? IMO a DMA mapping and
registration API for biovecs would be needed. Maybe RDMA Read and
Write too?

---

Chuck Lever (9):
      dma-debug: Fix a typo in a debugging eye-catcher
      bvec: Add bio_vec fields to manage DMA mapping
      dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
      mm: kmsan: Add support for DMA mapping bio_vec arrays
      dma-direct: Support direct mapping bio_vec arrays
      DMA-API: Add dma_sync_bvecs_for_cpu() and dma_sync_bvecs_for_device()
      DMA: Add dma_map_bvecs_attrs()
      iommu/dma: Support DMA-mapping a bio_vec array
      RDMA: Add helpers for DMA-mapping an array of bio_vecs


 drivers/iommu/dma-iommu.c   | 368 ++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c       |  58 ++++++
 include/linux/bvec.h        | 143 ++++++++++++++
 include/linux/dma-map-ops.h |   8 +
 include/linux/dma-mapping.h |   9 +
 include/linux/iommu.h       |   4 +
 include/linux/kmsan.h       |  20 ++
 include/rdma/ib_verbs.h     |  29 +++
 kernel/dma/debug.c          | 165 +++++++++++++++-
 kernel/dma/debug.h          |  38 ++++
 kernel/dma/direct.c         |  92 +++++++++
 kernel/dma/direct.h         |  17 ++
 kernel/dma/mapping.c        |  93 +++++++++
 mm/kmsan/hooks.c            |  13 ++
 14 files changed, 1056 insertions(+), 1 deletion(-)

--
Chuck Lever


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
@ 2023-10-19 15:25 ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: Marek Szyprowski, Chuck Lever, Robin Murphy, Alexander Potapenko,
	linux-mm, linux-rdma, Jens Axboe, kasan-dev, David Howells, iommu,
	Christoph Hellwig

The SunRPC stack manages pages (and eventually, folios) via an
array of struct biovec items within struct xdr_buf. We have not
fully committed to replacing the struct page array in xdr_buf
because, although the socket API supports biovec arrays, the RDMA
stack uses struct scatterlist rather than struct biovec.

This (incomplete) series explores what it might look like if the
RDMA core API could support struct biovec array arguments. The
series compiles on x86, but I haven't tested it further. I'm posting
early in hopes of starting further discussion.

Are there other upper layer API consumers, besides SunRPC, who might
prefer the use of biovec over scatterlist?

Besides handling folios as well as single pages in bv_page, what
other work might be needed in the DMA layer?

What RDMA core APIs should be converted? IMO a DMA mapping and
registration API for biovecs would be needed. Maybe RDMA Read and
Write too?

---

Chuck Lever (9):
      dma-debug: Fix a typo in a debugging eye-catcher
      bvec: Add bio_vec fields to manage DMA mapping
      dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
      mm: kmsan: Add support for DMA mapping bio_vec arrays
      dma-direct: Support direct mapping bio_vec arrays
      DMA-API: Add dma_sync_bvecs_for_cpu() and dma_sync_bvecs_for_device()
      DMA: Add dma_map_bvecs_attrs()
      iommu/dma: Support DMA-mapping a bio_vec array
      RDMA: Add helpers for DMA-mapping an array of bio_vecs


 drivers/iommu/dma-iommu.c   | 368 ++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c       |  58 ++++++
 include/linux/bvec.h        | 143 ++++++++++++++
 include/linux/dma-map-ops.h |   8 +
 include/linux/dma-mapping.h |   9 +
 include/linux/iommu.h       |   4 +
 include/linux/kmsan.h       |  20 ++
 include/rdma/ib_verbs.h     |  29 +++
 kernel/dma/debug.c          | 165 +++++++++++++++-
 kernel/dma/debug.h          |  38 ++++
 kernel/dma/direct.c         |  92 +++++++++
 kernel/dma/direct.h         |  17 ++
 kernel/dma/mapping.c        |  93 +++++++++
 mm/kmsan/hooks.c            |  13 ++
 14 files changed, 1056 insertions(+), 1 deletion(-)

--
Chuck Lever


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH RFC 1/9] dma-debug: Fix a typo in a debugging eye-catcher
  2023-10-19 15:25 ` Chuck Lever
  (?)
@ 2023-10-19 15:25 ` Chuck Lever
  2023-10-20  4:49   ` Christoph Hellwig
  -1 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, iommu,
	Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: iommu@lists.linux.dev
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 kernel/dma/debug.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 06366acd27b0..3de494375b7b 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -139,7 +139,7 @@ static const char *const maperr2str[] = {
 
 static const char *type2name[] = {
 	[dma_debug_single] = "single",
-	[dma_debug_sg] = "scather-gather",
+	[dma_debug_sg] = "scatter-gather",
 	[dma_debug_coherent] = "coherent",
 	[dma_debug_resource] = "resource",
 };



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 2/9] bvec: Add bio_vec fields to manage DMA mapping
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:25   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: Jens Axboe, Christoph Hellwig, David Howells, iommu, linux-rdma,
	Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

These are roughly equivalent to the fields used for managing
scatterlist DMA mapping.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/bvec.h |  143 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 143 insertions(+)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 555aae5448ae..1074f34a4e8f 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -13,6 +13,7 @@
 #include <linux/limits.h>
 #include <linux/minmax.h>
 #include <linux/types.h>
+#include <asm/io.h>
 
 struct page;
 
@@ -32,6 +33,13 @@ struct bio_vec {
 	struct page	*bv_page;
 	unsigned int	bv_len;
 	unsigned int	bv_offset;
+	dma_addr_t	bv_dma_address;
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+	unsigned int	bv_dma_length;
+#endif
+#ifdef CONFIG_NEED_SG_DMA_FLAGS
+	unsigned int	bv_dma_flags;
+#endif
 };
 
 /**
@@ -74,6 +82,24 @@ static inline void bvec_set_virt(struct bio_vec *bv, void *vaddr,
 	bvec_set_page(bv, virt_to_page(vaddr), len, offset_in_page(vaddr));
 }
 
+/**
+ * bv_phys - return physical address of a bio_vec
+ * @bv:		bio_vec
+ */
+static inline dma_addr_t bv_phys(struct bio_vec *bv)
+{
+	return page_to_phys(bv->bv_page) + bv->bv_offset;
+}
+
+/**
+ * bv_virt - return virtual address of a bio_vec
+ * @bv:		bio_vec
+ */
+static inline void *bv_virt(struct bio_vec *bv)
+{
+	return page_address(bv->bv_page) + bv->bv_offset;
+}
+
 struct bvec_iter {
 	sector_t		bi_sector;	/* device address in 512 byte
 						   sectors */
@@ -280,4 +306,121 @@ static inline void *bvec_virt(struct bio_vec *bvec)
 	return page_address(bvec->bv_page) + bvec->bv_offset;
 }
 
+/*
+ * These macros should be used after a dma_map_bvecs call has been done
+ * to get bus addresses of each of the bio_vec array entries and their
+ * lengths. You should work only with the number of bio_vec array entries
+ * dma_map_bvecs returns, or alternatively stop on the first bv_dma_len(bv)
+ * which is 0.
+ */
+#define bv_dma_address(bv)	((bv)->bv_dma_address)
+
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+#define bv_dma_len(bv)		((bv)->bv_dma_length)
+#else
+#define bv_dma_len(bv)		((bv)->bv_len)
+#endif
+
+/*
+ * On 64-bit architectures there is a 4-byte padding in struct scatterlist
+ * (assuming also CONFIG_NEED_SG_DMA_LENGTH is set). Use this padding for DMA
+ * flags bits to indicate when a specific dma address is a bus address or the
+ * buffer may have been bounced via SWIOTLB.
+ */
+#ifdef CONFIG_NEED_SG_DMA_FLAGS
+
+#define BV_DMA_BUS_ADDRESS	BIT(0)
+#define BV_DMA_SWIOTLB		BIT(1)
+
+/**
+ * bv_dma_is_bus_address - Return whether a given segment was marked
+ *			   as a bus address
+ * @bv:		 bio_vec array entry
+ *
+ * Description:
+ *   Returns true if bv_dma_mark_bus_address() has been called on
+ *   this bio_vec.
+ **/
+static inline bool bv_dma_is_bus_address(struct bio_vec *bv)
+{
+	return bv->bv_dma_flags & BV_DMA_BUS_ADDRESS;
+}
+
+/**
+ * bv_dma_mark_bus_address - Mark the bio_vec entry as a bus address
+ * @bv:		 bio_vec array entry
+ *
+ * Description:
+ *   Marks the passed-in bv entry to indicate that the dma_address is
+ *   a bus address and doesn't need to be unmapped. This should only be
+ *   used by dma_map_bvecs() implementations to mark bus addresses
+ *   so they can be properly cleaned up in dma_unmap_bvecs().
+ **/
+static inline void bv_dma_mark_bus_address(struct bio_vec *bv)
+{
+	bv->bv_dma_flags |= BV_DMA_BUS_ADDRESS;
+}
+
+/**
+ * bv_unmark_bus_address - Unmark the bio_vec entry as a bus address
+ * @bv:		 bio_vec array entry
+ *
+ * Description:
+ *   Clears the bus address mark.
+ **/
+static inline void bv_dma_unmark_bus_address(struct bio_vec *bv)
+{
+	bv->bv_dma_flags &= ~BV_DMA_BUS_ADDRESS;
+}
+
+/**
+ * bv_dma_is_swiotlb - Return whether the bio_vec was marked for SWIOTLB
+ *		       bouncing
+ * @bv:		bio_vec array entry
+ *
+ * Description:
+ *   Returns true if the bio_vec was marked for SWIOTLB bouncing. Not all
+ *   elements may have been bounced, so the caller would have to check
+ *   individual BV entries with is_swiotlb_buffer().
+ */
+static inline bool bv_dma_is_swiotlb(struct bio_vec *bv)
+{
+	return bv->bv_dma_flags & BV_DMA_SWIOTLB;
+}
+
+/**
+ * bv_dma_mark_swiotlb - Mark the bio_vec for SWIOTLB bouncing
+ * @bv:		bio_vec array entry
+ *
+ * Description:
+ *   Marks a a bio_vec for SWIOTLB bounce. Not all bio_vec entries may
+ *   be bounced.
+ */
+static inline void bv_dma_mark_swiotlb(struct bio_vec *bv)
+{
+	bv->bv_dma_flags |= BV_DMA_SWIOTLB;
+}
+
+#else
+
+static inline bool bv_dma_is_bus_address(struct bio_vec *bv)
+{
+	return false;
+}
+static inline void bv_dma_mark_bus_address(struct bio_vec *bv)
+{
+}
+static inline void bv_dma_unmark_bus_address(struct bio_vec *bv)
+{
+}
+static inline bool bv_dma_is_swiotlb(struct bio_vec *bv)
+{
+	return false;
+}
+static inline void bv_dma_mark_swiotlb(struct bio_vec *bv)
+{
+}
+
+#endif	/* CONFIG_NEED_SG_DMA_FLAGS */
+
 #endif /* __LINUX_BVEC_H */



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 2/9] bvec: Add bio_vec fields to manage DMA mapping
@ 2023-10-19 15:25   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: Jens Axboe, Christoph Hellwig, David Howells, iommu, linux-rdma,
	Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

These are roughly equivalent to the fields used for managing
scatterlist DMA mapping.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/bvec.h |  143 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 143 insertions(+)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 555aae5448ae..1074f34a4e8f 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -13,6 +13,7 @@
 #include <linux/limits.h>
 #include <linux/minmax.h>
 #include <linux/types.h>
+#include <asm/io.h>
 
 struct page;
 
@@ -32,6 +33,13 @@ struct bio_vec {
 	struct page	*bv_page;
 	unsigned int	bv_len;
 	unsigned int	bv_offset;
+	dma_addr_t	bv_dma_address;
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+	unsigned int	bv_dma_length;
+#endif
+#ifdef CONFIG_NEED_SG_DMA_FLAGS
+	unsigned int	bv_dma_flags;
+#endif
 };
 
 /**
@@ -74,6 +82,24 @@ static inline void bvec_set_virt(struct bio_vec *bv, void *vaddr,
 	bvec_set_page(bv, virt_to_page(vaddr), len, offset_in_page(vaddr));
 }
 
+/**
+ * bv_phys - return physical address of a bio_vec
+ * @bv:		bio_vec
+ */
+static inline dma_addr_t bv_phys(struct bio_vec *bv)
+{
+	return page_to_phys(bv->bv_page) + bv->bv_offset;
+}
+
+/**
+ * bv_virt - return virtual address of a bio_vec
+ * @bv:		bio_vec
+ */
+static inline void *bv_virt(struct bio_vec *bv)
+{
+	return page_address(bv->bv_page) + bv->bv_offset;
+}
+
 struct bvec_iter {
 	sector_t		bi_sector;	/* device address in 512 byte
 						   sectors */
@@ -280,4 +306,121 @@ static inline void *bvec_virt(struct bio_vec *bvec)
 	return page_address(bvec->bv_page) + bvec->bv_offset;
 }
 
+/*
+ * These macros should be used after a dma_map_bvecs call has been done
+ * to get bus addresses of each of the bio_vec array entries and their
+ * lengths. You should work only with the number of bio_vec array entries
+ * dma_map_bvecs returns, or alternatively stop on the first bv_dma_len(bv)
+ * which is 0.
+ */
+#define bv_dma_address(bv)	((bv)->bv_dma_address)
+
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+#define bv_dma_len(bv)		((bv)->bv_dma_length)
+#else
+#define bv_dma_len(bv)		((bv)->bv_len)
+#endif
+
+/*
+ * On 64-bit architectures there is a 4-byte padding in struct scatterlist
+ * (assuming also CONFIG_NEED_SG_DMA_LENGTH is set). Use this padding for DMA
+ * flags bits to indicate when a specific dma address is a bus address or the
+ * buffer may have been bounced via SWIOTLB.
+ */
+#ifdef CONFIG_NEED_SG_DMA_FLAGS
+
+#define BV_DMA_BUS_ADDRESS	BIT(0)
+#define BV_DMA_SWIOTLB		BIT(1)
+
+/**
+ * bv_dma_is_bus_address - Return whether a given segment was marked
+ *			   as a bus address
+ * @bv:		 bio_vec array entry
+ *
+ * Description:
+ *   Returns true if bv_dma_mark_bus_address() has been called on
+ *   this bio_vec.
+ **/
+static inline bool bv_dma_is_bus_address(struct bio_vec *bv)
+{
+	return bv->bv_dma_flags & BV_DMA_BUS_ADDRESS;
+}
+
+/**
+ * bv_dma_mark_bus_address - Mark the bio_vec entry as a bus address
+ * @bv:		 bio_vec array entry
+ *
+ * Description:
+ *   Marks the passed-in bv entry to indicate that the dma_address is
+ *   a bus address and doesn't need to be unmapped. This should only be
+ *   used by dma_map_bvecs() implementations to mark bus addresses
+ *   so they can be properly cleaned up in dma_unmap_bvecs().
+ **/
+static inline void bv_dma_mark_bus_address(struct bio_vec *bv)
+{
+	bv->bv_dma_flags |= BV_DMA_BUS_ADDRESS;
+}
+
+/**
+ * bv_unmark_bus_address - Unmark the bio_vec entry as a bus address
+ * @bv:		 bio_vec array entry
+ *
+ * Description:
+ *   Clears the bus address mark.
+ **/
+static inline void bv_dma_unmark_bus_address(struct bio_vec *bv)
+{
+	bv->bv_dma_flags &= ~BV_DMA_BUS_ADDRESS;
+}
+
+/**
+ * bv_dma_is_swiotlb - Return whether the bio_vec was marked for SWIOTLB
+ *		       bouncing
+ * @bv:		bio_vec array entry
+ *
+ * Description:
+ *   Returns true if the bio_vec was marked for SWIOTLB bouncing. Not all
+ *   elements may have been bounced, so the caller would have to check
+ *   individual BV entries with is_swiotlb_buffer().
+ */
+static inline bool bv_dma_is_swiotlb(struct bio_vec *bv)
+{
+	return bv->bv_dma_flags & BV_DMA_SWIOTLB;
+}
+
+/**
+ * bv_dma_mark_swiotlb - Mark the bio_vec for SWIOTLB bouncing
+ * @bv:		bio_vec array entry
+ *
+ * Description:
+ *   Marks a a bio_vec for SWIOTLB bounce. Not all bio_vec entries may
+ *   be bounced.
+ */
+static inline void bv_dma_mark_swiotlb(struct bio_vec *bv)
+{
+	bv->bv_dma_flags |= BV_DMA_SWIOTLB;
+}
+
+#else
+
+static inline bool bv_dma_is_bus_address(struct bio_vec *bv)
+{
+	return false;
+}
+static inline void bv_dma_mark_bus_address(struct bio_vec *bv)
+{
+}
+static inline void bv_dma_unmark_bus_address(struct bio_vec *bv)
+{
+}
+static inline bool bv_dma_is_swiotlb(struct bio_vec *bv)
+{
+	return false;
+}
+static inline void bv_dma_mark_swiotlb(struct bio_vec *bv)
+{
+}
+
+#endif	/* CONFIG_NEED_SG_DMA_FLAGS */
+
 #endif /* __LINUX_BVEC_H */



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:25   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/dma-mapping.h |    1 
 kernel/dma/debug.c          |  163 +++++++++++++++++++++++++++++++++++++++++++
 kernel/dma/debug.h          |   38 ++++++++++
 3 files changed, 202 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f0ccca16a0ac..f511ec546f4d 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -9,6 +9,7 @@
 #include <linux/err.h>
 #include <linux/dma-direction.h>
 #include <linux/scatterlist.h>
+#include <linux/bvec.h>
 #include <linux/bug.h>
 #include <linux/mem_encrypt.h>
 
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 3de494375b7b..efb4a2eaf9a0 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -39,6 +39,7 @@ enum {
 	dma_debug_sg,
 	dma_debug_coherent,
 	dma_debug_resource,
+	dma_debug_bv,
 };
 
 enum map_err_types {
@@ -142,6 +143,7 @@ static const char *type2name[] = {
 	[dma_debug_sg] = "scatter-gather",
 	[dma_debug_coherent] = "coherent",
 	[dma_debug_resource] = "resource",
+	[dma_debug_bv] = "bio-vec",
 };
 
 static const char *dir2name[] = {
@@ -1189,6 +1191,32 @@ static void check_sg_segment(struct device *dev, struct scatterlist *sg)
 #endif
 }
 
+static void check_bv_segment(struct device *dev, struct bio_vec *bv)
+{
+#ifdef CONFIG_DMA_API_DEBUG_SG
+	unsigned int max_seg = dma_get_max_seg_size(dev);
+	u64 start, end, boundary = dma_get_seg_boundary(dev);
+
+	/*
+	 * Either the driver forgot to set dma_parms appropriately, or
+	 * whoever generated the list forgot to check them.
+	 */
+	if (bv->length > max_seg)
+		err_printk(dev, NULL, "mapping bv entry longer than device claims to support [len=%u] [max=%u]\n",
+			   bv->length, max_seg);
+	/*
+	 * In some cases this could potentially be the DMA API
+	 * implementation's fault, but it would usually imply that
+	 * the scatterlist was built inappropriately to begin with.
+	 */
+	start = bv_dma_address(bv);
+	end = start + bv_dma_len(bv) - 1;
+	if ((start ^ end) & ~boundary)
+		err_printk(dev, NULL, "mapping bv entry across boundary [start=0x%016llx] [end=0x%016llx] [boundary=0x%016llx]\n",
+			   start, end, boundary);
+#endif
+}
+
 void debug_dma_map_single(struct device *dev, const void *addr,
 			    unsigned long len)
 {
@@ -1333,6 +1361,47 @@ void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	}
 }
 
+void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+			 int nents, int mapped_ents, int direction,
+			 unsigned long attrs)
+{
+	struct dma_debug_entry *entry;
+	struct bio_vec *bv;
+	int i;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		check_for_stack(dev, bv_page(bv), bv->offset);
+		if (!PageHighMem(bv_page(bv)))
+			check_for_illegal_area(dev, bv_virt(bv), bv->length);
+	}
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+
+		entry = dma_entry_alloc();
+		if (!entry)
+			return;
+
+		entry->type           = dma_debug_bv;
+		entry->dev            = dev;
+		entry->pfn	      = page_to_pfn(bv_page(bv));
+		entry->offset	      = bv->offset;
+		entry->size           = bv_dma_len(bv);
+		entry->dev_addr       = bv_dma_address(bv);
+		entry->direction      = direction;
+		entry->sg_call_ents   = nents;
+		entry->sg_mapped_ents = mapped_ents;
+
+		check_bv_segment(dev, bv);
+
+		add_dma_entry(entry, attrs);
+	}
+}
+
 static int get_nr_mapped_entries(struct device *dev,
 				 struct dma_debug_entry *ref)
 {
@@ -1384,6 +1453,37 @@ void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 	}
 }
 
+void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+			   int nelems, int dir)
+{
+	int mapped_ents = 0, i;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_bv,
+			.dev            = dev,
+			.pfn		= page_to_pfn(bv_page(bv)),
+			.offset		= bv->offset,
+			.dev_addr       = bv_dma_address(bv),
+			.size           = bv_dma_len(bv),
+			.direction      = dir,
+			.sg_call_ents   = nelems,
+		};
+
+		if (mapped_ents && i >= mapped_ents)
+			break;
+
+		if (!i)
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
+
+		check_unmap(&ref);
+	}
+}
+
 void debug_dma_alloc_coherent(struct device *dev, size_t size,
 			      dma_addr_t dma_addr, void *virt,
 			      unsigned long attrs)
@@ -1588,6 +1688,69 @@ void debug_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 	}
 }
 
+void debug_dma_sync_bvecs_for_cpu(struct device *dev, struct bio_vec *bvecs,
+				  int nelems, int direction)
+{
+	int mapped_ents = 0, i;
+	struct bio_vec *bv;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_bv,
+			.dev            = dev,
+			.pfn		= page_to_pfn(bv->bv_page),
+			.offset		= bv->bv_offset,
+			.dev_addr       = bv_dma_address(bv),
+			.size           = bv_dma_len(bv),
+			.direction      = direction,
+			.sg_call_ents   = nelems,
+		};
+
+		if (!i)
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
+
+		if (i >= mapped_ents)
+			break;
+
+		check_sync(dev, &ref, true);
+	}
+}
+
+void debug_dma_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+				     int nelems, int direction)
+{
+	int mapped_ents = 0, i;
+	struct bio_vec *bv;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_bv,
+			.dev            = dev,
+			.pfn		= page_to_pfn(bv->bv_page),
+			.offset		= bv->bv_offset,
+			.dev_addr       = bv_dma_address(bv),
+			.size           = bv_dma_len(bv),
+			.direction      = direction,
+			.sg_call_ents   = nelems,
+		};
+		if (!i)
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
+
+		if (i >= mapped_ents)
+			break;
+
+		check_sync(dev, &ref, false);
+	}
+}
+
 static int __init dma_debug_driver_setup(char *str)
 {
 	int i;
diff --git a/kernel/dma/debug.h b/kernel/dma/debug.h
index f525197d3cae..dff7e8a2f594 100644
--- a/kernel/dma/debug.h
+++ b/kernel/dma/debug.h
@@ -24,6 +24,13 @@ extern void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 extern void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 			       int nelems, int dir);
 
+extern void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+				int nents, int mapped_ents, int direction,
+				unsigned long attrs);
+
+extern void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+				  int nelems, int dir);
+
 extern void debug_dma_alloc_coherent(struct device *dev, size_t size,
 				     dma_addr_t dma_addr, void *virt,
 				     unsigned long attrs);
@@ -54,6 +61,14 @@ extern void debug_dma_sync_sg_for_cpu(struct device *dev,
 extern void debug_dma_sync_sg_for_device(struct device *dev,
 					 struct scatterlist *sg,
 					 int nelems, int direction);
+
+extern void debug_dma_sync_bvecs_for_cpu(struct device *dev,
+					 struct bio_vec *bvecs,
+					 int nelems, int direction);
+
+extern void debug_dma_sync_bvecs_for_device(struct device *dev,
+					    struct bio_vec *bvecs,
+					    int nelems, int direction);
 #else /* CONFIG_DMA_API_DEBUG */
 static inline void debug_dma_map_page(struct device *dev, struct page *page,
 				      size_t offset, size_t size,
@@ -79,6 +94,17 @@ static inline void debug_dma_unmap_sg(struct device *dev,
 {
 }
 
+static inline void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+				       int nents, int mapped_ents, int direction,
+				       unsigned long attrs)
+{
+}
+
+static inline void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+					 int nelems, int dir)
+{
+}
+
 static inline void debug_dma_alloc_coherent(struct device *dev, size_t size,
 					    dma_addr_t dma_addr, void *virt,
 					    unsigned long attrs)
@@ -126,5 +152,17 @@ static inline void debug_dma_sync_sg_for_device(struct device *dev,
 						int nelems, int direction)
 {
 }
+
+static inline void debug_dma_sync_bvecs_for_cpu(struct device *dev,
+						struct bio_vec *bvecs,
+						int nelems, int direction)
+{
+}
+
+static inline void debug_dma_sync_bvecs_for_device(struct device *dev,
+						   struct bio_vec *bvecs,
+						   int nelems, int direction)
+{
+}
 #endif /* CONFIG_DMA_API_DEBUG */
 #endif /* _KERNEL_DMA_DEBUG_H */



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
@ 2023-10-19 15:25   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/dma-mapping.h |    1 
 kernel/dma/debug.c          |  163 +++++++++++++++++++++++++++++++++++++++++++
 kernel/dma/debug.h          |   38 ++++++++++
 3 files changed, 202 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f0ccca16a0ac..f511ec546f4d 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -9,6 +9,7 @@
 #include <linux/err.h>
 #include <linux/dma-direction.h>
 #include <linux/scatterlist.h>
+#include <linux/bvec.h>
 #include <linux/bug.h>
 #include <linux/mem_encrypt.h>
 
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 3de494375b7b..efb4a2eaf9a0 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -39,6 +39,7 @@ enum {
 	dma_debug_sg,
 	dma_debug_coherent,
 	dma_debug_resource,
+	dma_debug_bv,
 };
 
 enum map_err_types {
@@ -142,6 +143,7 @@ static const char *type2name[] = {
 	[dma_debug_sg] = "scatter-gather",
 	[dma_debug_coherent] = "coherent",
 	[dma_debug_resource] = "resource",
+	[dma_debug_bv] = "bio-vec",
 };
 
 static const char *dir2name[] = {
@@ -1189,6 +1191,32 @@ static void check_sg_segment(struct device *dev, struct scatterlist *sg)
 #endif
 }
 
+static void check_bv_segment(struct device *dev, struct bio_vec *bv)
+{
+#ifdef CONFIG_DMA_API_DEBUG_SG
+	unsigned int max_seg = dma_get_max_seg_size(dev);
+	u64 start, end, boundary = dma_get_seg_boundary(dev);
+
+	/*
+	 * Either the driver forgot to set dma_parms appropriately, or
+	 * whoever generated the list forgot to check them.
+	 */
+	if (bv->length > max_seg)
+		err_printk(dev, NULL, "mapping bv entry longer than device claims to support [len=%u] [max=%u]\n",
+			   bv->length, max_seg);
+	/*
+	 * In some cases this could potentially be the DMA API
+	 * implementation's fault, but it would usually imply that
+	 * the scatterlist was built inappropriately to begin with.
+	 */
+	start = bv_dma_address(bv);
+	end = start + bv_dma_len(bv) - 1;
+	if ((start ^ end) & ~boundary)
+		err_printk(dev, NULL, "mapping bv entry across boundary [start=0x%016llx] [end=0x%016llx] [boundary=0x%016llx]\n",
+			   start, end, boundary);
+#endif
+}
+
 void debug_dma_map_single(struct device *dev, const void *addr,
 			    unsigned long len)
 {
@@ -1333,6 +1361,47 @@ void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	}
 }
 
+void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+			 int nents, int mapped_ents, int direction,
+			 unsigned long attrs)
+{
+	struct dma_debug_entry *entry;
+	struct bio_vec *bv;
+	int i;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		check_for_stack(dev, bv_page(bv), bv->offset);
+		if (!PageHighMem(bv_page(bv)))
+			check_for_illegal_area(dev, bv_virt(bv), bv->length);
+	}
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+
+		entry = dma_entry_alloc();
+		if (!entry)
+			return;
+
+		entry->type           = dma_debug_bv;
+		entry->dev            = dev;
+		entry->pfn	      = page_to_pfn(bv_page(bv));
+		entry->offset	      = bv->offset;
+		entry->size           = bv_dma_len(bv);
+		entry->dev_addr       = bv_dma_address(bv);
+		entry->direction      = direction;
+		entry->sg_call_ents   = nents;
+		entry->sg_mapped_ents = mapped_ents;
+
+		check_bv_segment(dev, bv);
+
+		add_dma_entry(entry, attrs);
+	}
+}
+
 static int get_nr_mapped_entries(struct device *dev,
 				 struct dma_debug_entry *ref)
 {
@@ -1384,6 +1453,37 @@ void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 	}
 }
 
+void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+			   int nelems, int dir)
+{
+	int mapped_ents = 0, i;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_bv,
+			.dev            = dev,
+			.pfn		= page_to_pfn(bv_page(bv)),
+			.offset		= bv->offset,
+			.dev_addr       = bv_dma_address(bv),
+			.size           = bv_dma_len(bv),
+			.direction      = dir,
+			.sg_call_ents   = nelems,
+		};
+
+		if (mapped_ents && i >= mapped_ents)
+			break;
+
+		if (!i)
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
+
+		check_unmap(&ref);
+	}
+}
+
 void debug_dma_alloc_coherent(struct device *dev, size_t size,
 			      dma_addr_t dma_addr, void *virt,
 			      unsigned long attrs)
@@ -1588,6 +1688,69 @@ void debug_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 	}
 }
 
+void debug_dma_sync_bvecs_for_cpu(struct device *dev, struct bio_vec *bvecs,
+				  int nelems, int direction)
+{
+	int mapped_ents = 0, i;
+	struct bio_vec *bv;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_bv,
+			.dev            = dev,
+			.pfn		= page_to_pfn(bv->bv_page),
+			.offset		= bv->bv_offset,
+			.dev_addr       = bv_dma_address(bv),
+			.size           = bv_dma_len(bv),
+			.direction      = direction,
+			.sg_call_ents   = nelems,
+		};
+
+		if (!i)
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
+
+		if (i >= mapped_ents)
+			break;
+
+		check_sync(dev, &ref, true);
+	}
+}
+
+void debug_dma_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+				     int nelems, int direction)
+{
+	int mapped_ents = 0, i;
+	struct bio_vec *bv;
+
+	if (unlikely(dma_debug_disabled()))
+		return;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_bv,
+			.dev            = dev,
+			.pfn		= page_to_pfn(bv->bv_page),
+			.offset		= bv->bv_offset,
+			.dev_addr       = bv_dma_address(bv),
+			.size           = bv_dma_len(bv),
+			.direction      = direction,
+			.sg_call_ents   = nelems,
+		};
+		if (!i)
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
+
+		if (i >= mapped_ents)
+			break;
+
+		check_sync(dev, &ref, false);
+	}
+}
+
 static int __init dma_debug_driver_setup(char *str)
 {
 	int i;
diff --git a/kernel/dma/debug.h b/kernel/dma/debug.h
index f525197d3cae..dff7e8a2f594 100644
--- a/kernel/dma/debug.h
+++ b/kernel/dma/debug.h
@@ -24,6 +24,13 @@ extern void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 extern void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 			       int nelems, int dir);
 
+extern void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+				int nents, int mapped_ents, int direction,
+				unsigned long attrs);
+
+extern void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+				  int nelems, int dir);
+
 extern void debug_dma_alloc_coherent(struct device *dev, size_t size,
 				     dma_addr_t dma_addr, void *virt,
 				     unsigned long attrs);
@@ -54,6 +61,14 @@ extern void debug_dma_sync_sg_for_cpu(struct device *dev,
 extern void debug_dma_sync_sg_for_device(struct device *dev,
 					 struct scatterlist *sg,
 					 int nelems, int direction);
+
+extern void debug_dma_sync_bvecs_for_cpu(struct device *dev,
+					 struct bio_vec *bvecs,
+					 int nelems, int direction);
+
+extern void debug_dma_sync_bvecs_for_device(struct device *dev,
+					    struct bio_vec *bvecs,
+					    int nelems, int direction);
 #else /* CONFIG_DMA_API_DEBUG */
 static inline void debug_dma_map_page(struct device *dev, struct page *page,
 				      size_t offset, size_t size,
@@ -79,6 +94,17 @@ static inline void debug_dma_unmap_sg(struct device *dev,
 {
 }
 
+static inline void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+				       int nents, int mapped_ents, int direction,
+				       unsigned long attrs)
+{
+}
+
+static inline void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+					 int nelems, int dir)
+{
+}
+
 static inline void debug_dma_alloc_coherent(struct device *dev, size_t size,
 					    dma_addr_t dma_addr, void *virt,
 					    unsigned long attrs)
@@ -126,5 +152,17 @@ static inline void debug_dma_sync_sg_for_device(struct device *dev,
 						int nelems, int direction)
 {
 }
+
+static inline void debug_dma_sync_bvecs_for_cpu(struct device *dev,
+						struct bio_vec *bvecs,
+						int nelems, int direction)
+{
+}
+
+static inline void debug_dma_sync_bvecs_for_device(struct device *dev,
+						   struct bio_vec *bvecs,
+						   int nelems, int direction)
+{
+}
 #endif /* CONFIG_DMA_API_DEBUG */
 #endif /* _KERNEL_DMA_DEBUG_H */



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 4/9] mm: kmsan: Add support for DMA mapping bio_vec arrays
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:25   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: Alexander Potapenko, kasan-dev, linux-mm, iommu, linux-rdma,
	Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: Alexander Potapenko <glider@google.com>
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/kmsan.h |   20 ++++++++++++++++++++
 mm/kmsan/hooks.c      |   13 +++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index e0c23a32cdf0..36c581a18b30 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -18,6 +18,7 @@ struct page;
 struct kmem_cache;
 struct task_struct;
 struct scatterlist;
+struct bio_vec;
 struct urb;
 
 #ifdef CONFIG_KMSAN
@@ -209,6 +210,20 @@ void kmsan_handle_dma(struct page *page, size_t offset, size_t size,
 void kmsan_handle_dma_sg(struct scatterlist *sg, int nents,
 			 enum dma_data_direction dir);
 
+/**
+ * kmsan_handle_dma_bvecs() - Handle a DMA transfer using bio_vec array.
+ * @bvecs: bio_vec array holding DMA buffers.
+ * @nents: number of scatterlist entries.
+ * @dir:   one of possible dma_data_direction values.
+ *
+ * Depending on @direction, KMSAN:
+ * * checks the buffers in the bio_vec array, if they are copied to device;
+ * * initializes the buffers, if they are copied from device;
+ * * does both, if this is a DMA_BIDIRECTIONAL transfer.
+ */
+void kmsan_handle_dma_bvecs(struct bio_vec *bv, int nents,
+			    enum dma_data_direction dir);
+
 /**
  * kmsan_handle_urb() - Handle a USB data transfer.
  * @urb:    struct urb pointer.
@@ -321,6 +336,11 @@ static inline void kmsan_handle_dma_sg(struct scatterlist *sg, int nents,
 {
 }
 
+static inline void kmsan_handle_dma_bvecs(struct bio_vec *bv, int nents,
+					  enum dma_data_direction dir)
+{
+}
+
 static inline void kmsan_handle_urb(const struct urb *urb, bool is_out)
 {
 }
diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index 5d6e2dee5692..87846011c9bd 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -358,6 +358,19 @@ void kmsan_handle_dma_sg(struct scatterlist *sg, int nents,
 				 dir);
 }
 
+void kmsan_handle_dma_bvecs(struct bio_vec *bvecs, int nents,
+			    enum dma_data_direction dir)
+{
+	struct bio_vec *item;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		item = &bvecs[i];
+		kmsan_handle_dma(bv_page(item), item->bv_offset, item->bv_len,
+				 dir);
+	}
+}
+
 /* Functions from kmsan-checks.h follow. */
 void kmsan_poison_memory(const void *address, size_t size, gfp_t flags)
 {



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 4/9] mm: kmsan: Add support for DMA mapping bio_vec arrays
@ 2023-10-19 15:25   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:25 UTC (permalink / raw)
  Cc: Alexander Potapenko, kasan-dev, linux-mm, iommu, linux-rdma,
	Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: Alexander Potapenko <glider@google.com>
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/kmsan.h |   20 ++++++++++++++++++++
 mm/kmsan/hooks.c      |   13 +++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index e0c23a32cdf0..36c581a18b30 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -18,6 +18,7 @@ struct page;
 struct kmem_cache;
 struct task_struct;
 struct scatterlist;
+struct bio_vec;
 struct urb;
 
 #ifdef CONFIG_KMSAN
@@ -209,6 +210,20 @@ void kmsan_handle_dma(struct page *page, size_t offset, size_t size,
 void kmsan_handle_dma_sg(struct scatterlist *sg, int nents,
 			 enum dma_data_direction dir);
 
+/**
+ * kmsan_handle_dma_bvecs() - Handle a DMA transfer using bio_vec array.
+ * @bvecs: bio_vec array holding DMA buffers.
+ * @nents: number of scatterlist entries.
+ * @dir:   one of possible dma_data_direction values.
+ *
+ * Depending on @direction, KMSAN:
+ * * checks the buffers in the bio_vec array, if they are copied to device;
+ * * initializes the buffers, if they are copied from device;
+ * * does both, if this is a DMA_BIDIRECTIONAL transfer.
+ */
+void kmsan_handle_dma_bvecs(struct bio_vec *bv, int nents,
+			    enum dma_data_direction dir);
+
 /**
  * kmsan_handle_urb() - Handle a USB data transfer.
  * @urb:    struct urb pointer.
@@ -321,6 +336,11 @@ static inline void kmsan_handle_dma_sg(struct scatterlist *sg, int nents,
 {
 }
 
+static inline void kmsan_handle_dma_bvecs(struct bio_vec *bv, int nents,
+					  enum dma_data_direction dir)
+{
+}
+
 static inline void kmsan_handle_urb(const struct urb *urb, bool is_out)
 {
 }
diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index 5d6e2dee5692..87846011c9bd 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -358,6 +358,19 @@ void kmsan_handle_dma_sg(struct scatterlist *sg, int nents,
 				 dir);
 }
 
+void kmsan_handle_dma_bvecs(struct bio_vec *bvecs, int nents,
+			    enum dma_data_direction dir)
+{
+	struct bio_vec *item;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		item = &bvecs[i];
+		kmsan_handle_dma(bv_page(item), item->bv_offset, item->bv_len,
+				 dir);
+	}
+}
+
 /* Functions from kmsan-checks.h follow. */
 void kmsan_poison_memory(const void *address, size_t size, gfp_t flags)
 {



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 5/9] dma-direct: Support direct mapping bio_vec arrays
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:26   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 kernel/dma/direct.c |   92 +++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/dma/direct.h |   17 +++++++++
 2 files changed, 109 insertions(+)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 9596ae1aa0da..7587c5c3d051 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -423,6 +423,26 @@ void dma_direct_sync_sg_for_device(struct device *dev,
 					dir);
 	}
 }
+
+void dma_direct_sync_bvecs_for_device(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		phys_addr_t paddr = dma_to_phys(dev, bv_dma_address(bv));
+
+		if (unlikely(is_swiotlb_buffer(dev, paddr)))
+			swiotlb_sync_single_for_device(dev, paddr, bv->bv_len,
+						       dir);
+
+		if (!dev_is_dma_coherent(dev))
+			arch_sync_dma_for_device(paddr, bv->bv_len,
+					dir);
+	}
+}
 #endif
 
 #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
@@ -516,6 +536,78 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
 	return ret;
 }
 
+void dma_direct_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		phys_addr_t paddr;
+
+		bv = &bvecs[i];
+		paddr = dma_to_phys(dev, bv_dma_address(bv));
+
+		if (!dev_is_dma_coherent(dev))
+			arch_sync_dma_for_cpu(paddr, bv->bv_len, dir);
+
+		if (unlikely(is_swiotlb_buffer(dev, paddr)))
+			swiotlb_sync_single_for_cpu(dev, paddr, bv->bv_len,
+						    dir);
+
+		if (dir == DMA_FROM_DEVICE)
+			arch_dma_mark_clean(paddr, bv->bv_len);
+	}
+
+	if (!dev_is_dma_coherent(dev))
+		arch_sync_dma_for_cpu_all();
+}
+
+/*
+ * Unmaps segments, except for ones marked as pci_p2pdma which do not
+ * require any further action as they contain a bus address.
+ */
+void dma_direct_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+			    int nents, enum dma_data_direction dir,
+			    unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		if (bv_dma_is_bus_address(bv))
+			bv_dma_unmark_bus_address(bv);
+		else
+			dma_direct_unmap_page(dev, bv_dma_address(bv),
+					      bv_dma_len(bv), dir, attrs);
+	}
+
+}
+
+int dma_direct_map_bvecs(struct device *dev, struct bio_vec *bvecs, int nents,
+			 enum dma_data_direction dir, unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	/* p2p DMA mapping support can be added later */
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		bv->bv_dma_address = dma_direct_map_page(dev, bv->bv_page,
+				bv->bv_offset, bv->bv_len, dir, attrs);
+		if (bv->bv_dma_address == DMA_MAPPING_ERROR)
+			goto out_unmap;
+		bv_dma_len(bv) = bv->bv_len;
+	}
+
+	return nents;
+
+out_unmap:
+	dma_direct_unmap_bvecs(dev, bvecs, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
+	return -EIO;
+}
+
 dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
index 97ec892ea0b5..6db1ccd04d21 100644
--- a/kernel/dma/direct.h
+++ b/kernel/dma/direct.h
@@ -20,17 +20,26 @@ int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
 bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr);
 int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
 		enum dma_data_direction dir, unsigned long attrs);
+int dma_direct_map_bvecs(struct device *dev, struct bio_vec *bvecs, int nents,
+		enum dma_data_direction dir, unsigned long attrs);
 size_t dma_direct_max_mapping_size(struct device *dev);
 
 #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
     defined(CONFIG_SWIOTLB)
 void dma_direct_sync_sg_for_device(struct device *dev, struct scatterlist *sgl,
 		int nents, enum dma_data_direction dir);
+void dma_direct_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir);
 #else
 static inline void dma_direct_sync_sg_for_device(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
 {
 }
+
+static inline void dma_direct_sync_bvecs_for_device(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+}
 #endif
 
 #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
@@ -40,6 +49,10 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
 		int nents, enum dma_data_direction dir, unsigned long attrs);
 void dma_direct_sync_sg_for_cpu(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir);
+void dma_direct_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs);
+void dma_direct_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir);
 #else
 static inline void dma_direct_unmap_sg(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir,
@@ -50,6 +63,10 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
 {
 }
+static inline void dma_direct_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+}
 #endif
 
 static inline void dma_direct_sync_single_for_device(struct device *dev,



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 5/9] dma-direct: Support direct mapping bio_vec arrays
@ 2023-10-19 15:26   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 kernel/dma/direct.c |   92 +++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/dma/direct.h |   17 +++++++++
 2 files changed, 109 insertions(+)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 9596ae1aa0da..7587c5c3d051 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -423,6 +423,26 @@ void dma_direct_sync_sg_for_device(struct device *dev,
 					dir);
 	}
 }
+
+void dma_direct_sync_bvecs_for_device(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		phys_addr_t paddr = dma_to_phys(dev, bv_dma_address(bv));
+
+		if (unlikely(is_swiotlb_buffer(dev, paddr)))
+			swiotlb_sync_single_for_device(dev, paddr, bv->bv_len,
+						       dir);
+
+		if (!dev_is_dma_coherent(dev))
+			arch_sync_dma_for_device(paddr, bv->bv_len,
+					dir);
+	}
+}
 #endif
 
 #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
@@ -516,6 +536,78 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
 	return ret;
 }
 
+void dma_direct_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		phys_addr_t paddr;
+
+		bv = &bvecs[i];
+		paddr = dma_to_phys(dev, bv_dma_address(bv));
+
+		if (!dev_is_dma_coherent(dev))
+			arch_sync_dma_for_cpu(paddr, bv->bv_len, dir);
+
+		if (unlikely(is_swiotlb_buffer(dev, paddr)))
+			swiotlb_sync_single_for_cpu(dev, paddr, bv->bv_len,
+						    dir);
+
+		if (dir == DMA_FROM_DEVICE)
+			arch_dma_mark_clean(paddr, bv->bv_len);
+	}
+
+	if (!dev_is_dma_coherent(dev))
+		arch_sync_dma_for_cpu_all();
+}
+
+/*
+ * Unmaps segments, except for ones marked as pci_p2pdma which do not
+ * require any further action as they contain a bus address.
+ */
+void dma_direct_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+			    int nents, enum dma_data_direction dir,
+			    unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		if (bv_dma_is_bus_address(bv))
+			bv_dma_unmark_bus_address(bv);
+		else
+			dma_direct_unmap_page(dev, bv_dma_address(bv),
+					      bv_dma_len(bv), dir, attrs);
+	}
+
+}
+
+int dma_direct_map_bvecs(struct device *dev, struct bio_vec *bvecs, int nents,
+			 enum dma_data_direction dir, unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	/* p2p DMA mapping support can be added later */
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		bv->bv_dma_address = dma_direct_map_page(dev, bv->bv_page,
+				bv->bv_offset, bv->bv_len, dir, attrs);
+		if (bv->bv_dma_address == DMA_MAPPING_ERROR)
+			goto out_unmap;
+		bv_dma_len(bv) = bv->bv_len;
+	}
+
+	return nents;
+
+out_unmap:
+	dma_direct_unmap_bvecs(dev, bvecs, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
+	return -EIO;
+}
+
 dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
index 97ec892ea0b5..6db1ccd04d21 100644
--- a/kernel/dma/direct.h
+++ b/kernel/dma/direct.h
@@ -20,17 +20,26 @@ int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
 bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr);
 int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
 		enum dma_data_direction dir, unsigned long attrs);
+int dma_direct_map_bvecs(struct device *dev, struct bio_vec *bvecs, int nents,
+		enum dma_data_direction dir, unsigned long attrs);
 size_t dma_direct_max_mapping_size(struct device *dev);
 
 #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
     defined(CONFIG_SWIOTLB)
 void dma_direct_sync_sg_for_device(struct device *dev, struct scatterlist *sgl,
 		int nents, enum dma_data_direction dir);
+void dma_direct_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir);
 #else
 static inline void dma_direct_sync_sg_for_device(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
 {
 }
+
+static inline void dma_direct_sync_bvecs_for_device(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+}
 #endif
 
 #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
@@ -40,6 +49,10 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
 		int nents, enum dma_data_direction dir, unsigned long attrs);
 void dma_direct_sync_sg_for_cpu(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir);
+void dma_direct_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs);
+void dma_direct_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir);
 #else
 static inline void dma_direct_unmap_sg(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir,
@@ -50,6 +63,10 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
 		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
 {
 }
+static inline void dma_direct_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir)
+{
+}
 #endif
 
 static inline void dma_direct_sync_single_for_device(struct device *dev,



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 6/9] DMA-API: Add dma_sync_bvecs_for_cpu() and dma_sync_bvecs_for_device()
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:26   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/dma-map-ops.h |    4 ++++
 include/linux/dma-mapping.h |    4 ++++
 kernel/dma/mapping.c        |   28 ++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index f2fc203fb8a1..de2a50d9207a 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -75,6 +75,10 @@ struct dma_map_ops {
 			int nents, enum dma_data_direction dir);
 	void (*sync_sg_for_device)(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir);
+	void (*sync_bvecs_for_cpu)(struct device *dev, struct bio_vec *bvecs,
+			int nents, enum dma_data_direction dir);
+	void (*sync_bvecs_for_device)(struct device *dev, struct bio_vec *bvecs,
+			int nents, enum dma_data_direction dir);
 	void (*cache_sync)(struct device *dev, void *vaddr, size_t size,
 			enum dma_data_direction direction);
 	int (*dma_supported)(struct device *dev, u64 mask);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f511ec546f4d..9fb422f376b6 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -126,6 +126,10 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 		    int nelems, enum dma_data_direction dir);
 void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 		       int nelems, enum dma_data_direction dir);
+void dma_sync_bvecs_for_cpu(struct device *dev, struct bio_vec *bvecs,
+			    int nelems, enum dma_data_direction dir);
+void dma_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+			       int nelems, enum dma_data_direction dir);
 void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t flag, unsigned long attrs);
 void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index e323ca48f7f2..94cffc9b45a5 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -385,6 +385,34 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 }
 EXPORT_SYMBOL(dma_sync_sg_for_device);
 
+void dma_sync_bvecs_for_cpu(struct device *dev, struct bio_vec *bvecs,
+			    int nelems, enum dma_data_direction dir)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	BUG_ON(!valid_dma_direction(dir));
+	if (dma_map_direct(dev, ops))
+		dma_direct_sync_bvecs_for_cpu(dev, bvecs, nelems, dir);
+	else if (ops->sync_bvecs_for_cpu)
+		ops->sync_bvecs_for_cpu(dev, bvecs, nelems, dir);
+	debug_dma_sync_bvecs_for_cpu(dev, bvecs, nelems, dir);
+}
+EXPORT_SYMBOL(dma_sync_bvecs_for_cpu);
+
+void dma_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+			       int nelems, enum dma_data_direction dir)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	BUG_ON(!valid_dma_direction(dir));
+	if (dma_map_direct(dev, ops))
+		dma_direct_sync_bvecs_for_device(dev, bvecs, nelems, dir);
+	else if (ops->sync_bvecs_for_device)
+		ops->sync_bvecs_for_device(dev, bvecs, nelems, dir);
+	debug_dma_sync_bvecs_for_device(dev, bvecs, nelems, dir);
+}
+EXPORT_SYMBOL(dma_sync_bvecs_for_device);
+
 /*
  * The whole dma_get_sgtable() idea is fundamentally unsafe - it seems
  * that the intention is to allow exporting memory allocated via the



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 6/9] DMA-API: Add dma_sync_bvecs_for_cpu() and dma_sync_bvecs_for_device()
@ 2023-10-19 15:26   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/dma-map-ops.h |    4 ++++
 include/linux/dma-mapping.h |    4 ++++
 kernel/dma/mapping.c        |   28 ++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index f2fc203fb8a1..de2a50d9207a 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -75,6 +75,10 @@ struct dma_map_ops {
 			int nents, enum dma_data_direction dir);
 	void (*sync_sg_for_device)(struct device *dev, struct scatterlist *sg,
 			int nents, enum dma_data_direction dir);
+	void (*sync_bvecs_for_cpu)(struct device *dev, struct bio_vec *bvecs,
+			int nents, enum dma_data_direction dir);
+	void (*sync_bvecs_for_device)(struct device *dev, struct bio_vec *bvecs,
+			int nents, enum dma_data_direction dir);
 	void (*cache_sync)(struct device *dev, void *vaddr, size_t size,
 			enum dma_data_direction direction);
 	int (*dma_supported)(struct device *dev, u64 mask);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f511ec546f4d..9fb422f376b6 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -126,6 +126,10 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 		    int nelems, enum dma_data_direction dir);
 void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 		       int nelems, enum dma_data_direction dir);
+void dma_sync_bvecs_for_cpu(struct device *dev, struct bio_vec *bvecs,
+			    int nelems, enum dma_data_direction dir);
+void dma_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+			       int nelems, enum dma_data_direction dir);
 void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t flag, unsigned long attrs);
 void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index e323ca48f7f2..94cffc9b45a5 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -385,6 +385,34 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 }
 EXPORT_SYMBOL(dma_sync_sg_for_device);
 
+void dma_sync_bvecs_for_cpu(struct device *dev, struct bio_vec *bvecs,
+			    int nelems, enum dma_data_direction dir)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	BUG_ON(!valid_dma_direction(dir));
+	if (dma_map_direct(dev, ops))
+		dma_direct_sync_bvecs_for_cpu(dev, bvecs, nelems, dir);
+	else if (ops->sync_bvecs_for_cpu)
+		ops->sync_bvecs_for_cpu(dev, bvecs, nelems, dir);
+	debug_dma_sync_bvecs_for_cpu(dev, bvecs, nelems, dir);
+}
+EXPORT_SYMBOL(dma_sync_bvecs_for_cpu);
+
+void dma_sync_bvecs_for_device(struct device *dev, struct bio_vec *bvecs,
+			       int nelems, enum dma_data_direction dir)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	BUG_ON(!valid_dma_direction(dir));
+	if (dma_map_direct(dev, ops))
+		dma_direct_sync_bvecs_for_device(dev, bvecs, nelems, dir);
+	else if (ops->sync_bvecs_for_device)
+		ops->sync_bvecs_for_device(dev, bvecs, nelems, dir);
+	debug_dma_sync_bvecs_for_device(dev, bvecs, nelems, dir);
+}
+EXPORT_SYMBOL(dma_sync_bvecs_for_device);
+
 /*
  * The whole dma_get_sgtable() idea is fundamentally unsafe - it seems
  * that the intention is to allow exporting memory allocated via the



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 7/9] DMA: Add dma_map_bvecs_attrs()
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:26   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/dma-map-ops.h |    4 +++
 include/linux/dma-mapping.h |    4 +++
 kernel/dma/mapping.c        |   65 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 73 insertions(+)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index de2a50d9207a..69ecfd403249 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -60,6 +60,10 @@ struct dma_map_ops {
 			enum dma_data_direction dir, unsigned long attrs);
 	void (*unmap_sg)(struct device *dev, struct scatterlist *sg, int nents,
 			enum dma_data_direction dir, unsigned long attrs);
+	int (*map_bvecs)(struct device *dev, struct bio_vec *bvecs, int nents,
+			enum dma_data_direction dir, unsigned long attrs);
+	void (*unmap_bvecs)(struct device *dev, struct bio_vec *bvecs, int nents,
+			enum dma_data_direction dir, unsigned long attrs);
 	dma_addr_t (*map_resource)(struct device *dev, phys_addr_t phys_addr,
 			size_t size, enum dma_data_direction dir,
 			unsigned long attrs);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 9fb422f376b6..6f522a82cfe3 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -114,6 +114,10 @@ void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg,
 				      unsigned long attrs);
 int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
 		enum dma_data_direction dir, unsigned long attrs);
+unsigned int dma_map_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs);
+void dma_unmap_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs);
 dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs);
 void dma_unmap_resource(struct device *dev, dma_addr_t addr, size_t size,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 94cffc9b45a5..f53cc4da2797 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -296,6 +296,71 @@ void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg,
 }
 EXPORT_SYMBOL(dma_unmap_sg_attrs);
 
+/**
+ * dma_map_sg_attrs - Map the given buffer for DMA
+ * @dev:	The device for which to perform the DMA operation
+ * @bvecs:	The bio_vec array describing the buffer
+ * @nents:	Number of bio_vecs to map
+ * @dir:	DMA direction
+ * @attrs:	Optional DMA attributes for the map operation
+ *
+ * Maps a buffer described by a bio_vec array passed in the bvecs
+ * argument with nents segments for the @dir DMA operation by the
+ * @dev device.
+ *
+ * Returns the number of mapped entries (which can be less than nents)
+ * on success. Zero is returned for any error.
+ *
+ * dma_unmap_bvecs_attrs() should be used to unmap the buffer with the
+ * original bvecs and original nents (not the value returned by this
+ * function).
+ */
+unsigned int dma_map_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+				 int nents, enum dma_data_direction dir,
+				 unsigned long attrs)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+	int ents;
+
+	BUG_ON(!valid_dma_direction(dir));
+
+	if (WARN_ON_ONCE(!dev->dma_mask))
+		return 0;
+
+	if (dma_map_direct(dev, ops))
+		ents = dma_direct_map_bvecs(dev, bvecs, nents, dir, attrs);
+	else
+		ents = ops->map_bvecs(dev, bvecs, nents, dir, attrs);
+
+	if (ents > 0) {
+		kmsan_handle_dma_bvecs(bvecs, nents, dir);
+		debug_dma_map_bvecs(dev, bvecs, nents, ents, dir, attrs);
+	} else if (WARN_ON_ONCE(ents != -EINVAL && ents != -ENOMEM &&
+				ents != -EIO && ents != -EREMOTEIO)) {
+		return -EIO;
+	}
+
+	return ents;
+}
+EXPORT_SYMBOL_GPL(dma_map_bvecs_attrs);
+
+void dma_unmap_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+			   int nents, enum dma_data_direction dir,
+			   unsigned long attrs)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	BUG_ON(!valid_dma_direction(dir));
+
+	debug_dma_unmap_bvecs(dev, bvecs, nents, dir);
+
+	if (dma_map_direct(dev, ops))
+		dma_direct_unmap_bvecs(dev, bvecs, nents, dir, attrs);
+	else if (ops->unmap_bvecs)
+		ops->unmap_bvecs(dev, bvecs, nents, dir, attrs);
+}
+EXPORT_SYMBOL(dma_unmap_bvecs_attrs);
+
 dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 7/9] DMA: Add dma_map_bvecs_attrs()
@ 2023-10-19 15:26   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/dma-map-ops.h |    4 +++
 include/linux/dma-mapping.h |    4 +++
 kernel/dma/mapping.c        |   65 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 73 insertions(+)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index de2a50d9207a..69ecfd403249 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -60,6 +60,10 @@ struct dma_map_ops {
 			enum dma_data_direction dir, unsigned long attrs);
 	void (*unmap_sg)(struct device *dev, struct scatterlist *sg, int nents,
 			enum dma_data_direction dir, unsigned long attrs);
+	int (*map_bvecs)(struct device *dev, struct bio_vec *bvecs, int nents,
+			enum dma_data_direction dir, unsigned long attrs);
+	void (*unmap_bvecs)(struct device *dev, struct bio_vec *bvecs, int nents,
+			enum dma_data_direction dir, unsigned long attrs);
 	dma_addr_t (*map_resource)(struct device *dev, phys_addr_t phys_addr,
 			size_t size, enum dma_data_direction dir,
 			unsigned long attrs);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 9fb422f376b6..6f522a82cfe3 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -114,6 +114,10 @@ void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg,
 				      unsigned long attrs);
 int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
 		enum dma_data_direction dir, unsigned long attrs);
+unsigned int dma_map_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs);
+void dma_unmap_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs);
 dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs);
 void dma_unmap_resource(struct device *dev, dma_addr_t addr, size_t size,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 94cffc9b45a5..f53cc4da2797 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -296,6 +296,71 @@ void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg,
 }
 EXPORT_SYMBOL(dma_unmap_sg_attrs);
 
+/**
+ * dma_map_sg_attrs - Map the given buffer for DMA
+ * @dev:	The device for which to perform the DMA operation
+ * @bvecs:	The bio_vec array describing the buffer
+ * @nents:	Number of bio_vecs to map
+ * @dir:	DMA direction
+ * @attrs:	Optional DMA attributes for the map operation
+ *
+ * Maps a buffer described by a bio_vec array passed in the bvecs
+ * argument with nents segments for the @dir DMA operation by the
+ * @dev device.
+ *
+ * Returns the number of mapped entries (which can be less than nents)
+ * on success. Zero is returned for any error.
+ *
+ * dma_unmap_bvecs_attrs() should be used to unmap the buffer with the
+ * original bvecs and original nents (not the value returned by this
+ * function).
+ */
+unsigned int dma_map_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+				 int nents, enum dma_data_direction dir,
+				 unsigned long attrs)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+	int ents;
+
+	BUG_ON(!valid_dma_direction(dir));
+
+	if (WARN_ON_ONCE(!dev->dma_mask))
+		return 0;
+
+	if (dma_map_direct(dev, ops))
+		ents = dma_direct_map_bvecs(dev, bvecs, nents, dir, attrs);
+	else
+		ents = ops->map_bvecs(dev, bvecs, nents, dir, attrs);
+
+	if (ents > 0) {
+		kmsan_handle_dma_bvecs(bvecs, nents, dir);
+		debug_dma_map_bvecs(dev, bvecs, nents, ents, dir, attrs);
+	} else if (WARN_ON_ONCE(ents != -EINVAL && ents != -ENOMEM &&
+				ents != -EIO && ents != -EREMOTEIO)) {
+		return -EIO;
+	}
+
+	return ents;
+}
+EXPORT_SYMBOL_GPL(dma_map_bvecs_attrs);
+
+void dma_unmap_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
+			   int nents, enum dma_data_direction dir,
+			   unsigned long attrs)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	BUG_ON(!valid_dma_direction(dir));
+
+	debug_dma_unmap_bvecs(dev, bvecs, nents, dir);
+
+	if (dma_map_direct(dev, ops))
+		dma_direct_unmap_bvecs(dev, bvecs, nents, dir, attrs);
+	else if (ops->unmap_bvecs)
+		ops->unmap_bvecs(dev, bvecs, nents, dir, attrs);
+}
+EXPORT_SYMBOL(dma_unmap_bvecs_attrs);
+
 dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 8/9] iommu/dma: Support DMA-mapping a bio_vec array
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:26   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 drivers/iommu/dma-iommu.c |  368 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c     |   58 +++++++
 include/linux/iommu.h     |    4 
 3 files changed, 430 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4b1a88f514c9..5ed15eac9a4a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -554,6 +554,34 @@ static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg,
 	return false;
 }
 
+static bool dev_use_bvecs_swiotlb(struct device *dev, struct bio_vec *bvecs,
+				  int nents, enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	if (!IS_ENABLED(CONFIG_SWIOTLB))
+		return false;
+
+	if (dev_is_untrusted(dev))
+		return true;
+
+	/*
+	 * If kmalloc() buffers are not DMA-safe for this device and
+	 * direction, check the individual lengths in the sg list. If any
+	 * element is deemed unsafe, use the swiotlb for bouncing.
+	 */
+	if (!dma_kmalloc_safe(dev, dir)) {
+		for (i = 0; i < nents; i++) {
+			bv = &bvecs[i];
+			if (!dma_kmalloc_size_aligned(bv->bv_len))
+				return true;
+		}
+	}
+
+	return false;
+}
+
 /**
  * iommu_dma_init_domain - Initialise a DMA mapping domain
  * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
@@ -1026,6 +1054,49 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 			arch_sync_dma_for_device(sg_phys(sg), sg->length, dir);
 }
 
+static void iommu_dma_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nelems,
+		enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	if (bv_dma_is_swiotlb(bvecs)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			iommu_dma_sync_single_for_cpu(dev, bv_dma_address(bv),
+						      bv->bv_len, dir);
+		}
+	} else if (!dev_is_dma_coherent(dev)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			arch_sync_dma_for_cpu(bv_phys(bv), bv->bv_len, dir);
+		}
+	}
+}
+
+static void iommu_dma_sync_bvecs_for_device(struct device *dev,
+		struct bio_vec *bvecs, int nelems,
+		enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	if (bv_dma_is_swiotlb(bvecs)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			iommu_dma_sync_single_for_device(dev,
+							 bv_dma_address(bv),
+							 bv->bv_len, dir);
+		}
+	} else if (!dev_is_dma_coherent(dev)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			arch_sync_dma_for_device(bv_phys(bv), bv->bv_len, dir);
+		}
+	}
+}
+
 static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, enum dma_data_direction dir,
 		unsigned long attrs)
@@ -1405,6 +1476,299 @@ static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
 		__iommu_dma_unmap(dev, start, end - start);
 }
 
+/*
+ * Prepare a successfully-mapped bio_vec array to give back to the caller.
+ *
+ * At this point the elements are already laid out by iommu_dma_map_bvecs()
+ * to avoid individually crossing any boundaries, so we merely need to check
+ * an element's start address to avoid concatenating across one.
+ */
+static int __finalise_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, dma_addr_t dma_addr)
+{
+	unsigned int cur_len = 0, max_len = dma_get_max_seg_size(dev);
+	unsigned long seg_mask = dma_get_seg_boundary(dev);
+	struct bio_vec *cur = bvecs;
+	int i, count = 0;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+
+		/* Restore this segment's original unaligned fields first */
+		dma_addr_t s_dma_addr = bv_dma_address(bv);
+		unsigned int s_iova_off = bv_dma_address(bv);
+		unsigned int s_length = bv_dma_len(bv);
+		unsigned int s_iova_len = bv->bv_len;
+
+		bv_dma_address(bv) = DMA_MAPPING_ERROR;
+		bv_dma_len(bv) = 0;
+
+		if (bv_dma_is_bus_address(bv)) {
+			if (i > 0)
+				cur++;
+
+			bv_dma_unmark_bus_address(bv);
+			bv_dma_address(cur) = s_dma_addr;
+			bv_dma_len(cur) = s_length;
+			bv_dma_mark_bus_address(cur);
+			count++;
+			cur_len = 0;
+			continue;
+		}
+
+		bv->bv_offset += s_iova_off;
+		bv->bv_len = s_length;
+
+		/*
+		 * Now fill in the real DMA data. If...
+		 * - there is a valid output segment to append to
+		 * - and this segment starts on an IOVA page boundary
+		 * - but doesn't fall at a segment boundary
+		 * - and wouldn't make the resulting output segment too long
+		 */
+		if (cur_len && !s_iova_off && (dma_addr & seg_mask) &&
+		    (max_len - cur_len >= s_length)) {
+			/* ...then concatenate it with the previous one */
+			cur_len += s_length;
+		} else {
+			/* Otherwise start the next output segment */
+			if (i > 0)
+				cur++;
+			cur_len = s_length;
+			count++;
+
+			bv_dma_address(cur) = dma_addr + s_iova_off;
+		}
+
+		bv_dma_len(cur) = cur_len;
+		dma_addr += s_iova_len;
+
+		if (s_length + s_iova_off < s_iova_len)
+			cur_len = 0;
+	}
+	return count;
+}
+
+/*
+ * If mapping failed, then just restore the original list,
+ * but making sure the DMA fields are invalidated.
+ */
+static void __invalidate_bvecs(struct bio_vec *bvecs, int nents)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		if (bv_dma_is_bus_address(bv)) {
+			bv_dma_unmark_bus_address(bv);
+		} else {
+			if (bv_dma_address(bv) != DMA_MAPPING_ERROR)
+				bv->bv_offset += bv_dma_address(bv);
+			if (bv_dma_len(bv))
+				bv->bv_len = bv_dma_len(bv);
+		}
+		bv_dma_address(bv) = DMA_MAPPING_ERROR;
+		bv_dma_len(bv) = 0;
+	}
+}
+
+static void iommu_dma_unmap_bvecs_swiotlb(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir,
+		unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		iommu_dma_unmap_page(dev, bv_dma_address(bv),
+				     bv_dma_len(bv), dir, attrs);
+	}
+}
+
+static int iommu_dma_map_bvecs_swiotlb(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	bv_dma_mark_swiotlb(bvecs);
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		bv_dma_address(bv) = iommu_dma_map_page(dev, bv->bv_page,
+				bv->bv_offset, bv->bv_len, dir, attrs);
+		if (bv_dma_address(bv) == DMA_MAPPING_ERROR)
+			goto out_unmap;
+		bv_dma_len(bv) = bv->bv_len;
+	}
+
+	return nents;
+
+out_unmap:
+	iommu_dma_unmap_bvecs_swiotlb(dev, bvecs, i, dir,
+				      attrs | DMA_ATTR_SKIP_CPU_SYNC);
+	return -EIO;
+}
+
+/*
+ * The DMA API client is passing in an array of bio_vecs which could
+ * describe any old buffer layout, but the IOMMU API requires everything
+ * to be aligned to IOMMU pages. Hence the need for this complicated bit
+ * of impedance-matching, to be able to hand off a suitably-aligned list,
+ * but still preserve the original offsets and sizes for the caller.
+ */
+static int iommu_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+	int prot = dma_info_to_prot(dir, dev_is_dma_coherent(dev), attrs);
+	struct iommu_domain *domain = iommu_get_dma_domain(dev);
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	unsigned long mask = dma_get_seg_boundary(dev);
+	struct iova_domain *iovad = &cookie->iovad;
+	size_t iova_len = 0;
+	dma_addr_t iova;
+	ssize_t ret;
+	int i;
+
+	if (static_branch_unlikely(&iommu_deferred_attach_enabled)) {
+		ret = iommu_deferred_attach(dev, domain);
+		if (ret)
+			goto out;
+	}
+
+	if (dev_use_bvecs_swiotlb(dev, bvecs, nents, dir))
+		return iommu_dma_map_bvecs_swiotlb(dev, bvecs, nents,
+						   dir, attrs);
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_bvecs_for_device(dev, bvecs, nents, dir);
+
+	/*
+	 * Work out how much IOVA space we need, and align the segments to
+	 * IOVA granules for the IOMMU driver to handle. With some clever
+	 * trickery we can modify the list in-place, but reversibly, by
+	 * stashing the unaligned parts in the as-yet-unused DMA fields.
+	 */
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		size_t s_iova_off = iova_offset(iovad, bv->bv_offset);
+		size_t pad_len = (mask - iova_len + 1) & mask;
+		size_t s_length = bv->bv_len;
+		struct bio_vec *prev = NULL;
+
+		bv_dma_address(bv) = s_iova_off;
+		bv_dma_len(bv) = s_length;
+		bv->bv_offset -= s_iova_off;
+		s_length = iova_align(iovad, s_length + s_iova_off);
+		bv->bv_len = s_length;
+
+		/*
+		 * Due to the alignment of our single IOVA allocation, we can
+		 * depend on these assumptions about the segment boundary mask:
+		 * - If mask size >= IOVA size, then the IOVA range cannot
+		 *   possibly fall across a boundary, so we don't care.
+		 * - If mask size < IOVA size, then the IOVA range must start
+		 *   exactly on a boundary, therefore we can lay things out
+		 *   based purely on segment lengths without needing to know
+		 *   the actual addresses beforehand.
+		 * - The mask must be a power of 2, so pad_len == 0 if
+		 *   iova_len == 0, thus we cannot dereference prev the first
+		 *   time through here (i.e. before it has a meaningful value).
+		 */
+		if (pad_len && pad_len < s_length - 1) {
+			prev->bv_len += pad_len;
+			iova_len += pad_len;
+		}
+
+		iova_len += s_length;
+		prev = bv;
+	}
+
+	if (!iova_len)
+		return __finalise_bvecs(dev, bvecs, nents, 0);
+
+	iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
+	if (!iova) {
+		ret = -ENOMEM;
+		goto out_restore_sg;
+	}
+
+	/*
+	 * We'll leave any physical concatenation to the IOMMU driver's
+	 * implementation - it knows better than we do.
+	 */
+	ret = iommu_map_bvecs(domain, iova, bvecs, nents, prot, GFP_ATOMIC);
+	if (ret < 0 || ret < iova_len)
+		goto out_free_iova;
+
+	return __finalise_bvecs(dev, bvecs, nents, iova);
+
+out_free_iova:
+	iommu_dma_free_iova(cookie, iova, iova_len, NULL);
+out_restore_sg:
+	__invalidate_bvecs(bvecs, nents);
+out:
+	if (ret != -ENOMEM && ret != -EREMOTEIO)
+		return -EINVAL;
+	return ret;
+}
+
+static void iommu_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+	dma_addr_t end = 0, start;
+	struct bio_vec *bv;
+	int i;
+
+	if (bv_dma_is_swiotlb(bvecs)) {
+		iommu_dma_unmap_bvecs_swiotlb(dev, bvecs, nents, dir, attrs);
+		return;
+	}
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_bvecs_for_cpu(dev, bvecs, nents, dir);
+
+	/*
+	 * The bio_vec array elements are mapped into a single
+	 * contiguous IOVA allocation, the start and end points
+	 * just have to be determined.
+	 */
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+
+		if (bv_dma_is_bus_address(bv)) {
+			bv_dma_unmark_bus_address(bv);
+			continue;
+		}
+
+		if (bv_dma_len(bv) == 0)
+			break;
+
+		start = bv_dma_address(bv);
+		break;
+	}
+
+	nents -= i;
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+
+		if (bv_dma_is_bus_address(bv)) {
+			bv_dma_unmark_bus_address(bv);
+			continue;
+		}
+
+		if (bv_dma_len(bv) == 0)
+			break;
+
+		end = bv_dma_address(bv) + bv_dma_len(bv);
+	}
+
+	if (end)
+		__iommu_dma_unmap(dev, start, end - start);
+}
+
 static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
@@ -1613,10 +1977,14 @@ static const struct dma_map_ops iommu_dma_ops = {
 	.unmap_page		= iommu_dma_unmap_page,
 	.map_sg			= iommu_dma_map_sg,
 	.unmap_sg		= iommu_dma_unmap_sg,
+	.map_bvecs		= iommu_dma_map_bvecs,
+	.unmap_bvecs		= iommu_dma_unmap_bvecs,
 	.sync_single_for_cpu	= iommu_dma_sync_single_for_cpu,
 	.sync_single_for_device	= iommu_dma_sync_single_for_device,
 	.sync_sg_for_cpu	= iommu_dma_sync_sg_for_cpu,
 	.sync_sg_for_device	= iommu_dma_sync_sg_for_device,
+	.sync_bvecs_for_cpu	= iommu_dma_sync_bvecs_for_cpu,
+	.sync_bvecs_for_device	= iommu_dma_sync_bvecs_for_device,
 	.map_resource		= iommu_dma_map_resource,
 	.unmap_resource		= iommu_dma_unmap_resource,
 	.get_merge_boundary	= iommu_dma_get_merge_boundary,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3bfc56df4f78..a117917bf9d0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2669,6 +2669,64 @@ ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 }
 EXPORT_SYMBOL_GPL(iommu_map_sg);
 
+ssize_t iommu_map_bvecs(struct iommu_domain *domain, unsigned long iova,
+			struct bio_vec *bv, unsigned int nents, int prot,
+			 gfp_t gfp)
+{
+	const struct iommu_domain_ops *ops = domain->ops;
+	size_t len = 0, mapped = 0;
+	unsigned int i = 0;
+	phys_addr_t start;
+	int ret;
+
+	might_sleep_if(gfpflags_allow_blocking(gfp));
+
+	/* Discourage passing strange GFP flags */
+	if (WARN_ON_ONCE(gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 |
+				__GFP_HIGHMEM)))
+		return -EINVAL;
+
+	while (i <= nents) {
+		phys_addr_t b_phys = bv_phys(bv);
+
+		if (len && b_phys != start + len) {
+			ret = __iommu_map(domain, iova + mapped, start,
+					len, prot, gfp);
+
+			if (ret)
+				goto out_err;
+
+			mapped += len;
+			len = 0;
+		}
+
+		if (bv_dma_is_bus_address(bv))
+			goto next;
+
+		if (len) {
+			len += bv->bv_len;
+		} else {
+			len = bv->bv_len;
+			start = b_phys;
+		}
+
+next:
+		if (++i < nents)
+			bv++;
+	}
+
+	if (ops->iotlb_sync_map)
+		ops->iotlb_sync_map(domain, iova, mapped);
+	return mapped;
+
+out_err:
+	/* undo mappings already done */
+	iommu_unmap(domain, iova, mapped);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_map_bvecs);
+
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c50a769d569a..9f7120314fda 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -8,6 +8,7 @@
 #define __LINUX_IOMMU_H
 
 #include <linux/scatterlist.h>
+#include <linux/bvec.h>
 #include <linux/device.h>
 #include <linux/types.h>
 #include <linux/errno.h>
@@ -485,6 +486,9 @@ extern size_t iommu_unmap_fast(struct iommu_domain *domain,
 extern ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 			    struct scatterlist *sg, unsigned int nents,
 			    int prot, gfp_t gfp);
+extern ssize_t iommu_map_bvecs(struct iommu_domain *domain, unsigned long iova,
+			       struct bio_vec *bvecs, unsigned int nents,
+			       int prot, gfp_t gfp);
 extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova);
 extern void iommu_set_fault_handler(struct iommu_domain *domain,
 			iommu_fault_handler_t handler, void *token);



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 8/9] iommu/dma: Support DMA-mapping a bio_vec array
@ 2023-10-19 15:26   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 drivers/iommu/dma-iommu.c |  368 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c     |   58 +++++++
 include/linux/iommu.h     |    4 
 3 files changed, 430 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4b1a88f514c9..5ed15eac9a4a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -554,6 +554,34 @@ static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg,
 	return false;
 }
 
+static bool dev_use_bvecs_swiotlb(struct device *dev, struct bio_vec *bvecs,
+				  int nents, enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	if (!IS_ENABLED(CONFIG_SWIOTLB))
+		return false;
+
+	if (dev_is_untrusted(dev))
+		return true;
+
+	/*
+	 * If kmalloc() buffers are not DMA-safe for this device and
+	 * direction, check the individual lengths in the sg list. If any
+	 * element is deemed unsafe, use the swiotlb for bouncing.
+	 */
+	if (!dma_kmalloc_safe(dev, dir)) {
+		for (i = 0; i < nents; i++) {
+			bv = &bvecs[i];
+			if (!dma_kmalloc_size_aligned(bv->bv_len))
+				return true;
+		}
+	}
+
+	return false;
+}
+
 /**
  * iommu_dma_init_domain - Initialise a DMA mapping domain
  * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
@@ -1026,6 +1054,49 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 			arch_sync_dma_for_device(sg_phys(sg), sg->length, dir);
 }
 
+static void iommu_dma_sync_bvecs_for_cpu(struct device *dev,
+		struct bio_vec *bvecs, int nelems,
+		enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	if (bv_dma_is_swiotlb(bvecs)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			iommu_dma_sync_single_for_cpu(dev, bv_dma_address(bv),
+						      bv->bv_len, dir);
+		}
+	} else if (!dev_is_dma_coherent(dev)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			arch_sync_dma_for_cpu(bv_phys(bv), bv->bv_len, dir);
+		}
+	}
+}
+
+static void iommu_dma_sync_bvecs_for_device(struct device *dev,
+		struct bio_vec *bvecs, int nelems,
+		enum dma_data_direction dir)
+{
+	struct bio_vec *bv;
+	int i;
+
+	if (bv_dma_is_swiotlb(bvecs)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			iommu_dma_sync_single_for_device(dev,
+							 bv_dma_address(bv),
+							 bv->bv_len, dir);
+		}
+	} else if (!dev_is_dma_coherent(dev)) {
+		for (i = 0; i < nelems; i++) {
+			bv = &bvecs[i];
+			arch_sync_dma_for_device(bv_phys(bv), bv->bv_len, dir);
+		}
+	}
+}
+
 static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, enum dma_data_direction dir,
 		unsigned long attrs)
@@ -1405,6 +1476,299 @@ static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
 		__iommu_dma_unmap(dev, start, end - start);
 }
 
+/*
+ * Prepare a successfully-mapped bio_vec array to give back to the caller.
+ *
+ * At this point the elements are already laid out by iommu_dma_map_bvecs()
+ * to avoid individually crossing any boundaries, so we merely need to check
+ * an element's start address to avoid concatenating across one.
+ */
+static int __finalise_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, dma_addr_t dma_addr)
+{
+	unsigned int cur_len = 0, max_len = dma_get_max_seg_size(dev);
+	unsigned long seg_mask = dma_get_seg_boundary(dev);
+	struct bio_vec *cur = bvecs;
+	int i, count = 0;
+
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+
+		/* Restore this segment's original unaligned fields first */
+		dma_addr_t s_dma_addr = bv_dma_address(bv);
+		unsigned int s_iova_off = bv_dma_address(bv);
+		unsigned int s_length = bv_dma_len(bv);
+		unsigned int s_iova_len = bv->bv_len;
+
+		bv_dma_address(bv) = DMA_MAPPING_ERROR;
+		bv_dma_len(bv) = 0;
+
+		if (bv_dma_is_bus_address(bv)) {
+			if (i > 0)
+				cur++;
+
+			bv_dma_unmark_bus_address(bv);
+			bv_dma_address(cur) = s_dma_addr;
+			bv_dma_len(cur) = s_length;
+			bv_dma_mark_bus_address(cur);
+			count++;
+			cur_len = 0;
+			continue;
+		}
+
+		bv->bv_offset += s_iova_off;
+		bv->bv_len = s_length;
+
+		/*
+		 * Now fill in the real DMA data. If...
+		 * - there is a valid output segment to append to
+		 * - and this segment starts on an IOVA page boundary
+		 * - but doesn't fall at a segment boundary
+		 * - and wouldn't make the resulting output segment too long
+		 */
+		if (cur_len && !s_iova_off && (dma_addr & seg_mask) &&
+		    (max_len - cur_len >= s_length)) {
+			/* ...then concatenate it with the previous one */
+			cur_len += s_length;
+		} else {
+			/* Otherwise start the next output segment */
+			if (i > 0)
+				cur++;
+			cur_len = s_length;
+			count++;
+
+			bv_dma_address(cur) = dma_addr + s_iova_off;
+		}
+
+		bv_dma_len(cur) = cur_len;
+		dma_addr += s_iova_len;
+
+		if (s_length + s_iova_off < s_iova_len)
+			cur_len = 0;
+	}
+	return count;
+}
+
+/*
+ * If mapping failed, then just restore the original list,
+ * but making sure the DMA fields are invalidated.
+ */
+static void __invalidate_bvecs(struct bio_vec *bvecs, int nents)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		if (bv_dma_is_bus_address(bv)) {
+			bv_dma_unmark_bus_address(bv);
+		} else {
+			if (bv_dma_address(bv) != DMA_MAPPING_ERROR)
+				bv->bv_offset += bv_dma_address(bv);
+			if (bv_dma_len(bv))
+				bv->bv_len = bv_dma_len(bv);
+		}
+		bv_dma_address(bv) = DMA_MAPPING_ERROR;
+		bv_dma_len(bv) = 0;
+	}
+}
+
+static void iommu_dma_unmap_bvecs_swiotlb(struct device *dev,
+		struct bio_vec *bvecs, int nents, enum dma_data_direction dir,
+		unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		iommu_dma_unmap_page(dev, bv_dma_address(bv),
+				     bv_dma_len(bv), dir, attrs);
+	}
+}
+
+static int iommu_dma_map_bvecs_swiotlb(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+	struct bio_vec *bv;
+	int i;
+
+	bv_dma_mark_swiotlb(bvecs);
+
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+		bv_dma_address(bv) = iommu_dma_map_page(dev, bv->bv_page,
+				bv->bv_offset, bv->bv_len, dir, attrs);
+		if (bv_dma_address(bv) == DMA_MAPPING_ERROR)
+			goto out_unmap;
+		bv_dma_len(bv) = bv->bv_len;
+	}
+
+	return nents;
+
+out_unmap:
+	iommu_dma_unmap_bvecs_swiotlb(dev, bvecs, i, dir,
+				      attrs | DMA_ATTR_SKIP_CPU_SYNC);
+	return -EIO;
+}
+
+/*
+ * The DMA API client is passing in an array of bio_vecs which could
+ * describe any old buffer layout, but the IOMMU API requires everything
+ * to be aligned to IOMMU pages. Hence the need for this complicated bit
+ * of impedance-matching, to be able to hand off a suitably-aligned list,
+ * but still preserve the original offsets and sizes for the caller.
+ */
+static int iommu_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+	int prot = dma_info_to_prot(dir, dev_is_dma_coherent(dev), attrs);
+	struct iommu_domain *domain = iommu_get_dma_domain(dev);
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	unsigned long mask = dma_get_seg_boundary(dev);
+	struct iova_domain *iovad = &cookie->iovad;
+	size_t iova_len = 0;
+	dma_addr_t iova;
+	ssize_t ret;
+	int i;
+
+	if (static_branch_unlikely(&iommu_deferred_attach_enabled)) {
+		ret = iommu_deferred_attach(dev, domain);
+		if (ret)
+			goto out;
+	}
+
+	if (dev_use_bvecs_swiotlb(dev, bvecs, nents, dir))
+		return iommu_dma_map_bvecs_swiotlb(dev, bvecs, nents,
+						   dir, attrs);
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_bvecs_for_device(dev, bvecs, nents, dir);
+
+	/*
+	 * Work out how much IOVA space we need, and align the segments to
+	 * IOVA granules for the IOMMU driver to handle. With some clever
+	 * trickery we can modify the list in-place, but reversibly, by
+	 * stashing the unaligned parts in the as-yet-unused DMA fields.
+	 */
+	for (i = 0; i < nents; i++) {
+		struct bio_vec *bv = &bvecs[i];
+		size_t s_iova_off = iova_offset(iovad, bv->bv_offset);
+		size_t pad_len = (mask - iova_len + 1) & mask;
+		size_t s_length = bv->bv_len;
+		struct bio_vec *prev = NULL;
+
+		bv_dma_address(bv) = s_iova_off;
+		bv_dma_len(bv) = s_length;
+		bv->bv_offset -= s_iova_off;
+		s_length = iova_align(iovad, s_length + s_iova_off);
+		bv->bv_len = s_length;
+
+		/*
+		 * Due to the alignment of our single IOVA allocation, we can
+		 * depend on these assumptions about the segment boundary mask:
+		 * - If mask size >= IOVA size, then the IOVA range cannot
+		 *   possibly fall across a boundary, so we don't care.
+		 * - If mask size < IOVA size, then the IOVA range must start
+		 *   exactly on a boundary, therefore we can lay things out
+		 *   based purely on segment lengths without needing to know
+		 *   the actual addresses beforehand.
+		 * - The mask must be a power of 2, so pad_len == 0 if
+		 *   iova_len == 0, thus we cannot dereference prev the first
+		 *   time through here (i.e. before it has a meaningful value).
+		 */
+		if (pad_len && pad_len < s_length - 1) {
+			prev->bv_len += pad_len;
+			iova_len += pad_len;
+		}
+
+		iova_len += s_length;
+		prev = bv;
+	}
+
+	if (!iova_len)
+		return __finalise_bvecs(dev, bvecs, nents, 0);
+
+	iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
+	if (!iova) {
+		ret = -ENOMEM;
+		goto out_restore_sg;
+	}
+
+	/*
+	 * We'll leave any physical concatenation to the IOMMU driver's
+	 * implementation - it knows better than we do.
+	 */
+	ret = iommu_map_bvecs(domain, iova, bvecs, nents, prot, GFP_ATOMIC);
+	if (ret < 0 || ret < iova_len)
+		goto out_free_iova;
+
+	return __finalise_bvecs(dev, bvecs, nents, iova);
+
+out_free_iova:
+	iommu_dma_free_iova(cookie, iova, iova_len, NULL);
+out_restore_sg:
+	__invalidate_bvecs(bvecs, nents);
+out:
+	if (ret != -ENOMEM && ret != -EREMOTEIO)
+		return -EINVAL;
+	return ret;
+}
+
+static void iommu_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+	dma_addr_t end = 0, start;
+	struct bio_vec *bv;
+	int i;
+
+	if (bv_dma_is_swiotlb(bvecs)) {
+		iommu_dma_unmap_bvecs_swiotlb(dev, bvecs, nents, dir, attrs);
+		return;
+	}
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_bvecs_for_cpu(dev, bvecs, nents, dir);
+
+	/*
+	 * The bio_vec array elements are mapped into a single
+	 * contiguous IOVA allocation, the start and end points
+	 * just have to be determined.
+	 */
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+
+		if (bv_dma_is_bus_address(bv)) {
+			bv_dma_unmark_bus_address(bv);
+			continue;
+		}
+
+		if (bv_dma_len(bv) == 0)
+			break;
+
+		start = bv_dma_address(bv);
+		break;
+	}
+
+	nents -= i;
+	for (i = 0; i < nents; i++) {
+		bv = &bvecs[i];
+
+		if (bv_dma_is_bus_address(bv)) {
+			bv_dma_unmark_bus_address(bv);
+			continue;
+		}
+
+		if (bv_dma_len(bv) == 0)
+			break;
+
+		end = bv_dma_address(bv) + bv_dma_len(bv);
+	}
+
+	if (end)
+		__iommu_dma_unmap(dev, start, end - start);
+}
+
 static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
@@ -1613,10 +1977,14 @@ static const struct dma_map_ops iommu_dma_ops = {
 	.unmap_page		= iommu_dma_unmap_page,
 	.map_sg			= iommu_dma_map_sg,
 	.unmap_sg		= iommu_dma_unmap_sg,
+	.map_bvecs		= iommu_dma_map_bvecs,
+	.unmap_bvecs		= iommu_dma_unmap_bvecs,
 	.sync_single_for_cpu	= iommu_dma_sync_single_for_cpu,
 	.sync_single_for_device	= iommu_dma_sync_single_for_device,
 	.sync_sg_for_cpu	= iommu_dma_sync_sg_for_cpu,
 	.sync_sg_for_device	= iommu_dma_sync_sg_for_device,
+	.sync_bvecs_for_cpu	= iommu_dma_sync_bvecs_for_cpu,
+	.sync_bvecs_for_device	= iommu_dma_sync_bvecs_for_device,
 	.map_resource		= iommu_dma_map_resource,
 	.unmap_resource		= iommu_dma_unmap_resource,
 	.get_merge_boundary	= iommu_dma_get_merge_boundary,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3bfc56df4f78..a117917bf9d0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2669,6 +2669,64 @@ ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 }
 EXPORT_SYMBOL_GPL(iommu_map_sg);
 
+ssize_t iommu_map_bvecs(struct iommu_domain *domain, unsigned long iova,
+			struct bio_vec *bv, unsigned int nents, int prot,
+			 gfp_t gfp)
+{
+	const struct iommu_domain_ops *ops = domain->ops;
+	size_t len = 0, mapped = 0;
+	unsigned int i = 0;
+	phys_addr_t start;
+	int ret;
+
+	might_sleep_if(gfpflags_allow_blocking(gfp));
+
+	/* Discourage passing strange GFP flags */
+	if (WARN_ON_ONCE(gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 |
+				__GFP_HIGHMEM)))
+		return -EINVAL;
+
+	while (i <= nents) {
+		phys_addr_t b_phys = bv_phys(bv);
+
+		if (len && b_phys != start + len) {
+			ret = __iommu_map(domain, iova + mapped, start,
+					len, prot, gfp);
+
+			if (ret)
+				goto out_err;
+
+			mapped += len;
+			len = 0;
+		}
+
+		if (bv_dma_is_bus_address(bv))
+			goto next;
+
+		if (len) {
+			len += bv->bv_len;
+		} else {
+			len = bv->bv_len;
+			start = b_phys;
+		}
+
+next:
+		if (++i < nents)
+			bv++;
+	}
+
+	if (ops->iotlb_sync_map)
+		ops->iotlb_sync_map(domain, iova, mapped);
+	return mapped;
+
+out_err:
+	/* undo mappings already done */
+	iommu_unmap(domain, iova, mapped);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_map_bvecs);
+
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c50a769d569a..9f7120314fda 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -8,6 +8,7 @@
 #define __LINUX_IOMMU_H
 
 #include <linux/scatterlist.h>
+#include <linux/bvec.h>
 #include <linux/device.h>
 #include <linux/types.h>
 #include <linux/errno.h>
@@ -485,6 +486,9 @@ extern size_t iommu_unmap_fast(struct iommu_domain *domain,
 extern ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 			    struct scatterlist *sg, unsigned int nents,
 			    int prot, gfp_t gfp);
+extern ssize_t iommu_map_bvecs(struct iommu_domain *domain, unsigned long iova,
+			       struct bio_vec *bvecs, unsigned int nents,
+			       int prot, gfp_t gfp);
 extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova);
 extern void iommu_set_fault_handler(struct iommu_domain *domain,
 			iommu_fault_handler_t handler, void *token);



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 9/9] RDMA: Add helpers for DMA-mapping an array of bio_vecs
  2023-10-19 15:25 ` Chuck Lever
@ 2023-10-19 15:26   ` Chuck Lever
  -1 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/rdma/ib_verbs.h |   29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 533ab92684d8..5e205fda90f9 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -4220,6 +4220,35 @@ static inline void ib_dma_unmap_sg(struct ib_device *dev,
 	ib_dma_unmap_sg_attrs(dev, sg, nents, direction, 0);
 }
 
+/**
+ * ib_dma_map_sg - Map an array of bio_vecs to DMA addresses
+ * @dev: The device for which the DMA addresses are to be created
+ * @bvecs: The array of bio_vec entries to map
+ * @nents: The number of entries in the array
+ * @direction: The direction of the DMA
+ */
+static inline int ib_dma_map_bvecs(struct ib_device *dev,
+				   struct bio_vec *bvecs, int nents,
+				   enum dma_data_direction direction)
+{
+	return dma_map_bvecs_attrs(dev->dma_device, bvecs, nents, direction, 0);
+}
+
+/**
+ * ib_dma_unmap_bvecs - Unmap a DMA-mapped bio_vec array
+ * @dev: The device for which the DMA addresses were created
+ * @bvecs: The array of bio_vec entries to unmap
+ * @nents: The number of entries in the array
+ * @direction: The direction of the DMA
+ */
+static inline void ib_dma_unmap_bvec(struct ib_device *dev,
+				     struct bio_vec *bvecs, int nents,
+				     enum dma_data_direction direction)
+{
+	if (!ib_uses_virt_dma(dev))
+		dma_unmap_bvecs_attrs(dev->dma_device, bvecs, nents, direction);
+}
+
 /**
  * ib_dma_max_seg_size - Return the size limit of a single DMA transfer
  * @dev: The device to query



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 9/9] RDMA: Add helpers for DMA-mapping an array of bio_vecs
@ 2023-10-19 15:26   ` Chuck Lever
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 15:26 UTC (permalink / raw)
  Cc: iommu, linux-rdma, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Cc: iommu@lists.linux.dev
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/rdma/ib_verbs.h |   29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 533ab92684d8..5e205fda90f9 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -4220,6 +4220,35 @@ static inline void ib_dma_unmap_sg(struct ib_device *dev,
 	ib_dma_unmap_sg_attrs(dev, sg, nents, direction, 0);
 }
 
+/**
+ * ib_dma_map_sg - Map an array of bio_vecs to DMA addresses
+ * @dev: The device for which the DMA addresses are to be created
+ * @bvecs: The array of bio_vec entries to map
+ * @nents: The number of entries in the array
+ * @direction: The direction of the DMA
+ */
+static inline int ib_dma_map_bvecs(struct ib_device *dev,
+				   struct bio_vec *bvecs, int nents,
+				   enum dma_data_direction direction)
+{
+	return dma_map_bvecs_attrs(dev->dma_device, bvecs, nents, direction, 0);
+}
+
+/**
+ * ib_dma_unmap_bvecs - Unmap a DMA-mapped bio_vec array
+ * @dev: The device for which the DMA addresses were created
+ * @bvecs: The array of bio_vec entries to unmap
+ * @nents: The number of entries in the array
+ * @direction: The direction of the DMA
+ */
+static inline void ib_dma_unmap_bvec(struct ib_device *dev,
+				     struct bio_vec *bvecs, int nents,
+				     enum dma_data_direction direction)
+{
+	if (!ib_uses_virt_dma(dev))
+		dma_unmap_bvecs_attrs(dev->dma_device, bvecs, nents, direction);
+}
+
 /**
  * ib_dma_max_seg_size - Return the size limit of a single DMA transfer
  * @dev: The device to query



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
  2023-10-19 15:25 ` Chuck Lever
                   ` (9 preceding siblings ...)
  (?)
@ 2023-10-19 15:53 ` Matthew Wilcox
  2023-10-19 17:48   ` Chuck Lever
  2023-10-20  4:58   ` Christoph Hellwig
  -1 siblings, 2 replies; 35+ messages in thread
From: Matthew Wilcox @ 2023-10-19 15:53 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Marek Szyprowski, Chuck Lever, Robin Murphy, Alexander Potapenko,
	linux-mm, linux-rdma, Jens Axboe, kasan-dev, David Howells, iommu,
	Christoph Hellwig

On Thu, Oct 19, 2023 at 11:25:31AM -0400, Chuck Lever wrote:
> The SunRPC stack manages pages (and eventually, folios) via an
> array of struct biovec items within struct xdr_buf. We have not
> fully committed to replacing the struct page array in xdr_buf
> because, although the socket API supports biovec arrays, the RDMA
> stack uses struct scatterlist rather than struct biovec.
> 
> This (incomplete) series explores what it might look like if the
> RDMA core API could support struct biovec array arguments. The
> series compiles on x86, but I haven't tested it further. I'm posting
> early in hopes of starting further discussion.

Good call, because I think patch 2/9 is a complete non-starter.

The fundamental problem with scatterlist is that it is both input
and output for the mapping operation.  You're replicating this mistake
in a different data structure.

My vision for the future is that we have phyr as our input structure.
That looks something like:

struct phyr {
	phys_addr_t start;
	size_t len;
};

On 32-bit, that's 8 or 12 bytes; on 64-bit it's 16 bytes.  This is
better than biovec because biovec is sometimes larger than that, and
it allows specifying IO to memory that does not have a struct page.

Our output structure can continue being called the scatterlist, but
it needs to go on a diet and look more like:

struct scatterlist {
	dma_addr_t dma_address;
	size_t dma_length;
};

Getting to this point is going to be a huge amount of work, and I need
to finish folios first.  Or somebody else can work on it ;-)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
  2023-10-19 15:25 ` Chuck Lever
                   ` (10 preceding siblings ...)
  (?)
@ 2023-10-19 16:43 ` Robin Murphy
  2023-10-19 17:53   ` Jason Gunthorpe
  -1 siblings, 1 reply; 35+ messages in thread
From: Robin Murphy @ 2023-10-19 16:43 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Marek Szyprowski, Chuck Lever, Alexander Potapenko, linux-mm,
	linux-rdma, Jens Axboe, kasan-dev, David Howells, iommu,
	Christoph Hellwig, Jason Gunthorpe

On 19/10/2023 4:25 pm, Chuck Lever wrote:
> The SunRPC stack manages pages (and eventually, folios) via an
> array of struct biovec items within struct xdr_buf. We have not
> fully committed to replacing the struct page array in xdr_buf
> because, although the socket API supports biovec arrays, the RDMA
> stack uses struct scatterlist rather than struct biovec.
> 
> This (incomplete) series explores what it might look like if the
> RDMA core API could support struct biovec array arguments. The
> series compiles on x86, but I haven't tested it further. I'm posting
> early in hopes of starting further discussion.
> 
> Are there other upper layer API consumers, besides SunRPC, who might
> prefer the use of biovec over scatterlist?
> 
> Besides handling folios as well as single pages in bv_page, what
> other work might be needed in the DMA layer?

Eww, please no. It's already well established that the scatterlist 
design is horrible and we want to move to something sane and actually 
suitable for modern DMA scenarios. Something where callers can pass a 
set of pages/physical address ranges in, and get a (separate) set of DMA 
ranges out. Without any bonkers packing of different-length lists into 
the same list structure. IIRC Jason did a bit of prototyping a while 
back, but it may be looking for someone else to pick up the idea and 
give it some more attention.

What we definitely don't what at this point is a copy-paste of the same 
bad design with all the same problems. I would have to NAK patch 8 on 
principle, because the existing iommu_dma_map_sg() stuff has always been 
utterly mad, but it had to be to work around the limitations of the 
existing scatterlist design while bridging between two other established 
APIs; there's no good excuse for having *two* copies of all that to 
maintain if one doesn't have an existing precedent to fit into.

Thanks,
Robin.

> What RDMA core APIs should be converted? IMO a DMA mapping and
> registration API for biovecs would be needed. Maybe RDMA Read and
> Write too?
> 
> ---
> 
> Chuck Lever (9):
>        dma-debug: Fix a typo in a debugging eye-catcher
>        bvec: Add bio_vec fields to manage DMA mapping
>        dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
>        mm: kmsan: Add support for DMA mapping bio_vec arrays
>        dma-direct: Support direct mapping bio_vec arrays
>        DMA-API: Add dma_sync_bvecs_for_cpu() and dma_sync_bvecs_for_device()
>        DMA: Add dma_map_bvecs_attrs()
>        iommu/dma: Support DMA-mapping a bio_vec array
>        RDMA: Add helpers for DMA-mapping an array of bio_vecs
> 
> 
>   drivers/iommu/dma-iommu.c   | 368 ++++++++++++++++++++++++++++++++++++
>   drivers/iommu/iommu.c       |  58 ++++++
>   include/linux/bvec.h        | 143 ++++++++++++++
>   include/linux/dma-map-ops.h |   8 +
>   include/linux/dma-mapping.h |   9 +
>   include/linux/iommu.h       |   4 +
>   include/linux/kmsan.h       |  20 ++
>   include/rdma/ib_verbs.h     |  29 +++
>   kernel/dma/debug.c          | 165 +++++++++++++++-
>   kernel/dma/debug.h          |  38 ++++
>   kernel/dma/direct.c         |  92 +++++++++
>   kernel/dma/direct.h         |  17 ++
>   kernel/dma/mapping.c        |  93 +++++++++
>   mm/kmsan/hooks.c            |  13 ++
>   14 files changed, 1056 insertions(+), 1 deletion(-)
> 
> --
> Chuck Lever
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
  2023-10-19 15:53 ` [PATCH RFC 0/9] Exploring biovec support in (R)DMA API Matthew Wilcox
@ 2023-10-19 17:48   ` Chuck Lever
  2023-10-20  4:58   ` Christoph Hellwig
  1 sibling, 0 replies; 35+ messages in thread
From: Chuck Lever @ 2023-10-19 17:48 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Chuck Lever, Marek Szyprowski, Robin Murphy, Alexander Potapenko,
	linux-mm, linux-rdma, Jens Axboe, kasan-dev, David Howells, iommu,
	Christoph Hellwig

On Thu, Oct 19, 2023 at 04:53:43PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 19, 2023 at 11:25:31AM -0400, Chuck Lever wrote:
> > The SunRPC stack manages pages (and eventually, folios) via an
> > array of struct biovec items within struct xdr_buf. We have not
> > fully committed to replacing the struct page array in xdr_buf
> > because, although the socket API supports biovec arrays, the RDMA
> > stack uses struct scatterlist rather than struct biovec.
> > 
> > This (incomplete) series explores what it might look like if the
> > RDMA core API could support struct biovec array arguments. The
> > series compiles on x86, but I haven't tested it further. I'm posting
> > early in hopes of starting further discussion.
> 
> Good call, because I think patch 2/9 is a complete non-starter.
> 
> The fundamental problem with scatterlist is that it is both input
> and output for the mapping operation.  You're replicating this mistake
> in a different data structure.

Fwiw, I'm not at all wedded to the "copy-and-paste SGL" approach.


> My vision for the future is that we have phyr as our input structure.
> That looks something like:
> 
> struct phyr {
> 	phys_addr_t start;
> 	size_t len;
> };
> 
> On 32-bit, that's 8 or 12 bytes; on 64-bit it's 16 bytes.  This is
> better than biovec because biovec is sometimes larger than that, and
> it allows specifying IO to memory that does not have a struct page.

Passing a folio rather than a page is indeed one of the benefits we
would like to gain for SunRPC.


> Our output structure can continue being called the scatterlist, but
> it needs to go on a diet and look more like:
> 
> struct scatterlist {
> 	dma_addr_t dma_address;
> 	size_t dma_length;
> };
> 
> Getting to this point is going to be a huge amount of work, and I need
> to finish folios first.  Or somebody else can work on it ;-)

I would like to see forward progress, as SunRPC has some skin in
this game. I'm happy to contribute code or review.

If there is some consensus on your proposed approach, I can start
with that.

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
  2023-10-19 16:43 ` Robin Murphy
@ 2023-10-19 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2023-10-19 17:53 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Chuck Lever, Marek Szyprowski, Chuck Lever, Alexander Potapenko,
	linux-mm, linux-rdma, Jens Axboe, kasan-dev, David Howells, iommu,
	Christoph Hellwig

On Thu, Oct 19, 2023 at 05:43:11PM +0100, Robin Murphy wrote:
> On 19/10/2023 4:25 pm, Chuck Lever wrote:
> > The SunRPC stack manages pages (and eventually, folios) via an
> > array of struct biovec items within struct xdr_buf. We have not
> > fully committed to replacing the struct page array in xdr_buf
> > because, although the socket API supports biovec arrays, the RDMA
> > stack uses struct scatterlist rather than struct biovec.
> > 
> > This (incomplete) series explores what it might look like if the
> > RDMA core API could support struct biovec array arguments. The
> > series compiles on x86, but I haven't tested it further. I'm posting
> > early in hopes of starting further discussion.
> > 
> > Are there other upper layer API consumers, besides SunRPC, who might
> > prefer the use of biovec over scatterlist?
> > 
> > Besides handling folios as well as single pages in bv_page, what
> > other work might be needed in the DMA layer?
> 
> Eww, please no. It's already well established that the scatterlist design is
> horrible and we want to move to something sane and actually suitable for
> modern DMA scenarios. Something where callers can pass a set of
> pages/physical address ranges in, and get a (separate) set of DMA ranges
> out. Without any bonkers packing of different-length lists into the same
> list structure. IIRC Jason did a bit of prototyping a while back, but it may
> be looking for someone else to pick up the idea and give it some more
> attention.

I put it aside for the moment as the direction changed after the
conference somewhat.

> What we definitely don't what at this point is a copy-paste of the same bad
> design with all the same problems. I would have to NAK patch 8 on principle,
> because the existing iommu_dma_map_sg() stuff has always been utterly mad,
> but it had to be to work around the limitations of the existing scatterlist
> design while bridging between two other established APIs; there's no good
> excuse for having *two* copies of all that to maintain if one doesn't have
> an existing precedent to fit into.

The idea from HCH I've been going toward was to allow each subsystem
to do what made sense for it. The dma api would provide some more
generic interfaces that could be used to implement a map_sg without
having to be tightly coupled to the DMA subsystem itself.

The concept would be to allow something like NVMe to go directly from
current BIO into its native HW format, without having to do a round
trip into an intermediate storage array.

How this formulates to RDMA work requests I haven't thought about,
this is a large enough thing that I need some mlx5 driver support to
do the first step and that was supposed to be this month but a war has
caused some delay :(

RDMA has a complicated historical relationship to the dma_api, sadly.

This plan also wants the significant archs to all use the common
dma-iommu - now that S390 is migrated only power remains...

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
  2023-10-19 15:25   ` Chuck Lever
  (?)
@ 2023-10-19 21:38   ` kernel test robot
  2023-10-19 23:21     ` Chuck Lever III
  -1 siblings, 1 reply; 35+ messages in thread
From: kernel test robot @ 2023-10-19 21:38 UTC (permalink / raw)
  To: Chuck Lever; +Cc: oe-kbuild-all

Hi Chuck,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on rdma/for-next linus/master v6.6-rc6 next-20231019]
[cannot apply to joro-iommu/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/dma-debug-Fix-a-typo-in-a-debugging-eye-catcher/20231020-032859
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/169772915215.5232.10127407258544978465.stgit%40klimt.1015granger.net
patch subject: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310200545.ScAzFYdK-lkp@intel.com/

All warnings (new ones prefixed by >>):

   kernel/dma/debug.c: In function 'check_bv_segment':
   kernel/dma/debug.c:1204:15: error: 'struct bio_vec' has no member named 'length'
    1204 |         if (bv->length > max_seg)
         |               ^~
   In file included from arch/m68k/include/asm/bug.h:32,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:13,
                    from include/asm-generic/preempt.h:5,
                    from ./arch/m68k/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:79,
                    from arch/m68k/include/asm/irqflags.h:6,
                    from include/linux/irqflags.h:17,
                    from arch/m68k/include/asm/atomic.h:6,
                    from include/linux/atomic.h:7,
                    from include/linux/rcupdate.h:25,
                    from include/linux/rculist.h:11,
                    from include/linux/pid.h:5,
                    from include/linux/sched.h:14,
                    from include/linux/sched/task_stack.h:9,
                    from kernel/dma/debug.c:10:
   kernel/dma/debug.c:1206:30: error: 'struct bio_vec' has no member named 'length'
    1206 |                            bv->length, max_seg);
         |                              ^~
   include/asm-generic/bug.h:99:62: note: in definition of macro '__WARN_printf'
      99 |                 warn_slowpath_fmt(__FILE__, __LINE__, taint, arg);      \
         |                                                              ^~~
   kernel/dma/debug.c:224:25: note: in expansion of macro 'WARN'
     224 |                         WARN(1, pr_fmt("%s %s: ") format,               \
         |                         ^~~~
   kernel/dma/debug.c:1205:17: note: in expansion of macro 'err_printk'
    1205 |                 err_printk(dev, NULL, "mapping bv entry longer than device claims to support [len=%u] [max=%u]\n",
         |                 ^~~~~~~~~~
   kernel/dma/debug.c: In function 'debug_dma_map_bvecs':
   kernel/dma/debug.c:1377:38: error: implicit declaration of function 'bv_page'; did you mean 'sg_page'? [-Werror=implicit-function-declaration]
    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
         |                                      ^~~~~~~
         |                                      sg_page
   kernel/dma/debug.c:1377:55: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
         |                                                       ^~~~~~
         |                                                       bv_offset
>> kernel/dma/debug.c:1377:38: warning: passing argument 2 of 'check_for_stack' makes pointer from integer without a cast [-Wint-conversion]
    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
         |                                      ^~~~~~~~~~~
         |                                      |
         |                                      int
   kernel/dma/debug.c:1059:42: note: expected 'struct page *' but argument is of type 'int'
    1059 |                             struct page *page, size_t offset)
         |                             ~~~~~~~~~~~~~^~~~
>> kernel/dma/debug.c:1378:34: warning: passing argument 1 of 'PageHighMem' makes pointer from integer without a cast [-Wint-conversion]
    1378 |                 if (!PageHighMem(bv_page(bv)))
         |                                  ^~~~~~~~~~~
         |                                  |
         |                                  int
   In file included from include/linux/mmzone.h:23,
                    from include/linux/gfp.h:7,
                    from include/linux/mm.h:7,
                    from include/linux/scatterlist.h:8,
                    from kernel/dma/debug.c:11:
   include/linux/page-flags.h:437:50: note: expected 'const struct page *' but argument is of type 'int'
     437 | static inline int Page##uname(const struct page *page) { return 0; }
         |                               ~~~~~~~~~~~~~~~~~~~^~~~
   include/linux/page-flags.h:461:38: note: in expansion of macro 'TESTPAGEFLAG_FALSE'
     461 | #define PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname, lname)   \
         |                                      ^~~~~~~~~~~~~~~~~~
   include/linux/page-flags.h:531:1: note: in expansion of macro 'PAGEFLAG_FALSE'
     531 | PAGEFLAG_FALSE(HighMem, highmem)
         | ^~~~~~~~~~~~~~
   kernel/dma/debug.c:1379:68: error: 'struct bio_vec' has no member named 'length'
    1379 |                         check_for_illegal_area(dev, bv_virt(bv), bv->length);
         |                                                                    ^~
   In file included from arch/m68k/include/asm/page.h:66,
                    from arch/m68k/include/asm/thread_info.h:6,
                    from include/linux/thread_info.h:60:
   include/asm-generic/memory_model.h:19:57: error: invalid operands to binary - (have 'int' and 'struct page *')
      19 | #define __page_to_pfn(page)     ((unsigned long)((page) - mem_map) + \
         |                                                  ~~~~~~ ^
   include/asm-generic/memory_model.h:64:21: note: in expansion of macro '__page_to_pfn'
      64 | #define page_to_pfn __page_to_pfn
         |                     ^~~~~~~~~~~~~
   kernel/dma/debug.c:1391:41: note: in expansion of macro 'page_to_pfn'
    1391 |                 entry->pfn            = page_to_pfn(bv_page(bv));
         |                                         ^~~~~~~~~~~
   kernel/dma/debug.c:1392:45: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
    1392 |                 entry->offset         = bv->offset;
         |                                             ^~~~~~
         |                                             bv_offset
   kernel/dma/debug.c: In function 'debug_dma_unmap_bvecs':
   kernel/dma/debug.c:1464:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
    1464 |         for (i = 0; i < nents; i++) {
         |                         ^~~~~
         |                         net
   kernel/dma/debug.c:1464:25: note: each undeclared identifier is reported only once for each function it appears in
   include/asm-generic/memory_model.h:19:57: error: invalid operands to binary - (have 'int' and 'struct page *')
      19 | #define __page_to_pfn(page)     ((unsigned long)((page) - mem_map) + \
         |                                                  ~~~~~~ ^
   include/asm-generic/memory_model.h:64:21: note: in expansion of macro '__page_to_pfn'
      64 | #define page_to_pfn __page_to_pfn
         |                     ^~~~~~~~~~~~~
   kernel/dma/debug.c:1469:43: note: in expansion of macro 'page_to_pfn'
    1469 |                         .pfn            = page_to_pfn(bv_page(bv)),
         |                                           ^~~~~~~~~~~
   kernel/dma/debug.c:1470:47: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
    1470 |                         .offset         = bv->offset,
         |                                               ^~~~~~
         |                                               bv_offset
   kernel/dma/debug.c: In function 'debug_dma_sync_bvecs_for_cpu':
   kernel/dma/debug.c:1700:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
    1700 |         for (i = 0; i < nents; i++) {
         |                         ^~~~~
         |                         net
>> kernel/dma/debug.c:1695:25: warning: unused variable 'bv' [-Wunused-variable]
    1695 |         struct bio_vec *bv;
         |                         ^~
   kernel/dma/debug.c: In function 'debug_dma_sync_bvecs_for_device':
   kernel/dma/debug.c:1732:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
    1732 |         for (i = 0; i < nents; i++) {
         |                         ^~~~~
         |                         net
   kernel/dma/debug.c:1727:25: warning: unused variable 'bv' [-Wunused-variable]
    1727 |         struct bio_vec *bv;
         |                         ^~
   cc1: some warnings being treated as errors


vim +/check_for_stack +1377 kernel/dma/debug.c

  1363	
  1364	void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
  1365				 int nents, int mapped_ents, int direction,
  1366				 unsigned long attrs)
  1367	{
  1368		struct dma_debug_entry *entry;
  1369		struct bio_vec *bv;
  1370		int i;
  1371	
  1372		if (unlikely(dma_debug_disabled()))
  1373			return;
  1374	
  1375		for (i = 0; i < nents; i++) {
  1376			bv = &bvecs[i];
> 1377			check_for_stack(dev, bv_page(bv), bv->offset);
> 1378			if (!PageHighMem(bv_page(bv)))
  1379				check_for_illegal_area(dev, bv_virt(bv), bv->length);
  1380		}
  1381	
  1382		for (i = 0; i < nents; i++) {
  1383			bv = &bvecs[i];
  1384	
  1385			entry = dma_entry_alloc();
  1386			if (!entry)
  1387				return;
  1388	
  1389			entry->type           = dma_debug_bv;
  1390			entry->dev            = dev;
  1391			entry->pfn	      = page_to_pfn(bv_page(bv));
  1392			entry->offset	      = bv->offset;
  1393			entry->size           = bv_dma_len(bv);
  1394			entry->dev_addr       = bv_dma_address(bv);
  1395			entry->direction      = direction;
  1396			entry->sg_call_ents   = nents;
  1397			entry->sg_mapped_ents = mapped_ents;
  1398	
  1399			check_bv_segment(dev, bv);
  1400	
  1401			add_dma_entry(entry, attrs);
  1402		}
  1403	}
  1404	
  1405	static int get_nr_mapped_entries(struct device *dev,
  1406					 struct dma_debug_entry *ref)
  1407	{
  1408		struct dma_debug_entry *entry;
  1409		struct hash_bucket *bucket;
  1410		unsigned long flags;
  1411		int mapped_ents;
  1412	
  1413		bucket       = get_hash_bucket(ref, &flags);
  1414		entry        = bucket_find_exact(bucket, ref);
  1415		mapped_ents  = 0;
  1416	
  1417		if (entry)
  1418			mapped_ents = entry->sg_mapped_ents;
  1419		put_hash_bucket(bucket, flags);
  1420	
  1421		return mapped_ents;
  1422	}
  1423	
  1424	void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
  1425				int nelems, int dir)
  1426	{
  1427		struct scatterlist *s;
  1428		int mapped_ents = 0, i;
  1429	
  1430		if (unlikely(dma_debug_disabled()))
  1431			return;
  1432	
  1433		for_each_sg(sglist, s, nelems, i) {
  1434	
  1435			struct dma_debug_entry ref = {
  1436				.type           = dma_debug_sg,
  1437				.dev            = dev,
  1438				.pfn		= page_to_pfn(sg_page(s)),
  1439				.offset		= s->offset,
  1440				.dev_addr       = sg_dma_address(s),
  1441				.size           = sg_dma_len(s),
  1442				.direction      = dir,
  1443				.sg_call_ents   = nelems,
  1444			};
  1445	
  1446			if (mapped_ents && i >= mapped_ents)
  1447				break;
  1448	
  1449			if (!i)
  1450				mapped_ents = get_nr_mapped_entries(dev, &ref);
  1451	
  1452			check_unmap(&ref);
  1453		}
  1454	}
  1455	
  1456	void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
  1457				   int nelems, int dir)
  1458	{
  1459		int mapped_ents = 0, i;
  1460	
  1461		if (unlikely(dma_debug_disabled()))
  1462			return;
  1463	
  1464		for (i = 0; i < nents; i++) {
  1465			struct bio_vec *bv = &bvecs[i];
  1466			struct dma_debug_entry ref = {
  1467				.type           = dma_debug_bv,
  1468				.dev            = dev,
  1469				.pfn		= page_to_pfn(bv_page(bv)),
> 1470				.offset		= bv->offset,
  1471				.dev_addr       = bv_dma_address(bv),
  1472				.size           = bv_dma_len(bv),
  1473				.direction      = dir,
  1474				.sg_call_ents   = nelems,
  1475			};
  1476	
  1477			if (mapped_ents && i >= mapped_ents)
  1478				break;
  1479	
  1480			if (!i)
  1481				mapped_ents = get_nr_mapped_entries(dev, &ref);
  1482	
  1483			check_unmap(&ref);
  1484		}
  1485	}
  1486	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
  2023-10-19 15:25   ` Chuck Lever
  (?)
  (?)
@ 2023-10-19 21:49   ` kernel test robot
  -1 siblings, 0 replies; 35+ messages in thread
From: kernel test robot @ 2023-10-19 21:49 UTC (permalink / raw)
  To: Chuck Lever; +Cc: oe-kbuild-all

Hi Chuck,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on rdma/for-next linus/master v6.6-rc6 next-20231019]
[cannot apply to joro-iommu/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/dma-debug-Fix-a-typo-in-a-debugging-eye-catcher/20231020-032859
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/169772915215.5232.10127407258544978465.stgit%40klimt.1015granger.net
patch subject: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20231020/202310200500.S3lG80My-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231020/202310200500.S3lG80My-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310200500.S3lG80My-lkp@intel.com/

All warnings (new ones prefixed by >>):

   kernel/dma/debug.c: In function 'check_bv_segment':
   kernel/dma/debug.c:1204:15: error: 'struct bio_vec' has no member named 'length'
    1204 |         if (bv->length > max_seg)
         |               ^~
   In file included from arch/alpha/include/asm/bug.h:23,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:13,
                    from include/asm-generic/current.h:6,
                    from ./arch/alpha/include/generated/asm/current.h:1,
                    from include/linux/sched.h:12,
                    from include/linux/sched/task_stack.h:9,
                    from kernel/dma/debug.c:10:
   kernel/dma/debug.c:1206:30: error: 'struct bio_vec' has no member named 'length'
    1206 |                            bv->length, max_seg);
         |                              ^~
   include/asm-generic/bug.h:99:62: note: in definition of macro '__WARN_printf'
      99 |                 warn_slowpath_fmt(__FILE__, __LINE__, taint, arg);      \
         |                                                              ^~~
   kernel/dma/debug.c:224:25: note: in expansion of macro 'WARN'
     224 |                         WARN(1, pr_fmt("%s %s: ") format,               \
         |                         ^~~~
   kernel/dma/debug.c:1205:17: note: in expansion of macro 'err_printk'
    1205 |                 err_printk(dev, NULL, "mapping bv entry longer than device claims to support [len=%u] [max=%u]\n",
         |                 ^~~~~~~~~~
   kernel/dma/debug.c: In function 'debug_dma_map_bvecs':
   kernel/dma/debug.c:1377:38: error: implicit declaration of function 'bv_page'; did you mean 'sg_page'? [-Werror=implicit-function-declaration]
    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
         |                                      ^~~~~~~
         |                                      sg_page
   kernel/dma/debug.c:1377:55: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
         |                                                       ^~~~~~
         |                                                       bv_offset
   kernel/dma/debug.c:1377:38: warning: passing argument 2 of 'check_for_stack' makes pointer from integer without a cast [-Wint-conversion]
    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
         |                                      ^~~~~~~~~~~
         |                                      |
         |                                      int
   kernel/dma/debug.c:1059:42: note: expected 'struct page *' but argument is of type 'int'
    1059 |                             struct page *page, size_t offset)
         |                             ~~~~~~~~~~~~~^~~~
   kernel/dma/debug.c:1378:34: warning: passing argument 1 of 'PageHighMem' makes pointer from integer without a cast [-Wint-conversion]
    1378 |                 if (!PageHighMem(bv_page(bv)))
         |                                  ^~~~~~~~~~~
         |                                  |
         |                                  int
   In file included from include/linux/mmzone.h:23,
                    from include/linux/gfp.h:7,
                    from include/linux/mm.h:7,
                    from include/linux/scatterlist.h:8,
                    from kernel/dma/debug.c:11:
   include/linux/page-flags.h:437:50: note: expected 'const struct page *' but argument is of type 'int'
     437 | static inline int Page##uname(const struct page *page) { return 0; }
         |                               ~~~~~~~~~~~~~~~~~~~^~~~
   include/linux/page-flags.h:461:38: note: in expansion of macro 'TESTPAGEFLAG_FALSE'
     461 | #define PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname, lname)   \
         |                                      ^~~~~~~~~~~~~~~~~~
   include/linux/page-flags.h:531:1: note: in expansion of macro 'PAGEFLAG_FALSE'
     531 | PAGEFLAG_FALSE(HighMem, highmem)
         | ^~~~~~~~~~~~~~
   kernel/dma/debug.c:1379:68: error: 'struct bio_vec' has no member named 'length'
    1379 |                         check_for_illegal_area(dev, bv_virt(bv), bv->length);
         |                                                                    ^~
   In file included from arch/alpha/include/asm/page.h:89,
                    from include/linux/shm.h:6,
                    from include/linux/sched.h:16:
>> include/asm-generic/memory_model.h:46:35: warning: initialization of 'const struct page *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
      46 | ({      const struct page *__pg = (pg);                         \
         |                                   ^
   include/asm-generic/memory_model.h:64:21: note: in expansion of macro '__page_to_pfn'
      64 | #define page_to_pfn __page_to_pfn
         |                     ^~~~~~~~~~~~~
   kernel/dma/debug.c:1391:41: note: in expansion of macro 'page_to_pfn'
    1391 |                 entry->pfn            = page_to_pfn(bv_page(bv));
         |                                         ^~~~~~~~~~~
   kernel/dma/debug.c:1392:45: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
    1392 |                 entry->offset         = bv->offset;
         |                                             ^~~~~~
         |                                             bv_offset
   kernel/dma/debug.c: In function 'debug_dma_unmap_bvecs':
   kernel/dma/debug.c:1464:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
    1464 |         for (i = 0; i < nents; i++) {
         |                         ^~~~~
         |                         net
   kernel/dma/debug.c:1464:25: note: each undeclared identifier is reported only once for each function it appears in
>> include/asm-generic/memory_model.h:46:35: warning: initialization of 'const struct page *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
      46 | ({      const struct page *__pg = (pg);                         \
         |                                   ^
   include/asm-generic/memory_model.h:64:21: note: in expansion of macro '__page_to_pfn'
      64 | #define page_to_pfn __page_to_pfn
         |                     ^~~~~~~~~~~~~
   kernel/dma/debug.c:1469:43: note: in expansion of macro 'page_to_pfn'
    1469 |                         .pfn            = page_to_pfn(bv_page(bv)),
         |                                           ^~~~~~~~~~~
   include/asm-generic/memory_model.h:46:35: note: (near initialization for 'ref')
      46 | ({      const struct page *__pg = (pg);                         \
         |                                   ^
   include/asm-generic/memory_model.h:64:21: note: in expansion of macro '__page_to_pfn'
      64 | #define page_to_pfn __page_to_pfn
         |                     ^~~~~~~~~~~~~
   kernel/dma/debug.c:1469:43: note: in expansion of macro 'page_to_pfn'
    1469 |                         .pfn            = page_to_pfn(bv_page(bv)),
         |                                           ^~~~~~~~~~~
   kernel/dma/debug.c:1470:47: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
    1470 |                         .offset         = bv->offset,
         |                                               ^~~~~~
         |                                               bv_offset
   kernel/dma/debug.c: In function 'debug_dma_sync_bvecs_for_cpu':
   kernel/dma/debug.c:1700:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
    1700 |         for (i = 0; i < nents; i++) {
         |                         ^~~~~
         |                         net
   kernel/dma/debug.c:1695:25: warning: unused variable 'bv' [-Wunused-variable]
    1695 |         struct bio_vec *bv;
         |                         ^~
   kernel/dma/debug.c: In function 'debug_dma_sync_bvecs_for_device':
   kernel/dma/debug.c:1732:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
    1732 |         for (i = 0; i < nents; i++) {
         |                         ^~~~~
         |                         net
   kernel/dma/debug.c:1727:25: warning: unused variable 'bv' [-Wunused-variable]
    1727 |         struct bio_vec *bv;
         |                         ^~
   cc1: some warnings being treated as errors


vim +46 include/asm-generic/memory_model.h

8f6aac419bd590 Christoph Lameter 2007-10-16  39  
a117e66ed45ac0 KAMEZAWA Hiroyuki 2006-03-27  40  #elif defined(CONFIG_SPARSEMEM)
a117e66ed45ac0 KAMEZAWA Hiroyuki 2006-03-27  41  /*
1a49123b343461 Zhang Yanfei      2013-10-03  42   * Note: section's mem_map is encoded to reflect its start_pfn.
a117e66ed45ac0 KAMEZAWA Hiroyuki 2006-03-27  43   * section[i].section_mem_map == mem_map's address - start_pfn;
a117e66ed45ac0 KAMEZAWA Hiroyuki 2006-03-27  44   */
67de648211fa04 Andy Whitcroft    2006-06-23  45  #define __page_to_pfn(pg)					\
aa462abe8aaf21 Ian Campbell      2011-08-17 @46  ({	const struct page *__pg = (pg);				\
a117e66ed45ac0 KAMEZAWA Hiroyuki 2006-03-27  47  	int __sec = page_to_section(__pg);			\
f05b6284ee5d3b Randy Dunlap      2007-02-10  48  	(unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec)));	\
a117e66ed45ac0 KAMEZAWA Hiroyuki 2006-03-27  49  })
a117e66ed45ac0 KAMEZAWA Hiroyuki 2006-03-27  50  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 7/9] DMA: Add dma_map_bvecs_attrs()
  2023-10-19 15:26   ` Chuck Lever
  (?)
@ 2023-10-19 22:10   ` kernel test robot
  -1 siblings, 0 replies; 35+ messages in thread
From: kernel test robot @ 2023-10-19 22:10 UTC (permalink / raw)
  To: Chuck Lever; +Cc: oe-kbuild-all

Hi Chuck,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on rdma/for-next linus/master v6.6-rc6 next-20231019]
[cannot apply to joro-iommu/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/dma-debug-Fix-a-typo-in-a-debugging-eye-catcher/20231020-032859
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/169772917833.5232.13488378553385610086.stgit%40klimt.1015granger.net
patch subject: [PATCH RFC 7/9] DMA: Add dma_map_bvecs_attrs()
config: loongarch-allyesconfig (https://download.01.org/0day-ci/archive/20231020/202310200655.lcp1jrR1-lkp@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231020/202310200655.lcp1jrR1-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310200655.lcp1jrR1-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/dma/mapping.c:321: warning: expecting prototype for dma_map_sg_attrs(). Prototype was for dma_map_bvecs_attrs() instead


vim +321 kernel/dma/mapping.c

   298	
   299	/**
   300	 * dma_map_sg_attrs - Map the given buffer for DMA
   301	 * @dev:	The device for which to perform the DMA operation
   302	 * @bvecs:	The bio_vec array describing the buffer
   303	 * @nents:	Number of bio_vecs to map
   304	 * @dir:	DMA direction
   305	 * @attrs:	Optional DMA attributes for the map operation
   306	 *
   307	 * Maps a buffer described by a bio_vec array passed in the bvecs
   308	 * argument with nents segments for the @dir DMA operation by the
   309	 * @dev device.
   310	 *
   311	 * Returns the number of mapped entries (which can be less than nents)
   312	 * on success. Zero is returned for any error.
   313	 *
   314	 * dma_unmap_bvecs_attrs() should be used to unmap the buffer with the
   315	 * original bvecs and original nents (not the value returned by this
   316	 * function).
   317	 */
   318	unsigned int dma_map_bvecs_attrs(struct device *dev, struct bio_vec *bvecs,
   319					 int nents, enum dma_data_direction dir,
   320					 unsigned long attrs)
 > 321	{
   322		const struct dma_map_ops *ops = get_dma_ops(dev);
   323		int ents;
   324	
   325		BUG_ON(!valid_dma_direction(dir));
   326	
   327		if (WARN_ON_ONCE(!dev->dma_mask))
   328			return 0;
   329	
   330		if (dma_map_direct(dev, ops))
   331			ents = dma_direct_map_bvecs(dev, bvecs, nents, dir, attrs);
   332		else
   333			ents = ops->map_bvecs(dev, bvecs, nents, dir, attrs);
   334	
   335		if (ents > 0) {
   336			kmsan_handle_dma_bvecs(bvecs, nents, dir);
   337			debug_dma_map_bvecs(dev, bvecs, nents, ents, dir, attrs);
   338		} else if (WARN_ON_ONCE(ents != -EINVAL && ents != -ENOMEM &&
   339					ents != -EIO && ents != -EREMOTEIO)) {
   340			return -EIO;
   341		}
   342	
   343		return ents;
   344	}
   345	EXPORT_SYMBOL_GPL(dma_map_bvecs_attrs);
   346	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
  2023-10-19 21:38   ` kernel test robot
@ 2023-10-19 23:21     ` Chuck Lever III
  2023-10-23  2:43       ` Liu, Yujie
  0 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever III @ 2023-10-19 23:21 UTC (permalink / raw)
  To: kernel test robot; +Cc: Chuck Lever, oe-kbuild-all@lists.linux.dev



> On Oct 19, 2023, at 5:38 PM, kernel test robot <lkp@intel.com> wrote:
> 
> Hi Chuck,
> 
> [This is a private test report for your RFC patch.]
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on akpm-mm/mm-everything]
> [also build test WARNING on rdma/for-next linus/master v6.6-rc6 next-20231019]
> [cannot apply to joro-iommu/next]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/dma-debug-Fix-a-typo-in-a-debugging-eye-catcher/20231020-032859
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/169772915215.5232.10127407258544978465.stgit%40klimt.1015granger.net
> patch subject: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
> config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/config)
> compiler: m68k-linux-gcc (GCC) 13.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202310200545.ScAzFYdK-lkp@intel.com/

Fwiw, this was an RFC series. Reviewers have already rejected
this approach, so I won't be posting another version of this.

Is there a way to flush out the test pipeline so no more tests
are run on the series?


> All warnings (new ones prefixed by >>):
> 
>   kernel/dma/debug.c: In function 'check_bv_segment':
>   kernel/dma/debug.c:1204:15: error: 'struct bio_vec' has no member named 'length'
>    1204 |         if (bv->length > max_seg)
>         |               ^~
>   In file included from arch/m68k/include/asm/bug.h:32,
>                    from include/linux/bug.h:5,
>                    from include/linux/thread_info.h:13,
>                    from include/asm-generic/preempt.h:5,
>                    from ./arch/m68k/include/generated/asm/preempt.h:1,
>                    from include/linux/preempt.h:79,
>                    from arch/m68k/include/asm/irqflags.h:6,
>                    from include/linux/irqflags.h:17,
>                    from arch/m68k/include/asm/atomic.h:6,
>                    from include/linux/atomic.h:7,
>                    from include/linux/rcupdate.h:25,
>                    from include/linux/rculist.h:11,
>                    from include/linux/pid.h:5,
>                    from include/linux/sched.h:14,
>                    from include/linux/sched/task_stack.h:9,
>                    from kernel/dma/debug.c:10:
>   kernel/dma/debug.c:1206:30: error: 'struct bio_vec' has no member named 'length'
>    1206 |                            bv->length, max_seg);
>         |                              ^~
>   include/asm-generic/bug.h:99:62: note: in definition of macro '__WARN_printf'
>      99 |                 warn_slowpath_fmt(__FILE__, __LINE__, taint, arg);      \
>         |                                                              ^~~
>   kernel/dma/debug.c:224:25: note: in expansion of macro 'WARN'
>     224 |                         WARN(1, pr_fmt("%s %s: ") format,               \
>         |                         ^~~~
>   kernel/dma/debug.c:1205:17: note: in expansion of macro 'err_printk'
>    1205 |                 err_printk(dev, NULL, "mapping bv entry longer than device claims to support [len=%u] [max=%u]\n",
>         |                 ^~~~~~~~~~
>   kernel/dma/debug.c: In function 'debug_dma_map_bvecs':
>   kernel/dma/debug.c:1377:38: error: implicit declaration of function 'bv_page'; did you mean 'sg_page'? [-Werror=implicit-function-declaration]
>    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
>         |                                      ^~~~~~~
>         |                                      sg_page
>   kernel/dma/debug.c:1377:55: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
>    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
>         |                                                       ^~~~~~
>         |                                                       bv_offset
>>> kernel/dma/debug.c:1377:38: warning: passing argument 2 of 'check_for_stack' makes pointer from integer without a cast [-Wint-conversion]
>    1377 |                 check_for_stack(dev, bv_page(bv), bv->offset);
>         |                                      ^~~~~~~~~~~
>         |                                      |
>         |                                      int
>   kernel/dma/debug.c:1059:42: note: expected 'struct page *' but argument is of type 'int'
>    1059 |                             struct page *page, size_t offset)
>         |                             ~~~~~~~~~~~~~^~~~
>>> kernel/dma/debug.c:1378:34: warning: passing argument 1 of 'PageHighMem' makes pointer from integer without a cast [-Wint-conversion]
>    1378 |                 if (!PageHighMem(bv_page(bv)))
>         |                                  ^~~~~~~~~~~
>         |                                  |
>         |                                  int
>   In file included from include/linux/mmzone.h:23,
>                    from include/linux/gfp.h:7,
>                    from include/linux/mm.h:7,
>                    from include/linux/scatterlist.h:8,
>                    from kernel/dma/debug.c:11:
>   include/linux/page-flags.h:437:50: note: expected 'const struct page *' but argument is of type 'int'
>     437 | static inline int Page##uname(const struct page *page) { return 0; }
>         |                               ~~~~~~~~~~~~~~~~~~~^~~~
>   include/linux/page-flags.h:461:38: note: in expansion of macro 'TESTPAGEFLAG_FALSE'
>     461 | #define PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname, lname)   \
>         |                                      ^~~~~~~~~~~~~~~~~~
>   include/linux/page-flags.h:531:1: note: in expansion of macro 'PAGEFLAG_FALSE'
>     531 | PAGEFLAG_FALSE(HighMem, highmem)
>         | ^~~~~~~~~~~~~~
>   kernel/dma/debug.c:1379:68: error: 'struct bio_vec' has no member named 'length'
>    1379 |                         check_for_illegal_area(dev, bv_virt(bv), bv->length);
>         |                                                                    ^~
>   In file included from arch/m68k/include/asm/page.h:66,
>                    from arch/m68k/include/asm/thread_info.h:6,
>                    from include/linux/thread_info.h:60:
>   include/asm-generic/memory_model.h:19:57: error: invalid operands to binary - (have 'int' and 'struct page *')
>      19 | #define __page_to_pfn(page)     ((unsigned long)((page) - mem_map) + \
>         |                                                  ~~~~~~ ^
>   include/asm-generic/memory_model.h:64:21: note: in expansion of macro '__page_to_pfn'
>      64 | #define page_to_pfn __page_to_pfn
>         |                     ^~~~~~~~~~~~~
>   kernel/dma/debug.c:1391:41: note: in expansion of macro 'page_to_pfn'
>    1391 |                 entry->pfn            = page_to_pfn(bv_page(bv));
>         |                                         ^~~~~~~~~~~
>   kernel/dma/debug.c:1392:45: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
>    1392 |                 entry->offset         = bv->offset;
>         |                                             ^~~~~~
>         |                                             bv_offset
>   kernel/dma/debug.c: In function 'debug_dma_unmap_bvecs':
>   kernel/dma/debug.c:1464:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
>    1464 |         for (i = 0; i < nents; i++) {
>         |                         ^~~~~
>         |                         net
>   kernel/dma/debug.c:1464:25: note: each undeclared identifier is reported only once for each function it appears in
>   include/asm-generic/memory_model.h:19:57: error: invalid operands to binary - (have 'int' and 'struct page *')
>      19 | #define __page_to_pfn(page)     ((unsigned long)((page) - mem_map) + \
>         |                                                  ~~~~~~ ^
>   include/asm-generic/memory_model.h:64:21: note: in expansion of macro '__page_to_pfn'
>      64 | #define page_to_pfn __page_to_pfn
>         |                     ^~~~~~~~~~~~~
>   kernel/dma/debug.c:1469:43: note: in expansion of macro 'page_to_pfn'
>    1469 |                         .pfn            = page_to_pfn(bv_page(bv)),
>         |                                           ^~~~~~~~~~~
>   kernel/dma/debug.c:1470:47: error: 'struct bio_vec' has no member named 'offset'; did you mean 'bv_offset'?
>    1470 |                         .offset         = bv->offset,
>         |                                               ^~~~~~
>         |                                               bv_offset
>   kernel/dma/debug.c: In function 'debug_dma_sync_bvecs_for_cpu':
>   kernel/dma/debug.c:1700:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
>    1700 |         for (i = 0; i < nents; i++) {
>         |                         ^~~~~
>         |                         net
>>> kernel/dma/debug.c:1695:25: warning: unused variable 'bv' [-Wunused-variable]
>    1695 |         struct bio_vec *bv;
>         |                         ^~
>   kernel/dma/debug.c: In function 'debug_dma_sync_bvecs_for_device':
>   kernel/dma/debug.c:1732:25: error: 'nents' undeclared (first use in this function); did you mean 'net'?
>    1732 |         for (i = 0; i < nents; i++) {
>         |                         ^~~~~
>         |                         net
>   kernel/dma/debug.c:1727:25: warning: unused variable 'bv' [-Wunused-variable]
>    1727 |         struct bio_vec *bv;
>         |                         ^~
>   cc1: some warnings being treated as errors
> 
> 
> vim +/check_for_stack +1377 kernel/dma/debug.c
> 
>  1363 
>  1364 void debug_dma_map_bvecs(struct device *dev, struct bio_vec *bvecs,
>  1365 int nents, int mapped_ents, int direction,
>  1366 unsigned long attrs)
>  1367 {
>  1368 struct dma_debug_entry *entry;
>  1369 struct bio_vec *bv;
>  1370 int i;
>  1371 
>  1372 if (unlikely(dma_debug_disabled()))
>  1373 return;
>  1374 
>  1375 for (i = 0; i < nents; i++) {
>  1376 bv = &bvecs[i];
>> 1377 check_for_stack(dev, bv_page(bv), bv->offset);
>> 1378 if (!PageHighMem(bv_page(bv)))
>  1379 check_for_illegal_area(dev, bv_virt(bv), bv->length);
>  1380 }
>  1381 
>  1382 for (i = 0; i < nents; i++) {
>  1383 bv = &bvecs[i];
>  1384 
>  1385 entry = dma_entry_alloc();
>  1386 if (!entry)
>  1387 return;
>  1388 
>  1389 entry->type           = dma_debug_bv;
>  1390 entry->dev            = dev;
>  1391 entry->pfn      = page_to_pfn(bv_page(bv));
>  1392 entry->offset      = bv->offset;
>  1393 entry->size           = bv_dma_len(bv);
>  1394 entry->dev_addr       = bv_dma_address(bv);
>  1395 entry->direction      = direction;
>  1396 entry->sg_call_ents   = nents;
>  1397 entry->sg_mapped_ents = mapped_ents;
>  1398 
>  1399 check_bv_segment(dev, bv);
>  1400 
>  1401 add_dma_entry(entry, attrs);
>  1402 }
>  1403 }
>  1404 
>  1405 static int get_nr_mapped_entries(struct device *dev,
>  1406 struct dma_debug_entry *ref)
>  1407 {
>  1408 struct dma_debug_entry *entry;
>  1409 struct hash_bucket *bucket;
>  1410 unsigned long flags;
>  1411 int mapped_ents;
>  1412 
>  1413 bucket       = get_hash_bucket(ref, &flags);
>  1414 entry        = bucket_find_exact(bucket, ref);
>  1415 mapped_ents  = 0;
>  1416 
>  1417 if (entry)
>  1418 mapped_ents = entry->sg_mapped_ents;
>  1419 put_hash_bucket(bucket, flags);
>  1420 
>  1421 return mapped_ents;
>  1422 }
>  1423 
>  1424 void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
>  1425 int nelems, int dir)
>  1426 {
>  1427 struct scatterlist *s;
>  1428 int mapped_ents = 0, i;
>  1429 
>  1430 if (unlikely(dma_debug_disabled()))
>  1431 return;
>  1432 
>  1433 for_each_sg(sglist, s, nelems, i) {
>  1434 
>  1435 struct dma_debug_entry ref = {
>  1436 .type           = dma_debug_sg,
>  1437 .dev            = dev,
>  1438 .pfn = page_to_pfn(sg_page(s)),
>  1439 .offset = s->offset,
>  1440 .dev_addr       = sg_dma_address(s),
>  1441 .size           = sg_dma_len(s),
>  1442 .direction      = dir,
>  1443 .sg_call_ents   = nelems,
>  1444 };
>  1445 
>  1446 if (mapped_ents && i >= mapped_ents)
>  1447 break;
>  1448 
>  1449 if (!i)
>  1450 mapped_ents = get_nr_mapped_entries(dev, &ref);
>  1451 
>  1452 check_unmap(&ref);
>  1453 }
>  1454 }
>  1455 
>  1456 void debug_dma_unmap_bvecs(struct device *dev, struct bio_vec *bvecs,
>  1457   int nelems, int dir)
>  1458 {
>  1459 int mapped_ents = 0, i;
>  1460 
>  1461 if (unlikely(dma_debug_disabled()))
>  1462 return;
>  1463 
>  1464 for (i = 0; i < nents; i++) {
>  1465 struct bio_vec *bv = &bvecs[i];
>  1466 struct dma_debug_entry ref = {
>  1467 .type           = dma_debug_bv,
>  1468 .dev            = dev,
>  1469 .pfn = page_to_pfn(bv_page(bv)),
>> 1470 .offset = bv->offset,
>  1471 .dev_addr       = bv_dma_address(bv),
>  1472 .size           = bv_dma_len(bv),
>  1473 .direction      = dir,
>  1474 .sg_call_ents   = nelems,
>  1475 };
>  1476 
>  1477 if (mapped_ents && i >= mapped_ents)
>  1478 break;
>  1479 
>  1480 if (!i)
>  1481 mapped_ents = get_nr_mapped_entries(dev, &ref);
>  1482 
>  1483 check_unmap(&ref);
>  1484 }
>  1485 }
>  1486 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki

--
Chuck Lever



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 1/9] dma-debug: Fix a typo in a debugging eye-catcher
  2023-10-19 15:25 ` [PATCH RFC 1/9] dma-debug: Fix a typo in a debugging eye-catcher Chuck Lever
@ 2023-10-20  4:49   ` Christoph Hellwig
  2023-10-20 13:38     ` Chuck Lever III
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2023-10-20  4:49 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, iommu,
	Chuck Lever

Thanks,

I'll add this to the dma-mapping tree.

FYI, I only got patches 1, 2 and the cover letter.  Something seems to
be broken in your mailer setup.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
  2023-10-19 15:53 ` [PATCH RFC 0/9] Exploring biovec support in (R)DMA API Matthew Wilcox
  2023-10-19 17:48   ` Chuck Lever
@ 2023-10-20  4:58   ` Christoph Hellwig
  2023-10-20 10:30     ` Robin Murphy
  1 sibling, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2023-10-20  4:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Chuck Lever, Marek Szyprowski, Chuck Lever, Robin Murphy,
	Alexander Potapenko, linux-mm, linux-rdma, Jens Axboe, kasan-dev,
	David Howells, iommu, Christoph Hellwig

On Thu, Oct 19, 2023 at 04:53:43PM +0100, Matthew Wilcox wrote:
> > RDMA core API could support struct biovec array arguments. The
> > series compiles on x86, but I haven't tested it further. I'm posting
> > early in hopes of starting further discussion.
> 
> Good call, because I think patch 2/9 is a complete non-starter.
> 
> The fundamental problem with scatterlist is that it is both input
> and output for the mapping operation.  You're replicating this mistake
> in a different data structure.

Agreed.

> 
> My vision for the future is that we have phyr as our input structure.
> That looks something like:
> 
> struct phyr {
> 	phys_addr_t start;
> 	size_t len;
> };

So my plan was always to turn the bio_vec into that structure, since
before you came u wit hthe phyr name.  But that's really a separate
discussion as we might as well support multiple input formats if we
really have to.

> Our output structure can continue being called the scatterlist, but
> it needs to go on a diet and look more like:
> 
> struct scatterlist {
> 	dma_addr_t dma_address;
> 	size_t dma_length;
> };

I called it a dma_vec in my years old proposal I can't find any more.

> Getting to this point is going to be a huge amount of work, and I need
> to finish folios first.  Or somebody else can work on it ;-)

Well, we can stage this.  I wish I could find my old proposal about the
dma_batch API (I remember Robin commented on it, my he is better at
finding it than me).  I think that mostly still stands, independent
of the transformation of the input structure.  The basic idea is that
we add a dma batching API, where you start a batch with one call,
and then add new physically discontiguous vectors to add it until
it is full and finalized it.  Very similar to how the iommu API
works internally.  We'd then only use this API if we actually have
an iommu (or if we want to be fancy swiotlb that could do the same
linearization), for the direct map we'd still do the equivalent
of dma_map_page for each element as we need one output vector per
input vector anyway.

As Jason pointed out the only fancy implementation we need for now
is the IOMMU API.  arm32 and powerpc will need to do the work
to convert to it or do their own work.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
  2023-10-20  4:58   ` Christoph Hellwig
@ 2023-10-20 10:30     ` Robin Murphy
  2023-10-23  5:59       ` Christoph Hellwig
  0 siblings, 1 reply; 35+ messages in thread
From: Robin Murphy @ 2023-10-20 10:30 UTC (permalink / raw)
  To: Christoph Hellwig, Matthew Wilcox
  Cc: Chuck Lever, Marek Szyprowski, Chuck Lever, Alexander Potapenko,
	linux-mm, linux-rdma, Jens Axboe, kasan-dev, David Howells, iommu

On 2023-10-20 05:58, Christoph Hellwig wrote:
> On Thu, Oct 19, 2023 at 04:53:43PM +0100, Matthew Wilcox wrote:
>>> RDMA core API could support struct biovec array arguments. The
>>> series compiles on x86, but I haven't tested it further. I'm posting
>>> early in hopes of starting further discussion.
>>
>> Good call, because I think patch 2/9 is a complete non-starter.
>>
>> The fundamental problem with scatterlist is that it is both input
>> and output for the mapping operation.  You're replicating this mistake
>> in a different data structure.
> 
> Agreed.
> 
>>
>> My vision for the future is that we have phyr as our input structure.
>> That looks something like:
>>
>> struct phyr {
>> 	phys_addr_t start;
>> 	size_t len;
>> };
> 
> So my plan was always to turn the bio_vec into that structure, since
> before you came u wit hthe phyr name.  But that's really a separate
> discussion as we might as well support multiple input formats if we
> really have to.
> 
>> Our output structure can continue being called the scatterlist, but
>> it needs to go on a diet and look more like:
>>
>> struct scatterlist {
>> 	dma_addr_t dma_address;
>> 	size_t dma_length;
>> };
> 
> I called it a dma_vec in my years old proposal I can't find any more.
> 
>> Getting to this point is going to be a huge amount of work, and I need
>> to finish folios first.  Or somebody else can work on it ;-)
> 
> Well, we can stage this.  I wish I could find my old proposal about the
> dma_batch API (I remember Robin commented on it, my he is better at
> finding it than me).

Heh, the dirty secret is that Office 365 is surprisingly effective at 
searching 9 years worth of email I haven't deleted :)

https://lore.kernel.org/linux-iommu/79926b59-0eb9-2b88-b1bb-1bd472b10370@arm.com/

>  I think that mostly still stands, independent
> of the transformation of the input structure.  The basic idea is that
> we add a dma batching API, where you start a batch with one call,
> and then add new physically discontiguous vectors to add it until
> it is full and finalized it.  Very similar to how the iommu API
> works internally.  We'd then only use this API if we actually have
> an iommu (or if we want to be fancy swiotlb that could do the same
> linearization), for the direct map we'd still do the equivalent
> of dma_map_page for each element as we need one output vector per
> input vector anyway.

The other thing that's clear by now is that I think we definitely want 
distinct APIs for "please map this bunch of disjoint things" for true 
scatter-gather cases like biovecs where it's largely just convenient to 
keep them grouped together (but opportunistic merging might still be a 
bonus), vs. "please give me a linearised DMA mapping of these pages (and 
fail if you can't)" for the dma-buf style cases.

Cheers,
Robin.

> As Jason pointed out the only fancy implementation we need for now
> is the IOMMU API.  arm32 and powerpc will need to do the work
> to convert to it or do their own work.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 1/9] dma-debug: Fix a typo in a debugging eye-catcher
  2023-10-20  4:49   ` Christoph Hellwig
@ 2023-10-20 13:38     ` Chuck Lever III
  2023-10-23  5:56       ` Christoph Hellwig
  0 siblings, 1 reply; 35+ messages in thread
From: Chuck Lever III @ 2023-10-20 13:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chuck Lever, Marek Szyprowski, Robin Murphy,
	iommu@lists.linux.dev



> On Oct 20, 2023, at 12:49 AM, Christoph Hellwig <hch@lst.de> wrote:
> 
> Thanks,
> 
> I'll add this to the dma-mapping tree.
> 
> FYI, I only got patches 1, 2 and the cover letter.  Something seems to
> be broken in your mailer setup.

I think you were explicitly copied only the first two.
I assumed you would see the others on the mailing lists.

I used information in MAINTAINERS for the files and
subsystems those patches modified. Is that information
up to date?


--
Chuck Lever



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
  2023-10-19 23:21     ` Chuck Lever III
@ 2023-10-23  2:43       ` Liu, Yujie
  2023-10-23 14:27         ` Chuck Lever III
  0 siblings, 1 reply; 35+ messages in thread
From: Liu, Yujie @ 2023-10-23  2:43 UTC (permalink / raw)
  To: chuck.lever@oracle.com; +Cc: cel@kernel.org, oe-kbuild-all@lists.linux.dev, lkp

Hi Chuck,

On Thu, 2023-10-19 at 23:21 +0000, Chuck Lever III wrote:
> 
> 
> > On Oct 19, 2023, at 5:38 PM, kernel test robot <lkp@intel.com> wrote:
> > 
> > Hi Chuck,
> > 
> > [This is a private test report for your RFC patch.]
> > kernel test robot noticed the following build warnings:
> > 
> > [auto build test WARNING on akpm-mm/mm-everything]
> > [also build test WARNING on rdma/for-next linus/master v6.6-rc6 next-20231019]
> > [cannot apply to joro-iommu/next]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> > 
> > url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/dma-debug-Fix-a-typo-in-a-debugging-eye-catcher/20231020-032859
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> > patch link:    https://lore.kernel.org/r/169772915215.5232.10127407258544978465.stgit%40klimt.1015granger.net
> > patch subject: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
> > config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/config)
> > compiler: m68k-linux-gcc (GCC) 13.2.0
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/reproduce)
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > Closes: https://lore.kernel.org/oe-kbuild-all/202310200545.ScAzFYdK-lkp@intel.com/
> 
> Fwiw, this was an RFC series. Reviewers have already rejected
> this approach, so I won't be posting another version of this.
> 
> Is there a way to flush out the test pipeline so no more tests
> are run on the series?

Sorry we are not able to flush out this patch series from the bot after
the test was triggered. We are not sure if there will be follow-up
reports for this series, but the reports for RFC patches are sent to
you privately and won't go to the mailing list. Please kindly ignore
them if the series is obsolete. Sorry for any inconvenience.

Best Regards,
Yujie

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 1/9] dma-debug: Fix a typo in a debugging eye-catcher
  2023-10-20 13:38     ` Chuck Lever III
@ 2023-10-23  5:56       ` Christoph Hellwig
  0 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2023-10-23  5:56 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Christoph Hellwig, Chuck Lever, Marek Szyprowski, Robin Murphy,
	iommu@lists.linux.dev

On Fri, Oct 20, 2023 at 01:38:01PM +0000, Chuck Lever III wrote:
> I think you were explicitly copied only the first two.
> I assumed you would see the others on the mailing lists.
> 
> I used information in MAINTAINERS for the files and
> subsystems those patches modified. Is that information
> up to date?

You must always send the entire series to all recipients.  Without
that proper review is impossible.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API
  2023-10-20 10:30     ` Robin Murphy
@ 2023-10-23  5:59       ` Christoph Hellwig
  0 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2023-10-23  5:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Matthew Wilcox, Chuck Lever, Marek Szyprowski,
	Chuck Lever, Alexander Potapenko, linux-mm, linux-rdma,
	Jens Axboe, kasan-dev, David Howells, iommu

On Fri, Oct 20, 2023 at 11:30:06AM +0100, Robin Murphy wrote:
>> Well, we can stage this.  I wish I could find my old proposal about the
>> dma_batch API (I remember Robin commented on it, my he is better at
>> finding it than me).
>
> Heh, the dirty secret is that Office 365 is surprisingly effective at 
> searching 9 years worth of email I haven't deleted :)
>
> https://lore.kernel.org/linux-iommu/79926b59-0eb9-2b88-b1bb-1bd472b10370@arm.com/

Perfect, thanks!

> The other thing that's clear by now is that I think we definitely want 
> distinct APIs for "please map this bunch of disjoint things" for true 
> scatter-gather cases like biovecs where it's largely just convenient to 
> keep them grouped together (but opportunistic merging might still be a 
> bonus), vs. "please give me a linearised DMA mapping of these pages (and 
> fail if you can't)" for the dma-buf style cases.

Hmm, I'm not sure I agree.  For both the iommu and swiotlb case we
get the linear mapping for free with small limitations:

 - for the iommu case the alignment needs to be a multiple of the iommu
   page size
 - for swiotlb the size of each mapping is very limited

If these conditions are matched we can linearize for free, if they aren't
we can't linearize at all.

But maybe I'm missing something?


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
  2023-10-23  2:43       ` Liu, Yujie
@ 2023-10-23 14:27         ` Chuck Lever III
  0 siblings, 0 replies; 35+ messages in thread
From: Chuck Lever III @ 2023-10-23 14:27 UTC (permalink / raw)
  To: Liu, Yujie; +Cc: Chuck Lever, oe-kbuild-all@lists.linux.dev, lkp



> On Oct 22, 2023, at 10:43 PM, Liu, Yujie <yujie.liu@intel.com> wrote:
> 
> Hi Chuck,
> 
> On Thu, 2023-10-19 at 23:21 +0000, Chuck Lever III wrote:
>> 
>> 
>>> On Oct 19, 2023, at 5:38 PM, kernel test robot <lkp@intel.com> wrote:
>>> 
>>> Hi Chuck,
>>> 
>>> [This is a private test report for your RFC patch.]
>>> kernel test robot noticed the following build warnings:
>>> 
>>> [auto build test WARNING on akpm-mm/mm-everything]
>>> [also build test WARNING on rdma/for-next linus/master v6.6-rc6 next-20231019]
>>> [cannot apply to joro-iommu/next]
>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>> And when submitting patch, we suggest to use '--base' as documented in
>>> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>>> 
>>> url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/dma-debug-Fix-a-typo-in-a-debugging-eye-catcher/20231020-032859
>>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
>>> patch link:    https://lore.kernel.org/r/169772915215.5232.10127407258544978465.stgit%40klimt.1015granger.net
>>> patch subject: [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays
>>> config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/config)
>>> compiler: m68k-linux-gcc (GCC) 13.2.0
>>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231020/202310200545.ScAzFYdK-lkp@intel.com/reproduce)
>>> 
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>>> Reported-by: kernel test robot <lkp@intel.com>
>>>> Closes: https://lore.kernel.org/oe-kbuild-all/202310200545.ScAzFYdK-lkp@intel.com/
>> 
>> Fwiw, this was an RFC series. Reviewers have already rejected
>> this approach, so I won't be posting another version of this.
>> 
>> Is there a way to flush out the test pipeline so no more tests
>> are run on the series?
> 
> Sorry we are not able to flush out this patch series from the bot after
> the test was triggered. We are not sure if there will be follow-up
> reports for this series, but the reports for RFC patches are sent to
> you privately and won't go to the mailing list. Please kindly ignore
> them if the series is obsolete. Sorry for any inconvenience.

No inconvenience... but wanted to make sure that these were not counted
as regressions, and also that if you have a database looking for the
reported issues to be addressed, there would be no fix forthcoming ;-)


--
Chuck Lever



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-10-23 14:27 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-19 15:25 [PATCH RFC 0/9] Exploring biovec support in (R)DMA API Chuck Lever
2023-10-19 15:25 ` Chuck Lever
2023-10-19 15:25 ` [PATCH RFC 1/9] dma-debug: Fix a typo in a debugging eye-catcher Chuck Lever
2023-10-20  4:49   ` Christoph Hellwig
2023-10-20 13:38     ` Chuck Lever III
2023-10-23  5:56       ` Christoph Hellwig
2023-10-19 15:25 ` [PATCH RFC 2/9] bvec: Add bio_vec fields to manage DMA mapping Chuck Lever
2023-10-19 15:25   ` Chuck Lever
2023-10-19 15:25 ` [PATCH RFC 3/9] dma-debug: Add dma_debug_ helpers for mapping bio_vec arrays Chuck Lever
2023-10-19 15:25   ` Chuck Lever
2023-10-19 21:38   ` kernel test robot
2023-10-19 23:21     ` Chuck Lever III
2023-10-23  2:43       ` Liu, Yujie
2023-10-23 14:27         ` Chuck Lever III
2023-10-19 21:49   ` kernel test robot
2023-10-19 15:25 ` [PATCH RFC 4/9] mm: kmsan: Add support for DMA " Chuck Lever
2023-10-19 15:25   ` Chuck Lever
2023-10-19 15:26 ` [PATCH RFC 5/9] dma-direct: Support direct " Chuck Lever
2023-10-19 15:26   ` Chuck Lever
2023-10-19 15:26 ` [PATCH RFC 6/9] DMA-API: Add dma_sync_bvecs_for_cpu() and dma_sync_bvecs_for_device() Chuck Lever
2023-10-19 15:26   ` Chuck Lever
2023-10-19 15:26 ` [PATCH RFC 7/9] DMA: Add dma_map_bvecs_attrs() Chuck Lever
2023-10-19 15:26   ` Chuck Lever
2023-10-19 22:10   ` kernel test robot
2023-10-19 15:26 ` [PATCH RFC 8/9] iommu/dma: Support DMA-mapping a bio_vec array Chuck Lever
2023-10-19 15:26   ` Chuck Lever
2023-10-19 15:26 ` [PATCH RFC 9/9] RDMA: Add helpers for DMA-mapping an array of bio_vecs Chuck Lever
2023-10-19 15:26   ` Chuck Lever
2023-10-19 15:53 ` [PATCH RFC 0/9] Exploring biovec support in (R)DMA API Matthew Wilcox
2023-10-19 17:48   ` Chuck Lever
2023-10-20  4:58   ` Christoph Hellwig
2023-10-20 10:30     ` Robin Murphy
2023-10-23  5:59       ` Christoph Hellwig
2023-10-19 16:43 ` Robin Murphy
2023-10-19 17:53   ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.