[RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects

dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
@ 2025-10-27  4:44 Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects Vivek Kasireddy
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Jason Gunthorpe, Leon Romanovsky,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter, Matthew Brost, Dongwon Kim

In a typical dma-buf use case, a dmabuf exporter makes its buffer
buffer available to an importer by mapping it using DMA APIs
such as dma_map_sgtable() or dma_map_resource(). However, this
is not desirable in some cases where the exporter and importer
are directly connected via a physical or virtual link (or
interconnect) and the importer can access the buffer without
having it DMA mapped.

So, to address this scenario, this patch series adds APIs to map/
unmap dmabufs via interconnects and also provides a helper to
identify the first common interconnect between the exporter and
importer. Furthermore, this patch series also adds support for
IOV interconnect in the vfio-pci driver and Intel Xe driver.

The IOV interconnect is a virtual interconnect between an SRIOV
physical function (PF) and its virtual functions (VFs). And, for
the IOV interconnect, the addresses associated with a buffer are
shared using an xarray (instead of an sg_table) that is populated
with entries of type struct range. 

The dma-buf patches in this series are based on ideas/suggestions
provided by Jason Gunthorpe, Christian Koenig and Thomas Hellström.

Changelog:
RFC -> RFCv2:
- Add documentation for the new dma-buf APIs and types (Thomas)
- Change the interconnect type from enum to unique pointer (Thomas)
- Moved the new dma-buf APIs to a separate file
- Store a copy of the interconnect matching data in the attachment
- Simplified the macros to create and match interconnects
- Use struct device instead of struct pci_dev in match data
- Replace DRM_INTERCONNECT_DRIVER with XE_INTERCONNECT_VRAM during
  address encoding (Matt, Thomas)
- Drop is_devmem_external and instead rely on bo->dma_data.dma_addr
  to check for imported VRAM BOs (Matt)
- Pass XE_PAGE_SIZE as the last parameter to xe_bo_addr (Matt)
- Add a check to prevent malicious VF from accessing other VF's
  addresses (Thomas)
- Fallback to legacy (map_dma_buf) mapping method if mapping via
  interconnect fails

Patchset overview:
Patch 1-3: Add dma-buf APIs to map/unmap and match
Patch 4: Add support for IOV interconnect in vfio-pci driver
Patch 5: Add support for IOV interconnect in Xe driver
Patch 6-8: Create and use a new dma_addr array for LMEM based
           dmabuf BOs to store translated addresses (DPAs)

This series is rebased on top of the following repo:
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=dmabuf-vfio-v5

Associated Qemu patch series:
https://lore.kernel.org/qemu-devel/20251003234138.85820-1-vivek.kasireddy@intel.com/
Associated vfio-pci patch series:
https://lore.kernel.org/dri-devel/cover.1760368250.git.leon@kernel.org/

This series is tested using the following method:
- Run Qemu with the following relevant options:
  qemu-system-x86_64 -m 4096m ....
  -device ioh3420,id=root_port1,bus=pcie.0
  -device x3130-upstream,id=upstream1,bus=root_port1
  -device xio3130-downstream,id=downstream1,bus=upstream1,chassis=9
  -device xio3130-downstream,id=downstream2,bus=upstream1,chassis=10
  -device vfio-pci,host=0000:03:00.1,bus=downstream1
  -device virtio-gpu,max_outputs=1,blob=true,xres=1920,yres=1080,bus=downstream2
  -display gtk,gl=on
  -object memory-backend-memfd,id=mem1,size=4096M
  -machine q35,accel=kvm,memory-backend=mem1 ...
- Run Gnome Wayland with the following options in the Guest VM:
  # cat /usr/lib/udev/rules.d/61-mutter-primary-gpu.rules
  ENV{DEVNAME}=="/dev/dri/card1", TAG+="mutter-device-preferred-primary", TAG+="mutter-device-disable-kms-modifiers"
  # XDG_SESSION_TYPE=wayland dbus-run-session -- /usr/bin/gnome-shell --wayland --no-x11 &

Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>

Vivek Kasireddy (8):
  dma-buf: Add support for map/unmap APIs for interconnects
  dma-buf: Add a helper to match interconnects between exporter/importer
  dma-buf: Create and expose IOV interconnect to all exporters/importers
  vfio/pci/dmabuf: Add support for IOV interconnect
  drm/xe/dma_buf: Add support for IOV interconnect
  drm/xe/pf: Add a helper function to get a VF's backing object in LMEM
  drm/xe/bo: Create new dma_addr array for dmabuf BOs associated with
    VFs
  drm/xe/pt: Add an additional check for dmabuf BOs while doing bind

 drivers/dma-buf/Makefile                   |   2 +-
 drivers/dma-buf/dma-buf-interconnect.c     | 164 +++++++++++++++++++++
 drivers/dma-buf/dma-buf.c                  |  12 +-
 drivers/gpu/drm/xe/xe_bo.c                 | 162 ++++++++++++++++++--
 drivers/gpu/drm/xe/xe_bo_types.h           |   6 +
 drivers/gpu/drm/xe/xe_dma_buf.c            |  17 ++-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c |  24 +++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h |   1 +
 drivers/gpu/drm/xe/xe_pt.c                 |   8 +
 drivers/gpu/drm/xe/xe_sriov_pf_types.h     |  19 +++
 drivers/vfio/pci/vfio_pci_dmabuf.c         | 135 ++++++++++++++++-
 include/linux/dma-buf-interconnect.h       | 122 +++++++++++++++
 include/linux/dma-buf.h                    |  41 ++++++
 13 files changed, 691 insertions(+), 22 deletions(-)
 create mode 100644 drivers/dma-buf/dma-buf-interconnect.c
 create mode 100644 include/linux/dma-buf-interconnect.h

-- 
2.50.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-27 17:47   ` Jason Gunthorpe
  2025-10-27  4:44 ` [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer Vivek Kasireddy
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Jason Gunthorpe, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter

For the map operation, the dma-buf core will create an xarray but
the exporter needs to populate it with the interconnect specific
addresses. And, similarly for unmap, the exporter is expected to
cleanup the individual entries of the xarray.

Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/dma-buf/Makefile               |  2 +-
 drivers/dma-buf/dma-buf-interconnect.c | 96 ++++++++++++++++++++++++++
 drivers/dma-buf/dma-buf.c              |  6 --
 include/linux/dma-buf-interconnect.h   | 79 +++++++++++++++++++++
 include/linux/dma-buf.h                | 27 ++++++++
 5 files changed, 203 insertions(+), 7 deletions(-)
 create mode 100644 drivers/dma-buf/dma-buf-interconnect.c
 create mode 100644 include/linux/dma-buf-interconnect.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 70ec901edf2c..fff39b973f28 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
-	 dma-fence-unwrap.o dma-resv.o
+	 dma-fence-unwrap.o dma-resv.o dma-buf-interconnect.o
 obj-$(CONFIG_DMABUF_HEAPS)	+= dma-heap.o
 obj-$(CONFIG_DMABUF_HEAPS)	+= heaps/
 obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
diff --git a/drivers/dma-buf/dma-buf-interconnect.c b/drivers/dma-buf/dma-buf-interconnect.c
new file mode 100644
index 000000000000..690423b6682f
--- /dev/null
+++ b/drivers/dma-buf/dma-buf-interconnect.c
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: MIT */
+
+#include <linux/dma-buf.h>
+#include <linux/dma-resv.h>
+
+/**
+ * dma_buf_map_interconnect - Returns the xarray wrapped in dma_buf_ranges
+ * that contains the buffer addresses that the importer would be able to use.
+ * It is a wrapper for map_interconnect() of the dma_buf_ops.
+ * @attach:	[in]	attachment whose xarray is to be returned
+ *
+ * On success, the buffer addresses are returned in a type that is specific
+ * to the interconnect implementation. For example, for IOV interconnect,
+ * struct range is the type used to represent the buffer addresses.
+ * On failure, appropriate ERR_PTR is returned or -EOPNOTSUPP if the importer
+ * is not allowed to use interconnect mappings.
+ *
+ * The importer must eventually call dma_buf_unmap_interconnect() after it is
+ * done using the buffer addresses. Note that, only dynamic importers are
+ * allowed to use this interface.
+ */
+struct dma_buf_ranges *
+dma_buf_map_interconnect(struct dma_buf_attachment *attach)
+{
+	const struct dma_buf_interconnect_ops *ic_ops;
+	struct dma_buf *dmabuf = attach->dmabuf;
+	struct dma_buf_ranges *ranges;
+	int ret;
+
+	might_sleep();
+
+	if (WARN_ON(!attach || !attach->dmabuf))
+		return ERR_PTR(-EINVAL);
+
+	if (!attach->allow_ic)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	dma_resv_assert_held(attach->dmabuf->resv);
+
+	if (!dma_buf_attachment_is_dynamic(attach))
+		return ERR_PTR(-EINVAL);
+
+	ic_ops = dmabuf->ops->interconnect_ops;
+	if (!ic_ops || !ic_ops->map_interconnect)
+		return ERR_PTR(-EINVAL);
+
+	ranges = kzalloc(sizeof(*ranges), GFP_KERNEL);
+	if (!ranges)
+		return ERR_PTR(-ENOMEM);
+
+	xa_init(&ranges->ranges);
+	ret = ic_ops->map_interconnect(attach, ranges);
+	if (ret)
+		goto err_free_ranges;
+
+	return ranges;
+
+err_free_ranges:
+	xa_destroy(&ranges->ranges);
+	kfree(ranges);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_NS_GPL(dma_buf_map_interconnect, "DMA_BUF");
+
+/**
+ * dma_buf_unmap_interconnect - destroys the xarray specific to this attachment
+ * and the importer. It is a wrapper for unmap_interconnect() of dma_buf_ops.
+ * @attach:	[in]	attachment to destroy xarray from
+ * @ranges:	[in]	dma_buf_ranges that contains the xarray to be destroyed
+ *
+ * This destroys the xarray that was created by dma_buf_map_interconnect().
+ */
+void dma_buf_unmap_interconnect(struct dma_buf_attachment *attach,
+				struct dma_buf_ranges *ranges)
+{
+	const struct dma_buf_interconnect_ops *ic_ops;
+	struct dma_buf *dmabuf = attach->dmabuf;
+
+	if (WARN_ON(!attach || !attach->dmabuf || !ranges))
+		return;
+
+	if (!attach->allow_ic)
+		return;
+
+	ic_ops = dmabuf->ops->interconnect_ops;
+	if (!ic_ops || !ic_ops->unmap_interconnect)
+		return;
+
+	dma_resv_assert_held(attach->dmabuf->resv);
+
+	ic_ops->unmap_interconnect(attach, ranges);
+
+	xa_destroy(&ranges->ranges);
+	kfree(ranges);
+}
+EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_interconnect, "DMA_BUF");
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 2bcf9ceca997..daa993503052 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -845,12 +845,6 @@ static void mangle_sg_table(struct sg_table *sg_table)
 
 }
 
-static inline bool
-dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
-{
-	return !!attach->importer_ops;
-}
-
 static bool
 dma_buf_pin_on_map(struct dma_buf_attachment *attach)
 {
diff --git a/include/linux/dma-buf-interconnect.h b/include/linux/dma-buf-interconnect.h
new file mode 100644
index 000000000000..50fc7a8272ce
--- /dev/null
+++ b/include/linux/dma-buf-interconnect.h
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: MIT */
+
+#ifndef __DMA_BUF_INTERCONNECT_H__
+#define __DMA_BUF_INTERCONNECT_H__
+
+#include <linux/xarray.h>
+
+#define CREATE_INTERCONNECT(type)					       \
+	static const struct dma_buf_interconnect __##type##_interconnect = {   \
+		.name = #type"_interconnect",				       \
+	};								       \
+	const struct dma_buf_interconnect *type##_interconnect =	       \
+		&__##type##_interconnect;   				       \
+
+struct dma_buf_attachment;
+
+/**
+ * struct dma_buf_interconnect - holds info associated with an interconnect
+ * @name: name of the interconnect.
+ *
+ * The exporter is expected to use CREATE_INTERCONNECT() macro to create a
+ * unique instance of this structure for each interconnect type it supports.
+ */
+struct dma_buf_interconnect {
+	const char *name;
+};
+
+/**
+ * struct dma_buf_ranges - holds info about interconnect address ranges
+ * @ranges: xarray that contains the address ranges
+ * @nranges: total number of ranges populated in the xarray
+ *
+ * The exporter is expected to populate this structure with xarray entries
+ * of type specific to the interconnect that would contain the address ranges
+ * associated with the shared buffer.
+ */
+struct dma_buf_ranges {
+	struct xarray ranges;
+	unsigned int nranges;
+};
+
+/**
+ * struct dma_buf_interconnect_ops - operations for using dma-buf interconnects
+ *
+ * These operations would be implemented by the exporter.
+ */
+struct dma_buf_interconnect_ops {
+	/**
+	 * @map_interconnect:
+	 *
+	 * This is called by dma_buf_map_interconnect() and is used to fill an
+	 * xarray with addresses wrapped in type specific to the interconnect
+	 * for the given attachment. It can only be called if @attach->allow_ic
+	 * has been set to true.
+	 *
+	 * Returns:
+	 *
+	 * A zero is returned on success, which means that the xarray is
+	 * successfully populated with addresses for all ranges.
+	 * On failure, a negative error value is returned.
+	 *
+	 * Note that only dynamic importers are expected to use this interface.
+	 */
+	int (*map_interconnect)(struct dma_buf_attachment *attach,
+				struct dma_buf_ranges *ranges);
+	/**
+	 * @unmap_interconnect:
+	 *
+	 * This is called by dma_buf_unmap_interconnect() and is used to clean
+	 * up the xarray entries allocated in @map_interconnect.
+	 */
+	void (*unmap_interconnect)(struct dma_buf_attachment *attach,
+				   struct dma_buf_ranges *ranges);
+};
+
+struct dma_buf_ranges *dma_buf_map_interconnect(struct dma_buf_attachment *);
+void dma_buf_unmap_interconnect(struct dma_buf_attachment *,
+				struct dma_buf_ranges *);
+#endif
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index d58e329ac0e7..a675bc89a69c 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -23,6 +23,8 @@
 #include <linux/dma-fence.h>
 #include <linux/wait.h>
 
+#include <linux/dma-buf-interconnect.h>
+
 struct device;
 struct dma_buf;
 struct dma_buf_attachment;
@@ -276,6 +278,16 @@ struct dma_buf_ops {
 
 	int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
 	void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
+
+	/**
+	 * @interconnect_ops:
+	 *
+	 * This contains the operations required for supporting dma-buf
+	 * interconnects for the exporter.
+	 *
+	 * This interface is optional.
+	 */
+	const struct dma_buf_interconnect_ops *interconnect_ops;
 };
 
 /**
@@ -483,6 +495,7 @@ struct dma_buf_attach_ops {
  * @dev: device attached to the buffer.
  * @node: list of dma_buf_attachment, protected by dma_resv lock of the dmabuf.
  * @peer2peer: true if the importer can handle peer resources without pages.
+ * @allow_ic: true if the importer is allowed to use interconnect ops.
  * @priv: exporter specific attachment data.
  * @importer_ops: importer operations for this attachment, if provided
  * dma_buf_map/unmap_attachment() must be called with the dma_resv lock held.
@@ -502,6 +515,7 @@ struct dma_buf_attachment {
 	struct device *dev;
 	struct list_head node;
 	bool peer2peer;
+	bool allow_ic;
 	const struct dma_buf_attach_ops *importer_ops;
 	void *importer_priv;
 	void *priv;
@@ -568,6 +582,19 @@ static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
 	return !!dmabuf->ops->pin;
 }
 
+/**
+ * dma_buf_attachment_is_dynamic - check if the importer can handle move_notify.
+ * @attach: the attachment to check
+ *
+ * Returns true if a DMA-buf importer has indicated that it can handle dmabuf
+ * location changes through the move_notify callback.
+ */
+static inline bool
+dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
+{
+	return !!attach->importer_ops;
+}
+
 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
 					  struct device *dev);
 struct dma_buf_attachment *
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects
  2025-10-27  4:44 ` [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects Vivek Kasireddy
@ 2025-10-27 17:47   ` Jason Gunthorpe
  2025-10-28  5:39     ` Kasireddy, Vivek
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Gunthorpe @ 2025-10-27 17:47 UTC (permalink / raw)
  To: Vivek Kasireddy
  Cc: dri-devel, intel-xe, linux-media, linaro-mm-sig, Christian Koenig,
	Sumit Semwal, Thomas Hellström, Simona Vetter

On Sun, Oct 26, 2025 at 09:44:13PM -0700, Vivek Kasireddy wrote:
> For the map operation, the dma-buf core will create an xarray but
> the exporter needs to populate it with the interconnect specific
> addresses. And, similarly for unmap, the exporter is expected to
> cleanup the individual entries of the xarray.

I don't think we should limit this to xarrays, nor do I think it is a
great datastructure for what is usually needed here..

I just posted the patches showing what iommufd needs, and it wants
something like

struct mapping {
   struct p2p_provider *provider;
   size_t nelms;
   struct phys_vec *phys;
};

Which is not something that make sense as an xarray.

I think the interconnect should have its own functions for map/unmap,
ie instead of trying to have them as a commmon
dma_buf_interconnect_ops do something like

struct dma_buf_interconnect_ops {
        const char *name;
        bool (*supports_interconnects)(struct dma_buf_attachment *attach,
                                      const struct dma_buf_interconnect_match *,
                                      unsigned int num_ics);
};

struct dma_buf_iov_interconnect_ops {
     struct dma_buf_interconnect_ops ic_ops;
     struct xx *(*map)(struct dma_buf_attachment *attach,
     	 		   unsigned int *bar_number,
			   size_t *nelms);
     // No unmap for iov
};

static inline struct xx *dma_buf_iov_map(struct dma_buf_attachment *attach,
     	 		   unsigned int *bar_number,
			   size_t *nelms)
{
     return container_of(attach->ic_ops, struct dma_buf_iov_interconnect_ops, ic_ops)->map(
                 attach, bar_number, nelms));
}

> +/**
> + * dma_buf_attachment_is_dynamic - check if the importer can handle move_notify.
> + * @attach: the attachment to check
> + *
> + * Returns true if a DMA-buf importer has indicated that it can handle dmabuf
> + * location changes through the move_notify callback.
> + */
> +static inline bool
> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> +{
> +	return !!attach->importer_ops;
> +}

Why is this in this patch?

I also think this patch should be second in the series, it makes more
sense to figure out how to attach with an interconnect then show how
to map/unmap with that interconnect

Like I'm not sure why this introduces allow_ic?

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects
  2025-10-27 17:47   ` Jason Gunthorpe
@ 2025-10-28  5:39     ` Kasireddy, Vivek
  2025-10-28 12:21       ` Jason Gunthorpe
  0 siblings, 1 reply; 24+ messages in thread
From: Kasireddy, Vivek @ 2025-10-28  5:39 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter

Hi Jason,

> Subject: Re: [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for
> interconnects
> 
> On Sun, Oct 26, 2025 at 09:44:13PM -0700, Vivek Kasireddy wrote:
> > For the map operation, the dma-buf core will create an xarray but
> > the exporter needs to populate it with the interconnect specific
> > addresses. And, similarly for unmap, the exporter is expected to
> > cleanup the individual entries of the xarray.
> 
> I don't think we should limit this to xarrays, nor do I think it is a
> great datastructure for what is usually needed here..
One of the goals (as suggested by Christian) is to have a container that
can be used with an iterator. So, instead of creating a new data structure,
I figured using an xarray would make sense here. And, since the entries
of an xarray can be of any type, I think another advantage is that the
dma-buf core only needs to be aware of the xarray but the exporter can
use an interconnect specific type to populate the entries that the importer
would be aware of.

> 
> I just posted the patches showing what iommufd needs, and it wants
> something like
> 
> struct mapping {
>    struct p2p_provider *provider;
>    size_t nelms;
>    struct phys_vec *phys;
> };
> 
> Which is not something that make sense as an xarray.
If we do not want to use an xarray, I guess we can try to generalize the
struct that holds the addresses and any additional info (such as provider).
Would any of the following look OK to you:
struct dma_buf_mapping {
        struct phys_vec *phys;
        unsigned int nents;
        void *map_data;
};

Or

struct dma_buf_ranges {
        struct range *ranges;
        unsigned int nranges;
        void *ranges_data;
};

> 
> I think the interconnect should have its own functions for map/unmap,
> ie instead of trying to have them as a commmon
> dma_buf_interconnect_ops do something like
In my current design, the exporter would call the interconnect specific
map/unmap functions from its common map() callback. But I guess I can
try to implement and test your suggestions to see if they are more robust/elegant.

> 
> struct dma_buf_interconnect_ops {
>         const char *name;
>         bool (*supports_interconnects)(struct dma_buf_attachment *attach,
I have this as part of dma_buf_attach_ops for the importer but I'll explore your
idea in more detail.

>                                       const struct dma_buf_interconnect_match *,
>                                       unsigned int num_ics);
> };
> 
> struct dma_buf_iov_interconnect_ops {
>      struct dma_buf_interconnect_ops ic_ops;
>      struct xx *(*map)(struct dma_buf_attachment *attach,
Do we want each specific interconnect to have its own return type for map?

>      	 		   unsigned int *bar_number,
> 			   size_t *nelms);
>      // No unmap for iov
> };
> 
> static inline struct xx *dma_buf_iov_map(struct dma_buf_attachment
> *attach,
>      	 		   unsigned int *bar_number,
> 			   size_t *nelms)
> {
>      return container_of(attach->ic_ops, struct dma_buf_iov_interconnect_ops,
> ic_ops)->map(
>                  attach, bar_number, nelms));
> }
> 
> > +/**
> > + * dma_buf_attachment_is_dynamic - check if the importer can handle
> move_notify.
> > + * @attach: the attachment to check
> > + *
> > + * Returns true if a DMA-buf importer has indicated that it can handle
> dmabuf
> > + * location changes through the move_notify callback.
> > + */
> > +static inline bool
> > +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> > +{
> > +	return !!attach->importer_ops;
> > +}
> 
> Why is this in this patch?
I figured it makes sense to limit map/unmap interconnect ops to dynamic
importers (that register a move_notify callback) only. I guess I could move the
above change into a separate patch.

> 
> I also think this patch should be second in the series, it makes more
> sense to figure out how to attach with an interconnect then show how
> to map/unmap with that interconnect
> 
> Like I'm not sure why this introduces allow_ic?
Ok, I'll move it to the other patch that introduces dma_buf_match_interconnects().

Thanks,
Vivek

> 
> Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects
  2025-10-28  5:39     ` Kasireddy, Vivek
@ 2025-10-28 12:21       ` Jason Gunthorpe
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Gunthorpe @ 2025-10-28 12:21 UTC (permalink / raw)
  To: Kasireddy, Vivek
  Cc: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter

On Tue, Oct 28, 2025 at 05:39:39AM +0000, Kasireddy, Vivek wrote:
> Hi Jason,
> 
> > Subject: Re: [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for
> > interconnects
> > 
> > On Sun, Oct 26, 2025 at 09:44:13PM -0700, Vivek Kasireddy wrote:
> > > For the map operation, the dma-buf core will create an xarray but
> > > the exporter needs to populate it with the interconnect specific
> > > addresses. And, similarly for unmap, the exporter is expected to
> > > cleanup the individual entries of the xarray.
> > 
> > I don't think we should limit this to xarrays, nor do I think it is a
> > great datastructure for what is usually needed here..
> One of the goals (as suggested by Christian) is to have a container that
> can be used with an iterator.

I thought Christian was suggesting to avoid the container and have
some kind of iterator?

> So, instead of creating a new data structure,
> I figured using an xarray would make sense here. And, since the entries
> of an xarray can be of any type, I think another advantage is that the
> dma-buf core only needs to be aware of the xarray but the exporter can
> use an interconnect specific type to populate the entries that the importer
> would be aware of.

It is excessively memory wasteful.

> > I just posted the patches showing what iommufd needs, and it wants
> > something like
> > 
> > struct mapping {
> >    struct p2p_provider *provider;
> >    size_t nelms;
> >    struct phys_vec *phys;
> > };
> > 
> > Which is not something that make sense as an xarray.
> If we do not want to use an xarray, I guess we can try to generalize the
> struct that holds the addresses and any additional info (such as provider).
> Would any of the following look OK to you:

I think just don't try to have a general struct, it is not required
once we have interconnects. Each interconnect can define what makes
sense for it.

> struct dma_buf_ranges {
>         struct range *ranges;
>         unsigned int nranges;
>         void *ranges_data;
> };

Like this is just pointless, it destroys type safety for no benifit.

> > struct dma_buf_iov_interconnect_ops {
> >      struct dma_buf_interconnect_ops ic_ops;
> >      struct xx *(*map)(struct dma_buf_attachment *attach,
> Do we want each specific interconnect to have its own return type for map?

I think yes, then you have type safety and so on. The types should all
be different. We need to get away from using dma_addr_t or phys_addr_t
for something that is not in those address spaces.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-27 18:18   ` Jason Gunthorpe
  2025-10-27  4:44 ` [RFC v2 3/8] dma-buf: Create and expose IOV interconnect to all exporters/importers Vivek Kasireddy
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Jason Gunthorpe, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter

If the importer provides a callback for supports_interconnects(),
the exporter starts the matching (or negotiation) process (during
attach) by invoking the supports_interconnects() callback which
would then call this helper to identify the first common
interconnect supported by both exporter and importer.

Note that whether an interconnect is supported between an
exporter/importer is ultimately determined by the exporter via
the match callback it is expected to provide.

Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/dma-buf/dma-buf-interconnect.c | 65 ++++++++++++++++++++++++++
 drivers/dma-buf/dma-buf.c              |  6 ++-
 include/linux/dma-buf-interconnect.h   | 36 ++++++++++++++
 include/linux/dma-buf.h                | 14 ++++++
 4 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-buf-interconnect.c b/drivers/dma-buf/dma-buf-interconnect.c
index 690423b6682f..12db77e6b9f1 100644
--- a/drivers/dma-buf/dma-buf-interconnect.c
+++ b/drivers/dma-buf/dma-buf-interconnect.c
@@ -94,3 +94,68 @@ void dma_buf_unmap_interconnect(struct dma_buf_attachment *attach,
 	kfree(ranges);
 }
 EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_interconnect, "DMA_BUF");
+
+/**
+ * dma_buf_match_interconnects - determine if there is a specific interconnect
+ * that is supported by both exporter and importer.
+ * @attach:	[in]	attachment to populate ic_match field
+ * @exp:	[in]	array of interconnects supported by exporter
+ * @exp_ics:	[in]	number of interconnects supported by exporter
+ * @imp:	[in]	array of interconnects supported by importer
+ * @imp_ics:	[in]	number of interconnects supported by importer
+ *
+ * This helper function iterates through the list interconnects supported by
+ * both exporter and importer to find a match. A successful match means that
+ * a common interconnect type is supported by both parties and the exporter's
+ * match_interconnect() callback also confirms that the importer is compatible
+ * with the exporter for that interconnect type.
+ *
+ * If a match is found, the attach->ic_match field is populated with a copy
+ * of the exporter's match data.
+ * Return: true if a match is found, false otherwise.
+ */
+bool dma_buf_match_interconnects(struct dma_buf_attachment *attach,
+				 const struct dma_buf_interconnect_match *exp,
+				 unsigned int exp_ics,
+				 const struct dma_buf_interconnect_match *imp,
+				 unsigned int imp_ics)
+{
+	const struct dma_buf_interconnect_ops *ic_ops;
+	struct dma_buf_interconnect_match *ic_match;
+	struct dma_buf *dmabuf = attach->dmabuf;
+	unsigned int i, j;
+
+	if (!exp || !imp)
+		return false;
+
+	if (!attach->allow_ic)
+		return false;
+
+	ic_ops = dmabuf->ops->interconnect_ops;
+	if (!ic_ops || !ic_ops->match_interconnect)
+		return false;
+
+	ic_match = kzalloc(sizeof(*ic_match), GFP_KERNEL);
+	if (!ic_match)
+		return false;
+
+	for (i = 0; i < exp_ics; i++) {
+		for (j = 0; j < imp_ics; j++) {
+			if (exp[i].type == imp[j].type) {
+				if (ic_ops->match_interconnect(&exp[i],
+							       &imp[j])) {
+					memcpy(ic_match, &exp[i],
+					       sizeof(*ic_match));
+
+					attach->ic_match = ic_match;
+					return true;
+				}
+			}
+		}
+	}
+
+	attach->allow_ic = false;
+	kfree(ic_match);
+	return false;
+}
+EXPORT_SYMBOL_NS_GPL(dma_buf_match_interconnects, "DMA_BUF");
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index daa993503052..a6977375f11e 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -959,8 +959,11 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
 
 	attach->dev = dev;
 	attach->dmabuf = dmabuf;
-	if (importer_ops)
+	if (importer_ops) {
 		attach->peer2peer = importer_ops->allow_peer2peer;
+		if (importer_ops->supports_interconnects)
+			attach->allow_ic = true;
+	}
 	attach->importer_ops = importer_ops;
 	attach->importer_priv = importer_priv;
 
@@ -1017,6 +1020,7 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
 	if (dmabuf->ops->detach)
 		dmabuf->ops->detach(dmabuf, attach);
 
+	kfree(attach->ic_match);
 	kfree(attach);
 }
 EXPORT_SYMBOL_NS_GPL(dma_buf_detach, "DMA_BUF");
diff --git a/include/linux/dma-buf-interconnect.h b/include/linux/dma-buf-interconnect.h
index 50fc7a8272ce..efe3ca1c354a 100644
--- a/include/linux/dma-buf-interconnect.h
+++ b/include/linux/dma-buf-interconnect.h
@@ -3,8 +3,14 @@
 #ifndef __DMA_BUF_INTERCONNECT_H__
 #define __DMA_BUF_INTERCONNECT_H__
 
+#include <linux/device.h>
 #include <linux/xarray.h>
 
+#define MATCH_INTERCONNECT(interconnect, ...)				\
+	((const struct dma_buf_interconnect_match) {			\
+		.type = interconnect __VA_OPT__(, __VA_ARGS__)		\
+	})								\
+
 #define CREATE_INTERCONNECT(type)					       \
 	static const struct dma_buf_interconnect __##type##_interconnect = {   \
 		.name = #type"_interconnect",				       \
@@ -25,6 +31,22 @@ struct dma_buf_interconnect {
 	const char *name;
 };
 
+/**
+ * struct dma_buf_interconnect_match - holds data used to match interconnects
+ * @type: pointer to the interconnect instance
+ * @dev: the device associated with a given exporter or importer
+ * @bar: the BAR index associated with the device
+ *
+ * The exporter and importer are expected to populate this structure with
+ * their respective device and BAR information for each interconnect type they
+ * support. This data is used to determine if a match exists between them.
+ */
+struct dma_buf_interconnect_match {
+	const struct dma_buf_interconnect *type;
+	struct device *dev;
+	unsigned int bar;
+};
+
 /**
  * struct dma_buf_ranges - holds info about interconnect address ranges
  * @ranges: xarray that contains the address ranges
@@ -71,9 +93,23 @@ struct dma_buf_interconnect_ops {
 	 */
 	void (*unmap_interconnect)(struct dma_buf_attachment *attach,
 				   struct dma_buf_ranges *ranges);
+	/**
+	 * @match_interconnect:
+	 *
+	 * This is called by dma_buf_match_interconnects() and is used by
+	 * the exporter to determine if the importer is compatible for a
+	 * given interconnect type.
+	 */
+	bool (*match_interconnect)(const struct dma_buf_interconnect_match *,
+				   const struct dma_buf_interconnect_match *);
 };
 
 struct dma_buf_ranges *dma_buf_map_interconnect(struct dma_buf_attachment *);
 void dma_buf_unmap_interconnect(struct dma_buf_attachment *,
 				struct dma_buf_ranges *);
+bool dma_buf_match_interconnects(struct dma_buf_attachment *attach,
+				 const struct dma_buf_interconnect_match *,
+				 unsigned int exp_ics,
+				 const struct dma_buf_interconnect_match *,
+				 unsigned int imp_ics);
 #endif
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index a675bc89a69c..f7d0b0dbcb24 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -487,6 +487,18 @@ struct dma_buf_attach_ops {
 	 * point to the new location of the DMA-buf.
 	 */
 	void (*move_notify)(struct dma_buf_attachment *attach);
+
+	/**
+	 * @supports_interconnects: [optional] indicate interconnect support
+	 *
+	 * If this callback is provided, it means that the importer would
+	 * provide a list of interconnects that it supports and would
+	 * invoke dma_buf_match_interconnects() to identify a match with the
+	 * exporter's interconnects.
+	 */
+	bool (*supports_interconnects)(struct dma_buf_attachment *attach,
+				       const struct dma_buf_interconnect_match *,
+				       unsigned int num_ics);
 };
 
 /**
@@ -498,6 +510,7 @@ struct dma_buf_attach_ops {
  * @allow_ic: true if the importer is allowed to use interconnect ops.
  * @priv: exporter specific attachment data.
  * @importer_ops: importer operations for this attachment, if provided
+ * @ic_match: copy of exporter's interconnect match data.
  * dma_buf_map/unmap_attachment() must be called with the dma_resv lock held.
  * @importer_priv: importer specific attachment data.
  *
@@ -517,6 +530,7 @@ struct dma_buf_attachment {
 	bool peer2peer;
 	bool allow_ic;
 	const struct dma_buf_attach_ops *importer_ops;
+	struct dma_buf_interconnect_match *ic_match;
 	void *importer_priv;
 	void *priv;
 };
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer
  2025-10-27  4:44 ` [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer Vivek Kasireddy
@ 2025-10-27 18:18   ` Jason Gunthorpe
  2025-10-28  6:04     ` Kasireddy, Vivek
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Gunthorpe @ 2025-10-27 18:18 UTC (permalink / raw)
  To: Vivek Kasireddy
  Cc: dri-devel, intel-xe, linux-media, linaro-mm-sig, Christian Koenig,
	Sumit Semwal, Thomas Hellström, Simona Vetter

On Sun, Oct 26, 2025 at 09:44:14PM -0700, Vivek Kasireddy wrote:
> +/**
> + * dma_buf_match_interconnects - determine if there is a specific interconnect
> + * that is supported by both exporter and importer.
> + * @attach:	[in]	attachment to populate ic_match field
> + * @exp:	[in]	array of interconnects supported by exporter
> + * @exp_ics:	[in]	number of interconnects supported by exporter
> + * @imp:	[in]	array of interconnects supported by importer
> + * @imp_ics:	[in]	number of interconnects supported by importer
> + *
> + * This helper function iterates through the list interconnects supported by
> + * both exporter and importer to find a match. A successful match means that
> + * a common interconnect type is supported by both parties and the exporter's
> + * match_interconnect() callback also confirms that the importer is compatible
> + * with the exporter for that interconnect type.

Document which of the exporter/importer is supposed to call this

> + *
> + * If a match is found, the attach->ic_match field is populated with a copy
> + * of the exporter's match data.

> + * Return: true if a match is found, false otherwise.
> + */
> +bool dma_buf_match_interconnects(struct dma_buf_attachment *attach,
> +				 const struct dma_buf_interconnect_match *exp,
> +				 unsigned int exp_ics,
> +				 const struct dma_buf_interconnect_match *imp,
> +				 unsigned int imp_ics)
> +{
> +	const struct dma_buf_interconnect_ops *ic_ops;
> +	struct dma_buf_interconnect_match *ic_match;
> +	struct dma_buf *dmabuf = attach->dmabuf;
> +	unsigned int i, j;
> +
> +	if (!exp || !imp)
> +		return false;
> +
> +	if (!attach->allow_ic)
> +		return false;

Seems redundant with this check for ic_ops == NULL:

> +	ic_ops = dmabuf->ops->interconnect_ops;
> +	if (!ic_ops || !ic_ops->match_interconnect)
> +		return false;

This seems like too much of a maze to me..

I think you should structure it like this. First declare an interconnect:

struct dma_buf_interconnect iov_interconnect {
   .name = "IOV interconnect",
   .match =..
}

Then the exporters "subclass"

struct dma_buf_interconnect_ops vfio_iov_interconnect {
    .interconnect = &iov_interconnect,
    .map = vfio_map,
}

I guess no container_of technique..

Then in VFIO's attach trigger the new code:

   const struct dma_buf_interconnect_match vfio_exp_ics[] = {
        {&vfio_iov_interconnect},
    };

   dma_buf_match_interconnects(attach, &vfio_exp_ics))

Which will callback to the importer:

static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = {
   .get_importer_interconnects
}

dma_buf_match_interconnects() would call
aops->get_importer_interconnects
and matchs first on .interconnect, then call the interconnect->match
function with exp/inpt match structs if not NULL.

> +struct dma_buf_interconnect_match {
> +	const struct dma_buf_interconnect *type;
> +	struct device *dev;
> +	unsigned int bar;
> +};

This should be more general, dev and bar are unique to the iov
importer. Maybe just simple:

struct dma_buf_interconnect_match {
    struct dma_buf_interconnect *ic; // no need for type
    const struct dma_buf_interconnct_ops *exporter_ic_ops;
    u64 match_data[2]; // dev and bar are IOV specific, generalize
};

Then some helper

       const struct dma_buf_interconnect_match supports_ics[] = {
          IOV_INTERCONNECT(&vfio_iov_interconnect, dev, bar),
       }

And it would be nice if interconnect aware drivers could more easially
interwork with non-interconnect importers.

So I'd add a exporter type of 'p2p dma mapped scatterlist' that just
matches the legacy importer.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer
  2025-10-27 18:18   ` Jason Gunthorpe
@ 2025-10-28  6:04     ` Kasireddy, Vivek
  0 siblings, 0 replies; 24+ messages in thread
From: Kasireddy, Vivek @ 2025-10-28  6:04 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter

Hi Jason,

> Subject: Re: [RFC v2 2/8] dma-buf: Add a helper to match interconnects
> between exporter/importer
> 
> On Sun, Oct 26, 2025 at 09:44:14PM -0700, Vivek Kasireddy wrote:
> > +/**
> > + * dma_buf_match_interconnects - determine if there is a specific
> interconnect
> > + * that is supported by both exporter and importer.
> > + * @attach:	[in]	attachment to populate ic_match field
> > + * @exp:	[in]	array of interconnects supported by exporter
> > + * @exp_ics:	[in]	number of interconnects supported by exporter
> > + * @imp:	[in]	array of interconnects supported by importer
> > + * @imp_ics:	[in]	number of interconnects supported by importer
> > + *
> > + * This helper function iterates through the list interconnects supported by
> > + * both exporter and importer to find a match. A successful match means
> that
> > + * a common interconnect type is supported by both parties and the
> exporter's
> > + * match_interconnect() callback also confirms that the importer is
> compatible
> > + * with the exporter for that interconnect type.
> 
> Document which of the exporter/importer is supposed to call this
I missed adding that part. The importer is expected to call this in my current design.

> 
> > + *
> > + * If a match is found, the attach->ic_match field is populated with a copy
> > + * of the exporter's match data.
> 
> > + * Return: true if a match is found, false otherwise.
> > + */
> > +bool dma_buf_match_interconnects(struct dma_buf_attachment *attach,
> > +				 const struct dma_buf_interconnect_match
> *exp,
> > +				 unsigned int exp_ics,
> > +				 const struct dma_buf_interconnect_match
> *imp,
> > +				 unsigned int imp_ics)
> > +{
> > +	const struct dma_buf_interconnect_ops *ic_ops;
> > +	struct dma_buf_interconnect_match *ic_match;
> > +	struct dma_buf *dmabuf = attach->dmabuf;
> > +	unsigned int i, j;
> > +
> > +	if (!exp || !imp)
> > +		return false;
> > +
> > +	if (!attach->allow_ic)
> > +		return false;
> 
> Seems redundant with this check for ic_ops == NULL:
Not really; attach->allow_ic would indicate if a successful match is
found or not. And, ic_ops is for the exporter to indicate whether it
supports interconnect ops or not.

> 
> > +	ic_ops = dmabuf->ops->interconnect_ops;
> > +	if (!ic_ops || !ic_ops->match_interconnect)
> > +		return false;
> 
> This seems like too much of a maze to me..
> 
> I think you should structure it like this. First declare an interconnect:
> 
> struct dma_buf_interconnect iov_interconnect {
>    .name = "IOV interconnect",
>    .match =..
> }
> 
> Then the exporters "subclass"
> 
> struct dma_buf_interconnect_ops vfio_iov_interconnect {
>     .interconnect = &iov_interconnect,
>     .map = vfio_map,
> }
> 
> I guess no container_of technique..
> 
> Then in VFIO's attach trigger the new code:
> 
>    const struct dma_buf_interconnect_match vfio_exp_ics[] = {
>         {&vfio_iov_interconnect},
>     };
> 
>    dma_buf_match_interconnects(attach, &vfio_exp_ics))
> 
> Which will callback to the importer:
> 
> static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = {
>    .get_importer_interconnects
> }
> 
> dma_buf_match_interconnects() would call
> aops->get_importer_interconnects
> and matchs first on .interconnect, then call the interconnect->match
> function with exp/inpt match structs if not NULL.
Ok, I'll try to test your suggestions. 

> 
> > +struct dma_buf_interconnect_match {
> > +	const struct dma_buf_interconnect *type;
> > +	struct device *dev;
> > +	unsigned int bar;
> > +};
> 
> This should be more general, dev and bar are unique to the iov
> importer. Maybe just simple:
> 
> struct dma_buf_interconnect_match {
>     struct dma_buf_interconnect *ic; // no need for type
>     const struct dma_buf_interconnct_ops *exporter_ic_ops;
>     u64 match_data[2]; // dev and bar are IOV specific, generalize
I am wondering what kind of match data would be needed for other
interconnects, so that we can try to generalize dma_buf_interconnect_match
or probably have interconnect specific implementations subclass it.

> };
> 
> Then some helper
> 
>        const struct dma_buf_interconnect_match supports_ics[] = {
>           IOV_INTERCONNECT(&vfio_iov_interconnect, dev, bar),
>        }
I have done mostly the same thing as you suggest in patches 4 and 5 of this
series that add IOV interconnect support for vfio-pci and Xe drivers. Did you
get a chance to look at them?

> 
> And it would be nice if interconnect aware drivers could more easially
> interwork with non-interconnect importers.
> 
> So I'd add a exporter type of 'p2p dma mapped scatterlist' that just
> matches the legacy importer.
IIUC, even interconnect aware drivers (or exporters) would need to implement
map_dma_buf() op (which is mandatory) which would return an sg_table.
So, if match_interconnects() fails, then the exporter/importer would need to
fallback to using the legacy path.

Thanks,
Vivek

> 
> Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC v2 3/8] dma-buf: Create and expose IOV interconnect to all exporters/importers
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect Vivek Kasireddy
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Jason Gunthorpe, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter

The IOV interconnect is a virtual interconnect between an SRIOV
physical function (PF) and its virtual functions (VFs). In order
for negotiation (or match) to succeed, the exporter is expected
to be a VF while the importer is expected to be the PF.

Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/dma-buf/dma-buf-interconnect.c | 3 +++
 include/linux/dma-buf-interconnect.h   | 7 +++++++
 2 files changed, 10 insertions(+)

diff --git a/drivers/dma-buf/dma-buf-interconnect.c b/drivers/dma-buf/dma-buf-interconnect.c
index 12db77e6b9f1..492e4d3fe4c8 100644
--- a/drivers/dma-buf/dma-buf-interconnect.c
+++ b/drivers/dma-buf/dma-buf-interconnect.c
@@ -159,3 +159,6 @@ bool dma_buf_match_interconnects(struct dma_buf_attachment *attach,
 	return false;
 }
 EXPORT_SYMBOL_NS_GPL(dma_buf_match_interconnects, "DMA_BUF");
+
+CREATE_INTERCONNECT(iov)
+EXPORT_SYMBOL_NS_GPL(iov_interconnect, "DMA_BUF");
diff --git a/include/linux/dma-buf-interconnect.h b/include/linux/dma-buf-interconnect.h
index efe3ca1c354a..37dee1a26f24 100644
--- a/include/linux/dma-buf-interconnect.h
+++ b/include/linux/dma-buf-interconnect.h
@@ -20,6 +20,13 @@
 
 struct dma_buf_attachment;
 
+/**
+ * The iov interconnect instance would be created and exported out of
+ * dma-buf-interconnect.c as it is a global interconnect that is expected
+ * to be supported by different exporters and importers.
+ */
+extern const struct dma_buf_interconnect *iov_interconnect;
+
 /**
  * struct dma_buf_interconnect - holds info associated with an interconnect
  * @name: name of the interconnect.
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
                   ` (2 preceding siblings ...)
  2025-10-27  4:44 ` [RFC v2 3/8] dma-buf: Create and expose IOV interconnect to all exporters/importers Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-28  2:00   ` Matthew Brost
  2025-10-27  4:44 ` [RFC v2 5/8] drm/xe/dma_buf: " Vivek Kasireddy
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Jason Gunthorpe, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter

Add support for IOV interconnect by provding ops for map/unmap and
match interconnect. Note that the xarray is populated with entries
of type struct range. The range type contains the start and end
addresses of the memory region.

Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/vfio/pci/vfio_pci_dmabuf.c | 135 ++++++++++++++++++++++++++++-
 1 file changed, 134 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
index eaba010777f3..d2b7b5410e5a 100644
--- a/drivers/vfio/pci/vfio_pci_dmabuf.c
+++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
@@ -4,6 +4,7 @@
 #include <linux/dma-buf.h>
 #include <linux/pci-p2pdma.h>
 #include <linux/dma-resv.h>
+#include <linux/range.h>
 
 #include "vfio_pci_priv.h"
 
@@ -16,15 +17,132 @@ struct vfio_pci_dma_buf {
 	size_t size;
 	struct phys_vec *phys_vec;
 	struct p2pdma_provider *provider;
+	struct dma_buf_interconnect_match *ic_match;
 	u32 nr_ranges;
 	u8 revoked : 1;
 };
 
+static int
+vfio_pci_create_match(struct vfio_pci_dma_buf *priv,
+			  struct vfio_device_feature_dma_buf *dma_buf)
+{
+	struct dma_buf_interconnect_match *ic_match;
+
+	ic_match = kzalloc(sizeof(*ic_match), GFP_KERNEL);
+	if (!ic_match)
+		return -ENOMEM;
+
+	ic_match->dev = &priv->vdev->pdev->dev;
+	ic_match->bar = dma_buf->region_index;
+
+	priv->ic_match = ic_match;
+	return 0;
+}
+
+static int vfio_pci_map_iov_interconnect(struct vfio_pci_dma_buf *priv,
+					 struct xarray *ranges)
+{
+	struct phys_vec *phys_vec = priv->phys_vec;
+	struct range *range;
+	unsigned long i;
+	void *entry;
+	int ret;
+
+	range = kmalloc_array(priv->nr_ranges, sizeof(*range), GFP_KERNEL);
+	if (!range)
+		return -ENOMEM;
+
+	for (i = 0; i < priv->nr_ranges; i++) {
+		entry = &range[i];
+		range[i].start = phys_vec[i].paddr;
+		range[i].end = phys_vec[i].paddr + phys_vec[i].len - 1;
+
+		entry = xa_store(ranges, i, entry, GFP_KERNEL);
+		if (xa_is_err(entry)) {
+			ret = xa_err(entry);
+			goto err_free_range;
+		}
+	}
+	return 0;
+
+err_free_range:
+	kfree(range);
+	return ret;
+}
+
+static int vfio_pci_map_interconnect(struct dma_buf_attachment *attachment,
+				     struct dma_buf_ranges *ranges)
+{
+	const struct dma_buf_interconnect *ic = attachment->ic_match->type;
+	struct vfio_pci_dma_buf *priv = attachment->dmabuf->priv;
+	int ret = -EINVAL;
+
+	ranges->nranges = priv->nr_ranges;
+
+	if (ic == iov_interconnect)
+		ret = vfio_pci_map_iov_interconnect(priv, &ranges->ranges);
+
+	return ret;
+}
+
+static void vfio_pci_unmap_interconnect(struct dma_buf_attachment *attachment,
+					struct dma_buf_ranges *ranges)
+{
+	void *entry;
+
+	entry = xa_load(&ranges->ranges, 0);
+	kfree(entry);
+}
+
+static bool
+vfio_pci_match_iov_interconnect(const struct dma_buf_interconnect_match *exp,
+				const struct dma_buf_interconnect_match *imp)
+{
+	struct pci_dev *exp_pdev = to_pci_dev(exp->dev);
+	struct pci_dev *imp_pdev = to_pci_dev(imp->dev);
+
+	return imp_pdev == pci_physfn(exp_pdev) && imp->bar == exp->bar;
+}
+
+static bool
+vfio_pci_match_interconnect(const struct dma_buf_interconnect_match *exp,
+			    const struct dma_buf_interconnect_match *imp)
+{
+	const struct dma_buf_interconnect *ic = exp->type;
+
+	if (ic == iov_interconnect)
+		return vfio_pci_match_iov_interconnect(exp, imp);
+
+	return false;
+}
+
+static bool
+vfio_pci_match_interconnects(struct vfio_pci_dma_buf *priv,
+			     struct dma_buf_attachment *attachment)
+{
+	const struct dma_buf_attach_ops *aops = attachment->importer_ops;
+	const struct dma_buf_interconnect_match supports_ics[] = {
+		MATCH_INTERCONNECT(iov_interconnect,
+				   priv->ic_match->dev, priv->ic_match->bar),
+	};
+
+	if (attachment->allow_ic) {
+		if (aops->supports_interconnects(attachment, supports_ics,
+						 ARRAY_SIZE(supports_ics)))
+			return true;
+	}
+	return false;
+}
+
 static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf,
 				   struct dma_buf_attachment *attachment)
 {
 	struct vfio_pci_dma_buf *priv = dmabuf->priv;
 
+	if (vfio_pci_match_interconnects(priv, attachment)) {
+		return 0;
+	}
+
 	if (!attachment->peer2peer)
 		return -EOPNOTSUPP;
 
@@ -189,6 +307,7 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment,
 	return ERR_PTR(ret);
 }
 
+
 static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment,
 				   struct sg_table *sgt,
 				   enum dma_data_direction dir)
@@ -228,15 +347,23 @@ static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf)
 		vfio_device_put_registration(&priv->vdev->vdev);
 	}
 	kfree(priv->phys_vec);
+	kfree(priv->ic_match);
 	kfree(priv);
 }
 
+static const struct dma_buf_interconnect_ops vfio_pci_interconnect_ops = {
+	.match_interconnect = vfio_pci_match_interconnect,
+	.map_interconnect = vfio_pci_map_interconnect,
+	.unmap_interconnect = vfio_pci_unmap_interconnect,
+};
+
 static const struct dma_buf_ops vfio_pci_dmabuf_ops = {
 	.attach = vfio_pci_dma_buf_attach,
 	.detach = vfio_pci_dma_buf_detach,
 	.map_dma_buf = vfio_pci_dma_buf_map,
 	.release = vfio_pci_dma_buf_release,
 	.unmap_dma_buf = vfio_pci_dma_buf_unmap,
+	.interconnect_ops = &vfio_pci_interconnect_ops,
 };
 
 static void dma_ranges_to_p2p_phys(struct vfio_pci_dma_buf *priv,
@@ -365,6 +492,10 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
 		goto err_free_phys;
 	}
 
+	ret = vfio_pci_create_match(priv, &get_dma_buf);
+	if (ret)
+		goto err_dev_put;
+
 	exp_info.ops = &vfio_pci_dmabuf_ops;
 	exp_info.size = priv->size;
 	exp_info.flags = get_dma_buf.open_flags;
@@ -373,7 +504,7 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
 	priv->dmabuf = dma_buf_export(&exp_info);
 	if (IS_ERR(priv->dmabuf)) {
 		ret = PTR_ERR(priv->dmabuf);
-		goto err_dev_put;
+		goto err_free_iov;
 	}
 
 	/* dma_buf_put() now frees priv */
@@ -391,6 +522,8 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
 	 */
 	return dma_buf_fd(priv->dmabuf, get_dma_buf.open_flags);
 
+err_free_iov:
+	kfree(priv->ic_match);
 err_dev_put:
 	vfio_device_put_registration(&vdev->vdev);
 err_free_phys:
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect
  2025-10-27  4:44 ` [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect Vivek Kasireddy
@ 2025-10-28  2:00   ` Matthew Brost
  2025-10-28  5:05     ` Kasireddy, Vivek
  0 siblings, 1 reply; 24+ messages in thread
From: Matthew Brost @ 2025-10-28  2:00 UTC (permalink / raw)
  To: Vivek Kasireddy
  Cc: dri-devel, intel-xe, linux-media, linaro-mm-sig, Jason Gunthorpe,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter

On Sun, Oct 26, 2025 at 09:44:16PM -0700, Vivek Kasireddy wrote:
> Add support for IOV interconnect by provding ops for map/unmap and
> match interconnect. Note that the xarray is populated with entries
> of type struct range. The range type contains the start and end
> addresses of the memory region.
> 
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Simona Vetter <simona.vetter@ffwll.ch>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> ---
>  drivers/vfio/pci/vfio_pci_dmabuf.c | 135 ++++++++++++++++++++++++++++-
>  1 file changed, 134 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> index eaba010777f3..d2b7b5410e5a 100644
> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c

In drm-tip vfio_pci_dmabuf.c does not exist as a file? Is this series
based on another series / branch where vfio_pci_dmabuf.c hasn't made it
into drm-tip yet?

Matt

> @@ -4,6 +4,7 @@
>  #include <linux/dma-buf.h>
>  #include <linux/pci-p2pdma.h>
>  #include <linux/dma-resv.h>
> +#include <linux/range.h>
>  
>  #include "vfio_pci_priv.h"
>  
> @@ -16,15 +17,132 @@ struct vfio_pci_dma_buf {
>  	size_t size;
>  	struct phys_vec *phys_vec;
>  	struct p2pdma_provider *provider;
> +	struct dma_buf_interconnect_match *ic_match;
>  	u32 nr_ranges;
>  	u8 revoked : 1;
>  };
>  
> +static int
> +vfio_pci_create_match(struct vfio_pci_dma_buf *priv,
> +			  struct vfio_device_feature_dma_buf *dma_buf)
> +{
> +	struct dma_buf_interconnect_match *ic_match;
> +
> +	ic_match = kzalloc(sizeof(*ic_match), GFP_KERNEL);
> +	if (!ic_match)
> +		return -ENOMEM;
> +
> +	ic_match->dev = &priv->vdev->pdev->dev;
> +	ic_match->bar = dma_buf->region_index;
> +
> +	priv->ic_match = ic_match;
> +	return 0;
> +}
> +
> +static int vfio_pci_map_iov_interconnect(struct vfio_pci_dma_buf *priv,
> +					 struct xarray *ranges)
> +{
> +	struct phys_vec *phys_vec = priv->phys_vec;
> +	struct range *range;
> +	unsigned long i;
> +	void *entry;
> +	int ret;
> +
> +	range = kmalloc_array(priv->nr_ranges, sizeof(*range), GFP_KERNEL);
> +	if (!range)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < priv->nr_ranges; i++) {
> +		entry = &range[i];
> +		range[i].start = phys_vec[i].paddr;
> +		range[i].end = phys_vec[i].paddr + phys_vec[i].len - 1;
> +
> +		entry = xa_store(ranges, i, entry, GFP_KERNEL);
> +		if (xa_is_err(entry)) {
> +			ret = xa_err(entry);
> +			goto err_free_range;
> +		}
> +	}
> +	return 0;
> +
> +err_free_range:
> +	kfree(range);
> +	return ret;
> +}
> +
> +static int vfio_pci_map_interconnect(struct dma_buf_attachment *attachment,
> +				     struct dma_buf_ranges *ranges)
> +{
> +	const struct dma_buf_interconnect *ic = attachment->ic_match->type;
> +	struct vfio_pci_dma_buf *priv = attachment->dmabuf->priv;
> +	int ret = -EINVAL;
> +
> +	ranges->nranges = priv->nr_ranges;
> +
> +	if (ic == iov_interconnect)
> +		ret = vfio_pci_map_iov_interconnect(priv, &ranges->ranges);
> +
> +	return ret;
> +}
> +
> +static void vfio_pci_unmap_interconnect(struct dma_buf_attachment *attachment,
> +					struct dma_buf_ranges *ranges)
> +{
> +	void *entry;
> +
> +	entry = xa_load(&ranges->ranges, 0);
> +	kfree(entry);
> +}
> +
> +static bool
> +vfio_pci_match_iov_interconnect(const struct dma_buf_interconnect_match *exp,
> +				const struct dma_buf_interconnect_match *imp)
> +{
> +	struct pci_dev *exp_pdev = to_pci_dev(exp->dev);
> +	struct pci_dev *imp_pdev = to_pci_dev(imp->dev);
> +
> +	return imp_pdev == pci_physfn(exp_pdev) && imp->bar == exp->bar;
> +}
> +
> +static bool
> +vfio_pci_match_interconnect(const struct dma_buf_interconnect_match *exp,
> +			    const struct dma_buf_interconnect_match *imp)
> +{
> +	const struct dma_buf_interconnect *ic = exp->type;
> +
> +	if (ic == iov_interconnect)
> +		return vfio_pci_match_iov_interconnect(exp, imp);
> +
> +	return false;
> +}
> +
> +static bool
> +vfio_pci_match_interconnects(struct vfio_pci_dma_buf *priv,
> +			     struct dma_buf_attachment *attachment)
> +{
> +	const struct dma_buf_attach_ops *aops = attachment->importer_ops;
> +	const struct dma_buf_interconnect_match supports_ics[] = {
> +		MATCH_INTERCONNECT(iov_interconnect,
> +				   priv->ic_match->dev, priv->ic_match->bar),
> +	};
> +
> +	if (attachment->allow_ic) {
> +		if (aops->supports_interconnects(attachment, supports_ics,
> +						 ARRAY_SIZE(supports_ics)))
> +			return true;
> +	}
> +	return false;
> +}
> +
>  static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf,
>  				   struct dma_buf_attachment *attachment)
>  {
>  	struct vfio_pci_dma_buf *priv = dmabuf->priv;
>  
> +	if (vfio_pci_match_interconnects(priv, attachment)) {
> +		return 0;
> +	}
> +
>  	if (!attachment->peer2peer)
>  		return -EOPNOTSUPP;
>  
> @@ -189,6 +307,7 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment,
>  	return ERR_PTR(ret);
>  }
>  
> +
>  static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment,
>  				   struct sg_table *sgt,
>  				   enum dma_data_direction dir)
> @@ -228,15 +347,23 @@ static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf)
>  		vfio_device_put_registration(&priv->vdev->vdev);
>  	}
>  	kfree(priv->phys_vec);
> +	kfree(priv->ic_match);
>  	kfree(priv);
>  }
>  
> +static const struct dma_buf_interconnect_ops vfio_pci_interconnect_ops = {
> +	.match_interconnect = vfio_pci_match_interconnect,
> +	.map_interconnect = vfio_pci_map_interconnect,
> +	.unmap_interconnect = vfio_pci_unmap_interconnect,
> +};
> +
>  static const struct dma_buf_ops vfio_pci_dmabuf_ops = {
>  	.attach = vfio_pci_dma_buf_attach,
>  	.detach = vfio_pci_dma_buf_detach,
>  	.map_dma_buf = vfio_pci_dma_buf_map,
>  	.release = vfio_pci_dma_buf_release,
>  	.unmap_dma_buf = vfio_pci_dma_buf_unmap,
> +	.interconnect_ops = &vfio_pci_interconnect_ops,
>  };
>  
>  static void dma_ranges_to_p2p_phys(struct vfio_pci_dma_buf *priv,
> @@ -365,6 +492,10 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
>  		goto err_free_phys;
>  	}
>  
> +	ret = vfio_pci_create_match(priv, &get_dma_buf);
> +	if (ret)
> +		goto err_dev_put;
> +
>  	exp_info.ops = &vfio_pci_dmabuf_ops;
>  	exp_info.size = priv->size;
>  	exp_info.flags = get_dma_buf.open_flags;
> @@ -373,7 +504,7 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
>  	priv->dmabuf = dma_buf_export(&exp_info);
>  	if (IS_ERR(priv->dmabuf)) {
>  		ret = PTR_ERR(priv->dmabuf);
> -		goto err_dev_put;
> +		goto err_free_iov;
>  	}
>  
>  	/* dma_buf_put() now frees priv */
> @@ -391,6 +522,8 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
>  	 */
>  	return dma_buf_fd(priv->dmabuf, get_dma_buf.open_flags);
>  
> +err_free_iov:
> +	kfree(priv->ic_match);
>  err_dev_put:
>  	vfio_device_put_registration(&vdev->vdev);
>  err_free_phys:
> -- 
> 2.50.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect
  2025-10-28  2:00   ` Matthew Brost
@ 2025-10-28  5:05     ` Kasireddy, Vivek
  0 siblings, 0 replies; 24+ messages in thread
From: Kasireddy, Vivek @ 2025-10-28  5:05 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	Jason Gunthorpe, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter

Hi Matt,

> Subject: Re: [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect
> 
> On Sun, Oct 26, 2025 at 09:44:16PM -0700, Vivek Kasireddy wrote:
> > Add support for IOV interconnect by provding ops for map/unmap and
> > match interconnect. Note that the xarray is populated with entries
> > of type struct range. The range type contains the start and end
> > addresses of the memory region.
> >
> > Cc: Jason Gunthorpe <jgg@nvidia.com>
> > Cc: Christian Koenig <christian.koenig@amd.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Simona Vetter <simona.vetter@ffwll.ch>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > ---
> >  drivers/vfio/pci/vfio_pci_dmabuf.c | 135
> ++++++++++++++++++++++++++++-
> >  1 file changed, 134 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c
> b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > index eaba010777f3..d2b7b5410e5a 100644
> > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> 
> In drm-tip vfio_pci_dmabuf.c does not exist as a file? Is this series
> based on another series / branch where vfio_pci_dmabuf.c hasn't made it
> into drm-tip yet?
That file is part of [1] which hasn't been merged yet. The last patch in [1]
adds the dmabuf feature to vfio-pci.

[1]: https://lore.kernel.org/dri-devel/cover.1760368250.git.leon@kernel.org/

Thanks,
Vivek

> 
> Matt
> 
> > @@ -4,6 +4,7 @@
> >  #include <linux/dma-buf.h>
> >  #include <linux/pci-p2pdma.h>
> >  #include <linux/dma-resv.h>
> > +#include <linux/range.h>
> >
> >  #include "vfio_pci_priv.h"
> >
> > @@ -16,15 +17,132 @@ struct vfio_pci_dma_buf {
> >  	size_t size;
> >  	struct phys_vec *phys_vec;
> >  	struct p2pdma_provider *provider;
> > +	struct dma_buf_interconnect_match *ic_match;
> >  	u32 nr_ranges;
> >  	u8 revoked : 1;
> >  };
> >
> > +static int
> > +vfio_pci_create_match(struct vfio_pci_dma_buf *priv,
> > +			  struct vfio_device_feature_dma_buf *dma_buf)
> > +{
> > +	struct dma_buf_interconnect_match *ic_match;
> > +
> > +	ic_match = kzalloc(sizeof(*ic_match), GFP_KERNEL);
> > +	if (!ic_match)
> > +		return -ENOMEM;
> > +
> > +	ic_match->dev = &priv->vdev->pdev->dev;
> > +	ic_match->bar = dma_buf->region_index;
> > +
> > +	priv->ic_match = ic_match;
> > +	return 0;
> > +}
> > +
> > +static int vfio_pci_map_iov_interconnect(struct vfio_pci_dma_buf *priv,
> > +					 struct xarray *ranges)
> > +{
> > +	struct phys_vec *phys_vec = priv->phys_vec;
> > +	struct range *range;
> > +	unsigned long i;
> > +	void *entry;
> > +	int ret;
> > +
> > +	range = kmalloc_array(priv->nr_ranges, sizeof(*range), GFP_KERNEL);
> > +	if (!range)
> > +		return -ENOMEM;
> > +
> > +	for (i = 0; i < priv->nr_ranges; i++) {
> > +		entry = &range[i];
> > +		range[i].start = phys_vec[i].paddr;
> > +		range[i].end = phys_vec[i].paddr + phys_vec[i].len - 1;
> > +
> > +		entry = xa_store(ranges, i, entry, GFP_KERNEL);
> > +		if (xa_is_err(entry)) {
> > +			ret = xa_err(entry);
> > +			goto err_free_range;
> > +		}
> > +	}
> > +	return 0;
> > +
> > +err_free_range:
> > +	kfree(range);
> > +	return ret;
> > +}
> > +
> > +static int vfio_pci_map_interconnect(struct dma_buf_attachment
> *attachment,
> > +				     struct dma_buf_ranges *ranges)
> > +{
> > +	const struct dma_buf_interconnect *ic = attachment->ic_match->type;
> > +	struct vfio_pci_dma_buf *priv = attachment->dmabuf->priv;
> > +	int ret = -EINVAL;
> > +
> > +	ranges->nranges = priv->nr_ranges;
> > +
> > +	if (ic == iov_interconnect)
> > +		ret = vfio_pci_map_iov_interconnect(priv, &ranges->ranges);
> > +
> > +	return ret;
> > +}
> > +
> > +static void vfio_pci_unmap_interconnect(struct dma_buf_attachment
> *attachment,
> > +					struct dma_buf_ranges *ranges)
> > +{
> > +	void *entry;
> > +
> > +	entry = xa_load(&ranges->ranges, 0);
> > +	kfree(entry);
> > +}
> > +
> > +static bool
> > +vfio_pci_match_iov_interconnect(const struct
> dma_buf_interconnect_match *exp,
> > +				const struct dma_buf_interconnect_match
> *imp)
> > +{
> > +	struct pci_dev *exp_pdev = to_pci_dev(exp->dev);
> > +	struct pci_dev *imp_pdev = to_pci_dev(imp->dev);
> > +
> > +	return imp_pdev == pci_physfn(exp_pdev) && imp->bar == exp->bar;
> > +}
> > +
> > +static bool
> > +vfio_pci_match_interconnect(const struct dma_buf_interconnect_match
> *exp,
> > +			    const struct dma_buf_interconnect_match *imp)
> > +{
> > +	const struct dma_buf_interconnect *ic = exp->type;
> > +
> > +	if (ic == iov_interconnect)
> > +		return vfio_pci_match_iov_interconnect(exp, imp);
> > +
> > +	return false;
> > +}
> > +
> > +static bool
> > +vfio_pci_match_interconnects(struct vfio_pci_dma_buf *priv,
> > +			     struct dma_buf_attachment *attachment)
> > +{
> > +	const struct dma_buf_attach_ops *aops = attachment->importer_ops;
> > +	const struct dma_buf_interconnect_match supports_ics[] = {
> > +		MATCH_INTERCONNECT(iov_interconnect,
> > +				   priv->ic_match->dev, priv->ic_match->bar),
> > +	};
> > +
> > +	if (attachment->allow_ic) {
> > +		if (aops->supports_interconnects(attachment, supports_ics,
> > +						 ARRAY_SIZE(supports_ics)))
> > +			return true;
> > +	}
> > +	return false;
> > +}
> > +
> >  static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf,
> >  				   struct dma_buf_attachment *attachment)
> >  {
> >  	struct vfio_pci_dma_buf *priv = dmabuf->priv;
> >
> > +	if (vfio_pci_match_interconnects(priv, attachment)) {
> > +		return 0;
> > +	}
> > +
> >  	if (!attachment->peer2peer)
> >  		return -EOPNOTSUPP;
> >
> > @@ -189,6 +307,7 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment
> *attachment,
> >  	return ERR_PTR(ret);
> >  }
> >
> > +
> >  static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment
> *attachment,
> >  				   struct sg_table *sgt,
> >  				   enum dma_data_direction dir)
> > @@ -228,15 +347,23 @@ static void vfio_pci_dma_buf_release(struct
> dma_buf *dmabuf)
> >  		vfio_device_put_registration(&priv->vdev->vdev);
> >  	}
> >  	kfree(priv->phys_vec);
> > +	kfree(priv->ic_match);
> >  	kfree(priv);
> >  }
> >
> > +static const struct dma_buf_interconnect_ops vfio_pci_interconnect_ops =
> {
> > +	.match_interconnect = vfio_pci_match_interconnect,
> > +	.map_interconnect = vfio_pci_map_interconnect,
> > +	.unmap_interconnect = vfio_pci_unmap_interconnect,
> > +};
> > +
> >  static const struct dma_buf_ops vfio_pci_dmabuf_ops = {
> >  	.attach = vfio_pci_dma_buf_attach,
> >  	.detach = vfio_pci_dma_buf_detach,
> >  	.map_dma_buf = vfio_pci_dma_buf_map,
> >  	.release = vfio_pci_dma_buf_release,
> >  	.unmap_dma_buf = vfio_pci_dma_buf_unmap,
> > +	.interconnect_ops = &vfio_pci_interconnect_ops,
> >  };
> >
> >  static void dma_ranges_to_p2p_phys(struct vfio_pci_dma_buf *priv,
> > @@ -365,6 +492,10 @@ int vfio_pci_core_feature_dma_buf(struct
> vfio_pci_core_device *vdev, u32 flags,
> >  		goto err_free_phys;
> >  	}
> >
> > +	ret = vfio_pci_create_match(priv, &get_dma_buf);
> > +	if (ret)
> > +		goto err_dev_put;
> > +
> >  	exp_info.ops = &vfio_pci_dmabuf_ops;
> >  	exp_info.size = priv->size;
> >  	exp_info.flags = get_dma_buf.open_flags;
> > @@ -373,7 +504,7 @@ int vfio_pci_core_feature_dma_buf(struct
> vfio_pci_core_device *vdev, u32 flags,
> >  	priv->dmabuf = dma_buf_export(&exp_info);
> >  	if (IS_ERR(priv->dmabuf)) {
> >  		ret = PTR_ERR(priv->dmabuf);
> > -		goto err_dev_put;
> > +		goto err_free_iov;
> >  	}
> >
> >  	/* dma_buf_put() now frees priv */
> > @@ -391,6 +522,8 @@ int vfio_pci_core_feature_dma_buf(struct
> vfio_pci_core_device *vdev, u32 flags,
> >  	 */
> >  	return dma_buf_fd(priv->dmabuf, get_dma_buf.open_flags);
> >
> > +err_free_iov:
> > +	kfree(priv->ic_match);
> >  err_dev_put:
> >  	vfio_device_put_registration(&vdev->vdev);
> >  err_free_phys:
> > --
> > 2.50.1
> >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC v2 5/8] drm/xe/dma_buf: Add support for IOV interconnect
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
                   ` (3 preceding siblings ...)
  2025-10-27  4:44 ` [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 6/8] drm/xe/pf: Add a helper function to get a VF's backing object in LMEM Vivek Kasireddy
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Jason Gunthorpe, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter

Provide a callback for supports_interconnects() to indicate to
the dma-buf core and to the exporter that Xe supports interconnects.
Note that Xe would support IOV interconnect only if the buffer
is located in VRAM region.

Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/gpu/drm/xe/xe_dma_buf.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index a7d67725c3ee..4a1aa47efbc6 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -13,6 +13,7 @@
 #include <drm/drm_prime.h>
 #include <drm/ttm/ttm_tt.h>
 
+#include "regs/xe_bars.h"
 #include "tests/xe_test.h"
 #include "xe_bo.h"
 #include "xe_device.h"
@@ -274,9 +275,23 @@ static void xe_dma_buf_move_notify(struct dma_buf_attachment *attach)
 	XE_WARN_ON(xe_bo_evict(bo, exec));
 }
 
+static bool
+xe_dma_buf_supports_interconnects(struct dma_buf_attachment *attach,
+				  const struct dma_buf_interconnect_match *exp,
+				  unsigned int exp_ics)
+{
+	const struct dma_buf_interconnect_match supports_ics[] = {
+		MATCH_INTERCONNECT(iov_interconnect, attach->dev, LMEM_BAR),
+	};
+
+	return dma_buf_match_interconnects(attach, exp, exp_ics, supports_ics,
+					   ARRAY_SIZE(supports_ics));
+}
+
 static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = {
 	.allow_peer2peer = true,
-	.move_notify = xe_dma_buf_move_notify
+	.move_notify = xe_dma_buf_move_notify,
+	.supports_interconnects = xe_dma_buf_supports_interconnects,
 };
 
 #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC v2 6/8] drm/xe/pf: Add a helper function to get a VF's backing object in LMEM
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
                   ` (4 preceding siblings ...)
  2025-10-27  4:44 ` [RFC v2 5/8] drm/xe/dma_buf: " Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 7/8] drm/xe/bo: Create new dma_addr array for dmabuf BOs associated with VFs Vivek Kasireddy
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig; +Cc: Vivek Kasireddy

To properly import a dmabuf that is associated with a VF (or that
originates in a Guest VM that includes a VF), we need to know where
in LMEM the VF's allocated regions exist. Therefore, introduce a
new helper to return the object that backs the VF's regions in LMEM.

v2:
- Make the helper return the LMEM object instead of the start address

v3:
- Move the declaration of the helper under other lmem helpers (Michal)

v4:
- Take a ref on the LMEM obj while holding the PF master mutex (Matt)

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 24 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h |  1 +
 2 files changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index 6344b5205c08..2f09b67438fc 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -1535,6 +1535,30 @@ u64 xe_gt_sriov_pf_config_get_lmem(struct xe_gt *gt, unsigned int vfid)
 	return size;
 }
 
+/**
+ * xe_gt_sriov_pf_config_get_lmem_obj - Get VF's LMEM BO.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function can only be called on PF.
+ *
+ * Return: BO that is backing VF's quota in LMEM.
+ */
+struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt,
+						 unsigned int vfid)
+{
+	struct xe_gt_sriov_config *config;
+	struct xe_bo *lmem_obj;
+
+	mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
+	config = pf_pick_vf_config(gt, vfid);
+	lmem_obj = config->lmem_obj;
+	xe_bo_get(lmem_obj);
+	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
+
+	return lmem_obj;
+}
+
 /**
  * xe_gt_sriov_pf_config_set_lmem - Provision VF with LMEM.
  * @gt: the &xe_gt (can't be media)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
index 513e6512a575..bbc5c238cbf6 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
@@ -36,6 +36,7 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
 int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs);
 int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs,
 					u64 size);
+struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid);
 
 u32 xe_gt_sriov_pf_config_get_exec_quantum(struct xe_gt *gt, unsigned int vfid);
 int xe_gt_sriov_pf_config_set_exec_quantum(struct xe_gt *gt, unsigned int vfid, u32 exec_quantum);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC v2 7/8] drm/xe/bo: Create new dma_addr array for dmabuf BOs associated with VFs
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
                   ` (5 preceding siblings ...)
  2025-10-27  4:44 ` [RFC v2 6/8] drm/xe/pf: Add a helper function to get a VF's backing object in LMEM Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-27  4:44 ` [RFC v2 8/8] drm/xe/pt: Add an additional check for dmabuf BOs while doing bind Vivek Kasireddy
  2025-10-29  0:27 ` [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Jason Gunthorpe
  8 siblings, 0 replies; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Matthew Brost, Thomas Hellström

For BOs of type ttm_bo_type_sg, that are backed by PCI BAR addresses
associated with a VF, we need to adjust and translate these addresses
to VRAM addresses to make the BOs usable by the PF. Otherwise, the
BOs (i.e, PCI BAR addresses) are only accessible by the CPU and not
by the GPU.

In order to do the above, we first need to identify if the addresses
associated with an imported BO (type ttm_bo_type_sg) belong to System
RAM or a VF or other PCI devices. After we confirm that they belong to
a VF, we convert the BAR addresses to DPAs and create a new dma_addr
array (of type drm_pagemap_dma_addr) and populate it with the new
addresses along with the segment sizes.

Note that, all the above is only done if we are able to map the
dmabuf via the IOV interconnect. If not, we fallback to the legacy
mapping route using the sg table.

v2:
- Use dma_addr array instead of sg table to store translated addresses
  (Matt)

v3:
- Remove the usage of iommu_iova_to_phys() as the imported BO would no
  longer contain IOVAs and would instead have BAR addresses.

v4:
- Take a reference on the VF's backing object in VRAM (Michal)
- Create a new type for storing dma data

v5:
- Replace DRM_INTERCONNECT_DRIVER with XE_INTERCONNECT_VRAM during
  address encoding (Matt, Thomas)
- Drop is_devmem_external and instead rely on bo->dma_data.dma_addr
  to check for imported VRAM BOs (Matt)
- Add a check to prevent malicious VF from accessing other VF's
  addresses (Thomas)
- Fallback to legacy (map_dma_buf) mapping method if mapping via
  interconnect fails
- Pass XE_PAGE_SIZE as the last parameter to xe_bo_addr (Matt)

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c             | 162 +++++++++++++++++++++++--
 drivers/gpu/drm/xe/xe_bo_types.h       |   6 +
 drivers/gpu/drm/xe/xe_sriov_pf_types.h |  19 +++
 3 files changed, 175 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4410e28dee54..49fc8f66e8aa 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -21,11 +21,13 @@
 
 #include <trace/events/gpu_mem.h>
 
+#include "regs/xe_bars.h"
 #include "xe_device.h"
 #include "xe_dma_buf.h"
 #include "xe_drm_client.h"
 #include "xe_ggtt.h"
 #include "xe_gt.h"
+#include "xe_gt_sriov_pf_config.h"
 #include "xe_map.h"
 #include "xe_migrate.h"
 #include "xe_pm.h"
@@ -34,6 +36,7 @@
 #include "xe_res_cursor.h"
 #include "xe_shrinker.h"
 #include "xe_sriov_vf_ccs.h"
+#include "xe_sriov_pf_helpers.h"
 #include "xe_trace_bo.h"
 #include "xe_ttm_stolen_mgr.h"
 #include "xe_vm.h"
@@ -346,6 +349,7 @@ struct xe_ttm_tt {
 	struct ttm_tt ttm;
 	struct sg_table sgt;
 	struct sg_table *sg;
+	struct dma_buf_ranges *ranges;
 	/** @purgeable: Whether the content of the pages of @ttm is purgeable. */
 	bool purgeable;
 };
@@ -679,6 +683,103 @@ static int xe_bo_trigger_rebind(struct xe_device *xe, struct xe_bo *bo,
 	return ret;
 }
 
+static struct pci_dev *xe_find_vf_dev(struct xe_device *xe,
+				      phys_addr_t phys)
+{
+	struct pci_dev *pdev, *pf_pdev = to_pci_dev(xe->drm.dev);
+	resource_size_t io_start, io_size;
+
+	list_for_each_entry(pdev, &pf_pdev->bus->devices, bus_list) {
+		if (pdev->is_physfn)
+			continue;
+
+		io_start = pci_resource_start(pdev, LMEM_BAR);
+		io_size = pci_resource_len(pdev, LMEM_BAR);
+
+		if (phys >= io_start &&
+		    phys < (io_start + io_size - PAGE_SIZE))
+			return pdev;
+	}
+
+	return NULL;
+}
+
+static int xe_bo_translate_io_addr_to_dpa(struct dma_buf_ranges *ranges,
+					  struct ttm_buffer_object *ttm_bo)
+{
+	struct xe_bo *lmem_obj = NULL, *bo = ttm_to_xe_bo(ttm_bo);
+	struct dma_buf_attachment *attach = ttm_bo->base.import_attach;
+	struct xe_device *xe = xe_bo_device(bo);
+	struct xe_gt *gt = xe_root_mmio_gt(xe);
+	struct drm_pagemap_addr *dma_addr;
+	resource_size_t io_start;
+	unsigned long i, offset;
+	struct pci_dev *vf_pdev;
+	struct range *range;
+	dma_addr_t addr;
+	int vfid, ret;
+	void *entry;
+
+	if (!IS_SRIOV_PF(xe))
+		return -EINVAL;
+
+	dma_addr = kmalloc_array(ranges->nranges, sizeof(*dma_addr),
+				 GFP_KERNEL);
+	if (!dma_addr)
+		return -ENOMEM;
+
+	xa_for_each(&ranges->ranges, i, entry) {
+		range = entry;
+		if (page_is_ram(PFN_DOWN(range->start))) {
+			ret = -EINVAL;
+			goto err_xlat;
+		}
+
+		vf_pdev = xe_find_vf_dev(xe, range->start);
+		if (!vf_pdev) {
+			ret = -EINVAL;
+			goto err_xlat;
+		}
+
+
+		/*
+		 * The below check prevents a malicious VF from accessing
+		 * another VF's addresses.
+		 */
+		vfid = pci_iov_vf_id(vf_pdev);
+		if (vfid < 0 ||
+		    vfid != pci_iov_vf_id(to_pci_dev(attach->ic_match->dev))) {
+			ret = -EPERM;
+			goto err_xlat;
+		}
+
+		if (!lmem_obj) {
+			lmem_obj = xe_gt_sriov_pf_config_get_lmem_obj(gt,
+								      vfid + 1);
+			if (!lmem_obj) {
+				ret = -EINVAL;
+				goto err_xlat;
+			}
+		}
+
+		io_start = pci_resource_start(vf_pdev, LMEM_BAR);
+		offset = range->start - io_start;
+		addr = xe_bo_addr(lmem_obj, offset, XE_PAGE_SIZE);
+
+		dma_addr[i] = drm_pagemap_addr_encode(addr,
+						XE_INTERCONNECT_VRAM,
+						get_order(range_len(range)),
+						DMA_BIDIRECTIONAL);
+	}
+
+	bo->dma_data.dma_addr = dma_addr;
+	return 0;
+err_xlat:
+	kfree(dma_addr);
+	xe_bo_put(lmem_obj);
+	return ret;
+}
+
 /*
  * The dma-buf map_attachment() / unmap_attachment() is hooked up here.
  * Note that unmapping the attachment is deferred to the next
@@ -692,11 +793,15 @@ static int xe_bo_move_dmabuf(struct ttm_buffer_object *ttm_bo,
 			     struct ttm_resource *new_res)
 {
 	struct dma_buf_attachment *attach = ttm_bo->base.import_attach;
+	struct dma_buf_interconnect_match *ic_match = attach->ic_match;
 	struct xe_ttm_tt *xe_tt = container_of(ttm_bo->ttm, struct xe_ttm_tt,
 					       ttm);
 	struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
 	bool device_unplugged = drm_dev_is_unplugged(&xe->drm);
-	struct sg_table *sg;
+	struct dma_buf_ranges *ranges;
+	struct sg_table *sg = NULL;
+	bool allow_ic = false;
+	int ret = 0;
 
 	xe_assert(xe, attach);
 	xe_assert(xe, ttm_bo->ttm);
@@ -716,10 +821,32 @@ static int xe_bo_move_dmabuf(struct ttm_buffer_object *ttm_bo,
 		dma_buf_unmap_attachment(attach, ttm_bo->sg, DMA_BIDIRECTIONAL);
 		ttm_bo->sg = NULL;
 	}
+	if (xe_tt->ranges) {
+		dma_buf_unmap_interconnect(attach, xe_tt->ranges);
+		xe_tt->ranges = NULL;
+	}
 
-	sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
-	if (IS_ERR(sg))
-		return PTR_ERR(sg);
+	if (attach->allow_ic && ic_match &&
+	    ic_match->type == iov_interconnect) {
+		allow_ic = true;
+
+		ranges = dma_buf_map_interconnect(attach);
+		if (IS_ERR(ranges)) {
+			allow_ic = false;
+		} else {
+			if (xe_bo_translate_io_addr_to_dpa(ranges, ttm_bo)) {
+				dma_buf_unmap_interconnect(attach, ranges);
+				allow_ic = false;
+			}
+			xe_tt->ranges = ranges;
+		}
+		attach->allow_ic = allow_ic;
+	}
+	if (!allow_ic) {
+		sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
+		if (IS_ERR(sg))
+			return PTR_ERR(sg);
+	}
 
 	ttm_bo->sg = sg;
 	xe_tt->sg = sg;
@@ -727,7 +854,7 @@ static int xe_bo_move_dmabuf(struct ttm_buffer_object *ttm_bo,
 out:
 	ttm_bo_move_null(ttm_bo, new_res);
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1537,6 +1664,7 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo)
 static void xe_ttm_bo_delete_mem_notify(struct ttm_buffer_object *ttm_bo)
 {
 	struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
+	struct xe_ttm_tt *xe_tt;
 
 	if (!xe_bo_is_xe_bo(ttm_bo))
 		return;
@@ -1544,18 +1672,28 @@ static void xe_ttm_bo_delete_mem_notify(struct ttm_buffer_object *ttm_bo)
 	if (IS_VF_CCS_READY(ttm_to_xe_device(ttm_bo->bdev)))
 		xe_sriov_vf_ccs_detach_bo(bo);
 
+	xe_tt = container_of(ttm_bo->ttm, struct xe_ttm_tt, ttm);
 	/*
 	 * Object is idle and about to be destroyed. Release the
 	 * dma-buf attachment.
 	 */
-	if (ttm_bo->type == ttm_bo_type_sg && ttm_bo->sg) {
-		struct xe_ttm_tt *xe_tt = container_of(ttm_bo->ttm,
-						       struct xe_ttm_tt, ttm);
+	if (ttm_bo->type == ttm_bo_type_sg) {
+		if (ttm_bo->sg) {
+			dma_buf_unmap_attachment(ttm_bo->base.import_attach,
+						 ttm_bo->sg, DMA_BIDIRECTIONAL);
+			ttm_bo->sg = NULL;
+			xe_tt->sg = NULL;
+		}
+		if (xe_tt->ranges) {
+			dma_buf_unmap_interconnect(ttm_bo->base.import_attach,
+						   xe_tt->ranges);
+			xe_tt->ranges = NULL;
+		}
 
-		dma_buf_unmap_attachment(ttm_bo->base.import_attach, ttm_bo->sg,
-					 DMA_BIDIRECTIONAL);
-		ttm_bo->sg = NULL;
-		xe_tt->sg = NULL;
+		if (bo->dma_data.dma_addr) {
+			xe_bo_put(bo->dma_data.lmem_obj);
+			kfree(bo->dma_data.dma_addr);
+		}
 	}
 }
 
diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h
index d4fe3c8dca5b..525e46608341 100644
--- a/drivers/gpu/drm/xe/xe_bo_types.h
+++ b/drivers/gpu/drm/xe/xe_bo_types.h
@@ -108,6 +108,12 @@ struct xe_bo {
 	 * from default
 	 */
 	u64 min_align;
+
+	/**
+	 * @dma_data: DMA related data for an imported dmabuf BO that is VRAM
+	 * based.
+	 */
+	struct xe_sriov_dma_data dma_data;
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
index 956a88f9f213..6d5f923f7fc4 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
@@ -11,6 +11,8 @@
 
 #include "xe_sriov_pf_service_types.h"
 
+struct xe_bo;
+
 /**
  * struct xe_sriov_metadata - per-VF device level metadata
  */
@@ -42,4 +44,21 @@ struct xe_device_pf {
 	struct xe_sriov_metadata *vfs;
 };
 
+/**
+ * struct xe_sriov_dma_data - DMA related data for LMEM based imported dmabuf
+ * BOs that are associated with a sriov VF.
+ *
+ * The data in this structure is valid only if driver is running in the
+ * @XE_SRIOV_MODE_PF mode.
+ */
+struct xe_sriov_dma_data {
+	/**
+	 * @dma_addr: An array to store DMA addresses (DPAs) for imported
+	 * dmabuf BOs that are LMEM based.
+	 */
+	struct drm_pagemap_addr *dma_addr;
+
+	/** @lmem_obj: Ref taken on the LMEM obj that backs a VF's quota */
+	struct xe_bo *lmem_obj;
+};
 #endif
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC v2 8/8] drm/xe/pt: Add an additional check for dmabuf BOs while doing bind
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
                   ` (6 preceding siblings ...)
  2025-10-27  4:44 ` [RFC v2 7/8] drm/xe/bo: Create new dma_addr array for dmabuf BOs associated with VFs Vivek Kasireddy
@ 2025-10-27  4:44 ` Vivek Kasireddy
  2025-10-29  0:27 ` [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Jason Gunthorpe
  8 siblings, 0 replies; 24+ messages in thread
From: Vivek Kasireddy @ 2025-10-27  4:44 UTC (permalink / raw)
  To: dri-devel, intel-xe, linux-media, linaro-mm-sig
  Cc: Vivek Kasireddy, Matthew Brost

If a BO's dma_data.dma_addr pointer is valid, it means that it is
an imported dmabuf BO that has a backing store in VRAM. Therefore,
in this case, we need to iterate over its dma_addr array.

v2:
- Use a cursor to iterate over the entries in the dma_addr array
  instead of relying on SG iterator (Matt)

v3:
- Since XE_PPGTT_PTE_DM is added to the PTE flags in all cases,
  remove the bo->is_devmem_external check added in v2

v4:
- Drop is_devmem_external and instead rely on bo->dma_data.dma_addr
  to check for imported VRAM BOs (Matt)

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/gpu/drm/xe/xe_pt.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index a1c88f9a6c76..18f959247e8d 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -759,6 +759,10 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
 
 	xe_walk.default_vram_pte |= XE_PPGTT_PTE_DM;
 	xe_walk.dma_offset = bo ? vram_region_gpu_offset(bo->ttm.resource) : 0;
+
+	if (bo && bo->dma_data.dma_addr)
+		xe_walk.dma_offset = 0;
+
 	if (!range)
 		xe_bo_assert_held(bo);
 
@@ -769,6 +773,10 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
 		else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo))
 			xe_res_first(bo->ttm.resource, xe_vma_bo_offset(vma),
 				     xe_vma_size(vma), &curs);
+		else if (bo && bo->dma_data.dma_addr)
+			xe_res_first_dma(bo->dma_data.dma_addr,
+					 xe_vma_bo_offset(vma),
+					 xe_vma_size(vma), &curs);
 		else
 			xe_res_first_sg(xe_bo_sg(bo), xe_vma_bo_offset(vma),
 					xe_vma_size(vma), &curs);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
                   ` (7 preceding siblings ...)
  2025-10-27  4:44 ` [RFC v2 8/8] drm/xe/pt: Add an additional check for dmabuf BOs while doing bind Vivek Kasireddy
@ 2025-10-29  0:27 ` Jason Gunthorpe
  2025-10-29  9:25   ` Leon Romanovsky
  2025-10-30  6:17   ` Kasireddy, Vivek
  8 siblings, 2 replies; 24+ messages in thread
From: Jason Gunthorpe @ 2025-10-29  0:27 UTC (permalink / raw)
  To: Vivek Kasireddy
  Cc: dri-devel, intel-xe, linux-media, linaro-mm-sig, Leon Romanovsky,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter, Matthew Brost, Dongwon Kim

On Sun, Oct 26, 2025 at 09:44:12PM -0700, Vivek Kasireddy wrote:
> In a typical dma-buf use case, a dmabuf exporter makes its buffer
> buffer available to an importer by mapping it using DMA APIs
> such as dma_map_sgtable() or dma_map_resource(). However, this
> is not desirable in some cases where the exporter and importer
> are directly connected via a physical or virtual link (or
> interconnect) and the importer can access the buffer without
> having it DMA mapped.

I think my explanation was not so clear, I spent a few hours and typed
in what I was thinking about here:

https://github.com/jgunthorpe/linux/commits/dmabuf_map_type

I didn't type in the last patch for iommufd side, hopefully it is
clear enough. Adding iov should follow the pattern of the "physical
address list" patch.

I think the use of EXPORT_SYMBOL_FOR_MODULES() to lock down the
physical addres list mapping type to iommufd is clever and I'm hoping
addresses Chrsitian's concerns about abuse.

Single GPU drivers can easilly declare their own mapping type for
their own private interconnect without needing to change the core
code.

This seems to be fairly straightforward and reasonably type safe..

What do you think?

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-29  0:27 ` [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Jason Gunthorpe
@ 2025-10-29  9:25   ` Leon Romanovsky
  2025-10-29 11:53     ` Jason Gunthorpe
  2025-10-30  6:17   ` Kasireddy, Vivek
  1 sibling, 1 reply; 24+ messages in thread
From: Leon Romanovsky @ 2025-10-29  9:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Vivek Kasireddy, dri-devel, intel-xe, linux-media, linaro-mm-sig,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter, Matthew Brost, Dongwon Kim

On Tue, Oct 28, 2025 at 09:27:26PM -0300, Jason Gunthorpe wrote:
> On Sun, Oct 26, 2025 at 09:44:12PM -0700, Vivek Kasireddy wrote:
> > In a typical dma-buf use case, a dmabuf exporter makes its buffer
> > buffer available to an importer by mapping it using DMA APIs
> > such as dma_map_sgtable() or dma_map_resource(). However, this
> > is not desirable in some cases where the exporter and importer
> > are directly connected via a physical or virtual link (or
> > interconnect) and the importer can access the buffer without
> > having it DMA mapped.
> 
> I think my explanation was not so clear, I spent a few hours and typed
> in what I was thinking about here:
> 
> https://github.com/jgunthorpe/linux/commits/dmabuf_map_type
> 
> I didn't type in the last patch for iommufd side, hopefully it is
> clear enough. Adding iov should follow the pattern of the "physical
> address list" patch.
> 
> I think the use of EXPORT_SYMBOL_FOR_MODULES() to lock down the
> physical addres list mapping type to iommufd is clever and I'm hoping
> addresses Chrsitian's concerns about abuse.
> 
> Single GPU drivers can easilly declare their own mapping type for
> their own private interconnect without needing to change the core
> code.
> 
> This seems to be fairly straightforward and reasonably type safe..

It makes me wonder what am I supposed to do with my series now [1]?
How do you see submission plan now?

[1] https://lore.kernel.org/all/cover.1760368250.git.leon@kernel.org/


> 
> What do you think?
> 
> Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-29  9:25   ` Leon Romanovsky
@ 2025-10-29 11:53     ` Jason Gunthorpe
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Gunthorpe @ 2025-10-29 11:53 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Vivek Kasireddy, dri-devel, intel-xe, linux-media, linaro-mm-sig,
	Christian Koenig, Sumit Semwal, Thomas Hellström,
	Simona Vetter, Matthew Brost, Dongwon Kim

On Wed, Oct 29, 2025 at 11:25:34AM +0200, Leon Romanovsky wrote:
> On Tue, Oct 28, 2025 at 09:27:26PM -0300, Jason Gunthorpe wrote:
> > On Sun, Oct 26, 2025 at 09:44:12PM -0700, Vivek Kasireddy wrote:
> > > In a typical dma-buf use case, a dmabuf exporter makes its buffer
> > > buffer available to an importer by mapping it using DMA APIs
> > > such as dma_map_sgtable() or dma_map_resource(). However, this
> > > is not desirable in some cases where the exporter and importer
> > > are directly connected via a physical or virtual link (or
> > > interconnect) and the importer can access the buffer without
> > > having it DMA mapped.
> > 
> > I think my explanation was not so clear, I spent a few hours and typed
> > in what I was thinking about here:
> > 
> > https://github.com/jgunthorpe/linux/commits/dmabuf_map_type
> > 
> > I didn't type in the last patch for iommufd side, hopefully it is
> > clear enough. Adding iov should follow the pattern of the "physical
> > address list" patch.
> > 
> > I think the use of EXPORT_SYMBOL_FOR_MODULES() to lock down the
> > physical addres list mapping type to iommufd is clever and I'm hoping
> > addresses Chrsitian's concerns about abuse.
> > 
> > Single GPU drivers can easilly declare their own mapping type for
> > their own private interconnect without needing to change the core
> > code.
> > 
> > This seems to be fairly straightforward and reasonably type safe..
> 
> It makes me wonder what am I supposed to do with my series now [1]?
> How do you see submission plan now?
> 
> [1] https://lore.kernel.org/all/cover.1760368250.git.leon@kernel.org/

IMHO that series needs the small tweaks and should go this merge
window, ideally along with the iommufd half.

I think this thread is a topic for the next cycle, I expect it will
take some time to converge on the dmabuf core changes, and adapting
your series is quite simple.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-29  0:27 ` [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Jason Gunthorpe
  2025-10-29  9:25   ` Leon Romanovsky
@ 2025-10-30  6:17   ` Kasireddy, Vivek
  2025-10-30 13:43     ` Jason Gunthorpe
  1 sibling, 1 reply; 24+ messages in thread
From: Kasireddy, Vivek @ 2025-10-30  6:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter, Brost, Matthew,
	Kim, Dongwon
  Cc: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org

Hi Jason,

> Subject: Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via
> interconnects
> 
> On Sun, Oct 26, 2025 at 09:44:12PM -0700, Vivek Kasireddy wrote:
> > In a typical dma-buf use case, a dmabuf exporter makes its buffer
> > buffer available to an importer by mapping it using DMA APIs
> > such as dma_map_sgtable() or dma_map_resource(). However, this
> > is not desirable in some cases where the exporter and importer
> > are directly connected via a physical or virtual link (or
> > interconnect) and the importer can access the buffer without
> > having it DMA mapped.
> 
> I think my explanation was not so clear, I spent a few hours and typed
> in what I was thinking about here:
> 
> https://github.com/jgunthorpe/linux/commits/dmabuf_map_type
> 
> I didn't type in the last patch for iommufd side, hopefully it is
> clear enough. Adding iov should follow the pattern of the "physical
> address list" patch.
> 
> I think the use of EXPORT_SYMBOL_FOR_MODULES() to lock down the
> physical addres list mapping type to iommufd is clever and I'm hoping
> addresses Chrsitian's concerns about abuse.
> 
> Single GPU drivers can easilly declare their own mapping type for
> their own private interconnect without needing to change the core
> code.
> 
> This seems to be fairly straightforward and reasonably type safe..
> 
> What do you think?
It mostly looks OK to me but there are a few things that I want to discuss,
after briefly looking at the patches in your branch:
- I am wondering what is the benefit of the SGT compatibility stuff especially
when Christian suggested that he'd like to see SGT usage gone from dma-buf
eventually. Also, if matching fails, IMO, indicating that to the importer (allow_ic)
and having both exporter/importer fallback to the current legacy mechanism
would be simpler than the SGT compatibility stuff.

- Also, I thought PCIe P2P (along with SGT) use-cases are already well handled
by the existing map_dma_buf() and other interfaces. So, it might be confusing
if the newer interfaces also provide a mechanism to handle P2P although a
bit differently. I might be missing something here but shouldn't the existing
allow_peer2peer and other related stuff be left alone?

- You are also adding custom attach/detach ops for each mapping_type. I think
it makes sense to reuse existing attach/detach ops if possible and initiate the
matching process from there, at-least initially.

- Looks like your design doesn't call for a dma_buf_map_interconnect() or other
similar helpers provided by dma-buf core that the importers can use. Is that
because the return type would not be known to the core?

- And, just to confirm, with your design if I want to add a new interconnect/
mapping_type (not just IOV but in general), all that is needed is to provide custom
attach/detach, match ops and one or more ops to map/unmap the address list
right? Does this mean that the role of dma-buf core would be limited to just
match and the exporters are expected to do most of the heavy lifting and
checking for stuff like dynamic importers, resv lock held, etc?

Thanks,
Vivek

> 
> Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-30  6:17   ` Kasireddy, Vivek
@ 2025-10-30 13:43     ` Jason Gunthorpe
  2025-10-31  5:15       ` Kasireddy, Vivek
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Gunthorpe @ 2025-10-30 13:43 UTC (permalink / raw)
  To: Kasireddy, Vivek
  Cc: Leon Romanovsky, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter, Brost, Matthew,
	Kim, Dongwon, dri-devel@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org

On Thu, Oct 30, 2025 at 06:17:11AM +0000, Kasireddy, Vivek wrote:
> It mostly looks OK to me but there are a few things that I want to discuss,
> after briefly looking at the patches in your branch:
> - I am wondering what is the benefit of the SGT compatibility stuff especially
> when Christian suggested that he'd like to see SGT usage gone from
> dma-buf

I think to get rid of SGT we do need to put it in a little well
defined box and then create alternatives and remove things using
SGT. This is a long journey, and I think this is the first step.

If SGT is some special case it will be harder to excise.

So the next steps would be to make all the exporters directly declare
a SGT and then remove the SGT related ops from dma_ops itself and
remove the compat sgt in the attach logic. This is not hard, it is all
simple mechanical work.

This way the only compat requirement is to automatically give an
import match list for a SGT only importer which is very little code in
the core.

The point is we make the SGT stuff nonspecial and fully aligned with
the mapping type in small steps. This way neither importer nor
exporter should have any special code to deal with interworking.

To remove SGT we'd want to teach the core code how to create some kind
of conversion mapping type, eg exporter uses SGT importer uses NEW so
the magic conversion mapping type does the adapatation.

In this way we can convert importers and exporters to use NEW in any
order and they still interwork with each other.

> eventually. Also, if matching fails, IMO, indicating that to the
> importer (allow_ic) and having both exporter/importer fallback to
> the current legacy mechanism would be simpler than the SGT
> compatibility stuff.

I don't want to have three paths in importers.

If the importer supports SGT it should declare it in a match and the
core code should always return a SGT match for the importer to use

The importer should not have to code 'oh it is sgt but it somehow a
little different' via an allow_ic type idea.

> - Also, I thought PCIe P2P (along with SGT) use-cases are already well handled
> by the existing map_dma_buf() and other interfaces. So, it might be confusing
> if the newer interfaces also provide a mechanism to handle P2P although a
> bit differently. I might be missing something here but shouldn't the existing
> allow_peer2peer and other related stuff be left alone?

P2P is part of SGT, it gets pulled into the SGT stuff as steps toward
isolating SGT properly. Again as we move things to use native SGT
exporters we would remove the exporter related allow_peer2peer items
when they become unused.

> - You are also adding custom attach/detach ops for each mapping_type. I think
> it makes sense to reuse existing attach/detach ops if possible and initiate the
> matching process from there, at-least initially.

I started there, but as soon as I went to adding PAL I realized the
attach/detach logic was completely different for each of the mapping
types. So this is looking alot simpler.

If the driver wants to share the same attach/detach ops for some of
its mapping types then it can just set the same function pointer to
all of them and pick up the mapping type from the attach->map_type.

> - Looks like your design doesn't call for a dma_buf_map_interconnect() or other
> similar helpers provided by dma-buf core that the importers can use. Is that
> because the return type would not be known to the core?

I don't want to have a single shared 'map' operation, that is the
whole point of this design. Each mapping type has its own ops, own
types, own function signatures that the client calls directly.

No more type confusion or trying to abuse phys_addr_t, dma_addr_t, or
scatterlist for in appropriate things. If your driver wants something
special, like IOV, then give it proper clear types so it is
understandable.

> - And, just to confirm, with your design if I want to add a new interconnect/
> mapping_type (not just IOV but in general), all that is needed is to provide custom
> attach/detach, match ops and one or more ops to map/unmap the address list
> right? Does this mean that the role of dma-buf core would be limited to just
> match and the exporters are expected to do most of the heavy lifting and
> checking for stuff like dynamic importers, resv lock held, etc?

I expect the core code would continue to provide wrappers and helpers
to call the ops that can do any required common stuff.

However, keep in mind, when the importer moves to use mapping type it
also must be upgraded to use the dynamic importer flow as this API
doesn't support non-dynamic importers using mapping type.

I will add some of these remarks to the commit messages..

Thanks!
Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-30 13:43     ` Jason Gunthorpe
@ 2025-10-31  5:15       ` Kasireddy, Vivek
  2025-10-31  7:46         ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Kasireddy, Vivek @ 2025-10-31  5:15 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Christian Koenig, Sumit Semwal,
	Thomas Hellström, Simona Vetter, Brost, Matthew,
	Kim, Dongwon, dri-devel@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org

Hi Jason,

> Subject: Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via
> interconnects
> 
> On Thu, Oct 30, 2025 at 06:17:11AM +0000, Kasireddy, Vivek wrote:
> > It mostly looks OK to me but there are a few things that I want to discuss,
> > after briefly looking at the patches in your branch:
> > - I am wondering what is the benefit of the SGT compatibility stuff especially
> > when Christian suggested that he'd like to see SGT usage gone from
> > dma-buf
> 
> I think to get rid of SGT we do need to put it in a little well
> defined box and then create alternatives and remove things using
> SGT. This is a long journey, and I think this is the first step.
> 
> If SGT is some special case it will be harder to excise.
> 
> So the next steps would be to make all the exporters directly declare
> a SGT and then remove the SGT related ops from dma_ops itself and
> remove the compat sgt in the attach logic. This is not hard, it is all
> simple mechanical work.
IMO, this SGT compatibility stuff should ideally be a separate follow-on
effort (and patch series) that would also probably include updates to
various drivers to add the SGT mapping type.

> 
> This way the only compat requirement is to automatically give an
> import match list for a SGT only importer which is very little code in
> the core.
> 
> The point is we make the SGT stuff nonspecial and fully aligned with
> the mapping type in small steps. This way neither importer nor
> exporter should have any special code to deal with interworking.
> 
> To remove SGT we'd want to teach the core code how to create some kind
> of conversion mapping type, eg exporter uses SGT importer uses NEW so
> the magic conversion mapping type does the adapatation.
> 
> In this way we can convert importers and exporters to use NEW in any
> order and they still interwork with each other.
> 
> > eventually. Also, if matching fails, IMO, indicating that to the
> > importer (allow_ic) and having both exporter/importer fallback to
> > the current legacy mechanism would be simpler than the SGT
> > compatibility stuff.
> 
> I don't want to have three paths in importers.
> 
> If the importer supports SGT it should declare it in a match and the
> core code should always return a SGT match for the importer to use
> 
> The importer should not have to code 'oh it is sgt but it somehow a
> little different' via an allow_ic type idea.
> 
> > - Also, I thought PCIe P2P (along with SGT) use-cases are already well
> handled
> > by the existing map_dma_buf() and other interfaces. So, it might be
> confusing
> > if the newer interfaces also provide a mechanism to handle P2P although a
> > bit differently. I might be missing something here but shouldn't the existing
> > allow_peer2peer and other related stuff be left alone?
> 
> P2P is part of SGT, it gets pulled into the SGT stuff as steps toward
> isolating SGT properly. Again as we move things to use native SGT
> exporters we would remove the exporter related allow_peer2peer items
> when they become unused.
> 
> > - You are also adding custom attach/detach ops for each mapping_type. I
> think
> > it makes sense to reuse existing attach/detach ops if possible and initiate
> the
> > matching process from there, at-least initially.
> 
> I started there, but as soon as I went to adding PAL I realized the
> attach/detach logic was completely different for each of the mapping
> types. So this is looking alot simpler.
> 
> If the driver wants to share the same attach/detach ops for some of
> its mapping types then it can just set the same function pointer to
> all of them and pick up the mapping type from the attach->map_type.
> 
> > - Looks like your design doesn't call for a dma_buf_map_interconnect() or
> other
> > similar helpers provided by dma-buf core that the importers can use. Is that
> > because the return type would not be known to the core?
> 
> I don't want to have a single shared 'map' operation, that is the
> whole point of this design. Each mapping type has its own ops, own
> types, own function signatures that the client calls directly.
> 
> No more type confusion or trying to abuse phys_addr_t, dma_addr_t, or
> scatterlist for in appropriate things. If your driver wants something
> special, like IOV, then give it proper clear types so it is
> understandable.
> 
> > - And, just to confirm, with your design if I want to add a new interconnect/
> > mapping_type (not just IOV but in general), all that is needed is to provide
> custom
> > attach/detach, match ops and one or more ops to map/unmap the address
> list
> > right? Does this mean that the role of dma-buf core would be limited to just
> > match and the exporters are expected to do most of the heavy lifting and
> > checking for stuff like dynamic importers, resv lock held, etc?
> 
> I expect the core code would continue to provide wrappers and helpers
> to call the ops that can do any required common stuff.
> 
> However, keep in mind, when the importer moves to use mapping type it
> also must be upgraded to use the dynamic importer flow as this API
> doesn't support non-dynamic importers using mapping type.
> 
> I will add some of these remarks to the commit messages..
Sounds good. I'll start testing/working on IOV interconnect patches based on
your design.

Thanks,
Vivek
> 
> Thanks!
> Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-31  5:15       ` Kasireddy, Vivek
@ 2025-10-31  7:46         ` Christian König
  2025-10-31 13:16           ` Jason Gunthorpe
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2025-10-31  7:46 UTC (permalink / raw)
  To: Kasireddy, Vivek, Jason Gunthorpe
  Cc: Leon Romanovsky, Sumit Semwal, Thomas Hellström,
	Simona Vetter, Brost, Matthew, Kim, Dongwon,
	dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org

On 10/31/25 06:15, Kasireddy, Vivek wrote:
> Hi Jason,
> 
>> Subject: Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via
>> interconnects
>>
>> On Thu, Oct 30, 2025 at 06:17:11AM +0000, Kasireddy, Vivek wrote:
>>> It mostly looks OK to me but there are a few things that I want to discuss,
>>> after briefly looking at the patches in your branch:
>>> - I am wondering what is the benefit of the SGT compatibility stuff especially
>>> when Christian suggested that he'd like to see SGT usage gone from
>>> dma-buf
>>
>> I think to get rid of SGT we do need to put it in a little well
>> defined box and then create alternatives and remove things using
>> SGT. This is a long journey, and I think this is the first step.
>>
>> If SGT is some special case it will be harder to excise.
>>
>> So the next steps would be to make all the exporters directly declare
>> a SGT and then remove the SGT related ops from dma_ops itself and
>> remove the compat sgt in the attach logic. This is not hard, it is all
>> simple mechanical work.
> IMO, this SGT compatibility stuff should ideally be a separate follow-on
> effort (and patch series) that would also probably include updates to
> various drivers to add the SGT mapping type.

Nope, just the other way around. In other words the SGT compatibility is a pre-requisite.

We should first demonstrate with existing drivers that the new interface works and does what it promised to do and then extend it with new functionality.

Regards,
Christian.

> 
>>
>> This way the only compat requirement is to automatically give an
>> import match list for a SGT only importer which is very little code in
>> the core.
>>
>> The point is we make the SGT stuff nonspecial and fully aligned with
>> the mapping type in small steps. This way neither importer nor
>> exporter should have any special code to deal with interworking.
>>
>> To remove SGT we'd want to teach the core code how to create some kind
>> of conversion mapping type, eg exporter uses SGT importer uses NEW so
>> the magic conversion mapping type does the adapatation.
>>
>> In this way we can convert importers and exporters to use NEW in any
>> order and they still interwork with each other.
>>
>>> eventually. Also, if matching fails, IMO, indicating that to the
>>> importer (allow_ic) and having both exporter/importer fallback to
>>> the current legacy mechanism would be simpler than the SGT
>>> compatibility stuff.
>>
>> I don't want to have three paths in importers.
>>
>> If the importer supports SGT it should declare it in a match and the
>> core code should always return a SGT match for the importer to use
>>
>> The importer should not have to code 'oh it is sgt but it somehow a
>> little different' via an allow_ic type idea.
>>
>>> - Also, I thought PCIe P2P (along with SGT) use-cases are already well
>> handled
>>> by the existing map_dma_buf() and other interfaces. So, it might be
>> confusing
>>> if the newer interfaces also provide a mechanism to handle P2P although a
>>> bit differently. I might be missing something here but shouldn't the existing
>>> allow_peer2peer and other related stuff be left alone?
>>
>> P2P is part of SGT, it gets pulled into the SGT stuff as steps toward
>> isolating SGT properly. Again as we move things to use native SGT
>> exporters we would remove the exporter related allow_peer2peer items
>> when they become unused.
>>
>>> - You are also adding custom attach/detach ops for each mapping_type. I
>> think
>>> it makes sense to reuse existing attach/detach ops if possible and initiate
>> the
>>> matching process from there, at-least initially.
>>
>> I started there, but as soon as I went to adding PAL I realized the
>> attach/detach logic was completely different for each of the mapping
>> types. So this is looking alot simpler.
>>
>> If the driver wants to share the same attach/detach ops for some of
>> its mapping types then it can just set the same function pointer to
>> all of them and pick up the mapping type from the attach->map_type.
>>
>>> - Looks like your design doesn't call for a dma_buf_map_interconnect() or
>> other
>>> similar helpers provided by dma-buf core that the importers can use. Is that
>>> because the return type would not be known to the core?
>>
>> I don't want to have a single shared 'map' operation, that is the
>> whole point of this design. Each mapping type has its own ops, own
>> types, own function signatures that the client calls directly.
>>
>> No more type confusion or trying to abuse phys_addr_t, dma_addr_t, or
>> scatterlist for in appropriate things. If your driver wants something
>> special, like IOV, then give it proper clear types so it is
>> understandable.
>>
>>> - And, just to confirm, with your design if I want to add a new interconnect/
>>> mapping_type (not just IOV but in general), all that is needed is to provide
>> custom
>>> attach/detach, match ops and one or more ops to map/unmap the address
>> list
>>> right? Does this mean that the role of dma-buf core would be limited to just
>>> match and the exporters are expected to do most of the heavy lifting and
>>> checking for stuff like dynamic importers, resv lock held, etc?
>>
>> I expect the core code would continue to provide wrappers and helpers
>> to call the ops that can do any required common stuff.
>>
>> However, keep in mind, when the importer moves to use mapping type it
>> also must be upgraded to use the dynamic importer flow as this API
>> doesn't support non-dynamic importers using mapping type.
>>
>> I will add some of these remarks to the commit messages..
> Sounds good. I'll start testing/working on IOV interconnect patches based on
> your design.
> 
> Thanks,
> Vivek
>>
>> Thanks!
>> Jason


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects
  2025-10-31  7:46         ` Christian König
@ 2025-10-31 13:16           ` Jason Gunthorpe
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Gunthorpe @ 2025-10-31 13:16 UTC (permalink / raw)
  To: Christian König
  Cc: Kasireddy, Vivek, Leon Romanovsky, Sumit Semwal,
	Thomas Hellström, Simona Vetter, Brost, Matthew,
	Kim, Dongwon, dri-devel@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org

On Fri, Oct 31, 2025 at 08:46:34AM +0100, Christian König wrote:
> On 10/31/25 06:15, Kasireddy, Vivek wrote:
> > Hi Jason,
> > 
> >> Subject: Re: [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via
> >> interconnects
> >>
> >> On Thu, Oct 30, 2025 at 06:17:11AM +0000, Kasireddy, Vivek wrote:
> >>> It mostly looks OK to me but there are a few things that I want to discuss,
> >>> after briefly looking at the patches in your branch:
> >>> - I am wondering what is the benefit of the SGT compatibility stuff especially
> >>> when Christian suggested that he'd like to see SGT usage gone from
> >>> dma-buf
> >>
> >> I think to get rid of SGT we do need to put it in a little well
> >> defined box and then create alternatives and remove things using
> >> SGT. This is a long journey, and I think this is the first step.
> >>
> >> If SGT is some special case it will be harder to excise.
> >>
> >> So the next steps would be to make all the exporters directly declare
> >> a SGT and then remove the SGT related ops from dma_ops itself and
> >> remove the compat sgt in the attach logic. This is not hard, it is all
> >> simple mechanical work.
> > IMO, this SGT compatibility stuff should ideally be a separate follow-on
> > effort (and patch series) that would also probably include updates to
> > various drivers to add the SGT mapping type.
> 
> Nope, just the other way around. In other words the SGT
> compatibility is a pre-requisite.
> 
> We should first demonstrate with existing drivers that the new
> interface works and does what it promised to do and then extend it
> with new functionality.

Ok, so I think that is what my github is showing.

Everything interworks, non-mapping-type aware code simply acts exactly
as though it is using a SGT mapping type from the perspective of aware
code.

I see a fairly easy path to do some driver upgrades to make more
things more mapping aware and remove some of the compatibility parts.

Let me see if I can post a RFC version next week, got a big pile of
other stuff to do still..

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-10-31 13:16 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-27  4:44 [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Vivek Kasireddy
2025-10-27  4:44 ` [RFC v2 1/8] dma-buf: Add support for map/unmap APIs for interconnects Vivek Kasireddy
2025-10-27 17:47   ` Jason Gunthorpe
2025-10-28  5:39     ` Kasireddy, Vivek
2025-10-28 12:21       ` Jason Gunthorpe
2025-10-27  4:44 ` [RFC v2 2/8] dma-buf: Add a helper to match interconnects between exporter/importer Vivek Kasireddy
2025-10-27 18:18   ` Jason Gunthorpe
2025-10-28  6:04     ` Kasireddy, Vivek
2025-10-27  4:44 ` [RFC v2 3/8] dma-buf: Create and expose IOV interconnect to all exporters/importers Vivek Kasireddy
2025-10-27  4:44 ` [RFC v2 4/8] vfio/pci/dmabuf: Add support for IOV interconnect Vivek Kasireddy
2025-10-28  2:00   ` Matthew Brost
2025-10-28  5:05     ` Kasireddy, Vivek
2025-10-27  4:44 ` [RFC v2 5/8] drm/xe/dma_buf: " Vivek Kasireddy
2025-10-27  4:44 ` [RFC v2 6/8] drm/xe/pf: Add a helper function to get a VF's backing object in LMEM Vivek Kasireddy
2025-10-27  4:44 ` [RFC v2 7/8] drm/xe/bo: Create new dma_addr array for dmabuf BOs associated with VFs Vivek Kasireddy
2025-10-27  4:44 ` [RFC v2 8/8] drm/xe/pt: Add an additional check for dmabuf BOs while doing bind Vivek Kasireddy
2025-10-29  0:27 ` [RFC v2 0/8] dma-buf: Add support for mapping dmabufs via interconnects Jason Gunthorpe
2025-10-29  9:25   ` Leon Romanovsky
2025-10-29 11:53     ` Jason Gunthorpe
2025-10-30  6:17   ` Kasireddy, Vivek
2025-10-30 13:43     ` Jason Gunthorpe
2025-10-31  5:15       ` Kasireddy, Vivek
2025-10-31  7:46         ` Christian König
2025-10-31 13:16           ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).