* [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access
@ 2026-02-09 17:53 Zhiping Zhang
2026-02-09 17:53 ` [RFC 1/2] Vfio: add callback to get tph info for dmabuf Zhiping Zhang
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Zhiping Zhang @ 2026-02-09 17:53 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky, Bjorn Helgaas, linux-rdma,
linux-pci, netdev, Keith Busch, Yochai Cohen, Yishai Hadas
Cc: Bjorn Helgaas, Zhiping Zhang
Currently, the steering tag can be used for a CPU on the motherboard; the
ACPI check is in place to query and obtain the supported tph settings. Here
we intend to use the tph info to improve RDMA NIC memory access on a vfio-based
accelerator device via PCIe peer-to-peer. When an applicantion register a
RDMA memory region with DMABUF for the RDMA NIC to access the device memory,
the tph associated with the memory region can be retrieved and used to set the
steering tag / process hint (ph). The tph contains additional instructions
or hints to the GPU or accelerator device for advanced memory operations,
such as, read cache selection.
Note this RFC is for the discussion on the direction and is not intended to be
a complete implementation. Once the direction is agreed on, we will work on the
implementation or a real patch set.
Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
[RFC 1/2] Vfio: add callback to get tph info for dmabuf
[RFC 2/2] RMDA MLX5: get tph for p2p access when registering dmabuf
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC 1/2] Vfio: add callback to get tph info for dmabuf
2026-02-09 17:53 [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Zhiping Zhang
@ 2026-02-09 17:53 ` Zhiping Zhang
2026-02-09 17:53 ` [RFC 2/2] RMDA MLX5: get tph for p2p access when registering dmabuf mr Zhiping Zhang
2026-02-09 18:13 ` [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Jason Gunthorpe
2 siblings, 0 replies; 5+ messages in thread
From: Zhiping Zhang @ 2026-02-09 17:53 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky, Bjorn Helgaas, linux-rdma,
linux-pci, netdev, Keith Busch, Yochai Cohen, Yishai Hadas
Cc: Bjorn Helgaas, Zhiping Zhang
This RFC patch adds a callback to get the tph info on DMA buffer exporters.
The tph info includes both the steering tag and the process hint (ph).
Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
---
drivers/vfio/pci/vfio_pci_dmabuf.c | 15 ++++++++++++++-
include/linux/dma-buf.h | 30 ++++++++++++++++++++++++++++++
include/uapi/linux/vfio.h | 2 ++
3 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
index d4d0f7d08c53..4da1a6cc306f 100644
--- a/drivers/vfio/pci/vfio_pci_dmabuf.c
+++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
@@ -17,6 +17,8 @@ struct vfio_pci_dma_buf {
struct dma_buf_phys_vec *phys_vec;
struct p2pdma_provider *provider;
u32 nr_ranges;
+ u16 steering_tag;
+ u8 ph;
u8 revoked : 1;
};
@@ -50,6 +52,15 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment,
priv->size, dir);
}
+static int vfio_pci_dma_buf_get_tph(struct dma_buf *dmabuf, u16 *steering_tag,
+ u8 *ph)
+{
+ struct vfio_pci_dma_buf *priv = dmabuf->priv;
+ *steering_tag = priv->steering_tag;
+ *ph = priv->ph;
+ return 0;
+}
+
static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment,
struct sg_table *sgt,
enum dma_data_direction dir)
@@ -78,6 +89,7 @@ static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf)
static const struct dma_buf_ops vfio_pci_dmabuf_ops = {
.attach = vfio_pci_dma_buf_attach,
.map_dma_buf = vfio_pci_dma_buf_map,
+ .get_tph = vfio_pci_dma_buf_get_tph,
.unmap_dma_buf = vfio_pci_dma_buf_unmap,
.release = vfio_pci_dma_buf_release,
};
@@ -274,7 +286,8 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
ret = PTR_ERR(priv->dmabuf);
goto err_dev_put;
}
-
+ priv->steering_tag = get_dma_buf.steering_tag;
+ priv->ph = get_dma_buf.ph;
/* dma_buf_put() now frees priv */
INIT_LIST_HEAD(&priv->dmabufs_elm);
down_write(&vdev->memory_lock);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 0bc492090237..466290c02ebf 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -113,6 +113,36 @@ struct dma_buf_ops {
*/
void (*unpin)(struct dma_buf_attachment *attach);
+ /**
+ * @get_tph:
+ *
+ * Get the TPH (TLP Processing Hints) for this DMA buffer.
+ *
+ * This callback allows DMA buffer exporters to provide TPH including
+ * both the steering tag and the process hints (ph), which can be used
+ * to optimize peer-to-peer (P2P) memory access. The TPH info is typically
+ * used in scenarios where:
+ * - A PCIe device (e.g., RDMA NIC) needs to access memory on another
+ * PCIe device (e.g., GPU),
+ * - The system supports TPH and can use steering tags / ph to optimize
+ * cache placement and memory access patterns,
+ * - The memory is exported via DMABUF for cross-device sharing.
+ *
+ * @dmabuf: [in] The DMA buffer for which to retrieve TPH
+ * @steering_tag: [out] Pointer to store the 16-bit TPH steering tag value
+ * @ph: [out] Pointer to store the 8-bit TPH processing-hint value
+ *
+ * Returns:
+ * * 0 - Success, steering tag stored in @tph
+ * * -EOPNOTSUPP - TPH steering tags not supported for this buffer
+ * * -EINVAL - Invalid parameters
+ *
+ * This callback is optional. If not implemented, the buffer does not
+ * support TPH.
+ *
+ */
+ int (*get_tph)(struct dma_buf *dmabuf, u16 *steering_tag, u8 *ph);
+
/**
* @map_dma_buf:
*
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index ac2329f24141..bff2f5f7e38d 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1501,6 +1501,8 @@ struct vfio_region_dma_range {
struct vfio_device_feature_dma_buf {
__u32 region_index;
__u32 open_flags;
+ __u16 steering_tag;
+ __u8 ph;
__u32 flags;
__u32 nr_ranges;
struct vfio_region_dma_range dma_ranges[] __counted_by(nr_ranges);
--
2.47.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC 2/2] RMDA MLX5: get tph for p2p access when registering dmabuf mr
2026-02-09 17:53 [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Zhiping Zhang
2026-02-09 17:53 ` [RFC 1/2] Vfio: add callback to get tph info for dmabuf Zhiping Zhang
@ 2026-02-09 17:53 ` Zhiping Zhang
2026-02-09 18:13 ` [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Jason Gunthorpe
2 siblings, 0 replies; 5+ messages in thread
From: Zhiping Zhang @ 2026-02-09 17:53 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky, Bjorn Helgaas, linux-rdma,
linux-pci, netdev, Keith Busch, Yochai Cohen, Yishai Hadas
Cc: Bjorn Helgaas, Zhiping Zhang
The patch adds a local function to check and get tph info when available during
dmabuf mr registration. Note the DMAH workflow for CPU still takes precedence in
the process. Currently, it only works with the direct st_mode. Compatibility
with other st_modes will be added in the forma patch set.
Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
---
drivers/infiniband/hw/mlx5/mr.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 325fa04cbe8a..c3eb5b24ef29 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -46,6 +46,8 @@
#include "data_direct.h"
#include "dmah.h"
+MODULE_IMPORT_NS("DMA_BUF");
+
enum {
MAX_PENDING_REG_MR = 8,
};
@@ -1623,6 +1625,32 @@ static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = {
.move_notify = mlx5_ib_dmabuf_invalidate_cb,
};
+static void get_tph_mr_dmabuf(struct mlx5_ib_dev *dev, int fd, u16 *st_index,
+ u8 *ph)
+{
+ int ret;
+ struct dma_buf *dmabuf;
+ struct mlx5_core_dev *mdev = dev->mdev;
+
+ dmabuf = dma_buf_get(fd);
+ if (IS_ERR(dmabuf))
+ return;
+
+ if (!dmabuf->ops->get_tph)
+ goto end_dbuf_put;
+
+ ret = dmabuf->ops->get_tph(dmabuf, st_index, ph);
+ if (ret) {
+ *st_index = MLX5_MKC_PCIE_TPH_NO_STEERING_TAG_INDEX;
+ *ph = MLX5_IB_NO_PH;
+ mlx5_ib_dbg(dev, "get_tph failed (%d)\n", ret);
+ goto end_dbuf_put;
+ }
+
+end_dbuf_put:
+ dma_buf_put(dmabuf);
+};
+
static struct ib_mr *
reg_user_mr_dmabuf(struct ib_pd *pd, struct device *dma_device,
u64 offset, u64 length, u64 virt_addr,
@@ -1662,6 +1690,8 @@ reg_user_mr_dmabuf(struct ib_pd *pd, struct device *dma_device,
ph = dmah->ph;
if (dmah->valid_fields & BIT(IB_DMAH_CPU_ID_EXISTS))
st_index = mdmah->st_index;
+ } else {
+ get_tph_mr_dmabuf(dev, fd, &st_index, &ph);
}
mr = alloc_cacheable_mr(pd, &umem_dmabuf->umem, virt_addr,
--
2.47.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access
2026-02-09 17:53 [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Zhiping Zhang
2026-02-09 17:53 ` [RFC 1/2] Vfio: add callback to get tph info for dmabuf Zhiping Zhang
2026-02-09 17:53 ` [RFC 2/2] RMDA MLX5: get tph for p2p access when registering dmabuf mr Zhiping Zhang
@ 2026-02-09 18:13 ` Jason Gunthorpe
2026-02-09 23:28 ` Zhiping Zhang
2 siblings, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2026-02-09 18:13 UTC (permalink / raw)
To: Zhiping Zhang
Cc: Leon Romanovsky, Bjorn Helgaas, linux-rdma, linux-pci, netdev,
Keith Busch, Yochai Cohen, Yishai Hadas, Bjorn Helgaas
On Mon, Feb 09, 2026 at 09:53:10AM -0800, Zhiping Zhang wrote:
> Currently, the steering tag can be used for a CPU on the motherboard; the
> ACPI check is in place to query and obtain the supported tph settings. Here
> we intend to use the tph info to improve RDMA NIC memory access on a vfio-based
> accelerator device via PCIe peer-to-peer. When an applicantion register a
> RDMA memory region with DMABUF for the RDMA NIC to access the device memory,
> the tph associated with the memory region can be retrieved and used to set the
> steering tag / process hint (ph). The tph contains additional instructions
> or hints to the GPU or accelerator device for advanced memory operations,
> such as, read cache selection.
>
> Note this RFC is for the discussion on the direction and is not intended to be
> a complete implementation. Once the direction is agreed on, we will work on the
> implementation or a real patch set.
you didn't cc the DRM people who really need to look at any changes to
the dmabuf contract.
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access
2026-02-09 18:13 ` [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Jason Gunthorpe
@ 2026-02-09 23:28 ` Zhiping Zhang
0 siblings, 0 replies; 5+ messages in thread
From: Zhiping Zhang @ 2026-02-09 23:28 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Bjorn Helgaas, linux-rdma, linux-pci, netdev,
Keith Busch, Yochai Cohen, Yishai Hadas, Zhiping Zhang
On Mon, Feb 08, 2026 at 10:13:00AM -0800, Jason Gunthor write:
> On Mon, Feb 09, 2026 at 09:53:10AM -0800, Zhiping Zhang wrote:
> > Currently, the steering tag can be used for a CPU on the motherboard; the
> > ACPI check is in place to query and obtain the supported tph settings. Here
> > we intend to use the tph info to improve RDMA NIC memory access on a vfio-based
> > accelerator device via PCIe peer-to-peer. When an applicantion register a
> > RDMA memory region with DMABUF for the RDMA NIC to access the device memory,
> > the tph associated with the memory region can be retrieved and used to set the
> > steering tag / process hint (ph). The tph contains additional instructions
> > or hints to the GPU or accelerator device for advanced memory operations,
> > such as, read cache selection.
> >
> > Note this RFC is for the discussion on the direction and is not intended to be
> > a complete implementation. Once the direction is agreed on, we will work on the
> > implementation or a real patch set.
>
> you didn't cc the DRM people who really need to look at any changes to
> the dmabuf contract.
>
> Jason
Thanks, let me submit again including the DRM people via the mailing list
dri-devel@lists.freedesktop.org.
Zhiping
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-02-09 23:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-09 17:53 [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Zhiping Zhang
2026-02-09 17:53 ` [RFC 1/2] Vfio: add callback to get tph info for dmabuf Zhiping Zhang
2026-02-09 17:53 ` [RFC 2/2] RMDA MLX5: get tph for p2p access when registering dmabuf mr Zhiping Zhang
2026-02-09 18:13 ` [RFC 0/2] Retrieve tph from dmabuf for PCIe P2P memory access Jason Gunthorpe
2026-02-09 23:28 ` Zhiping Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox