* Re: [PATCH v7 1/5] net/mlx5: free mlx5_st_idx_data on final dealloc [not found] ` <20260610193158.2614209-2-zhipingz@meta.com> @ 2026-06-11 7:47 ` Christian König 2026-06-11 22:53 ` Zhiping Zhang 0 siblings, 1 reply; 8+ messages in thread From: Christian König @ 2026-06-11 7:47 UTC (permalink / raw) To: Zhiping Zhang, Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal Cc: Bjorn Helgaas, kvm, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On 6/10/26 21:31, Zhiping Zhang wrote: > When the last reference to an ST table entry is dropped, > mlx5_st_dealloc_index() removed the entry from idx_xa but leaked the > backing mlx5_st_idx_data allocation. Repeated alloc/dealloc cycles > therefore accumulate one struct mlx5_st_idx_data per cycle. > > Free idx_data after the xa_erase() so the lifetime of the bookkeeping > struct matches the lifetime of the ST entry it tracks. > > Fixes: 888a7776f4fb ("net/mlx5: Add support for device steering tag") > Signed-off-by: Zhiping Zhang <zhipingz@meta.com> Since this is an obvious bug fix I think it shouldn't be part of this patch set and go upstream completely independent. Regards, Christian. > --- > drivers/net/ethernet/mellanox/mlx5/core/lib/st.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c > index 997be91f0a13..7cedc348790d 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c > @@ -175,6 +175,7 @@ int mlx5_st_dealloc_index(struct mlx5_core_dev *dev, u16 st_index) > > if (refcount_dec_and_test(&idx_data->usecount)) { > xa_erase(&st->idx_xa, st_index); > + kfree(idx_data); > /* We leave PCI config space as was before, no mkey will refer to it */ > } > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v7 1/5] net/mlx5: free mlx5_st_idx_data on final dealloc 2026-06-11 7:47 ` [PATCH v7 1/5] net/mlx5: free mlx5_st_idx_data on final dealloc Christian König @ 2026-06-11 22:53 ` Zhiping Zhang 2026-06-11 23:45 ` Zhiping Zhang 0 siblings, 1 reply; 8+ messages in thread From: Zhiping Zhang @ 2026-06-11 22:53 UTC (permalink / raw) To: Christian König Cc: Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal, Bjorn Helgaas, kvm, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On Thu, Jun 11, 2026 at 12:47 AM Christian König <christian.koenig@amd.com> wrote: > > > > On 6/10/26 21:31, Zhiping Zhang wrote: > > When the last reference to an ST table entry is dropped, > > mlx5_st_dealloc_index() removed the entry from idx_xa but leaked the > > backing mlx5_st_idx_data allocation. Repeated alloc/dealloc cycles > > therefore accumulate one struct mlx5_st_idx_data per cycle. > > > > Free idx_data after the xa_erase() so the lifetime of the bookkeeping > > struct matches the lifetime of the ST entry it tracks. > > > > Fixes: 888a7776f4fb ("net/mlx5: Add support for device steering tag") > > Signed-off-by: Zhiping Zhang <zhipingz@meta.com> > > Since this is an obvious bug fix I think it shouldn't be part of this patch set and go upstream completely independent. > > Regards, > Christian. > Sure, Michael replied that he has made a patch to fix it, i will rebase on top. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v7 1/5] net/mlx5: free mlx5_st_idx_data on final dealloc 2026-06-11 22:53 ` Zhiping Zhang @ 2026-06-11 23:45 ` Zhiping Zhang 0 siblings, 0 replies; 8+ messages in thread From: Zhiping Zhang @ 2026-06-11 23:45 UTC (permalink / raw) To: Christian König Cc: Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal, Bjorn Helgaas, kvm, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On Thu, Jun 11, 2026 at 3:53 PM Zhiping Zhang <zhipingz@meta.com> wrote: > > On Thu, Jun 11, 2026 at 12:47 AM Christian König > <christian.koenig@amd.com> wrote: > > > > > > > > On 6/10/26 21:31, Zhiping Zhang wrote: > > > When the last reference to an ST table entry is dropped, > > > mlx5_st_dealloc_index() removed the entry from idx_xa but leaked the > > > backing mlx5_st_idx_data allocation. Repeated alloc/dealloc cycles > > > therefore accumulate one struct mlx5_st_idx_data per cycle. > > > > > > Free idx_data after the xa_erase() so the lifetime of the bookkeeping > > > struct matches the lifetime of the ST entry it tracks. > > > > > > Fixes: 888a7776f4fb ("net/mlx5: Add support for device steering tag") > > > Signed-off-by: Zhiping Zhang <zhipingz@meta.com> > > > > Since this is an obvious bug fix I think it shouldn't be part of this patch set and go upstream completely independent. > > > > Regards, > > Christian. > > > > Sure, Michael replied that he has made a patch to fix it, i will rebase on top. Never mind, it seems Michael's patch did not contain the fix, let me submit a separate set. Thanks, Zhiping ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20260610193158.2614209-4-zhipingz@meta.com>]
* Re: [PATCH v7 3/5] dma-buf: add optional get_tph() callback [not found] ` <20260610193158.2614209-4-zhipingz@meta.com> @ 2026-06-11 10:35 ` Christian König 2026-06-11 23:07 ` Zhiping Zhang 0 siblings, 1 reply; 8+ messages in thread From: Christian König @ 2026-06-11 10:35 UTC (permalink / raw) To: Zhiping Zhang, Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal Cc: Bjorn Helgaas, kvm, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On 6/10/26 21:31, Zhiping Zhang wrote: > Add an optional dma_buf_ops.get_tph callback and a dma_buf_get_tph() > wrapper for importers. > > 8-bit ST and 16-bit Extended ST are distinct PCIe TPH namespaces, so > the importer requests the namespace it can emit and the exporter > returns the matching ST/PH tuple or -EOPNOTSUPP. > > dma_buf_get_tph() is the importer entry point. It returns -EOPNOTSUPP > when the exporter lacks the callback and requires dmabuf->resv to be > held while the callback runs. > > The first user is VFIO_DEVICE_FEATURE_DMA_BUF_TPH in vfio-pci, with > mlx5 as the first importer. > > Signed-off-by: Zhiping Zhang <zhipingz@meta.com> > --- > drivers/dma-buf/dma-buf.c | 25 +++++++++++++++++++++++++ > include/linux/dma-buf.h | 21 +++++++++++++++++++++ > 2 files changed, 46 insertions(+) > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index d504c636dc29..aff79ea12e43 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -1144,6 +1144,31 @@ void dma_buf_unpin(struct dma_buf_attachment *attach) > } > EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF"); > > +/** > + * dma_buf_get_tph - Retrieve TPH metadata from an exporter > + * @dmabuf: DMA buffer to query > + * @extended: false for 8-bit ST, true for 16-bit Extended ST > + * @steering_tag: returns the raw steering tag for the requested namespace > + * @ph: returns the TPH processing hint > + * > + * Wrapper for the optional &dma_buf_ops.get_tph callback. > + * > + * Must be called with &dma_buf.resv held. Returns -EOPNOTSUPP if the > + * exporter does not implement the callback or has no metadata for the > + * requested namespace. > + */ > +int dma_buf_get_tph(struct dma_buf *dmabuf, bool extended, > + u16 *steering_tag, u8 *ph) That name needs improvement, maybe something like dma_buf_get_pci_tph(). It also needs some brief explanation what TPH is, maybe a reference to the PCIe spec name etc... And document in the list of functions that this one should be called with the lock held. > +{ > + dma_resv_assert_held(dmabuf->resv); > + > + if (!dmabuf->ops->get_tph) > + return -EOPNOTSUPP; > + > + return dmabuf->ops->get_tph(dmabuf, extended, steering_tag, ph); > +} > +EXPORT_SYMBOL_NS_GPL(dma_buf_get_tph, "DMA_BUF"); > + > /** > * dma_buf_map_attachment - Returns the scatterlist table of the attachment; > * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index d1203da56fc5..6a54e0f251a2 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -113,6 +113,25 @@ struct dma_buf_ops { > */ > void (*unpin)(struct dma_buf_attachment *attach); > > + /** > + * @get_tph: > + * @dmabuf: DMA buffer for which to retrieve TPH metadata > + * @extended: false for 8-bit ST, true for 16-bit Extended ST > + * @steering_tag: Returns the raw TPH steering tag for the requested > + * namespace > + * @ph: Returns the TPH processing hint (2-bit value) > + * > + * Return TPH metadata for the namespace selected by @extended. Return > + * 0 on success, or -EOPNOTSUPP if no metadata is available. > + * > + * This callback is optional. Importers must not call it directly; > + * the dma_buf_get_tph() wrapper is the only entry point and handles > + * the NULL-callback case. The callback is invoked with > + * &dma_buf.resv held. That most of that should be obvious, we only need that it's optional and that the lock should be held. Everything else can be dropped. And most of the description/documentation should be on the wrapper function, exporters who implement the callback should know what they are doing. Regards, Christian. > + */ > + int (*get_tph)(struct dma_buf *dmabuf, bool extended, > + u16 *steering_tag, u8 *ph); > + > /** > * @map_dma_buf: > * > @@ -563,6 +582,8 @@ void dma_buf_detach(struct dma_buf *dmabuf, > struct dma_buf_attachment *attach); > int dma_buf_pin(struct dma_buf_attachment *attach); > void dma_buf_unpin(struct dma_buf_attachment *attach); > +int dma_buf_get_tph(struct dma_buf *dmabuf, bool extended, > + u16 *steering_tag, u8 *ph); > > struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info); > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v7 3/5] dma-buf: add optional get_tph() callback 2026-06-11 10:35 ` [PATCH v7 3/5] dma-buf: add optional get_tph() callback Christian König @ 2026-06-11 23:07 ` Zhiping Zhang 0 siblings, 0 replies; 8+ messages in thread From: Zhiping Zhang @ 2026-06-11 23:07 UTC (permalink / raw) To: Christian König Cc: Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal, Bjorn Helgaas, kvm, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On Thu, Jun 11, 2026 at 3:35 AM Christian König <christian.koenig@amd.com> wrote: > > > > On 6/10/26 21:31, Zhiping Zhang wrote: > > Add an optional dma_buf_ops.get_tph callback and a dma_buf_get_tph() > > wrapper for importers. > > > > 8-bit ST and 16-bit Extended ST are distinct PCIe TPH namespaces, so > > the importer requests the namespace it can emit and the exporter > > returns the matching ST/PH tuple or -EOPNOTSUPP. > > > > dma_buf_get_tph() is the importer entry point. It returns -EOPNOTSUPP > > when the exporter lacks the callback and requires dmabuf->resv to be > > held while the callback runs. > > > > The first user is VFIO_DEVICE_FEATURE_DMA_BUF_TPH in vfio-pci, with > > mlx5 as the first importer. > > > > Signed-off-by: Zhiping Zhang <zhipingz@meta.com> > > --- > > drivers/dma-buf/dma-buf.c | 25 +++++++++++++++++++++++++ > > include/linux/dma-buf.h | 21 +++++++++++++++++++++ > > 2 files changed, 46 insertions(+) > > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > > index d504c636dc29..aff79ea12e43 100644 > > --- a/drivers/dma-buf/dma-buf.c > > +++ b/drivers/dma-buf/dma-buf.c > > @@ -1144,6 +1144,31 @@ void dma_buf_unpin(struct dma_buf_attachment *attach) > > } > > EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF"); > > > > +/** > > + * dma_buf_get_tph - Retrieve TPH metadata from an exporter > > + * @dmabuf: DMA buffer to query > > + * @extended: false for 8-bit ST, true for 16-bit Extended ST > > + * @steering_tag: returns the raw steering tag for the requested namespace > > + * @ph: returns the TPH processing hint > > + * > > + * Wrapper for the optional &dma_buf_ops.get_tph callback. > > + * > > + * Must be called with &dma_buf.resv held. Returns -EOPNOTSUPP if the > > + * exporter does not implement the callback or has no metadata for the > > + * requested namespace. > > + */ > > +int dma_buf_get_tph(struct dma_buf *dmabuf, bool extended, > > + u16 *steering_tag, u8 *ph) > > That name needs improvement, maybe something like dma_buf_get_pci_tph(). > > It also needs some brief explanation what TPH is, maybe a reference to the PCIe spec name etc... > > And document in the list of functions that this one should be called with the lock held. > > > +{ > > + dma_resv_assert_held(dmabuf->resv); > > + > > + if (!dmabuf->ops->get_tph) > > + return -EOPNOTSUPP; > > + > > + return dmabuf->ops->get_tph(dmabuf, extended, steering_tag, ph); > > +} > > +EXPORT_SYMBOL_NS_GPL(dma_buf_get_tph, "DMA_BUF"); > > + > > /** > > * dma_buf_map_attachment - Returns the scatterlist table of the attachment; > > * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > > index d1203da56fc5..6a54e0f251a2 100644 > > --- a/include/linux/dma-buf.h > > +++ b/include/linux/dma-buf.h > > @@ -113,6 +113,25 @@ struct dma_buf_ops { > > */ > > void (*unpin)(struct dma_buf_attachment *attach); > > > > + /** > > + * @get_tph: > > + * @dmabuf: DMA buffer for which to retrieve TPH metadata > > + * @extended: false for 8-bit ST, true for 16-bit Extended ST > > + * @steering_tag: Returns the raw TPH steering tag for the requested > > + * namespace > > + * @ph: Returns the TPH processing hint (2-bit value) > > + * > > + * Return TPH metadata for the namespace selected by @extended. Return > > + * 0 on success, or -EOPNOTSUPP if no metadata is available. > > + * > > + * This callback is optional. Importers must not call it directly; > > + * the dma_buf_get_tph() wrapper is the only entry point and handles > > + * the NULL-callback case. The callback is invoked with > > + * &dma_buf.resv held. > > That most of that should be obvious, we only need that it's optional and that the lock should be held. Everything else can be dropped. > > And most of the description/documentation should be on the wrapper function, exporters who implement the callback should know what they are doing. > > Regards, > Christian. > sure will do Thanks, Zhiping ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20260610193158.2614209-6-zhipingz@meta.com>]
* Re: [PATCH v7 5/5] RDMA/mlx5: get tph for p2p access when registering dma-buf mr [not found] ` <20260610193158.2614209-6-zhipingz@meta.com> @ 2026-06-11 12:44 ` Michael Gur 2026-06-11 23:09 ` Zhiping Zhang 0 siblings, 1 reply; 8+ messages in thread From: Michael Gur @ 2026-06-11 12:44 UTC (permalink / raw) To: Zhiping Zhang, Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal, Christian Konig Cc: Bjorn Helgaas, kvm, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On 6/10/2026 10:31 PM, Zhiping Zhang wrote: > Query dma-buf TPH metadata when registering a dma-buf MR for peer-to- > peer access to a PCIe endpoint and use it to program requester-side TPH > on the outbound mkey. If the exporter has no metadata, fall back to the > existing no-TPH path. > > For TPH-backed FRMRs, make the extra ST-table reference belong to the > hardware mkey handle rather than the transient MR object. Extend the > FRMR pool API so reuse and final destroy can transfer and drop that ref > at the handle lifetime boundaries, and add mlx5_st_get_index() to take > a ref on an already-known ST index. I'd keep the ST reference tied to MRs, where the ST is actually in use. There's no functional need to couple ST refcounting to mkey lifetime. Once an MR is destroyed and its mkey revoked, the mkey can no longer generate traffic, it's just an idle entry in the FRMR pool waiting to be aged out or reused. This lets us drop all FRMR pool changes from this patch and keep a simple flow of 'acquire on MR create, release on MR destroy'. > Also decode PH from kernel_vendor_key when recreating pooled mkeys so > the requester hint matches the pool key. I've fixed that in a series I've sent earlier this week, please rebase next version on top of it. Thanks, Michael > Signed-off-by: Zhiping Zhang <zhipingz@meta.com> > --- > drivers/infiniband/core/frmr_pools.c | 20 +++- > drivers/infiniband/hw/mlx5/mr.c | 111 +++++++++++++++++- > .../net/ethernet/mellanox/mlx5/core/lib/st.c | 49 ++++++-- > include/linux/mlx5/driver.h | 12 ++ > include/rdma/frmr_pools.h | 5 +- > 5 files changed, 178 insertions(+), 19 deletions(-) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v7 5/5] RDMA/mlx5: get tph for p2p access when registering dma-buf mr 2026-06-11 12:44 ` [PATCH v7 5/5] RDMA/mlx5: get tph for p2p access when registering dma-buf mr Michael Gur @ 2026-06-11 23:09 ` Zhiping Zhang 0 siblings, 0 replies; 8+ messages in thread From: Zhiping Zhang @ 2026-06-11 23:09 UTC (permalink / raw) To: Michael Gur Cc: Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal, Christian Konig, Bjorn Helgaas, kvm, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On Thu, Jun 11, 2026 at 5:44 AM Michael Gur <michaelgur@nvidia.com> wrote: > > > > > On 6/10/2026 10:31 PM, Zhiping Zhang wrote: > > Query dma-buf TPH metadata when registering a dma-buf MR for peer-to- > > peer access to a PCIe endpoint and use it to program requester-side TPH > > on the outbound mkey. If the exporter has no metadata, fall back to the > > existing no-TPH path. > > > > For TPH-backed FRMRs, make the extra ST-table reference belong to the > > hardware mkey handle rather than the transient MR object. Extend the > > FRMR pool API so reuse and final destroy can transfer and drop that ref > > at the handle lifetime boundaries, and add mlx5_st_get_index() to take > > a ref on an already-known ST index. > I'd keep the ST reference tied to MRs, where the ST is actually in use. > There's no functional need to couple ST refcounting to mkey lifetime. > Once an MR is destroyed and its mkey revoked, the mkey can no longer > generate traffic, it's just an idle entry in the FRMR pool waiting to be > aged out or reused. > This lets us drop all FRMR pool changes from this patch and keep a > simple flow of 'acquire on MR create, release on MR destroy'. > > Also decode PH from kernel_vendor_key when recreating pooled mkeys so > > the requester hint matches the pool key. > I've fixed that in a series I've sent earlier this week, please rebase > next version on top of it. > > Thanks, > Michael ack, thanks! ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v7 0/5] vfio/dma-buf: add TPH support for peer-to-peer access @ 2026-06-11 16:11 Zhiping Zhang 2026-06-11 16:11 ` [PATCH v7 3/5] dma-buf: add optional get_tph() callback Zhiping Zhang 0 siblings, 1 reply; 8+ messages in thread From: Zhiping Zhang @ 2026-06-11 16:11 UTC (permalink / raw) To: netdev; +Cc: kvm, linux-rdma, linux-pci, dri-devel, Zhiping Zhang This series adds TLP Processing Hints (TPH) support to the VFIO dma-buf export path, allowing importing drivers (e.g. mlx5) to use the exporter's steering tag when performing peer-to-peer DMA into a VFIO-owned device. There is no separate in-tree vendor kernel driver for the target device: vfio-pci is the in-tree driver and the targeted device is managed from userspace via VFIO passthrough. That is why the ST has to flow through a uAPI: userspace owns the device and its ST table, so it is the entity that can publish a meaningful value for a given dma-buf. The kernel-visible participants are still in-tree: vfio-pci exports the dma-buf and mlx5 imports it. On the effect: the endpoint's PCIe ingress block uses the 8-bit ST as an in-band instruction for the incoming P2P TLP -- selecting a target cache partition and, on writes, an in-flight operation on the data before it lands. The dma-buf callback keeps this opaque to the framework -- only the producer (userspace owner of the VFIO device) and the consumer (endpoint block) need to interpret the value. The dma-buf get_tph callback itself is optional for workloads that depend on the endpoint's in-flight operation that fallback does not produce the same result. The dma-buf hook is intentionally generic and discoverable rather than a private side channel. The exporter owns the completing address space for the dma-buf and decides whether it can provide a meaningful ST/PH tuple for that completer; the dma-buf core keeps the tuple opaque, and importers merely request the namespace they support and place the returned value on generated TLPs. Exporters that cannot derive a meaningful tuple simply return -EOPNOTSUPP. Patch 1 is a pre-existing fix split out from the series: mlx5_st_dealloc_index() removed the xarray entry but never freed the backing struct, so repeated alloc/dealloc cycles leaked memory. Patch 2 adds small PCI/TPH type helpers so drivers can query the enabled TPH requester mode and the device's TPH Completer Supported field without reaching into pci_dev internals (and so callers in CONFIG_PCIE_TPH=n builds get a clean fallback). Patch 3 adds the optional dma_buf_ops::get_tph callback plus the dma_buf_get_tph() importer wrapper so importers can fetch TPH metadata from an exporter under dmabuf->resv. Patch 4 implements get_tph in vfio-pci and adds the new uAPI (VFIO_DEVICE_FEATURE_DMA_BUF_TPH) for userspace to attach the metadata. Patch 5 wires up the mlx5 RDMA driver as a consumer. Build-tested with both CONFIG_PCIE_TPH=y and CONFIG_PCIE_TPH=n. Functional validation on the target topology: PCIe analyzer captures on the P2P TLPs confirm the ST emitted by mlx5 matches the value published through VFIO_DEVICE_FEATURE_DMA_BUF_TPH, and the end-to-end P2P workload only produces results consistent with the endpoint's ST-selected in-flight operation. For example, with userspace publishing 8-bit ST=0xf0 and PH=2, an analyzer capture of a peer-to- peer MWr64 shows "STP MWr64 TC=0 OHC=2 ..." followed by "OHC-B ST=F0h PH=2 HV=1": (TLP Captures) 08000260 -> STP MWr64 TC=0 OHC=2 TS=0 Attr=0 L=8 F0000004 -> RID=4h:0h.0h EP- Tag=F0h E0200000 -> AddrH=000020E0h 00080006 -> AddrL=06000800h 90F00000 -> OHC-B ST=F0h PH=2 HV=1 AMA=0 AV- Previous link: v6: https://lore.kernel.org/dri-devel/20260608185646.4085127-1-zhipingz@meta.com/ v5: https://lore.kernel.org/dri-devel/20260526144401.1485788-1-zhipingz@meta.com/ v4: https://lore.kernel.org/linux-pci/20260519201401.1558410-1-zhipingz@meta.com/ v3: https://lore.kernel.org/linux-pci/20260512184755.4137227-1-zhipingz@meta.com/ v2: https://lore.kernel.org/linux-pci/20260430200704.352228-1-zhipingz@meta.com/ Zhiping Zhang (5): net/mlx5: free mlx5_st_idx_data on final dealloc PCI/TPH: Add requester/completer type helpers dma-buf: add optional get_tph() callback vfio/pci: implement get_tph and DMA_BUF_TPH feature RDMA/mlx5: get tph for p2p access when registering dma-buf mr drivers/dma-buf/dma-buf.c | 25 ++++ drivers/infiniband/core/frmr_pools.c | 20 +++- drivers/infiniband/hw/mlx5/mr.c | 111 +++++++++++++++++- .../net/ethernet/mellanox/mlx5/core/lib/st.c | 50 ++++++-- drivers/pci/tph.c | 43 +++++++ drivers/vfio/pci/vfio_pci_core.c | 3 + drivers/vfio/pci/vfio_pci_dmabuf.c | 94 ++++++++++++++- drivers/vfio/pci/vfio_pci_priv.h | 12 ++ include/linux/dma-buf.h | 21 ++++ include/linux/mlx5/driver.h | 12 ++ include/linux/pci-tph.h | 8 ++ include/rdma/frmr_pools.h | 5 +- include/uapi/linux/vfio.h | 37 ++++++ 13 files changed, 421 insertions(+), 20 deletions(-) -- 2.53.0-Meta ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v7 3/5] dma-buf: add optional get_tph() callback 2026-06-11 16:11 [PATCH v7 0/5] vfio/dma-buf: add TPH support for peer-to-peer access Zhiping Zhang @ 2026-06-11 16:11 ` Zhiping Zhang 0 siblings, 0 replies; 8+ messages in thread From: Zhiping Zhang @ 2026-06-11 16:11 UTC (permalink / raw) To: netdev; +Cc: kvm, linux-rdma, linux-pci, dri-devel, Zhiping Zhang Add an optional dma_buf_ops.get_tph callback and a dma_buf_get_tph() wrapper for importers. 8-bit ST and 16-bit Extended ST are distinct PCIe TPH namespaces, so the importer requests the namespace it can emit and the exporter returns the matching ST/PH tuple or -EOPNOTSUPP. dma_buf_get_tph() is the importer entry point. It returns -EOPNOTSUPP when the exporter lacks the callback and requires dmabuf->resv to be held while the callback runs. The first user is VFIO_DEVICE_FEATURE_DMA_BUF_TPH in vfio-pci, with mlx5 as the first importer. Signed-off-by: Zhiping Zhang <zhipingz@meta.com> --- drivers/dma-buf/dma-buf.c | 25 +++++++++++++++++++++++++ include/linux/dma-buf.h | 21 +++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d504c636dc29..aff79ea12e43 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1144,6 +1144,31 @@ void dma_buf_unpin(struct dma_buf_attachment *attach) } EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF"); +/** + * dma_buf_get_tph - Retrieve TPH metadata from an exporter + * @dmabuf: DMA buffer to query + * @extended: false for 8-bit ST, true for 16-bit Extended ST + * @steering_tag: returns the raw steering tag for the requested namespace + * @ph: returns the TPH processing hint + * + * Wrapper for the optional &dma_buf_ops.get_tph callback. + * + * Must be called with &dma_buf.resv held. Returns -EOPNOTSUPP if the + * exporter does not implement the callback or has no metadata for the + * requested namespace. + */ +int dma_buf_get_tph(struct dma_buf *dmabuf, bool extended, + u16 *steering_tag, u8 *ph) +{ + dma_resv_assert_held(dmabuf->resv); + + if (!dmabuf->ops->get_tph) + return -EOPNOTSUPP; + + return dmabuf->ops->get_tph(dmabuf, extended, steering_tag, ph); +} +EXPORT_SYMBOL_NS_GPL(dma_buf_get_tph, "DMA_BUF"); + /** * dma_buf_map_attachment - Returns the scatterlist table of the attachment; * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index d1203da56fc5..6a54e0f251a2 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -113,6 +113,25 @@ struct dma_buf_ops { */ void (*unpin)(struct dma_buf_attachment *attach); + /** + * @get_tph: + * @dmabuf: DMA buffer for which to retrieve TPH metadata + * @extended: false for 8-bit ST, true for 16-bit Extended ST + * @steering_tag: Returns the raw TPH steering tag for the requested + * namespace + * @ph: Returns the TPH processing hint (2-bit value) + * + * Return TPH metadata for the namespace selected by @extended. Return + * 0 on success, or -EOPNOTSUPP if no metadata is available. + * + * This callback is optional. Importers must not call it directly; + * the dma_buf_get_tph() wrapper is the only entry point and handles + * the NULL-callback case. The callback is invoked with + * &dma_buf.resv held. + */ + int (*get_tph)(struct dma_buf *dmabuf, bool extended, + u16 *steering_tag, u8 *ph); + /** * @map_dma_buf: * @@ -563,6 +582,8 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach); int dma_buf_pin(struct dma_buf_attachment *attach); void dma_buf_unpin(struct dma_buf_attachment *attach); +int dma_buf_get_tph(struct dma_buf *dmabuf, bool extended, + u16 *steering_tag, u8 *ph); struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info); -- 2.53.0-Meta ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-06-11 23:45 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260610193158.2614209-1-zhipingz@meta.com>
[not found] ` <20260610193158.2614209-2-zhipingz@meta.com>
2026-06-11 7:47 ` [PATCH v7 1/5] net/mlx5: free mlx5_st_idx_data on final dealloc Christian König
2026-06-11 22:53 ` Zhiping Zhang
2026-06-11 23:45 ` Zhiping Zhang
[not found] ` <20260610193158.2614209-4-zhipingz@meta.com>
2026-06-11 10:35 ` [PATCH v7 3/5] dma-buf: add optional get_tph() callback Christian König
2026-06-11 23:07 ` Zhiping Zhang
[not found] ` <20260610193158.2614209-6-zhipingz@meta.com>
2026-06-11 12:44 ` [PATCH v7 5/5] RDMA/mlx5: get tph for p2p access when registering dma-buf mr Michael Gur
2026-06-11 23:09 ` Zhiping Zhang
2026-06-11 16:11 [PATCH v7 0/5] vfio/dma-buf: add TPH support for peer-to-peer access Zhiping Zhang
2026-06-11 16:11 ` [PATCH v7 3/5] dma-buf: add optional get_tph() callback Zhiping Zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox