Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* Re: [PATCH v10 2/4] dma-buf: add optional get_pci_tph() callback
       [not found] ` <20260630224328.3218796-3-zhipingz@meta.com>
@ 2026-07-01  8:25   ` Christian König
  2026-07-01 17:53     ` Zhiping Zhang
  0 siblings, 1 reply; 5+ messages in thread
From: Christian König @ 2026-07-01  8:25 UTC (permalink / raw)
  To: Zhiping Zhang, Jason Gunthorpe, Leon Romanovsky, Michael Guralnik,
	Sumit Semwal, Alex Williamson, Bjorn Helgaas
  Cc: kvm, linux-rdma, linux-pci, dri-devel

On 7/1/26 00:42, Zhiping Zhang wrote:
> Add an optional dma_buf_ops.get_pci_tph callback and a
> DMA-buf importer wrapper, dma_buf_get_pci_tph().
> 
> TPH is PCIe TLP Processing Hint. 8-bit ST and 16-bit Extended ST are
> distinct PCIe TPH namespaces, so the importer requests the namespace it
> can emit and the exporter returns the matching ST/PH tuple or
> -EOPNOTSUPP.
> 
> dma_buf_get_pci_tph() is the importer entry point. It requires
> &dmabuf->resv to be held while the callback runs and returns
> -EOPNOTSUPP when the exporter does not provide PCI TPH metadata.
> 
> The first user is VFIO_DEVICE_FEATURE_DMA_BUF_TPH in vfio-pci, with
> mlx5 as the first importer.
> 
> Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
> ---
>  drivers/dma-buf/dma-buf.c | 25 +++++++++++++++++++++++++
>  include/linux/dma-buf.h   | 22 ++++++++++++++++++++++
>  2 files changed, 47 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index d504c636dc29..7a4c9b0d5dab 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1144,6 +1144,31 @@ void dma_buf_unpin(struct dma_buf_attachment *attach)
>  }
>  EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF");
>  
> +/**
> + * dma_buf_get_pci_tph - Retrieve PCIe TLP Processing Hint (TPH) metadata
> + * @dmabuf: DMA buffer to query
> + * @extended: false for 8-bit ST, true for 16-bit Extended ST
> + * @steering_tag: returns the raw steering tag for the requested namespace
> + * @ph: returns the TPH processing hint
> + *
> + * Wrapper for the optional &dma_buf_ops.get_pci_tph callback.
> + *
> + * Must be called with &dma_buf.resv held. Returns -EOPNOTSUPP if the
> + * exporter does not implement the callback or has no metadata for the
> + * requested namespace.

Please add something like this:

* The returned information is only valid till the next invalidate_mappings() callback from the exporter and should be re-queried when a new mapping is created after invalidation.

Apart from that it looks good to me, but I still think we need some kind of example that this works for other DMA-buf users as well.

Just demonstrating that this also works with some simple FPGA or similar PCIe endpoint should be sufficient.

Regards,
Christian.

> + */
> +int dma_buf_get_pci_tph(struct dma_buf *dmabuf, bool extended,
> +			u16 *steering_tag, u8 *ph)
> +{
> +	dma_resv_assert_held(dmabuf->resv);
> +
> +	if (!dmabuf->ops->get_pci_tph)
> +		return -EOPNOTSUPP;
> +
> +	return dmabuf->ops->get_pci_tph(dmabuf, extended, steering_tag, ph);
> +}
> +EXPORT_SYMBOL_NS_GPL(dma_buf_get_pci_tph, "DMA_BUF");
> +
>  /**
>   * dma_buf_map_attachment - Returns the scatterlist table of the attachment;
>   * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index d1203da56fc5..53b2686ad8fc 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -113,6 +113,26 @@ struct dma_buf_ops {
>  	 */
>  	void (*unpin)(struct dma_buf_attachment *attach);
>  
> +	/**
> +	 * @get_pci_tph:
> +	 *
> +	 * Retrieve PCIe TLP Processing Hint (TPH) steering metadata for
> +	 * this buffer so an importer can program a matching ST/PH hint on
> +	 * outbound TLPs targeting the exporter for peer-to-peer DMA.
> +	 *
> +	 * @dmabuf: DMA buffer for which to retrieve TPH metadata
> +	 * @extended: false for 8-bit ST, true for 16-bit Extended ST
> +	 * @steering_tag: Returns the raw TPH steering tag for the requested
> +	 *                namespace
> +	 * @ph: Returns the TPH processing hint (2-bit value)
> +	 *
> +	 * Optional callback for dma_buf_get_pci_tph(). Called with
> +	 * &dma_buf.resv held. Returns 0 on success or -EOPNOTSUPP when
> +	 * the exporter has no metadata for the requested namespace.
> +	 */
> +	int (*get_pci_tph)(struct dma_buf *dmabuf, bool extended,
> +			   u16 *steering_tag, u8 *ph);
> +
>  	/**
>  	 * @map_dma_buf:
>  	 *
> @@ -563,6 +583,8 @@ void dma_buf_detach(struct dma_buf *dmabuf,
>  		    struct dma_buf_attachment *attach);
>  int dma_buf_pin(struct dma_buf_attachment *attach);
>  void dma_buf_unpin(struct dma_buf_attachment *attach);
> +int dma_buf_get_pci_tph(struct dma_buf *dmabuf, bool extended,
> +			u16 *steering_tag, u8 *ph);
>  
>  struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
>  


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v10 2/4] dma-buf: add optional get_pci_tph() callback
  2026-07-01  8:25   ` [PATCH v10 2/4] dma-buf: add optional get_pci_tph() callback Christian König
@ 2026-07-01 17:53     ` Zhiping Zhang
  2026-07-02  7:06       ` Christian König
  0 siblings, 1 reply; 5+ messages in thread
From: Zhiping Zhang @ 2026-07-01 17:53 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Gunthorpe, Leon Romanovsky, Michael Guralnik, Sumit Semwal,
	Alex Williamson, Bjorn Helgaas, kvm, linux-rdma, linux-pci,
	dri-devel

> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index d504c636dc29..7a4c9b0d5dab 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -1144,6 +1144,31 @@ void dma_buf_unpin(struct dma_buf_attachment *attach)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF");
> >
> > +/**
> > + * dma_buf_get_pci_tph - Retrieve PCIe TLP Processing Hint (TPH) metadata
> > + * @dmabuf: DMA buffer to query
> > + * @extended: false for 8-bit ST, true for 16-bit Extended ST
> > + * @steering_tag: returns the raw steering tag for the requested namespace
> > + * @ph: returns the TPH processing hint
> > + *
> > + * Wrapper for the optional &dma_buf_ops.get_pci_tph callback.
> > + *
> > + * Must be called with &dma_buf.resv held. Returns -EOPNOTSUPP if the
> > + * exporter does not implement the callback or has no metadata for the
> > + * requested namespace.
>
> Please add something like this:
>
> * The returned information is only valid till the next invalidate_mappings() callback from the exporter and should be re-queried when a new mapping is created after invalidation.
>

Thanks, Will do in v11!

> Apart from that it looks good to me, but I still think we need some kind of example that this works for other DMA-buf users as well.
>
> Just demonstrating that this also works with some simple FPGA or similar PCIe endpoint should be sufficient.
>
> Regards,
> Christian.
>

On v10, I have validated a second importer: another vendor's NIC
(driver not upstream yet, so locally patched to
call dma_buf_get_pci_tph). A PCIe analyzer confirms the TLP steering
tag matches the exporter's for both mlx5/ConnectX-8
and this second NIC — two unrelated importer drivers exercising the
API end-to-end.

Thanks,
Zhiping

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v10 3/4] vfio/pci: implement get_pci_tph and DMA_BUF_TPH feature
       [not found] ` <20260630224328.3218796-4-zhipingz@meta.com>
@ 2026-07-01 18:07   ` Alex Williamson
  2026-07-01 21:07     ` Zhiping Zhang
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2026-07-01 18:07 UTC (permalink / raw)
  To: Zhiping Zhang
  Cc: Jason Gunthorpe, Leon Romanovsky, Michael Guralnik, Sumit Semwal,
	Christian Konig, Bjorn Helgaas, kvm, linux-rdma, linux-pci,
	dri-devel, alex

On Tue, 30 Jun 2026 15:42:25 -0700
Zhiping Zhang <zhipingz@meta.com> wrote:

> Implement dma-buf get_pci_tph for vfio-pci exported dma-bufs and add
> VFIO_DEVICE_FEATURE_DMA_BUF_TPH so userspace can publish TPH metadata
> for a VFIO-owned device.
> 
> 8-bit ST and 16-bit Extended ST are distinct PCIe TPH namespaces; the
> uAPI carries both with explicit validity flags, and get_pci_tph()
> returns the value matching the importer's requested namespace or
> -EOPNOTSUPP.
> 
> Publish and read the TPH descriptor under dmabuf->resv, matching the
> locking used for other importer-visible dma-buf state. The SET ioctl
> takes dma_resv_lock_interruptible(), while the callback runs under
> DMA-buf's asserted resv lock.
> 
> Reject requests the device cannot consume as a completer:
> pcie_tph_completer_type() must report at least
> PCI_EXP_DEVCAP2_TPH_COMP_TPH_ONLY, and Extended ST requires
> PCI_EXP_DEVCAP2_TPH_COMP_EXT_TPH. Make PROBE follow the same hardware
> gate so the feature only probes as supported when the device can really
> consume it.
> 
> Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
> ---
>  drivers/vfio/pci/vfio_pci_core.c   |  3 +
>  drivers/vfio/pci/vfio_pci_dmabuf.c | 99 +++++++++++++++++++++++++++++-
>  drivers/vfio/pci/vfio_pci_priv.h   | 12 ++++
>  include/uapi/linux/vfio.h          | 43 +++++++++++++
>  4 files changed, 155 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index a28f1e99362c..c7d6902bc61b 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1572,6 +1572,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
>  		return vfio_pci_core_feature_token(vdev, flags, arg, argsz);
>  	case VFIO_DEVICE_FEATURE_DMA_BUF:
>  		return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz);
> +	case VFIO_DEVICE_FEATURE_DMA_BUF_TPH:
> +		return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg,
> +							 argsz);
>  	default:
>  		return -ENOTTY;
>  	}
> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> index c16f460c01d6..8de72f9e7502 100644
> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> @@ -3,6 +3,7 @@
>   */
>  #include <linux/dma-buf-mapping.h>
>  #include <linux/pci-p2pdma.h>
> +#include <linux/pci-tph.h>
>  #include <linux/dma-resv.h>
>  
>  #include "vfio_pci_priv.h"
> @@ -19,7 +20,14 @@ struct vfio_pci_dma_buf {
>  	u32 nr_ranges;
>  	struct kref kref;
>  	struct completion comp;
> -	u8 revoked : 1;
> +
> +	/* Protected by dmabuf->resv. */
> +	u16 tph_st_ext;
> +	u8 tph_st;
> +	bool revoked;
> +	u8 tph_st_valid:1;
> +	u8 tph_st_ext_valid:1;
> +	u8 tph_ph:2;

Since it seems there will be a v11, note again the comment made here on
v9:

On Tue, 23 Jun 2026 22:24:54 -0700
Zhiping Zhang <zhipingz@meta.com> wrote:
> On Tue, Jun 23, 2026 at 11:17 AM Alex Williamson <alex@shazbot.org> wrote:
> >
> > Nit, it would be more accurate to say:
> >
> >         /*
> >          * Updates protected by dmabuf->resv, @revoked additionally
> >          * protected by memory_lock.
> >          */
> >
> > revoked also has an unprotected read, but it's previously existing and
> > benign, and likely just needs a READ_ONCE() annotation.
> >  
> 
> Agreed, I'll update the comment and add READ_ONCE() as well.

The READ_ONCE was added, but the comment remains as in v9.  The
READ_ONCE rationale should be described in the commit log too.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v10 3/4] vfio/pci: implement get_pci_tph and DMA_BUF_TPH feature
  2026-07-01 18:07   ` [PATCH v10 3/4] vfio/pci: implement get_pci_tph and DMA_BUF_TPH feature Alex Williamson
@ 2026-07-01 21:07     ` Zhiping Zhang
  0 siblings, 0 replies; 5+ messages in thread
From: Zhiping Zhang @ 2026-07-01 21:07 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Jason Gunthorpe, Leon Romanovsky, Michael Guralnik, Sumit Semwal,
	Christian Konig, Bjorn Helgaas, kvm, linux-rdma, linux-pci,
	dri-devel

> Since it seems there will be a v11, note again the comment made here on
> v9:
>
> On Tue, 23 Jun 2026 22:24:54 -0700
> Zhiping Zhang <zhipingz@meta.com> wrote:
> > On Tue, Jun 23, 2026 at 11:17 AM Alex Williamson <alex@shazbot.org> wrote:
> > >
> > > Nit, it would be more accurate to say:
> > >
> > >         /*
> > >          * Updates protected by dmabuf->resv, @revoked additionally
> > >          * protected by memory_lock.
> > >          */
> > >
> > > revoked also has an unprotected read, but it's previously existing and
> > > benign, and likely just needs a READ_ONCE() annotation.
> > >
> >
> > Agreed, I'll update the comment and add READ_ONCE() as well.
>
> The READ_ONCE was added, but the comment remains as in v9.  The
> READ_ONCE rationale should be described in the commit log too.  Thanks,
>
> Alex

Sure, sorry for the miss. will do!

Thanks,
Zhiping

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v10 2/4] dma-buf: add optional get_pci_tph() callback
  2026-07-01 17:53     ` Zhiping Zhang
@ 2026-07-02  7:06       ` Christian König
  0 siblings, 0 replies; 5+ messages in thread
From: Christian König @ 2026-07-02  7:06 UTC (permalink / raw)
  To: Zhiping Zhang
  Cc: Jason Gunthorpe, Leon Romanovsky, Michael Guralnik, Sumit Semwal,
	Alex Williamson, Bjorn Helgaas, kvm, linux-rdma, linux-pci,
	dri-devel

On 7/1/26 19:53, Zhiping Zhang wrote:
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index d504c636dc29..7a4c9b0d5dab 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -1144,6 +1144,31 @@ void dma_buf_unpin(struct dma_buf_attachment *attach)
>>>  }
>>>  EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF");
>>>
>>> +/**
>>> + * dma_buf_get_pci_tph - Retrieve PCIe TLP Processing Hint (TPH) metadata
>>> + * @dmabuf: DMA buffer to query
>>> + * @extended: false for 8-bit ST, true for 16-bit Extended ST
>>> + * @steering_tag: returns the raw steering tag for the requested namespace
>>> + * @ph: returns the TPH processing hint
>>> + *
>>> + * Wrapper for the optional &dma_buf_ops.get_pci_tph callback.
>>> + *
>>> + * Must be called with &dma_buf.resv held. Returns -EOPNOTSUPP if the
>>> + * exporter does not implement the callback or has no metadata for the
>>> + * requested namespace.
>>
>> Please add something like this:
>>
>> * The returned information is only valid till the next invalidate_mappings() callback from the exporter and should be re-queried when a new mapping is created after invalidation.
>>
> 
> Thanks, Will do in v11!
> 
>> Apart from that it looks good to me, but I still think we need some kind of example that this works for other DMA-buf users as well.
>>
>> Just demonstrating that this also works with some simple FPGA or similar PCIe endpoint should be sufficient.
>>
>> Regards,
>> Christian.
>>
> 
> On v10, I have validated a second importer: another vendor's NIC
> (driver not upstream yet, so locally patched to
> call dma_buf_get_pci_tph). A PCIe analyzer confirms the TLP steering
> tag matches the exporter's for both mlx5/ConnectX-8
> and this second NIC — two unrelated importer drivers exercising the
> API end-to-end.

That sounds like it would be sufficient, yes.

Thanks,
Christian.

> 
> Thanks,
> Zhiping


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-07-02  7:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260630224328.3218796-1-zhipingz@meta.com>
     [not found] ` <20260630224328.3218796-3-zhipingz@meta.com>
2026-07-01  8:25   ` [PATCH v10 2/4] dma-buf: add optional get_pci_tph() callback Christian König
2026-07-01 17:53     ` Zhiping Zhang
2026-07-02  7:06       ` Christian König
     [not found] ` <20260630224328.3218796-4-zhipingz@meta.com>
2026-07-01 18:07   ` [PATCH v10 3/4] vfio/pci: implement get_pci_tph and DMA_BUF_TPH feature Alex Williamson
2026-07-01 21:07     ` Zhiping Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox