Netdev List
 help / color / mirror / Atom feed
  • [parent not found: <20260610193158.2614209-4-zhipingz@meta.com>]
  • [parent not found: <20260610193158.2614209-6-zhipingz@meta.com>]
  • * [PATCH v7 0/5] vfio/dma-buf: add TPH support for peer-to-peer access
    @ 2026-06-11 16:11 Zhiping Zhang
      2026-06-11 16:11 ` [PATCH v7 3/5] dma-buf: add optional get_tph() callback Zhiping Zhang
      0 siblings, 1 reply; 8+ messages in thread
    From: Zhiping Zhang @ 2026-06-11 16:11 UTC (permalink / raw)
      To: netdev; +Cc: kvm, linux-rdma, linux-pci, dri-devel, Zhiping Zhang
    
    This series adds TLP Processing Hints (TPH) support to the VFIO dma-buf
    export path, allowing importing drivers (e.g. mlx5) to use the
    exporter's steering tag when performing peer-to-peer DMA into a
    VFIO-owned device.
    
    There is no separate in-tree vendor kernel driver for the target device:
    vfio-pci is the in-tree driver and the targeted device is managed
    from userspace via VFIO passthrough. That is why the ST has to flow
    through a uAPI: userspace owns the device and its ST table, so it is the
    entity that can publish a meaningful value for a given dma-buf. The
    kernel-visible participants are still in-tree: vfio-pci exports the
    dma-buf and mlx5 imports it.
    
    On the effect: the endpoint's PCIe ingress block uses the 8-bit ST as
    an in-band instruction for the incoming P2P TLP -- selecting a target
    cache partition and, on writes, an in-flight operation on the data
    before it lands. The dma-buf callback keeps this opaque to the
    framework -- only the producer (userspace owner of the VFIO device)
    and the consumer (endpoint block) need to interpret the value. The
    dma-buf get_tph callback itself is optional for workloads that depend
    on the endpoint's in-flight operation that fallback does not produce
    the same result.
    
    The dma-buf hook is intentionally generic and discoverable rather than
    a private side channel. The exporter owns the completing address
    space for the dma-buf and decides whether it can provide a meaningful
    ST/PH tuple for that completer; the dma-buf core keeps the tuple opaque,
    and importers merely request the namespace they support and place the
    returned value on generated TLPs. Exporters that cannot derive a
    meaningful tuple simply return -EOPNOTSUPP.
    
    Patch 1 is a pre-existing fix split out from the series:
    mlx5_st_dealloc_index() removed the xarray entry but never freed the
    backing struct, so repeated alloc/dealloc cycles leaked memory.
    Patch 2 adds small PCI/TPH type helpers so drivers can query the enabled
    TPH requester mode and the device's TPH Completer Supported field
    without reaching into pci_dev internals (and so callers in
    CONFIG_PCIE_TPH=n builds get a clean fallback).
    Patch 3 adds the optional dma_buf_ops::get_tph callback plus the
    dma_buf_get_tph() importer wrapper so importers can fetch TPH metadata
    from an exporter under dmabuf->resv.
    Patch 4 implements get_tph in vfio-pci and adds the new uAPI
    (VFIO_DEVICE_FEATURE_DMA_BUF_TPH) for userspace to attach the metadata.
    Patch 5 wires up the mlx5 RDMA driver as a consumer.
    
    Build-tested with both CONFIG_PCIE_TPH=y and CONFIG_PCIE_TPH=n.
    Functional validation on the target topology: PCIe analyzer captures
    on the P2P TLPs confirm the ST emitted by mlx5 matches the value
    published through VFIO_DEVICE_FEATURE_DMA_BUF_TPH, and the end-to-end
    P2P workload only produces results consistent with the endpoint's
    ST-selected in-flight operation. For example, with userspace
    publishing 8-bit ST=0xf0 and PH=2, an analyzer capture of a peer-to-
    peer MWr64 shows "STP MWr64 TC=0 OHC=2 ..." followed by "OHC-B
    ST=F0h PH=2 HV=1":
    (TLP Captures)
    08000260 -> STP MWr64 TC=0 OHC=2 TS=0 Attr=0 L=8
    F0000004 -> RID=4h:0h.0h EP- Tag=F0h
    E0200000 -> AddrH=000020E0h
    00080006 -> AddrL=06000800h
    90F00000 -> OHC-B ST=F0h PH=2 HV=1 AMA=0 AV-
    
    Previous link:
    v6: https://lore.kernel.org/dri-devel/20260608185646.4085127-1-zhipingz@meta.com/
    v5: https://lore.kernel.org/dri-devel/20260526144401.1485788-1-zhipingz@meta.com/
    v4: https://lore.kernel.org/linux-pci/20260519201401.1558410-1-zhipingz@meta.com/
    v3: https://lore.kernel.org/linux-pci/20260512184755.4137227-1-zhipingz@meta.com/
    v2: https://lore.kernel.org/linux-pci/20260430200704.352228-1-zhipingz@meta.com/
    
    Zhiping Zhang (5):
      net/mlx5: free mlx5_st_idx_data on final dealloc
      PCI/TPH: Add requester/completer type helpers
      dma-buf: add optional get_tph() callback
      vfio/pci: implement get_tph and DMA_BUF_TPH feature
      RDMA/mlx5: get tph for p2p access when registering dma-buf mr
    
     drivers/dma-buf/dma-buf.c                     |  25 ++++
     drivers/infiniband/core/frmr_pools.c          |  20 +++-
     drivers/infiniband/hw/mlx5/mr.c               | 111 +++++++++++++++++-
     .../net/ethernet/mellanox/mlx5/core/lib/st.c  |  50 ++++++--
     drivers/pci/tph.c                             |  43 +++++++
     drivers/vfio/pci/vfio_pci_core.c              |   3 +
     drivers/vfio/pci/vfio_pci_dmabuf.c            |  94 ++++++++++++++-
     drivers/vfio/pci/vfio_pci_priv.h              |  12 ++
     include/linux/dma-buf.h                       |  21 ++++
     include/linux/mlx5/driver.h                   |  12 ++
     include/linux/pci-tph.h                       |   8 ++
     include/rdma/frmr_pools.h                     |   5 +-
     include/uapi/linux/vfio.h                     |  37 ++++++
     13 files changed, 421 insertions(+), 20 deletions(-)
    
    -- 
    2.53.0-Meta
    
    ^ permalink raw reply	[flat|nested] 8+ messages in thread

    end of thread, other threads:[~2026-06-11 23:45 UTC | newest]
    
    Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <20260610193158.2614209-1-zhipingz@meta.com>
         [not found] ` <20260610193158.2614209-2-zhipingz@meta.com>
    2026-06-11  7:47   ` [PATCH v7 1/5] net/mlx5: free mlx5_st_idx_data on final dealloc Christian König
    2026-06-11 22:53     ` Zhiping Zhang
    2026-06-11 23:45       ` Zhiping Zhang
         [not found] ` <20260610193158.2614209-4-zhipingz@meta.com>
    2026-06-11 10:35   ` [PATCH v7 3/5] dma-buf: add optional get_tph() callback Christian König
    2026-06-11 23:07     ` Zhiping Zhang
         [not found] ` <20260610193158.2614209-6-zhipingz@meta.com>
    2026-06-11 12:44   ` [PATCH v7 5/5] RDMA/mlx5: get tph for p2p access when registering dma-buf mr Michael Gur
    2026-06-11 23:09     ` Zhiping Zhang
    2026-06-11 16:11 [PATCH v7 0/5] vfio/dma-buf: add TPH support for peer-to-peer access Zhiping Zhang
    2026-06-11 16:11 ` [PATCH v7 3/5] dma-buf: add optional get_tph() callback Zhiping Zhang
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox