public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Zhiping Zhang <zhipingz@meta.com>
Cc: Alex Williamson <alex@shazbot.org>,
	Stanislav Fomichev <sdf@meta.com>,
	Keith Busch <kbusch@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	Bjorn Helgaas <helgaas@kernel.org>,
	linux-rdma@vger.kernel.org, linux-pci@vger.kernel.org,
	netdev@vger.kernel.org, dri-devel@lists.freedesktop.org,
	Yochai Cohen <yochai@nvidia.com>,
	Yishai Hadas <yishaih@nvidia.com>
Subject: Re: [PATCH v1 1/2] vfio: add callback to get tph info for dma-buf
Date: Mon, 27 Apr 2026 21:35:13 +0300	[thread overview]
Message-ID: <20260427183513.GK440345@unreal> (raw)
In-Reply-To: <CAH3zFs2Sy0mv=QkK4VSV+MVR=ef_CdoxMhTFgzaqoZ+uSOpxoQ@mail.gmail.com>

On Mon, Apr 27, 2026 at 07:28:57AM -0700, Zhiping Zhang wrote:
> On Mon, Apr 27, 2026 at 6:37 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > >
> > On Wed, Apr 22, 2026 at 09:23:27AM -0600, Alex Williamson wrote:
> > > On Mon, 20 Apr 2026 11:39:15 -0700
> > > Zhiping Zhang <zhipingz@meta.com> wrote:
> > >
> > > > Add a dma-buf callback that returns raw TPH metadata from the exporter
> > > > so peer devices can reuse the steering tag and processing hint
> > > > associated with a VFIO-exported buffer.
> > > >
> > > > Keep the existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI layout intact by
> > > > using a flag plus one extra trailing entries[] object for the optional
> > > > TPH metadata. Rename the uAPI field dma_ranges to entries. The
> > > > nr_ranges field remains the DMA range count; when VFIO_DMABUF_FLAG_TPH
> > > > is set the kernel reads one extra entry beyond nr_ranges for the TPH
> > > > metadata.
> > > >
> > > > Add an st_width parameter to get_tph() so the exporter can reject
> > > > steering tags that exceed the consumer's supported width (8 vs 16 bit).
> > > > When no TPH metadata was supplied, make get_tph() return -EOPNOTSUPP.
> > > >
> > > > Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
> > > > ---
> > > >  drivers/vfio/pci/vfio_pci_dmabuf.c | 62 +++++++++++++++++++++++-------
> > > >  include/linux/dma-buf.h            | 17 ++++++++
> > > >  include/uapi/linux/vfio.h          | 28 ++++++++++++--
> > > >  3 files changed, 89 insertions(+), 18 deletions(-)
> >
> > <...>
> >
> > > > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > > > index bb7b89330d35..a0bd24623c52 100644
> > > > --- a/include/uapi/linux/vfio.h
> > > > +++ b/include/uapi/linux/vfio.h
> > > > @@ -1490,16 +1490,36 @@ struct vfio_device_feature_bus_master {
> > > >   * open_flags are the typical flags passed to open(2), eg O_RDWR, O_CLOEXEC,
> > > >   * etc. offset/length specify a slice of the region to create the dmabuf from.
> > > >   * nr_ranges is the total number of (P2P DMA) ranges that comprise the dmabuf.
> > > > + * When VFIO_DMABUF_FLAG_TPH is set, entries[] contains one extra trailing
> > > > + * object after the nr_ranges DMA ranges carrying the TPH steering tag and
> > > > + * processing hint.
> > >
> > > I really don't think we want to design an API where entries is
> > > implicitly one-off from what's actually there.  This feeds back into
> > > the below removal of the __counted by attribute, which is a red flag
> > > that this is the wrong approach.
> >
> > I believe removing `__counted` is a mistake. In my proposal, the intent
> > was to adjust the meaning of the storage object based on the flag bit.
> > The size of the array should still be represented correctly.
> >
> > Thanks
> 
> Thanks Leon — you're right that __counted_by should be preserved. In
> your approach, when the flag is set, the last entry in the array
> carries the TPH data, so the effective DMA range count is nr_ranges -
> 1.

It is correct only if you keep the original name *nr_range*. However, the
variable name is worth changing to something clearer, such as
*nr_array_elements*.

> 
> That said, after discussing internally, we're leaning toward
> introducing a new VFIO device feature with dedicated TPH fields (as
> Alex suggested too), to avoid overloading vfio_region_dma_range with a
> union that changes semantics based on position.
> 
> Would you have concerns with that direction? I'll post a v3 with the
> new approach.

I don’t have any concerns. My only worry was about “stealing” too many  
bits from the flags variable, and you’ve avoided that here.

Thanks

  reply	other threads:[~2026-04-27 18:35 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20 18:39 [PATCH v1 0/2] Retrieve TPH from dma-buf for PCIe P2P memory access Zhiping Zhang
2026-04-20 18:39 ` [PATCH v1 1/2] vfio: add callback to get tph info for dma-buf Zhiping Zhang
2026-04-22 15:23   ` Alex Williamson
2026-04-22 16:29     ` Jason Gunthorpe
2026-04-22 19:27       ` Alex Williamson
2026-04-23 14:28         ` Jason Gunthorpe
2026-04-23 19:20           ` Alex Williamson
2026-04-23 22:46             ` Jason Gunthorpe
2026-04-24  5:41               ` Zhiping Zhang
2026-04-27 13:37     ` Leon Romanovsky
2026-04-27 14:28       ` Zhiping Zhang
2026-04-27 18:35         ` Leon Romanovsky [this message]
2026-04-20 18:39 ` [PATCH v1 2/2] RDMA/mlx5: get tph for p2p access when registering dma-buf mr Zhiping Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260427183513.GK440345@unreal \
    --to=leon@kernel.org \
    --cc=alex@shazbot.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=helgaas@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@meta.com \
    --cc=yishaih@nvidia.com \
    --cc=yochai@nvidia.com \
    --cc=zhipingz@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox