linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
	"Christoph Hellwig" <hch@lst.de>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Christian König" <christian.koenig@amd.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linaro-mm-sig@lists.linaro.org" <linaro-mm-sig@lists.linaro.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-media@vger.kernel.org" <linux-media@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Will Deacon" <will@kernel.org>
Subject: Re: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO regions
Date: Thu, 24 Jul 2025 08:44:43 +0300	[thread overview]
Message-ID: <20250724054443.GP402218@unreal> (raw)
In-Reply-To: <IA0PR11MB7185E487736B8B4CD70600DEF85EA@IA0PR11MB7185.namprd11.prod.outlook.com>

On Thu, Jul 24, 2025 at 05:13:49AM +0000, Kasireddy, Vivek wrote:
> Hi Leon,
> 
> > Subject: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO
> > regions
> > 
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Add support for exporting PCI device MMIO regions through dma-buf,
> > enabling safe sharing of non-struct page memory with controlled
> > lifetime management. This allows RDMA and other subsystems to import
> > dma-buf FDs and build them into memory regions for PCI P2P operations.
> > 
> > The implementation provides a revocable attachment mechanism using
> > dma-buf move operations. MMIO regions are normally pinned as BARs
> > don't change physical addresses, but access is revoked when the VFIO
> > device is closed or a PCI reset is issued. This ensures kernel
> > self-defense against potentially hostile userspace.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >  drivers/vfio/pci/Kconfig           |  20 ++
> >  drivers/vfio/pci/Makefile          |   2 +
> >  drivers/vfio/pci/vfio_pci_config.c |  22 +-
> >  drivers/vfio/pci/vfio_pci_core.c   |  25 ++-
> >  drivers/vfio/pci/vfio_pci_dmabuf.c | 321 +++++++++++++++++++++++++++++
> >  drivers/vfio/pci/vfio_pci_priv.h   |  23 +++
> >  include/linux/dma-buf.h            |   1 +
> >  include/linux/vfio_pci_core.h      |   3 +
> >  include/uapi/linux/vfio.h          |  19 ++
> >  9 files changed, 431 insertions(+), 5 deletions(-)
> >  create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c

<...>

> > +static int validate_dmabuf_input(struct vfio_pci_core_device *vdev,
> > +				 struct vfio_device_feature_dma_buf *dma_buf)
> > +{
> > +	struct pci_dev *pdev = vdev->pdev;
> > +	u32 bar = dma_buf->region_index;
> > +	u64 offset = dma_buf->offset;
> > +	u64 len = dma_buf->length;
> > +	resource_size_t bar_size;
> > +	u64 sum;
> > +
> > +	/*
> > +	 * For PCI the region_index is the BAR number like  everything else.
> > +	 */
> > +	if (bar >= VFIO_PCI_ROM_REGION_INDEX)
> > +		return -ENODEV;

<...>

> > +/**
> > + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
> > + * regions selected.
> > + *
> > + * open_flags are the typical flags passed to open(2), eg O_RDWR,
> > O_CLOEXEC,
> > + * etc. offset/length specify a slice of the region to create the dmabuf from.
> > + * nr_ranges is the total number of (P2P DMA) ranges that comprise the
> > dmabuf.
> Any particular reason why you dropped the option (nr_ranges) of creating a
> single dmabuf from multiple ranges of an MMIO region?

I did it for two reasons. First, I wanted to simplify the code in order
to speed-up discussion over the patchset itself. Second, I failed to
find justification for need of multiple ranges, as the number of BARs
are limited by VFIO_PCI_ROM_REGION_INDEX (6) and same functionality
can be achieved by multiple calls to DMABUF import.

> 
> Restricting the dmabuf to a single range (or having to create multiple dmabufs
> to represent multiple regions/ranges associated with a single scattered buffer)
> would be very limiting and may not work in all cases. For instance, in my use-case,
> I am trying to share a large (4k mode) framebuffer (FB) located in GPU's VRAM
> between two (p2p compatible) GPU devices. And, this would probably not work
> given that allocating a large contiguous FB (nr_ranges = 1) in VRAM may not be
> feasible when there is memory pressure.

Can you please help me and point to the place in the code where this can fail?
I'm probably missing something basic as there are no large allocations
in the current patchset.

> 
> Furthermore, since you are adding a new UAPI with this patch/feature, as you know,
> we cannot go back and tweak it (to add support for nr_ranges > 1) should there
> be a need in the future, but you can always use nr_ranges = 1 anytime. Therefore,
> I think it makes sense to be flexible in terms of the number of ranges to include
> while creating a dmabuf instead of restricting ourselves to one range.

I'm not a big fan of over-engineering. Let's first understand if this
case is needed.

Thanks

> 
> Thanks,
> Vivek
> 
> > + *
> > + * Return: The fd number on success, -1 and errno is set on failure.
> > + */
> > +#define VFIO_DEVICE_FEATURE_DMA_BUF 11
> > +
> > +struct vfio_device_feature_dma_buf {
> > +	__u32	region_index;
> > +	__u32	open_flags;
> > +	__u64	offset;
> > +	__u64	length;
> > +};
> > +
> >  /* -------- API for Type1 VFIO IOMMU -------- */
> > 
> >  /**
> > --
> > 2.50.1
> 

  reply	other threads:[~2025-07-24  5:44 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-23 13:00 [PATCH 00/10] vfio/pci: Allow MMIO regions to be exported through dma-buf Leon Romanovsky
2025-07-23 13:00 ` [PATCH 01/10] PCI/P2PDMA: Remove redundant bus_offset from map state Leon Romanovsky
2025-07-24  7:50   ` Christoph Hellwig
2025-07-23 13:00 ` [PATCH 02/10] PCI/P2PDMA: Introduce p2pdma_provider structure for cleaner abstraction Leon Romanovsky
2025-07-24  7:51   ` Christoph Hellwig
2025-07-24  7:55     ` Leon Romanovsky
2025-07-24  7:59       ` Christoph Hellwig
2025-07-24  8:07         ` Leon Romanovsky
2025-07-27 18:51         ` Jason Gunthorpe
2025-07-29  7:52           ` Christoph Hellwig
2025-07-29  8:53             ` Leon Romanovsky
2025-07-29 10:41               ` Christoph Hellwig
2025-07-29 11:39                 ` Leon Romanovsky
2025-07-29 13:15             ` Jason Gunthorpe
2025-07-29 16:12   ` Jason Gunthorpe
2025-07-23 13:00 ` [PATCH 03/10] PCI/P2PDMA: Simplify bus address mapping API Leon Romanovsky
2025-07-24  7:52   ` Christoph Hellwig
2025-07-23 13:00 ` [PATCH 04/10] PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation Leon Romanovsky
2025-07-23 13:00 ` [PATCH 05/10] PCI/P2PDMA: Export pci_p2pdma_map_type() function Leon Romanovsky
2025-07-24  8:03   ` Christoph Hellwig
2025-07-24  8:13     ` Leon Romanovsky
2025-07-25 16:30       ` Logan Gunthorpe
2025-07-25 18:54         ` Leon Romanovsky
2025-07-25 19:12           ` Logan Gunthorpe
2025-07-27  6:01             ` Leon Romanovsky
2025-07-27 19:05         ` Jason Gunthorpe
2025-07-28 16:12           ` Logan Gunthorpe
2025-07-28 16:41             ` Leon Romanovsky
2025-07-28 17:07               ` Logan Gunthorpe
2025-07-28 23:11                 ` Jason Gunthorpe
2025-07-29 20:54                   ` Logan Gunthorpe
2025-07-29 22:14                     ` Jason Gunthorpe
2025-07-30  8:03                     ` Leon Romanovsky
2025-07-29  7:52       ` Christoph Hellwig
2025-07-29  8:45         ` Leon Romanovsky
2025-07-27 19:02     ` Jason Gunthorpe
2025-07-23 13:00 ` [PATCH 06/10] types: move phys_vec definition to common header Leon Romanovsky
2025-07-23 13:00 ` [PATCH 07/10] vfio: Export vfio device get and put registration helpers Leon Romanovsky
2025-07-23 13:00 ` [PATCH 08/10] vfio/pci: Enable peer-to-peer DMA transactions by default Leon Romanovsky
2025-07-23 13:00 ` [PATCH 09/10] vfio/pci: Share the core device pointer while invoking feature functions Leon Romanovsky
2025-07-28 20:55   ` Alex Williamson
2025-07-29  8:39     ` Leon Romanovsky
2025-07-23 13:00 ` [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO regions Leon Romanovsky
2025-07-24  5:13   ` Kasireddy, Vivek
2025-07-24  5:44     ` Leon Romanovsky [this message]
2025-07-25  5:34       ` Kasireddy, Vivek
2025-07-27  6:16         ` Leon Romanovsky
2025-07-29 19:44   ` Robin Murphy
2025-07-29 20:13     ` Jason Gunthorpe
2025-07-30  9:32       ` Leon Romanovsky
2025-07-30 14:49       ` Robin Murphy
2025-07-30 16:01         ` Jason Gunthorpe
2025-07-30 19:58 ` [PATCH 00/10] vfio/pci: Allow MMIO regions to be exported through dma-buf Alex Williamson
2025-07-31  0:21   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250724054443.GP402218@unreal \
    --to=leon@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=m.szyprowski@samsung.com \
    --cc=robin.murphy@arm.com \
    --cc=sumit.semwal@linaro.org \
    --cc=vivek.kasireddy@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).