Re: [PATCH v7 11/11] vfio/nvgrace: Support get_dmabuf_phys

linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Alex Williamson <alex@shazbot.org>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Joerg Roedel" <joro@8bytes.org>, "Will Deacon" <will@kernel.org>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Kees Cook" <kees@kernel.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	"Ankit Agrawal" <ankita@nvidia.com>,
	"Yishai Hadas" <yishaih@nvidia.com>,
	"Shameer Kolothum" <skolothumtho@nvidia.com>,
	"Kevin Tian" <kevin.tian@intel.com>,
	"Krishnakant Jaju" <kjaju@nvidia.com>,
	"Matt Ochs" <mochs@nvidia.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org, iommu@lists.linux.dev,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linaro-mm-sig@lists.linaro.org, kvm@vger.kernel.org,
	linux-hardening@vger.kernel.org, "Alex Mastro" <amastro@fb.com>,
	"Nicolin Chen" <nicolinc@nvidia.com>
Subject: Re: [PATCH v7 11/11] vfio/nvgrace: Support get_dmabuf_phys
Date: Mon, 10 Nov 2025 13:05:34 -0700	[thread overview]
Message-ID: <20251110130534.4d4b17ad.alex@shazbot.org> (raw)
In-Reply-To: <20251106-dmabuf-vfio-v7-11-2503bf390699@nvidia.com>

On Thu,  6 Nov 2025 16:16:56 +0200
Leon Romanovsky <leon@kernel.org> wrote:

> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> Call vfio_pci_core_fill_phys_vec() with the proper physical ranges for the
> synthetic BAR 2 and BAR 4 regions. Otherwise use the normal flow based on
> the PCI bar.
> 
> This demonstrates a DMABUF that follows the region info report to only
> allow mapping parts of the region that are mmapable. Since the BAR is
> power of two sized and the "CXL" region is just page aligned the there can
> be a padding region at the end that is not mmaped or passed into the
> DMABUF.
> 
> The "CXL" ranges that are remapped into BAR 2 and BAR 4 areas are not PCI
> MMIO, they actually run over the CXL-like coherent interconnect and for
> the purposes of DMA behave identically to DRAM. We don't try to model this
> distinction between true PCI BAR memory that takes a real PCI path and the
> "CXL" memory that takes a different path in the p2p framework for now.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Tested-by: Alex Mastro <amastro@fb.com>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/vfio/pci/nvgrace-gpu/main.c | 56 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c
> index e346392b72f6..7d7ab2c84018 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/main.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/main.c
> @@ -7,6 +7,7 @@
>  #include <linux/vfio_pci_core.h>
>  #include <linux/delay.h>
>  #include <linux/jiffies.h>
> +#include <linux/pci-p2pdma.h>
>  
>  /*
>   * The device memory usable to the workloads running in the VM is cached
> @@ -683,6 +684,54 @@ nvgrace_gpu_write(struct vfio_device *core_vdev,
>  	return vfio_pci_core_write(core_vdev, buf, count, ppos);
>  }
>  
> +static int nvgrace_get_dmabuf_phys(struct vfio_pci_core_device *core_vdev,
> +				   struct p2pdma_provider **provider,
> +				   unsigned int region_index,
> +				   struct dma_buf_phys_vec *phys_vec,
> +				   struct vfio_region_dma_range *dma_ranges,
> +				   size_t nr_ranges)
> +{
> +	struct nvgrace_gpu_pci_core_device *nvdev = container_of(
> +		core_vdev, struct nvgrace_gpu_pci_core_device, core_device);
> +	struct pci_dev *pdev = core_vdev->pdev;
> +
> +	if (nvdev->resmem.memlength && region_index == RESMEM_REGION_INDEX) {
> +		/*
> +		 * The P2P properties of the non-BAR memory is the same as the
> +		 * BAR memory, so just use the provider for index 0. Someday
> +		 * when CXL gets P2P support we could create CXLish providers
> +		 * for the non-BAR memory.
> +		 */
> +		*provider = pcim_p2pdma_provider(pdev, 0);
> +		if (!*provider)
> +			return -EINVAL;
> +		return vfio_pci_core_fill_phys_vec(phys_vec, dma_ranges,
> +						   nr_ranges,
> +						   nvdev->resmem.memphys,
> +						   nvdev->resmem.memlength);
> +	} else if (region_index == USEMEM_REGION_INDEX) {
> +		/*
> +		 * This is actually cachable memory and isn't treated as P2P in
> +		 * the chip. For now we have no way to push cachable memory
> +		 * through everything and the Grace HW doesn't care what caching
> +		 * attribute is programmed into the SMMU. So use BAR 0.
> +		 */
> +		*provider = pcim_p2pdma_provider(pdev, 0);
> +		if (!*provider)
> +			return -EINVAL;
> +		return vfio_pci_core_fill_phys_vec(phys_vec, dma_ranges,
> +						   nr_ranges,
> +						   nvdev->usemem.memphys,
> +						   nvdev->usemem.memlength);
> +	}
> +	return vfio_pci_core_get_dmabuf_phys(core_vdev, provider, region_index,
> +					     phys_vec, dma_ranges, nr_ranges);
> +}


Unless my eyes deceive, we could reduce the redundancy a bit:

	struct mem_region *mem_region = NULL;

	if (nvdev->resmem.memlength && region_index == RESMEM_REGION_INDEX) {
		/*
		 * The P2P properties of the non-BAR memory is the same as the
		 * BAR memory, so just use the provider for index 0. Someday
		 * when CXL gets P2P support we could create CXLish providers
		 * for the non-BAR memory.
		 */
		mem_region = &nvdev->resmem;
	} else if (region_index == USEMEM_REGION_INDEX) {
		/*
		 * This is actually cachable memory and isn't treated as P2P in
		 * the chip. For now we have no way to push cachable memory
		 * through everything and the Grace HW doesn't care what caching
		 * attribute is programmed into the SMMU. So use BAR 0.
		 */
		mem_region = &nvdev->usemem;
	}

	if (mem_region) {
		*provider = pcim_p2pdma_provider(pdev, 0);
		if (!*provider)
			return -EINVAL;
		return vfio_pci_core_fill_phys_vec(phys_vec, dma_ranges,
						   nr_ranges,
						   mem_region->memphys,
						   mem_region->memlength);
	}

	return vfio_pci_core_get_dmabuf_phys(core_vdev, provider, region_index,
					     phys_vec, dma_ranges, nr_ranges);
		
Thanks,
Alex

next prev parent reply	other threads:[~2025-11-10 20:05 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-06 14:16 [PATCH v7 00/11] vfio/pci: Allow MMIO regions to be exported through dma-buf Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 01/11] PCI/P2PDMA: Separate the mmap() support from the core logic Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 02/11] PCI/P2PDMA: Simplify bus address mapping API Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 03/11] PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 04/11] PCI/P2PDMA: Provide an access to pci_p2pdma_map_type() function Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 05/11] PCI/P2PDMA: Document DMABUF model Leon Romanovsky
2025-11-07  6:15   ` Randy Dunlap
2025-11-07 16:01     ` Leon Romanovsky
2025-11-07 18:58       ` Randy Dunlap
2025-11-07 20:27         ` Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 06/11] dma-buf: provide phys_vec to scatter-gather mapping routine Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 07/11] vfio: Export vfio device get and put registration helpers Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 08/11] vfio/pci: Share the core device pointer while invoking feature functions Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 09/11] vfio/pci: Enable peer-to-peer DMA transactions by default Leon Romanovsky
2025-11-06 14:16 ` [PATCH v7 10/11] vfio/pci: Add dma-buf export support for MMIO regions Leon Romanovsky
2025-11-10 20:05   ` Alex Williamson
2025-11-06 14:16 ` [PATCH v7 11/11] vfio/nvgrace: Support get_dmabuf_phys Leon Romanovsky
2025-11-10 20:05   ` Alex Williamson [this message]
2025-11-10 20:28     ` Leon Romanovsky
2025-11-10 20:42 ` [PATCH v7 00/11] vfio/pci: Allow MMIO regions to be exported through dma-buf Alex Williamson
2025-11-11  8:54   ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251110130534.4d4b17ad.alex@shazbot.org \
    --to=alex@shazbot.org \
    --cc=akpm@linux-foundation.org \
    --cc=amastro@fb.com \
    --cc=ankita@nvidia.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gustavoars@kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=kees@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=kjaju@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=m.szyprowski@samsung.com \
    --cc=mochs@nvidia.com \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=skolothumtho@nvidia.com \
    --cc=sumit.semwal@linaro.org \
    --cc=will@kernel.org \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).