Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
Cc: "Jason Gunthorpe" <jgg@nvidia.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Simona Vetter" <simona.vetter@ffwll.ch>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Subject: Re: [PATCH v4 1/5] PCI/P2PDMA: Don't enforce ACS check for device functions of Intel GPUs
Date: Mon, 22 Sep 2025 23:25:47 -0700	[thread overview]
Message-ID: <aNI9a6o0RtQmDYPp@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <IA0PR11MB718580B723FA2BEDCFAB71E9F81DA@IA0PR11MB7185.namprd11.prod.outlook.com>

On Mon, Sep 22, 2025 at 11:53:06PM -0600, Kasireddy, Vivek wrote:
> Hi Jason,
> 
> > Subject: Re: [PATCH v4 1/5] PCI/P2PDMA: Don't enforce ACS check for device
> > functions of Intel GPUs
> > 
> > On Mon, Sep 22, 2025 at 01:22:49PM +0200, Christian König wrote:
> > 
> > > Well what exactly is happening here? You have a PF assigned to the
> > > host and a VF passed through to a guest, correct?
> > >
> > > And now the PF (from the host side) wants to access a BAR of the VF?
> > 
> > Not quite.
> > 
> > It is a GPU so it has a pool of VRAM. The PF can access all VRAM and
> > the VF can access some VRAM.
> > 
> > They want to get a DMABUF handle for a bit of the VF's reachable VRAM
> > that the PF can import and use through it's own funciton.
> > 
> > The use of the VF's BAR in this series is an ugly hack.
> IIUC, it is a common practice among GPU drivers including Xe and Amdgpu
> to never expose VRAM Addresses and instead have BAR addresses as DMA
> addresses when exporting dmabufs to other devices. Here is the relevant code
> snippet in Xe:
>                 phys_addr_t phys = cursor.start + xe_vram_region_io_start(tile->mem.vram);             
>                 size_t size = min_t(u64, cursor.size, SZ_2G);                         
>                 dma_addr_t addr;                                                      
>                                                                                       
>                 addr = dma_map_resource(dev, phys, size, dir,                         
>                                         DMA_ATTR_SKIP_CPU_SYNC);
> 
> And, here is the one in amdgpu:
>         for_each_sgtable_sg((*sgt), sg, i) {
>                 phys_addr_t phys = cursor.start + adev->gmc.aper_base;
>                 unsigned long size = min(cursor.size, AMDGPU_MAX_SG_SEGMENT_SIZE);
>                 dma_addr_t addr;
> 
>                 addr = dma_map_resource(dev, phys, size, dir,
>                                         DMA_ATTR_SKIP_CPU_SYNC);
> 

I've read through this thread—Jason, correct me if I'm wrong—but I
believe what you're suggesting is that instead of using PCIe P2P
(dma_map_resource) to communicate the VF's VRAM offset to the PF, we
should teach dma-buf to natively understand a VF's VRAM offset. I don't
think this is currently built into dma-buf, but it probably should be,
as it could benefit other use cases as well (e.g., UALink, NVLink,
etc.).

In both examples above, the PCIe P2P fabric is used for communication,
whereas in the VF→PF case, it's only using the PCIe P2P address to
extract the VF's VRAM offset, rather than serving as a communication
path. I believe that's Jason's objection. Again, Jason, correct me if
I'm misunderstanding here.

Assuming I'm understanding Jason's comments correctly, I tend to agree
with him.

> And, AFAICS, most of these drivers don't see use the BAR addresses directly
> if they import a dmabuf that they exported earlier and instead do this:
> 
>         if (dma_buf->ops == &xe_dmabuf_ops) {
>                 obj = dma_buf->priv;
>                 if (obj->dev == dev &&
>                     !XE_TEST_ONLY(test && test->force_different_devices)) {
>                         /*
>                          * Importing dmabuf exported from out own gem increases
>                          * refcount on gem itself instead of f_count of dmabuf.
>                          */
>                         drm_gem_object_get(obj);
>                         return obj;
>                 }
>         }

This code won't be triggered on the VF→PF path, as obj->dev == dev will
fail.

> 
> >The PF never actually uses the VF BAR
> That's because the PF can't use it directly, most likely due to hardware limitations.
> 
> >it just hackily converts the dma_addr_t back
> > to CPU physical and figures out where it is in the VRAM pool and then
> > uses a PF centric address for it.
> > 
> > All they want is either the actual VRAM address or the CPU physical.
> The problem here is that the CPU physical (aka BAR Address) is only
> usable by the CPU. Since the GPU PF only understands VRAM addresses,
> the current exporter (vfio-pci) or any VF/VFIO variant driver cannot provide
> the VRAM addresses that the GPU PF can use directly because they do not
> have access to the provisioning data.
>

Right, we need to provide the offset within the VRAM provisioning, which
the PF can resolve to a physical address based on the provisioning data.
The series already does this—the problem is how the VF provides
this offset. It shouldn't be a P2P address, but rather a native
dma-buf-provided offset that everyone involved in the attachment
understands.
 
> However, it is possible that if vfio-pci or a VF/VFIO variant driver had access
> to the VF's provisioning data, then it might be able to create a dmabuf with
> VRAM addresses that the PF can use directly. But I am not sure if exposing
> provisioning data to VFIO drivers is ok from a security standpoint or not.
> 

I'd prefer to leave the provisioning data to the PF if possible. I
haven't fully wrapped my head around the flow yet, but it should be
feasible for the VF → VFIO → PF path to pass along the initial VF
scatter-gather (SG) list in the dma-buf, which includes VF-specific
PFNs. The PF can then use this, along with its provisioning information,
to resolve the physical address.

Matt

> Thanks,
> Vivek
> 
> > 
> > Jason

  reply	other threads:[~2025-09-23  6:26 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250915072428.1712837-1-vivek.kasireddy@intel.com>
2025-09-15  7:21 ` [PATCH v4 1/5] PCI/P2PDMA: Don't enforce ACS check for device functions of Intel GPUs Vivek Kasireddy
2025-09-15 15:33   ` Logan Gunthorpe
2025-09-16 17:34   ` Bjorn Helgaas
2025-09-16 17:59     ` Jason Gunthorpe
2025-09-16 17:57   ` Jason Gunthorpe
2025-09-18  6:16     ` Kasireddy, Vivek
2025-09-18 12:04       ` Jason Gunthorpe
2025-09-19  6:22         ` Kasireddy, Vivek
2025-09-19 12:29           ` Jason Gunthorpe
2025-09-22  6:59             ` Kasireddy, Vivek
2025-09-22 11:22               ` Christian König
2025-09-22 12:20                 ` Jason Gunthorpe
2025-09-22 12:25                   ` Christian König
2025-09-22 12:29                     ` Jason Gunthorpe
2025-09-22 13:20                       ` Christian König
2025-09-22 13:27                         ` Jason Gunthorpe
2025-09-22 13:57                           ` Christian König
2025-09-22 14:00                             ` Jason Gunthorpe
2025-09-23  5:53                   ` Kasireddy, Vivek
2025-09-23  6:25                     ` Matthew Brost [this message]
2025-09-23  6:44                       ` Matthew Brost
2025-09-23  7:52                         ` Christian König
2025-09-23 12:15                           ` Jason Gunthorpe
2025-09-23 12:45                             ` Christian König
2025-09-23 13:12                               ` Jason Gunthorpe
2025-09-23 13:28                                 ` Christian König
2025-09-23 13:38                                   ` Jason Gunthorpe
2025-09-23 13:48                                     ` Christian König
2025-09-23 23:02                                       ` Matthew Brost
2025-09-24  8:29                                         ` Christian König
2025-09-24  6:50                                       ` Kasireddy, Vivek
2025-09-24  7:21                                         ` Christian König
2025-09-25  3:56                                           ` Kasireddy, Vivek
2025-09-25 10:51                                             ` Thomas Hellström
2025-09-25 11:28                                               ` Christian König
2025-09-25 13:11                                                 ` Thomas Hellström
2025-09-25 13:33                                                   ` Jason Gunthorpe
2025-09-25 15:40                                                     ` Thomas Hellström
2025-09-25 15:55                                                       ` Jason Gunthorpe
2025-09-26  6:12                                                 ` Kasireddy, Vivek
2025-09-23 13:36                               ` Christoph Hellwig
2025-09-23  6:01                 ` Kasireddy, Vivek
2025-09-22 12:12               ` Jason Gunthorpe
2025-09-24 16:13           ` Simon Richter
2025-09-24 17:12             ` Jason Gunthorpe
2025-09-25  4:06             ` Kasireddy, Vivek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aNI9a6o0RtQmDYPp@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=bhelgaas@google.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jgg@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=simona.vetter@ffwll.ch \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=vivek.kasireddy@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox