The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Pranjal Shrivastava <praan@google.com>
To: Samiullah Khawaja <skhawaja@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Alex Williamson <alex@shazbot.org>,
	Kevin Tian <kevin.tian@intel.com>,
	Ankit Agrawal <ankita@nvidia.com>, Matt Evans <mattev@meta.com>,
	Vivek Kasireddy <vivek.kasireddy@intel.com>,
	Leon Romanovsky <leon@kernel.org>,
	Shivaji Kant <shivajikant@google.com>
Subject: Re: [RFC PATCH 0/5] vfio/pci: Support ZONE_DEVICE-backed P2P Registration
Date: Tue, 16 Jun 2026 06:38:15 +0000	[thread overview]
Message-ID: <ajDvV4mOkaL9NMII@google.com> (raw)
In-Reply-To: <ajCaTmrYQvKDoP_I@google.com>

On Tue, Jun 16, 2026 at 12:42:19AM +0000, Samiullah Khawaja wrote:
> On Fri, Jun 12, 2026 at 02:50:18PM +0000, Pranjal Shrivastava wrote:
> > On Thu, Jun 11, 2026 at 07:14:47PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Jun 11, 2026 at 02:40:17PM +0000, Pranjal Shrivastava wrote:
> > > > On Wed, Jun 10, 2026 at 01:28:48PM -0300, Jason Gunthorpe wrote:
> > > > > On Wed, Jun 10, 2026 at 03:18:48PM +0000, Pranjal Shrivastava wrote:
> > > > >
> 
> [snip]
> > 
> > Yea, that's going to be tricky.. I'm thinking if we can have a zap model
> > there somehow? If the device is gone / going through a reset, we can
> > handle the refcounts accordingly?
> 
> IIUC zapping will only work if userspace is using these, but if you feed
> this memory into another device through NFS and the pages are pinned by
> gup (or that device) then the dmabuf move_notify/revoke logic on device
> reset will be tricky as now the pages for that device BAR are pinned.

Yes, it would be tricky. However the zap is still needed since userspace
is the entity creating the file I/O leading to those pins. The user would
mmap the BAR and pass the buff into a POSIX read() / write() where the 
filesystem (like NFS) would extract the iovs and call GUP to pin them.

By zapping the userspace mappings first, we prevent the any new 
read/write() calls and halts the creation of additional GUP pins.
(Note that if GUP doesn't see a PTE for the page, it manually invokes
the page fault handler and waits for the page fault to be serviced, where
it would then block on the vdev->memory_lock held by the reset thread).

I agree it will be tricky but we just need a multi-stage sequence. The 
standard workflow is: userspace mmaps the BAR and passes the buffer to
filesystem via the POSIX file API. Filesystem then pins the pages via GUP
for the duration of the synchronous DMA.

My plan for RFC v2 is as follows:

a) Zap the userspace mappings first to prevent new requests
b) Wait for In-flight DMA: Just as we currently use dma_resv_wait_timeout
   to wait for HW fences, we'll first wait for the page refcounts to drop.

   An important thing to note is that filesystems can't pin these pages
   for long term, i.e. FOLL_PCI_P2PDMA and FOLL_LONGTERM can't be
   requested together for a single pin as mandated by gup [1]. Thus,
   filesystems using ZONE_DEVICE memory (via ITER_ALLOW_P2PDMA) simply
   hold the pins for the DMA duration.

c) Once the refcounts hit zero do we proceed with move_notify and the
   hardware reset.

This is going to be the first stab, I guess it'll definitely evolve
further. I'll try to implement this in RFC v2 and attempt to address 
these concerns 

Thanks,
Praan

[1] https://elixir.bootlin.com/linux/v7.1-rc6/source/mm/gup.c#L2538

      reply	other threads:[~2026-06-16  6:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 15:18 [RFC PATCH 0/5] vfio/pci: Support ZONE_DEVICE-backed P2P Registration Pranjal Shrivastava
2026-06-10 15:18 ` [RFC PATCH 1/5] vfio: Add UAPI for ZONE_DEVICE-backed P2P registration Pranjal Shrivastava
2026-06-10 15:18 ` [RFC PATCH 2/5] vfio/pci: Implement " Pranjal Shrivastava
2026-06-10 15:18 ` [RFC PATCH 3/5] vfio/pci: Block mmap & dmabuf export for ZONE_DEVICE-registered BARs Pranjal Shrivastava
2026-06-10 15:18 ` [RFC PATCH 4/5] vfio/pci: Block ZONE_DEVICE registration for BARs with active DMABUFs Pranjal Shrivastava
2026-06-10 15:18 ` [RFC PATCH 5/5] PCI/P2PDMA: Introduce a helper to release P2P resources Pranjal Shrivastava
2026-06-10 16:28 ` [RFC PATCH 0/5] vfio/pci: Support ZONE_DEVICE-backed P2P Registration Jason Gunthorpe
2026-06-10 18:32   ` Leon Romanovsky
2026-06-11 14:40   ` Pranjal Shrivastava
2026-06-11 14:43     ` Pranjal Shrivastava
2026-06-11 22:14     ` Jason Gunthorpe
2026-06-12 14:50       ` Pranjal Shrivastava
2026-06-16  0:42         ` Samiullah Khawaja
2026-06-16  6:38           ` Pranjal Shrivastava [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajDvV4mOkaL9NMII@google.com \
    --to=praan@google.com \
    --cc=alex@shazbot.org \
    --cc=ankita@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=jgg@ziepe.ca \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=mattev@meta.com \
    --cc=shivajikant@google.com \
    --cc=skhawaja@google.com \
    --cc=vivek.kasireddy@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox