Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Pranjal Shrivastava" <praan@google.com>
Cc: kvm@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [RFC PATCH 5/5] PCI/P2PDMA: Introduce a helper to release P2P resources
Date: Wed, 10 Jun 2026 15:54:49 +0000	[thread overview]
Message-ID: <20260610155450.32E831F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260610151853.3608948-6-praan@google.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [Critical] `pci_p2pdma_remove_resource()` destroys the P2P memory pool without RCU synchronization, causing Use-After-Free and NULL pointer dereferences.
- [High] `vfio_pci_core_feature_p2p_register()` modifies P2P resources without acquiring any serialization lock, causing data races with concurrent ioctls and `vfio_pci_zap_bars()`.
- [High] `devm_memunmap_pages()` blocks indefinitely in `vfio_pci_zap_bars()` if P2P memory pages are persistently pinned by userspace, causing a local Denial of Service.

Pre-existing issues:
- [High] Error path in `pci_p2pdma_add_resource()` leaves a dangling devres action, causing a Use-After-Free during driver unbind.
--

commit f498fe90fe30c1dceda58fbbdcd2de8314934e62
Author: Pranjal Shrivastava <praan@google.com>

PCI/P2PDMA: Introduce a helper to release P2P resources
    
Introduce pci_p2pdma_remove_resource() to allow manual teardown of a
device's P2P DMA pool. The new API enables exclusive owners of a device,
such as vfio-pci, to cleanly release P2P resources during session closure
or hardware reset.

> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index adb17a4f69396..2a48ffefa01c4 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
[ ... ]
> @@ -466,6 +469,41 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
>  }
>  EXPORT_SYMBOL_GPL(pci_p2pdma_add_resource);

[Severity: High]
This is a pre-existing issue, but does the error path in
pci_p2pdma_add_resource() leave a dangling devres action?

If gen_pool_add_owner() fails, the code jumps to pages_free:

pages_free:
	devm_memunmap_pages(&pdev->dev, pgmap);
pgmap_free:
	devm_kfree(&pdev->dev, p2p_pgmap);
	return error;

The devm_add_action_or_reset() called earlier registers
pci_p2pdma_unmap_mappings() with the p2p_pgmap pointer, but it doesn't
appear to be deregistered here. When the device driver later unbinds,
will the devres core execute pci_p2pdma_unmap_mappings() with the already
freed p2p_pgmap pointer, leading to a Use-After-Free of p2p_pgmap->mem->owner?

> +/**
> + * pci_p2pdma_remove_resource - remove all p2p memory for a device
> + * @pdev: the device to remove the memory from
> + *
> + * Tear down the entire p2p DMA pool for the device. Zap any existinga
> + * userspace mappings of the p2pmem/allocate file
> + */
> +void pci_p2pdma_remove_resource(struct pci_dev *pdev)
> +{
[ ... ]
> +	gen_pool_destroy(p2pdma->pool);
> +	p2pdma->pool = NULL;

[Severity: Critical]
Does destroying the pool and setting it to NULL here without RCU
synchronization cause Use-After-Free and NULL pointer dereferences?

Because pdev->p2pdma is still active and accessible to RCU readers, it seems
this can race with several concurrent operations:

Concurrent sysfs readers like size_show() might read p2pdma->pool and
trigger a Use-After-Free when executing gen_pool_size().

Concurrent callers of pci_alloc_p2pmem() read p2pdma under RCU, see it is
not NULL, and could unconditionally pass the now-NULL p2pdma->pool to
gen_pool_alloc_owner(), causing a NULL pointer dereference.

Additionally, the driver detach handler pci_p2pdma_release() checks
if (p2pdma->pool) synchronize_rcu();. Since the pool is now NULL, it skips
RCU synchronization and immediately destroys the xarray, leading to a UAF
of the p2pdma structure itself by any remaining RCU readers.

> +
> +	sysfs_remove_group(&pdev->dev.kobj, &p2pmem_group);
> +
> +	pci_info(pdev, "removed all peer-to-peer DMA memory\n");
> +}
> +EXPORT_SYMBOL_GPL(pci_p2pdma_remove_resource);
[ ... ]
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 7913b8916df91..7b58cb344408d 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1704,6 +1704,11 @@ static void vfio_pci_zap_bars(struct vfio_pci_core_device *vdev)
>  	loff_t len = end - start;
>  
>  	unmap_mapping_range(core_vdev->inode->i_mapping, start, len, true);
> +
> +	if (vdev->p2p_registered_bars) {
> +		pci_p2pdma_remove_resource(vdev->pdev);

[Severity: High]
Is it possible for devm_memunmap_pages() to block indefinitely here while
holding a critical device rwsem?

Inside pci_p2pdma_remove_resource(), devm_memunmap_pages() synchronously
waits via wait_for_completion(&pgmap->done) for all device page references
to be released. If userspace has persistently pinned these P2P pages (e.g.,
by registering them with RDMA or io_uring), the reference count might remain
elevated, causing the kernel thread to hang in uninterruptible sleep.

Because vfio_pci_zap_bars() is called with down_write(&vdev->memory_lock)
held, this hang could prevent any future operations on the device, including
administrative driver unbinding.

[Severity: High]
Can pci_p2pdma_remove_resource() run concurrently with the ioctl handler
vfio_pci_core_feature_p2p_register()?

It looks like vfio_pci_core_feature_p2p_register() updates
vdev->p2p_registered_bars and calls pci_p2pdma_add_resource() without taking
vdev->memory_lock or any other serialization lock.

Since vfio_pci_zap_bars() directly calls pci_p2pdma_remove_resource(),
concurrent execution could cause a TOCTOU race and corrupt the p2pdma pool
state and pagemaps, as the PCI P2P subsystem assumes exclusive access from
the caller for state modifications.

> +		vdev->p2p_registered_bars = 0;
> +	}
>  }
>  
>  void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_core_device *vdev)

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260610151853.3608948-1-praan@google.com?part=5

  reply	other threads:[~2026-06-10 15:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 15:18 [RFC PATCH 0/5] vfio/pci: Support ZONE_DEVICE-backed P2P Registration Pranjal Shrivastava
2026-06-10 15:18 ` [RFC PATCH 1/5] vfio: Add UAPI for ZONE_DEVICE-backed P2P registration Pranjal Shrivastava
2026-06-10 15:31   ` sashiko-bot
2026-06-10 15:18 ` [RFC PATCH 2/5] vfio/pci: Implement " Pranjal Shrivastava
2026-06-10 15:35   ` sashiko-bot
2026-06-10 15:18 ` [RFC PATCH 3/5] vfio/pci: Block mmap & dmabuf export for ZONE_DEVICE-registered BARs Pranjal Shrivastava
2026-06-10 15:40   ` sashiko-bot
2026-06-10 15:18 ` [RFC PATCH 4/5] vfio/pci: Block ZONE_DEVICE registration for BARs with active DMABUFs Pranjal Shrivastava
2026-06-10 15:44   ` sashiko-bot
2026-06-10 15:18 ` [RFC PATCH 5/5] PCI/P2PDMA: Introduce a helper to release P2P resources Pranjal Shrivastava
2026-06-10 15:54   ` sashiko-bot [this message]
2026-06-10 16:28 ` [RFC PATCH 0/5] vfio/pci: Support ZONE_DEVICE-backed P2P Registration Jason Gunthorpe
2026-06-10 18:32   ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260610155450.32E831F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=praan@google.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox