From: sashiko-bot@kernel.org
To: "Ankit Agrawal" <ankita@nvidia.com>
Cc: linux-pci@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH v7 1/1] vfio/nvgrace-gpu: Add Blackwell-Next GPU readiness check via CXL DVSEC
Date: Thu, 28 May 2026 10:22:14 +0000 [thread overview]
Message-ID: <20260528102214.D2D221F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260528093840.4018-1-ankita@nvidia.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] TOCTOU race in lockless device readiness checks causes unexpected -EAGAIN or VM_FAULT_SIGBUS errors instead of waiting for the device.
- [High] The 256-second device readiness polling uses uninterruptible `msleep()` loops, which can trigger hung task panics and renders user tasks unkillable.
--
commit a85819b2171b67b275d15e9b87eb7447b4dd63a4
Author: Ankit Agrawal <ankita@nvidia.com>
vfio/nvgrace-gpu: Add Blackwell-Next GPU readiness check via CXL DVSEC
Adds a CXL DVSEC-based readiness check for Blackwell-Next GPUs in addition
to the existing legacy BAR0 polling. The device readiness is determined by
polling the Memory_Active bit outside the memory lock for up to 256 seconds.
> diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c
> index fa056b69f899a..04fcc0d088f54 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/main.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/main.c
[ ... ]
> +static int nvgrace_gpu_wait_device_ready_cxl(struct nvgrace_gpu_pci_core_device *nvdev)
> +{
> + unsigned long deadline = jiffies + msecs_to_jiffies(POLL_QUANTUM_MS);
> + bool active_phase = false;
> + u32 status;
> + int ret;
> +
> + for (;;) {
[ ... ]
> + if (time_after(jiffies, deadline))
> + return -ETIME;
> +
> + msleep(POLL_QUANTUM_MS);
> + }
> +}
[Severity: High]
Will this uninterruptible sleep trigger hung task panics?
Since the CXL device readiness can take up to 256 seconds and msleep() places
the task in the TASK_UNINTERRUPTIBLE state, processes accessing the device
during a reset will become unresponsive and immune to signals (including
SIGKILL) for up to 4 minutes.
With the default Linux hung_task_timeout_secs at 120 seconds, blocking here
is highly likely to trigger the hung task detector and cause a kernel panic
on systems with hung_task_panic enabled. Would it be better to use an
interruptible sleep and abort on pending fatal signals?
[ ... ]
> @@ -313,6 +399,21 @@ static vm_fault_t nvgrace_gpu_vfio_pci_huge_fault(struct vm_fault *vmf,
> pfn = PHYS_PFN(memregion->memphys) + addr_to_pgoff(vma, addr);
>
> if (is_aligned_for_order(vma, addr, pfn, order)) {
> + /*
> + * Exit early under memory_lock to avoid a potentially lengthy
> + * device readiness wait on a runtime-suspended device. Any
> + * race after the lock is dropped is benign as the re-check
> + * inside the scoped guard below catches it.
> + */
> + scoped_guard(rwsem_read, &vdev->memory_lock) {
> + if (vdev->pm_runtime_engaged)
> + return VM_FAULT_SIGBUS;
> + }
> +
> + if (nvdev->cxl_dvsec && READ_ONCE(nvdev->reset_done) &&
> + nvgrace_gpu_wait_device_ready_cxl(nvdev))
> + return VM_FAULT_SIGBUS;
[Severity: High]
Can this lockless check race with a concurrent device reset and bypass the
intended polling mechanism?
If a concurrent device reset finishes and sets reset_done = true right after
this lockless check evaluates to false, the thread proceeds without polling
and blocks waiting for memory_lock.
Once the lock is acquired, the thread calls nvgrace_gpu_check_device_ready()
which observes reset_done == true and invokes the fallback
nvgrace_gpu_test_device_ready_cxl() under the lock. Because this CXL test is
non-blocking, it immediately returns -EAGAIN, causing
nvgrace_gpu_vfio_pci_huge_fault() to return VM_FAULT_SIGBUS and crash user
applications instead of waiting.
> +
> scoped_guard(rwsem_read, &vdev->memory_lock) {
> if (vdev->pm_runtime_engaged ||
> nvgrace_gpu_check_device_ready(nvdev))
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260528093840.4018-1-ankita@nvidia.com?part=1
next prev parent reply other threads:[~2026-05-28 10:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-28 9:38 [PATCH v7 1/1] vfio/nvgrace-gpu: Add Blackwell-Next GPU readiness check via CXL DVSEC Ankit Agrawal
2026-05-28 10:22 ` sashiko-bot [this message]
2026-05-28 17:56 ` Alex Williamson
2026-05-29 11:27 ` Ankit Agrawal
2026-06-08 15:47 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260528102214.D2D221F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=ankita@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox