From: Alex Williamson <alex@shazbot.org>
To: <ankita@nvidia.com>
Cc: <jgg@ziepe.ca>, <yishaih@nvidia.com>, <skolothumtho@nvidia.com>,
<kevin.tian@intel.com>, <aniketa@nvidia.com>, <vsethi@nvidia.com>,
<mochs@nvidia.com>, <Yunxiang.Li@amd.com>, <yi.l.liu@intel.com>,
<zhangdongdong@eswincomputing.com>, <avihaih@nvidia.com>,
<bhelgaas@google.com>, <peterx@redhat.com>, <pstanner@redhat.com>,
<apopple@nvidia.com>, <kvm@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <cjia@nvidia.com>,
<kwankhede@nvidia.com>, <targupta@nvidia.com>, <zhiw@nvidia.com>,
<danw@nvidia.com>, <dnigam@nvidia.com>, <kjaju@nvidia.com>
Subject: Re: [PATCH v6 5/6] vfio/nvgrace-gpu: Inform devmem unmapped after reset
Date: Tue, 25 Nov 2025 13:52:47 -0700 [thread overview]
Message-ID: <20251125135247.62878956.alex@shazbot.org> (raw)
In-Reply-To: <20251125173013.39511-6-ankita@nvidia.com>
On Tue, 25 Nov 2025 17:30:12 +0000
<ankita@nvidia.com> wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
>
> Introduce a new flag reset_done to notify that the GPU has just
> been reset and the mapping to the GPU memory is zapped.
>
> Implement the reset_done handler to set this new variable. It
> will be used later in the patches to wait for the GPU memory
> to be ready before doing any mapping or access.
>
> cc: Jason Gunthorpe <jgg@ziepe.ca>
> Suggested-by: Alex Williamson <alex@shazbot.org>
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
> drivers/vfio/pci/nvgrace-gpu/main.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c
> index 2b736cb82f38..7d5544280ed2 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/main.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/main.c
> @@ -58,6 +58,8 @@ struct nvgrace_gpu_pci_core_device {
> /* Lock to control device memory kernel mapping */
> struct mutex remap_lock;
> bool has_mig_hw_bug;
> + /* GPU has just been reset */
> + bool reset_done;
> };
>
> static void nvgrace_gpu_init_fake_bar_emu_regs(struct vfio_device *core_vdev)
> @@ -1047,12 +1049,27 @@ static const struct pci_device_id nvgrace_gpu_vfio_pci_table[] = {
>
> MODULE_DEVICE_TABLE(pci, nvgrace_gpu_vfio_pci_table);
>
/*
* Comment explaining why this can't use lockdep_assert_held_write but
* in vfio use cases relies on this for serialization against faults and
* read/write.
*/
Thanks,
Alex
> +static void nvgrace_gpu_vfio_pci_reset_done(struct pci_dev *pdev)
> +{
> + struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev);
> + struct nvgrace_gpu_pci_core_device *nvdev =
> + container_of(core_device, struct nvgrace_gpu_pci_core_device,
> + core_device);
> +
> + nvdev->reset_done = true;
> +}
> +
> +static const struct pci_error_handlers nvgrace_gpu_vfio_pci_err_handlers = {
> + .reset_done = nvgrace_gpu_vfio_pci_reset_done,
> + .error_detected = vfio_pci_core_aer_err_detected,
> +};
> +
> static struct pci_driver nvgrace_gpu_vfio_pci_driver = {
> .name = KBUILD_MODNAME,
> .id_table = nvgrace_gpu_vfio_pci_table,
> .probe = nvgrace_gpu_probe,
> .remove = nvgrace_gpu_remove,
> - .err_handler = &vfio_pci_core_err_handlers,
> + .err_handler = &nvgrace_gpu_vfio_pci_err_handlers,
> .driver_managed_dma = true,
> };
>
next prev parent reply other threads:[~2025-11-25 20:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-25 17:30 [PATCH v6 0/6] vfio/nvgrace-gpu: Support huge PFNMAP and wait for GPU ready post reset ankita
2025-11-25 17:30 ` [PATCH v6 1/6] vfio: export function to map the VMA ankita
2025-11-25 20:04 ` Zhi Wang
2025-11-25 20:52 ` Alex Williamson
2025-11-25 17:30 ` [PATCH v6 2/6] vfio/nvgrace-gpu: Add support for huge pfnmap ankita
2025-11-25 19:58 ` Zhi Wang
2025-11-25 20:52 ` Alex Williamson
2025-11-25 17:30 ` [PATCH v6 3/6] vfio: use vfio_pci_core_setup_barmap to map bar in mmap ankita
2025-11-25 20:04 ` Zhi Wang
2025-11-25 17:30 ` [PATCH v6 4/6] vfio/nvgrace-gpu: split the code to wait for GPU ready ankita
2025-11-25 20:30 ` Zhi Wang
2025-11-25 20:52 ` Alex Williamson
2025-11-25 17:30 ` [PATCH v6 5/6] vfio/nvgrace-gpu: Inform devmem unmapped after reset ankita
2025-11-25 20:52 ` Alex Williamson [this message]
2025-11-26 3:26 ` Ankit Agrawal
2025-11-26 4:54 ` Alex Williamson
2025-11-25 17:30 ` [PATCH v6 6/6] vfio/nvgrace-gpu: wait for the GPU mem to be ready ankita
2025-11-25 20:28 ` Zhi Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251125135247.62878956.alex@shazbot.org \
--to=alex@shazbot.org \
--cc=Yunxiang.Li@amd.com \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=avihaih@nvidia.com \
--cc=bhelgaas@google.com \
--cc=cjia@nvidia.com \
--cc=danw@nvidia.com \
--cc=dnigam@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=kevin.tian@intel.com \
--cc=kjaju@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mochs@nvidia.com \
--cc=peterx@redhat.com \
--cc=pstanner@redhat.com \
--cc=skolothumtho@nvidia.com \
--cc=targupta@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=yi.l.liu@intel.com \
--cc=yishaih@nvidia.com \
--cc=zhangdongdong@eswincomputing.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox