From: Alex Williamson <alex@shazbot.org>
To: <ankita@nvidia.com>
Cc: <vsethi@nvidia.com>, <jgg@nvidia.com>, <mochs@nvidia.com>,
<jgg@ziepe.ca>, <skolothumtho@nvidia.com>, <cjia@nvidia.com>,
<zhiw@nvidia.com>, <kjaju@nvidia.com>, <yishaih@nvidia.com>,
<kevin.tian@intel.com>, <kvm@vger.kernel.org>,
<linux-kernel@vger.kernel.org>,
alex@shazbot.org
Subject: Re: [PATCH RFC v2 10/15] vfio/nvgrace-egm: Clear Memory before handing out to VM
Date: Wed, 4 Mar 2026 15:14:43 -0700 [thread overview]
Message-ID: <20260304151443.26eb33f0@shazbot.org> (raw)
In-Reply-To: <20260223155514.152435-11-ankita@nvidia.com>
On Mon, 23 Feb 2026 15:55:09 +0000
<ankita@nvidia.com> wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
>
> The EGM region is invisible to the host Linux kernel and it does not
> manage the region. The EGM module manages the EGM memory and thus is
> responsible to clear out the region before handing out to the VM.
>
> Clear EGM region on EGM chardev open. To avoid CPU lockup logs,
> zap the region in 1G chunks.
>
> Suggested-by: Vikram Sethi <vsethi@nvidia.com>
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
> drivers/vfio/pci/nvgrace-gpu/egm.c | 43 ++++++++++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
>
> diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-gpu/egm.c
> index 5786ebe374a5..de7771a4145d 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/egm.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c
> @@ -15,6 +15,7 @@ static DEFINE_XARRAY(egm_chardevs);
> struct chardev {
> struct device device;
> struct cdev cdev;
> + atomic_t open_count;
> };
>
> static struct nvgrace_egm_dev *
> @@ -30,6 +31,42 @@ static int nvgrace_egm_open(struct inode *inode, struct file *file)
> {
> struct chardev *egm_chardev =
> container_of(inode->i_cdev, struct chardev, cdev);
> + struct nvgrace_egm_dev *egm_dev =
> + egm_chardev_to_nvgrace_egm_dev(egm_chardev);
> + void *memaddr;
> +
> + if (atomic_cmpxchg(&egm_chardev->open_count, 0, 1) != 0)
> + return -EBUSY;
> +
> + /*
> + * nvgrace-egm module is responsible to manage the EGM memory as
> + * the host kernel has no knowledge of it. Clear the region before
> + * handing over to userspace.
> + */
> + memaddr = memremap(egm_dev->egmphys, egm_dev->egmlength, MEMREMAP_WB);
> + if (!memaddr) {
> + atomic_dec(&egm_chardev->open_count);
> + return -ENOMEM;
> + }
> +
> + /*
> + * Clear in chunks of 1G to avoid CPU lockup logs.
> + */
> + {
> + size_t remaining = egm_dev->egmlength;
> + u8 *chunk_addr = (u8 *)memaddr;
> + size_t chunk_size;
Declare at the start of the function and remove this scope hack.
> +
> + while (remaining > 0) {
> + chunk_size = min(remaining, SZ_1G);
min_t(size_t,,);
> + memset(chunk_addr, 0, chunk_size);
> + cond_resched();
> + chunk_addr += chunk_size;
> + remaining -= chunk_size;
> + }
> + }
Aren't we going to want to do this asynchronously or run multiple
threads to avoid stalling VM launch?
> +
> + memunmap(memaddr);
>
> file->private_data = egm_chardev;
>
> @@ -38,8 +75,13 @@ static int nvgrace_egm_open(struct inode *inode, struct file *file)
>
> static int nvgrace_egm_release(struct inode *inode, struct file *file)
> {
> + struct chardev *egm_chardev =
> + container_of(inode->i_cdev, struct chardev, cdev);
> +
> file->private_data = NULL;
>
> + atomic_dec(&egm_chardev->open_count);
> +
> return 0;
> }
>
> @@ -108,6 +150,7 @@ setup_egm_chardev(struct nvgrace_egm_dev *egm_dev)
> egm_chardev->device.parent = &egm_dev->aux_dev.dev;
> cdev_init(&egm_chardev->cdev, &file_ops);
> egm_chardev->cdev.owner = THIS_MODULE;
> + atomic_set(&egm_chardev->open_count, 0);
Already zero from kzalloc. Thanks,
Alex
>
> ret = dev_set_name(&egm_chardev->device, "egm%lld", egm_dev->egmpxm);
> if (ret)
next prev parent reply other threads:[~2026-03-04 22:14 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 15:54 [PATCH RFC v2 00/15] Add virtualization support for EGM ankita
2026-02-23 15:55 ` [PATCH RFC v2 01/15] vfio/nvgrace-gpu: Expand module_pci_driver to allow custom module init ankita
2026-02-23 15:55 ` [PATCH RFC v2 02/15] vfio/nvgrace-gpu: Create auxiliary device for EGM ankita
2026-02-26 14:28 ` Shameer Kolothum Thodi
2026-03-04 0:13 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 03/15] vfio/nvgrace-gpu: track GPUs associated with the EGM regions ankita
2026-02-26 14:55 ` Shameer Kolothum Thodi
2026-03-04 17:14 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 04/15] vfio/nvgrace-gpu: Introduce functions to fetch and save EGM info ankita
2026-02-26 15:12 ` Shameer Kolothum Thodi
2026-03-04 17:37 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 05/15] vfio/nvgrace-egm: Introduce module to manage EGM ankita
2026-03-04 18:09 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 06/15] vfio/nvgrace-egm: Introduce egm class and register char device numbers ankita
2026-03-04 18:56 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 07/15] vfio/nvgrace-egm: Register auxiliary driver ops ankita
2026-03-04 19:06 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 08/15] vfio/nvgrace-egm: Expose EGM region as char device ankita
2026-02-26 17:08 ` Shameer Kolothum Thodi
2026-03-04 20:16 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 09/15] vfio/nvgrace-egm: Add chardev ops for EGM management ankita
2026-03-04 22:04 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 10/15] vfio/nvgrace-egm: Clear Memory before handing out to VM ankita
2026-02-26 18:15 ` Shameer Kolothum Thodi
2026-02-26 18:56 ` Jason Gunthorpe
2026-02-26 19:29 ` Shameer Kolothum Thodi
2026-03-04 22:14 ` Alex Williamson [this message]
2026-02-23 15:55 ` [PATCH RFC v2 11/15] vfio/nvgrace-egm: Fetch EGM region retired pages list ankita
2026-03-04 22:37 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 12/15] vfio/nvgrace-egm: Introduce ioctl to share retired pages ankita
2026-03-04 23:00 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 13/15] vfio/nvgrace-egm: expose the egm size through sysfs ankita
2026-03-04 23:22 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 14/15] vfio/nvgrace-gpu: Add link from pci to EGM ankita
2026-03-04 23:37 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 15/15] vfio/nvgrace-egm: register EGM PFNMAP range with memory_failure ankita
2026-03-04 23:48 ` Alex Williamson
2026-03-05 17:33 ` [PATCH RFC v2 00/15] Add virtualization support for EGM Alex Williamson
2026-03-11 6:47 ` Ankit Agrawal
2026-03-11 20:37 ` Alex Williamson
2026-03-12 13:51 ` Ankit Agrawal
2026-03-12 14:59 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260304151443.26eb33f0@shazbot.org \
--to=alex@shazbot.org \
--cc=ankita@nvidia.com \
--cc=cjia@nvidia.com \
--cc=jgg@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=kevin.tian@intel.com \
--cc=kjaju@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mochs@nvidia.com \
--cc=skolothumtho@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=yishaih@nvidia.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.