From: Alex Williamson <alex@shazbot.org>
To: Ankit Agrawal <ankita@nvidia.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
Vikram Sethi <vsethi@nvidia.com>, Matt Ochs <mochs@nvidia.com>,
"jgg@ziepe.ca" <jgg@ziepe.ca>,
Shameer Kolothum Thodi <skolothumtho@nvidia.com>,
Neo Jia <cjia@nvidia.com>, Zhi Wang <zhiw@nvidia.com>,
Krishnakant Jaju <kjaju@nvidia.com>,
Yishai Hadas <yishaih@nvidia.com>,
"kevin.tian@intel.com" <kevin.tian@intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
alex@shazbot.org
Subject: Re: [PATCH RFC v2 00/15] Add virtualization support for EGM
Date: Thu, 12 Mar 2026 08:59:04 -0600 [thread overview]
Message-ID: <20260312085904.42a98f16@shazbot.org> (raw)
In-Reply-To: <SA1PR12MB7199673D9DE79C11DB6B9E33B044A@SA1PR12MB7199.namprd12.prod.outlook.com>
On Thu, 12 Mar 2026 13:51:20 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:
> >> > nvgrace-gpu is manipulating sysfs
> >> > on devices owned by nvgrace-egm, we don't have mechanisms to manage the
> >> > aux device relative to the state of the GPU, we're trying to add a
> >> > driver that can bind to device created by an out-of-tree driver, and
> >> > we're inventing new uAPIs on the chardev for things that already exist
> >> > for vfio regions.
> >>
> >> Sorry for the confusion. The nvgrace-egm would not bind to the device
> >> created by the out-of-tree driver. We would have a separate out-of-tree
> >> equivalent of nvgrace-egm to bind to the device by the out-of-tree vfio
> >> driver. Maybe we can consider exposing a register/unregister APIs from
> >> nvgrace-egm where a module (in-tree nvgrace / out-of-tree) can register
> >> a pdev and nvgrace-egm can use to fetch the region info.
> >
> > Ok, this wasn't clear to me, but does that also mean that if some GPUs
> > are managed by nvgrace-gpu and others by out-of-tree drivers that the
> > in-kernel and out-of-tree equivalent drivers are both installing
> > chardevs as /dev/egmXX? Playing in the same space is ugly, but what
> > happens when the 2 GPUs per socket are split between drivers and they
> > both try to added the same chardev?
>
> But that would be an unsupported configuration. It is expected that all the
> GPUs on the system and the EGM char devices to be attached to the same
> VM for full functionality. So either all the devices (GPU and EGM chardev)
> would be bound to nvgrace or to the out-of-tree module. Please refer sec 8.1
> https://docs.nvidia.com/multi-node-nvlink-systems/partition-guide-v1-2.pdf
> Perhaps I should add this information in the commit message.
Just because it can be documented as a policy doesn't make it an
agreeable architecture.
> > However, I'd then ask the question why we're associating EGM to the GPU
> > PCI driver at all. For instance, why should nvgrace-gpu spawn aux
> > devices to feed into an nvgrace-egm driver, and duplicate that whole
> > thing in an out-of-tree driver, when we could just have one in-kernel
> > platform(?) driver walk ACPI, find these ranges, and expose them as
> > chardev entirely independent of the PCI driver bound to the GPU?
>
> So a new platform driver to walk through the ACPI and look for EGM properties
> and create EGM char devs?
>
> Maybe it is okay, but given that all the 4 EGM properties are under the GPU's
> ACPI node and there being no independent ACPI _HID device identity, it sounds
> a bit off to me. Do we have a precedent like that?
>
> But as I mentioned above, the expectation is that the EGM devices and the GPU
> devices to be assigned to the same VM. So would it not make sense that we
> keep the association between the EGM devices and the GPU devices?
You're telling me that the EGM access is 100% independent of any state
related to the GPU, so why would we tie the lifecycle of these aux
devices to any particular driver for the GPU or re-implement it across
multiple drivers? That doesn't make sense to me. Thanks,
Alex
prev parent reply other threads:[~2026-03-12 14:59 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 15:54 [PATCH RFC v2 00/15] Add virtualization support for EGM ankita
2026-02-23 15:55 ` [PATCH RFC v2 01/15] vfio/nvgrace-gpu: Expand module_pci_driver to allow custom module init ankita
2026-02-23 15:55 ` [PATCH RFC v2 02/15] vfio/nvgrace-gpu: Create auxiliary device for EGM ankita
2026-02-26 14:28 ` Shameer Kolothum Thodi
2026-03-04 0:13 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 03/15] vfio/nvgrace-gpu: track GPUs associated with the EGM regions ankita
2026-02-26 14:55 ` Shameer Kolothum Thodi
2026-03-04 17:14 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 04/15] vfio/nvgrace-gpu: Introduce functions to fetch and save EGM info ankita
2026-02-26 15:12 ` Shameer Kolothum Thodi
2026-03-04 17:37 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 05/15] vfio/nvgrace-egm: Introduce module to manage EGM ankita
2026-03-04 18:09 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 06/15] vfio/nvgrace-egm: Introduce egm class and register char device numbers ankita
2026-03-04 18:56 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 07/15] vfio/nvgrace-egm: Register auxiliary driver ops ankita
2026-03-04 19:06 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 08/15] vfio/nvgrace-egm: Expose EGM region as char device ankita
2026-02-26 17:08 ` Shameer Kolothum Thodi
2026-03-04 20:16 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 09/15] vfio/nvgrace-egm: Add chardev ops for EGM management ankita
2026-03-04 22:04 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 10/15] vfio/nvgrace-egm: Clear Memory before handing out to VM ankita
2026-02-26 18:15 ` Shameer Kolothum Thodi
2026-02-26 18:56 ` Jason Gunthorpe
2026-02-26 19:29 ` Shameer Kolothum Thodi
2026-03-04 22:14 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 11/15] vfio/nvgrace-egm: Fetch EGM region retired pages list ankita
2026-03-04 22:37 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 12/15] vfio/nvgrace-egm: Introduce ioctl to share retired pages ankita
2026-03-04 23:00 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 13/15] vfio/nvgrace-egm: expose the egm size through sysfs ankita
2026-03-04 23:22 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 14/15] vfio/nvgrace-gpu: Add link from pci to EGM ankita
2026-03-04 23:37 ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 15/15] vfio/nvgrace-egm: register EGM PFNMAP range with memory_failure ankita
2026-03-04 23:48 ` Alex Williamson
2026-03-05 17:33 ` [PATCH RFC v2 00/15] Add virtualization support for EGM Alex Williamson
2026-03-11 6:47 ` Ankit Agrawal
2026-03-11 20:37 ` Alex Williamson
2026-03-12 13:51 ` Ankit Agrawal
2026-03-12 14:59 ` Alex Williamson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260312085904.42a98f16@shazbot.org \
--to=alex@shazbot.org \
--cc=ankita@nvidia.com \
--cc=cjia@nvidia.com \
--cc=jgg@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=kevin.tian@intel.com \
--cc=kjaju@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mochs@nvidia.com \
--cc=skolothumtho@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=yishaih@nvidia.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox