From: Jason Gunthorpe <jgg@nvidia.com>
To: ankita@nvidia.com
Cc: alex.williamson@redhat.com, yishaih@nvidia.com,
skolothumtho@nvidia.com, kevin.tian@intel.com,
yi.l.liu@intel.com, zhiw@nvidia.com, aniketa@nvidia.com,
cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
jhubbard@nvidia.com, danw@nvidia.com, anuaggarwal@nvidia.com,
mochs@nvidia.com, kjaju@nvidia.com, dnigam@nvidia.com,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC 14/14] vfio/nvgrace-gpu: Add link from pci to EGM
Date: Fri, 5 Sep 2025 10:42:19 -0300 [thread overview]
Message-ID: <20250905134219.GH616306@nvidia.com> (raw)
In-Reply-To: <20250904040828.319452-15-ankita@nvidia.com>
On Thu, Sep 04, 2025 at 04:08:28AM +0000, ankita@nvidia.com wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
>
> To replicate the host EGM topology in the VM in terms of
> the GPU affinity, the userspace need to be aware of which
> GPUs belong to the same socket as the EGM region.
>
> Expose the list of GPUs associated with an EGM region
> through sysfs. The list can be queried from the auxiliary
> device path.
>
> On a 2-socket, 4 GPU Grace Blackwell setup, it shows up as the following:
> /sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> /sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> pointing to egm4.
>
> /sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> /sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> pointing to egm5.
>
> Moreover
> /sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> /sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> lists links to both the 0008:01:00.0 & 0009:01:00.0 GPU devices.
>
> and
> /sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> /sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> lists links to both the 0018:01:00.0 & 0019:01:00.0.
This seems backwards, I would rather the egm chardev itself have a
directory of links to the PCI devices not have EGM manipulate the
sysfs belonging to some other driver and subsystem..
Jason
prev parent reply other threads:[~2025-09-05 13:42 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-04 4:08 [RFC 00/14] cover-letter: Add virtualization support for EGM ankita
2025-09-04 4:08 ` [RFC 01/14] vfio/nvgrace-gpu: Expand module_pci_driver to allow custom module init ankita
2025-09-04 4:08 ` [RFC 02/14] vfio/nvgrace-gpu: Create auxiliary device for EGM ankita
2025-09-15 6:56 ` Shameer Kolothum
2025-09-04 4:08 ` [RFC 03/14] vfio/nvgrace-gpu: track GPUs associated with the EGM regions ankita
2025-09-15 7:19 ` Shameer Kolothum
2025-09-04 4:08 ` [RFC 04/14] vfio/nvgrace-gpu: Introduce functions to fetch and save EGM info ankita
2025-09-04 4:08 ` [RFC 05/14] vfio/nvgrace-egm: Introduce module to manage EGM ankita
2025-09-05 13:26 ` Jason Gunthorpe
2025-09-15 7:47 ` Shameer Kolothum
2025-09-04 4:08 ` [RFC 06/14] vfio/nvgrace-egm: Introduce egm class and register char device numbers ankita
2025-09-04 4:08 ` [RFC 07/14] vfio/nvgrace-egm: Register auxiliary driver ops ankita
2025-09-05 13:31 ` Jason Gunthorpe
2025-09-04 4:08 ` [RFC 08/14] vfio/nvgrace-egm: Expose EGM region as char device ankita
2025-09-05 13:34 ` Jason Gunthorpe
2025-09-15 8:36 ` Shameer Kolothum
2025-09-04 4:08 ` [RFC 09/14] vfio/nvgrace-egm: Add chardev ops for EGM management ankita
2025-09-05 13:36 ` Jason Gunthorpe
2025-09-04 4:08 ` [RFC 10/14] vfio/nvgrace-egm: Clear Memory before handing out to VM ankita
2025-09-05 13:39 ` Jason Gunthorpe
2025-09-15 8:45 ` Shameer Kolothum
2025-09-04 4:08 ` [RFC 11/14] vfio/nvgrace-egm: Fetch EGM region retired pages list ankita
2025-09-15 9:21 ` Shameer Kolothum
2025-09-04 4:08 ` [RFC 12/14] vfio/nvgrace-egm: Introduce ioctl to share retired pages ankita
2025-09-04 4:08 ` [RFC 13/14] vfio/nvgrace-egm: expose the egm size through sysfs ankita
2025-09-04 4:08 ` [RFC 14/14] vfio/nvgrace-gpu: Add link from pci to EGM ankita
2025-09-05 13:42 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250905134219.GH616306@nvidia.com \
--to=jgg@nvidia.com \
--cc=acurrid@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=anuaggarwal@nvidia.com \
--cc=apopple@nvidia.com \
--cc=cjia@nvidia.com \
--cc=danw@nvidia.com \
--cc=dnigam@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kjaju@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mochs@nvidia.com \
--cc=skolothumtho@nvidia.com \
--cc=targupta@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=yi.l.liu@intel.com \
--cc=yishaih@nvidia.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox