public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: <ankita@nvidia.com>
To: <ankita@nvidia.com>, <jgg@nvidia.com>,
	<alex.williamson@redhat.com>, <yishaih@nvidia.com>,
	<skolothumtho@nvidia.com>, <kevin.tian@intel.com>,
	<yi.l.liu@intel.com>, <zhiw@nvidia.com>
Cc: <aniketa@nvidia.com>, <cjia@nvidia.com>, <kwankhede@nvidia.com>,
	<targupta@nvidia.com>, <vsethi@nvidia.com>, <acurrid@nvidia.com>,
	<apopple@nvidia.com>, <jhubbard@nvidia.com>, <danw@nvidia.com>,
	<anuaggarwal@nvidia.com>, <mochs@nvidia.com>, <kjaju@nvidia.com>,
	<dnigam@nvidia.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: [RFC 14/14] vfio/nvgrace-gpu: Add link from pci to EGM
Date: Thu, 4 Sep 2025 04:08:28 +0000	[thread overview]
Message-ID: <20250904040828.319452-15-ankita@nvidia.com> (raw)
In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com>

From: Ankit Agrawal <ankita@nvidia.com>

To replicate the host EGM topology in the VM in terms of
the GPU affinity, the userspace need to be aware of which
GPUs belong to the same socket as the EGM region.

Expose the list of GPUs associated with an EGM region
through sysfs. The list can be queried from the auxiliary
device path.

On a 2-socket, 4 GPU Grace Blackwell setup, it shows up as the following:
/sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4
/sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4
pointing to egm4.

/sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5
/sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5
pointing to egm5.

Moreover
/sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4
/sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4
lists links to both the 0008:01:00.0 & 0009:01:00.0 GPU devices.

and
/sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5
/sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5
lists links to both the 0018:01:00.0 & 0019:01:00.0.

Suggested-by: Matthew R. Ochs <mochs@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
---
 drivers/vfio/pci/nvgrace-gpu/egm_dev.c | 42 +++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c
index b8e143542bce..20e9213aa0ac 100644
--- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c
+++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c
@@ -56,6 +56,36 @@ int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev, u64 *pegmphys,
 	return ret;
 }
 
+static int create_egm_symlinks(struct nvgrace_egm_dev *egm_dev,
+			       struct pci_dev *pdev)
+{
+	int ret_l1, ret_l2;
+
+	ret_l1 = sysfs_create_link_nowarn(&pdev->dev.kobj,
+					  &egm_dev->aux_dev.dev.kobj,
+					  dev_name(&egm_dev->aux_dev.dev));
+
+	/*
+	 * Allow if Link already exists - created since GPU is the auxiliary
+	 * device's parent; flag the error otherwise.
+	 */
+	if (ret_l1 && ret_l1 != -EEXIST)
+		return ret_l1;
+
+	ret_l2 = sysfs_create_link(&egm_dev->aux_dev.dev.kobj,
+				   &pdev->dev.kobj,
+				   dev_name(&pdev->dev));
+
+	/*
+	 * Remove the aux dev link only if wasn't already present.
+	 */
+	if (ret_l2 && !ret_l1)
+		sysfs_remove_link(&pdev->dev.kobj,
+				  dev_name(&egm_dev->aux_dev.dev));
+
+	return ret_l2;
+}
+
 int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev)
 {
 	struct gpu_node *node;
@@ -68,7 +98,16 @@ int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev)
 
 	list_add_tail(&node->list, &egm_dev->gpus);
 
-	return 0;
+	return create_egm_symlinks(egm_dev, pdev);
+}
+
+static void remove_egm_symlinks(struct nvgrace_egm_dev *egm_dev,
+				struct pci_dev *pdev)
+{
+	sysfs_remove_link(&pdev->dev.kobj,
+			  dev_name(&egm_dev->aux_dev.dev));
+	sysfs_remove_link(&egm_dev->aux_dev.dev.kobj,
+			  dev_name(&pdev->dev));
 }
 
 void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev)
@@ -77,6 +116,7 @@ void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev)
 
 	list_for_each_entry_safe(node, tmp, &egm_dev->gpus, list) {
 		if (node->pdev == pdev) {
+			remove_egm_symlinks(egm_dev, pdev);
 			list_del(&node->list);
 			kvfree(node);
 		}
-- 
2.34.1


  parent reply	other threads:[~2025-09-04  4:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-04  4:08 [RFC 00/14] cover-letter: Add virtualization support for EGM ankita
2025-09-04  4:08 ` [RFC 01/14] vfio/nvgrace-gpu: Expand module_pci_driver to allow custom module init ankita
2025-09-04  4:08 ` [RFC 02/14] vfio/nvgrace-gpu: Create auxiliary device for EGM ankita
2025-09-15  6:56   ` Shameer Kolothum
2025-09-04  4:08 ` [RFC 03/14] vfio/nvgrace-gpu: track GPUs associated with the EGM regions ankita
2025-09-15  7:19   ` Shameer Kolothum
2025-09-04  4:08 ` [RFC 04/14] vfio/nvgrace-gpu: Introduce functions to fetch and save EGM info ankita
2025-09-04  4:08 ` [RFC 05/14] vfio/nvgrace-egm: Introduce module to manage EGM ankita
2025-09-05 13:26   ` Jason Gunthorpe
2025-09-15  7:47   ` Shameer Kolothum
2025-09-04  4:08 ` [RFC 06/14] vfio/nvgrace-egm: Introduce egm class and register char device numbers ankita
2025-09-04  4:08 ` [RFC 07/14] vfio/nvgrace-egm: Register auxiliary driver ops ankita
2025-09-05 13:31   ` Jason Gunthorpe
2025-09-04  4:08 ` [RFC 08/14] vfio/nvgrace-egm: Expose EGM region as char device ankita
2025-09-05 13:34   ` Jason Gunthorpe
2025-09-15  8:36   ` Shameer Kolothum
2025-09-04  4:08 ` [RFC 09/14] vfio/nvgrace-egm: Add chardev ops for EGM management ankita
2025-09-05 13:36   ` Jason Gunthorpe
2025-09-04  4:08 ` [RFC 10/14] vfio/nvgrace-egm: Clear Memory before handing out to VM ankita
2025-09-05 13:39   ` Jason Gunthorpe
2025-09-15  8:45   ` Shameer Kolothum
2025-09-04  4:08 ` [RFC 11/14] vfio/nvgrace-egm: Fetch EGM region retired pages list ankita
2025-09-15  9:21   ` Shameer Kolothum
2025-09-04  4:08 ` [RFC 12/14] vfio/nvgrace-egm: Introduce ioctl to share retired pages ankita
2025-09-04  4:08 ` [RFC 13/14] vfio/nvgrace-egm: expose the egm size through sysfs ankita
2025-09-04  4:08 ` ankita [this message]
2025-09-05 13:42   ` [RFC 14/14] vfio/nvgrace-gpu: Add link from pci to EGM Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250904040828.319452-15-ankita@nvidia.com \
    --to=ankita@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=anuaggarwal@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=dnigam@nvidia.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kjaju@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=skolothumtho@nvidia.com \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=yi.l.liu@intel.com \
    --cc=yishaih@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox