All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex@shazbot.org>
To: <ankita@nvidia.com>
Cc: <vsethi@nvidia.com>, <jgg@nvidia.com>, <mochs@nvidia.com>,
	<jgg@ziepe.ca>, <skolothumtho@nvidia.com>, <cjia@nvidia.com>,
	<zhiw@nvidia.com>, <kjaju@nvidia.com>, <yishaih@nvidia.com>,
	<kevin.tian@intel.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	alex@shazbot.org
Subject: Re: [PATCH RFC v2 04/15] vfio/nvgrace-gpu: Introduce functions to fetch and save EGM info
Date: Wed, 4 Mar 2026 10:37:07 -0700	[thread overview]
Message-ID: <20260304103707.02e05918@shazbot.org> (raw)
In-Reply-To: <20260223155514.152435-5-ankita@nvidia.com>

On Mon, 23 Feb 2026 15:55:03 +0000
<ankita@nvidia.com> wrote:

> From: Ankit Agrawal <ankita@nvidia.com>
> 
> The nvgrace-gpu module tracks the various EGM regions on the system.
> The EGM region information - Base SPA and size - are part of the ACPI
> tables. This can be fetched from the DSD table using the GPU handle.
> 
> When the GPUs are bound to the nvgrace-gpu module, it fetches the EGM
> region information from the ACPI table using the GPU's pci_dev. The
> EGM regions are tracked in a list and the information per region is
> maintained in the nvgrace_egm_dev.
> 
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
>  drivers/vfio/pci/nvgrace-gpu/egm_dev.c | 24 +++++++++++++++++++++++-
>  drivers/vfio/pci/nvgrace-gpu/egm_dev.h |  4 +++-
>  drivers/vfio/pci/nvgrace-gpu/main.c    |  8 ++++++--
>  include/linux/nvgrace-egm.h            |  2 ++
>  4 files changed, 34 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c
> index 0bf95688a486..20291504aca8 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c
> @@ -17,6 +17,26 @@ int nvgrace_gpu_has_egm_property(struct pci_dev *pdev, u64 *pegmpxm)
>  					pegmpxm);
>  }
>  
> +int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev, u64 *pegmphys,
> +				   u64 *pegmlength)
> +{
> +	int ret;
> +
> +	/*
> +	 * The memory information is present in the system ACPI tables as DSD
> +	 * properties nvidia,egm-base-pa and nvidia,egm-size.
> +	 */
> +	ret = device_property_read_u64(&pdev->dev, "nvidia,egm-size",
> +				       pegmlength);
> +	if (ret)
> +		return ret;
> +
> +	ret = device_property_read_u64(&pdev->dev, "nvidia,egm-base-pa",
> +				       pegmphys);
> +
> +	return ret;
> +}

What guarantees that all GPUs in the same PXM have the same properties?
AIUI we only consume the resulting properties for the first GPU
associated to the egm_dev.  Why do we even bother to retrieve the
properties for subsequent GPUs?

Nit, it's a bit inconsistent to partially write caller data on error
versus read to local variables and set the caller data only on success.
Thanks,

Alex

> +
>  int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev)
>  {
>  	struct gpu_node *node;
> @@ -54,7 +74,7 @@ static void nvgrace_gpu_release_aux_device(struct device *device)
>  
>  struct nvgrace_egm_dev *
>  nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name,
> -			      u64 egmpxm)
> +			      u64 egmphys, u64 egmlength, u64 egmpxm)
>  {
>  	struct nvgrace_egm_dev *egm_dev;
>  	int ret;
> @@ -64,6 +84,8 @@ nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name,
>  		goto create_err;
>  
>  	egm_dev->egmpxm = egmpxm;
> +	egm_dev->egmphys = egmphys;
> +	egm_dev->egmlength = egmlength;
>  	INIT_LIST_HEAD(&egm_dev->gpus);
>  
>  	egm_dev->aux_dev.id = egmpxm;
> diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h b/drivers/vfio/pci/nvgrace-gpu/egm_dev.h
> index 1635753c9e50..2e1612445898 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h
> +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.h
> @@ -16,6 +16,8 @@ void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev);
>  
>  struct nvgrace_egm_dev *
>  nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name,
> -			      u64 egmphys);
> +			      u64 egmphys, u64 egmlength, u64 egmpxm);
>  
> +int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev, u64 *pegmphys,
> +				   u64 *pegmlength);
>  #endif /* EGM_DEV_H */
> diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c
> index 3dd0c57e5789..b356e941340a 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/main.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/main.c
> @@ -78,7 +78,7 @@ static struct list_head egm_dev_list;
>  static int nvgrace_gpu_create_egm_aux_device(struct pci_dev *pdev)
>  {
>  	struct nvgrace_egm_dev_entry *egm_entry = NULL;
> -	u64 egmpxm;
> +	u64 egmphys, egmlength, egmpxm;
>  	int ret = 0;
>  	bool is_new_region = false;
>  
> @@ -91,6 +91,10 @@ static int nvgrace_gpu_create_egm_aux_device(struct pci_dev *pdev)
>  	if (nvgrace_gpu_has_egm_property(pdev, &egmpxm))
>  		goto exit;
>  
> +	ret = nvgrace_gpu_fetch_egm_property(pdev, &egmphys, &egmlength);
> +	if (ret)
> +		goto exit;
> +
>  	list_for_each_entry(egm_entry, &egm_dev_list, list) {
>  		/*
>  		 * A system could have multiple GPUs associated with an
> @@ -110,7 +114,7 @@ static int nvgrace_gpu_create_egm_aux_device(struct pci_dev *pdev)
>  
>  	egm_entry->egm_dev =
>  		nvgrace_gpu_create_aux_device(pdev, NVGRACE_EGM_DEV_NAME,
> -					      egmpxm);
> +					      egmphys, egmlength, egmpxm);
>  	if (!egm_entry->egm_dev) {
>  		ret = -EINVAL;
>  		goto free_egm_entry;
> diff --git a/include/linux/nvgrace-egm.h b/include/linux/nvgrace-egm.h
> index e42494a2b1a6..a66906753267 100644
> --- a/include/linux/nvgrace-egm.h
> +++ b/include/linux/nvgrace-egm.h
> @@ -17,6 +17,8 @@ struct gpu_node {
>  
>  struct nvgrace_egm_dev {
>  	struct auxiliary_device aux_dev;
> +	phys_addr_t egmphys;
> +	size_t egmlength;
>  	u64 egmpxm;
>  	struct list_head gpus;
>  };


  parent reply	other threads:[~2026-03-04 17:37 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 15:54 [PATCH RFC v2 00/15] Add virtualization support for EGM ankita
2026-02-23 15:55 ` [PATCH RFC v2 01/15] vfio/nvgrace-gpu: Expand module_pci_driver to allow custom module init ankita
2026-02-23 15:55 ` [PATCH RFC v2 02/15] vfio/nvgrace-gpu: Create auxiliary device for EGM ankita
2026-02-26 14:28   ` Shameer Kolothum Thodi
2026-03-04  0:13   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 03/15] vfio/nvgrace-gpu: track GPUs associated with the EGM regions ankita
2026-02-26 14:55   ` Shameer Kolothum Thodi
2026-03-04 17:14     ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 04/15] vfio/nvgrace-gpu: Introduce functions to fetch and save EGM info ankita
2026-02-26 15:12   ` Shameer Kolothum Thodi
2026-03-04 17:37   ` Alex Williamson [this message]
2026-02-23 15:55 ` [PATCH RFC v2 05/15] vfio/nvgrace-egm: Introduce module to manage EGM ankita
2026-03-04 18:09   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 06/15] vfio/nvgrace-egm: Introduce egm class and register char device numbers ankita
2026-03-04 18:56   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 07/15] vfio/nvgrace-egm: Register auxiliary driver ops ankita
2026-03-04 19:06   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 08/15] vfio/nvgrace-egm: Expose EGM region as char device ankita
2026-02-26 17:08   ` Shameer Kolothum Thodi
2026-03-04 20:16   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 09/15] vfio/nvgrace-egm: Add chardev ops for EGM management ankita
2026-03-04 22:04   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 10/15] vfio/nvgrace-egm: Clear Memory before handing out to VM ankita
2026-02-26 18:15   ` Shameer Kolothum Thodi
2026-02-26 18:56     ` Jason Gunthorpe
2026-02-26 19:29       ` Shameer Kolothum Thodi
2026-03-04 22:14   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 11/15] vfio/nvgrace-egm: Fetch EGM region retired pages list ankita
2026-03-04 22:37   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 12/15] vfio/nvgrace-egm: Introduce ioctl to share retired pages ankita
2026-03-04 23:00   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 13/15] vfio/nvgrace-egm: expose the egm size through sysfs ankita
2026-03-04 23:22   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 14/15] vfio/nvgrace-gpu: Add link from pci to EGM ankita
2026-03-04 23:37   ` Alex Williamson
2026-02-23 15:55 ` [PATCH RFC v2 15/15] vfio/nvgrace-egm: register EGM PFNMAP range with memory_failure ankita
2026-03-04 23:48   ` Alex Williamson
2026-03-05 17:33 ` [PATCH RFC v2 00/15] Add virtualization support for EGM Alex Williamson
2026-03-11  6:47   ` Ankit Agrawal
2026-03-11 20:37     ` Alex Williamson
2026-03-12 13:51       ` Ankit Agrawal
2026-03-12 14:59         ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260304103707.02e05918@shazbot.org \
    --to=alex@shazbot.org \
    --cc=ankita@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=kevin.tian@intel.com \
    --cc=kjaju@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=skolothumtho@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=yishaih@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.