public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/1] vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached
@ 2024-02-29 19:39 ankita
  2024-03-04 14:53 ` Jason Gunthorpe
  2024-03-05 23:12 ` Alex Williamson
  0 siblings, 2 replies; 4+ messages in thread
From: ankita @ 2024-02-29 19:39 UTC (permalink / raw)
  To: ankita, jgg, alex.williamson, yishaih, shameerali.kolothum.thodi,
	kevin.tian
  Cc: aniketa, cjia, kwankhede, targupta, vsethi, acurrid, apopple,
	jhubbard, danw, rrameshbabu, zhiw, anuaggarwal, mochs, kvm,
	linux-kernel

From: Ankit Agrawal <ankita@nvidia.com>

The NVIDIA Grace Hopper GPUs have device memory that is supposed to be
used as a regular RAM. It is accessible through CPU-GPU chip-to-chip
cache coherent interconnect and is present in the system physical
address space. The device memory is split into two regions - termed
as usemem and resmem - in the system physical address space,
with each region mapped and exposed to the VM as a separate fake
device BAR [1].

Owing to a hardware defect for Multi-Instance GPU (MIG) feature [2],
there is a requirement - as a workaround - for the resmem BAR to
display uncached memory characteristics. Based on [3], on system with
FWB enabled such as Grace Hopper, the requisite properties
(uncached, unaligned access) can be achieved through a VM mapping (S1)
of NORMAL_NC and host mapping (S2) of MT_S2_FWB_NORMAL_NC.

KVM currently maps the MMIO region in S2 as MT_S2_FWB_DEVICE_nGnRE by
default. The fake device BARs thus displays DEVICE_nGnRE behavior in the
VM.

The following table summarizes the behavior for the various S1 and S2
mapping combinations for systems with FWB enabled [3].
S1           |  S2           | Result
NORMAL_NC    |  NORMAL_NC    | NORMAL_NC
NORMAL_NC    |  DEVICE_nGnRE | DEVICE_nGnRE

Recently a change was added that modifies this default behavior and
make KVM map MMIO as MT_S2_FWB_NORMAL_NC when a VMA flag
VM_ALLOW_ANY_UNCACHED is set [4]. Setting S2 as MT_S2_FWB_NORMAL_NC
provides the desired behavior (uncached, unaligned access) for resmem.

To use VM_ALLOW_ANY_UNCACHED flag, the platform must guarantee that
no action taken on the MMIO mapping can trigger an uncontained
failure. The Grace Hopper satisfies this requirement. So set
the VM_ALLOW_ANY_UNCACHED flag in the VMA.

Applied over next-20240227.
base-commit: 22ba90670a51

Link: https://lore.kernel.org/all/20240220115055.23546-4-ankita@nvidia.com/ [1]
Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu/ [2]
Link: https://developer.arm.com/documentation/ddi0487/latest/ section D8.5.5 [3]
Link: https://lore.kernel.org/all/20240224150546.368-1-ankita@nvidia.com/ [4]

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Vikram Sethi <vsethi@nvidia.com>
Cc: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
---
 drivers/vfio/pci/nvgrace-gpu/main.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c
index 25814006352d..a7fd018aa548 100644
--- a/drivers/vfio/pci/nvgrace-gpu/main.c
+++ b/drivers/vfio/pci/nvgrace-gpu/main.c
@@ -160,8 +160,17 @@ static int nvgrace_gpu_mmap(struct vfio_device *core_vdev,
 	 * The carved out region of the device memory needs the NORMAL_NC
 	 * property. Communicate as such to the hypervisor.
 	 */
-	if (index == RESMEM_REGION_INDEX)
+	if (index == RESMEM_REGION_INDEX) {
+		/*
+		 * The nvgrace-gpu module has no issues with uncontained
+		 * failures on NORMAL_NC accesses. VM_ALLOW_ANY_UNCACHED is
+		 * set to communicate to the KVM to S2 map as NORMAL_NC.
+		 * This opens up guest usage of NORMAL_NC for this mapping.
+		 */
+		vm_flags_set(vma, VM_ALLOW_ANY_UNCACHED);
+
 		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+	}
 
 	/*
 	 * Perform a PFN map to the memory and back the device BAR by the
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/1] vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached
  2024-02-29 19:39 [PATCH v2 1/1] vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached ankita
@ 2024-03-04 14:53 ` Jason Gunthorpe
  2024-03-05 23:12 ` Alex Williamson
  1 sibling, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2024-03-04 14:53 UTC (permalink / raw)
  To: ankita
  Cc: alex.williamson, yishaih, shameerali.kolothum.thodi, kevin.tian,
	aniketa, cjia, kwankhede, targupta, vsethi, acurrid, apopple,
	jhubbard, danw, rrameshbabu, zhiw, anuaggarwal, mochs, kvm,
	linux-kernel

On Thu, Feb 29, 2024 at 07:39:34PM +0000, ankita@nvidia.com wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
> 
> The NVIDIA Grace Hopper GPUs have device memory that is supposed to be
> used as a regular RAM. It is accessible through CPU-GPU chip-to-chip
> cache coherent interconnect and is present in the system physical
> address space. The device memory is split into two regions - termed
> as usemem and resmem - in the system physical address space,
> with each region mapped and exposed to the VM as a separate fake
> device BAR [1].
> 
> Owing to a hardware defect for Multi-Instance GPU (MIG) feature [2],
> there is a requirement - as a workaround - for the resmem BAR to
> display uncached memory characteristics. Based on [3], on system with
> FWB enabled such as Grace Hopper, the requisite properties
> (uncached, unaligned access) can be achieved through a VM mapping (S1)
> of NORMAL_NC and host mapping (S2) of MT_S2_FWB_NORMAL_NC.
> 
> KVM currently maps the MMIO region in S2 as MT_S2_FWB_DEVICE_nGnRE by
> default. The fake device BARs thus displays DEVICE_nGnRE behavior in the
> VM.
> 
> The following table summarizes the behavior for the various S1 and S2
> mapping combinations for systems with FWB enabled [3].
> S1           |  S2           | Result
> NORMAL_NC    |  NORMAL_NC    | NORMAL_NC
> NORMAL_NC    |  DEVICE_nGnRE | DEVICE_nGnRE
> 
> Recently a change was added that modifies this default behavior and
> make KVM map MMIO as MT_S2_FWB_NORMAL_NC when a VMA flag
> VM_ALLOW_ANY_UNCACHED is set [4]. Setting S2 as MT_S2_FWB_NORMAL_NC
> provides the desired behavior (uncached, unaligned access) for resmem.
> 
> To use VM_ALLOW_ANY_UNCACHED flag, the platform must guarantee that
> no action taken on the MMIO mapping can trigger an uncontained
> failure. The Grace Hopper satisfies this requirement. So set
> the VM_ALLOW_ANY_UNCACHED flag in the VMA.
> 
> Applied over next-20240227.
> base-commit: 22ba90670a51
> 
> Link: https://lore.kernel.org/all/20240220115055.23546-4-ankita@nvidia.com/ [1]
> Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu/ [2]
> Link: https://developer.arm.com/documentation/ddi0487/latest/ section D8.5.5 [3]
> Link: https://lore.kernel.org/all/20240224150546.368-1-ankita@nvidia.com/ [4]
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Vikram Sethi <vsethi@nvidia.com>
> Cc: Zhi Wang <zhiw@nvidia.com>
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
>  drivers/vfio/pci/nvgrace-gpu/main.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/1] vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached
  2024-02-29 19:39 [PATCH v2 1/1] vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached ankita
  2024-03-04 14:53 ` Jason Gunthorpe
@ 2024-03-05 23:12 ` Alex Williamson
  2024-03-06  1:46   ` Ankit Agrawal
  1 sibling, 1 reply; 4+ messages in thread
From: Alex Williamson @ 2024-03-05 23:12 UTC (permalink / raw)
  To: ankita, Oliver Upton
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, aniketa,
	cjia, kwankhede, targupta, vsethi, acurrid, apopple, jhubbard,
	danw, rrameshbabu, zhiw, anuaggarwal, mochs, kvm, linux-kernel,
	kvmarm

On Thu, 29 Feb 2024 19:39:34 +0000
<ankita@nvidia.com> wrote:

> From: Ankit Agrawal <ankita@nvidia.com>
> 
> The NVIDIA Grace Hopper GPUs have device memory that is supposed to be
> used as a regular RAM. It is accessible through CPU-GPU chip-to-chip
> cache coherent interconnect and is present in the system physical
> address space. The device memory is split into two regions - termed
> as usemem and resmem - in the system physical address space,
> with each region mapped and exposed to the VM as a separate fake
> device BAR [1].
> 
> Owing to a hardware defect for Multi-Instance GPU (MIG) feature [2],
> there is a requirement - as a workaround - for the resmem BAR to
> display uncached memory characteristics. Based on [3], on system with
> FWB enabled such as Grace Hopper, the requisite properties
> (uncached, unaligned access) can be achieved through a VM mapping (S1)
> of NORMAL_NC and host mapping (S2) of MT_S2_FWB_NORMAL_NC.
> 
> KVM currently maps the MMIO region in S2 as MT_S2_FWB_DEVICE_nGnRE by
> default. The fake device BARs thus displays DEVICE_nGnRE behavior in the
> VM.
> 
> The following table summarizes the behavior for the various S1 and S2
> mapping combinations for systems with FWB enabled [3].
> S1           |  S2           | Result
> NORMAL_NC    |  NORMAL_NC    | NORMAL_NC
> NORMAL_NC    |  DEVICE_nGnRE | DEVICE_nGnRE
> 
> Recently a change was added that modifies this default behavior and
> make KVM map MMIO as MT_S2_FWB_NORMAL_NC when a VMA flag
> VM_ALLOW_ANY_UNCACHED is set [4]. Setting S2 as MT_S2_FWB_NORMAL_NC
> provides the desired behavior (uncached, unaligned access) for resmem.
> 
> To use VM_ALLOW_ANY_UNCACHED flag, the platform must guarantee that
> no action taken on the MMIO mapping can trigger an uncontained
> failure. The Grace Hopper satisfies this requirement. So set
> the VM_ALLOW_ANY_UNCACHED flag in the VMA.
> 
> Applied over next-20240227.
> base-commit: 22ba90670a51
> 
> Link: https://lore.kernel.org/all/20240220115055.23546-4-ankita@nvidia.com/ [1]
> Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu/ [2]
> Link: https://developer.arm.com/documentation/ddi0487/latest/ section D8.5.5 [3]
> Link: https://lore.kernel.org/all/20240224150546.368-1-ankita@nvidia.com/ [4]
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Vikram Sethi <vsethi@nvidia.com>
> Cc: Zhi Wang <zhiw@nvidia.com>
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
>  drivers/vfio/pci/nvgrace-gpu/main.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)

Applied to vfio next branch for v6.9.

Oliver, FYI I did merge the branch you provided in [1] for this, thanks
for the foresight in providing that.

Alex

[1]https://lore.kernel.org/all/170899100569.1405597.5047894183843333522.b4-ty@linux.dev/

> 
> diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c
> index 25814006352d..a7fd018aa548 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/main.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/main.c
> @@ -160,8 +160,17 @@ static int nvgrace_gpu_mmap(struct vfio_device *core_vdev,
>  	 * The carved out region of the device memory needs the NORMAL_NC
>  	 * property. Communicate as such to the hypervisor.
>  	 */
> -	if (index == RESMEM_REGION_INDEX)
> +	if (index == RESMEM_REGION_INDEX) {
> +		/*
> +		 * The nvgrace-gpu module has no issues with uncontained
> +		 * failures on NORMAL_NC accesses. VM_ALLOW_ANY_UNCACHED is
> +		 * set to communicate to the KVM to S2 map as NORMAL_NC.
> +		 * This opens up guest usage of NORMAL_NC for this mapping.
> +		 */
> +		vm_flags_set(vma, VM_ALLOW_ANY_UNCACHED);
> +
>  		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> +	}
>  
>  	/*
>  	 * Perform a PFN map to the memory and back the device BAR by the


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/1] vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached
  2024-03-05 23:12 ` Alex Williamson
@ 2024-03-06  1:46   ` Ankit Agrawal
  0 siblings, 0 replies; 4+ messages in thread
From: Ankit Agrawal @ 2024-03-06  1:46 UTC (permalink / raw)
  To: Alex Williamson, Oliver Upton
  Cc: Jason Gunthorpe, Yishai Hadas,
	shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Alistair Popple, John Hubbard,
	Dan Williams, Rahul Rameshbabu, Zhi Wang, Anuj Aggarwal (SW-GPU),
	Matt Ochs, kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvmarm@lists.linux.dev

>> To use VM_ALLOW_ANY_UNCACHED flag, the platform must guarantee that
>> no action taken on the MMIO mapping can trigger an uncontained
>> failure. The Grace Hopper satisfies this requirement. So set
>> the VM_ALLOW_ANY_UNCACHED flag in the VMA.
>>
>> Applied over next-20240227.
>> base-commit: 22ba90670a51
>>
>> Link: https://lore.kernel.org/all/20240220115055.23546-4-ankita@nvidia.com/ [1]
>> Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu/ [2]
>> Link: https://developer.arm.com/documentation/ddi0487/latest/ section D8.5.5 [3]
>> Link: https://lore.kernel.org/all/20240224150546.368-1-ankita@nvidia.com/ [4]
>>
>> Cc: Alex Williamson <alex.williamson@redhat.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Jason Gunthorpe <jgg@nvidia.com>
>> Cc: Vikram Sethi <vsethi@nvidia.com>
>> Cc: Zhi Wang <zhiw@nvidia.com>
>> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
>> ---
>>  drivers/vfio/pci/nvgrace-gpu/main.c | 11 ++++++++++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> Applied to vfio next branch for v6.9.
>
> Oliver, FYI I did merge the branch you provided in [1] for this, thanks
> for the foresight in providing that.
>
> Alex
>
> [1]https://lore.kernel.org/all/170899100569.1405597.5047894183843333522.b4-ty@linux.dev/

Many thanks Alex, appreciate your help with this!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-03-06  1:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-29 19:39 [PATCH v2 1/1] vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached ankita
2024-03-04 14:53 ` Jason Gunthorpe
2024-03-05 23:12 ` Alex Williamson
2024-03-06  1:46   ` Ankit Agrawal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox