linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/1] KVM: arm64: Map GPU memory with no struct pages
@ 2024-11-18 13:19 ankita
  2024-11-18 13:19 ` [PATCH v2 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags ankita
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: ankita @ 2024-11-18 13:19 UTC (permalink / raw)
  To: ankita, jgg, maz, oliver.upton, joey.gouly, suzuki.poulose,
	yuzenghui, catalin.marinas, will, ryan.roberts, shahuang,
	lpieralisi
  Cc: aniketa, cjia, kwankhede, targupta, vsethi, acurrid, apopple,
	jhubbard, danw, zhiw, mochs, udhoke, dnigam, alex.williamson,
	sebastianene, coltonlewis, kevin.tian, yi.l.liu, ardb, akpm,
	gshan, linux-mm, kvmarm, kvm, linux-kernel, linux-arm-kernel

From: Ankit Agrawal <ankita@nvidia.com>

Grace based platforms such as Grace Hopper/Blackwell Superchips have
CPU accessible cache coherent GPU memory. The current KVM code
prevents such memory to be mapped Normal cacheable and the patch aims
to solve this use case.

Today KVM forces the memory to either NORMAL or DEVICE_nGnRE
based on pfn_is_map_memory() and ignores the per-VMA flags that
indicates the memory attributes. This means there is no way for
a VM to get cachable IO memory (like from a CXL or pre-CXL device).
In both cases the memory will be forced to be DEVICE_nGnRE and the
VM's memory attributes will be ignored.

The pfn_is_map_memory() is thus restrictive and allows only for
the memory that is added to the kernel to be marked as cacheable.
In most cases the code needs to know if there is a struct page, or
if the memory is in the kernel map and pfn_valid() is an appropriate
API for this. Extend the umbrella with pfn_valid() to include memory
with no struct pages for consideration to be mapped cacheable in
stage 2. A !pfn_valid() implies that the memory is unsafe to be mapped
as cacheable.

Also take care of the following two cases that are unsafe to be mapped
as cacheable:
1. The VMA pgprot may have VM_IO set alongwith MT_NORMAL or MT_NORMAL_TAGGED.
   Although unexpected and wrong, presence of such configuration cannot
   be ruled out.
2. Configurations where VM_MTE_ALLOWED is not set and KVM_CAP_ARM_MTE
   is enabled. Otherwise a malicious guest can enable MTE at stage 1
   without the hypervisor being able to tell. This could cause external
   aborts.

The GPU memory such as on the Grace Hopper systems is interchangeable
with DDR memory and retains its properties. Executable faults should thus
be allowed on the memory determined as Normal cacheable.

Note when FWB is not enabled, the kernel expects to trivially do
cache management by flushing the memory by linearly converting a
kvm_pte to phys_addr to a KVA, see kvm_flush_dcache_to_poc(). This is
only possibile for struct page backed memory. Do not allow non-struct
page memory to be cachable without FWB.

The changes are heavily influenced by the insightful discussions between
Catalin Marinas and Jason Gunthorpe [1] on v1. Many thanks for their
valuable suggestions.

Applied over next-20241117 and tested on the Grace Hopper and
Grace Blackwell platforms by booting up VM and running several CUDA
workloads. This has not been tested on MTE enabled hardware. If
someone can give it a try, it will be very helpful.

v1 -> v2
1. Removed kvm_is_device_pfn() as a determiner for device type memory
   determination. Instead using pfn_valid()
2. Added handling for MTE.
3. Minor cleanup.

Link: https://lore.kernel.org/lkml/20230907181459.18145-2-ankita@nvidia.com [1]

Ankit Agrawal (1):
  KVM: arm64: Allow cacheable stage 2 mapping using VMA flags

 arch/arm64/include/asm/kvm_pgtable.h |   8 +++
 arch/arm64/kvm/hyp/pgtable.c         |   2 +-
 arch/arm64/kvm/mmu.c                 | 101 +++++++++++++++++++++------
 3 files changed, 87 insertions(+), 24 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-01-17 19:16 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-18 13:19 [PATCH v2 0/1] KVM: arm64: Map GPU memory with no struct pages ankita
2024-11-18 13:19 ` [PATCH v2 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags ankita
2024-12-10 14:13   ` Will Deacon
2024-12-11  2:58     ` Ankit Agrawal
2024-12-11 21:49       ` Will Deacon
2024-12-11 22:01   ` Catalin Marinas
2025-01-10 21:04     ` Ankit Agrawal
2024-12-20 15:42   ` David Hildenbrand
2025-01-06 16:51     ` Jason Gunthorpe
2025-01-08 16:09       ` David Hildenbrand
2025-01-10 21:15         ` Ankit Agrawal
2025-01-13 16:27           ` Jason Gunthorpe
2025-01-14 13:17             ` David Hildenbrand
2025-01-14 13:31               ` Jason Gunthorpe
2025-01-14 23:13                 ` Ankit Agrawal
2025-01-15 14:32                   ` Jason Gunthorpe
2025-01-16 22:28                     ` Catalin Marinas
2025-01-17 14:00                       ` Jason Gunthorpe
2025-01-17 16:58                         ` David Hildenbrand
2025-01-17 17:10                           ` Jason Gunthorpe
2025-01-17 18:52                         ` Catalin Marinas
2025-01-17 19:16                           ` Jason Gunthorpe
2024-11-26 17:10 ` [PATCH v2 0/1] KVM: arm64: Map GPU memory with no struct pages Donald Dutile
2024-12-02  4:51   ` Ankit Agrawal
2024-12-10 14:07 ` Will Deacon
2024-12-10 14:18   ` Jason Gunthorpe
2024-12-10 14:45     ` Catalin Marinas
2024-12-10 15:56     ` Donald Dutile
2024-12-10 16:08       ` Catalin Marinas
2024-12-11  3:05         ` Ankit Agrawal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).