[PATCH v4 0/6] KVM: Guest Memory Pre-Population API

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/6] KVM: Guest Memory Pre-Population API
@ 2024-04-19  8:59 Paolo Bonzini
  2024-04-19  8:59 ` [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-19  8:59 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: isaku.yamahata, xiaoyao.li, binbin.wu, seanjc, rick.p.edgecombe

Pre-population has been requested several times to mitigate KVM page faults
during guest boot or after live migration.  It is also required by TDX
before filling in the initial guest memory with measured contents; while
I am not yet sure that userspace will use this ioctl, if not the code
will be used by a TDX-specific ioctl---to pre-populate the SEPT before
invoking TDH.MEM.PAGE.ADD or TDH.MR.EXTEND.

This patch series depends on the other pieces that have been applied
to the kvm-coco-queue branch (and is present on the branch).

Paolo

v3->v4:
- renamed everything to KVM_PRE_FAULT_MEMORY, KVM_CAP_PRE_FAULT_MEMORY,
  struct kvm_pre_fault_memory.
- renamed base_address field to gpa
- merged introduction of kvm_tdp_map_page() and kvm_arch_vcpu_map_memory()
  in a single patch, moving the latter to mmu.c; did *not* merge them
  in a single function though
- removed EINVAL return code for RET_PF_RETRY, do it in KVM and exit
  on signal_pending()
- return ENOENT for RET_PF_EMULATE
- do not document the possibility that different VMs can have different
  results for KVM_CHECK_EXTENSION(KVM_CAP_PRE_FAULT_MEMORY)
- return long from kvm_arch_vcpu_map_memory(), update size and gpa in
  kvm_vcpu_map_memory()
- cover remaining range.size more thoroughly in the selftest

v2->v3:
- no vendor-specific hooks
- just fail if pre-population is invoked while nested virt is access
- just populate page tables for the SMM address space if invoked while
  SMM is active
- struct name changed to `kvm_map_memory`
- common code has supports for KVM_CHECK_EXTENSION(KVM_MAP_MEMORY)
  on the VM file descriptor, which allows to make this ioctl supported
  only on a subset of VM types
- if EINTR or EAGAIN happens on the first page, it is returned.  Otherwise,
  the ioctl *succeeds* but mapping->size is left nonzero.  While this
  drops the detail as to why the system call was interrupted, it is
  consistent with other interruptible system calls such as read().
- the test is not x86-specific anymore (though for now only compiled
  on x86 because no other architectures supports the feature)
- instead of using __weak symbols, the code is conditional on a new
  Kconfig CONFIG_KVM_GENERIC_MAP_MEMORY.


Isaku Yamahata (6):
  KVM: Document KVM_PRE_FAULT_MEMORY ioctl
  KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  KVM: x86/mmu: Extract __kvm_mmu_do_page_fault()
  KVM: x86/mmu: Make __kvm_mmu_do_page_fault() return mapped level
  KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY

 Documentation/virt/kvm/api.rst                |  50 ++++++
 arch/x86/kvm/Kconfig                          |   1 +
 arch/x86/kvm/mmu/mmu.c                        |  72 +++++++++
 arch/x86/kvm/mmu/mmu_internal.h               |  42 +++--
 arch/x86/kvm/x86.c                            |   3 +
 include/linux/kvm_host.h                      |   5 +
 include/uapi/linux/kvm.h                      |  10 ++
 tools/include/uapi/linux/kvm.h                |   8 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/pre_fault_memory_test.c     | 146 ++++++++++++++++++
 virt/kvm/Kconfig                              |   3 +
 virt/kvm/kvm_main.c                           |  63 ++++++++
 12 files changed, 390 insertions(+), 14 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/pre_fault_memory_test.c

-- 
2.43.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl
  2024-04-19  8:59 [PATCH v4 0/6] KVM: Guest Memory Pre-Population API Paolo Bonzini
@ 2024-04-19  8:59 ` Paolo Bonzini
  2024-04-22 17:55   ` Isaku Yamahata
  2024-04-19  8:59 ` [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-19  8:59 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: isaku.yamahata, xiaoyao.li, binbin.wu, seanjc, rick.p.edgecombe

From: Isaku Yamahata <isaku.yamahata@intel.com>

Adds documentation of KVM_PRE_FAULT_MEMORY ioctl. [1]

It populates guest memory.  It doesn't do extra operations on the
underlying technology-specific initialization [2].  For example,
CoCo-related operations won't be performed.  Concretely for TDX, this API
won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific APIs
are required for such operations.

The key point is to adapt of vcpu ioctl instead of VM ioctl.  First,
populating guest memory requires vcpu.  If it is VM ioctl, we need to pick
one vcpu somehow.  Secondly, vcpu ioctl allows each vcpu to invoke this
ioctl in parallel.  It helps to scale regarding guest memory size, e.g.,
hundreds of GB.

[1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@google.com/
[2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <9a060293c9ad9a78f1d8994cfe1311e818e99257.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Documentation/virt/kvm/api.rst | 50 ++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index f0b76ff5030d..bbcaa5d2b54b 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6352,6 +6352,56 @@ a single guest_memfd file, but the bound ranges must not overlap).
 
 See KVM_SET_USER_MEMORY_REGION2 for additional details.
 
+4.143 KVM_PRE_FAULT_MEMORY
+------------------------
+
+:Capability: KVM_CAP_PRE_FAULT_MEMORY
+:Architectures: none
+:Type: vcpu ioctl
+:Parameters: struct kvm_pre_fault_memory (in/out)
+:Returns: 0 on success, < 0 on error
+
+Errors:
+
+  ========== ===============================================================
+  EINVAL     The specified `gpa` and `size` were invalid (e.g. not
+             page aligned).
+  ENOENT     The specified `gpa` is outside defined memslots.
+  EINTR      An unmasked signal is pending and no page was processed.
+  EFAULT     The parameter address was invalid.
+  EOPNOTSUPP Mapping memory for a GPA is unsupported by the
+             hypervisor, and/or for the current vCPU state/mode.
+  ========== ===============================================================
+
+::
+
+  struct kvm_pre_fault_memory {
+	/* in/out */
+	__u64 gpa;
+	__u64 size;
+	/* in */
+	__u64 flags;
+	__u64 padding[5];
+  };
+
+KVM_PRE_FAULT_MEMORY populates KVM's stage-2 page tables used to map memory
+for the current vCPU state.  KVM maps memory as if the vCPU generated a
+stage-2 read page fault, e.g. faults in memory as needed, but doesn't break
+CoW.  However, KVM does not mark any newly created stage-2 PTE as Accessed.
+
+In some cases, multiple vCPUs might share the page tables.  In this
+case, the ioctl can be called in parallel.
+
+Shadow page tables cannot support this ioctl because they
+are indexed by virtual address or nested guest physical address.
+Calling this ioctl when the guest is using shadow page tables (for
+example because it is running a nested guest with nested page tables)
+will fail with `EOPNOTSUPP` even if `KVM_CHECK_EXTENSION` reports
+the capability to be present.
+
+`flags` must currently be zero.
+
+
 5. The kvm_run structure
 ========================
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  2024-04-19  8:59 [PATCH v4 0/6] KVM: Guest Memory Pre-Population API Paolo Bonzini
  2024-04-19  8:59 ` [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl Paolo Bonzini
@ 2024-04-19  8:59 ` Paolo Bonzini
  2024-04-22  5:39   ` Binbin Wu
                     ` (2 more replies)
  2024-04-19  8:59 ` [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault() Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-19  8:59 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: isaku.yamahata, xiaoyao.li, binbin.wu, seanjc, rick.p.edgecombe

From: Isaku Yamahata <isaku.yamahata@intel.com>

Add a new ioctl KVM_PRE_FAULT_MEMORY in the KVM common code. It iterates on the
memory range and calls the arch-specific function.  Add stub arch function
as a weak symbol.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <819322b8f25971f2b9933bfa4506e618508ad782.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/linux/kvm_host.h |  5 ++++
 include/uapi/linux/kvm.h | 10 +++++++
 virt/kvm/Kconfig         |  3 ++
 virt/kvm/kvm_main.c      | 63 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 81 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8dea11701ab2..9e9943e5e37c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2478,4 +2478,9 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages
 void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
 #endif
 
+#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
+long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
+				    struct kvm_pre_fault_memory *range);
+#endif
+
 #endif
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2190adbe3002..917d2964947d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -917,6 +917,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_MEMORY_ATTRIBUTES 233
 #define KVM_CAP_GUEST_MEMFD 234
 #define KVM_CAP_VM_TYPES 235
+#define KVM_CAP_PRE_FAULT_MEMORY 236
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
@@ -1548,4 +1549,13 @@ struct kvm_create_guest_memfd {
 	__u64 reserved[6];
 };
 
+#define KVM_PRE_FAULT_MEMORY	_IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memory)
+
+struct kvm_pre_fault_memory {
+	__u64 gpa;
+	__u64 size;
+	__u64 flags;
+	__u64 padding[5];
+};
+
 #endif /* __LINUX_KVM_H */
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 754c6c923427..b14e14cdbfb9 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -67,6 +67,9 @@ config HAVE_KVM_INVALID_WAKEUPS
 config KVM_GENERIC_DIRTYLOG_READ_PROTECT
        bool
 
+config KVM_GENERIC_PRE_FAULT_MEMORY
+       bool
+
 config KVM_COMPAT
        def_bool y
        depends on KVM && COMPAT && !(S390 || ARM64 || RISCV)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 38b498669ef9..51d8dbe7e93b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4379,6 +4379,55 @@ static int kvm_vcpu_ioctl_get_stats_fd(struct kvm_vcpu *vcpu)
 	return fd;
 }
 
+#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
+static int kvm_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
+				     struct kvm_pre_fault_memory *range)
+{
+	int idx;
+	long r;
+	u64 full_size;
+
+	if (range->flags)
+		return -EINVAL;
+
+	if (!PAGE_ALIGNED(range->gpa) ||
+	    !PAGE_ALIGNED(range->size) ||
+	    range->gpa + range->size <= range->gpa)
+		return -EINVAL;
+
+	if (!range->size)
+		return 0;
+
+	vcpu_load(vcpu);
+	idx = srcu_read_lock(&vcpu->kvm->srcu);
+
+	full_size = range->size;
+	do {
+		if (signal_pending(current)) {
+			r = -EINTR;
+			break;
+		}
+
+		r = kvm_arch_vcpu_pre_fault_memory(vcpu, range);
+		if (r < 0)
+			break;
+
+		if (WARN_ON_ONCE(r == 0))
+			break;
+
+		range->size -= r;
+		range->gpa += r;
+		cond_resched();
+	} while (range->size);
+
+	srcu_read_unlock(&vcpu->kvm->srcu, idx);
+	vcpu_put(vcpu);
+
+	/* Return success if at least one page was mapped successfully.  */
+	return full_size == range->size ? r : 0;
+}
+#endif
+
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -4580,6 +4629,20 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		r = kvm_vcpu_ioctl_get_stats_fd(vcpu);
 		break;
 	}
+#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
+	case KVM_PRE_FAULT_MEMORY: {
+		struct kvm_pre_fault_memory range;
+
+		r = -EFAULT;
+		if (copy_from_user(&range, argp, sizeof(range)))
+			break;
+		r = kvm_vcpu_pre_fault_memory(vcpu, &range);
+		/* Pass back leftover range. */
+		if (copy_to_user(argp, &range, sizeof(range)))
+			r = -EFAULT;
+		break;
+	}
+#endif
 	default:
 		r = kvm_arch_vcpu_ioctl(filp, ioctl, arg);
 	}
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault()
  2024-04-19  8:59 [PATCH v4 0/6] KVM: Guest Memory Pre-Population API Paolo Bonzini
  2024-04-19  8:59 ` [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl Paolo Bonzini
  2024-04-19  8:59 ` [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory Paolo Bonzini
@ 2024-04-19  8:59 ` Paolo Bonzini
  2024-04-22  8:46   ` Xiaoyao Li
  2024-04-19  8:59 ` [PATCH 4/6] KVM: x86/mmu: Make __kvm_mmu_do_page_fault() return mapped level Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-19  8:59 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: isaku.yamahata, xiaoyao.li, binbin.wu, seanjc, rick.p.edgecombe

From: Isaku Yamahata <isaku.yamahata@intel.com>

Extract out __kvm_mmu_do_page_fault() from kvm_mmu_do_page_fault().  The
inner function is to initialize struct kvm_page_fault and to call the fault
handler, and the outer function handles updating stats and converting
return code.  KVM_PRE_FAULT_MEMORY will call the KVM page fault handler.

This patch makes the emulation_type always set irrelevant to the return
code.  kvm_mmu_page_fault() is the only caller of kvm_mmu_do_page_fault(),
and references the value only when PF_RET_EMULATE is returned.  Therefore,
this adjustment doesn't affect functionality.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <ddf1d98420f562707b11e12c416cce8fdb986bb1.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu_internal.h | 38 +++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index e68a60974cf4..9baae6c223ee 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -287,8 +287,8 @@ static inline void kvm_mmu_prepare_memory_fault_exit(struct kvm_vcpu *vcpu,
 				      fault->is_private);
 }
 
-static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
-					u64 err, bool prefetch, int *emulation_type)
+static inline int __kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
+					  u64 err, bool prefetch, int *emulation_type)
 {
 	struct kvm_page_fault fault = {
 		.addr = cr2_or_gpa,
@@ -318,6 +318,27 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 		fault.slot = kvm_vcpu_gfn_to_memslot(vcpu, fault.gfn);
 	}
 
+	if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp)
+		r = kvm_tdp_page_fault(vcpu, &fault);
+	else
+		r = vcpu->arch.mmu->page_fault(vcpu, &fault);
+
+	if (r == RET_PF_EMULATE && fault.is_private) {
+		kvm_mmu_prepare_memory_fault_exit(vcpu, &fault);
+		r = -EFAULT;
+	}
+
+	if (fault.write_fault_to_shadow_pgtable && emulation_type)
+		*emulation_type |= EMULTYPE_WRITE_PF_TO_SP;
+
+	return r;
+}
+
+static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
+					u64 err, bool prefetch, int *emulation_type)
+{
+	int r;
+
 	/*
 	 * Async #PF "faults", a.k.a. prefetch faults, are not faults from the
 	 * guest perspective and have already been counted at the time of the
@@ -326,18 +347,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	if (!prefetch)
 		vcpu->stat.pf_taken++;
 
-	if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp)
-		r = kvm_tdp_page_fault(vcpu, &fault);
-	else
-		r = vcpu->arch.mmu->page_fault(vcpu, &fault);
-
-	if (r == RET_PF_EMULATE && fault.is_private) {
-		kvm_mmu_prepare_memory_fault_exit(vcpu, &fault);
-		return -EFAULT;
-	}
-
-	if (fault.write_fault_to_shadow_pgtable && emulation_type)
-		*emulation_type |= EMULTYPE_WRITE_PF_TO_SP;
+	r = __kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, err, prefetch, emulation_type);
 
 	/*
 	 * Similar to above, prefetch faults aren't truly spurious, and the
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/6] KVM: x86/mmu: Make __kvm_mmu_do_page_fault() return mapped level
  2024-04-19  8:59 [PATCH v4 0/6] KVM: Guest Memory Pre-Population API Paolo Bonzini
                   ` (2 preceding siblings ...)
  2024-04-19  8:59 ` [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault() Paolo Bonzini
@ 2024-04-19  8:59 ` Paolo Bonzini
  2024-04-19  8:59 ` [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() Paolo Bonzini
  2024-04-19  8:59 ` [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY Paolo Bonzini
  5 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-19  8:59 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: isaku.yamahata, xiaoyao.li, binbin.wu, seanjc, rick.p.edgecombe

From: Isaku Yamahata <isaku.yamahata@intel.com>

The guest memory population logic will need to know what page size or level
(4K, 2M, ...) is mapped.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <eabc3f3e5eb03b370cadf6e1901ea34d7a020adc.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu_internal.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 9baae6c223ee..b0a10f5a40dd 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -288,7 +288,8 @@ static inline void kvm_mmu_prepare_memory_fault_exit(struct kvm_vcpu *vcpu,
 }
 
 static inline int __kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
-					  u64 err, bool prefetch, int *emulation_type)
+					  u64 err, bool prefetch,
+					  int *emulation_type, u8 *level)
 {
 	struct kvm_page_fault fault = {
 		.addr = cr2_or_gpa,
@@ -330,6 +331,8 @@ static inline int __kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gp
 
 	if (fault.write_fault_to_shadow_pgtable && emulation_type)
 		*emulation_type |= EMULTYPE_WRITE_PF_TO_SP;
+	if (level)
+		*level = fault.goal_level;
 
 	return r;
 }
@@ -347,7 +350,8 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	if (!prefetch)
 		vcpu->stat.pf_taken++;
 
-	r = __kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, err, prefetch, emulation_type);
+	r = __kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, err, prefetch,
+				    emulation_type, NULL);
 
 	/*
 	 * Similar to above, prefetch faults aren't truly spurious, and the
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  2024-04-19  8:59 [PATCH v4 0/6] KVM: Guest Memory Pre-Population API Paolo Bonzini
                   ` (3 preceding siblings ...)
  2024-04-19  8:59 ` [PATCH 4/6] KVM: x86/mmu: Make __kvm_mmu_do_page_fault() return mapped level Paolo Bonzini
@ 2024-04-19  8:59 ` Paolo Bonzini
  2024-04-22 15:37   ` Xiaoyao Li
  2024-04-19  8:59 ` [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY Paolo Bonzini
  5 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-19  8:59 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: isaku.yamahata, xiaoyao.li, binbin.wu, seanjc, rick.p.edgecombe

From: Isaku Yamahata <isaku.yamahata@intel.com>

Wire KVM_PRE_FAULT_MEMORY ioctl to __kvm_mmu_do_page_fault() to populate guest
memory.  It can be called right after KVM_CREATE_VCPU creates a vCPU,
since at that point kvm_mmu_create() and kvm_init_mmu() are called and
the vCPU is ready to invoke the KVM page fault handler.

The helper function kvm_mmu_map_tdp_page take care of the logic to
process RET_PF_* return values and convert them to success or errno.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/Kconfig   |  1 +
 arch/x86/kvm/mmu/mmu.c | 72 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c     |  3 ++
 3 files changed, 76 insertions(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 7632fe6e4db9..54c155432793 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -44,6 +44,7 @@ config KVM
 	select KVM_VFIO
 	select HAVE_KVM_PM_NOTIFIER if PM
 	select KVM_GENERIC_HARDWARE_ENABLING
+	select KVM_GENERIC_PRE_FAULT_MEMORY
 	help
 	  Support hosting fully virtualized guest machines using hardware
 	  virtualization extensions.  You will need a fairly recent
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 10e90788b263..a045b23964c0 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4647,6 +4647,78 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 	return direct_page_fault(vcpu, fault);
 }
 
+static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code,
+		     u8 *level)
+{
+	int r;
+
+	/* Restrict to TDP page fault. */
+	if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault)
+		return -EOPNOTSUPP;
+
+retry:
+	r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, true, NULL, level);
+	if (r < 0)
+		return r;
+
+	switch (r) {
+	case RET_PF_RETRY:
+		if (signal_pending(current))
+			return -EINTR;
+		cond_resched();
+		goto retry;
+
+	case RET_PF_FIXED:
+	case RET_PF_SPURIOUS:
+		break;
+
+	case RET_PF_EMULATE:
+		return -ENOENT;
+
+	case RET_PF_CONTINUE:
+	case RET_PF_INVALID:
+	default:
+		WARN_ON_ONCE(r);
+		return -EIO;
+	}
+
+	return 0;
+}
+
+long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
+				    struct kvm_pre_fault_memory *range)
+{
+	u64 error_code = PFERR_GUEST_FINAL_MASK;
+	u8 level = PG_LEVEL_4K;
+	u64 end;
+	int r;
+
+	/*
+	 * reload is efficient when called repeatedly, so we can do it on
+	 * every iteration.
+	 */
+	kvm_mmu_reload(vcpu);
+
+	if (kvm_arch_has_private_mem(vcpu->kvm) &&
+	    kvm_mem_is_private(vcpu->kvm, gpa_to_gfn(range->gpa)))
+		error_code |= PFERR_PRIVATE_ACCESS;
+
+	/*
+	 * Shadow paging uses GVA for kvm page fault, so restrict to
+	 * two-dimensional paging.
+	 */
+	r = kvm_tdp_map_page(vcpu, range->gpa, error_code, &level);
+	if (r < 0)
+		return r;
+
+	/*
+	 * If the mapping that covers range->gpa can use a huge page, it
+	 * may start below it or end after range->gpa + range->size.
+	 */
+	end = (range->gpa & KVM_HPAGE_MASK(level)) + KVM_HPAGE_SIZE(level);
+	return min(range->size, end - range->gpa);
+}
+
 static void nonpaging_init_context(struct kvm_mmu *context)
 {
 	context->page_fault = nonpaging_page_fault;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 83b8260443a3..619ad713254e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4715,6 +4715,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_MEMORY_FAULT_INFO:
 		r = 1;
 		break;
+	case KVM_CAP_PRE_FAULT_MEMORY:
+		r = tdp_enabled;
+		break;
 	case KVM_CAP_EXIT_HYPERCALL:
 		r = KVM_EXIT_HYPERCALL_VALID_MASK;
 		break;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  2024-04-19  8:59 [PATCH v4 0/6] KVM: Guest Memory Pre-Population API Paolo Bonzini
                   ` (4 preceding siblings ...)
  2024-04-19  8:59 ` [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() Paolo Bonzini
@ 2024-04-19  8:59 ` Paolo Bonzini
  2024-04-22 17:50   ` Isaku Yamahata
  2024-04-23 15:18   ` Xiaoyao Li
  5 siblings, 2 replies; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-19  8:59 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: isaku.yamahata, xiaoyao.li, binbin.wu, seanjc, rick.p.edgecombe

From: Isaku Yamahata <isaku.yamahata@intel.com>

Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the
pre-populated area.  It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM
and KVM_X86_SW_PROTECTED_VM.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tools/include/uapi/linux/kvm.h                |   8 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/pre_fault_memory_test.c     | 146 ++++++++++++++++++
 3 files changed, 155 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/pre_fault_memory_test.c

diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index c3308536482b..4d66d8afdcd1 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -2227,4 +2227,12 @@ struct kvm_create_guest_memfd {
 	__u64 reserved[6];
 };
 
+#define KVM_PRE_FAULT_MEMORY	_IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memory)
+
+struct kvm_pre_fault_memory {
+	__u64 gpa;
+	__u64 size;
+	__u64 flags;
+};
+
 #endif /* __LINUX_KVM_H */
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 871e2de3eb05..61d581a4bab4 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -144,6 +144,7 @@ TEST_GEN_PROGS_x86_64 += set_memory_region_test
 TEST_GEN_PROGS_x86_64 += steal_time
 TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
 TEST_GEN_PROGS_x86_64 += system_counter_offset_test
+TEST_GEN_PROGS_x86_64 += pre_fault_memory_test
 
 # Compiled outputs used by test targets
 TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
diff --git a/tools/testing/selftests/kvm/pre_fault_memory_test.c b/tools/testing/selftests/kvm/pre_fault_memory_test.c
new file mode 100644
index 000000000000..e56eed2c1f05
--- /dev/null
+++ b/tools/testing/selftests/kvm/pre_fault_memory_test.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2024, Intel, Inc
+ *
+ * Author:
+ * Isaku Yamahata <isaku.yamahata at gmail.com>
+ */
+#include <linux/sizes.h>
+
+#include <test_util.h>
+#include <kvm_util.h>
+#include <processor.h>
+
+/* Arbitrarily chosen values */
+#define TEST_SIZE		(SZ_2M + PAGE_SIZE)
+#define TEST_NPAGES		(TEST_SIZE / PAGE_SIZE)
+#define TEST_SLOT		10
+
+static void guest_code(uint64_t base_gpa)
+{
+	volatile uint64_t val __used;
+	int i;
+
+	for (i = 0; i < TEST_NPAGES; i++) {
+		uint64_t *src = (uint64_t *)(base_gpa + i * PAGE_SIZE);
+
+		val = *src;
+	}
+
+	GUEST_DONE();
+}
+
+static void pre_fault_memory(struct kvm_vcpu *vcpu, u64 gpa, u64 size,
+			     u64 left)
+{
+	struct kvm_pre_fault_memory range = {
+		.gpa = gpa,
+		.size = size,
+		.flags = 0,
+	};
+	u64 prev;
+	int ret, save_errno;
+
+	do {
+		prev = range.size;
+		ret = __vcpu_ioctl(vcpu, KVM_PRE_FAULT_MEMORY, &range);
+		save_errno = errno;
+		TEST_ASSERT((range.size < prev) ^ (ret < 0),
+			    "%sexpecting range.size to change on %s",
+			    ret < 0 ? "not " : "",
+			    ret < 0 ? "failure" : "success");
+	} while (ret >= 0 ? range.size : save_errno == EINTR);
+
+	TEST_ASSERT(range.size == left,
+		    "Completed with %lld bytes left, expected %" PRId64,
+		    range.size, left);
+
+	if (left == 0)
+		__TEST_ASSERT_VM_VCPU_IOCTL(!ret, "KVM_PRE_FAULT_MEMORY", ret, vcpu->vm);
+	else
+		/* No memory slot causes RET_PF_EMULATE. it results in -ENOENT. */
+		__TEST_ASSERT_VM_VCPU_IOCTL(ret && save_errno == ENOENT,
+					    "KVM_PRE_FAULT_MEMORY", ret, vcpu->vm);
+}
+
+static void __test_pre_fault_memory(unsigned long vm_type, bool private)
+{
+	const struct vm_shape shape = {
+		.mode = VM_MODE_DEFAULT,
+		.type = vm_type,
+	};
+	struct kvm_vcpu *vcpu;
+	struct kvm_run *run;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	uint64_t guest_test_phys_mem;
+	uint64_t guest_test_virt_mem;
+	uint64_t alignment, guest_page_size;
+
+	vm = vm_create_shape_with_one_vcpu(shape, &vcpu, guest_code);
+
+	alignment = guest_page_size = vm_guest_mode_params[VM_MODE_DEFAULT].page_size;
+	guest_test_phys_mem = (vm->max_gfn - TEST_NPAGES) * guest_page_size;
+#ifdef __s390x__
+	alignment = max(0x100000UL, guest_page_size);
+#else
+	alignment = SZ_2M;
+#endif
+	guest_test_phys_mem = align_down(guest_test_phys_mem, alignment);
+	guest_test_virt_mem = guest_test_phys_mem;
+
+	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
+				    guest_test_phys_mem, TEST_SLOT, TEST_NPAGES,
+				    private ? KVM_MEM_GUEST_MEMFD : 0);
+	virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, TEST_NPAGES);
+
+	if (private)
+		vm_mem_set_private(vm, guest_test_phys_mem, TEST_SIZE);
+	pre_fault_memory(vcpu, guest_test_phys_mem, SZ_2M, 0);
+	pre_fault_memory(vcpu, guest_test_phys_mem + SZ_2M, PAGE_SIZE * 2, PAGE_SIZE);
+	pre_fault_memory(vcpu, guest_test_phys_mem + TEST_SIZE, PAGE_SIZE, PAGE_SIZE);
+
+	vcpu_args_set(vcpu, 1, guest_test_virt_mem);
+	vcpu_run(vcpu);
+
+	run = vcpu->run;
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+		    "Wanted KVM_EXIT_IO, got exit reason: %u (%s)",
+		    run->exit_reason, exit_reason_str(run->exit_reason));
+
+	switch (get_ucall(vcpu, &uc)) {
+	case UCALL_ABORT:
+		REPORT_GUEST_ASSERT(uc);
+		break;
+	case UCALL_DONE:
+		break;
+	default:
+		TEST_FAIL("Unknown ucall 0x%lx.", uc.cmd);
+		break;
+	}
+
+	kvm_vm_free(vm);
+}
+
+static void test_pre_fault_memory(unsigned long vm_type, bool private)
+{
+	if (vm_type && !(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(vm_type))) {
+		pr_info("Skipping tests for vm_type 0x%lx\n", vm_type);
+		return;
+	}
+
+	__test_pre_fault_memory(vm_type, private);
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(kvm_check_cap(KVM_CAP_PRE_FAULT_MEMORY));
+
+	test_pre_fault_memory(0, false);
+#ifdef __x86_64__
+	test_pre_fault_memory(KVM_X86_SW_PROTECTED_VM, false);
+	test_pre_fault_memory(KVM_X86_SW_PROTECTED_VM, true);
+#endif
+	return 0;
+}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  2024-04-19  8:59 ` [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory Paolo Bonzini
@ 2024-04-22  5:39   ` Binbin Wu
  2024-04-24 16:05     ` Paolo Bonzini
  2024-04-22  7:19   ` Binbin Wu
  2024-04-22 18:00   ` Isaku Yamahata
  2 siblings, 1 reply; 19+ messages in thread
From: Binbin Wu @ 2024-04-22  5:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, isaku.yamahata, xiaoyao.li, seanjc,
	rick.p.edgecombe



On 4/19/2024 4:59 PM, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
>
> Add a new ioctl KVM_PRE_FAULT_MEMORY in the KVM common code. It iterates on the
> memory range and calls the arch-specific function.  Add stub arch function
> as a weak symbol.
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Message-ID: <819322b8f25971f2b9933bfa4506e618508ad782.1712785629.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   include/linux/kvm_host.h |  5 ++++
>   include/uapi/linux/kvm.h | 10 +++++++
>   virt/kvm/Kconfig         |  3 ++
>   virt/kvm/kvm_main.c      | 63 ++++++++++++++++++++++++++++++++++++++++
>   4 files changed, 81 insertions(+)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 8dea11701ab2..9e9943e5e37c 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2478,4 +2478,9 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages
>   void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
>   #endif
>   
> +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
> +long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> +				    struct kvm_pre_fault_memory *range);
> +#endif
> +
>   #endif
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 2190adbe3002..917d2964947d 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -917,6 +917,7 @@ struct kvm_enable_cap {
>   #define KVM_CAP_MEMORY_ATTRIBUTES 233
>   #define KVM_CAP_GUEST_MEMFD 234
>   #define KVM_CAP_VM_TYPES 235
> +#define KVM_CAP_PRE_FAULT_MEMORY 236
>   
>   struct kvm_irq_routing_irqchip {
>   	__u32 irqchip;
> @@ -1548,4 +1549,13 @@ struct kvm_create_guest_memfd {
>   	__u64 reserved[6];
>   };
>   
> +#define KVM_PRE_FAULT_MEMORY	_IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memory)
> +
> +struct kvm_pre_fault_memory {
> +	__u64 gpa;
> +	__u64 size;
> +	__u64 flags;
> +	__u64 padding[5];
> +};
> +
>   #endif /* __LINUX_KVM_H */
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index 754c6c923427..b14e14cdbfb9 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -67,6 +67,9 @@ config HAVE_KVM_INVALID_WAKEUPS
>   config KVM_GENERIC_DIRTYLOG_READ_PROTECT
>          bool
>   
> +config KVM_GENERIC_PRE_FAULT_MEMORY
> +       bool
> +
>   config KVM_COMPAT
>          def_bool y
>          depends on KVM && COMPAT && !(S390 || ARM64 || RISCV)
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 38b498669ef9..51d8dbe7e93b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4379,6 +4379,55 @@ static int kvm_vcpu_ioctl_get_stats_fd(struct kvm_vcpu *vcpu)
>   	return fd;
>   }
>   
> +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
> +static int kvm_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> +				     struct kvm_pre_fault_memory *range)
> +{
> +	int idx;
> +	long r;
> +	u64 full_size;
> +
> +	if (range->flags)
> +		return -EINVAL;
> +
> +	if (!PAGE_ALIGNED(range->gpa) ||
> +	    !PAGE_ALIGNED(range->size) ||
> +	    range->gpa + range->size <= range->gpa)
> +		return -EINVAL;
> +
> +	if (!range->size)
> +		return 0;

range->size equals 0 can be covered by "range->gpa + range->size <= 
range->gpa"

If we want to return success when size is 0 (, though I am not sure it's 
needed),
we need to use "range->gpa + range->size < range->gpa" instead.


> +
> +	vcpu_load(vcpu);
> +	idx = srcu_read_lock(&vcpu->kvm->srcu);
> +
> +	full_size = range->size;
> +	do {
> +		if (signal_pending(current)) {
> +			r = -EINTR;
> +			break;
> +		}
> +
> +		r = kvm_arch_vcpu_pre_fault_memory(vcpu, range);
> +		if (r < 0)
> +			break;
> +
> +		if (WARN_ON_ONCE(r == 0))
> +			break;
> +
> +		range->size -= r;
> +		range->gpa += r;
> +		cond_resched();
> +	} while (range->size);
> +
> +	srcu_read_unlock(&vcpu->kvm->srcu, idx);
> +	vcpu_put(vcpu);
> +
> +	/* Return success if at least one page was mapped successfully.  */
> +	return full_size == range->size ? r : 0;
> +}
> +#endif
> +
>   static long kvm_vcpu_ioctl(struct file *filp,
>   			   unsigned int ioctl, unsigned long arg)
>   {
> @@ -4580,6 +4629,20 @@ static long kvm_vcpu_ioctl(struct file *filp,
>   		r = kvm_vcpu_ioctl_get_stats_fd(vcpu);
>   		break;
>   	}
> +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
> +	case KVM_PRE_FAULT_MEMORY: {
> +		struct kvm_pre_fault_memory range;
> +
> +		r = -EFAULT;
> +		if (copy_from_user(&range, argp, sizeof(range)))
> +			break;
> +		r = kvm_vcpu_pre_fault_memory(vcpu, &range);
> +		/* Pass back leftover range. */
> +		if (copy_to_user(argp, &range, sizeof(range)))
> +			r = -EFAULT;
> +		break;
> +	}
> +#endif
>   	default:
>   		r = kvm_arch_vcpu_ioctl(filp, ioctl, arg);
>   	}


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  2024-04-19  8:59 ` [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory Paolo Bonzini
  2024-04-22  5:39   ` Binbin Wu
@ 2024-04-22  7:19   ` Binbin Wu
  2024-04-22 18:00   ` Isaku Yamahata
  2 siblings, 0 replies; 19+ messages in thread
From: Binbin Wu @ 2024-04-22  7:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, isaku.yamahata, xiaoyao.li, seanjc,
	rick.p.edgecombe



On 4/19/2024 4:59 PM, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
>
> Add a new ioctl KVM_PRE_FAULT_MEMORY in the KVM common code. It iterates on the
> memory range and calls the arch-specific function.  Add stub arch function
> as a weak symbol.

The description is stale. The weak symbol was removed since v3.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault()
  2024-04-19  8:59 ` [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault() Paolo Bonzini
@ 2024-04-22  8:46   ` Xiaoyao Li
  2024-06-12 20:47     ` Sean Christopherson
  0 siblings, 1 reply; 19+ messages in thread
From: Xiaoyao Li @ 2024-04-22  8:46 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: isaku.yamahata, binbin.wu, seanjc, rick.p.edgecombe

On 4/19/2024 4:59 PM, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Extract out __kvm_mmu_do_page_fault() from kvm_mmu_do_page_fault().  The
> inner function is to initialize struct kvm_page_fault and to call the fault
> handler, and the outer function handles updating stats and converting
> return code.  

I don't see how the outer function converts return code.

> KVM_PRE_FAULT_MEMORY will call the KVM page fault handler.

I assume it means the inner function will be used by KVM_PRE_FAULT_MEMORY.

> This patch makes the emulation_type always set irrelevant to the return
> code.  kvm_mmu_page_fault() is the only caller of kvm_mmu_do_page_fault(),
> and references the value only when PF_RET_EMULATE is returned.  Therefore,
> this adjustment doesn't affect functionality.

This paragraph needs to be removed, I think. It's not true.

> No functional change intended.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Message-ID: <ddf1d98420f562707b11e12c416cce8fdb986bb1.1712785629.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/mmu/mmu_internal.h | 38 +++++++++++++++++++++------------
>   1 file changed, 24 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index e68a60974cf4..9baae6c223ee 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -287,8 +287,8 @@ static inline void kvm_mmu_prepare_memory_fault_exit(struct kvm_vcpu *vcpu,
>   				      fault->is_private);
>   }
>   
> -static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> -					u64 err, bool prefetch, int *emulation_type)
> +static inline int __kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> +					  u64 err, bool prefetch, int *emulation_type)
>   {
>   	struct kvm_page_fault fault = {
>   		.addr = cr2_or_gpa,
> @@ -318,6 +318,27 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>   		fault.slot = kvm_vcpu_gfn_to_memslot(vcpu, fault.gfn);
>   	}
>   
> +	if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp)
> +		r = kvm_tdp_page_fault(vcpu, &fault);
> +	else
> +		r = vcpu->arch.mmu->page_fault(vcpu, &fault);
> +
> +	if (r == RET_PF_EMULATE && fault.is_private) {
> +		kvm_mmu_prepare_memory_fault_exit(vcpu, &fault);
> +		r = -EFAULT;
> +	}
> +
> +	if (fault.write_fault_to_shadow_pgtable && emulation_type)
> +		*emulation_type |= EMULTYPE_WRITE_PF_TO_SP;
> +
> +	return r;
> +}
> +
> +static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> +					u64 err, bool prefetch, int *emulation_type)
> +{
> +	int r;
> +
>   	/*
>   	 * Async #PF "faults", a.k.a. prefetch faults, are not faults from the
>   	 * guest perspective and have already been counted at the time of the
> @@ -326,18 +347,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>   	if (!prefetch)
>   		vcpu->stat.pf_taken++;
>   
> -	if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp)
> -		r = kvm_tdp_page_fault(vcpu, &fault);
> -	else
> -		r = vcpu->arch.mmu->page_fault(vcpu, &fault);
> -
> -	if (r == RET_PF_EMULATE && fault.is_private) {
> -		kvm_mmu_prepare_memory_fault_exit(vcpu, &fault);
> -		return -EFAULT;
> -	}
> -
> -	if (fault.write_fault_to_shadow_pgtable && emulation_type)
> -		*emulation_type |= EMULTYPE_WRITE_PF_TO_SP;
> +	r = __kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, err, prefetch, emulation_type);
>   
>   	/*
>   	 * Similar to above, prefetch faults aren't truly spurious, and the


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  2024-04-19  8:59 ` [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() Paolo Bonzini
@ 2024-04-22 15:37   ` Xiaoyao Li
  2024-06-12 21:02     ` Sean Christopherson
  0 siblings, 1 reply; 19+ messages in thread
From: Xiaoyao Li @ 2024-04-22 15:37 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: isaku.yamahata, binbin.wu, seanjc, rick.p.edgecombe

On 4/19/2024 4:59 PM, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Wire KVM_PRE_FAULT_MEMORY ioctl to __kvm_mmu_do_page_fault() to populate guest
> memory.  It can be called right after KVM_CREATE_VCPU creates a vCPU,
> since at that point kvm_mmu_create() and kvm_init_mmu() are called and
> the vCPU is ready to invoke the KVM page fault handler.
> 
> The helper function kvm_mmu_map_tdp_page take care of the logic to
> process RET_PF_* return values and convert them to success or errno.
> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/Kconfig   |  1 +
>   arch/x86/kvm/mmu/mmu.c | 72 ++++++++++++++++++++++++++++++++++++++++++
>   arch/x86/kvm/x86.c     |  3 ++
>   3 files changed, 76 insertions(+)
> 
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 7632fe6e4db9..54c155432793 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -44,6 +44,7 @@ config KVM
>   	select KVM_VFIO
>   	select HAVE_KVM_PM_NOTIFIER if PM
>   	select KVM_GENERIC_HARDWARE_ENABLING
> +	select KVM_GENERIC_PRE_FAULT_MEMORY
>   	help
>   	  Support hosting fully virtualized guest machines using hardware
>   	  virtualization extensions.  You will need a fairly recent
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 10e90788b263..a045b23964c0 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4647,6 +4647,78 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>   	return direct_page_fault(vcpu, fault);
>   }
>   
> +static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code,
> +		     u8 *level)
> +{
> +	int r;
> +
> +	/* Restrict to TDP page fault. */
> +	if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault)
> +		return -EOPNOTSUPP;
> +
> +retry:
> +	r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, true, NULL, level);
> +	if (r < 0)
> +		return r;
> +
> +	switch (r) {
> +	case RET_PF_RETRY:
> +		if (signal_pending(current))
> +			return -EINTR;
> +		cond_resched();
> +		goto retry;
> +
> +	case RET_PF_FIXED:
> +	case RET_PF_SPURIOUS:
> +		break;
> +
> +	case RET_PF_EMULATE:
> +		return -ENOENT;
> +
> +	case RET_PF_CONTINUE:
> +	case RET_PF_INVALID:
> +	default:
> +		WARN_ON_ONCE(r);
> +		return -EIO;

Need to update patch 1 for -EIO

> +	}
> +
> +	return 0;
> +}
> +
> +long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> +				    struct kvm_pre_fault_memory *range)
> +{
> +	u64 error_code = PFERR_GUEST_FINAL_MASK;
> +	u8 level = PG_LEVEL_4K;
> +	u64 end;
> +	int r;
> +
> +	/*
> +	 * reload is efficient when called repeatedly, so we can do it on
> +	 * every iteration.
> +	 */
> +	kvm_mmu_reload(vcpu);
> +
> +	if (kvm_arch_has_private_mem(vcpu->kvm) &&
> +	    kvm_mem_is_private(vcpu->kvm, gpa_to_gfn(range->gpa)))
> +		error_code |= PFERR_PRIVATE_ACCESS;
> +
> +	/*
> +	 * Shadow paging uses GVA for kvm page fault, so restrict to
> +	 * two-dimensional paging.
> +	 */
> +	r = kvm_tdp_map_page(vcpu, range->gpa, error_code, &level);
> +	if (r < 0)
> +		return r;
> +
> +	/*
> +	 * If the mapping that covers range->gpa can use a huge page, it
> +	 * may start below it or end after range->gpa + range->size.
> +	 */
> +	end = (range->gpa & KVM_HPAGE_MASK(level)) + KVM_HPAGE_SIZE(level);
> +	return min(range->size, end - range->gpa);
> +}
> +
>   static void nonpaging_init_context(struct kvm_mmu *context)
>   {
>   	context->page_fault = nonpaging_page_fault;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 83b8260443a3..619ad713254e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4715,6 +4715,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>   	case KVM_CAP_MEMORY_FAULT_INFO:
>   		r = 1;
>   		break;
> +	case KVM_CAP_PRE_FAULT_MEMORY:
> +		r = tdp_enabled;
> +		break;
>   	case KVM_CAP_EXIT_HYPERCALL:
>   		r = KVM_EXIT_HYPERCALL_VALID_MASK;
>   		break;


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  2024-04-19  8:59 ` [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY Paolo Bonzini
@ 2024-04-22 17:50   ` Isaku Yamahata
  2024-04-23 15:18   ` Xiaoyao Li
  1 sibling, 0 replies; 19+ messages in thread
From: Isaku Yamahata @ 2024-04-22 17:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, isaku.yamahata, xiaoyao.li, binbin.wu, seanjc,
	rick.p.edgecombe, isaku.yamahata

On Fri, Apr 19, 2024 at 04:59:27AM -0400,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the
> pre-populated area.  It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM
> and KVM_X86_SW_PROTECTED_VM.
> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  tools/include/uapi/linux/kvm.h                |   8 +
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/pre_fault_memory_test.c     | 146 ++++++++++++++++++
>  3 files changed, 155 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/pre_fault_memory_test.c
> 
> diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
> index c3308536482b..4d66d8afdcd1 100644
> --- a/tools/include/uapi/linux/kvm.h
> +++ b/tools/include/uapi/linux/kvm.h
> @@ -2227,4 +2227,12 @@ struct kvm_create_guest_memfd {
>  	__u64 reserved[6];
>  };
>  
> +#define KVM_PRE_FAULT_MEMORY	_IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memory)
> +
> +struct kvm_pre_fault_memory {
> +	__u64 gpa;
> +	__u64 size;
> +	__u64 flags;

nitpick: catch up for struct update.
+       __u64 padding[5];

> +};
> +
>  #endif /* __LINUX_KVM_H */
-- 
Isaku Yamahata <isaku.yamahata@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl
  2024-04-19  8:59 ` [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl Paolo Bonzini
@ 2024-04-22 17:55   ` Isaku Yamahata
  0 siblings, 0 replies; 19+ messages in thread
From: Isaku Yamahata @ 2024-04-22 17:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, isaku.yamahata, xiaoyao.li, binbin.wu, seanjc,
	rick.p.edgecombe, isaku.yamahata

On Fri, Apr 19, 2024 at 04:59:22AM -0400,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Adds documentation of KVM_PRE_FAULT_MEMORY ioctl. [1]
> 
> It populates guest memory.  It doesn't do extra operations on the
> underlying technology-specific initialization [2].  For example,
> CoCo-related operations won't be performed.  Concretely for TDX, this API
> won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific APIs
> are required for such operations.
> 
> The key point is to adapt of vcpu ioctl instead of VM ioctl.  First,
> populating guest memory requires vcpu.  If it is VM ioctl, we need to pick
> one vcpu somehow.  Secondly, vcpu ioctl allows each vcpu to invoke this
> ioctl in parallel.  It helps to scale regarding guest memory size, e.g.,
> hundreds of GB.
> 
> [1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@google.com/
> [2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Message-ID: <9a060293c9ad9a78f1d8994cfe1311e818e99257.1712785629.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  Documentation/virt/kvm/api.rst | 50 ++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index f0b76ff5030d..bbcaa5d2b54b 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6352,6 +6352,56 @@ a single guest_memfd file, but the bound ranges must not overlap).
>  
>  See KVM_SET_USER_MEMORY_REGION2 for additional details.
>  
> +4.143 KVM_PRE_FAULT_MEMORY
> +------------------------
> +
> +:Capability: KVM_CAP_PRE_FAULT_MEMORY
> +:Architectures: none
> +:Type: vcpu ioctl
> +:Parameters: struct kvm_pre_fault_memory (in/out)
> +:Returns: 0 on success, < 0 on error
> +
> +Errors:
> +
> +  ========== ===============================================================
> +  EINVAL     The specified `gpa` and `size` were invalid (e.g. not
> +             page aligned).
> +  ENOENT     The specified `gpa` is outside defined memslots.
> +  EINTR      An unmasked signal is pending and no page was processed.
> +  EFAULT     The parameter address was invalid.
> +  EOPNOTSUPP Mapping memory for a GPA is unsupported by the
> +             hypervisor, and/or for the current vCPU state/mode.

     EIO        Unexpected error happened.

> +  ========== ===============================================================
> +
> +::
> +
> +  struct kvm_pre_fault_memory {
> +	/* in/out */
> +	__u64 gpa;
> +	__u64 size;
> +	/* in */
> +	__u64 flags;
> +	__u64 padding[5];
> +  };
> +
> +KVM_PRE_FAULT_MEMORY populates KVM's stage-2 page tables used to map memory
> +for the current vCPU state.  KVM maps memory as if the vCPU generated a
> +stage-2 read page fault, e.g. faults in memory as needed, but doesn't break
> +CoW.  However, KVM does not mark any newly created stage-2 PTE as Accessed.
> +
> +In some cases, multiple vCPUs might share the page tables.  In this
> +case, the ioctl can be called in parallel.
> +
> +Shadow page tables cannot support this ioctl because they
> +are indexed by virtual address or nested guest physical address.
> +Calling this ioctl when the guest is using shadow page tables (for
> +example because it is running a nested guest with nested page tables)
> +will fail with `EOPNOTSUPP` even if `KVM_CHECK_EXTENSION` reports
> +the capability to be present.
> +
> +`flags` must currently be zero.

`flags` and `padding`

> +
> +
>  5. The kvm_run structure
>  ========================
>  
> -- 
> 2.43.0
> 
> 
> 

-- 
Isaku Yamahata <isaku.yamahata@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  2024-04-19  8:59 ` [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory Paolo Bonzini
  2024-04-22  5:39   ` Binbin Wu
  2024-04-22  7:19   ` Binbin Wu
@ 2024-04-22 18:00   ` Isaku Yamahata
  2 siblings, 0 replies; 19+ messages in thread
From: Isaku Yamahata @ 2024-04-22 18:00 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, isaku.yamahata, xiaoyao.li, binbin.wu, seanjc,
	rick.p.edgecombe, isaku.yamahata

On Fri, Apr 19, 2024 at 04:59:23AM -0400,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Add a new ioctl KVM_PRE_FAULT_MEMORY in the KVM common code. It iterates on the
> memory range and calls the arch-specific function.  Add stub arch function
> as a weak symbol.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Message-ID: <819322b8f25971f2b9933bfa4506e618508ad782.1712785629.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/linux/kvm_host.h |  5 ++++
>  include/uapi/linux/kvm.h | 10 +++++++
>  virt/kvm/Kconfig         |  3 ++
>  virt/kvm/kvm_main.c      | 63 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 81 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 8dea11701ab2..9e9943e5e37c 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2478,4 +2478,9 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages
>  void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
>  #endif
>  
> +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
> +long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> +				    struct kvm_pre_fault_memory *range);
> +#endif
> +
>  #endif
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 2190adbe3002..917d2964947d 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -917,6 +917,7 @@ struct kvm_enable_cap {
>  #define KVM_CAP_MEMORY_ATTRIBUTES 233
>  #define KVM_CAP_GUEST_MEMFD 234
>  #define KVM_CAP_VM_TYPES 235
> +#define KVM_CAP_PRE_FAULT_MEMORY 236
>  
>  struct kvm_irq_routing_irqchip {
>  	__u32 irqchip;
> @@ -1548,4 +1549,13 @@ struct kvm_create_guest_memfd {
>  	__u64 reserved[6];
>  };
>  
> +#define KVM_PRE_FAULT_MEMORY	_IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memory)
> +
> +struct kvm_pre_fault_memory {
> +	__u64 gpa;
> +	__u64 size;
> +	__u64 flags;
> +	__u64 padding[5];
> +};
> +
>  #endif /* __LINUX_KVM_H */
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index 754c6c923427..b14e14cdbfb9 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -67,6 +67,9 @@ config HAVE_KVM_INVALID_WAKEUPS
>  config KVM_GENERIC_DIRTYLOG_READ_PROTECT
>         bool
>  
> +config KVM_GENERIC_PRE_FAULT_MEMORY
> +       bool
> +
>  config KVM_COMPAT
>         def_bool y
>         depends on KVM && COMPAT && !(S390 || ARM64 || RISCV)
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 38b498669ef9..51d8dbe7e93b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4379,6 +4379,55 @@ static int kvm_vcpu_ioctl_get_stats_fd(struct kvm_vcpu *vcpu)
>  	return fd;
>  }
>  
> +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
> +static int kvm_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> +				     struct kvm_pre_fault_memory *range)
> +{
> +	int idx;
> +	long r;
> +	u64 full_size;
> +
> +	if (range->flags)
> +		return -EINVAL;

To keep future extensively, check the padding are zero.
Or will we be rely on flags?

        if (!memchr_inv(range->padding, 0, sizeof(range->padding)))
                return -EINVAL;
-- 
Isaku Yamahata <isaku.yamahata@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  2024-04-19  8:59 ` [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY Paolo Bonzini
  2024-04-22 17:50   ` Isaku Yamahata
@ 2024-04-23 15:18   ` Xiaoyao Li
  2024-04-24  1:59     ` Xiaoyao Li
  1 sibling, 1 reply; 19+ messages in thread
From: Xiaoyao Li @ 2024-04-23 15:18 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: isaku.yamahata, binbin.wu, seanjc, rick.p.edgecombe

On 4/19/2024 4:59 PM, Paolo Bonzini wrote:

...

> +static void __test_pre_fault_memory(unsigned long vm_type, bool private)
> +{
> +	const struct vm_shape shape = {
> +		.mode = VM_MODE_DEFAULT,
> +		.type = vm_type,
> +	};
> +	struct kvm_vcpu *vcpu;
> +	struct kvm_run *run;
> +	struct kvm_vm *vm;
> +	struct ucall uc;
> +
> +	uint64_t guest_test_phys_mem;
> +	uint64_t guest_test_virt_mem;
> +	uint64_t alignment, guest_page_size;
> +
> +	vm = vm_create_shape_with_one_vcpu(shape, &vcpu, guest_code);
> +
> +	alignment = guest_page_size = vm_guest_mode_params[VM_MODE_DEFAULT].page_size;
> +	guest_test_phys_mem = (vm->max_gfn - TEST_NPAGES) * guest_page_size;
> +#ifdef __s390x__
> +	alignment = max(0x100000UL, guest_page_size);
> +#else
> +	alignment = SZ_2M;
> +#endif
> +	guest_test_phys_mem = align_down(guest_test_phys_mem, alignment);
> +	guest_test_virt_mem = guest_test_phys_mem;

guest_test_virt_mem cannot be assigned as guest_test_phys_mem, which 
leads to following virt_map() fails with

==== Test Assertion Failure ====
   lib/x86_64/processor.c:197: sparsebit_is_set(vm->vpages_valid, (vaddr 
 >> vm->page_shift))
   pid=4773 tid=4773 errno=0 - Success
      1	0x000000000040f55c: __virt_pg_map at processor.c:197
      2	0x000000000040605e: virt_pg_map at kvm_util_base.h:1065
      3	 (inlined by) virt_map at kvm_util.c:1571
      4	0x0000000000402b75: __test_pre_fault_memory at 
pre_fault_memory_test.c:96
      5	0x000000000040246e: test_pre_fault_memory at 
pre_fault_memory_test.c:133 (discriminator 3)
      6	 (inlined by) main at pre_fault_memory_test.c:140 (discriminator 3)
      7	0x00007fcb68429d8f: ?? ??:0
      8	0x00007fcb68429e3f: ?? ??:0
      9	0x00000000004024e4: _start at ??:?
   Invalid virtual address, vaddr: 0xfffffffc00000

> +
> +	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
> +				    guest_test_phys_mem, TEST_SLOT, TEST_NPAGES,
> +				    private ? KVM_MEM_GUEST_MEMFD : 0);
> +	virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, TEST_NPAGES);





^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  2024-04-23 15:18   ` Xiaoyao Li
@ 2024-04-24  1:59     ` Xiaoyao Li
  0 siblings, 0 replies; 19+ messages in thread
From: Xiaoyao Li @ 2024-04-24  1:59 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: isaku.yamahata, binbin.wu, seanjc, rick.p.edgecombe

On 4/23/2024 11:18 PM, Xiaoyao Li wrote:
> On 4/19/2024 4:59 PM, Paolo Bonzini wrote:
> 
> ...
> 
>> +static void __test_pre_fault_memory(unsigned long vm_type, bool private)
>> +{
>> +    const struct vm_shape shape = {
>> +        .mode = VM_MODE_DEFAULT,
>> +        .type = vm_type,
>> +    };
>> +    struct kvm_vcpu *vcpu;
>> +    struct kvm_run *run;
>> +    struct kvm_vm *vm;
>> +    struct ucall uc;
>> +
>> +    uint64_t guest_test_phys_mem;
>> +    uint64_t guest_test_virt_mem;
>> +    uint64_t alignment, guest_page_size;
>> +
>> +    vm = vm_create_shape_with_one_vcpu(shape, &vcpu, guest_code);
>> +
>> +    alignment = guest_page_size = 
>> vm_guest_mode_params[VM_MODE_DEFAULT].page_size;
>> +    guest_test_phys_mem = (vm->max_gfn - TEST_NPAGES) * guest_page_size;
>> +#ifdef __s390x__
>> +    alignment = max(0x100000UL, guest_page_size);
>> +#else
>> +    alignment = SZ_2M;
>> +#endif
>> +    guest_test_phys_mem = align_down(guest_test_phys_mem, alignment);
>> +    guest_test_virt_mem = guest_test_phys_mem;
> 
> guest_test_virt_mem cannot be assigned as guest_test_phys_mem, which 
> leads to following virt_map() fails with

The root cause is that vm->pa_bits is 52 while vm->va_bits is 48. So 
vm->max_gfn is beyond the capability of va space

> ==== Test Assertion Failure ====
>    lib/x86_64/processor.c:197: sparsebit_is_set(vm->vpages_valid, (vaddr 
>  >> vm->page_shift))
>    pid=4773 tid=4773 errno=0 - Success
>       1    0x000000000040f55c: __virt_pg_map at processor.c:197
>       2    0x000000000040605e: virt_pg_map at kvm_util_base.h:1065
>       3     (inlined by) virt_map at kvm_util.c:1571
>       4    0x0000000000402b75: __test_pre_fault_memory at 
> pre_fault_memory_test.c:96
>       5    0x000000000040246e: test_pre_fault_memory at 
> pre_fault_memory_test.c:133 (discriminator 3)
>       6     (inlined by) main at pre_fault_memory_test.c:140 
> (discriminator 3)
>       7    0x00007fcb68429d8f: ?? ??:0
>       8    0x00007fcb68429e3f: ?? ??:0
>       9    0x00000000004024e4: _start at ??:?
>    Invalid virtual address, vaddr: 0xfffffffc00000
> 
>> +
>> +    vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
>> +                    guest_test_phys_mem, TEST_SLOT, TEST_NPAGES,
>> +                    private ? KVM_MEM_GUEST_MEMFD : 0);
>> +    virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, TEST_NPAGES);
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  2024-04-22  5:39   ` Binbin Wu
@ 2024-04-24 16:05     ` Paolo Bonzini
  0 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2024-04-24 16:05 UTC (permalink / raw)
  To: Binbin Wu
  Cc: linux-kernel, kvm, isaku.yamahata, xiaoyao.li, seanjc,
	rick.p.edgecombe

On Mon, Apr 22, 2024 at 7:39 AM Binbin Wu <binbin.wu@linux.intel.com> wrote:
> range->size equals 0 can be covered by "range->gpa + range->size <=
> range->gpa"
>
> If we want to return success when size is 0 (, though I am not sure it's
> needed),
> we need to use "range->gpa + range->size < range->gpa" instead.

I think it's not needed because it could cause an infinite loop in
(buggy) userspace. Better return -EINVAL.

Paolo

>
> > +
> > +     vcpu_load(vcpu);
> > +     idx = srcu_read_lock(&vcpu->kvm->srcu);
> > +
> > +     full_size = range->size;
> > +     do {
> > +             if (signal_pending(current)) {
> > +                     r = -EINTR;
> > +                     break;
> > +             }
> > +
> > +             r = kvm_arch_vcpu_pre_fault_memory(vcpu, range);
> > +             if (r < 0)
> > +                     break;
> > +
> > +             if (WARN_ON_ONCE(r == 0))
> > +                     break;
> > +
> > +             range->size -= r;
> > +             range->gpa += r;
> > +             cond_resched();
> > +     } while (range->size);
> > +
> > +     srcu_read_unlock(&vcpu->kvm->srcu, idx);
> > +     vcpu_put(vcpu);
> > +
> > +     /* Return success if at least one page was mapped successfully.  */
> > +     return full_size == range->size ? r : 0;
> > +}
> > +#endif
> > +
> >   static long kvm_vcpu_ioctl(struct file *filp,
> >                          unsigned int ioctl, unsigned long arg)
> >   {
> > @@ -4580,6 +4629,20 @@ static long kvm_vcpu_ioctl(struct file *filp,
> >               r = kvm_vcpu_ioctl_get_stats_fd(vcpu);
> >               break;
> >       }
> > +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY
> > +     case KVM_PRE_FAULT_MEMORY: {
> > +             struct kvm_pre_fault_memory range;
> > +
> > +             r = -EFAULT;
> > +             if (copy_from_user(&range, argp, sizeof(range)))
> > +                     break;
> > +             r = kvm_vcpu_pre_fault_memory(vcpu, &range);
> > +             /* Pass back leftover range. */
> > +             if (copy_to_user(argp, &range, sizeof(range)))
> > +                     r = -EFAULT;
> > +             break;
> > +     }
> > +#endif
> >       default:
> >               r = kvm_arch_vcpu_ioctl(filp, ioctl, arg);
> >       }
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault()
  2024-04-22  8:46   ` Xiaoyao Li
@ 2024-06-12 20:47     ` Sean Christopherson
  0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2024-06-12 20:47 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, linux-kernel, kvm, isaku.yamahata, binbin.wu,
	rick.p.edgecombe

On Mon, Apr 22, 2024, Xiaoyao Li wrote:
> On 4/19/2024 4:59 PM, Paolo Bonzini wrote:
> > From: Isaku Yamahata <isaku.yamahata@intel.com>
> > 
> > Extract out __kvm_mmu_do_page_fault() from kvm_mmu_do_page_fault().  The
> > inner function is to initialize struct kvm_page_fault and to call the fault
> > handler, and the outer function handles updating stats and converting
> > return code.
> 
> I don't see how the outer function converts return code.
> 
> > KVM_PRE_FAULT_MEMORY will call the KVM page fault handler.
> 
> I assume it means the inner function will be used by KVM_PRE_FAULT_MEMORY.
> 
> > This patch makes the emulation_type always set irrelevant to the return
> > code.  kvm_mmu_page_fault() is the only caller of kvm_mmu_do_page_fault(),
> > and references the value only when PF_RET_EMULATE is returned.  Therefore,
> > this adjustment doesn't affect functionality.
> 
> This paragraph needs to be removed, I think. It's not true.

It's oddly worded, but I do think it's correct.  kvm_arch_async_page_ready()
doesn't pass emulation_type, and kvm_mmu_page_fault() bails early for all other
return values:

	if (r < 0)
		return r;
	if (r != RET_PF_EMULATE)
		return 1;

That said, this belongs in a separate patch (if it's actually necessary). 

And _that_ said, rather than add an inner version, what if we instead shuffle the
stats code?  pf_taken, pf_spurious, and pf_emulate should _only_ ever be bumped
by kvm_mmu_page_fault(), i.e. should only track page faults that actually happened
in the guest.  And so handling them in kvm_mmu_do_page_fault() doesn't make any
sense, because there should only ever be one caller that passes prefetch=false.

Compile tested only, and kvm_mmu_page_fault() is a bit ugly (but that's solvable),
but I think this would provide better separation of concerns.

--
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 12 Jun 2024 12:51:38 -0700
Subject: [PATCH 1/2] KVM: x86/mmu: Bump pf_taken stat only in the "real" page
 fault handler

Account stat.pf_taken in kvm_mmu_page_fault(), i.e. the actual page fault
handler, instead of conditionally bumping it in kvm_mmu_do_page_fault().
The "real" page fault handler is the only path that should ever increment
the number of taken page faults, as all other paths that "do page fault"
are by definition not handling faults that occurred in the guest.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 2 ++
 arch/x86/kvm/mmu/mmu_internal.h | 8 --------
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f2c9580d9588..3461b8c4aba2 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5928,6 +5928,8 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err
 	}
 
 	if (r == RET_PF_INVALID) {
+		vcpu->stat.pf_taken++;
+
 		r = kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, error_code, false,
 					  &emulation_type);
 		if (KVM_BUG_ON(r == RET_PF_INVALID, vcpu->kvm))
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index ce2fcd19ba6b..8efd31b3856b 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -318,14 +318,6 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 		fault.slot = kvm_vcpu_gfn_to_memslot(vcpu, fault.gfn);
 	}
 
-	/*
-	 * Async #PF "faults", a.k.a. prefetch faults, are not faults from the
-	 * guest perspective and have already been counted at the time of the
-	 * original fault.
-	 */
-	if (!prefetch)
-		vcpu->stat.pf_taken++;
-
 	if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp)
 		r = kvm_tdp_page_fault(vcpu, &fault);
 	else

base-commit: b7bc82a015e237862837bd1300d6ba1f5cd17466
-- 
2.45.2.505.gda0bf45e8d-goog

From 1dc69d38a8d51c9d8ad833475938cb925f7ea4cf Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 12 Jun 2024 12:59:06 -0700
Subject: [PATCH 2/2] KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in
 callers of "do page fault"

Move the accounting of the result of kvm_mmu_do_page_fault() to its
callers, as only pf_fixed is common to guest page faults and async #PFs,
and upcoming support KVM_PRE_FAULT_MEMORY won't bump _any_ stats.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 19 ++++++++++++++++++-
 arch/x86/kvm/mmu/mmu_internal.h | 13 -------------
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3461b8c4aba2..56373577a197 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4291,7 +4291,16 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
 	      work->arch.cr3 != kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu))
 		return;
 
-	kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, work->arch.error_code, true, NULL);
+	r = kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, work->arch.error_code,
+				  true, NULL);
+
+	/*
+	 * Account fixed page faults, otherwise they'll never be counted, but
+	 * ignore stats for all other return times.  Page-ready "faults" aren't
+	 * truly spurious and never trigger emulation
+	 */
+	if (r == RET_PF_FIXED)
+		vcpu->stat.pf_fixed++;
 }
 
 static inline u8 kvm_max_level_for_order(int order)
@@ -5938,6 +5947,14 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err
 
 	if (r < 0)
 		return r;
+
+	if (r == RET_PF_FIXED)
+		vcpu->stat.pf_fixed++;
+	else if (r == RET_PF_EMULATE)
+		vcpu->stat.pf_emulate++;
+	else if (r == RET_PF_SPURIOUS)
+		vcpu->stat.pf_spurious++;
+
 	if (r != RET_PF_EMULATE)
 		return 1;
 
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 8efd31b3856b..444f55a5eed7 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -337,19 +337,6 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	if (fault.write_fault_to_shadow_pgtable && emulation_type)
 		*emulation_type |= EMULTYPE_WRITE_PF_TO_SP;
 
-	/*
-	 * Similar to above, prefetch faults aren't truly spurious, and the
-	 * async #PF path doesn't do emulation.  Do count faults that are fixed
-	 * by the async #PF handler though, otherwise they'll never be counted.
-	 */
-	if (r == RET_PF_FIXED)
-		vcpu->stat.pf_fixed++;
-	else if (prefetch)
-		;
-	else if (r == RET_PF_EMULATE)
-		vcpu->stat.pf_emulate++;
-	else if (r == RET_PF_SPURIOUS)
-		vcpu->stat.pf_spurious++;
 	return r;
 }
 
-- 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  2024-04-22 15:37   ` Xiaoyao Li
@ 2024-06-12 21:02     ` Sean Christopherson
  0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2024-06-12 21:02 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, linux-kernel, kvm, isaku.yamahata, binbin.wu,
	rick.p.edgecombe

On Mon, Apr 22, 2024, Xiaoyao Li wrote:
> On 4/19/2024 4:59 PM, Paolo Bonzini wrote:
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 10e90788b263..a045b23964c0 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -4647,6 +4647,78 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> >   	return direct_page_fault(vcpu, fault);
> >   }
> > +static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code,
> > +		     u8 *level)

Align parameters:

static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code,
			    u8 *level)

> > +{
> > +	int r;
> > +
> > +	/* Restrict to TDP page fault. */

This is fairly obvious from the code, what might not be obvious is _why_.  I'm
also ok dropping the comment entirely, but it's easy enough to provide a hint to
the reader.

> > +	if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault)
> > +		return -EOPNOTSUPP;
> > +
> > +retry:
> > +	r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, true, NULL, level);
> > +	if (r < 0)
> > +		return r;
> > +
> > +	switch (r) {
> > +	case RET_PF_RETRY:
> > +		if (signal_pending(current))
> > +			return -EINTR;
> > +		cond_resched();
> > +		goto retry;

Rather than a goto+retry from inside a switch statement, what about:

	int r;

	/* 
	 * Pre-faulting a GPA is supported only non-nested TDP, as indirect
	 * MMUs map either GVAs or L2 GPAs, not L1 GPAs.
	 */
	if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault)
		return -EOPNOTSUPP;

	do {
		if (signal_pending(current))
			return -EINTR;

		cond_resched();

		r = kvm_mmu_do_page_fault(vcpu, gpa, error_code, true, NULL, level);
	} while (r == RET_PF_RETRY);

	switch (r) {
	case RET_PF_FIXED:
	case RET_PF_SPURIOUS:
		break;

	case RET_PF_EMULATE:
		return -ENOENT;

	case RET_PF_CONTINUE:
	case RET_PF_INVALID:
	case RET_PF_RETRY:
	default:
		WARN_ON_ONCE(r >= 0);
		return -EIO;
	}

	return 0;

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-06-12 21:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-19  8:59 [PATCH v4 0/6] KVM: Guest Memory Pre-Population API Paolo Bonzini
2024-04-19  8:59 ` [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl Paolo Bonzini
2024-04-22 17:55   ` Isaku Yamahata
2024-04-19  8:59 ` [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory Paolo Bonzini
2024-04-22  5:39   ` Binbin Wu
2024-04-24 16:05     ` Paolo Bonzini
2024-04-22  7:19   ` Binbin Wu
2024-04-22 18:00   ` Isaku Yamahata
2024-04-19  8:59 ` [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault() Paolo Bonzini
2024-04-22  8:46   ` Xiaoyao Li
2024-06-12 20:47     ` Sean Christopherson
2024-04-19  8:59 ` [PATCH 4/6] KVM: x86/mmu: Make __kvm_mmu_do_page_fault() return mapped level Paolo Bonzini
2024-04-19  8:59 ` [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() Paolo Bonzini
2024-04-22 15:37   ` Xiaoyao Li
2024-06-12 21:02     ` Sean Christopherson
2024-04-19  8:59 ` [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY Paolo Bonzini
2024-04-22 17:50   ` Isaku Yamahata
2024-04-23 15:18   ` Xiaoyao Li
2024-04-24  1:59     ` Xiaoyao Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox