[PATCH v7 00/11] KVM: xen: update shared_info and vcpu

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling
@ 2023-10-02  9:57 Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 01/11] KVM: pfncache: add a map helper function Paul Durrant
                   ` (11 more replies)
  0 siblings, 12 replies; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, H. Peter Anvin, Borislav Petkov, Dave Hansen,
	David Woodhouse, Ingo Molnar, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, x86

From: Paul Durrant <pdurrant@amazon.com>

The following text from the original cover letter still serves as an
introduction to the series:

"Currently we treat the shared_info page as guest memory and the VMM
informs KVM of its location using a GFN. However it is not guest memory as
such; it's an overlay page. So we pointlessly invalidate and re-cache a
mapping to the *same page* of memory every time the guest requests that
shared_info be mapped into its address space. Let's avoid doing that by
modifying the pfncache code to allow activation using a fixed userspace HVA
as well as a GPA."

This version of the series is functionally the same as version 6. I have
simply added David Woodhouse's R-b to patch 11 to indicate that he has
now fully reviewed the series.

Paul Durrant (11):
  KVM: pfncache: add a map helper function
  KVM: pfncache: add a mark-dirty helper
  KVM: pfncache: add a helper to get the gpa
  KVM: pfncache: base offset check on khva rather than gpa
  KVM: pfncache: allow a cache to be activated with a fixed (userspace)
    HVA
  KVM: xen: allow shared_info to be mapped by fixed HVA
  KVM: xen: allow vcpu_info to be mapped by fixed HVA
  KVM: selftests / xen: map shared_info using HVA rather than GFN
  KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA
  KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability
  KVM: xen: allow vcpu_info content to be 'safely' copied

 Documentation/virt/kvm/api.rst                |  53 +++++--
 arch/x86/kvm/x86.c                            |   5 +-
 arch/x86/kvm/xen.c                            |  92 +++++++++----
 include/linux/kvm_host.h                      |  43 ++++++
 include/linux/kvm_types.h                     |   3 +-
 include/uapi/linux/kvm.h                      |   9 +-
 .../selftests/kvm/x86_64/xen_shinfo_test.c    |  59 ++++++--
 virt/kvm/pfncache.c                           | 129 +++++++++++++-----
 8 files changed, 302 insertions(+), 91 deletions(-)
---
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
-- 
2.39.2


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v7 01/11] KVM: pfncache: add a map helper function
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-31 23:20   ` Sean Christopherson
  2023-10-02  9:57 ` [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper Paul Durrant
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, Sean Christopherson,
	David Woodhouse, Paolo Bonzini

From: Paul Durrant <pdurrant@amazon.com>

We have an unmap helper but mapping is open-coded. Arguably this is fine
because mapping is done in only one place, hva_to_pfn_retry(), but adding
the helper does make that function more readable.

No functional change intended.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---
 virt/kvm/pfncache.c | 43 +++++++++++++++++++++++++------------------
 1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 2d6aba677830..0f36acdf577f 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -96,17 +96,28 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, unsigned long len)
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_check);
 
-static void gpc_unmap_khva(kvm_pfn_t pfn, void *khva)
+static void *gpc_map(kvm_pfn_t pfn)
+{
+	if (pfn_valid(pfn))
+		return kmap(pfn_to_page(pfn));
+#ifdef CONFIG_HAS_IOMEM
+	else
+		return memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
+#endif
+}
+
+static void gpc_unmap(kvm_pfn_t pfn, void *khva)
 {
 	/* Unmap the old pfn/page if it was mapped before. */
-	if (!is_error_noslot_pfn(pfn) && khva) {
-		if (pfn_valid(pfn))
-			kunmap(pfn_to_page(pfn));
+	if (is_error_noslot_pfn(pfn) || !khva)
+		return;
+
+	if (pfn_valid(pfn))
+		kunmap(pfn_to_page(pfn));
 #ifdef CONFIG_HAS_IOMEM
-		else
-			memunmap(khva);
+	else
+		memunmap(khva);
 #endif
-	}
 }
 
 static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_seq)
@@ -175,7 +186,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 			 * the existing mapping and didn't create a new one.
 			 */
 			if (new_khva != old_khva)
-				gpc_unmap_khva(new_pfn, new_khva);
+				gpc_unmap(new_pfn, new_khva);
 
 			kvm_release_pfn_clean(new_pfn);
 
@@ -193,15 +204,11 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 		 * too must be done outside of gpc->lock!
 		 */
 		if (gpc->usage & KVM_HOST_USES_PFN) {
-			if (new_pfn == gpc->pfn) {
+			if (new_pfn == gpc->pfn)
 				new_khva = old_khva;
-			} else if (pfn_valid(new_pfn)) {
-				new_khva = kmap(pfn_to_page(new_pfn));
-#ifdef CONFIG_HAS_IOMEM
-			} else {
-				new_khva = memremap(pfn_to_hpa(new_pfn), PAGE_SIZE, MEMREMAP_WB);
-#endif
-			}
+			else
+				new_khva = gpc_map(new_pfn);
+
 			if (!new_khva) {
 				kvm_release_pfn_clean(new_pfn);
 				goto out_error;
@@ -326,7 +333,7 @@ static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 	mutex_unlock(&gpc->refresh_lock);
 
 	if (unmap_old)
-		gpc_unmap_khva(old_pfn, old_khva);
+		gpc_unmap(old_pfn, old_khva);
 
 	return ret;
 }
@@ -412,7 +419,7 @@ void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc)
 		list_del(&gpc->list);
 		spin_unlock(&kvm->gpc_lock);
 
-		gpc_unmap_khva(old_pfn, old_khva);
+		gpc_unmap(old_pfn, old_khva);
 	}
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_deactivate);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 01/11] KVM: pfncache: add a map helper function Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-31 23:28   ` Sean Christopherson
  2023-10-02  9:57 ` [PATCH v7 03/11] KVM: pfncache: add a helper to get the gpa Paul Durrant
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, David Woodhouse,
	Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

From: Paul Durrant <pdurrant@amazon.com>

At the moment pages are marked dirty by open-coded calls to
mark_page_dirty_in_slot(), directly deferefencing the gpa and memslot
from the cache. After a subsequent patch these may not always be set
so add a helper now so that caller will protected from the need to know
about this detail.

NOTE: Pages are now marked dirty while the cache lock is held. This is
      to ensure that gpa and memslot are mutually consistent.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/kvm/x86.c       |  2 +-
 arch/x86/kvm/xen.c       | 13 ++++++-------
 include/linux/kvm_host.h |  7 +++++++
 virt/kvm/pfncache.c      |  6 ++++++
 4 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9f18b06bbda6..eee252a0afef 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3137,7 +3137,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v,
 
 	guest_hv_clock->version = ++vcpu->hv_clock.version;
 
-	mark_page_dirty_in_slot(v->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
+	kvm_gpc_mark_dirty(gpc);
 	read_unlock_irqrestore(&gpc->lock, flags);
 
 	trace_kvm_pvclock_update(v->vcpu_id, &vcpu->hv_clock);
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 40edf4d1974c..33fddd29824b 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -430,14 +430,13 @@ static void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, bool atomic)
 		smp_wmb();
 	}
 
-	if (user_len2)
+	if (user_len2) {
+		kvm_gpc_mark_dirty(gpc2);
 		read_unlock(&gpc2->lock);
+	}
 
+	kvm_gpc_mark_dirty(gpc1);
 	read_unlock_irqrestore(&gpc1->lock, flags);
-
-	mark_page_dirty_in_slot(v->kvm, gpc1->memslot, gpc1->gpa >> PAGE_SHIFT);
-	if (user_len2)
-		mark_page_dirty_in_slot(v->kvm, gpc2->memslot, gpc2->gpa >> PAGE_SHIFT);
 }
 
 void kvm_xen_update_runstate(struct kvm_vcpu *v, int state)
@@ -543,13 +542,13 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v)
 			     : "0" (evtchn_pending_sel32));
 		WRITE_ONCE(vi->evtchn_upcall_pending, 1);
 	}
+
+	kvm_gpc_mark_dirty(gpc);
 	read_unlock_irqrestore(&gpc->lock, flags);
 
 	/* For the per-vCPU lapic vector, deliver it as MSI. */
 	if (v->arch.xen.upcall_vector)
 		kvm_xen_inject_vcpu_vector(v);
-
-	mark_page_dirty_in_slot(v->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
 }
 
 int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index fb6c6109fdca..c71e8fbccaaf 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1367,6 +1367,13 @@ int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, unsigned long len);
  */
 void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc);
 
+/**
+ * kvm_gpc_mark_dirty - mark a cached page as dirty.
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ */
+void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc);
+
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
 
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 0f36acdf577f..b68ed7fa56a2 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -386,6 +386,12 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
+void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc)
+{
+	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
+}
+EXPORT_SYMBOL_GPL(kvm_gpc_mark_dirty);
+
 void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc)
 {
 	struct kvm *kvm = gpc->kvm;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 03/11] KVM: pfncache: add a helper to get the gpa
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 01/11] KVM: pfncache: add a map helper function Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-31 23:37   ` Sean Christopherson
  2023-10-02  9:57 ` [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa Paul Durrant
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, David Woodhouse,
	Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

From: Paul Durrant <pdurrant@amazon.com>

A subsequent patch will rename this field since it will become overloaded.
To avoid churn in places that currently retrieve the gpa, add a helper for
that purpose now.

No functional change intended.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/kvm/xen.c       | 15 ++++++++-------
 include/linux/kvm_host.h |  7 +++++++
 virt/kvm/pfncache.c      |  6 ++++++
 3 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 33fddd29824b..8e6fdcd7bb6e 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -261,8 +261,8 @@ static void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, bool atomic)
 	 * alignment (and the 32-bit ABI doesn't align the 64-bit integers
 	 * anyway, even if the overall struct had been 64-bit aligned).
 	 */
-	if ((gpc1->gpa & ~PAGE_MASK) + user_len >= PAGE_SIZE) {
-		user_len1 = PAGE_SIZE - (gpc1->gpa & ~PAGE_MASK);
+	if ((kvm_gpc_gpa(gpc1) & ~PAGE_MASK) + user_len >= PAGE_SIZE) {
+		user_len1 = PAGE_SIZE - (kvm_gpc_gpa(gpc1) & ~PAGE_MASK);
 		user_len2 = user_len - user_len1;
 	} else {
 		user_len1 = user_len;
@@ -343,7 +343,7 @@ static void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, bool atomic)
 			 * to the second page now because the guest changed to
 			 * 64-bit mode, the second GPC won't have been set up.
 			 */
-			if (kvm_gpc_activate(gpc2, gpc1->gpa + user_len1,
+			if (kvm_gpc_activate(gpc2, kvm_gpc_gpa(gpc1) + user_len1,
 					     user_len2))
 				return;
 
@@ -677,7 +677,8 @@ int kvm_xen_hvm_get_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 
 	case KVM_XEN_ATTR_TYPE_SHARED_INFO:
 		if (kvm->arch.xen.shinfo_cache.active)
-			data->u.shared_info.gfn = gpa_to_gfn(kvm->arch.xen.shinfo_cache.gpa);
+			data->u.shared_info.gfn =
+				gpa_to_gfn(kvm_gpc_gpa(&kvm->arch.xen.shinfo_cache));
 		else
 			data->u.shared_info.gfn = KVM_XEN_INVALID_GFN;
 		r = 0;
@@ -955,7 +956,7 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 	switch (data->type) {
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO:
 		if (vcpu->arch.xen.vcpu_info_cache.active)
-			data->u.gpa = vcpu->arch.xen.vcpu_info_cache.gpa;
+			data->u.gpa = kvm_gpc_gpa(&vcpu->arch.xen.vcpu_info_cache);
 		else
 			data->u.gpa = KVM_XEN_INVALID_GPA;
 		r = 0;
@@ -963,7 +964,7 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO:
 		if (vcpu->arch.xen.vcpu_time_info_cache.active)
-			data->u.gpa = vcpu->arch.xen.vcpu_time_info_cache.gpa;
+			data->u.gpa = kvm_gpc_gpa(&vcpu->arch.xen.vcpu_time_info_cache);
 		else
 			data->u.gpa = KVM_XEN_INVALID_GPA;
 		r = 0;
@@ -975,7 +976,7 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			break;
 		}
 		if (vcpu->arch.xen.runstate_cache.active) {
-			data->u.gpa = vcpu->arch.xen.runstate_cache.gpa;
+			data->u.gpa = kvm_gpc_gpa(&vcpu->arch.xen.runstate_cache);
 			r = 0;
 		}
 		break;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c71e8fbccaaf..4d8027fe9928 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1374,6 +1374,13 @@ void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc);
  */
 void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc);
 
+/**
+ * kvm_gpc_gpa - retrieve the guest physical address of a cached mapping
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ */
+gpa_t kvm_gpc_gpa(struct gfn_to_pfn_cache *gpc);
+
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
 
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index b68ed7fa56a2..17afbb464a70 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -386,6 +386,12 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
+gpa_t kvm_gpc_gpa(struct gfn_to_pfn_cache *gpc)
+{
+	return gpc->gpa;
+}
+EXPORT_SYMBOL_GPL(kvm_gpc_gpa);
+
 void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc)
 {
 	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (2 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 03/11] KVM: pfncache: add a helper to get the gpa Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-31 23:40   ` Sean Christopherson
  2023-10-02  9:57 ` [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA Paul Durrant
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, Sean Christopherson,
	David Woodhouse, Paolo Bonzini

From: Paul Durrant <pdurrant@amazon.com>

After a subsequent patch, the gpa may not always be set whereas khva will
(as long as the cache valid flag is also set).

No functional change intended.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---
 virt/kvm/pfncache.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 17afbb464a70..37bcb4399780 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -83,15 +83,18 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, unsigned long len)
 	if (!gpc->active)
 		return false;
 
-	if ((gpc->gpa & ~PAGE_MASK) + len > PAGE_SIZE)
+	if (gpc->generation != slots->generation)
 		return false;
 
-	if (gpc->generation != slots->generation || kvm_is_error_hva(gpc->uhva))
+	if (kvm_is_error_hva(gpc->uhva))
 		return false;
 
 	if (!gpc->valid)
 		return false;
 
+	if (offset_in_page(gpc->khva) + len > PAGE_SIZE)
+		return false;
+
 	return true;
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_check);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (3 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-31 23:49   ` Sean Christopherson
  2023-10-02  9:57 ` [PATCH v7 06/11] KVM: xen: allow shared_info to be mapped by fixed HVA Paul Durrant
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, Sean Christopherson, Paolo Bonzini,
	David Woodhouse

From: Paul Durrant <pdurrant@amazon.com>

Some cached pages may actually be overlays on guest memory that have a
fixed HVA within the VMM. It's pointless to invalidate such cached
mappings if the overlay is moved so allow a cache to be activated directly
with the HVA to cater for such cases. A subsequent patch will make use
of this facility.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
---
 include/linux/kvm_host.h  | 29 ++++++++++++++++
 include/linux/kvm_types.h |  3 +-
 virt/kvm/pfncache.c       | 73 ++++++++++++++++++++++++++++-----------
 3 files changed, 84 insertions(+), 21 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4d8027fe9928..6823bae5c66c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1321,6 +1321,22 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
  */
 int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len);
 
+/**
+ * kvm_gpc_activate_hva - prepare a cached kernel mapping and HPA for a given HVA.
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ * @hva:	   userspace virtual address to map.
+ * @len:	   sanity check; the range being access must fit a single page.
+ *
+ * @return:	   0 for success.
+ *		   -EINVAL for a mapping which would cross a page boundary.
+ *		   -EFAULT for an untranslatable guest physical address.
+ *
+ * The semantics of this function are the same as those of kvm_gpc_activate(). It
+ * merely bypasses a layer of address translation.
+ */
+int kvm_gpc_activate_hva(struct gfn_to_pfn_cache *gpc, unsigned long hva, unsigned long len);
+
 /**
  * kvm_gpc_check - check validity of a gfn_to_pfn_cache.
  *
@@ -1378,9 +1394,22 @@ void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc);
  * kvm_gpc_gpa - retrieve the guest physical address of a cached mapping
  *
  * @gpc:	   struct gfn_to_pfn_cache object.
+ *
+ * @return:	   If the cache was activated with a fixed HVA then INVALID_GPA
+ *		   will be returned.
  */
 gpa_t kvm_gpc_gpa(struct gfn_to_pfn_cache *gpc);
 
+/**
+ * kvm_gpc_hva - retrieve the fixed host physical address of a cached mapping
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ *
+ * @return:	   If the cache was activated with a guest physical address then
+ *		   0 will be returned.
+ */
+unsigned long kvm_gpc_hva(struct gfn_to_pfn_cache *gpc);
+
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
 
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 6f4737d5046a..d49946ee7ae3 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -64,7 +64,7 @@ struct gfn_to_hva_cache {
 
 struct gfn_to_pfn_cache {
 	u64 generation;
-	gpa_t gpa;
+	u64 addr;
 	unsigned long uhva;
 	struct kvm_memory_slot *memslot;
 	struct kvm *kvm;
@@ -77,6 +77,7 @@ struct gfn_to_pfn_cache {
 	enum pfn_cache_usage usage;
 	bool active;
 	bool valid;
+	bool addr_is_gpa;
 };
 
 #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 37bcb4399780..b3e3f7e38410 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -83,7 +83,7 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, unsigned long len)
 	if (!gpc->active)
 		return false;
 
-	if (gpc->generation != slots->generation)
+	if (gpc->addr_is_gpa && gpc->generation != slots->generation)
 		return false;
 
 	if (kvm_is_error_hva(gpc->uhva))
@@ -229,7 +229,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 
 	gpc->valid = true;
 	gpc->pfn = new_pfn;
-	gpc->khva = new_khva + (gpc->gpa & ~PAGE_MASK);
+	gpc->khva = new_khva + (gpc->addr & ~PAGE_MASK);
 
 	/*
 	 * Put the reference to the _new_ pfn.  The pfn is now tracked by the
@@ -246,11 +246,11 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 	return -EFAULT;
 }
 
-static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa,
-			     unsigned long len)
+static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, u64 addr,
+			     unsigned long len, bool addr_is_gpa)
 {
 	struct kvm_memslots *slots = kvm_memslots(gpc->kvm);
-	unsigned long page_offset = gpa & ~PAGE_MASK;
+	unsigned long page_offset = addr & ~PAGE_MASK;
 	bool unmap_old = false;
 	unsigned long old_uhva;
 	kvm_pfn_t old_pfn;
@@ -282,22 +282,34 @@ static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 	old_khva = gpc->khva - offset_in_page(gpc->khva);
 	old_uhva = gpc->uhva;
 
-	/* If the userspace HVA is invalid, refresh that first */
-	if (gpc->gpa != gpa || gpc->generation != slots->generation ||
+	/*
+	 * If the address has changed, switched from guest to host (or vice
+	 * versa), or it's a guest address and the memory slots have been
+	 * updated, we need to refresh the userspace HVA.
+	 */
+	if (gpc->addr != addr ||
+	    gpc->addr_is_gpa != addr_is_gpa ||
+	    (addr_is_gpa && gpc->generation != slots->generation) ||
 	    kvm_is_error_hva(gpc->uhva)) {
-		gfn_t gfn = gpa_to_gfn(gpa);
+		gpc->addr = addr;
+		gpc->addr_is_gpa = addr_is_gpa;
 
-		gpc->gpa = gpa;
-		gpc->generation = slots->generation;
-		gpc->memslot = __gfn_to_memslot(slots, gfn);
-		gpc->uhva = gfn_to_hva_memslot(gpc->memslot, gfn);
+		if (addr_is_gpa) {
+			gfn_t gfn = gpa_to_gfn(addr);
 
-		if (kvm_is_error_hva(gpc->uhva)) {
-			ret = -EFAULT;
-			goto out;
+			gpc->generation = slots->generation;
+			gpc->memslot = __gfn_to_memslot(slots, gfn);
+			gpc->uhva = gfn_to_hva_memslot(gpc->memslot, gfn);
+		} else {
+			gpc->uhva = addr & PAGE_MASK;
 		}
 	}
 
+	if (kvm_is_error_hva(gpc->uhva)) {
+		ret = -EFAULT;
+		goto out;
+	}
+
 	/*
 	 * If the userspace HVA changed or the PFN was already invalid,
 	 * drop the lock and do the HVA to PFN lookup again.
@@ -343,7 +355,7 @@ static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 
 int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, unsigned long len)
 {
-	return __kvm_gpc_refresh(gpc, gpc->gpa, len);
+	return __kvm_gpc_refresh(gpc, gpc->addr, len, gpc->addr_is_gpa);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_refresh);
 
@@ -364,7 +376,8 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_init);
 
-int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
+static int __kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, u64 addr, unsigned long len,
+			      bool addr_is_gpa)
 {
 	struct kvm *kvm = gpc->kvm;
 
@@ -385,19 +398,39 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
 		gpc->active = true;
 		write_unlock_irq(&gpc->lock);
 	}
-	return __kvm_gpc_refresh(gpc, gpa, len);
+	return __kvm_gpc_refresh(gpc, addr, len, addr_is_gpa);
+}
+
+int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
+{
+	return __kvm_gpc_activate(gpc, gpa, len, true);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
 gpa_t kvm_gpc_gpa(struct gfn_to_pfn_cache *gpc)
 {
-	return gpc->gpa;
+	return gpc->addr_is_gpa ? gpc->addr : INVALID_GPA;
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_gpa);
 
+int kvm_gpc_activate_hva(struct gfn_to_pfn_cache *gpc, unsigned long hva, unsigned long len)
+{
+	return __kvm_gpc_activate(gpc, hva, len, false);
+}
+EXPORT_SYMBOL_GPL(kvm_gpc_activate_hva);
+
+unsigned long kvm_gpc_hva(struct gfn_to_pfn_cache *gpc)
+{
+	return !gpc->addr_is_gpa ? gpc->addr : 0;
+}
+EXPORT_SYMBOL_GPL(kvm_gpc_hva);
+
 void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc)
 {
-	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
+	if (!gpc->addr_is_gpa)
+		return;
+
+	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->addr >> PAGE_SHIFT);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_mark_dirty);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 06/11] KVM: xen: allow shared_info to be mapped by fixed HVA
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (4 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-31 23:52   ` Sean Christopherson
  2023-10-02  9:57 ` [PATCH v7 07/11] KVM: xen: allow vcpu_info " Paul Durrant
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, David Woodhouse,
	Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

From: Paul Durrant <pdurrant@amazon.com>

The shared_info page is not guest memory as such. It is a dedicated page
allocated by the VMM and overlaid onto guest memory in a GFN chosen by the
guest and specified in the XENMEM_add_to_physmap hypercall. The guest may
even request that shared_info be moved from one GFN to another by
re-issuing that hypercall, but the HVA is never going to change.

Because the shared_info page is an overlay we need to update the memory
slots in response to the hypercall. However, memory slot adjustment is not
atomic and, whilst all vCPUs are paused, there is still the possibility
that events may be delivered (which requires the shared_info page to be
updated) whilst the shared_info GPA is absent. The HVA is never absent
though, so it makes much more sense to use that as the basis for the
kernel's mapping.

Hence add a new KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA attribute type for this
purpose and a KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag to advertize its
availability. Don't actually advertize it yet though. That will be done in
a subsequent patch, which will also add tests for the new attribute type.

Also update the KVM API documentation with the new attribute and also fix
it up to consistently refer to 'shared_info' (with the underscore).

NOTE: The change of the kvm_xen_hvm_attr shared_info from struct to union
      is technically an ABI change but it's entirely compatible with
      existing users.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org

v2:
 - Define the new attribute and capability but don't advertize the
   capability yet.
 - Add API documentation.
---
 Documentation/virt/kvm/api.rst | 25 +++++++++++++++++++------
 arch/x86/kvm/xen.c             | 28 ++++++++++++++++++++++------
 include/uapi/linux/kvm.h       |  6 +++++-
 3 files changed, 46 insertions(+), 13 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 21a7578142a1..e9df4df6fe48 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -353,7 +353,7 @@ The bits in the dirty bitmap are cleared before the ioctl returns, unless
 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is enabled.  For more information,
 see the description of the capability.
 
-Note that the Xen shared info page, if configured, shall always be assumed
+Note that the Xen shared_info page, if configured, shall always be assumed
 to be dirty. KVM will not explicitly mark it such.
 
 
@@ -5408,8 +5408,9 @@ KVM_PV_ASYNC_CLEANUP_PERFORM
 		__u8 long_mode;
 		__u8 vector;
 		__u8 runstate_update_flag;
-		struct {
+		union {
 			__u64 gfn;
+			__u64 hva;
 		} shared_info;
 		struct {
 			__u32 send_port;
@@ -5437,10 +5438,10 @@ type values:
 
 KVM_XEN_ATTR_TYPE_LONG_MODE
   Sets the ABI mode of the VM to 32-bit or 64-bit (long mode). This
-  determines the layout of the shared info pages exposed to the VM.
+  determines the layout of the shared_info page exposed to the VM.
 
 KVM_XEN_ATTR_TYPE_SHARED_INFO
-  Sets the guest physical frame number at which the Xen "shared info"
+  Sets the guest physical frame number at which the Xen shared_info
   page resides. Note that although Xen places vcpu_info for the first
   32 vCPUs in the shared_info page, KVM does not automatically do so
   and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
@@ -5449,7 +5450,7 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
   not be aware of the Xen CPU id which is used as the index into the
   vcpu_info[] array, so may know the correct default location.
 
-  Note that the shared info page may be constantly written to by KVM;
+  Note that the shared_info page may be constantly written to by KVM;
   it contains the event channel bitmap used to deliver interrupts to
   a Xen guest, amongst other things. It is exempt from dirty tracking
   mechanisms — KVM will not explicitly mark the page as dirty each
@@ -5458,9 +5459,21 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
   any vCPU has been running or any event channel interrupts can be
   routed to the guest.
 
-  Setting the gfn to KVM_XEN_INVALID_GFN will disable the shared info
+  Setting the gfn to KVM_XEN_INVALID_GFN will disable the shared_info
   page.
 
+KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA
+  If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
+  Xen capabilities, then this attribute may be used to set the
+  userspace address at which the shared_info page resides, which
+  will always be fixed in the VMM regardless of where it is mapped
+  in guest physical address space. This attribute should be used in
+  preference to KVM_XEN_ATTR_TYPE_SHARED_INFO as it avoids
+  unnecessary invalidation of an internal cache when the page is
+  re-mapped in guest physcial address space.
+
+  Setting the hva to zero will disable the shared_info page.
+
 KVM_XEN_ATTR_TYPE_UPCALL_VECTOR
   Sets the exception vector used to deliver Xen event channel upcalls.
   This is the HVM-wide vector injected directly by the hypervisor
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 8e6fdcd7bb6e..1abb4547642a 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -34,24 +34,27 @@ static bool kvm_xen_hcall_evtchn_send(struct kvm_vcpu *vcpu, u64 param, u64 *r);
 
 DEFINE_STATIC_KEY_DEFERRED_FALSE(kvm_xen_enabled, HZ);
 
-static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
+static int kvm_xen_shared_info_init(struct kvm *kvm, u64 addr, bool addr_is_gfn)
 {
 	struct gfn_to_pfn_cache *gpc = &kvm->arch.xen.shinfo_cache;
 	struct pvclock_wall_clock *wc;
-	gpa_t gpa = gfn_to_gpa(gfn);
 	u32 *wc_sec_hi;
 	u32 wc_version;
 	u64 wall_nsec;
 	int ret = 0;
 	int idx = srcu_read_lock(&kvm->srcu);
 
-	if (gfn == KVM_XEN_INVALID_GFN) {
+	if ((addr_is_gfn && addr == KVM_XEN_INVALID_GFN) ||
+	    (!addr_is_gfn && addr == 0)) {
 		kvm_gpc_deactivate(gpc);
 		goto out;
 	}
 
 	do {
-		ret = kvm_gpc_activate(gpc, gpa, PAGE_SIZE);
+		if (addr_is_gfn)
+			ret = kvm_gpc_activate(gpc, gfn_to_gpa(addr), PAGE_SIZE);
+		else
+			ret = kvm_gpc_activate_hva(gpc, addr, PAGE_SIZE);
 		if (ret)
 			goto out;
 
@@ -604,7 +607,6 @@ int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 {
 	int r = -ENOENT;
 
-
 	switch (data->type) {
 	case KVM_XEN_ATTR_TYPE_LONG_MODE:
 		if (!IS_ENABLED(CONFIG_64BIT) && data->u.long_mode) {
@@ -619,7 +621,13 @@ int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 
 	case KVM_XEN_ATTR_TYPE_SHARED_INFO:
 		mutex_lock(&kvm->arch.xen.xen_lock);
-		r = kvm_xen_shared_info_init(kvm, data->u.shared_info.gfn);
+		r = kvm_xen_shared_info_init(kvm, data->u.shared_info.gfn, true);
+		mutex_unlock(&kvm->arch.xen.xen_lock);
+		break;
+
+	case KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA:
+		mutex_lock(&kvm->arch.xen.xen_lock);
+		r = kvm_xen_shared_info_init(kvm, data->u.shared_info.hva, false);
 		mutex_unlock(&kvm->arch.xen.xen_lock);
 		break;
 
@@ -684,6 +692,14 @@ int kvm_xen_hvm_get_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 		r = 0;
 		break;
 
+	case KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA:
+		if (kvm->arch.xen.shinfo_cache.active)
+			data->u.shared_info.hva = kvm_gpc_hva(&kvm->arch.xen.shinfo_cache);
+		else
+			data->u.shared_info.hva = 0;
+		r = 0;
+		break;
+
 	case KVM_XEN_ATTR_TYPE_UPCALL_VECTOR:
 		data->u.vector = kvm->arch.xen.upcall_vector;
 		r = 0;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 13065dd96132..062bfa14b4d9 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1282,6 +1282,7 @@ struct kvm_x86_mce {
 #define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL	(1 << 4)
 #define KVM_XEN_HVM_CONFIG_EVTCHN_SEND		(1 << 5)
 #define KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG	(1 << 6)
+#define KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA	(1 << 7)
 
 struct kvm_xen_hvm_config {
 	__u32 flags;
@@ -1793,9 +1794,10 @@ struct kvm_xen_hvm_attr {
 		__u8 long_mode;
 		__u8 vector;
 		__u8 runstate_update_flag;
-		struct {
+		union {
 			__u64 gfn;
 #define KVM_XEN_INVALID_GFN ((__u64)-1)
+			__u64 hva;
 		} shared_info;
 		struct {
 			__u32 send_port;
@@ -1837,6 +1839,8 @@ struct kvm_xen_hvm_attr {
 #define KVM_XEN_ATTR_TYPE_XEN_VERSION		0x4
 /* Available with KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG */
 #define KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG	0x5
+/* Available with KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA */
+#define KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA	0x6
 
 /* Per-vCPU Xen attributes */
 #define KVM_XEN_VCPU_GET_ATTR	_IOWR(KVMIO, 0xca, struct kvm_xen_vcpu_attr)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 07/11] KVM: xen: allow vcpu_info to be mapped by fixed HVA
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (5 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 06/11] KVM: xen: allow shared_info to be mapped by fixed HVA Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 08/11] KVM: selftests / xen: map shared_info using HVA rather than GFN Paul Durrant
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, David Woodhouse,
	Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

From: Paul Durrant <pdurrant@amazon.com>

If the guest does not explicitly set the GPA of vcpu_info structure in
memory then, for guests with 32 vCPUs or fewer, the vcpu_info embedded
in the shared_info page may be used. As described in a previous commit,
the shared_info page is an overlay at a fixed HVA within the VMM, so in
this case it also more optimal to activate the vcpu_info cache with a
fixed HVA to avoid unnecessary invalidation if the guest memory layout
is modified.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org

v5:
 - New in this version.
---
 Documentation/virt/kvm/api.rst | 26 +++++++++++++++++++++-----
 arch/x86/kvm/xen.c             | 33 +++++++++++++++++++++++++++------
 include/uapi/linux/kvm.h       |  3 +++
 3 files changed, 51 insertions(+), 11 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e9df4df6fe48..5adc6dfc8c6e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5444,11 +5444,12 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
   Sets the guest physical frame number at which the Xen shared_info
   page resides. Note that although Xen places vcpu_info for the first
   32 vCPUs in the shared_info page, KVM does not automatically do so
-  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
-  explicitly even when the vcpu_info for a given vCPU resides at the
-  "default" location in the shared_info page. This is because KVM may
-  not be aware of the Xen CPU id which is used as the index into the
-  vcpu_info[] array, so may know the correct default location.
+  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO or
+  KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA be used explicitly even when
+  the vcpu_info for a given vCPU resides at the "default" location
+  in the shared_info page. This is because KVM may not be aware of
+  the Xen CPU id which is used as the index into the vcpu_info[]
+  array, so may know the correct default location.
 
   Note that the shared_info page may be constantly written to by KVM;
   it contains the event channel bitmap used to deliver interrupts to
@@ -5570,6 +5571,21 @@ KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
   on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
   the vcpu_info.
 
+KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA
+  If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
+  Xen capabilities, then this attribute may be used to set the
+  userspace address of the vcpu_info for a given vCPU. It should
+  only be used when the vcpu_info resides at the "default" location
+  in the shared_info page. In this case it is safe to assume the
+  userspace address will not change, because the shared_info page is
+  an overlay on guest memory and remains at a fixed host address
+  regardless of where it is mapped in guest physical address space
+  and hence unnecessary invalidation of an internal cache may be
+  avoided if the guest memory layout is modified.
+  If the vcpu_info does not reside at the "default" location then
+  it is not guaranteed to remain at the same host address and
+  hence the aforementioned cache invalidation is required.
+
 KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
   Sets the guest physical address of an additional pvclock structure
   for a given vCPU. This is typically used for guest vsyscall support.
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 1abb4547642a..aafc794940e4 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -736,20 +736,33 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 
 	switch (data->type) {
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO:
+	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA:
 		/* No compat necessary here. */
 		BUILD_BUG_ON(sizeof(struct vcpu_info) !=
 			     sizeof(struct compat_vcpu_info));
 		BUILD_BUG_ON(offsetof(struct vcpu_info, time) !=
 			     offsetof(struct compat_vcpu_info, time));
 
-		if (data->u.gpa == KVM_XEN_INVALID_GPA) {
-			kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
-			r = 0;
-			break;
+		if (data->type == KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO) {
+			if (data->u.gpa == KVM_XEN_INVALID_GPA) {
+				kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
+				r = 0;
+				break;
+			}
+
+			r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache,
+					     data->u.gpa, sizeof(struct vcpu_info));
+		} else {
+			if (data->u.hva == 0) {
+				kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
+				r = 0;
+				break;
+			}
+
+			r = kvm_gpc_activate_hva(&vcpu->arch.xen.vcpu_info_cache,
+						 data->u.hva, sizeof(struct vcpu_info));
 		}
 
-		r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache,
-				     data->u.gpa, sizeof(struct vcpu_info));
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 
@@ -978,6 +991,14 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 		r = 0;
 		break;
 
+	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA:
+		if (vcpu->arch.xen.vcpu_info_cache.active)
+			data->u.hva = kvm_gpc_hva(&vcpu->arch.xen.vcpu_info_cache);
+		else
+			data->u.hva = 0;
+		r = 0;
+		break;
+
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO:
 		if (vcpu->arch.xen.vcpu_time_info_cache.active)
 			data->u.gpa = kvm_gpc_gpa(&vcpu->arch.xen.vcpu_time_info_cache);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 062bfa14b4d9..0267c2ef43de 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1858,6 +1858,7 @@ struct kvm_xen_vcpu_attr {
 	union {
 		__u64 gpa;
 #define KVM_XEN_INVALID_GPA ((__u64)-1)
+		__u64 hva;
 		__u64 pad[8];
 		struct {
 			__u64 state;
@@ -1888,6 +1889,8 @@ struct kvm_xen_vcpu_attr {
 #define KVM_XEN_VCPU_ATTR_TYPE_VCPU_ID		0x6
 #define KVM_XEN_VCPU_ATTR_TYPE_TIMER		0x7
 #define KVM_XEN_VCPU_ATTR_TYPE_UPCALL_VECTOR	0x8
+/* Available with KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA */
+#define KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA	0x9
 
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 08/11] KVM: selftests / xen: map shared_info using HVA rather than GFN
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (6 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 07/11] KVM: xen: allow vcpu_info " Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 09/11] KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA Paul Durrant
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, Sean Christopherson, Paolo Bonzini,
	David Woodhouse

From: Paul Durrant <pdurrant@amazon.com>

Using the HVA of the shared_info page is more efficient, so if the
capability (KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA) is present use that method
to do the mapping.

NOTE: Have the juggle_shinfo_state() thread map and unmap using both
      GFN and HVA, to make sure the older mechanism is not broken.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>

v3:
 - Re-work the juggle_shinfo_state() thread

v2:
 - New in this version.
---
 .../selftests/kvm/x86_64/xen_shinfo_test.c    | 44 +++++++++++++++----
 1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
index 05898ad9f4d9..e6672ae1d9de 100644
--- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
+++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
@@ -393,6 +393,7 @@ static int cmp_timespec(struct timespec *a, struct timespec *b)
 		return 0;
 }
 
+static struct shared_info *shinfo;
 static struct vcpu_info *vinfo;
 static struct kvm_vcpu *vcpu;
 
@@ -408,20 +409,38 @@ static void *juggle_shinfo_state(void *arg)
 {
 	struct kvm_vm *vm = (struct kvm_vm *)arg;
 
-	struct kvm_xen_hvm_attr cache_activate = {
+	struct kvm_xen_hvm_attr cache_activate_gfn = {
 		.type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
 		.u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE
 	};
 
-	struct kvm_xen_hvm_attr cache_deactivate = {
+	struct kvm_xen_hvm_attr cache_deactivate_gfn = {
 		.type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
 		.u.shared_info.gfn = KVM_XEN_INVALID_GFN
 	};
 
+	struct kvm_xen_hvm_attr cache_activate_hva = {
+		.type = KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA,
+		.u.shared_info.hva = (unsigned long)shinfo
+	};
+
+	struct kvm_xen_hvm_attr cache_deactivate_hva = {
+		.type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
+		.u.shared_info.hva = 0
+	};
+
+	int xen_caps = kvm_check_cap(KVM_CAP_XEN_HVM);
+
 	for (;;) {
-		__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_activate);
-		__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_deactivate);
+		__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_activate_gfn);
 		pthread_testcancel();
+		__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_deactivate_gfn);
+
+		if (xen_caps & KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA) {
+			__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_activate_hva);
+			pthread_testcancel();
+			__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_deactivate_hva);
+		}
 	}
 
 	return NULL;
@@ -446,6 +465,7 @@ int main(int argc, char *argv[])
 	bool do_runstate_flag = !!(xen_caps & KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG);
 	bool do_eventfd_tests = !!(xen_caps & KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL);
 	bool do_evtchn_tests = do_eventfd_tests && !!(xen_caps & KVM_XEN_HVM_CONFIG_EVTCHN_SEND);
+	bool has_shinfo_hva = !!(xen_caps & KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA);
 
 	clock_gettime(CLOCK_REALTIME, &min_ts);
 
@@ -456,7 +476,7 @@ int main(int argc, char *argv[])
 				    SHINFO_REGION_GPA, SHINFO_REGION_SLOT, 3, 0);
 	virt_map(vm, SHINFO_REGION_GVA, SHINFO_REGION_GPA, 3);
 
-	struct shared_info *shinfo = addr_gpa2hva(vm, SHINFO_VADDR);
+	shinfo = addr_gpa2hva(vm, SHINFO_VADDR);
 
 	int zero_fd = open("/dev/zero", O_RDONLY);
 	TEST_ASSERT(zero_fd != -1, "Failed to open /dev/zero");
@@ -492,10 +512,16 @@ int main(int argc, char *argv[])
 			    "Failed to read back RUNSTATE_UPDATE_FLAG attr");
 	}
 
-	struct kvm_xen_hvm_attr ha = {
-		.type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
-		.u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE,
-	};
+	struct kvm_xen_hvm_attr ha = {};
+
+	if (has_shinfo_hva) {
+		ha.type = KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA;
+		ha.u.shared_info.hva = (unsigned long)shinfo;
+	} else {
+		ha.type = KVM_XEN_ATTR_TYPE_SHARED_INFO;
+		ha.u.shared_info.gfn = SHINFO_ADDR / PAGE_SIZE;
+	}
+
 	vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &ha);
 
 	/*
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 09/11] KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (7 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 08/11] KVM: selftests / xen: map shared_info using HVA rather than GFN Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 10/11] KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability Paul Durrant
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, Sean Christopherson, Paolo Bonzini,
	David Woodhouse

From: Paul Durrant <pdurrant@amazon.com>

If the relevant capability (KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA) is present
then re-map vcpu_info using the HVA part way through the tests to make sure
then there is no functional change.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>

v5:
 - New in this version.
---
 .../selftests/kvm/x86_64/xen_shinfo_test.c        | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
index e6672ae1d9de..a5d3aea8fd95 100644
--- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
+++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
@@ -66,6 +66,7 @@ enum {
 	TEST_POLL_TIMEOUT,
 	TEST_POLL_MASKED,
 	TEST_POLL_WAKE,
+	SET_VCPU_INFO,
 	TEST_TIMER_PAST,
 	TEST_LOCKING_SEND_RACE,
 	TEST_LOCKING_POLL_RACE,
@@ -325,6 +326,10 @@ static void guest_code(void)
 
 	GUEST_SYNC(TEST_POLL_WAKE);
 
+	/* Set the vcpu_info to point at exactly the place it already is to
+	 * make sure the attribute is functional. */
+	GUEST_SYNC(SET_VCPU_INFO);
+
 	/* A timer wake an *unmasked* port which should wake us with an
 	 * actual interrupt, while we're polling on a different port. */
 	ports[0]++;
@@ -892,6 +897,16 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
+			case SET_VCPU_INFO:
+				if (has_shinfo_hva) {
+					struct kvm_xen_vcpu_attr vih = {
+						.type = KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA,
+						.u.hva = (unsigned long)vinfo
+					};
+					vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &vih);
+				}
+				break;
+
 			case TEST_TIMER_PAST:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 10/11] KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (8 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 09/11] KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-02  9:57 ` [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied Paul Durrant
  2023-10-05  6:41 ` [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling David Woodhouse
  11 siblings, 0 replies; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, David Woodhouse, x86

From: Paul Durrant <pdurrant@amazon.com>

Now that all relevant kernel changes and selftests are in place, enable the
new capability.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: x86@kernel.org

v2:
 - New in this version.
---
 arch/x86/kvm/x86.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eee252a0afef..1487b679ae45 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4531,7 +4531,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		    KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL |
 		    KVM_XEN_HVM_CONFIG_SHARED_INFO |
 		    KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL |
-		    KVM_XEN_HVM_CONFIG_EVTCHN_SEND;
+		    KVM_XEN_HVM_CONFIG_EVTCHN_SEND |
+		    KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA;
 		if (sched_info_on())
 			r |= KVM_XEN_HVM_CONFIG_RUNSTATE |
 			     KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (9 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 10/11] KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability Paul Durrant
@ 2023-10-02  9:57 ` Paul Durrant
  2023-10-31 23:58   ` Sean Christopherson
  2023-10-05  6:41 ` [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling David Woodhouse
  11 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-02  9:57 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paul Durrant, David Woodhouse, David Woodhouse,
	Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

From: Paul Durrant <pdurrant@amazon.com>

If the guest sets an explicit vcpu_info GPA then, for any of the first 32
vCPUs, the content of the default vcpu_info in the shared_info page must be
copied into the new location. Because this copy may race with event
delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) we
need a way to defer that until the copy is complete.
Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen
that is used in atomic context if the vcpu_info PFN cache has been
invalidated so that the update of vcpu_info can be deferred until the
cache can be refreshed (on vCPU thread's the way back into guest context).
So let's also use this shadow if the vcpu_info cache has been
*deactivated*, so that the VMM can safely copy the vcpu_info content and
then re-activate the cache with the new GPA. To do this, all we need to do
is stop considering an inactive vcpu_info cache as a hard error in
kvm_xen_set_evtchn_fast().

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org

v6:
 - New in this version.
---
 arch/x86/kvm/xen.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index aafc794940e4..e645066217bb 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx);
 	}

-	if (!vcpu->arch.xen.vcpu_info_cache.active)
-		return -EINVAL;
-
 	if (xe->port >= max_evtchn_port(kvm))
 		return -EINVAL;

-- 
2.39.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling
  2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
                   ` (10 preceding siblings ...)
  2023-10-02  9:57 ` [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied Paul Durrant
@ 2023-10-05  6:41 ` David Woodhouse
  2023-10-05  8:36   ` Paul Durrant
  2023-10-30 12:00   ` Paul Durrant
  11 siblings, 2 replies; 34+ messages in thread
From: David Woodhouse @ 2023-10-05  6:41 UTC (permalink / raw)
  To: Paul Durrant, kvm, linux-kernel
  Cc: Paul Durrant, H. Peter Anvin, Borislav Petkov, Dave Hansen,
	Ingo Molnar, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	x86

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

On Mon, 2023-10-02 at 09:57 +0000, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> The following text from the original cover letter still serves as an
> introduction to the series:
> 
> "Currently we treat the shared_info page as guest memory and the VMM
> informs KVM of its location using a GFN. However it is not guest memory as
> such; it's an overlay page. So we pointlessly invalidate and re-cache a
> mapping to the *same page* of memory every time the guest requests that
> shared_info be mapped into its address space. Let's avoid doing that by
> modifying the pfncache code to allow activation using a fixed userspace HVA
> as well as a GPA."
> 
> This version of the series is functionally the same as version 6. I have
> simply added David Woodhouse's R-b to patch 11 to indicate that he has
> now fully reviewed the series.

Thanks. I believe Sean is probably waiting for us to stop going back
and forth, and for the dust to settle. So for the record: I think I'm
done heckling and this is ready to go in.

Are you doing the QEMU patches or am I?


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling
  2023-10-05  6:41 ` [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling David Woodhouse
@ 2023-10-05  8:36   ` Paul Durrant
  2023-11-09 10:02     ` David Woodhouse
  2023-10-30 12:00   ` Paul Durrant
  1 sibling, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-10-05  8:36 UTC (permalink / raw)
  To: David Woodhouse, kvm, linux-kernel
  Cc: Paul Durrant, H. Peter Anvin, Borislav Petkov, Dave Hansen,
	Ingo Molnar, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	x86

On 05/10/2023 07:41, David Woodhouse wrote:
> On Mon, 2023-10-02 at 09:57 +0000, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
>>
>> The following text from the original cover letter still serves as an
>> introduction to the series:
>>
>> "Currently we treat the shared_info page as guest memory and the VMM
>> informs KVM of its location using a GFN. However it is not guest memory as
>> such; it's an overlay page. So we pointlessly invalidate and re-cache a
>> mapping to the *same page* of memory every time the guest requests that
>> shared_info be mapped into its address space. Let's avoid doing that by
>> modifying the pfncache code to allow activation using a fixed userspace HVA
>> as well as a GPA."
>>
>> This version of the series is functionally the same as version 6. I have
>> simply added David Woodhouse's R-b to patch 11 to indicate that he has
>> now fully reviewed the series.
> 
> Thanks. I believe Sean is probably waiting for us to stop going back
> and forth, and for the dust to settle. So for the record: I think I'm
> done heckling and this is ready to go in.
> 
> Are you doing the QEMU patches or am I?
> 

I'll do the QEMU changes, once the patches hit kvm/next.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling
  2023-10-05  6:41 ` [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling David Woodhouse
  2023-10-05  8:36   ` Paul Durrant
@ 2023-10-30 12:00   ` Paul Durrant
  1 sibling, 0 replies; 34+ messages in thread
From: Paul Durrant @ 2023-10-30 12:00 UTC (permalink / raw)
  To: David Woodhouse, kvm, linux-kernel
  Cc: Paul Durrant, H. Peter Anvin, Borislav Petkov, Dave Hansen,
	Ingo Molnar, Paolo Bonzini, Thomas Gleixner, x86

On 05/10/2023 07:41, David Woodhouse wrote:
> On Mon, 2023-10-02 at 09:57 +0000, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
>>
>> The following text from the original cover letter still serves as an
>> introduction to the series:
>>
>> "Currently we treat the shared_info page as guest memory and the VMM
>> informs KVM of its location using a GFN. However it is not guest memory as
>> such; it's an overlay page. So we pointlessly invalidate and re-cache a
>> mapping to the *same page* of memory every time the guest requests that
>> shared_info be mapped into its address space. Let's avoid doing that by
>> modifying the pfncache code to allow activation using a fixed userspace HVA
>> as well as a GPA."
>>
>> This version of the series is functionally the same as version 6. I have
>> simply added David Woodhouse's R-b to patch 11 to indicate that he has
>> now fully reviewed the series.
> 
> Thanks. I believe Sean is probably waiting for us to stop going back
> and forth, and for the dust to settle. So for the record: I think I'm
> done heckling and this is ready to go in.
> 

Nudge.

Sean, is there anything more I need to do on this series?

   Paul


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 01/11] KVM: pfncache: add a map helper function
  2023-10-02  9:57 ` [PATCH v7 01/11] KVM: pfncache: add a map helper function Paul Durrant
@ 2023-10-31 23:20   ` Sean Christopherson
  2023-11-02 17:09     ` Paul Durrant
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2023-10-31 23:20 UTC (permalink / raw)
  To: Paul Durrant
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini

On Mon, Oct 02, 2023, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>

Please make the changelog standalone, i.e. don't rely on the shortlog to provide
context.  Yeah, it can be silly and repetive sometimes, particularly when viewing
git commits where the shortlog+changelog are bundled fairly close together, but
when viewing patches in a mail client, e.g. when I'm doing initial review, the
shortlog is in the subject which may be far away or even completely hidden (as is
the case as I'm typing this).

I could have sworn I included this in Documentation/process/maintainer-kvm-x86.rst,
but I'm not finding it.

> We have an unmap helper but mapping is open-coded. Arguably this is fine

Pronouns.

> because mapping is done in only one place, hva_to_pfn_retry(), but adding
> the helper does make that function more readable.
> 
> No functional change intended.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  virt/kvm/pfncache.c | 43 +++++++++++++++++++++++++------------------
>  1 file changed, 25 insertions(+), 18 deletions(-)
> 
> diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
> index 2d6aba677830..0f36acdf577f 100644
> --- a/virt/kvm/pfncache.c
> +++ b/virt/kvm/pfncache.c
> @@ -96,17 +96,28 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, unsigned long len)
>  }
>  EXPORT_SYMBOL_GPL(kvm_gpc_check);
>  
> -static void gpc_unmap_khva(kvm_pfn_t pfn, void *khva)
> +static void *gpc_map(kvm_pfn_t pfn)
> +{
> +	if (pfn_valid(pfn))
> +		return kmap(pfn_to_page(pfn));
> +#ifdef CONFIG_HAS_IOMEM
> +	else

There's no need for the "else", the happy path is terminal.

> +		return memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
> +#endif

This needs a return for CONFIG_HAS_IOMEM=n.  I haven't tried to compile, but I'm
guessing s390 won't be happy.

This?

static void *gpc_map(kvm_pfn_t pfn)
{
	if (pfn_valid(pfn))
		return kmap(pfn_to_page(pfn));

#ifdef CONFIG_HAS_IOMEM
	return memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
#else
	return NULL;
#endif
}

> +}
> +
> +static void gpc_unmap(kvm_pfn_t pfn, void *khva)
>  {
>  	/* Unmap the old pfn/page if it was mapped before. */
> -	if (!is_error_noslot_pfn(pfn) && khva) {
> -		if (pfn_valid(pfn))
> -			kunmap(pfn_to_page(pfn));
> +	if (is_error_noslot_pfn(pfn) || !khva)
> +		return;
> +
> +	if (pfn_valid(pfn))
> +		kunmap(pfn_to_page(pfn));
>  #ifdef CONFIG_HAS_IOMEM
> -		else
> -			memunmap(khva);
> +	else
> +		memunmap(khva);
>  #endif

I don't mind the refactoring, but it needs to be at least mentioned in the
changelog.  And if we're going to bother, it probably makes sense to add a WARN
in the CONFIG_HAS_IOMEM=n path, e.g.

	/* Unmap the old pfn/page if it was mapped before. */
	if (is_error_noslot_pfn(pfn) || !khva)
		return;

	if (pfn_valid(pfn))
		kunmap(pfn_to_page(pfn));
	else
#ifdef CONFIG_HAS_IOMEM
		memunmap(khva);
#else
		WARN_ON_ONCE(1);
#endif


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper
  2023-10-02  9:57 ` [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper Paul Durrant
@ 2023-10-31 23:28   ` Sean Christopherson
  2023-11-02 17:52     ` Paul Durrant
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2023-10-31 23:28 UTC (permalink / raw)
  To: Paul Durrant
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

On Mon, Oct 02, 2023, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> At the moment pages are marked dirty by open-coded calls to
> mark_page_dirty_in_slot(), directly deferefencing the gpa and memslot
> from the cache. After a subsequent patch these may not always be set
> so add a helper now so that caller will protected from the need to know
> about this detail.
> 
> NOTE: Pages are now marked dirty while the cache lock is held. This is
>       to ensure that gpa and memslot are mutually consistent.

This absolutely belongs in a separate patch.  It sounds like a bug fix (haven't
spent the time to figure out if it actually is), and even if it doesn't fix
anything, burying something like this in a "add a helper" patch is just mean.

> diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
> index 0f36acdf577f..b68ed7fa56a2 100644
> --- a/virt/kvm/pfncache.c
> +++ b/virt/kvm/pfncache.c
> @@ -386,6 +386,12 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
>  }
>  EXPORT_SYMBOL_GPL(kvm_gpc_activate);
>  
> +void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc)
> +{

If there's actually a reason to call mark_page_dirty_in_slot() while holding @gpc's
lock, then this should have a lockdep.  If there's no good reason, then don't move
the invocation.

> +	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
> +}
> +EXPORT_SYMBOL_GPL(kvm_gpc_mark_dirty);

This doesn't need to be exported.  Hrm, none of the exports in this file are
necessary, they likely all got added when we were thinking this stuff would be
used for nVMX.  I think we should remove them, not because I'm worried about
sub-modules doing bad things, but just because we should avoid polluting exported
symbols as much as possible.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 03/11] KVM: pfncache: add a helper to get the gpa
  2023-10-02  9:57 ` [PATCH v7 03/11] KVM: pfncache: add a helper to get the gpa Paul Durrant
@ 2023-10-31 23:37   ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2023-10-31 23:37 UTC (permalink / raw)
  To: Paul Durrant
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

On Mon, Oct 02, 2023, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> A subsequent patch will rename this field since it will become overloaded.

Too. Many. Pronouns.

  Add a helper to get the gpa of a gpc cache, as a subsequent patch will
  rename "gpa" to "addr".

> To avoid churn in places that currently retrieve the gpa, add a helper for
> that purpose now.

This is silly.  If this series added any protection against incorrect usage then
I could understand the helper, but this just end up being

	return gpc->addr_is_gpa ? gpc->addr : INVALID_GPA;

which is nasty.  IIUC, there's no WARN because kvm_xen_vcpu_get_attr() doesn't
pre-check that the cache is in the correct mode.  That's a really silly reason
to not harden the rest of KVM.

> diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
> index b68ed7fa56a2..17afbb464a70 100644
> --- a/virt/kvm/pfncache.c
> +++ b/virt/kvm/pfncache.c
> @@ -386,6 +386,12 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
>  }
>  EXPORT_SYMBOL_GPL(kvm_gpc_activate);
>  
> +gpa_t kvm_gpc_gpa(struct gfn_to_pfn_cache *gpc)
> +{
> +	return gpc->gpa;
> +}
> +EXPORT_SYMBOL_GPL(kvm_gpc_gpa);

Any reason not to make this static inline?  Even in the final form, not making
this inlined seems silly.

Belatedly, same question for kvm_gpc_mark_dirty() I suppose.

>  void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc)
>  {
>  	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa
  2023-10-02  9:57 ` [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa Paul Durrant
@ 2023-10-31 23:40   ` Sean Christopherson
  2023-11-02 18:11     ` Paul Durrant
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2023-10-31 23:40 UTC (permalink / raw)
  To: Paul Durrant
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini

On Mon, Oct 02, 2023, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> After a subsequent patch, the gpa may not always be set whereas khva will
> (as long as the cache valid flag is also set).

This holds true only because there are no users of KVM_GUEST_USES_PFN, and
because hva_to_pfn_retry() rather oddly adds the offset to a NULL khva.

I think it's time to admit using this to map PFNs into the guest is a bad idea
and rip out KVM_GUEST_USES_PFN before fully relying on khva.

https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA
  2023-10-02  9:57 ` [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA Paul Durrant
@ 2023-10-31 23:49   ` Sean Christopherson
  2023-11-02 18:01     ` Paul Durrant
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2023-10-31 23:49 UTC (permalink / raw)
  To: Paul Durrant
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, Paolo Bonzini,
	David Woodhouse

On Mon, Oct 02, 2023, Paul Durrant wrote:
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index 6f4737d5046a..d49946ee7ae3 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -64,7 +64,7 @@ struct gfn_to_hva_cache {
>  
>  struct gfn_to_pfn_cache {
>  	u64 generation;
> -	gpa_t gpa;
> +	u64 addr;

Holy moly, we have unions for exactly this reason.

	union {
		gpa_t gpa;
		unsigned long addr;
	};

But that's also weird and silly because it's basically the exact same thing as
"uhva".  If "uhva" stores the full address instead of the page-aligned address,
then I don't see a need for unionizing the gpa and uhva.

kvm_xen_vcpu_get_attr() should darn well explicitly check that the gpc stores
the correct type and not bleed ABI into the gfn_to_pfn_cache implementation.

If there's a true need for a union, the helpers should WARN.

> +unsigned long kvm_gpc_hva(struct gfn_to_pfn_cache *gpc)
> +{
> +	return !gpc->addr_is_gpa ? gpc->addr : 0;

'0' is a perfectly valid address.  Yeah, practically speaking '0' can't be used
these days, but we already have KVM_HVA_ERR_BAD.  If y'all want to use the for the
Xen ABI, then so be it.  But the common helpers need to use a sane value.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 06/11] KVM: xen: allow shared_info to be mapped by fixed HVA
  2023-10-02  9:57 ` [PATCH v7 06/11] KVM: xen: allow shared_info to be mapped by fixed HVA Paul Durrant
@ 2023-10-31 23:52   ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2023-10-31 23:52 UTC (permalink / raw)
  To: Paul Durrant
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

On Mon, Oct 02, 2023, Paul Durrant wrote:
> NOTE: The change of the kvm_xen_hvm_attr shared_info from struct to union
>       is technically an ABI change but it's entirely compatible with
>       existing users.

It's not an ABI change, is it?  Userspace that picks up the new header might generate
different code on a rebuild, but the actual ABI is unchanged, no?

> @@ -684,6 +692,14 @@ int kvm_xen_hvm_get_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
>  		r = 0;
>  		break;
>  
> +	case KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA:
> +		if (kvm->arch.xen.shinfo_cache.active)

As requested in previous patches, please explicitly check that the cache is in
the right "mode".

> +			data->u.shared_info.hva = kvm_gpc_hva(&kvm->arch.xen.shinfo_cache);
> +		else
> +			data->u.shared_info.hva = 0;
> +		r = 0;
> +		break;

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied
  2023-10-02  9:57 ` [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied Paul Durrant
@ 2023-10-31 23:58   ` Sean Christopherson
  2023-11-08 15:15     ` David Woodhouse
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2023-10-31 23:58 UTC (permalink / raw)
  To: Paul Durrant
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

On Mon, Oct 02, 2023, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> If the guest sets an explicit vcpu_info GPA then, for any of the first 32
> vCPUs, the content of the default vcpu_info in the shared_info page must be
> copied into the new location. Because this copy may race with event
> delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) we
> need a way to defer that until the copy is complete.

Nit, add a blank link between paragraphs.

> Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen
> that is used in atomic context if the vcpu_info PFN cache has been
> invalidated so that the update of vcpu_info can be deferred until the
> cache can be refreshed (on vCPU thread's the way back into guest context).
> So let's also use this shadow if the vcpu_info cache has been
> *deactivated*, so that the VMM can safely copy the vcpu_info content and
> then re-activate the cache with the new GPA. To do this, all we need to do
> is stop considering an inactive vcpu_info cache as a hard error in
> kvm_xen_set_evtchn_fast().

Please, please try to write changelogs that adhere to the preferred style.  I
get that the preferred style likely doesn't align with what you're used to, but
the preferred style really doesn't help me get through reviews quicker.

> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index aafc794940e4..e645066217bb 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
>  		WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx);
>  	}
>  
> -	if (!vcpu->arch.xen.vcpu_info_cache.active)
> -		return -EINVAL;
> -

Hmm, maybe move this check after the "hard" error checks and explicitly do:

		return -EWOULDBLOCK

That way it's much more obvious that this patch is safe.  Alternatively, briefly
explain what happens if the cache is invalid in the changelog.


>  	if (xe->port >= max_evtchn_port(kvm))
>  		return -EINVAL;
>  
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 01/11] KVM: pfncache: add a map helper function
  2023-10-31 23:20   ` Sean Christopherson
@ 2023-11-02 17:09     ` Paul Durrant
  0 siblings, 0 replies; 34+ messages in thread
From: Paul Durrant @ 2023-11-02 17:09 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini

On 31/10/2023 23:20, Sean Christopherson wrote:
> On Mon, Oct 02, 2023, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
> 
> Please make the changelog standalone, i.e. don't rely on the shortlog to provide
> context.  Yeah, it can be silly and repetive sometimes, particularly when viewing
> git commits where the shortlog+changelog are bundled fairly close together, but
> when viewing patches in a mail client, e.g. when I'm doing initial review, the
> shortlog is in the subject which may be far away or even completely hidden (as is
> the case as I'm typing this).
> 
> I could have sworn I included this in Documentation/process/maintainer-kvm-x86.rst,
> but I'm not finding it.
> 

OK, I'll add some more text.

>> We have an unmap helper but mapping is open-coded. Arguably this is fine
> 
> Pronouns.
> 

Sorry... didn't realize that was an issue.

>> because mapping is done in only one place, hva_to_pfn_retry(), but adding
>> the helper does make that function more readable.
>>
>> No functional change intended.
>>
>> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
>> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
>> ---
>> Cc: Sean Christopherson <seanjc@google.com>
>> Cc: David Woodhouse <dwmw2@infradead.org>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   virt/kvm/pfncache.c | 43 +++++++++++++++++++++++++------------------
>>   1 file changed, 25 insertions(+), 18 deletions(-)
>>
>> diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
>> index 2d6aba677830..0f36acdf577f 100644
>> --- a/virt/kvm/pfncache.c
>> +++ b/virt/kvm/pfncache.c
>> @@ -96,17 +96,28 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, unsigned long len)
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_gpc_check);
>>   
>> -static void gpc_unmap_khva(kvm_pfn_t pfn, void *khva)
>> +static void *gpc_map(kvm_pfn_t pfn)
>> +{
>> +	if (pfn_valid(pfn))
>> +		return kmap(pfn_to_page(pfn));
>> +#ifdef CONFIG_HAS_IOMEM
>> +	else
> 
> There's no need for the "else", the happy path is terminal.
> 
>> +		return memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
>> +#endif
> 
> This needs a return for CONFIG_HAS_IOMEM=n.  I haven't tried to compile, but I'm
> guessing s390 won't be happy.
> 

Oops, yes, of course.

> This?
> 
> static void *gpc_map(kvm_pfn_t pfn)
> {
> 	if (pfn_valid(pfn))
> 		return kmap(pfn_to_page(pfn));
> 
> #ifdef CONFIG_HAS_IOMEM
> 	return memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
> #else
> 	return NULL;
> #endif
> }
> 

Looks good. Thanks,

   Paul

>> +}
>> +
>> +static void gpc_unmap(kvm_pfn_t pfn, void *khva)
>>   {
>>   	/* Unmap the old pfn/page if it was mapped before. */
>> -	if (!is_error_noslot_pfn(pfn) && khva) {
>> -		if (pfn_valid(pfn))
>> -			kunmap(pfn_to_page(pfn));
>> +	if (is_error_noslot_pfn(pfn) || !khva)
>> +		return;
>> +
>> +	if (pfn_valid(pfn))
>> +		kunmap(pfn_to_page(pfn));
>>   #ifdef CONFIG_HAS_IOMEM
>> -		else
>> -			memunmap(khva);
>> +	else
>> +		memunmap(khva);
>>   #endif
> 
> I don't mind the refactoring, but it needs to be at least mentioned in the
> changelog.  And if we're going to bother, it probably makes sense to add a WARN
> in the CONFIG_HAS_IOMEM=n path, e.g.
> 
> 	/* Unmap the old pfn/page if it was mapped before. */
> 	if (is_error_noslot_pfn(pfn) || !khva)
> 		return;
> 
> 	if (pfn_valid(pfn))
> 		kunmap(pfn_to_page(pfn));
> 	else
> #ifdef CONFIG_HAS_IOMEM
> 		memunmap(khva);
> #else
> 		WARN_ON_ONCE(1);
> #endif
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper
  2023-10-31 23:28   ` Sean Christopherson
@ 2023-11-02 17:52     ` Paul Durrant
  2023-11-03 23:07       ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-11-02 17:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

On 31/10/2023 23:28, Sean Christopherson wrote:
> On Mon, Oct 02, 2023, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
>>
>> At the moment pages are marked dirty by open-coded calls to
>> mark_page_dirty_in_slot(), directly deferefencing the gpa and memslot
>> from the cache. After a subsequent patch these may not always be set
>> so add a helper now so that caller will protected from the need to know
>> about this detail.
>>
>> NOTE: Pages are now marked dirty while the cache lock is held. This is
>>        to ensure that gpa and memslot are mutually consistent.
> 
> This absolutely belongs in a separate patch.  It sounds like a bug fix (haven't
> spent the time to figure out if it actually is), and even if it doesn't fix
> anything, burying something like this in a "add a helper" patch is just mean.
> 

Ok, I can split it out. It's a pretty minor fix so didn't seem worth it.

> 
>> diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
>> index 0f36acdf577f..b68ed7fa56a2 100644
>> --- a/virt/kvm/pfncache.c
>> +++ b/virt/kvm/pfncache.c
>> @@ -386,6 +386,12 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_gpc_activate);
>>   
>> +void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc)
>> +{
> 
> If there's actually a reason to call mark_page_dirty_in_slot() while holding @gpc's
> lock, then this should have a lockdep.  If there's no good reason, then don't move
> the invocation.
> 
>> +	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_gpc_mark_dirty);
> 
> This doesn't need to be exported.  Hrm, none of the exports in this file are
> necessary, they likely all got added when we were thinking this stuff would be
> used for nVMX.  I think we should remove them, not because I'm worried about
> sub-modules doing bad things, but just because we should avoid polluting exported
> symbols as much as possible.

That in a separate clean-up patch too, I assume?

   Paul

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA
  2023-10-31 23:49   ` Sean Christopherson
@ 2023-11-02 18:01     ` Paul Durrant
  2023-11-03 23:12       ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-11-02 18:01 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, Paolo Bonzini,
	David Woodhouse

On 31/10/2023 23:49, Sean Christopherson wrote:
> On Mon, Oct 02, 2023, Paul Durrant wrote:
>> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
>> index 6f4737d5046a..d49946ee7ae3 100644
>> --- a/include/linux/kvm_types.h
>> +++ b/include/linux/kvm_types.h
>> @@ -64,7 +64,7 @@ struct gfn_to_hva_cache {
>>   
>>   struct gfn_to_pfn_cache {
>>   	u64 generation;
>> -	gpa_t gpa;
>> +	u64 addr;
> 
> Holy moly, we have unions for exactly this reason.
> 
> 	union {
> 		gpa_t gpa;
> 		unsigned long addr;
> 	};
> 
> But that's also weird and silly because it's basically the exact same thing as
> "uhva".  If "uhva" stores the full address instead of the page-aligned address,
> then I don't see a need for unionizing the gpa and uhva.
> 

Ok, I think that'll be more invasive but I'll see how it looks.

> kvm_xen_vcpu_get_attr() should darn well explicitly check that the gpc stores
> the correct type and not bleed ABI into the gfn_to_pfn_cache implementation.
> 

I guess if we leave gpa alone and make it INVALID_GPA for caches 
initialized using an HVA then that can be checked. Is that what you mean 
here?

> If there's a true need for a union, the helpers should WARN.
> 
>> +unsigned long kvm_gpc_hva(struct gfn_to_pfn_cache *gpc)
>> +{
>> +	return !gpc->addr_is_gpa ? gpc->addr : 0;
> 
> '0' is a perfectly valid address.  Yeah, practically speaking '0' can't be used
> these days, but we already have KVM_HVA_ERR_BAD.  If y'all want to use the for the
> Xen ABI, then so be it.  But the common helpers need to use a sane value.

Ok.

   Paul

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa
  2023-10-31 23:40   ` Sean Christopherson
@ 2023-11-02 18:11     ` Paul Durrant
  2023-11-03 23:10       ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Paul Durrant @ 2023-11-02 18:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini

On 31/10/2023 23:40, Sean Christopherson wrote:
> On Mon, Oct 02, 2023, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
>>
>> After a subsequent patch, the gpa may not always be set whereas khva will
>> (as long as the cache valid flag is also set).
> 
> This holds true only because there are no users of KVM_GUEST_USES_PFN, and
> because hva_to_pfn_retry() rather oddly adds the offset to a NULL khva.
> 
> I think it's time to admit using this to map PFNs into the guest is a bad idea
> and rip out KVM_GUEST_USES_PFN before fully relying on khva.
> 
> https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.com

Is this something you want me to fix?

   Paul


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper
  2023-11-02 17:52     ` Paul Durrant
@ 2023-11-03 23:07       ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:07 UTC (permalink / raw)
  To: paul
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

On Thu, Nov 02, 2023, Paul Durrant wrote:
> On 31/10/2023 23:28, Sean Christopherson wrote:
> > On Mon, Oct 02, 2023, Paul Durrant wrote:
> > > diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
> > > index 0f36acdf577f..b68ed7fa56a2 100644
> > > --- a/virt/kvm/pfncache.c
> > > +++ b/virt/kvm/pfncache.c
> > > @@ -386,6 +386,12 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
> > >   }
> > >   EXPORT_SYMBOL_GPL(kvm_gpc_activate);
> > > +void kvm_gpc_mark_dirty(struct gfn_to_pfn_cache *gpc)
> > > +{
> > 
> > If there's actually a reason to call mark_page_dirty_in_slot() while holding @gpc's
> > lock, then this should have a lockdep.  If there's no good reason, then don't move
> > the invocation.
> > 
> > > +	mark_page_dirty_in_slot(gpc->kvm, gpc->memslot, gpc->gpa >> PAGE_SHIFT);
> > > +}
> > > +EXPORT_SYMBOL_GPL(kvm_gpc_mark_dirty);
> > 
> > This doesn't need to be exported.  Hrm, none of the exports in this file are
> > necessary, they likely all got added when we were thinking this stuff would be
> > used for nVMX.  I think we should remove them, not because I'm worried about
> > sub-modules doing bad things, but just because we should avoid polluting exported
> > symbols as much as possible.
> 
> That in a separate clean-up patch too, I assume?

Yes, but feel free to punt that one or post it as a standalone patch.  For this
series, please just don't add more exports unless they're actually used in the
series.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa
  2023-11-02 18:11     ` Paul Durrant
@ 2023-11-03 23:10       ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:10 UTC (permalink / raw)
  To: paul
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, David Woodhouse,
	Paolo Bonzini

On Thu, Nov 02, 2023, Paul Durrant wrote:
> On 31/10/2023 23:40, Sean Christopherson wrote:
> > On Mon, Oct 02, 2023, Paul Durrant wrote:
> > > From: Paul Durrant <pdurrant@amazon.com>
> > > 
> > > After a subsequent patch, the gpa may not always be set whereas khva will
> > > (as long as the cache valid flag is also set).
> > 
> > This holds true only because there are no users of KVM_GUEST_USES_PFN, and
> > because hva_to_pfn_retry() rather oddly adds the offset to a NULL khva.
> > 
> > I think it's time to admit using this to map PFNs into the guest is a bad idea
> > and rip out KVM_GUEST_USES_PFN before fully relying on khva.
> > 
> > https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.com
> 
> Is this something you want me to fix?

Yes?  I don't want to snowball your series, but I also really don't like the
confusion that is introduced by relying on khva while KVM_GUEST_USES_PFN is still
a thing.

Can you give it a shot, and then holler if it's a bigger mess than I'm anticipating?
I'm assuming/hoping it's a relatively small, one-off patch, but I haven't actually
dug through in-depth to figure out what all needs to change.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA
  2023-11-02 18:01     ` Paul Durrant
@ 2023-11-03 23:12       ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:12 UTC (permalink / raw)
  To: paul
  Cc: kvm, linux-kernel, Paul Durrant, David Woodhouse, Paolo Bonzini,
	David Woodhouse

On Thu, Nov 02, 2023, Paul Durrant wrote:
> On 31/10/2023 23:49, Sean Christopherson wrote:
> > On Mon, Oct 02, 2023, Paul Durrant wrote:
> > > diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> > > index 6f4737d5046a..d49946ee7ae3 100644
> > > --- a/include/linux/kvm_types.h
> > > +++ b/include/linux/kvm_types.h
> > > @@ -64,7 +64,7 @@ struct gfn_to_hva_cache {
> > >   struct gfn_to_pfn_cache {
> > >   	u64 generation;
> > > -	gpa_t gpa;
> > > +	u64 addr;
> > 
> > Holy moly, we have unions for exactly this reason.
> > 
> > 	union {
> > 		gpa_t gpa;
> > 		unsigned long addr;
> > 	};
> > 
> > But that's also weird and silly because it's basically the exact same thing as
> > "uhva".  If "uhva" stores the full address instead of the page-aligned address,
> > then I don't see a need for unionizing the gpa and uhva.
> 
> Ok, I think that'll be more invasive but I'll see how it looks.

Invasive is fine.  Not ideal, but fine.  If the resulting code is a mess, then
that's a problem, but churn in and of itself isn't awful if the end result is a
net positive.

> > kvm_xen_vcpu_get_attr() should darn well explicitly check that the gpc stores
> > the correct type and not bleed ABI into the gfn_to_pfn_cache implementation.
> 
> I guess if we leave gpa alone and make it INVALID_GPA for caches initialized
> using an HVA then that can be checked. Is that what you mean here?

Yep, that should work.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied
  2023-10-31 23:58   ` Sean Christopherson
@ 2023-11-08 15:15     ` David Woodhouse
  2023-11-08 20:55       ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: David Woodhouse @ 2023-11-08 15:15 UTC (permalink / raw)
  To: Sean Christopherson, Paul Durrant, Allister, Jack
  Cc: kvm, linux-kernel, Paul Durrant, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

On Tue, 2023-10-31 at 16:58 -0700, Sean Christopherson wrote:
> 
> > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> > index aafc794940e4..e645066217bb 100644
> > --- a/arch/x86/kvm/xen.c
> > +++ b/arch/x86/kvm/xen.c
> > @@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
> >                 WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx);
> >         }
> >   
> > -       if (!vcpu->arch.xen.vcpu_info_cache.active)
> > -               return -EINVAL;
> > -
> 
> Hmm, maybe move this check after the "hard" error checks and explicitly do:
> 
>                 return -EWOULDBLOCK
> 
> That way it's much more obvious that this patch is safe.  Alternatively, briefly
> explain what happens if the cache is invalid in the changelog.


Hm, I thought we dropped that? It doesn't want to return -EWOULDBLOCK.
That would cause a "fast" irqfd delivery to defer to the slow
(workqueue) path, but then the same thing would happen on that path
*too*.

I actually think this check needs to be dropped completely. The
evtchn_pending_sel bit will get set in the *shadow* copy in vcpu-
>arch.xen.evtchn_pending_sel, and will subsequently be set in the real
vcpu_info when that is configured again. We do already handle that case
correctly, I believe:

		if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) {
			/*
			 * Could not access the vcpu_info. Set the bit in-kernel
			 * and prod the vCPU to deliver it for itself.
			 */
			if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel))
				kick_vcpu = true;
			goto out_rcu;
		}

If we leave this check in place as it is, I think we're going to lose
incoming events (and IPIs) directed at a vCPU while it has its
vcpu_info temporarily removed.

(We do want to "temporarily" remove the vcpu_info so that we can adjust
memslots, because there's no way to atomically break apart a large
memslot and overlay a single page in the middle of it. And also when a
guest is paused for migration, it's nice to be sure that no changes
will happen to the guest memory and trigger sanity checks that the
memory on the destination precisely matches the source.)

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied
  2023-11-08 15:15     ` David Woodhouse
@ 2023-11-08 20:55       ` Sean Christopherson
  2023-11-08 21:56         ` David Woodhouse
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2023-11-08 20:55 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Paul Durrant, Jack Allister, kvm, linux-kernel, Paul Durrant,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

On Wed, Nov 08, 2023, David Woodhouse wrote:
> On Tue, 2023-10-31 at 16:58 -0700, Sean Christopherson wrote:
> > 
> > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> > > index aafc794940e4..e645066217bb 100644
> > > --- a/arch/x86/kvm/xen.c
> > > +++ b/arch/x86/kvm/xen.c
> > > @@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
> > >                 WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx);
> > >         }
> > >   
> > > -       if (!vcpu->arch.xen.vcpu_info_cache.active)
> > > -               return -EINVAL;
> > > -
> > 
> > Hmm, maybe move this check after the "hard" error checks and explicitly do:
> > 
> >                 return -EWOULDBLOCK
> > 
> > That way it's much more obvious that this patch is safe.  Alternatively, briefly
> > explain what happens if the cache is invalid in the changelog.
> 
> 
> Hm, I thought we dropped that? It doesn't want to return -EWOULDBLOCK.
> That would cause a "fast" irqfd delivery to defer to the slow
> (workqueue) path, but then the same thing would happen on that path
> *too*.
> 
> I actually think this check needs to be dropped completely. The
> evtchn_pending_sel bit will get set in the *shadow* copy in vcpu-
> >arch.xen.evtchn_pending_sel, and will subsequently be set in the real
> vcpu_info when that is configured again. We do already handle that case
> correctly, I believe:
> 
> 		if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) {
> 			/*
> 			 * Could not access the vcpu_info. Set the bit in-kernel
> 			 * and prod the vCPU to deliver it for itself.
> 			 */
> 			if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel))
> 				kick_vcpu = true;
> 			goto out_rcu;
> 		}
> 
> If we leave this check in place as it is, I think we're going to lose
> incoming events (and IPIs) directed at a vCPU while it has its
> vcpu_info temporarily removed.

Oh, dagnabbit, there are two different gfn_to_pfn_cache objects.  I assumed "gpc"
was the same thing as vcpu_info_cache.active, i.e. thought the explicit active
check here was just to pre-check the validity before acquiring the lock.

	if (!vcpu->arch.xen.vcpu_info_cache.active)
		return -EINVAL;

	rc = -EWOULDBLOCK;

	idx = srcu_read_lock(&kvm->srcu);

	read_lock_irqsave(&gpc->lock, flags);
	if (!kvm_gpc_check(gpc, PAGE_SIZE))  <== I thought this was the same check
		goto out_rcu;

That's why my comment was nonsensical, and also why I was super confused by your
response for a few minutes :-)

Good gravy, this even conditionally switches gpc and makes the out path drop
different locks.  That's just mean.

Any objection to splitting out the vcpu_info chunk to a separate function?  That'd
eliminate the subtle gpc+lock switch, and provide a convenient and noticeable
location to explain why it's ok for setting the vcpu_info to fail.

E.g.

---
 arch/x86/kvm/xen.c | 102 ++++++++++++++++++++++++++-------------------
 1 file changed, 58 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index d65e89e062e4..cd2021ba0ae3 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1621,6 +1621,57 @@ static void kvm_xen_check_poller(struct kvm_vcpu *vcpu, int port)
 	}
 }
 
+static int kvm_xen_needs_a_name(struct kvm_vcpu *vcpu, int port_word_bit)
+{
+	struct gfn_to_pfn_cache *gpc = &vcpu->arch.xen.vcpu_info_cache;
+	bool kick_vcpu = false;
+	unsigned long flags;
+	int r = 0;
+
+	read_lock_irqsave(&gpc->lock, flags);
+	if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) {
+		/*
+		 * Could not access the vcpu_info. Set the bit in-kernel and
+		 * prod the vCPU to deliver it for itself.
+		 */
+		if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel))
+			kick_vcpu = true;
+
+		r = -EWOULDBLOCK;
+		goto out;
+	}
+
+	if (IS_ENABLED(CONFIG_64BIT) && vcpu->kvm->arch.xen.long_mode) {
+		struct vcpu_info *vcpu_info = gpc->khva;
+		if (!test_and_set_bit(port_word_bit, &vcpu_info->evtchn_pending_sel)) {
+			WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1);
+			kick_vcpu = true;
+		}
+	} else {
+		struct compat_vcpu_info *vcpu_info = gpc->khva;
+		if (!test_and_set_bit(port_word_bit,
+					(unsigned long *)&vcpu_info->evtchn_pending_sel)) {
+			WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1);
+			kick_vcpu = true;
+		}
+	}
+
+	/* For the per-vCPU lapic vector, deliver it as MSI. */
+	if (kick_vcpu && vcpu->arch.xen.upcall_vector) {
+		kvm_xen_inject_vcpu_vector(vcpu);
+		kick_vcpu = false;
+	}
+
+out:
+	read_unlock_irqrestore(&gpc->lock, flags);
+
+	if (kick_vcpu) {
+		kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
+		kvm_vcpu_kick(vcpu);
+	}
+
+	return r;
+}
 /*
  * The return value from this function is propagated to kvm_set_irq() API,
  * so it returns:
@@ -1638,7 +1689,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 	unsigned long *pending_bits, *mask_bits;
 	unsigned long flags;
 	int port_word_bit;
-	bool kick_vcpu = false;
 	int vcpu_idx, idx, rc;
 
 	vcpu_idx = READ_ONCE(xe->vcpu_idx);
@@ -1660,7 +1710,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 
 	read_lock_irqsave(&gpc->lock, flags);
 	if (!kvm_gpc_check(gpc, PAGE_SIZE))
-		goto out_rcu;
+		goto out_unlock;
 
 	if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
 		struct shared_info *shinfo = gpc->khva;
@@ -1688,52 +1738,16 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		kvm_xen_check_poller(vcpu, xe->port);
 	} else {
 		rc = 1; /* Delivered to the bitmap in shared_info. */
-		/* Now switch to the vCPU's vcpu_info to set the index and pending_sel */
-		read_unlock_irqrestore(&gpc->lock, flags);
-		gpc = &vcpu->arch.xen.vcpu_info_cache;
-
-		read_lock_irqsave(&gpc->lock, flags);
-		if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) {
-			/*
-			 * Could not access the vcpu_info. Set the bit in-kernel
-			 * and prod the vCPU to deliver it for itself.
-			 */
-			if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel))
-				kick_vcpu = true;
-			goto out_rcu;
-		}
-
-		if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
-			struct vcpu_info *vcpu_info = gpc->khva;
-			if (!test_and_set_bit(port_word_bit, &vcpu_info->evtchn_pending_sel)) {
-				WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1);
-				kick_vcpu = true;
-			}
-		} else {
-			struct compat_vcpu_info *vcpu_info = gpc->khva;
-			if (!test_and_set_bit(port_word_bit,
-					      (unsigned long *)&vcpu_info->evtchn_pending_sel)) {
-				WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1);
-				kick_vcpu = true;
-			}
-		}
-
-		/* For the per-vCPU lapic vector, deliver it as MSI. */
-		if (kick_vcpu && vcpu->arch.xen.upcall_vector) {
-			kvm_xen_inject_vcpu_vector(vcpu);
-			kick_vcpu = false;
-		}
 	}
 
- out_rcu:
+ out_unlock:
 	read_unlock_irqrestore(&gpc->lock, flags);
+
+	/* Paul to add comment explaining why it's ok kvm_xen_needs_a_name() fails */
+	if (rc == 1)
+		rc = kvm_xen_needs_a_name(vcpu, port_word_bit);
+
 	srcu_read_unlock(&kvm->srcu, idx);
-
-	if (kick_vcpu) {
-		kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
-		kvm_vcpu_kick(vcpu);
-	}
-
 	return rc;
 }
 

base-commit: 0f34262cf6fab8163456c7740c7ee1888a8af93c
-- 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied
  2023-11-08 20:55       ` Sean Christopherson
@ 2023-11-08 21:56         ` David Woodhouse
  0 siblings, 0 replies; 34+ messages in thread
From: David Woodhouse @ 2023-11-08 21:56 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paul Durrant, Jack Allister, kvm, linux-kernel, Paul Durrant,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86

[-- Attachment #1: Type: text/plain, Size: 2130 bytes --]

On Wed, 2023-11-08 at 12:55 -0800, Sean Christopherson wrote:
> 
> Any objection to splitting out the vcpu_info chunk to a separate function?  That'd
> eliminate the subtle gpc+lock switch, and provide a convenient and noticeable
> location to explain why it's ok for setting the vcpu_info to fail.

No objection, that looks entirely sane to me.

> E.g.
> 
> ---
>  arch/x86/kvm/xen.c | 102 ++++++++++++++++++++++++++-------------------
>  1 file changed, 58 insertions(+), 44 deletions(-)
> 
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index d65e89e062e4..cd2021ba0ae3 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -1621,6 +1621,57 @@ static void kvm_xen_check_poller(struct kvm_vcpu *vcpu, int port)
>         }
>  }
>  
> +static int kvm_xen_needs_a_name(struct kvm_vcpu *vcpu, int port_word_bit)
> 

kvm_xen_deliver_vcpu_evtchn_pending_sel()  ?

...

> - out_rcu:
> + out_unlock:
>         read_unlock_irqrestore(&gpc->lock, flags);
> +
> +       /* Paul to add comment explaining why it's ok kvm_xen_needs_a_name() fails */

Basically the point here is that it *doesn't* fail. Either it delivers
directly to the vcpu_info at the appropriate GPA via the GPC, or it
will set the corresponding bit in the 'shadow'
vcpu->arch.xen.evtchn_pending_sel to be delivered later by
kvm_xen_inject_pending_events(). Either way, it's 'success'.

I do want to vet the code path where userspace forgets to reinstate the
vcpu_info before running the vCPU again. Looking at
kvm_xen_inject_pending_events(), I think the kvm_gpc_check() for the
vcpu_info will fail, so it'll just return, and KVM will enter the vCPU
with vcpu->arch.xen.evtchn_pending_sel still unchanged? The bits won't
get delivered until the next time the vCPU is entered after the
vcpu_info is set.

Which I *think* is fine. Userspace has to stop running the vCPU in
order to actually set the vcpu_info again, when it finally decides to
do so. And if it never does, that works out as well as can be
expected. 

Probably does want adding to the selftest.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling
  2023-10-05  8:36   ` Paul Durrant
@ 2023-11-09 10:02     ` David Woodhouse
  2023-11-09 10:06       ` Paul Durrant
  0 siblings, 1 reply; 34+ messages in thread
From: David Woodhouse @ 2023-11-09 10:02 UTC (permalink / raw)
  To: paul, kvm, linux-kernel
  Cc: Paul Durrant, H. Peter Anvin, Borislav Petkov, Dave Hansen,
	Ingo Molnar, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	x86

[-- Attachment #1: Type: text/plain, Size: 2031 bytes --]

On Thu, 2023-10-05 at 09:36 +0100, Paul Durrant wrote:
> On 05/10/2023 07:41, David Woodhouse wrote:
> > On Mon, 2023-10-02 at 09:57 +0000, Paul Durrant wrote:
> > > From: Paul Durrant <pdurrant@amazon.com>
> > > 
> > > The following text from the original cover letter still serves as an
> > > introduction to the series:
> > > 
> > > "Currently we treat the shared_info page as guest memory and the VMM
> > > informs KVM of its location using a GFN. However it is not guest memory as
> > > such; it's an overlay page. So we pointlessly invalidate and re-cache a
> > > mapping to the *same page* of memory every time the guest requests that
> > > shared_info be mapped into its address space. Let's avoid doing that by
> > > modifying the pfncache code to allow activation using a fixed userspace HVA
> > > as well as a GPA."
> > > 
> > > This version of the series is functionally the same as version 6. I have
> > > simply added David Woodhouse's R-b to patch 11 to indicate that he has
> > > now fully reviewed the series.
> > 
> > Thanks. I believe Sean is probably waiting for us to stop going back
> > and forth, and for the dust to settle. So for the record: I think I'm
> > done heckling and this is ready to go in.
> > 
> > Are you doing the QEMU patches or am I?
> > 
> 
> I'll do the QEMU changes, once the patches hit kvm/next.

Note that I disabled migration support in QEMU for emulated Xen
guests. You might want that for testing, since the reason for this work
is to enable pause/serialize workflows.

Migration does work all the way up to XenStore itself, and
https://gitlab.com/qemu-project/qemu/-/commit/766804b101d *was* tested
with migration enabled. There are also unit tests for XenStore
serialize/deserialize.

I disabled it because the PV backends on the XenBus don't have
suspend/resume support. But a guest using other emulated net/disk
devices should still be able to suspend/resume OK if we just remove the
'unmigratable' flag from xen_xenstore, I believe. 

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling
  2023-11-09 10:02     ` David Woodhouse
@ 2023-11-09 10:06       ` Paul Durrant
  0 siblings, 0 replies; 34+ messages in thread
From: Paul Durrant @ 2023-11-09 10:06 UTC (permalink / raw)
  To: David Woodhouse, kvm, linux-kernel
  Cc: Paul Durrant, H. Peter Anvin, Borislav Petkov, Dave Hansen,
	Ingo Molnar, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	x86

On 09/11/2023 10:02, David Woodhouse wrote:
> On Thu, 2023-10-05 at 09:36 +0100, Paul Durrant wrote:
>> On 05/10/2023 07:41, David Woodhouse wrote:
>>> On Mon, 2023-10-02 at 09:57 +0000, Paul Durrant wrote:
>>>> From: Paul Durrant <pdurrant@amazon.com>
>>>>
>>>> The following text from the original cover letter still serves as an
>>>> introduction to the series:
>>>>
>>>> "Currently we treat the shared_info page as guest memory and the VMM
>>>> informs KVM of its location using a GFN. However it is not guest memory as
>>>> such; it's an overlay page. So we pointlessly invalidate and re-cache a
>>>> mapping to the *same page* of memory every time the guest requests that
>>>> shared_info be mapped into its address space. Let's avoid doing that by
>>>> modifying the pfncache code to allow activation using a fixed userspace HVA
>>>> as well as a GPA."
>>>>
>>>> This version of the series is functionally the same as version 6. I have
>>>> simply added David Woodhouse's R-b to patch 11 to indicate that he has
>>>> now fully reviewed the series.
>>>
>>> Thanks. I believe Sean is probably waiting for us to stop going back
>>> and forth, and for the dust to settle. So for the record: I think I'm
>>> done heckling and this is ready to go in.
>>>
>>> Are you doing the QEMU patches or am I?
>>>
>>
>> I'll do the QEMU changes, once the patches hit kvm/next.
> 
> Note that I disabled migration support in QEMU for emulated Xen
> guests. You might want that for testing, since the reason for this work
> is to enable pause/serialize workflows.
> 
> Migration does work all the way up to XenStore itself, and
> https://gitlab.com/qemu-project/qemu/-/commit/766804b101d *was* tested
> with migration enabled. There are also unit tests for XenStore
> serialize/deserialize.
> 
> I disabled it because the PV backends on the XenBus don't have
> suspend/resume support. But a guest using other emulated net/disk
> devices should still be able to suspend/resume OK if we just remove the
> 'unmigratable' flag from xen_xenstore, I believe.

Ok. Enabling suspend/resume for backends really ought not to be that 
hard. The main reason for this series was to enable pause 
for-for-memory-reconfiguration but I can look into 
suspend/resume/migrate once I've done the necessary re-work.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2023-11-09 10:07 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-02  9:57 [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling Paul Durrant
2023-10-02  9:57 ` [PATCH v7 01/11] KVM: pfncache: add a map helper function Paul Durrant
2023-10-31 23:20   ` Sean Christopherson
2023-11-02 17:09     ` Paul Durrant
2023-10-02  9:57 ` [PATCH v7 02/11] KVM: pfncache: add a mark-dirty helper Paul Durrant
2023-10-31 23:28   ` Sean Christopherson
2023-11-02 17:52     ` Paul Durrant
2023-11-03 23:07       ` Sean Christopherson
2023-10-02  9:57 ` [PATCH v7 03/11] KVM: pfncache: add a helper to get the gpa Paul Durrant
2023-10-31 23:37   ` Sean Christopherson
2023-10-02  9:57 ` [PATCH v7 04/11] KVM: pfncache: base offset check on khva rather than gpa Paul Durrant
2023-10-31 23:40   ` Sean Christopherson
2023-11-02 18:11     ` Paul Durrant
2023-11-03 23:10       ` Sean Christopherson
2023-10-02  9:57 ` [PATCH v7 05/11] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA Paul Durrant
2023-10-31 23:49   ` Sean Christopherson
2023-11-02 18:01     ` Paul Durrant
2023-11-03 23:12       ` Sean Christopherson
2023-10-02  9:57 ` [PATCH v7 06/11] KVM: xen: allow shared_info to be mapped by fixed HVA Paul Durrant
2023-10-31 23:52   ` Sean Christopherson
2023-10-02  9:57 ` [PATCH v7 07/11] KVM: xen: allow vcpu_info " Paul Durrant
2023-10-02  9:57 ` [PATCH v7 08/11] KVM: selftests / xen: map shared_info using HVA rather than GFN Paul Durrant
2023-10-02  9:57 ` [PATCH v7 09/11] KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA Paul Durrant
2023-10-02  9:57 ` [PATCH v7 10/11] KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability Paul Durrant
2023-10-02  9:57 ` [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied Paul Durrant
2023-10-31 23:58   ` Sean Christopherson
2023-11-08 15:15     ` David Woodhouse
2023-11-08 20:55       ` Sean Christopherson
2023-11-08 21:56         ` David Woodhouse
2023-10-05  6:41 ` [PATCH v7 00/11] KVM: xen: update shared_info and vcpu_info handling David Woodhouse
2023-10-05  8:36   ` Paul Durrant
2023-11-09 10:02     ` David Woodhouse
2023-11-09 10:06       ` Paul Durrant
2023-10-30 12:00   ` Paul Durrant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox