public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] x86:kvm:hyperv: fix userspace interaction flaws
@ 2017-06-16 17:37 Roman Kagan
  2017-06-16 17:37 ` [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2 Roman Kagan
  2017-06-16 17:37 ` [PATCH v2 2/2] x86:kvm:hyperv: make VP_INDEX managed by userspace Roman Kagan
  0 siblings, 2 replies; 8+ messages in thread
From: Roman Kagan @ 2017-06-16 17:37 UTC (permalink / raw)
  To: kvm, Radim Krčmář, Paolo Bonzini
  Cc: Evgeny Yakovlev, Denis V . Lunev, Eduardo Habkost, Igor Mammedov

This patchset contains fixes for two hyperv implementation flaws in KVM
that surfaced when the userspace (QEMU) started to use it more actively.

In both cases more control is delegated from KVM to userspace.

Roman Kagan (2):
  x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2
  x86:kvm:hyperv: make VP_INDEX managed by userspace

---
v1 -> v2:
 - add patch 1
 - add capability in patch 2

 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/hyperv.h           |  2 +-
 include/uapi/linux/kvm.h        |  2 ++
 arch/x86/kvm/hyperv.c           | 62 ++++++++++++++++++++++++++---------------
 arch/x86/kvm/x86.c              |  6 +++-
 5 files changed, 49 insertions(+), 25 deletions(-)

-- 
2.9.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2
  2017-06-16 17:37 [PATCH v2 0/2] x86:kvm:hyperv: fix userspace interaction flaws Roman Kagan
@ 2017-06-16 17:37 ` Roman Kagan
  2017-06-21 20:09   ` Radim Krčmář
  2017-06-16 17:37 ` [PATCH v2 2/2] x86:kvm:hyperv: make VP_INDEX managed by userspace Roman Kagan
  1 sibling, 1 reply; 8+ messages in thread
From: Roman Kagan @ 2017-06-16 17:37 UTC (permalink / raw)
  To: kvm, Radim Krčmář, Paolo Bonzini
  Cc: Evgeny Yakovlev, Denis V . Lunev, Eduardo Habkost, Igor Mammedov

There is a flaw in the Hyper-V SynIC implementation in KVM: when message
page or event flags page is enabled by setting the corresponding msr,
KVM zeroes it out.  This is problematic because on migration the
corresponding MSRs are loaded on the destination, so the content of
those pages is lost.

This went unnoticed so far because the only user of those pages was
in-KVM hyperv synic timers, which could continue working despite that
zeroing.

Newer QEMU uses those pages for Hyper-V VMBus implementation, and
zeroing them breaks the migration.

Besides, in newer QEMU the content of those pages is fully managed by
QEMU, so zeroing them is undesirable even when writing the MSRs from the
guest side.

To support this new scheme, introduce a new capability,
KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
pages aren't zeroed out in KVM.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
v2: new patch

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/hyperv.h           |  2 +-
 include/uapi/linux/kvm.h        |  1 +
 arch/x86/kvm/hyperv.c           | 13 +++++++++----
 arch/x86/kvm/x86.c              |  5 ++++-
 5 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 695605e..10e70e7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -458,6 +458,7 @@ struct kvm_vcpu_hv_synic {
 	DECLARE_BITMAP(auto_eoi_bitmap, 256);
 	DECLARE_BITMAP(vec_bitmap, 256);
 	bool active;
+	bool dont_zero_synic_pages;
 };
 
 /* Hyper-V per vcpu emulation context */
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index cd11195..12f65fe 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -56,7 +56,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu);
 void kvm_hv_irq_routing_update(struct kvm *kvm);
 int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vcpu_id, u32 sint);
 void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector);
-int kvm_hv_activate_synic(struct kvm_vcpu *vcpu);
+int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages);
 
 void kvm_hv_vcpu_init(struct kvm_vcpu *vcpu);
 void kvm_hv_vcpu_uninit(struct kvm_vcpu *vcpu);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 577429a..193fcf9 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -895,6 +895,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_SPAPR_TCE_VFIO 142
 #define KVM_CAP_X86_GUEST_MWAIT 143
 #define KVM_CAP_ARM_USER_IRQ 144
+#define KVM_CAP_HYPERV_SYNIC2 145
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index ebae57a..a808440 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -221,7 +221,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 		synic->version = data;
 		break;
 	case HV_X64_MSR_SIEFP:
-		if (data & HV_SYNIC_SIEFP_ENABLE)
+		if ((data & HV_SYNIC_SIEFP_ENABLE) && !host &&
+		    !synic->dont_zero_synic_pages)
 			if (kvm_clear_guest(vcpu->kvm,
 					    data & PAGE_MASK, PAGE_SIZE)) {
 				ret = 1;
@@ -232,7 +233,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 			synic_exit(synic, msr);
 		break;
 	case HV_X64_MSR_SIMP:
-		if (data & HV_SYNIC_SIMP_ENABLE)
+		if ((data & HV_SYNIC_SIMP_ENABLE) && !host &&
+		    !synic->dont_zero_synic_pages)
 			if (kvm_clear_guest(vcpu->kvm,
 					    data & PAGE_MASK, PAGE_SIZE)) {
 				ret = 1;
@@ -687,14 +689,17 @@ void kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
 		stimer_init(&hv_vcpu->stimer[i], i);
 }
 
-int kvm_hv_activate_synic(struct kvm_vcpu *vcpu)
+int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages)
 {
+	struct kvm_vcpu_hv_synic *synic = vcpu_to_synic(vcpu);
+
 	/*
 	 * Hyper-V SynIC auto EOI SINT's are
 	 * not compatible with APICV, so deactivate APICV
 	 */
 	kvm_vcpu_deactivate_apicv(vcpu);
-	vcpu_to_synic(vcpu)->active = true;
+	synic->active = true;
+	synic->dont_zero_synic_pages = dont_zero_synic_pages;
 	return 0;
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 87d3cb9..4bbe43c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2661,6 +2661,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_HYPERV_VAPIC:
 	case KVM_CAP_HYPERV_SPIN:
 	case KVM_CAP_HYPERV_SYNIC:
+	case KVM_CAP_HYPERV_SYNIC2:
 	case KVM_CAP_PCI_SEGMENT:
 	case KVM_CAP_DEBUGREGS:
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
@@ -3385,9 +3386,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 
 	switch (cap->cap) {
 	case KVM_CAP_HYPERV_SYNIC:
+	case KVM_CAP_HYPERV_SYNIC2:
 		if (!irqchip_in_kernel(vcpu->kvm))
 			return -EINVAL;
-		return kvm_hv_activate_synic(vcpu);
+		return kvm_hv_activate_synic(vcpu, cap->cap ==
+					     KVM_CAP_HYPERV_SYNIC2);
 	default:
 		return -EINVAL;
 	}
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] x86:kvm:hyperv: make VP_INDEX managed by userspace
  2017-06-16 17:37 [PATCH v2 0/2] x86:kvm:hyperv: fix userspace interaction flaws Roman Kagan
  2017-06-16 17:37 ` [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2 Roman Kagan
@ 2017-06-16 17:37 ` Roman Kagan
  2017-06-22 12:14   ` Paolo Bonzini
  1 sibling, 1 reply; 8+ messages in thread
From: Roman Kagan @ 2017-06-16 17:37 UTC (permalink / raw)
  To: kvm, Radim Krčmář, Paolo Bonzini
  Cc: Evgeny Yakovlev, Denis V . Lunev, Eduardo Habkost, Igor Mammedov

Hyper-V identifies vCPUs by Virtual Processor Index, which can be
queried via HV_X64_MSR_VP_INDEX msr.  It is defined by the spec as a
sequential number which can't exceed the maximum number of vCPUs per VM.
APIC ids can be sparse and thus aren't a valid replacement for VP
indices.

Current KVM uses its internal vcpu index as VP_INDEX.  However, to make
it predictable and persistent across VM migrations, the userspace has to
control the value of VP_INDEX.

This patch achieves that, by storing vp_index explicitly on vcpu, and
allowing HV_X64_MSR_VP_INDEX to be set from the host side.  For
compatibility it's initialized to KVM vcpu index.  Also a few variables
are renamed to make clear distinction betweed this Hyper-V vp_index and
KVM vcpu_id (== APIC id).  Besides, a new capability,
KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip
attempting msr writes where unsupported, to avoid spamming error logs.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
v1 -> v2:
 - add capability

 arch/x86/include/asm/kvm_host.h |  1 +
 include/uapi/linux/kvm.h        |  1 +
 arch/x86/kvm/hyperv.c           | 49 +++++++++++++++++++++++++----------------
 arch/x86/kvm/x86.c              |  1 +
 4 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 10e70e7..3e654e4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -463,6 +463,7 @@ struct kvm_vcpu_hv_synic {
 
 /* Hyper-V per vcpu emulation context */
 struct kvm_vcpu_hv {
+	u32 vp_index;
 	u64 hv_vapic;
 	s64 runtime_offset;
 	struct kvm_vcpu_hv_synic synic;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 193fcf9..93a1385 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -896,6 +896,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_X86_GUEST_MWAIT 143
 #define KVM_CAP_ARM_USER_IRQ 144
 #define KVM_CAP_HYPERV_SYNIC2 145
+#define KVM_CAP_HYPERV_VP_INDEX 146
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index a808440..8671632 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -106,14 +106,27 @@ static int synic_set_sint(struct kvm_vcpu_hv_synic *synic, int sint,
 	return 0;
 }
 
-static struct kvm_vcpu_hv_synic *synic_get(struct kvm *kvm, u32 vcpu_id)
+static struct kvm_vcpu *get_vcpu_by_vpidx(struct kvm *kvm, u32 vpidx)
+{
+	struct kvm_vcpu *vcpu = NULL;
+	int i;
+
+	if (vpidx < KVM_MAX_VCPUS)
+		vcpu = kvm_get_vcpu(kvm, vpidx);
+	if (vcpu && vcpu_to_hv_vcpu(vcpu)->vp_index == vpidx)
+		return vcpu;
+	kvm_for_each_vcpu(i, vcpu, kvm)
+		if (vcpu_to_hv_vcpu(vcpu)->vp_index == vpidx)
+			return vcpu;
+	return NULL;
+}
+
+static struct kvm_vcpu_hv_synic *synic_get(struct kvm *kvm, u32 vpidx)
 {
 	struct kvm_vcpu *vcpu;
 	struct kvm_vcpu_hv_synic *synic;
 
-	if (vcpu_id >= atomic_read(&kvm->online_vcpus))
-		return NULL;
-	vcpu = kvm_get_vcpu(kvm, vcpu_id);
+	vcpu = get_vcpu_by_vpidx(kvm, vpidx);
 	if (!vcpu)
 		return NULL;
 	synic = vcpu_to_synic(vcpu);
@@ -320,11 +333,11 @@ static int synic_set_irq(struct kvm_vcpu_hv_synic *synic, u32 sint)
 	return ret;
 }
 
-int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vcpu_id, u32 sint)
+int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vpidx, u32 sint)
 {
 	struct kvm_vcpu_hv_synic *synic;
 
-	synic = synic_get(kvm, vcpu_id);
+	synic = synic_get(kvm, vpidx);
 	if (!synic)
 		return -EINVAL;
 
@@ -343,11 +356,11 @@ void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
 			kvm_hv_notify_acked_sint(vcpu, i);
 }
 
-static int kvm_hv_set_sint_gsi(struct kvm *kvm, u32 vcpu_id, u32 sint, int gsi)
+static int kvm_hv_set_sint_gsi(struct kvm *kvm, u32 vpidx, u32 sint, int gsi)
 {
 	struct kvm_vcpu_hv_synic *synic;
 
-	synic = synic_get(kvm, vcpu_id);
+	synic = synic_get(kvm, vpidx);
 	if (!synic)
 		return -EINVAL;
 
@@ -682,6 +695,8 @@ void kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
 	struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
 	int i;
 
+	hv_vcpu->vp_index = kvm_vcpu_get_idx(vcpu);
+
 	synic_init(&hv_vcpu->synic);
 
 	bitmap_zero(hv_vcpu->stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
@@ -983,6 +998,11 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
 	struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
 
 	switch (msr) {
+	case HV_X64_MSR_VP_INDEX:
+		if (!host)
+			return 1;
+		hv->vp_index = (u32)data;
+		break;
 	case HV_X64_MSR_APIC_ASSIST_PAGE: {
 		u64 gfn;
 		unsigned long addr;
@@ -1094,18 +1114,9 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
 	struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
 
 	switch (msr) {
-	case HV_X64_MSR_VP_INDEX: {
-		int r;
-		struct kvm_vcpu *v;
-
-		kvm_for_each_vcpu(r, v, vcpu->kvm) {
-			if (v == vcpu) {
-				data = r;
-				break;
-			}
-		}
+	case HV_X64_MSR_VP_INDEX:
+		data = hv->vp_index;
 		break;
-	}
 	case HV_X64_MSR_EOI:
 		return kvm_hv_vapic_msr_read(vcpu, APIC_EOI, pdata);
 	case HV_X64_MSR_ICR:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4bbe43c..a47665c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2662,6 +2662,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_HYPERV_SPIN:
 	case KVM_CAP_HYPERV_SYNIC:
 	case KVM_CAP_HYPERV_SYNIC2:
+	case KVM_CAP_HYPERV_VP_INDEX:
 	case KVM_CAP_PCI_SEGMENT:
 	case KVM_CAP_DEBUGREGS:
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2
  2017-06-16 17:37 ` [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2 Roman Kagan
@ 2017-06-21 20:09   ` Radim Krčmář
  2017-06-22  5:14     ` Roman Kagan
  0 siblings, 1 reply; 8+ messages in thread
From: Radim Krčmář @ 2017-06-21 20:09 UTC (permalink / raw)
  To: Roman Kagan
  Cc: kvm, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev,
	Eduardo Habkost, Igor Mammedov

2017-06-16 20:37+0300, Roman Kagan:
> There is a flaw in the Hyper-V SynIC implementation in KVM: when message
> page or event flags page is enabled by setting the corresponding msr,
> KVM zeroes it out.  This is problematic because on migration the
> corresponding MSRs are loaded on the destination, so the content of
> those pages is lost.
> 
> This went unnoticed so far because the only user of those pages was
> in-KVM hyperv synic timers, which could continue working despite that
> zeroing.
> 
> Newer QEMU uses those pages for Hyper-V VMBus implementation, and
> zeroing them breaks the migration.
> 
> Besides, in newer QEMU the content of those pages is fully managed by
> QEMU, so zeroing them is undesirable even when writing the MSRs from the
> guest side.
> 
> To support this new scheme, introduce a new capability,
> KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
> pages aren't zeroed out in KVM.
> 
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---

I have changed the subject tags to more common "kvm: x86: hyperv:" and
added minimal documentation:

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 4029943887a3..3f4b1d5f0dce 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4157,3 +4157,10 @@ Currently the following bits are defined for the device_irq_level bitmap:
 Future versions of kvm may implement additional events. These will get
 indicated by returning a higher number from KVM_CHECK_EXTENSION and will be
 listed above.
+
+8.10 KVM_CAP_HYPERV_SYNIC2
+
+Architectures: x86
+
+The only difference from KVM_CAP_HYPERV_SYNIC is that KVM does not clear SynIC
+pages when the guest enables them.

Please let me know if you'd like to improve it.

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2
  2017-06-21 20:09   ` Radim Krčmář
@ 2017-06-22  5:14     ` Roman Kagan
  2017-06-22 12:06       ` Paolo Bonzini
  0 siblings, 1 reply; 8+ messages in thread
From: Roman Kagan @ 2017-06-22  5:14 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: kvm, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev,
	Eduardo Habkost, Igor Mammedov

On Wed, Jun 21, 2017 at 10:09:15PM +0200, Radim Krčmář wrote:
> 2017-06-16 20:37+0300, Roman Kagan:
> > There is a flaw in the Hyper-V SynIC implementation in KVM: when message
> > page or event flags page is enabled by setting the corresponding msr,
> > KVM zeroes it out.  This is problematic because on migration the
> > corresponding MSRs are loaded on the destination, so the content of
> > those pages is lost.
> > 
> > This went unnoticed so far because the only user of those pages was
> > in-KVM hyperv synic timers, which could continue working despite that
> > zeroing.
> > 
> > Newer QEMU uses those pages for Hyper-V VMBus implementation, and
> > zeroing them breaks the migration.
> > 
> > Besides, in newer QEMU the content of those pages is fully managed by
> > QEMU, so zeroing them is undesirable even when writing the MSRs from the
> > guest side.
> > 
> > To support this new scheme, introduce a new capability,
> > KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
> > pages aren't zeroed out in KVM.
> > 
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > ---
> 
> I have changed the subject tags to more common "kvm: x86: hyperv:" and
> added minimal documentation:
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 4029943887a3..3f4b1d5f0dce 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -4157,3 +4157,10 @@ Currently the following bits are defined for the device_irq_level bitmap:
>  Future versions of kvm may implement additional events. These will get
>  indicated by returning a higher number from KVM_CHECK_EXTENSION and will be
>  listed above.
> +
> +8.10 KVM_CAP_HYPERV_SYNIC2
> +
> +Architectures: x86
> +
> +The only difference from KVM_CAP_HYPERV_SYNIC is that KVM does not clear SynIC
> +pages when the guest enables them.
> 
> Please let me know if you'd like to improve it.

Yes that's perfectly fine, thanks.  Sorry, I should have done it
myself...

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2
  2017-06-22  5:14     ` Roman Kagan
@ 2017-06-22 12:06       ` Paolo Bonzini
  2017-06-22 12:13         ` Roman Kagan
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2017-06-22 12:06 UTC (permalink / raw)
  To: Roman Kagan, Radim Krčmář, kvm, Evgeny Yakovlev,
	Denis V . Lunev, Eduardo Habkost, Igor Mammedov



On 22/06/2017 07:14, Roman Kagan wrote:
> On Wed, Jun 21, 2017 at 10:09:15PM +0200, Radim Krčmář wrote:
>> 2017-06-16 20:37+0300, Roman Kagan:
>>> There is a flaw in the Hyper-V SynIC implementation in KVM: when message
>>> page or event flags page is enabled by setting the corresponding msr,
>>> KVM zeroes it out.  This is problematic because on migration the
>>> corresponding MSRs are loaded on the destination, so the content of
>>> those pages is lost.
>>>
>>> This went unnoticed so far because the only user of those pages was
>>> in-KVM hyperv synic timers, which could continue working despite that
>>> zeroing.
>>>
>>> Newer QEMU uses those pages for Hyper-V VMBus implementation, and
>>> zeroing them breaks the migration.
>>>
>>> Besides, in newer QEMU the content of those pages is fully managed by
>>> QEMU, so zeroing them is undesirable even when writing the MSRs from the
>>> guest side.
>>>
>>> To support this new scheme, introduce a new capability,
>>> KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
>>> pages aren't zeroed out in KVM.
>>>
>>> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
>>> ---
>>
>> I have changed the subject tags to more common "kvm: x86: hyperv:" and
>> added minimal documentation:
>>
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index 4029943887a3..3f4b1d5f0dce 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -4157,3 +4157,10 @@ Currently the following bits are defined for the device_irq_level bitmap:
>>  Future versions of kvm may implement additional events. These will get
>>  indicated by returning a higher number from KVM_CHECK_EXTENSION and will be
>>  listed above.
>> +
>> +8.10 KVM_CAP_HYPERV_SYNIC2
>> +
>> +Architectures: x86
>> +
>> +The only difference from KVM_CAP_HYPERV_SYNIC is that KVM does not clear SynIC
>> +pages when the guest enables them.
>>
>> Please let me know if you'd like to improve it.
> 
> Yes that's perfectly fine, thanks.  Sorry, I should have done it
> myself...

The new capability should also check that args[0] is zero.  That was a
mistake done previously that we shouldn't repeat.

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2
  2017-06-22 12:06       ` Paolo Bonzini
@ 2017-06-22 12:13         ` Roman Kagan
  0 siblings, 0 replies; 8+ messages in thread
From: Roman Kagan @ 2017-06-22 12:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Radim Krčmář, kvm, Evgeny Yakovlev,
	Denis V . Lunev, Eduardo Habkost, Igor Mammedov

On Thu, Jun 22, 2017 at 02:06:57PM +0200, Paolo Bonzini wrote:
> On 22/06/2017 07:14, Roman Kagan wrote:
> > On Wed, Jun 21, 2017 at 10:09:15PM +0200, Radim Krčmář wrote:
> >> 2017-06-16 20:37+0300, Roman Kagan:
> >>> There is a flaw in the Hyper-V SynIC implementation in KVM: when message
> >>> page or event flags page is enabled by setting the corresponding msr,
> >>> KVM zeroes it out.  This is problematic because on migration the
> >>> corresponding MSRs are loaded on the destination, so the content of
> >>> those pages is lost.
> >>>
> >>> This went unnoticed so far because the only user of those pages was
> >>> in-KVM hyperv synic timers, which could continue working despite that
> >>> zeroing.
> >>>
> >>> Newer QEMU uses those pages for Hyper-V VMBus implementation, and
> >>> zeroing them breaks the migration.
> >>>
> >>> Besides, in newer QEMU the content of those pages is fully managed by
> >>> QEMU, so zeroing them is undesirable even when writing the MSRs from the
> >>> guest side.
> >>>
> >>> To support this new scheme, introduce a new capability,
> >>> KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
> >>> pages aren't zeroed out in KVM.
> >>>
> >>> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> >>> ---
> >>
> >> I have changed the subject tags to more common "kvm: x86: hyperv:" and
> >> added minimal documentation:
> >>
> >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> >> index 4029943887a3..3f4b1d5f0dce 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -4157,3 +4157,10 @@ Currently the following bits are defined for the device_irq_level bitmap:
> >>  Future versions of kvm may implement additional events. These will get
> >>  indicated by returning a higher number from KVM_CHECK_EXTENSION and will be
> >>  listed above.
> >> +
> >> +8.10 KVM_CAP_HYPERV_SYNIC2
> >> +
> >> +Architectures: x86
> >> +
> >> +The only difference from KVM_CAP_HYPERV_SYNIC is that KVM does not clear SynIC
> >> +pages when the guest enables them.
> >>
> >> Please let me know if you'd like to improve it.
> > 
> > Yes that's perfectly fine, thanks.  Sorry, I should have done it
> > myself...
> 
> The new capability should also check that args[0] is zero.  That was a
> mistake done previously that we shouldn't repeat.

Good point, so that we don't need to add another capability when we
figure out another design flaw ;)

I'll respin the patches then.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] x86:kvm:hyperv: make VP_INDEX managed by userspace
  2017-06-16 17:37 ` [PATCH v2 2/2] x86:kvm:hyperv: make VP_INDEX managed by userspace Roman Kagan
@ 2017-06-22 12:14   ` Paolo Bonzini
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2017-06-22 12:14 UTC (permalink / raw)
  To: Roman Kagan, kvm, Radim Krčmář
  Cc: Evgeny Yakovlev, Denis V . Lunev, Eduardo Habkost, Igor Mammedov



On 16/06/2017 19:37, Roman Kagan wrote:
> control the value of VP_INDEX.
> 
> This patch achieves that, by storing vp_index explicitly on vcpu, and
> allowing HV_X64_MSR_VP_INDEX to be set from the host side.  For
> compatibility it's initialized to KVM vcpu index.  Also a few variables
> are renamed to make clear distinction betweed this Hyper-V vp_index and
> KVM vcpu_id (== APIC id).  Besides, a new capability,
> KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip
> attempting msr writes where unsupported, to avoid spamming error logs.
> 
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>

The new capability needs documentation.

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-06-22 12:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-16 17:37 [PATCH v2 0/2] x86:kvm:hyperv: fix userspace interaction flaws Roman Kagan
2017-06-16 17:37 ` [PATCH v2 1/2] x86:kvm:hyperv: add KVM_CAP_HYPERV_SYNIC2 Roman Kagan
2017-06-21 20:09   ` Radim Krčmář
2017-06-22  5:14     ` Roman Kagan
2017-06-22 12:06       ` Paolo Bonzini
2017-06-22 12:13         ` Roman Kagan
2017-06-16 17:37 ` [PATCH v2 2/2] x86:kvm:hyperv: make VP_INDEX managed by userspace Roman Kagan
2017-06-22 12:14   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox