public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs
@ 2017-12-18  9:45 Liran Alon
  2017-12-18  9:45 ` [PATCH 1/7] KVM: x86: Add module parameter for supporting VMware backdoor Liran Alon
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm; +Cc: idan.brown

Hi,

This series aims to complete KVM support of VMware backdoor interface.
This interface is currently partially supported.

Before these commits, KVM forwaded accesses to vmport I/O ports to QEMU
where they were handled correctly.
However, it turns out VMware design allows these I/O ports to also be
accessible to guest when TSS I/O permissions bitmap disallows it.
The way VMware hypervisor does it is to intercept #GP and on #GP intercept
handler run the x86 emulator which was modified to specifically skip
checking TSS I/O permissions bitmap for these magic I/O ports.
This behavior was found to be a must for some VMware workloads.
For example, VMware Tools Windows installer refuse to install as it
cannot access vmport I/O ports from Ring3, therefore assuming it
is not running inside a VM.

Patches 1-6 mimics the above behavior in KVM. It installs a #GP
intercept on both VMX & SVM such that the #GP intercept handler
will call x86 emulator which was modified to always allow access
to these I/O ports.
Because this behavior is not a must for all workloads
and #GP intercept can introduce a slight performance overhead,
a module parameter was added to control whether KVM will do this or not.

Patch 7 finally completes VMware backdoor interface by adding support
for VMware Pseduo-PMCs. VMware defines a couple of PMCs which are made-up
by their hypervisor which returns various PV information
(Such as host's RDTSC). Similar to vmport I/O ports, these Pseduo-PMCs
can be accessed via RDPMC even if executed from Ring3 when CR4.PCE=0.
Therefore, the x86 emulator was modified to mimic this behavior.

Regards,
-Liran Alon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/7] KVM: x86: Add module parameter for supporting VMware backdoor
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
@ 2017-12-18  9:45 ` Liran Alon
  2017-12-18  9:45 ` [PATCH 2/7] KVM: x86: Always allow access to VMware backdoor I/O ports Liran Alon
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm; +Cc: idan.brown, Liran Alon, Krish Sadhukhan

Support access to VMware backdoor requires KVM to intercept #GP
exceptions from guest which introduce slight performance hit.
Therefore, control this support by module parameter.

Note that module parameter is exported as it should be consumed by
kvm_intel & kvm_amd to determine if they should intercept #GP or not.

This commit doesn't change semantics.
It is done as a preparation for future commits.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 3 +++
 arch/x86/kvm/x86.c                              | 4 ++++
 arch/x86/kvm/x86.h                              | 2 ++
 3 files changed, 9 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6571fbfdb2a1..05737a507ddc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1867,6 +1867,9 @@
 	kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
 			Default is 0 (don't ignore, but inject #GP)
 
+	kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
+				   Default is false (don't support).
+
 	kvm.mmu_audit=	[KVM] This is a R/W parameter which allows audit
 			KVM MMU at runtime.
 			Default is 0 (off)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index faf843c9b916..5fef09674de1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -138,6 +138,10 @@
 static bool __read_mostly vector_hashing = true;
 module_param(vector_hashing, bool, S_IRUGO);
 
+bool __read_mostly enable_vmware_backdoor = false;
+module_param(enable_vmware_backdoor, bool, S_IRUGO);
+EXPORT_SYMBOL_GPL(enable_vmware_backdoor);
+
 #define KVM_NR_SHARED_MSRS 16
 
 struct kvm_shared_msrs_global {
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index d0b95b7a90b4..1909603cb816 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -241,6 +241,8 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn,
 
 extern unsigned int lapic_timer_advance_ns;
 
+extern bool enable_vmware_backdoor;
+
 extern struct static_key kvm_no_apic_vcpu;
 
 static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/7] KVM: x86: Always allow access to VMware backdoor I/O ports
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
  2017-12-18  9:45 ` [PATCH 1/7] KVM: x86: Add module parameter for supporting VMware backdoor Liran Alon
@ 2017-12-18  9:45 ` Liran Alon
  2017-12-18  9:45 ` [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure Liran Alon
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm; +Cc: idan.brown, Liran Alon, Krish Sadhukhan

VMware allows access to these ports even if denied
by TSS I/O permission bitmap. Mimic behavior.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
---
 arch/x86/kvm/emulate.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index abe74f779f9d..8f4a75f4e8b9 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2868,6 +2868,9 @@ static bool emulator_bad_iopl(struct x86_emulate_ctxt *ctxt)
 	return ctxt->ops->cpl(ctxt) > iopl;
 }
 
+#define VMWARE_PORT_VMPORT	(0x5658)
+#define VMWARE_PORT_VMRPC	(0x5659)
+
 static bool emulator_io_port_access_allowed(struct x86_emulate_ctxt *ctxt,
 					    u16 port, u16 len)
 {
@@ -2879,6 +2882,14 @@ static bool emulator_io_port_access_allowed(struct x86_emulate_ctxt *ctxt,
 	unsigned mask = (1 << len) - 1;
 	unsigned long base;
 
+	/*
+	 * VMware allows access to these ports even if denied
+	 * by TSS I/O permission bitmap. Mimic behavior.
+	 */
+	if (enable_vmware_backdoor &&
+	    ((port == VMWARE_PORT_VMPORT) || (port == VMWARE_PORT_VMRPC)))
+		return true;
+
 	ops->get_segment(ctxt, &tr, &tr_seg, &base3, VCPU_SREG_TR);
 	if (!tr_seg.p)
 		return false;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
  2017-12-18  9:45 ` [PATCH 1/7] KVM: x86: Add module parameter for supporting VMware backdoor Liran Alon
  2017-12-18  9:45 ` [PATCH 2/7] KVM: x86: Always allow access to VMware backdoor I/O ports Liran Alon
@ 2017-12-18  9:45 ` Liran Alon
  2017-12-21 15:11   ` Paolo Bonzini
  2017-12-18  9:45 ` [PATCH 4/7] KVM: x86: Emulate only IN/OUT instructions when accessing VMware backdoor Liran Alon
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm; +Cc: idan.brown, Liran Alon, Krish Sadhukhan

Next commits are going introduce support for accessing VMware backdoor
ports even though guest's TSS I/O permissions bitmap doesn't allow
access. This mimic VMware hypervisor behavior.

In order to support this, next commits will change VMX/SVM to
intercept #GP which was raised by such access and handle it by calling
the x86 emulator to emulate instruction. Since commit "KVM: x86:
Always allow access to VMware backdoor I/O ports", the x86 emulator
handles access to these I/O ports by not checking these ports against
the TSS I/O permission bitmap.

It turns out that the x86 emulator is incomplete and therefore
certain instructions that can cause #GP cannot be emulated.
Such an example is "INT x" (opcode 0xcd) which reach emulate_int()
which can only emulate instruction if vCPU is in real-mode.

In those cases, we would like the #GP intercept to just forward #GP
as-is to guest as if there was no intercept to begin with.
However, current emulator code always queue #UD exception in case
emulator fails which is not what is wanted in this flow.

This commit address this issue by adding a new emulation_type flag
that will allow the #GP intercept handler to specify it wish to just
be aware of when instruction emulation fails and doesn't want #UD
exception to be queued.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c              | 12 ++++++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 516798431328..2b7ea1ac4f86 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1168,6 +1168,7 @@ enum emulation_result {
 #define EMULTYPE_SKIP		    (1 << 2)
 #define EMULTYPE_RETRY		    (1 << 3)
 #define EMULTYPE_NO_REEXECUTE	    (1 << 4)
+#define EMULTYPE_NO_UD_ON_FAIL	    (1 << 5)
 int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
 			    int emulation_type, void *insn, int insn_len);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5fef09674de1..8fd2d3e1bcd4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5425,7 +5425,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
 }
 EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
 
-static int handle_emulation_failure(struct kvm_vcpu *vcpu)
+static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
 {
 	int r = EMULATE_DONE;
 
@@ -5437,7 +5437,11 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu)
 		vcpu->run->internal.ndata = 0;
 		r = EMULATE_USER_EXIT;
 	}
-	kvm_queue_exception(vcpu, UD_VECTOR);
+
+	if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
+		r = EMULATE_FAIL;
+	else
+		kvm_queue_exception(vcpu, UD_VECTOR);
 
 	return r;
 }
@@ -5731,7 +5735,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 				return EMULATE_DONE;
 			if (emulation_type & EMULTYPE_SKIP)
 				return EMULATE_FAIL;
-			return handle_emulation_failure(vcpu);
+			return handle_emulation_failure(vcpu, emulation_type);
 		}
 	}
 
@@ -5766,7 +5770,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 					emulation_type))
 			return EMULATE_DONE;
 
-		return handle_emulation_failure(vcpu);
+		return handle_emulation_failure(vcpu, emulation_type);
 	}
 
 	if (ctxt->have_exception) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/7] KVM: x86: Emulate only IN/OUT instructions when accessing VMware backdoor
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
                   ` (2 preceding siblings ...)
  2017-12-18  9:45 ` [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure Liran Alon
@ 2017-12-18  9:45 ` Liran Alon
  2017-12-18  9:45 ` [PATCH 5/7] KVM: x86: VMX: Intercept #GP to support access to VMware backdoor ports Liran Alon
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm
  Cc: idan.brown, Liran Alon, Krish Sadhukhan, Konrad Rzeszutek Wilk

Access to VMware backdoor ports is done by one of the IN/OUT/INS/OUTS
instructions. These ports must be allowed access even if TSS I/O
permission bitmap don't allow it.

To handle this, VMX/SVM will be changed in future commits
to intercept #GP which was raised by such access and
handle it by calling x86 emulator to emulate instruction.
If it was one of these instructions, the x86 emulator already handles
it correctly (Since commit "KVM: x86: Always allow access to VMware
backdoor I/O ports") by not checking these ports against TSS I/O
permission bitmap.

One may wonder why checking for specific instructions is necessary
as we can just forward all #GPs to the x86 emulator.
However, it turns out that the x86 emulator is incomplete and therefore
certain instructions that can cause #GP cannot be emulated.
Such an example is "INT x" (opcode 0xcd) which reach emulate_int() which
can only emulate instruction if vCPU is in real-mode.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c              | 28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2b7ea1ac4f86..eff7ecadbc44 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1169,6 +1169,7 @@ enum emulation_result {
 #define EMULTYPE_RETRY		    (1 << 3)
 #define EMULTYPE_NO_REEXECUTE	    (1 << 4)
 #define EMULTYPE_NO_UD_ON_FAIL	    (1 << 5)
+#define EMULTYPE_VMWARE		    (1 << 6)
 int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
 			    int emulation_type, void *insn, int insn_len);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8fd2d3e1bcd4..c52a71c1be0d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5684,6 +5684,30 @@ static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
 	return false;
 }
 
+static bool is_vmware_backdoor_opcode(struct x86_emulate_ctxt *ctxt)
+{
+	if (ctxt->opcode_len != 1)
+		return false;
+
+	switch (ctxt->b) {
+	case 0xe4:	/* IN */
+	case 0xe5:
+	case 0xec:
+	case 0xed:
+	case 0xe6:	/* OUT */
+	case 0xe7:
+	case 0xee:
+	case 0xef:
+	case 0x6c:	/* INS */
+	case 0x6d:
+	case 0x6e:	/* OUTS */
+	case 0x6f:
+		return true;
+	}
+
+	return false;
+}
+
 int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 			    unsigned long cr2,
 			    int emulation_type,
@@ -5739,6 +5763,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 		}
 	}
 
+	if ((emulation_type & EMULTYPE_VMWARE) &&
+	    !is_vmware_backdoor_opcode(ctxt))
+		return EMULATE_FAIL;
+
 	if (emulation_type & EMULTYPE_SKIP) {
 		kvm_rip_write(vcpu, ctxt->_eip);
 		if (ctxt->eflags & X86_EFLAGS_RF)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/7] KVM: x86: VMX: Intercept #GP to support access to VMware backdoor ports
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
                   ` (3 preceding siblings ...)
  2017-12-18  9:45 ` [PATCH 4/7] KVM: x86: Emulate only IN/OUT instructions when accessing VMware backdoor Liran Alon
@ 2017-12-18  9:45 ` Liran Alon
  2017-12-18  9:45 ` [PATCH 6/7] KVM: x86: SVM: " Liran Alon
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm
  Cc: idan.brown, Liran Alon, Krish Sadhukhan, Konrad Rzeszutek Wilk

If KVM enable_vmware_backdoor module parameter is set,
the commit change VMX to now intercept #GP instead of being directly
deliviered from CPU to guest.
If vCPU runs in guest-mode and L1 don't intercept #GP, forward it
directly to L2. Otherwise, emulate instruction which caused #GP through
emulator.

It is done to support access to VMware backdoor I/O ports
even if TSS I/O permission denies it.
In that case:
1. A #GP will be raised and intercepted.
2. #GP intercept handler will simulate I/O port access instruction.
3. I/O port access instruction simulation will allow access to VMware
backdoor ports specifically even if TSS I/O permission bitmap denies it.

Note that the above change introduce slight performance hit as now #GPs
are now not deliviered directly from CPU to guest but instead
cause #VMExit and instruction emulation.
However, this behavior is introduced only when enable_vmware_backdoor
KVM module parameter is set.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kvm/vmx.c | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 8eba631c4dbd..08110f39e657 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1063,6 +1063,11 @@ static inline bool is_invalid_opcode(u32 intr_info)
 	return is_exception_n(intr_info, UD_VECTOR);
 }
 
+static inline bool is_gp_fault(u32 intr_info)
+{
+	return is_exception_n(intr_info, GP_VECTOR);
+}
+
 static inline bool is_external_interrupt(u32 intr_info)
 {
 	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -1905,8 +1910,17 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
 	 */
 	if (is_guest_mode(vcpu))
 		eb |= get_vmcs12(vcpu)->exception_bitmap;
-	else
+	else {
 		eb |= 1u << UD_VECTOR;
+		/*
+		 * Guest access to VMware backdoor ports could legitimately
+		 * trigger #GP because of TSS I/O permission bitmap.
+		 * We intercept those #GP and allow access to them anyway
+		 * as VMware does.
+		 */
+		if (enable_vmware_backdoor)
+			eb |= (1u << GP_VECTOR);
+	}
 
 	vmcs_write32(EXCEPTION_BITMAP, eb);
 }
@@ -5930,6 +5944,18 @@ static int handle_exception(struct kvm_vcpu *vcpu)
 	if (intr_info & INTR_INFO_DELIVER_CODE_MASK)
 		error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE);
 
+	if (!vmx->rmode.vm86_active && is_gp_fault(intr_info)) {
+		WARN_ON_ONCE(!enable_vmware_backdoor);
+		WARN_ON_ONCE(is_guest_mode(vcpu));
+		er = emulate_instruction(vcpu,
+			EMULTYPE_VMWARE | EMULTYPE_NO_UD_ON_FAIL);
+		if (er == EMULATE_USER_EXIT)
+			return 0;
+		else if (er != EMULATE_DONE)
+			kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
+		return 1;
+	}
+
 	/*
 	 * The #PF with PFEC.RSVD = 1 indicates the guest is accessing
 	 * MMIO, it is better to report an internal error.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/7] KVM: x86: SVM: Intercept #GP to support access to VMware backdoor ports
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
                   ` (4 preceding siblings ...)
  2017-12-18  9:45 ` [PATCH 5/7] KVM: x86: VMX: Intercept #GP to support access to VMware backdoor ports Liran Alon
@ 2017-12-18  9:45 ` Liran Alon
  2017-12-18  9:45 ` [PATCH 7/7] KVM: x86: Add support for VMware backdoor Pseudo-PMCs Liran Alon
  2017-12-21 15:15 ` [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Paolo Bonzini
  7 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm
  Cc: idan.brown, Liran Alon, Krish Sadhukhan, Konrad Rzeszutek Wilk

If KVM enable_vmware_backdoor module parameter is set,
the commit change VMX to now intercept #GP instead of being directly
deliviered from CPU to guest.
If vCPU runs in guest-mode and L1 don't intercept #GP, forward it
directly to L2. Otherwise, emulate instruction which caused #GP through
emulator.

It is done to support access to VMware Backdoor I/O ports
even if TSS I/O permission denies it.
In that case:
1. A #GP will be raised and intercepted.
2. #GP intercept handler will simulate I/O port access instruction.
3. I/O port access instruction simulation will allow access to VMware
backdoor ports specifically even if TSS I/O permission bitmap denies it.

Note that the above change introduce slight performance hit as now #GPs
are now not deliviered directly from CPU to guest but instead
cause #VMExit and instruction emulation.
However, this behavior is introduced only when enable_vmware_backdoor
KVM module parameter is set.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kvm/svm.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index eb714f1cdf7e..31dc73b0b449 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -372,9 +372,10 @@ static void recalc_intercepts(struct vcpu_svm *svm)
 	h = &svm->nested.hsave->control;
 	g = &svm->nested;
 
-	/* No need to intercept #UD if L1 doesn't intercept it */
+	/* No need to intercept #UD/#GP if L1 doesn't intercept it */
 	h_intercept_exceptions =
-		h->intercept_exceptions & ~(1U << UD_VECTOR);
+		h->intercept_exceptions &
+		~((1U << UD_VECTOR) | (1U << GP_VECTOR));
 
 	c->intercept_cr = h->intercept_cr | g->intercept_cr;
 	c->intercept_dr = h->intercept_dr | g->intercept_dr;
@@ -1228,6 +1229,14 @@ static void init_vmcb(struct vcpu_svm *svm)
 	set_exception_intercept(svm, MC_VECTOR);
 	set_exception_intercept(svm, AC_VECTOR);
 	set_exception_intercept(svm, DB_VECTOR);
+	/*
+	 * Guest access to VMware backdoor ports could legitimately
+	 * trigger #GP because of TSS I/O permission bitmap.
+	 * We intercept those #GP and allow access to them anyway
+	 * as VMware does.
+	 */
+	if (enable_vmware_backdoor)
+		set_exception_intercept(svm, GP_VECTOR);
 
 	set_intercept(svm, INTERCEPT_INTR);
 	set_intercept(svm, INTERCEPT_NMI);
@@ -2217,6 +2226,24 @@ static int ac_interception(struct vcpu_svm *svm)
 	return 1;
 }
 
+static int gp_interception(struct vcpu_svm *svm)
+{
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	u32 error_code = svm->vmcb->control.exit_info_1;
+	int er;
+
+	WARN_ON_ONCE(!enable_vmware_backdoor);
+	WARN_ON_ONCE(is_guest_mode(&svm->vcpu));
+
+	er = emulate_instruction(vcpu,
+		EMULTYPE_VMWARE | EMULTYPE_NO_UD_ON_FAIL);
+	if (er == EMULATE_USER_EXIT)
+		return 0;
+	else if (er != EMULATE_DONE)
+		kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
+	return 1;
+}
+
 static bool is_erratum_383(void)
 {
 	int err, i;
@@ -4133,6 +4160,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
 	[SVM_EXIT_EXCP_BASE + PF_VECTOR]	= pf_interception,
 	[SVM_EXIT_EXCP_BASE + MC_VECTOR]	= mc_interception,
 	[SVM_EXIT_EXCP_BASE + AC_VECTOR]	= ac_interception,
+	[SVM_EXIT_EXCP_BASE + GP_VECTOR]	= gp_interception,
 	[SVM_EXIT_INTR]				= intr_interception,
 	[SVM_EXIT_NMI]				= nmi_interception,
 	[SVM_EXIT_SMI]				= nop_on_interception,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/7] KVM: x86: Add support for VMware backdoor Pseudo-PMCs
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
                   ` (5 preceding siblings ...)
  2017-12-18  9:45 ` [PATCH 6/7] KVM: x86: SVM: " Liran Alon
@ 2017-12-18  9:45 ` Liran Alon
  2017-12-21 15:15 ` [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Paolo Bonzini
  7 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-18  9:45 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm; +Cc: idan.brown, Arbel Moshe, Konrad Rzeszutek Wilk

From: Arbel Moshe <arbel.moshe@oracle.com>

VMware exposes the following Pseudo PMCs:
0x10000: Physical host TSC
0x10001: Elapsed real time in ns
0x10002: Elapsed apparent time in ns

For more info refer to:
https://www.vmware.com/files/pdf/techpaper/Timekeeping-In-VirtualMachines.pdf

VMware allows access to these Pseduo-PMCs even when read via RDPMC
in Ring3 and CR4.PCE=0. Therefore, commit modifies x86 emulator
to allow access to these PMCs in this situation. In addition,
emulation of these PMCs were added to kvm_pmu_rdpmc().

Signed-off-by: Arbel Moshe <arbel.moshe@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kvm/emulate.c |  8 ++++++++
 arch/x86/kvm/pmu.c     | 37 +++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/pmu.h     |  6 ++++++
 arch/x86/kvm/x86.c     | 39 ++++++++++++++++++++++-----------------
 4 files changed, 73 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8f4a75f4e8b9..ce5b2973fa18 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -29,6 +29,7 @@
 #include "x86.h"
 #include "tss.h"
 #include "mmu.h"
+#include "pmu.h"
 
 /*
  * Operand types
@@ -4236,6 +4237,13 @@ static int check_rdpmc(struct x86_emulate_ctxt *ctxt)
 	u64 cr4 = ctxt->ops->get_cr(ctxt, 4);
 	u64 rcx = reg_read(ctxt, VCPU_REGS_RCX);
 
+	/*
+	 * VMware allows access to these Pseduo-PMCs even when read via RDPMC
+	 * in Ring3 when CR4.PCE=0.
+	 */
+	if (enable_vmware_backdoor && is_vmware_backdoor_pmc(rcx))
+		return X86EMUL_CONTINUE;
+
 	if ((!(cr4 & X86_CR4_PCE) && ctxt->ops->cpl(ctxt)) ||
 	    ctxt->ops->check_pmc(ctxt, rcx))
 		return emulate_gp(ctxt, 0);
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 026db42a86c3..58ead7db71a3 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -244,12 +244,49 @@ int kvm_pmu_is_valid_msr_idx(struct kvm_vcpu *vcpu, unsigned idx)
 	return kvm_x86_ops->pmu_ops->is_valid_msr_idx(vcpu, idx);
 }
 
+bool is_vmware_backdoor_pmc(u32 pmc_idx)
+{
+	switch (pmc_idx) {
+	case VMWARE_BACKDOOR_PMC_HOST_TSC:
+	case VMWARE_BACKDOOR_PMC_REAL_TIME:
+	case VMWARE_BACKDOOR_PMC_APPARENT_TIME:
+		return true;
+	}
+	return false;
+}
+
+static int kvm_pmu_rdpmc_vmware(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
+{
+	u64 ctr_val;
+
+	switch (idx) {
+	case VMWARE_BACKDOOR_PMC_HOST_TSC:
+		ctr_val = rdtsc();
+		break;
+	case VMWARE_BACKDOOR_PMC_REAL_TIME:
+		ctr_val = ktime_get_boot_ns();
+		break;
+	case VMWARE_BACKDOOR_PMC_APPARENT_TIME:
+		ctr_val = ktime_get_boot_ns() +
+			vcpu->kvm->arch.kvmclock_offset;
+		break;
+	default:
+		return 1;
+	}
+
+	*data = ctr_val;
+	return 0;
+}
+
 int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 {
 	bool fast_mode = idx & (1u << 31);
 	struct kvm_pmc *pmc;
 	u64 ctr_val;
 
+	if (is_vmware_backdoor_pmc(idx))
+		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
+
 	pmc = kvm_x86_ops->pmu_ops->msr_idx_to_pmc(vcpu, idx);
 	if (!pmc)
 		return 1;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index a9a62b9a73e2..ba8898e1a854 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -9,6 +9,10 @@
 /* retrieve the 4 bits for EN and PMI out of IA32_FIXED_CTR_CTRL */
 #define fixed_ctrl_field(ctrl_reg, idx) (((ctrl_reg) >> ((idx)*4)) & 0xf)
 
+#define VMWARE_BACKDOOR_PMC_HOST_TSC		0x10000
+#define VMWARE_BACKDOOR_PMC_REAL_TIME		0x10001
+#define VMWARE_BACKDOOR_PMC_APPARENT_TIME	0x10002
+
 struct kvm_event_hw_type_mapping {
 	u8 eventsel;
 	u8 unit_mask;
@@ -114,6 +118,8 @@ static inline struct kvm_pmc *get_fixed_pmc(struct kvm_pmu *pmu, u32 msr)
 void kvm_pmu_init(struct kvm_vcpu *vcpu);
 void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
 
+bool is_vmware_backdoor_pmc(u32 pmc_idx);
+
 extern struct kvm_pmu_ops intel_pmu_ops;
 extern struct kvm_pmu_ops amd_pmu_ops;
 #endif /* __KVM_X86_PMU_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c52a71c1be0d..da17e9424a42 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5686,23 +5686,28 @@ static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
 
 static bool is_vmware_backdoor_opcode(struct x86_emulate_ctxt *ctxt)
 {
-	if (ctxt->opcode_len != 1)
-		return false;
-
-	switch (ctxt->b) {
-	case 0xe4:	/* IN */
-	case 0xe5:
-	case 0xec:
-	case 0xed:
-	case 0xe6:	/* OUT */
-	case 0xe7:
-	case 0xee:
-	case 0xef:
-	case 0x6c:	/* INS */
-	case 0x6d:
-	case 0x6e:	/* OUTS */
-	case 0x6f:
-		return true;
+	switch (ctxt->opcode_len) {
+	case 1:
+		switch (ctxt->b) {
+		case 0xe4:	/* IN */
+		case 0xe5:
+		case 0xec:
+		case 0xed:
+		case 0xe6:	/* OUT */
+		case 0xe7:
+		case 0xee:
+		case 0xef:
+		case 0x6c:	/* INS */
+		case 0x6d:
+		case 0x6e:	/* OUTS */
+		case 0x6f:
+			return true;
+		}
+	case 2:
+		switch (ctxt->b) {
+		case 0x33:	/* RDPMC */
+			return true;
+		}
 	}
 
 	return false;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure
  2017-12-18  9:45 ` [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure Liran Alon
@ 2017-12-21 15:11   ` Paolo Bonzini
  2017-12-22  1:11     ` Liran Alon
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-12-21 15:11 UTC (permalink / raw)
  To: Liran Alon, rkrcmar, kvm; +Cc: idan.brown, Krish Sadhukhan

On 18/12/2017 10:45, Liran Alon wrote:
> Next commits are going introduce support for accessing VMware backdoor
> ports even though guest's TSS I/O permissions bitmap doesn't allow
> access. This mimic VMware hypervisor behavior.
> 
> In order to support this, next commits will change VMX/SVM to
> intercept #GP which was raised by such access and handle it by calling
> the x86 emulator to emulate instruction. Since commit "KVM: x86:
> Always allow access to VMware backdoor I/O ports", the x86 emulator
> handles access to these I/O ports by not checking these ports against
> the TSS I/O permission bitmap.
> 
> It turns out that the x86 emulator is incomplete and therefore
> certain instructions that can cause #GP cannot be emulated.
> Such an example is "INT x" (opcode 0xcd) which reach emulate_int()
> which can only emulate instruction if vCPU is in real-mode.
> 
> In those cases, we would like the #GP intercept to just forward #GP
> as-is to guest as if there was no intercept to begin with.
> However, current emulator code always queue #UD exception in case
> emulator fails which is not what is wanted in this flow.
> 
> This commit address this issue by adding a new emulation_type flag
> that will allow the #GP intercept handler to specify it wish to just
> be aware of when instruction emulation fails and doesn't want #UD
> exception to be queued.
> 
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/x86.c              | 12 ++++++++----
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 516798431328..2b7ea1ac4f86 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1168,6 +1168,7 @@ enum emulation_result {
>  #define EMULTYPE_SKIP		    (1 << 2)
>  #define EMULTYPE_RETRY		    (1 << 3)
>  #define EMULTYPE_NO_REEXECUTE	    (1 << 4)
> +#define EMULTYPE_NO_UD_ON_FAIL	    (1 << 5)
>  int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
>  			    int emulation_type, void *insn, int insn_len);
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5fef09674de1..8fd2d3e1bcd4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5425,7 +5425,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
>  }
>  EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
>  
> -static int handle_emulation_failure(struct kvm_vcpu *vcpu)
> +static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
>  {
>  	int r = EMULATE_DONE;
>  
> @@ -5437,7 +5437,11 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu)
>  		vcpu->run->internal.ndata = 0;
>  		r = EMULATE_USER_EXIT;
>  	}
> -	kvm_queue_exception(vcpu, UD_VECTOR);
> +
> +	if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
> +		r = EMULATE_FAIL;

There seems to be some overlap between EMULTYPE_VMWARE and
EMULTYPE_NO_UD_ON_FAIL.

Even if you don't specify EMULTYPE_VMWARE, the emulation could fail, but
there should have been no writeback and injecting the original #GP
exception should be safe.  Maybe should EMULTYPE_NO_UD_ON_FAIL return
straight away, skipping even the user-mode exit.

On the other hand, you may want to have EMULTYPE_VMWARE so as to reduce
the attack surface from the emulator (as the emulator would then be very
easy to trigger, just by executing an instruction that causes #GP).  In
that case, however, emulation of the {in,out}{s,} instructions shouldn't
fail and you shouldn't need EMULTYPE_NO_UD_ON_FAIL.

Paolo


> +	else
> +		kvm_queue_exception(vcpu, UD_VECTOR);
>  
>  	return r;
>  }
> @@ -5731,7 +5735,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>  				return EMULATE_DONE;
>  			if (emulation_type & EMULTYPE_SKIP)
>  				return EMULATE_FAIL;
> -			return handle_emulation_failure(vcpu);
> +			return handle_emulation_failure(vcpu, emulation_type);
>  		}
>  	}
>  
> @@ -5766,7 +5770,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>  					emulation_type))
>  			return EMULATE_DONE;
>  
> -		return handle_emulation_failure(vcpu);
> +		return handle_emulation_failure(vcpu, emulation_type);
>  	}
>  
>  	if (ctxt->have_exception) {
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs
  2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
                   ` (6 preceding siblings ...)
  2017-12-18  9:45 ` [PATCH 7/7] KVM: x86: Add support for VMware backdoor Pseudo-PMCs Liran Alon
@ 2017-12-21 15:15 ` Paolo Bonzini
  2017-12-22  1:22   ` Liran Alon
  7 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-12-21 15:15 UTC (permalink / raw)
  To: Liran Alon, rkrcmar, kvm; +Cc: idan.brown

On 18/12/2017 10:45, Liran Alon wrote:
> Hi,
> 
> This series aims to complete KVM support of VMware backdoor interface.
> This interface is currently partially supported.
> 
> Before these commits, KVM forwaded accesses to vmport I/O ports to QEMU
> where they were handled correctly.
> However, it turns out VMware design allows these I/O ports to also be
> accessible to guest when TSS I/O permissions bitmap disallows it.
> The way VMware hypervisor does it is to intercept #GP and on #GP intercept
> handler run the x86 emulator which was modified to specifically skip
> checking TSS I/O permissions bitmap for these magic I/O ports.
> This behavior was found to be a must for some VMware workloads.
> For example, VMware Tools Windows installer refuse to install as it
> cannot access vmport I/O ports from Ring3, therefore assuming it
> is not running inside a VM.
> 
> Patches 1-6 mimics the above behavior in KVM. It installs a #GP
> intercept on both VMX & SVM such that the #GP intercept handler
> will call x86 emulator which was modified to always allow access
> to these I/O ports.
> Because this behavior is not a must for all workloads
> and #GP intercept can introduce a slight performance overhead,
> a module parameter was added to control whether KVM will do this or not.
> 
> Patch 7 finally completes VMware backdoor interface by adding support
> for VMware Pseduo-PMCs. VMware defines a couple of PMCs which are made-up
> by their hypervisor which returns various PV information
> (Such as host's RDTSC). Similar to vmport I/O ports, these Pseduo-PMCs
> can be accessed via RDPMC even if executed from Ring3 when CR4.PCE=0.
> Therefore, the x86 emulator was modified to mimic this behavior.

Looks good, I have only a question in patch 3.

Also, just for confirmation, in the nested case L1 can still trap #GP
using the exception bitmap, right?  Can you write a testcase for
kvm-unit-tests/vmx.c, for both RDPMC and the 0x5658 port?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure
  2017-12-21 15:11   ` Paolo Bonzini
@ 2017-12-22  1:11     ` Liran Alon
  2017-12-22 15:16       ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Liran Alon @ 2017-12-22  1:11 UTC (permalink / raw)
  To: Paolo Bonzini, rkrcmar, kvm; +Cc: idan.brown, Krish Sadhukhan



On 21/12/17 17:11, Paolo Bonzini wrote:
> On 18/12/2017 10:45, Liran Alon wrote:
>> Next commits are going introduce support for accessing VMware backdoor
>> ports even though guest's TSS I/O permissions bitmap doesn't allow
>> access. This mimic VMware hypervisor behavior.
>>
>> In order to support this, next commits will change VMX/SVM to
>> intercept #GP which was raised by such access and handle it by calling
>> the x86 emulator to emulate instruction. Since commit "KVM: x86:
>> Always allow access to VMware backdoor I/O ports", the x86 emulator
>> handles access to these I/O ports by not checking these ports against
>> the TSS I/O permission bitmap.
>>
>> It turns out that the x86 emulator is incomplete and therefore
>> certain instructions that can cause #GP cannot be emulated.
>> Such an example is "INT x" (opcode 0xcd) which reach emulate_int()
>> which can only emulate instruction if vCPU is in real-mode.
>>
>> In those cases, we would like the #GP intercept to just forward #GP
>> as-is to guest as if there was no intercept to begin with.
>> However, current emulator code always queue #UD exception in case
>> emulator fails which is not what is wanted in this flow.
>>
>> This commit address this issue by adding a new emulation_type flag
>> that will allow the #GP intercept handler to specify it wish to just
>> be aware of when instruction emulation fails and doesn't want #UD
>> exception to be queued.
>>
>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
>> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h |  1 +
>>   arch/x86/kvm/x86.c              | 12 ++++++++----
>>   2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 516798431328..2b7ea1ac4f86 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1168,6 +1168,7 @@ enum emulation_result {
>>   #define EMULTYPE_SKIP		    (1 << 2)
>>   #define EMULTYPE_RETRY		    (1 << 3)
>>   #define EMULTYPE_NO_REEXECUTE	    (1 << 4)
>> +#define EMULTYPE_NO_UD_ON_FAIL	    (1 << 5)
>>   int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
>>   			    int emulation_type, void *insn, int insn_len);
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 5fef09674de1..8fd2d3e1bcd4 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5425,7 +5425,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
>>
>> -static int handle_emulation_failure(struct kvm_vcpu *vcpu)
>> +static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
>>   {
>>   	int r = EMULATE_DONE;
>>
>> @@ -5437,7 +5437,11 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu)
>>   		vcpu->run->internal.ndata = 0;
>>   		r = EMULATE_USER_EXIT;
>>   	}
>> -	kvm_queue_exception(vcpu, UD_VECTOR);
>> +
>> +	if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
>> +		r = EMULATE_FAIL;
>
> There seems to be some overlap between EMULTYPE_VMWARE and
> EMULTYPE_NO_UD_ON_FAIL.
>
> Even if you don't specify EMULTYPE_VMWARE, the emulation could fail, but
> there should have been no writeback and injecting the original #GP
> exception should be safe.

What you mention is true in case x86_emulate_instruction() fails on 
instruction emulation but it could also fail on instruction disassembly.
(See more details below).

> Maybe should EMULTYPE_NO_UD_ON_FAIL return
> straight away, skipping even the user-mode exit.

I agree it is possible to check
"if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)"
immediately after statistics & tracing, as if this flag is set, we will 
return EMULATE_FAIL anyway (maybe overwriting EMULATE_USER_EXIT).
If that's important, I can do such change in a v2 of this patch.

>
> On the other hand, you may want to have EMULTYPE_VMWARE so as to reduce
> the attack surface from the emulator (as the emulator would then be very
> easy to trigger, just by executing an instruction that causes #GP).  In
> that case, however, emulation of the {in,out}{s,} instructions shouldn't
> fail and you shouldn't need EMULTYPE_NO_UD_ON_FAIL.

Consider the case where the CPU raises a #GP on some instruction which 
is now intercepted by KVM. The #GP intercept will call 
x86_emulate_instruction(). If the x86 emulator disassembly engine is 
incomplete and therefore doesn't know how to parse the instruction which 
caused the #GP, x86_decode_insn() will fail which will also reach 
handle_emulation_failure(). If there is no EMULTYPE_NO_UD_ON_FAIL flag, 
this will cause a #UD exception to be queued which is not what we want.
(We would like to preserve behaviour of raising a #GP to guest)

Therefore we can summarize these flags usage as follows:
1. EMULTYPE_NO_UD_ON_FAIL is used to tell emulator "if you fail to 
disassemble the instruction, I just want you to return failure. Do not 
queue a #UD and let me decide what should be the proper response".
2. EMULTYPE_VMWARE is indeed used to avoid making all instructions that 
could raise #GP to reach instruction-emulation as the x86 emulator is 
incomplete anyway and it just, as you say, increase attack surface.

Having said that, I agree the commit messages of the 2 commits 
introducing these flags may not be indicative enough. If we agree on the 
written above, I can fix them in v2 of this series.

What do you think?

Regards,
-Liran

>
> Paolo
>
>
>> +	else
>> +		kvm_queue_exception(vcpu, UD_VECTOR);
>>
>>   	return r;
>>   }
>> @@ -5731,7 +5735,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>>   				return EMULATE_DONE;
>>   			if (emulation_type & EMULTYPE_SKIP)
>>   				return EMULATE_FAIL;
>> -			return handle_emulation_failure(vcpu);
>> +			return handle_emulation_failure(vcpu, emulation_type);
>>   		}
>>   	}
>>
>> @@ -5766,7 +5770,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>>   					emulation_type))
>>   			return EMULATE_DONE;
>>
>> -		return handle_emulation_failure(vcpu);
>> +		return handle_emulation_failure(vcpu, emulation_type);
>>   	}
>>
>>   	if (ctxt->have_exception) {
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs
  2017-12-21 15:15 ` [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Paolo Bonzini
@ 2017-12-22  1:22   ` Liran Alon
  0 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2017-12-22  1:22 UTC (permalink / raw)
  To: Paolo Bonzini, rkrcmar, kvm; +Cc: idan.brown



On 21/12/17 17:15, Paolo Bonzini wrote:
> On 18/12/2017 10:45, Liran Alon wrote:
>> Hi,
>>
>> This series aims to complete KVM support of VMware backdoor interface.
>> This interface is currently partially supported.
>>
>> Before these commits, KVM forwaded accesses to vmport I/O ports to QEMU
>> where they were handled correctly.
>> However, it turns out VMware design allows these I/O ports to also be
>> accessible to guest when TSS I/O permissions bitmap disallows it.
>> The way VMware hypervisor does it is to intercept #GP and on #GP intercept
>> handler run the x86 emulator which was modified to specifically skip
>> checking TSS I/O permissions bitmap for these magic I/O ports.
>> This behavior was found to be a must for some VMware workloads.
>> For example, VMware Tools Windows installer refuse to install as it
>> cannot access vmport I/O ports from Ring3, therefore assuming it
>> is not running inside a VM.
>>
>> Patches 1-6 mimics the above behavior in KVM. It installs a #GP
>> intercept on both VMX & SVM such that the #GP intercept handler
>> will call x86 emulator which was modified to always allow access
>> to these I/O ports.
>> Because this behavior is not a must for all workloads
>> and #GP intercept can introduce a slight performance overhead,
>> a module parameter was added to control whether KVM will do this or not.
>>
>> Patch 7 finally completes VMware backdoor interface by adding support
>> for VMware Pseduo-PMCs. VMware defines a couple of PMCs which are made-up
>> by their hypervisor which returns various PV information
>> (Such as host's RDTSC). Similar to vmport I/O ports, these Pseduo-PMCs
>> can be accessed via RDPMC even if executed from Ring3 when CR4.PCE=0.
>> Therefore, the x86 emulator was modified to mimic this behavior.
>
> Looks good, I have only a question in patch 3.

Replied on relevant discussion on patch 3.

>
> Also, just for confirmation, in the nested case L1 can still trap #GP
> using the exception bitmap, right?  Can you write a testcase for
> kvm-unit-tests/vmx.c, for both RDPMC and the 0x5658 port?

Yes.

vmx_handle_exit() makes sure to check if exit should be reflected to L1 
before it runs L0 exit handler.
See how nested_vmx_exit_reflected() & nested_vmx_reflect_vmexit() are 
called before one of the pointers at kvm_vmx_exit_handlers[].

handle_exception() is one of the handlers in kvm_vmx_exit_handlers[] and 
therefore the handling of #GP intercept there should run only if the #GP 
wasn't intercepted by L1.

In addition, in update_exception_bitmap() we make sure to not intercept 
#GP if is_guest_mode(vcpu)==true. Because if we are running L2, L1 
should be the one responsible for emulating #GP behaviour if any.
In addition, this is validated in handle_exception() #GP intercept 
handler by a warning.

We have already written kvm-unit-tests for checking the functionality 
introduced by this patch series that will be posted to upstream shortly.
I will find time to also add a test-case for vmx.c for the nested case.

Regards,
-Liran

>
> Thanks,
>
> Paolo
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure
  2017-12-22  1:11     ` Liran Alon
@ 2017-12-22 15:16       ` Paolo Bonzini
  2017-12-22 15:53         ` Liran Alon
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-12-22 15:16 UTC (permalink / raw)
  To: Liran Alon, rkrcmar, kvm; +Cc: idan.brown, Krish Sadhukhan

On 22/12/2017 02:11, Liran Alon wrote:
> Consider the case where the CPU raises a #GP on some instruction
> which is now intercepted by KVM. The #GP intercept will call
> x86_emulate_instruction(). If the x86 emulator disassembly engine is
> incomplete and therefore doesn't know how to parse the instruction
> which caused the #GP, x86_decode_insn() will fail which will also
> reach handle_emulation_failure(). If there is no
> EMULTYPE_NO_UD_ON_FAIL flag, this will cause a #UD exception to be
> queued which is not what we want.

Yup, however EMULTYPE_VMWARE has filtered the opcodes, hasn't it?  So in
this case you shouldn't fail the decoding.

> Therefore we can summarize these flags usage as follows: 1.
> EMULTYPE_NO_UD_ON_FAIL is used to tell emulator "if you fail to 
> disassemble the instruction, I just want you to return failure. Do
> not queue a #UD and let me decide what should be the proper
> response".
>
> 2. EMULTYPE_VMWARE is indeed used to avoid making all
> instructions that could raise #GP to reach instruction-emulation as
> the x86 emulator is incomplete anyway and it just, as you say,
> increase attack surface.
> 
> Having said that, I agree the commit messages of the 2 commits 
> introducing these flags may not be indicative enough. If we agree on
> the written above, I can fix them in v2 of this series.

Yeah, that's good.  In particular it's important to note that
EMULTYPE_VMWARE is not for correctness, only for hardening.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure
  2017-12-22 15:16       ` Paolo Bonzini
@ 2017-12-22 15:53         ` Liran Alon
  2017-12-22 15:59           ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Liran Alon @ 2017-12-22 15:53 UTC (permalink / raw)
  To: Paolo Bonzini, rkrcmar, kvm; +Cc: idan.brown, Krish Sadhukhan



On 22/12/17 17:16, Paolo Bonzini wrote:
> On 22/12/2017 02:11, Liran Alon wrote:
>> Consider the case where the CPU raises a #GP on some instruction
>> which is now intercepted by KVM. The #GP intercept will call
>> x86_emulate_instruction(). If the x86 emulator disassembly engine is
>> incomplete and therefore doesn't know how to parse the instruction
>> which caused the #GP, x86_decode_insn() will fail which will also
>> reach handle_emulation_failure(). If there is no
>> EMULTYPE_NO_UD_ON_FAIL flag, this will cause a #UD exception to be
>> queued which is not what we want.
>
> Yup, however EMULTYPE_VMWARE has filtered the opcodes, hasn't it?  So in
> this case you shouldn't fail the decoding.

In my current implementation EMULTYPE_VMWARE is considered only after 
the disassembly engine (x86_decode_insn()) has succeeded. It is true I 
could have filtered the opcodes before invoking the disassembly engine 
but that will make code a bit more complex. In addition, I didn't saw a 
lot of value in reducing the attack surface from the disassembly engine 
itself. Only from the emulation.

Therefore, I decided to make the EMULTYPE_NO_UD_ON_FAIL flag which may 
be also useful in the future for other use cases.

Regards,
-Liran

>
>> Therefore we can summarize these flags usage as follows: 1.
>> EMULTYPE_NO_UD_ON_FAIL is used to tell emulator "if you fail to
>> disassemble the instruction, I just want you to return failure. Do
>> not queue a #UD and let me decide what should be the proper
>> response".
>>
>> 2. EMULTYPE_VMWARE is indeed used to avoid making all
>> instructions that could raise #GP to reach instruction-emulation as
>> the x86 emulator is incomplete anyway and it just, as you say,
>> increase attack surface.
>>
>> Having said that, I agree the commit messages of the 2 commits
>> introducing these flags may not be indicative enough. If we agree on
>> the written above, I can fix them in v2 of this series.
>
> Yeah, that's good.  In particular it's important to note that
> EMULTYPE_VMWARE is not for correctness, only for hardening.
>
> Thanks,
>
> Paolo
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure
  2017-12-22 15:53         ` Liran Alon
@ 2017-12-22 15:59           ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-12-22 15:59 UTC (permalink / raw)
  To: Liran Alon, rkrcmar, kvm; +Cc: idan.brown, Krish Sadhukhan

On 22/12/2017 16:53, Liran Alon wrote:
>>
>> Yup, however EMULTYPE_VMWARE has filtered the opcodes, hasn't it?  So in
>> this case you shouldn't fail the decoding.
> 
> In my current implementation EMULTYPE_VMWARE is considered only after
> the disassembly engine (x86_decode_insn()) has succeeded. It is true I
> could have filtered the opcodes before invoking the disassembly engine
> but that will make code a bit more complex. In addition, I didn't saw a
> lot of value in reducing the attack surface from the disassembly engine
> itself. Only from the emulation.

We've had oopses from incorrect decoding, so it can be useful.  You're
right that the RDPMC case has a two-byte opcode and therefore it makes
decoding a bit more complex.

I'd say, let's keep it simple and start with EMULTYPE_NO_UD_ON_FAIL.

Thanks,

Paolo

> Therefore, I decided to make the EMULTYPE_NO_UD_ON_FAIL flag which may
> be also useful in the future for other use cases.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-12-22 15:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-18  9:45 [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Liran Alon
2017-12-18  9:45 ` [PATCH 1/7] KVM: x86: Add module parameter for supporting VMware backdoor Liran Alon
2017-12-18  9:45 ` [PATCH 2/7] KVM: x86: Always allow access to VMware backdoor I/O ports Liran Alon
2017-12-18  9:45 ` [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure Liran Alon
2017-12-21 15:11   ` Paolo Bonzini
2017-12-22  1:11     ` Liran Alon
2017-12-22 15:16       ` Paolo Bonzini
2017-12-22 15:53         ` Liran Alon
2017-12-22 15:59           ` Paolo Bonzini
2017-12-18  9:45 ` [PATCH 4/7] KVM: x86: Emulate only IN/OUT instructions when accessing VMware backdoor Liran Alon
2017-12-18  9:45 ` [PATCH 5/7] KVM: x86: VMX: Intercept #GP to support access to VMware backdoor ports Liran Alon
2017-12-18  9:45 ` [PATCH 6/7] KVM: x86: SVM: " Liran Alon
2017-12-18  9:45 ` [PATCH 7/7] KVM: x86: Add support for VMware backdoor Pseudo-PMCs Liran Alon
2017-12-21 15:15 ` [PATCH 0/7] KVM: x86: Add support for VMware backdoor I/O ports & Pseduo-PMCs Paolo Bonzini
2017-12-22  1:22   ` Liran Alon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox