[PATCH v2 0/5] KVM: Improve VMware guest support

linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/5] KVM: Improve VMware guest support
@ 2025-04-22 16:12 Zack Rusin
  2025-04-22 16:12 ` [PATCH v2 2/5] KVM: x86: Allow enabling of the vmware backdoor via a cap Zack Rusin
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Zack Rusin @ 2025-04-22 16:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: Zack Rusin, Paolo Bonzini, Jonathan Corbet, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Doug Covelli, Shuah Khan, Namhyung Kim,
	Arnaldo Carvalho de Melo, Michael Ellerman, Joel Stanley,
	Isaku Yamahata, kvm, linux-doc, linux-kselftest

This is the second version of a series that lets us run VMware
Workstation on Linux on top of KVM.

The most significant change in this series is the introduction of
CONFIG_KVM_VMWARE which is, in general, a nice cleanup for various
bits of VMware compatibility code that have been scattered around KVM.
(first patch)

The rest of the series builds upon the VMware platform to implement
features that are needed to run VMware guests without any
modifications on top of KVM:
- ability to turn on the VMware backdoor at runtime on a per-vm basis
(used to be a kernel boot argument only)
- support for VMware hypercalls - VMware products have a huge
collection of hypercalls, all of which are handled in userspace,
- support for handling legacy VMware backdoor in L0 in nested configs
- in cases where we have WS running a Windows VBS guest, the L0 would
be KVM, L1 Hyper-V so by default VMware Tools backdoor calls endup in
Hyper-V which can not handle them, so introduce a cap to let L0 handle
those.

The final change in the series is a kselftest of the VMware hypercall
functionality.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Cc: Doug Covelli <doug.covelli@broadcom.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org

Zack Rusin (5):
  KVM: x86: Centralize KVM's VMware code
  KVM: x86: Allow enabling of the vmware backdoor via a cap
  KVM: x86: Add support for VMware guest specific hypercalls
  KVM: x86: Add support for legacy VMware backdoors in nested setups
  KVM: selftests: x86: Add a test for KVM_CAP_X86_VMWARE_HYPERCALL

 Documentation/virt/kvm/api.rst                |  86 +++++++-
 MAINTAINERS                                   |   9 +
 arch/x86/include/asm/kvm_host.h               |  13 ++
 arch/x86/kvm/Kconfig                          |  16 ++
 arch/x86/kvm/Makefile                         |   1 +
 arch/x86/kvm/emulate.c                        |  11 +-
 arch/x86/kvm/kvm_vmware.c                     |  85 ++++++++
 arch/x86/kvm/kvm_vmware.h                     | 189 ++++++++++++++++++
 arch/x86/kvm/pmu.c                            |  39 +---
 arch/x86/kvm/pmu.h                            |   4 -
 arch/x86/kvm/svm/nested.c                     |   6 +
 arch/x86/kvm/svm/svm.c                        |  10 +-
 arch/x86/kvm/vmx/nested.c                     |   6 +
 arch/x86/kvm/vmx/vmx.c                        |   5 +-
 arch/x86/kvm/x86.c                            |  74 +++----
 arch/x86/kvm/x86.h                            |   2 -
 include/uapi/linux/kvm.h                      |  27 +++
 tools/include/uapi/linux/kvm.h                |   3 +
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 .../selftests/kvm/x86/vmware_hypercall_test.c | 121 +++++++++++
 20 files changed, 614 insertions(+), 94 deletions(-)
 create mode 100644 arch/x86/kvm/kvm_vmware.c
 create mode 100644 arch/x86/kvm/kvm_vmware.h
 create mode 100644 tools/testing/selftests/kvm/x86/vmware_hypercall_test.c

-- 
2.48.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 2/5] KVM: x86: Allow enabling of the vmware backdoor via a cap
  2025-04-22 16:12 [PATCH v2 0/5] KVM: Improve VMware guest support Zack Rusin
@ 2025-04-22 16:12 ` Zack Rusin
  2025-07-23 18:06   ` Sean Christopherson
  2025-04-22 16:12 ` [PATCH v2 3/5] KVM: x86: Add support for VMware guest specific hypercalls Zack Rusin
  2025-04-22 16:12 ` [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups Zack Rusin
  2 siblings, 1 reply; 17+ messages in thread
From: Zack Rusin @ 2025-04-22 16:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: Zack Rusin, Doug Covelli, Paolo Bonzini, Jonathan Corbet,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

Allow enabling of the vmware backdoor on a per-vm basis. The vmware
backdoor could only be enabled systemwide via the kernel parameter
kvm.enable_vmware_backdoor which required modifying the kernels boot
parameters.

Add the KVM_CAP_X86_VMWARE_BACKDOOR cap that enables the backdoor at the
hypervisor level and allows setting it on a per-vm basis.

The default is whatever kvm.enable_vmware_backdoor was set to, which
by default is false.

Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: Doug Covelli <doug.covelli@broadcom.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 Documentation/virt/kvm/api.rst  | 15 +++++++++++++++
 arch/x86/include/asm/kvm_host.h | 11 +++++++++++
 arch/x86/kvm/Makefile           |  1 +
 arch/x86/kvm/kvm_vmware.c       | 16 ++++++++++++++++
 arch/x86/kvm/kvm_vmware.h       | 10 +++++++---
 arch/x86/kvm/x86.c              | 21 +++++++++++++++++----
 include/uapi/linux/kvm.h        |  1 +
 7 files changed, 68 insertions(+), 7 deletions(-)
 create mode 100644 arch/x86/kvm/kvm_vmware.c

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 2b52eb77e29c..24bc80764fdc 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8258,6 +8258,21 @@ KVM exits with the register state of either the L1 or L2 guest
 depending on which executed at the time of an exit. Userspace must
 take care to differentiate between these cases.
 
+7.37 KVM_CAP_X86_VMWARE_BACKDOOR
+--------------------------------
+
+:Architectures: x86
+:Parameters: args[0] whether the feature should be enabled or not
+:Returns: 0 on success.
+
+The presence of this capability indicates that KVM supports
+enabling of the VMware backdoor via the enable cap interface.
+
+When enabled KVM will support VMware backdoor PV interface. The
+default value for it is set via the kvm.enable_vmware_backdoor
+kernel parameter (false when not set). Must be set before any
+VCPUs have been created.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 32ae3aa50c7e..5670d7d02d1b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1192,6 +1192,13 @@ struct kvm_xen {
 };
 #endif
 
+#ifdef CONFIG_KVM_VMWARE
+/* VMware emulation context */
+struct kvm_vmware {
+	bool backdoor_enabled;
+};
+#endif
+
 enum kvm_irqchip_mode {
 	KVM_IRQCHIP_NONE,
 	KVM_IRQCHIP_KERNEL,       /* created with KVM_CREATE_IRQCHIP */
@@ -1420,6 +1427,10 @@ struct kvm_arch {
 	struct kvm_hv hyperv;
 #endif
 
+#ifdef CONFIG_KVM_VMWARE
+	struct kvm_vmware vmware;
+#endif
+
 #ifdef CONFIG_KVM_XEN
 	struct kvm_xen xen;
 #endif
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index f9dddb8cb466..addd6a1005ce 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -12,6 +12,7 @@ kvm-y			+= x86.o emulate.o i8259.o irq.o lapic.o \
 
 kvm-$(CONFIG_X86_64) += mmu/tdp_iter.o mmu/tdp_mmu.o
 kvm-$(CONFIG_KVM_HYPERV) += hyperv.o
+kvm-$(CONFIG_KVM_VMWARE) += kvm_vmware.o
 kvm-$(CONFIG_KVM_XEN)	+= xen.o
 kvm-$(CONFIG_KVM_SMM)	+= smm.o
 
diff --git a/arch/x86/kvm/kvm_vmware.c b/arch/x86/kvm/kvm_vmware.c
new file mode 100644
index 000000000000..b8ede454751f
--- /dev/null
+++ b/arch/x86/kvm/kvm_vmware.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (c) 2025 Broadcom. All Rights Reserved. The term
+ * “Broadcom” refers to Broadcom Inc. and/or its subsidiaries.
+ */
+
+ #include "kvm_vmware.h"
+
+bool __read_mostly enable_vmware_backdoor;
+EXPORT_SYMBOL_GPL(enable_vmware_backdoor);
+module_param(enable_vmware_backdoor, bool, 0444);
+
+void kvm_vmware_init_vm(struct kvm *kvm)
+{
+	kvm->arch.vmware.backdoor_enabled = enable_vmware_backdoor;
+}
diff --git a/arch/x86/kvm/kvm_vmware.h b/arch/x86/kvm/kvm_vmware.h
index 8f091687dcd9..de55c9ee7c0f 100644
--- a/arch/x86/kvm/kvm_vmware.h
+++ b/arch/x86/kvm/kvm_vmware.h
@@ -15,11 +15,9 @@
 #define VMWARE_BACKDOOR_PMC_REAL_TIME		0x10001
 #define VMWARE_BACKDOOR_PMC_APPARENT_TIME	0x10002
 
-extern bool enable_vmware_backdoor;
-
 static inline bool kvm_vmware_backdoor_enabled(struct kvm_vcpu *vcpu)
 {
-	return enable_vmware_backdoor;
+	return vcpu->kvm->arch.vmware.backdoor_enabled;
 }
 
 static inline bool kvm_vmware_is_backdoor_pmc(u32 pmc_idx)
@@ -95,6 +93,8 @@ static inline bool kvm_vmware_is_backdoor_opcode(u8 opcode_len, u8 b)
 	return false;
 }
 
+void kvm_vmware_init_vm(struct kvm *kvm);
+
 #else /* !CONFIG_KVM_VMWARE */
 
 static inline bool kvm_vmware_backdoor_enabled(struct kvm_vcpu *vcpu)
@@ -122,6 +122,10 @@ static inline bool kvm_vmware_is_backdoor_opcode(u8 opcode_len, u8 len)
 	return false;
 }
 
+static inline void kvm_vmware_init_vm(struct kvm *kvm)
+{
+}
+
 #endif /* CONFIG_KVM_VMWARE */
 
 #endif /* __ARCH_X86_KVM_VMWARE_H__ */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1b0c6925d339..a0b0830e5ece 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -169,10 +169,6 @@ module_param(tsc_tolerance_ppm, uint, 0644);
 static bool __read_mostly vector_hashing = true;
 module_param(vector_hashing, bool, 0444);
 
-bool __read_mostly enable_vmware_backdoor = false;
-module_param(enable_vmware_backdoor, bool, 0444);
-EXPORT_SYMBOL_GPL(enable_vmware_backdoor);
-
 /*
  * Flags to manipulate forced emulation behavior (any non-zero value will
  * enable forced emulation).
@@ -4654,6 +4650,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_IRQFD_RESAMPLE:
 	case KVM_CAP_MEMORY_FAULT_INFO:
 	case KVM_CAP_X86_GUEST_MODE:
+#ifdef CONFIG_KVM_VMWARE
+	case KVM_CAP_X86_VMWARE_BACKDOOR:
+#endif
 		r = 1;
 		break;
 	case KVM_CAP_PRE_FAULT_MEMORY:
@@ -6735,6 +6734,19 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		mutex_unlock(&kvm->lock);
 		break;
 	}
+#ifdef CONFIG_KVM_VMWARE
+	case KVM_CAP_X86_VMWARE_BACKDOOR:
+		r = -EINVAL;
+		if (cap->args[0] & ~1)
+			break;
+		mutex_lock(&kvm->lock);
+		if (!kvm->created_vcpus) {
+			kvm->arch.vmware.backdoor_enabled = cap->args[0];
+			r = 0;
+		}
+		mutex_unlock(&kvm->lock);
+		break;
+#endif
 	default:
 		r = -EINVAL;
 		break;
@@ -12707,6 +12719,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 	kvm_apicv_init(kvm);
 	kvm_hv_init_vm(kvm);
+	kvm_vmware_init_vm(kvm);
 	kvm_xen_init_vm(kvm);
 
 	if (ignore_msrs && !report_ignored_msrs) {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 45e6d8fca9b9..793d0cf7ae3c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -929,6 +929,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_PRE_FAULT_MEMORY 236
 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
 #define KVM_CAP_X86_GUEST_MODE 238
+#define KVM_CAP_X86_VMWARE_BACKDOOR 239
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/5] KVM: x86: Add support for VMware guest specific hypercalls
  2025-04-22 16:12 [PATCH v2 0/5] KVM: Improve VMware guest support Zack Rusin
  2025-04-22 16:12 ` [PATCH v2 2/5] KVM: x86: Allow enabling of the vmware backdoor via a cap Zack Rusin
@ 2025-04-22 16:12 ` Zack Rusin
  2025-07-23 18:14   ` Sean Christopherson
  2025-04-22 16:12 ` [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups Zack Rusin
  2 siblings, 1 reply; 17+ messages in thread
From: Zack Rusin @ 2025-04-22 16:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: Zack Rusin, Doug Covelli, Paolo Bonzini, Jonathan Corbet,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

VMware products handle hypercalls in userspace. Give KVM the ability
to run VMware guests unmodified by fowarding all hypercalls to the
userspace.

Enabling of the KVM_CAP_X86_VMWARE_HYPERCALL_ENABLE capability turns
the feature on - it's off by default. This allows vmx's built on top
of KVM to support VMware specific hypercalls.

Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: Doug Covelli <doug.covelli@broadcom.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 Documentation/virt/kvm/api.rst  | 57 +++++++++++++++++++++++++--
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/Kconfig            |  6 ++-
 arch/x86/kvm/kvm_vmware.c       | 69 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/kvm_vmware.h       | 16 ++++++++
 arch/x86/kvm/x86.c              | 11 ++++++
 include/uapi/linux/kvm.h        | 25 ++++++++++++
 7 files changed, 179 insertions(+), 6 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 24bc80764fdc..6d3d2a509848 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6624,10 +6624,11 @@ to the byte array.
 .. note::
 
       For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, KVM_EXIT_XEN,
-      KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding
-      operations are complete (and guest state is consistent) only after userspace
-      has re-entered the kernel with KVM_RUN.  The kernel side will first finish
-      incomplete operations and then check for pending signals.
+      KVM_EXIT_EPR, KVM_EXIT_VMWARE, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR
+      the corresponding operations are complete (and guest state is consistent)
+      only after userspace has re-entered the kernel with KVM_RUN. The kernel
+      side will first finish incomplete operations and then check for pending
+      signals.
 
       The pending state of the operation is not preserved in state which is
       visible to userspace, thus userspace should ensure that the operation is
@@ -8273,6 +8274,54 @@ default value for it is set via the kvm.enable_vmware_backdoor
 kernel parameter (false when not set). Must be set before any
 VCPUs have been created.
 
+7.38 KVM_CAP_X86_VMWARE_HYPERCALL
+---------------------------------
+
+:Architectures: x86
+:Parameters: args[0] whether the feature should be enabled or not
+:Returns: 0 on success.
+
+Capability allows userspace to handle hypercalls. When enabled
+whenever the vcpu has executed a VMCALL(Intel) or a VMMCALL(AMD)
+instruction kvm will exit to userspace with KVM_EXIT_VMWARE.
+
+On exit the vmware structure of the kvm_run structure will
+look as follows:
+
+::
+
+  struct kvm_vmware_exit {
+  #define KVM_EXIT_VMWARE_HCALL          1
+    __u32 type;
+    __u32 pad1;
+    union {
+      struct {
+        __u32 longmode;/* true if in long/64bit mode */
+        __u32 cpl;
+        __u64 rax, rbx, rcx, rdx, rsi, rdi, rbp;
+        __u64 result;  /* will be written to eax on return */
+        struct {
+          __u32 inject;
+          __u32 pad2;
+          __u32 vector;
+          __u32 error_code;
+          __u64 address;
+        } exception;
+      } hcall;
+    };
+  };
+
+The exception structure of the kvm_vmware_call.hcall member allows
+the monitor to inject an exception in the guest. On return if the
+exception.inject is set true the remaining members of the exception
+structure will be used to create and queue up an exception for the
+guest.
+
+Except when running in compatibility mode with VMware hypervisors
+userspace handling of hypercalls is discouraged. To implement
+such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO
+(all except s390).
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5670d7d02d1b..86bacda2802e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1196,6 +1196,7 @@ struct kvm_xen {
 /* VMware emulation context */
 struct kvm_vmware {
 	bool backdoor_enabled;
+	bool hypercall_enabled;
 };
 #endif
 
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 9e3be87fc82b..f817601924bd 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -183,11 +183,13 @@ config KVM_VMWARE
 	depends on KVM
 	default y
 	help
-	  Provides KVM support for hosting VMware guests. Adds support for
-	  VMware legacy backdoor interface: VMware tools and various userspace
+	  Provides KVM support for hosting VMware guests. KVM features that can
+	  be turned on when this option is enabled include:
+	  - VMware legacy backdoor interface: VMware tools and various userspace
 	  utilities running in VMware guests sometimes utilize specially
 	  formatted IN, OUT and RDPMC instructions which need to be
 	  intercepted.
+	  - VMware hypercall interface: VMware hypercalls exit to userspace
 
 	  If unsure, say "Y".
 
diff --git a/arch/x86/kvm/kvm_vmware.c b/arch/x86/kvm/kvm_vmware.c
index b8ede454751f..096adb92ac60 100644
--- a/arch/x86/kvm/kvm_vmware.c
+++ b/arch/x86/kvm/kvm_vmware.c
@@ -6,6 +6,8 @@
 
  #include "kvm_vmware.h"
 
+ #include "x86.h"
+
 bool __read_mostly enable_vmware_backdoor;
 EXPORT_SYMBOL_GPL(enable_vmware_backdoor);
 module_param(enable_vmware_backdoor, bool, 0444);
@@ -14,3 +16,70 @@ void kvm_vmware_init_vm(struct kvm *kvm)
 {
 	kvm->arch.vmware.backdoor_enabled = enable_vmware_backdoor;
 }
+
+static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
+{
+	u64 ret = vcpu->run->vmware.hcall.result;
+
+	if (!is_64_bit_hypercall(vcpu))
+		ret = (u32)ret;
+	kvm_rax_write(vcpu, ret);
+
+	if (vcpu->run->vmware.hcall.exception.inject) {
+		u32 vector = vcpu->run->vmware.hcall.exception.vector;
+		u32 error_code = vcpu->run->vmware.hcall.exception.error_code;
+		u32 address = vcpu->run->vmware.hcall.exception.address;
+		bool has_error_code = x86_exception_has_error_code(vector);
+
+		if (vector == PF_VECTOR) {
+			struct x86_exception fault = {0};
+
+			fault.vector = PF_VECTOR;
+			fault.error_code_valid = true;
+			fault.error_code = error_code;
+			fault.address = address;
+
+			kvm_inject_page_fault(vcpu, &fault);
+		} else if (has_error_code) {
+			kvm_queue_exception_e(vcpu, vector, error_code);
+		} else {
+			kvm_queue_exception(vcpu, vector);
+		}
+
+		/*
+		 * Don't skip the instruction to deliver the exception
+		 * at the backdoor call
+		 */
+		return 1;
+	}
+
+	return kvm_skip_emulated_instruction(vcpu);
+}
+
+int kvm_vmware_hypercall(struct kvm_vcpu *vcpu)
+{
+	struct kvm_run *run = vcpu->run;
+	bool is_64_bit = is_64_bit_hypercall(vcpu);
+	u64 mask = is_64_bit ? U64_MAX : U32_MAX;
+
+	run->exit_reason = KVM_EXIT_VMWARE;
+	run->vmware.type = KVM_EXIT_VMWARE_HCALL;
+	run->vmware.hcall.longmode = is_64_bit;
+	run->vmware.hcall.rax = kvm_rax_read(vcpu) & mask;
+	run->vmware.hcall.rbx = kvm_rbx_read(vcpu) & mask;
+	run->vmware.hcall.rcx = kvm_rcx_read(vcpu) & mask;
+	run->vmware.hcall.rdx = kvm_rdx_read(vcpu) & mask;
+	run->vmware.hcall.rsi = kvm_rsi_read(vcpu) & mask;
+	run->vmware.hcall.rdi = kvm_rdi_read(vcpu) & mask;
+	run->vmware.hcall.rbp = kvm_rbp_read(vcpu) & mask;
+	run->vmware.hcall.cpl = kvm_x86_call(get_cpl)(vcpu);
+	run->vmware.hcall.result = 0;
+	run->vmware.hcall.exception.inject = 0;
+	run->vmware.hcall.exception.vector = 0;
+	run->vmware.hcall.exception.error_code = 0;
+	run->vmware.hcall.exception.address = 0;
+
+	vcpu->arch.complete_userspace_io = complete_hypercall_exit;
+
+	return 0;
+}
diff --git a/arch/x86/kvm/kvm_vmware.h b/arch/x86/kvm/kvm_vmware.h
index de55c9ee7c0f..846b90091a2a 100644
--- a/arch/x86/kvm/kvm_vmware.h
+++ b/arch/x86/kvm/kvm_vmware.h
@@ -93,7 +93,13 @@ static inline bool kvm_vmware_is_backdoor_opcode(u8 opcode_len, u8 b)
 	return false;
 }
 
+static inline bool kvm_vmware_hypercall_enabled(struct kvm *kvm)
+{
+	return kvm->arch.vmware.hypercall_enabled;
+}
+
 void kvm_vmware_init_vm(struct kvm *kvm);
+int kvm_vmware_hypercall(struct kvm_vcpu *vcpu);
 
 #else /* !CONFIG_KVM_VMWARE */
 
@@ -126,6 +132,16 @@ static inline void kvm_vmware_init_vm(struct kvm *kvm)
 {
 }
 
+static inline bool kvm_vmware_hypercall_enabled(struct kvm *kvm)
+{
+	return false;
+}
+
+static inline int kvm_vmware_hypercall(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+
 #endif /* CONFIG_KVM_VMWARE */
 
 #endif /* __ARCH_X86_KVM_VMWARE_H__ */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a0b0830e5ece..300cef9a37e2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4652,6 +4652,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_X86_GUEST_MODE:
 #ifdef CONFIG_KVM_VMWARE
 	case KVM_CAP_X86_VMWARE_BACKDOOR:
+	case KVM_CAP_X86_VMWARE_HYPERCALL:
 #endif
 		r = 1;
 		break;
@@ -6746,6 +6747,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		}
 		mutex_unlock(&kvm->lock);
 		break;
+	case KVM_CAP_X86_VMWARE_HYPERCALL:
+		r = -EINVAL;
+		if (cap->args[0] & ~1)
+			break;
+		kvm->arch.vmware.hypercall_enabled = cap->args[0];
+		r = 0;
+		break;
 #endif
 	default:
 		r = -EINVAL;
@@ -10085,6 +10093,9 @@ EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall);
 
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
+	if (kvm_vmware_hypercall_enabled(vcpu->kvm))
+		return kvm_vmware_hypercall(vcpu);
+
 	if (kvm_xen_hypercall_enabled(vcpu->kvm))
 		return kvm_xen_hypercall(vcpu);
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 793d0cf7ae3c..adf1a1449c06 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -135,6 +135,27 @@ struct kvm_xen_exit {
 	} u;
 };
 
+struct kvm_vmware_exit {
+#define KVM_EXIT_VMWARE_HCALL          1
+	__u32 type;
+	__u32 pad1;
+	union {
+		struct {
+			__u32 longmode;
+			__u32 cpl;
+			__u64 rax, rbx, rcx, rdx, rsi, rdi, rbp;
+			__u64 result;
+			struct {
+				__u32 inject;
+				__u32 pad2;
+				__u32 vector;
+				__u32 error_code;
+				__u64 address;
+			} exception;
+		} hcall;
+	};
+};
+
 #define KVM_S390_GET_SKEYS_NONE   1
 #define KVM_S390_SKEYS_MAX        1048576
 
@@ -178,6 +199,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_NOTIFY           37
 #define KVM_EXIT_LOONGARCH_IOCSR  38
 #define KVM_EXIT_MEMORY_FAULT     39
+#define KVM_EXIT_VMWARE           40
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -420,6 +442,8 @@ struct kvm_run {
 		} msr;
 		/* KVM_EXIT_XEN */
 		struct kvm_xen_exit xen;
+		/* KVM_EXIT_VMWARE */
+		struct kvm_vmware_exit vmware;
 		/* KVM_EXIT_RISCV_SBI */
 		struct {
 			unsigned long extension_id;
@@ -930,6 +954,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
 #define KVM_CAP_X86_GUEST_MODE 238
 #define KVM_CAP_X86_VMWARE_BACKDOOR 239
+#define KVM_CAP_X86_VMWARE_HYPERCALL 240
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-22 16:12 [PATCH v2 0/5] KVM: Improve VMware guest support Zack Rusin
  2025-04-22 16:12 ` [PATCH v2 2/5] KVM: x86: Allow enabling of the vmware backdoor via a cap Zack Rusin
  2025-04-22 16:12 ` [PATCH v2 3/5] KVM: x86: Add support for VMware guest specific hypercalls Zack Rusin
@ 2025-04-22 16:12 ` Zack Rusin
  2025-04-23  7:56   ` Xin Li
  2025-07-23 18:19   ` Sean Christopherson
  2 siblings, 2 replies; 17+ messages in thread
From: Zack Rusin @ 2025-04-22 16:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: Zack Rusin, Doug Covelli, Paolo Bonzini, Jonathan Corbet,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

Allow handling VMware backdoors by the L0 monitor. This is required on
setups running Windows VBS, where the L1 will be running Hyper-V which
can't handle VMware backdoors. Thus on Windows VBS legacy VMware backdoor
calls issued by the userspace will end up in Hyper-V (L1) and endup
throwing an error.
Add a KVM cap that, in nested setups, allows the legacy VMware backdoor
to be handled by the L0 monitor. Thanks to this we can make sure that
VMware backdoor is always handled by the correct monitor.

Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: Doug Covelli <doug.covelli@broadcom.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 Documentation/virt/kvm/api.rst  | 14 +++++++++++
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/Kconfig            |  1 +
 arch/x86/kvm/kvm_vmware.h       | 42 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/nested.c       |  6 +++++
 arch/x86/kvm/svm/svm.c          |  3 ++-
 arch/x86/kvm/vmx/nested.c       |  6 +++++
 arch/x86/kvm/x86.c              |  8 +++++++
 include/uapi/linux/kvm.h        |  1 +
 9 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 6d3d2a509848..55bd464ebf95 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8322,6 +8322,20 @@ userspace handling of hypercalls is discouraged. To implement
 such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO
 (all except s390).
 
+7.39 KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0
+------------------------------------------
+
+:Architectures: x86
+:Parameters: args[0] whether the feature should be enabled or not
+:Returns: 0 on success.
+
+Capability allows VMware backdoors to be handled by L0 when running
+on nested configurations. This is required when, for example
+running Windows guest with Hyper-V VBS enabled - in that configuration
+the VMware backdoor calls issued by VMware tools would endup in Hyper-V
+(L1) which doesn't handle VMware backdoor. Enable this option to have
+VMware backdoor sent to L0 monitor.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 86bacda2802e..2a806aa93a9e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1197,6 +1197,7 @@ struct kvm_xen {
 struct kvm_vmware {
 	bool backdoor_enabled;
 	bool hypercall_enabled;
+	bool nested_backdoor_l0_enabled;
 };
 #endif
 
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index f817601924bd..8fefde6f2e78 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -190,6 +190,7 @@ config KVM_VMWARE
 	  formatted IN, OUT and RDPMC instructions which need to be
 	  intercepted.
 	  - VMware hypercall interface: VMware hypercalls exit to userspace
+	  - VMware legacy backdoor handling in L0
 
 	  If unsure, say "Y".
 
diff --git a/arch/x86/kvm/kvm_vmware.h b/arch/x86/kvm/kvm_vmware.h
index 846b90091a2a..d90bcf73bae4 100644
--- a/arch/x86/kvm/kvm_vmware.h
+++ b/arch/x86/kvm/kvm_vmware.h
@@ -9,6 +9,9 @@
 
 #include <linux/kvm_host.h>
 
+#include "asm/vmware.h"
+#include "x86.h"
+
 #ifdef CONFIG_KVM_VMWARE
 
 #define VMWARE_BACKDOOR_PMC_HOST_TSC		0x10000
@@ -98,6 +101,35 @@ static inline bool kvm_vmware_hypercall_enabled(struct kvm *kvm)
 	return kvm->arch.vmware.hypercall_enabled;
 }
 
+static inline bool kvm_vmware_nested_backdoor_l0_enabled(struct kvm *kvm)
+{
+	return kvm->arch.vmware.backdoor_enabled &&
+		kvm->arch.vmware.nested_backdoor_l0_enabled;
+}
+
+static inline bool kvm_vmware_wants_backdoor_to_l0(struct kvm_vcpu *vcpu, u32 cpl)
+{
+	/* We only care about the lower 32 bits */
+	const unsigned long mask = 0xffffffff;
+	const unsigned long port_mask = 0xffff;
+	unsigned long rax, rdx;
+
+	if (!kvm_vmware_nested_backdoor_l0_enabled(vcpu->kvm))
+		return false;
+
+	if (cpl != 3)
+		return false;
+
+	rax = kvm_rax_read(vcpu) & mask;
+	if (rax == VMWARE_HYPERVISOR_MAGIC) {
+		rdx = kvm_rdx_read(vcpu) & port_mask;
+		return (rdx == VMWARE_HYPERVISOR_PORT ||
+			rdx == VMWARE_HYPERVISOR_PORT_HB);
+	}
+
+	return false;
+}
+
 void kvm_vmware_init_vm(struct kvm *kvm);
 int kvm_vmware_hypercall(struct kvm_vcpu *vcpu);
 
@@ -142,6 +174,16 @@ static inline int kvm_vmware_hypercall(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static inline bool kvm_vmware_nested_backdoor_l0_enabled(struct kvm *kvm)
+{
+	return false;
+}
+
+static inline bool kvm_vmware_wants_backdoor_to_l0(struct kvm_vcpu *vcpu, u32 cpl)
+{
+	return false;
+}
+
 #endif /* CONFIG_KVM_VMWARE */
 
 #endif /* __ARCH_X86_KVM_VMWARE_H__ */
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 04c375bf1ac2..74c472e51479 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -22,6 +22,7 @@
 #include <asm/debugreg.h>
 
 #include "kvm_emulate.h"
+#include "kvm_vmware.h"
 #include "trace.h"
 #include "mmu.h"
 #include "x86.h"
@@ -1517,6 +1518,11 @@ int nested_svm_exit_special(struct vcpu_svm *svm)
 			 svm->vcpu.arch.apf.host_apf_flags)
 			/* Trap async PF even if not shadowing */
 			return NESTED_EXIT_HOST;
+#ifdef CONFIG_KVM_VMWARE
+		else if ((exit_code == (SVM_EXIT_EXCP_BASE + GP_VECTOR)) &&
+			 kvm_vmware_wants_backdoor_to_l0(vcpu, to_svm(vcpu)->vmcb->save.cpl))
+			return NESTED_EXIT_HOST;
+#endif
 		break;
 	}
 	case SVM_EXIT_VMMCALL:
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index be106bd60553..96996e7f9de4 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2407,7 +2407,8 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 		 * VMware backdoor emulation on #GP interception only handles
 		 * IN{S}, OUT{S}, and RDPMC.
 		 */
-		if (!is_guest_mode(vcpu))
+		if (!is_guest_mode(vcpu)  ||
+		    kvm_vmware_wants_backdoor_to_l0(vcpu, svm_get_cpl(vcpu)))
 			return kvm_emulate_instruction(vcpu,
 				EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
 	} else {
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ed8a3cb53961..ff8a1dbbba01 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -10,6 +10,7 @@
 #include "x86.h"
 #include "cpuid.h"
 #include "hyperv.h"
+#include "kvm_vmware.h"
 #include "mmu.h"
 #include "nested.h"
 #include "pmu.h"
@@ -6386,6 +6387,11 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
 			return true;
 		else if (is_ve_fault(intr_info))
 			return true;
+#ifdef CONFIG_KVM_VMWARE
+		else if (is_gp_fault(intr_info) &&
+			 kvm_vmware_wants_backdoor_to_l0(vcpu, vmx_get_cpl(vcpu)))
+			return true;
+#endif
 		return false;
 	case EXIT_REASON_EXTERNAL_INTERRUPT:
 		return true;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 300cef9a37e2..5dc57bc57851 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4653,6 +4653,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 #ifdef CONFIG_KVM_VMWARE
 	case KVM_CAP_X86_VMWARE_BACKDOOR:
 	case KVM_CAP_X86_VMWARE_HYPERCALL:
+	case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
 #endif
 		r = 1;
 		break;
@@ -6754,6 +6755,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		kvm->arch.vmware.hypercall_enabled = cap->args[0];
 		r = 0;
 		break;
+	case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
+		r = -EINVAL;
+		if (cap->args[0] & ~1)
+			break;
+		kvm->arch.vmware.nested_backdoor_l0_enabled = cap->args[0];
+		r = 0;
+		break;
 #endif
 	default:
 		r = -EINVAL;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index adf1a1449c06..f5d63c0c79f5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -955,6 +955,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_X86_GUEST_MODE 238
 #define KVM_CAP_X86_VMWARE_BACKDOOR 239
 #define KVM_CAP_X86_VMWARE_HYPERCALL 240
+#define KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0 241
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-22 16:12 ` [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups Zack Rusin
@ 2025-04-23  7:56   ` Xin Li
  2025-04-23 11:43     ` Zack Rusin
  2025-07-23 18:19   ` Sean Christopherson
  1 sibling, 1 reply; 17+ messages in thread
From: Xin Li @ 2025-04-23  7:56 UTC (permalink / raw)
  To: Zack Rusin, linux-kernel
  Cc: Doug Covelli, Paolo Bonzini, Jonathan Corbet, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, kvm, linux-doc

On 4/22/2025 9:12 AM, Zack Rusin wrote:
> Allow handling VMware backdoors by the L0 monitor. This is required on
> setups running Windows VBS, where the L1 will be running Hyper-V which
> can't handle VMware backdoors. Thus on Windows VBS legacy VMware backdoor
> calls issued by the userspace will end up in Hyper-V (L1) and endup
> throwing an error.
> Add a KVM cap that, in nested setups, allows the legacy VMware backdoor
> to be handled by the L0 monitor. Thanks to this we can make sure that
> VMware backdoor is always handled by the correct monitor.
> 
> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
> Cc: Doug Covelli <doug.covelli@broadcom.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: x86@kernel.org
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Zack Rusin <zack.rusin@broadcom.com>
> Cc: kvm@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>   Documentation/virt/kvm/api.rst  | 14 +++++++++++
>   arch/x86/include/asm/kvm_host.h |  1 +
>   arch/x86/kvm/Kconfig            |  1 +
>   arch/x86/kvm/kvm_vmware.h       | 42 +++++++++++++++++++++++++++++++++
>   arch/x86/kvm/svm/nested.c       |  6 +++++
>   arch/x86/kvm/svm/svm.c          |  3 ++-
>   arch/x86/kvm/vmx/nested.c       |  6 +++++
>   arch/x86/kvm/x86.c              |  8 +++++++
>   include/uapi/linux/kvm.h        |  1 +
>   9 files changed, 81 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 6d3d2a509848..55bd464ebf95 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -8322,6 +8322,20 @@ userspace handling of hypercalls is discouraged. To implement
>   such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO
>   (all except s390).
>   
> +7.39 KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0
> +------------------------------------------
> +
> +:Architectures: x86
> +:Parameters: args[0] whether the feature should be enabled or not
> +:Returns: 0 on success.
> +
> +Capability allows VMware backdoors to be handled by L0 when running
> +on nested configurations. This is required when, for example
> +running Windows guest with Hyper-V VBS enabled - in that configuration
> +the VMware backdoor calls issued by VMware tools would endup in Hyper-V
> +(L1) which doesn't handle VMware backdoor. Enable this option to have
> +VMware backdoor sent to L0 monitor.
> +
>   8. Other capabilities.
>   ======================
>   

You're not basing the patch set on v6.15-rcX?

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 300cef9a37e2..5dc57bc57851 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4653,6 +4653,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>   #ifdef CONFIG_KVM_VMWARE
>   	case KVM_CAP_X86_VMWARE_BACKDOOR:
>   	case KVM_CAP_X86_VMWARE_HYPERCALL:
> +	case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
>   #endif
>   		r = 1;
>   		break;
> @@ -6754,6 +6755,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>   		kvm->arch.vmware.hypercall_enabled = cap->args[0];
>   		r = 0;
>   		break;
> +	case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
> +		r = -EINVAL;
> +		if (cap->args[0] & ~1)

Replace ~1 with a macro for better readability please.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23  7:56   ` Xin Li
@ 2025-04-23 11:43     ` Zack Rusin
  2025-04-23 13:31       ` Sean Christopherson
  0 siblings, 1 reply; 17+ messages in thread
From: Zack Rusin @ 2025-04-23 11:43 UTC (permalink / raw)
  To: Xin Li
  Cc: linux-kernel, Doug Covelli, Paolo Bonzini, Jonathan Corbet,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

[-- Attachment #1: Type: text/plain, Size: 4565 bytes --]

On Wed, Apr 23, 2025 at 3:56 AM Xin Li <xin@zytor.com> wrote:
>
> On 4/22/2025 9:12 AM, Zack Rusin wrote:
> > Allow handling VMware backdoors by the L0 monitor. This is required on
> > setups running Windows VBS, where the L1 will be running Hyper-V which
> > can't handle VMware backdoors. Thus on Windows VBS legacy VMware backdoor
> > calls issued by the userspace will end up in Hyper-V (L1) and endup
> > throwing an error.
> > Add a KVM cap that, in nested setups, allows the legacy VMware backdoor
> > to be handled by the L0 monitor. Thanks to this we can make sure that
> > VMware backdoor is always handled by the correct monitor.
> >
> > Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
> > Cc: Doug Covelli <doug.covelli@broadcom.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Sean Christopherson <seanjc@google.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: x86@kernel.org
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Zack Rusin <zack.rusin@broadcom.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >   Documentation/virt/kvm/api.rst  | 14 +++++++++++
> >   arch/x86/include/asm/kvm_host.h |  1 +
> >   arch/x86/kvm/Kconfig            |  1 +
> >   arch/x86/kvm/kvm_vmware.h       | 42 +++++++++++++++++++++++++++++++++
> >   arch/x86/kvm/svm/nested.c       |  6 +++++
> >   arch/x86/kvm/svm/svm.c          |  3 ++-
> >   arch/x86/kvm/vmx/nested.c       |  6 +++++
> >   arch/x86/kvm/x86.c              |  8 +++++++
> >   include/uapi/linux/kvm.h        |  1 +
> >   9 files changed, 81 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 6d3d2a509848..55bd464ebf95 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -8322,6 +8322,20 @@ userspace handling of hypercalls is discouraged. To implement
> >   such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO
> >   (all except s390).
> >
> > +7.39 KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0
> > +------------------------------------------
> > +
> > +:Architectures: x86
> > +:Parameters: args[0] whether the feature should be enabled or not
> > +:Returns: 0 on success.
> > +
> > +Capability allows VMware backdoors to be handled by L0 when running
> > +on nested configurations. This is required when, for example
> > +running Windows guest with Hyper-V VBS enabled - in that configuration
> > +the VMware backdoor calls issued by VMware tools would endup in Hyper-V
> > +(L1) which doesn't handle VMware backdoor. Enable this option to have
> > +VMware backdoor sent to L0 monitor.
> > +
> >   8. Other capabilities.
> >   ======================
> >
>
> You're not basing the patch set on v6.15-rcX?

It was rebased on top of v6.14.

> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 300cef9a37e2..5dc57bc57851 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -4653,6 +4653,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >   #ifdef CONFIG_KVM_VMWARE
> >       case KVM_CAP_X86_VMWARE_BACKDOOR:
> >       case KVM_CAP_X86_VMWARE_HYPERCALL:
> > +     case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
> >   #endif
> >               r = 1;
> >               break;
> > @@ -6754,6 +6755,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> >               kvm->arch.vmware.hypercall_enabled = cap->args[0];
> >               r = 0;
> >               break;
> > +     case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
> > +             r = -EINVAL;
> > +             if (cap->args[0] & ~1)
>
> Replace ~1 with a macro for better readability please.

Are you sure about that? This code is already used elsewhere in the
file  (for KVM_CAP_EXIT_ON_EMULATION_FAILURE) so, ignoring the fact
that it's arguable whether IS_ZERO_OR_ONE is more readable than & ~1,
if we use a macro for the vmware caps and not for
KVM_CAP_EXIT_ON_EMULATION_FAILURE then the code would be inconsistent
and that decreases the readability.

Or are you saying that since I'm already there you'd like to see a
completely separate patch that defines some kind of IS_ZERO_OR_ONE
macro for KVM, use it for KVM_CAP_EXIT_ON_EMULATION_FAILURE and, once
that lands then I can make use of it in this series?

z

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 11:43     ` Zack Rusin
@ 2025-04-23 13:31       ` Sean Christopherson
  2025-04-23 15:36         ` Zack Rusin
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2025-04-23 13:31 UTC (permalink / raw)
  To: Zack Rusin
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

On Wed, Apr 23, 2025, Zack Rusin wrote:
> On Wed, Apr 23, 2025 at 3:56 AM Xin Li <xin@zytor.com> wrote:
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 300cef9a37e2..5dc57bc57851 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -4653,6 +4653,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> > >   #ifdef CONFIG_KVM_VMWARE
> > >       case KVM_CAP_X86_VMWARE_BACKDOOR:
> > >       case KVM_CAP_X86_VMWARE_HYPERCALL:
> > > +     case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:

I would probably omit the L0, because KVM could be running as L1.

> > >   #endif
> > >               r = 1;
> > >               break;
> > > @@ -6754,6 +6755,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> > >               kvm->arch.vmware.hypercall_enabled = cap->args[0];
> > >               r = 0;
> > >               break;
> > > +     case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
> > > +             r = -EINVAL;
> > > +             if (cap->args[0] & ~1)
> >
> > Replace ~1 with a macro for better readability please.
> 
> Are you sure about that? This code is already used elsewhere in the
> file  (for KVM_CAP_EXIT_ON_EMULATION_FAILURE) so, ignoring the fact
> that it's arguable whether IS_ZERO_OR_ONE is more readable than & ~1,
> if we use a macro for the vmware caps and not for
> KVM_CAP_EXIT_ON_EMULATION_FAILURE then the code would be inconsistent
> and that decreases the readability.

Heh, KVM_CAP_EXIT_ON_EMULATION_FAILURE is the odd one out.  Even if that weren't
the case, this is one of the situations where diverging from the existing code is
desirable, because the existing code is garbage.

arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_caps.supported_quirks)
arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_X2APIC_API_VALID_FLAGS)
arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_get_allowed_disable_exits())
arch/x86/kvm/x86.c:                 (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_MSR_EXIT_REASON_VALID_MASK)
arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_BUS_LOCK_DETECTION_VALID_MODE)
arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
arch/x86/kvm/x86.c:             if (cap->args[0] & ~1)
arch/x86/kvm/x86.c:             if (!enable_pmu || (cap->args[0] & ~KVM_CAP_PMU_VALID_MASK))
arch/x86/kvm/x86.c:             if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
virt/kvm/kvm_main.c:            if (cap->flags || (cap->args[0] & ~allowed_options))


> Or are you saying that since I'm already there you'd like to see a
> completely separate patch that defines some kind of IS_ZERO_OR_ONE
> macro for KVM, use it for KVM_CAP_EXIT_ON_EMULATION_FAILURE and, once
> that lands then I can make use of it in this series?

Xin is suggesting that you add a macro in arch/x86/include/uapi/asm/kvm.h to
#define which bits are valid and which bits are reserved.

At a glance, you can kill multiple birds with one stone.  Rather than add three
separate capabilities, add one capability and then a variety of flags.  E.g.

#define KVM_X86_VMWARE_HYPERCALL	_BITUL(0)
#define KVM_X86_VMWARE_BACKDOOR		_BITUL(1)
#define KVM_X86_VMWARE_NESTED_BACKDOOR	_BITUL(2)
#define KVM_X86_VMWARE_VALID_FLAGS	(KVM_X86_VMWARE_HYPERCALL |
					 KVM_X86_VMWARE_BACKDOOR |
					 KVM_X86_VMWARE_NESTED_BACKDOOR)

	case KVM_CAP_X86_VMWARE_EMULATION:
		r = -EINVAL;
		if (cap->args[0] & ~KVM_X86_VMWARE_VALID_FLAGS)
			break;

		mutex_lock(&kvm->lock);
		if (!kvm->created_vcpus) {
			if (cap->args[0] & KVM_X86_VMWARE_HYPERCALL)
				kvm->arch.vmware.hypercall_enabled = true;
			if (cap->args[0] & KVM_X86_VMWARE_BACKDOOR)
				kvm->arch.vmware.backdoor_enabled;
			if (cap->args[0] & KVM_X86_VMWARE_NESTED_BACKDOOR)
				kvm->arch.vmware.nested_backdoor_enabled = true;
			r = 0;
		}
		mutex_unlock(&kvm->lock);
		break;

That approach wouldn't let userspace disable previously enabled VMware capabilities,
but unless there's a use case for doing so, that should be a non-issue.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 13:31       ` Sean Christopherson
@ 2025-04-23 15:36         ` Zack Rusin
  2025-04-23 15:54           ` Sean Christopherson
  0 siblings, 1 reply; 17+ messages in thread
From: Zack Rusin @ 2025-04-23 15:36 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

[-- Attachment #1: Type: text/plain, Size: 5942 bytes --]

On Wed, Apr 23, 2025 at 9:31 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 23, 2025, Zack Rusin wrote:
> > On Wed, Apr 23, 2025 at 3:56 AM Xin Li <xin@zytor.com> wrote:
> > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > index 300cef9a37e2..5dc57bc57851 100644
> > > > --- a/arch/x86/kvm/x86.c
> > > > +++ b/arch/x86/kvm/x86.c
> > > > @@ -4653,6 +4653,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> > > >   #ifdef CONFIG_KVM_VMWARE
> > > >       case KVM_CAP_X86_VMWARE_BACKDOOR:
> > > >       case KVM_CAP_X86_VMWARE_HYPERCALL:
> > > > +     case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
>
> I would probably omit the L0, because KVM could be running as L1.

Yea, that sounds good to me.

> > > >   #endif
> > > >               r = 1;
> > > >               break;
> > > > @@ -6754,6 +6755,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> > > >               kvm->arch.vmware.hypercall_enabled = cap->args[0];
> > > >               r = 0;
> > > >               break;
> > > > +     case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
> > > > +             r = -EINVAL;
> > > > +             if (cap->args[0] & ~1)
> > >
> > > Replace ~1 with a macro for better readability please.
> >
> > Are you sure about that? This code is already used elsewhere in the
> > file  (for KVM_CAP_EXIT_ON_EMULATION_FAILURE) so, ignoring the fact
> > that it's arguable whether IS_ZERO_OR_ONE is more readable than & ~1,
> > if we use a macro for the vmware caps and not for
> > KVM_CAP_EXIT_ON_EMULATION_FAILURE then the code would be inconsistent
> > and that decreases the readability.
>
> Heh, KVM_CAP_EXIT_ON_EMULATION_FAILURE is the odd one out.  Even if that weren't
> the case, this is one of the situations where diverging from the existing code is
> desirable, because the existing code is garbage.
>
> arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_caps.supported_quirks)
> arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_X2APIC_API_VALID_FLAGS)
> arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_get_allowed_disable_exits())
> arch/x86/kvm/x86.c:                 (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_MSR_EXIT_REASON_VALID_MASK)
> arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_BUS_LOCK_DETECTION_VALID_MODE)
> arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
> arch/x86/kvm/x86.c:             if (cap->args[0] & ~1)
> arch/x86/kvm/x86.c:             if (!enable_pmu || (cap->args[0] & ~KVM_CAP_PMU_VALID_MASK))
> arch/x86/kvm/x86.c:             if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
> virt/kvm/kvm_main.c:            if (cap->flags || (cap->args[0] & ~allowed_options))

That's because none of those other options are boolean, right? I
assumed that the options that have valid masks use defines but
booleans use ~1 because (val & ~1) makes it obvious to the reader that
the option is in fact a boolean in a way that (val &
~KVM_SOME_VALID_BITS) can not.

I don't want to be defending the code in there, especially to its
maintainers :) I'm very happy to change it in any way you feel is more
readable to you, but I don't think it's crap :)

> > Or are you saying that since I'm already there you'd like to see a
> > completely separate patch that defines some kind of IS_ZERO_OR_ONE
> > macro for KVM, use it for KVM_CAP_EXIT_ON_EMULATION_FAILURE and, once
> > that lands then I can make use of it in this series?
>
> Xin is suggesting that you add a macro in arch/x86/include/uapi/asm/kvm.h to
> #define which bits are valid and which bits are reserved.
>
> At a glance, you can kill multiple birds with one stone.  Rather than add three
> separate capabilities, add one capability and then a variety of flags.  E.g.
>
> #define KVM_X86_VMWARE_HYPERCALL        _BITUL(0)
> #define KVM_X86_VMWARE_BACKDOOR         _BITUL(1)
> #define KVM_X86_VMWARE_NESTED_BACKDOOR  _BITUL(2)
> #define KVM_X86_VMWARE_VALID_FLAGS      (KVM_X86_VMWARE_HYPERCALL |
>                                          KVM_X86_VMWARE_BACKDOOR |
>                                          KVM_X86_VMWARE_NESTED_BACKDOOR)
>
>         case KVM_CAP_X86_VMWARE_EMULATION:
>                 r = -EINVAL;
>                 if (cap->args[0] & ~KVM_X86_VMWARE_VALID_FLAGS)
>                         break;
>
>                 mutex_lock(&kvm->lock);
>                 if (!kvm->created_vcpus) {
>                         if (cap->args[0] & KVM_X86_VMWARE_HYPERCALL)
>                                 kvm->arch.vmware.hypercall_enabled = true;
>                         if (cap->args[0] & KVM_X86_VMWARE_BACKDOOR)
>                                 kvm->arch.vmware.backdoor_enabled;
>                         if (cap->args[0] & KVM_X86_VMWARE_NESTED_BACKDOOR)
>                                 kvm->arch.vmware.nested_backdoor_enabled = true;
>                         r = 0;
>                 }
>                 mutex_unlock(&kvm->lock);
>                 break;
>
> That approach wouldn't let userspace disable previously enabled VMware capabilities,
> but unless there's a use case for doing so, that should be a non-issue.

I'd say that if we desperately want to use a single cap for all of
these then I'd probably prefer a different approach because this would
make vmware_backdoor_enabled behavior really wacky. It's the one that
currently can only be set via kernel boot flags, so having systems
where the boot flag is on and disabling it on a per-vm basis makes
sense and breaks with this. I'd probably still write the code to be
able to disable/enable all of them because it makes sense for
vmware_backdoor_enabled. Of course, we want it on all the time, so I
also don't necessarily want to be arguing about being able to disable
those options ;)

z

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 15:36         ` Zack Rusin
@ 2025-04-23 15:54           ` Sean Christopherson
  2025-04-23 16:58             ` Zack Rusin
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2025-04-23 15:54 UTC (permalink / raw)
  To: Zack Rusin
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

On Wed, Apr 23, 2025, Zack Rusin wrote:
> On Wed, Apr 23, 2025 at 9:31 AM Sean Christopherson <seanjc@google.com> wrote:
> > Heh, KVM_CAP_EXIT_ON_EMULATION_FAILURE is the odd one out.  Even if that weren't
> > the case, this is one of the situations where diverging from the existing code is
> > desirable, because the existing code is garbage.
> >
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_caps.supported_quirks)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_X2APIC_API_VALID_FLAGS)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_get_allowed_disable_exits())
> > arch/x86/kvm/x86.c:                 (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_MSR_EXIT_REASON_VALID_MASK)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_BUS_LOCK_DETECTION_VALID_MODE)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~1)
> > arch/x86/kvm/x86.c:             if (!enable_pmu || (cap->args[0] & ~KVM_CAP_PMU_VALID_MASK))
> > arch/x86/kvm/x86.c:             if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
> > virt/kvm/kvm_main.c:            if (cap->flags || (cap->args[0] & ~allowed_options))
> 
> That's because none of those other options are boolean, right? I
> assumed that the options that have valid masks use defines but
> booleans use ~1 because (val & ~1) makes it obvious to the reader that
> the option is in fact a boolean in a way that (val &
> ~KVM_SOME_VALID_BITS) can not.

The entire reason when KVM checks and enforces cap->args[0] is so that KVM can
expand the capability's functionality in the future.  Whether or not a capability
is *currently* a boolean, i.e. only has one supported flag, is completely irrelevant.

KVM has burned itself many times over by not performing checks, e.g. is how we
ended up with things like KVM_CAP_DISABLE_QUIRKS2.

> > > Or are you saying that since I'm already there you'd like to see a
> > > completely separate patch that defines some kind of IS_ZERO_OR_ONE
> > > macro for KVM, use it for KVM_CAP_EXIT_ON_EMULATION_FAILURE and, once
> > > that lands then I can make use of it in this series?
> >
> > Xin is suggesting that you add a macro in arch/x86/include/uapi/asm/kvm.h to
> > #define which bits are valid and which bits are reserved.
> >
> > At a glance, you can kill multiple birds with one stone.  Rather than add three
> > separate capabilities, add one capability and then a variety of flags.  E.g.
> >
> > #define KVM_X86_VMWARE_HYPERCALL        _BITUL(0)
> > #define KVM_X86_VMWARE_BACKDOOR         _BITUL(1)
> > #define KVM_X86_VMWARE_NESTED_BACKDOOR  _BITUL(2)
> > #define KVM_X86_VMWARE_VALID_FLAGS      (KVM_X86_VMWARE_HYPERCALL |
> >                                          KVM_X86_VMWARE_BACKDOOR |
> >                                          KVM_X86_VMWARE_NESTED_BACKDOOR)
> >
> >         case KVM_CAP_X86_VMWARE_EMULATION:
> >                 r = -EINVAL;
> >                 if (cap->args[0] & ~KVM_X86_VMWARE_VALID_FLAGS)
> >                         break;
> >
> >                 mutex_lock(&kvm->lock);
> >                 if (!kvm->created_vcpus) {
> >                         if (cap->args[0] & KVM_X86_VMWARE_HYPERCALL)
> >                                 kvm->arch.vmware.hypercall_enabled = true;
> >                         if (cap->args[0] & KVM_X86_VMWARE_BACKDOOR)
> >                                 kvm->arch.vmware.backdoor_enabled;
> >                         if (cap->args[0] & KVM_X86_VMWARE_NESTED_BACKDOOR)
> >                                 kvm->arch.vmware.nested_backdoor_enabled = true;
> >                         r = 0;
> >                 }
> >                 mutex_unlock(&kvm->lock);
> >                 break;
> >
> > That approach wouldn't let userspace disable previously enabled VMware capabilities,
> > but unless there's a use case for doing so, that should be a non-issue.
> 
> I'd say that if we desperately want to use a single cap for all of
> these then I'd probably prefer a different approach because this would
> make vmware_backdoor_enabled behavior really wacky.

How so?  If kvm.enable_vmware_backdoor is true, then the backdoor is enabled
for all VMs, else it's disabled by default but can be enabled on a per-VM basis
by the new capability.

> It's the one that currently can only be set via kernel boot flags, so having
> systems where the boot flag is on and disabling it on a per-vm basis makes
> sense and breaks with this.

We could go this route, e.g. KVM does something similar for PMU virtualization.
But the key difference is that enable_pmu is enabled by default, whereas
enable_vmware_backdoor is disabled by default.  I.e. it makes far more sense for
the capability to let userspace opt-in, as opposed to opt-out.

> I'd probably still write the code to be able to disable/enable all of them
> because it makes sense for vmware_backdoor_enabled.

Again, that's not KVM's default, and it will never be KVM's default.  Unless there
is a concrete use case for disabling features after *userspace* has opted-in,
just make them one-way 0=>1 transitions so as to keep KVM's implementation as
simple as possible.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 15:54           ` Sean Christopherson
@ 2025-04-23 16:58             ` Zack Rusin
  2025-04-23 17:16               ` Sean Christopherson
  0 siblings, 1 reply; 17+ messages in thread
From: Zack Rusin @ 2025-04-23 16:58 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

[-- Attachment #1: Type: text/plain, Size: 6161 bytes --]

On Wed, Apr 23, 2025 at 11:54 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 23, 2025, Zack Rusin wrote:
> > On Wed, Apr 23, 2025 at 9:31 AM Sean Christopherson <seanjc@google.com> wrote:
> > > Heh, KVM_CAP_EXIT_ON_EMULATION_FAILURE is the odd one out.  Even if that weren't
> > > the case, this is one of the situations where diverging from the existing code is
> > > desirable, because the existing code is garbage.
> > >
> > > arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_caps.supported_quirks)
> > > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_X2APIC_API_VALID_FLAGS)
> > > arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_get_allowed_disable_exits())
> > > arch/x86/kvm/x86.c:                 (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> > > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_MSR_EXIT_REASON_VALID_MASK)
> > > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_BUS_LOCK_DETECTION_VALID_MODE)
> > > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
> > > arch/x86/kvm/x86.c:             if (cap->args[0] & ~1)
> > > arch/x86/kvm/x86.c:             if (!enable_pmu || (cap->args[0] & ~KVM_CAP_PMU_VALID_MASK))
> > > arch/x86/kvm/x86.c:             if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
> > > virt/kvm/kvm_main.c:            if (cap->flags || (cap->args[0] & ~allowed_options))
> >
> > That's because none of those other options are boolean, right? I
> > assumed that the options that have valid masks use defines but
> > booleans use ~1 because (val & ~1) makes it obvious to the reader that
> > the option is in fact a boolean in a way that (val &
> > ~KVM_SOME_VALID_BITS) can not.
>
> The entire reason when KVM checks and enforces cap->args[0] is so that KVM can
> expand the capability's functionality in the future.  Whether or not a capability
> is *currently* a boolean, i.e. only has one supported flag, is completely irrelevant.
>
> KVM has burned itself many times over by not performing checks, e.g. is how we
> ended up with things like KVM_CAP_DISABLE_QUIRKS2.
>
> > > > Or are you saying that since I'm already there you'd like to see a
> > > > completely separate patch that defines some kind of IS_ZERO_OR_ONE
> > > > macro for KVM, use it for KVM_CAP_EXIT_ON_EMULATION_FAILURE and, once
> > > > that lands then I can make use of it in this series?
> > >
> > > Xin is suggesting that you add a macro in arch/x86/include/uapi/asm/kvm.h to
> > > #define which bits are valid and which bits are reserved.
> > >
> > > At a glance, you can kill multiple birds with one stone.  Rather than add three
> > > separate capabilities, add one capability and then a variety of flags.  E.g.
> > >
> > > #define KVM_X86_VMWARE_HYPERCALL        _BITUL(0)
> > > #define KVM_X86_VMWARE_BACKDOOR         _BITUL(1)
> > > #define KVM_X86_VMWARE_NESTED_BACKDOOR  _BITUL(2)
> > > #define KVM_X86_VMWARE_VALID_FLAGS      (KVM_X86_VMWARE_HYPERCALL |
> > >                                          KVM_X86_VMWARE_BACKDOOR |
> > >                                          KVM_X86_VMWARE_NESTED_BACKDOOR)
> > >
> > >         case KVM_CAP_X86_VMWARE_EMULATION:
> > >                 r = -EINVAL;
> > >                 if (cap->args[0] & ~KVM_X86_VMWARE_VALID_FLAGS)
> > >                         break;
> > >
> > >                 mutex_lock(&kvm->lock);
> > >                 if (!kvm->created_vcpus) {
> > >                         if (cap->args[0] & KVM_X86_VMWARE_HYPERCALL)
> > >                                 kvm->arch.vmware.hypercall_enabled = true;
> > >                         if (cap->args[0] & KVM_X86_VMWARE_BACKDOOR)
> > >                                 kvm->arch.vmware.backdoor_enabled;
> > >                         if (cap->args[0] & KVM_X86_VMWARE_NESTED_BACKDOOR)
> > >                                 kvm->arch.vmware.nested_backdoor_enabled = true;
> > >                         r = 0;
> > >                 }
> > >                 mutex_unlock(&kvm->lock);
> > >                 break;
> > >
> > > That approach wouldn't let userspace disable previously enabled VMware capabilities,
> > > but unless there's a use case for doing so, that should be a non-issue.
> >
> > I'd say that if we desperately want to use a single cap for all of
> > these then I'd probably prefer a different approach because this would
> > make vmware_backdoor_enabled behavior really wacky.
>
> How so?  If kvm.enable_vmware_backdoor is true, then the backdoor is enabled
> for all VMs, else it's disabled by default but can be enabled on a per-VM basis
> by the new capability.

Like you said if  kvm.enable_vmware_backdoor is true, then it's
enabled for all VMs, so it'd make sense to allow disabling it on a
per-vm basis on those systems.
Just like when the kvm.enable_vmware_backdoor is false, the cap can be
used to enable it on a per-vm basis.

> > It's the one that currently can only be set via kernel boot flags, so having
> > systems where the boot flag is on and disabling it on a per-vm basis makes
> > sense and breaks with this.
>
> We could go this route, e.g. KVM does something similar for PMU virtualization.
> But the key difference is that enable_pmu is enabled by default, whereas
> enable_vmware_backdoor is disabled by default.  I.e. it makes far more sense for
> the capability to let userspace opt-in, as opposed to opt-out.
>
> > I'd probably still write the code to be able to disable/enable all of them
> > because it makes sense for vmware_backdoor_enabled.
>
> Again, that's not KVM's default, and it will never be KVM's default.

All I'm saying is that you can enable it on a whole system via the
boot flags and on the systems on which it has been turned on it'd make
sense to allow disabling it on a per-vm basis. Anyway, I'm sure I can
make it work correctly under any constraints, so let me try to
understand the issue because I'm not sure what we're solving here. Is
the problem the fact that we have three caps and instead want to
squeeze all of the functionality under one cap?

z

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 16:58             ` Zack Rusin
@ 2025-04-23 17:16               ` Sean Christopherson
  2025-04-23 17:25                 ` Zack Rusin
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2025-04-23 17:16 UTC (permalink / raw)
  To: Zack Rusin
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

On Wed, Apr 23, 2025, Zack Rusin wrote:
> On Wed, Apr 23, 2025 at 11:54 AM Sean Christopherson <seanjc@google.com> wrote:
> > > I'd say that if we desperately want to use a single cap for all of
> > > these then I'd probably prefer a different approach because this would
> > > make vmware_backdoor_enabled behavior really wacky.
> >
> > How so?  If kvm.enable_vmware_backdoor is true, then the backdoor is enabled
> > for all VMs, else it's disabled by default but can be enabled on a per-VM basis
> > by the new capability.
> 
> Like you said if  kvm.enable_vmware_backdoor is true, then it's
> enabled for all VMs, so it'd make sense to allow disabling it on a
> per-vm basis on those systems.
> Just like when the kvm.enable_vmware_backdoor is false, the cap can be
> used to enable it on a per-vm basis.

Why?  What use case does that serve?

> > > It's the one that currently can only be set via kernel boot flags, so having
> > > systems where the boot flag is on and disabling it on a per-vm basis makes
> > > sense and breaks with this.
> >
> > We could go this route, e.g. KVM does something similar for PMU virtualization.
> > But the key difference is that enable_pmu is enabled by default, whereas
> > enable_vmware_backdoor is disabled by default.  I.e. it makes far more sense for
> > the capability to let userspace opt-in, as opposed to opt-out.
> >
> > > I'd probably still write the code to be able to disable/enable all of them
> > > because it makes sense for vmware_backdoor_enabled.
> >
> > Again, that's not KVM's default, and it will never be KVM's default.
> 
> All I'm saying is that you can enable it on a whole system via the
> boot flags and on the systems on which it has been turned on it'd make
> sense to allow disabling it on a per-vm basis.

Again, why would anyone do that?  If you *know* you're going to run some VMs
with VMware emulation and some without, the sane approach is to not touch the
module param and rely entirely on the capability.  Otherwise the VMM must be
able to opt-out, which means that running an older userspace that doesn't know
about the new capability *can't* opt-out.

The only reason to even keep the module param is to not break existing users,
e.g. to be able to run VMs that want VMware functionality using an existing VMM.

> Anyway, I'm sure I can make it work correctly under any constraints, so let
> me try to understand the issue because I'm not sure what we're solving here.
> Is the problem the fact that we have three caps and instead want to squeeze
> all of the functionality under one cap?

The "problem" is that I don't want to add complexity and create ABI for a use
case that doesn't exist.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 17:16               ` Sean Christopherson
@ 2025-04-23 17:25                 ` Zack Rusin
  2025-04-23 18:57                   ` Sean Christopherson
  0 siblings, 1 reply; 17+ messages in thread
From: Zack Rusin @ 2025-04-23 17:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

[-- Attachment #1: Type: text/plain, Size: 3020 bytes --]

On Wed, Apr 23, 2025 at 1:16 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 23, 2025, Zack Rusin wrote:
> > On Wed, Apr 23, 2025 at 11:54 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > I'd say that if we desperately want to use a single cap for all of
> > > > these then I'd probably prefer a different approach because this would
> > > > make vmware_backdoor_enabled behavior really wacky.
> > >
> > > How so?  If kvm.enable_vmware_backdoor is true, then the backdoor is enabled
> > > for all VMs, else it's disabled by default but can be enabled on a per-VM basis
> > > by the new capability.
> >
> > Like you said if  kvm.enable_vmware_backdoor is true, then it's
> > enabled for all VMs, so it'd make sense to allow disabling it on a
> > per-vm basis on those systems.
> > Just like when the kvm.enable_vmware_backdoor is false, the cap can be
> > used to enable it on a per-vm basis.
>
> Why?  What use case does that serve?

Testing purposes?

> > > > It's the one that currently can only be set via kernel boot flags, so having
> > > > systems where the boot flag is on and disabling it on a per-vm basis makes
> > > > sense and breaks with this.
> > >
> > > We could go this route, e.g. KVM does something similar for PMU virtualization.
> > > But the key difference is that enable_pmu is enabled by default, whereas
> > > enable_vmware_backdoor is disabled by default.  I.e. it makes far more sense for
> > > the capability to let userspace opt-in, as opposed to opt-out.
> > >
> > > > I'd probably still write the code to be able to disable/enable all of them
> > > > because it makes sense for vmware_backdoor_enabled.
> > >
> > > Again, that's not KVM's default, and it will never be KVM's default.
> >
> > All I'm saying is that you can enable it on a whole system via the
> > boot flags and on the systems on which it has been turned on it'd make
> > sense to allow disabling it on a per-vm basis.
>
> Again, why would anyone do that?  If you *know* you're going to run some VMs
> with VMware emulation and some without, the sane approach is to not touch the
> module param and rely entirely on the capability.  Otherwise the VMM must be
> able to opt-out, which means that running an older userspace that doesn't know
> about the new capability *can't* opt-out.
>
> The only reason to even keep the module param is to not break existing users,
> e.g. to be able to run VMs that want VMware functionality using an existing VMM.
>
> > Anyway, I'm sure I can make it work correctly under any constraints, so let
> > me try to understand the issue because I'm not sure what we're solving here.
> > Is the problem the fact that we have three caps and instead want to squeeze
> > all of the functionality under one cap?
>
> The "problem" is that I don't want to add complexity and create ABI for a use
> case that doesn't exist.

Would you like to see a v3 where I specifically do not allow disabling
those caps?

z

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 17:25                 ` Zack Rusin
@ 2025-04-23 18:57                   ` Sean Christopherson
  2025-04-23 20:01                     ` Zack Rusin
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2025-04-23 18:57 UTC (permalink / raw)
  To: Zack Rusin
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

On Wed, Apr 23, 2025, Zack Rusin wrote:
> On Wed, Apr 23, 2025 at 1:16 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Apr 23, 2025, Zack Rusin wrote:
> > > On Wed, Apr 23, 2025 at 11:54 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > > I'd say that if we desperately want to use a single cap for all of
> > > > > these then I'd probably prefer a different approach because this would
> > > > > make vmware_backdoor_enabled behavior really wacky.
> > > >
> > > > How so?  If kvm.enable_vmware_backdoor is true, then the backdoor is enabled
> > > > for all VMs, else it's disabled by default but can be enabled on a per-VM basis
> > > > by the new capability.
> > >
> > > Like you said if  kvm.enable_vmware_backdoor is true, then it's
> > > enabled for all VMs, so it'd make sense to allow disabling it on a
> > > per-vm basis on those systems.
> > > Just like when the kvm.enable_vmware_backdoor is false, the cap can be
> > > used to enable it on a per-vm basis.
> >
> > Why?  What use case does that serve?
> 
> Testing purposes?

Heh, testing what?  To have heterogenous VMware emulation settings on a single
host, at least one of the VMMs needs to have been updated to utilize the new
capability.  Updating the VMM that doesn't want VMware emulation makes zero sense,
because that would limit testing to only the non-nested backdoor.

> > > > > It's the one that currently can only be set via kernel boot flags, so having
> > > > > systems where the boot flag is on and disabling it on a per-vm basis makes
> > > > > sense and breaks with this.
> > > >
> > > > We could go this route, e.g. KVM does something similar for PMU virtualization.
> > > > But the key difference is that enable_pmu is enabled by default, whereas
> > > > enable_vmware_backdoor is disabled by default.  I.e. it makes far more sense for
> > > > the capability to let userspace opt-in, as opposed to opt-out.
> > > >
> > > > > I'd probably still write the code to be able to disable/enable all of them
> > > > > because it makes sense for vmware_backdoor_enabled.
> > > >
> > > > Again, that's not KVM's default, and it will never be KVM's default.
> > >
> > > All I'm saying is that you can enable it on a whole system via the
> > > boot flags and on the systems on which it has been turned on it'd make
> > > sense to allow disabling it on a per-vm basis.
> >
> > Again, why would anyone do that?  If you *know* you're going to run some VMs
> > with VMware emulation and some without, the sane approach is to not touch the
> > module param and rely entirely on the capability.  Otherwise the VMM must be
> > able to opt-out, which means that running an older userspace that doesn't know
> > about the new capability *can't* opt-out.
> >
> > The only reason to even keep the module param is to not break existing users,
> > e.g. to be able to run VMs that want VMware functionality using an existing VMM.
> >
> > > Anyway, I'm sure I can make it work correctly under any constraints, so let
> > > me try to understand the issue because I'm not sure what we're solving here.
> > > Is the problem the fact that we have three caps and instead want to squeeze
> > > all of the functionality under one cap?
> >
> > The "problem" is that I don't want to add complexity and create ABI for a use
> > case that doesn't exist.
> 
> Would you like to see a v3 where I specifically do not allow disabling
> those caps?

Yes.  Though I recommend waiting to send a v3 until I (and others) have had a
change to review the rest of the patches, e.g. to avoid wasting your time.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-23 18:57                   ` Sean Christopherson
@ 2025-04-23 20:01                     ` Zack Rusin
  0 siblings, 0 replies; 17+ messages in thread
From: Zack Rusin @ 2025-04-23 20:01 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Xin Li, linux-kernel, Doug Covelli, Paolo Bonzini,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, kvm, linux-doc

[-- Attachment #1: Type: text/plain, Size: 4093 bytes --]

On Wed, Apr 23, 2025 at 2:58 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 23, 2025, Zack Rusin wrote:
> > On Wed, Apr 23, 2025 at 1:16 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Wed, Apr 23, 2025, Zack Rusin wrote:
> > > > On Wed, Apr 23, 2025 at 11:54 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > > > I'd say that if we desperately want to use a single cap for all of
> > > > > > these then I'd probably prefer a different approach because this would
> > > > > > make vmware_backdoor_enabled behavior really wacky.
> > > > >
> > > > > How so?  If kvm.enable_vmware_backdoor is true, then the backdoor is enabled
> > > > > for all VMs, else it's disabled by default but can be enabled on a per-VM basis
> > > > > by the new capability.
> > > >
> > > > Like you said if  kvm.enable_vmware_backdoor is true, then it's
> > > > enabled for all VMs, so it'd make sense to allow disabling it on a
> > > > per-vm basis on those systems.
> > > > Just like when the kvm.enable_vmware_backdoor is false, the cap can be
> > > > used to enable it on a per-vm basis.
> > >
> > > Why?  What use case does that serve?
> >
> > Testing purposes?
>
> Heh, testing what?

Running VMware and non-VMware guests on the same system... I'm in a
weird spot where I'm defending not my own code, so I'd prefer not
having to do that. We don't use kvm.vmware_backdoor_enabled as a boot
flag, we haven't written that code, so I don't want to be arguing on
behalf of it either way. I was just trying to make sure this nicely
works with the new cap's. In this case having it just work is actually
less effort than making it not work so it just seemed like a nice and
proper thing to do.

> > > > > > It's the one that currently can only be set via kernel boot flags, so having
> > > > > > systems where the boot flag is on and disabling it on a per-vm basis makes
> > > > > > sense and breaks with this.
> > > > >
> > > > > We could go this route, e.g. KVM does something similar for PMU virtualization.
> > > > > But the key difference is that enable_pmu is enabled by default, whereas
> > > > > enable_vmware_backdoor is disabled by default.  I.e. it makes far more sense for
> > > > > the capability to let userspace opt-in, as opposed to opt-out.
> > > > >
> > > > > > I'd probably still write the code to be able to disable/enable all of them
> > > > > > because it makes sense for vmware_backdoor_enabled.
> > > > >
> > > > > Again, that's not KVM's default, and it will never be KVM's default.
> > > >
> > > > All I'm saying is that you can enable it on a whole system via the
> > > > boot flags and on the systems on which it has been turned on it'd make
> > > > sense to allow disabling it on a per-vm basis.
> > >
> > > Again, why would anyone do that?  If you *know* you're going to run some VMs
> > > with VMware emulation and some without, the sane approach is to not touch the
> > > module param and rely entirely on the capability.  Otherwise the VMM must be
> > > able to opt-out, which means that running an older userspace that doesn't know
> > > about the new capability *can't* opt-out.
> > >
> > > The only reason to even keep the module param is to not break existing users,
> > > e.g. to be able to run VMs that want VMware functionality using an existing VMM.
> > >
> > > > Anyway, I'm sure I can make it work correctly under any constraints, so let
> > > > me try to understand the issue because I'm not sure what we're solving here.
> > > > Is the problem the fact that we have three caps and instead want to squeeze
> > > > all of the functionality under one cap?
> > >
> > > The "problem" is that I don't want to add complexity and create ABI for a use
> > > case that doesn't exist.
> >
> > Would you like to see a v3 where I specifically do not allow disabling
> > those caps?
>
> Yes.  Though I recommend waiting to send a v3 until I (and others) have had a
> change to review the rest of the patches, e.g. to avoid wasting your time.

Sounds good.

z

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/5] KVM: x86: Allow enabling of the vmware backdoor via a cap
  2025-04-22 16:12 ` [PATCH v2 2/5] KVM: x86: Allow enabling of the vmware backdoor via a cap Zack Rusin
@ 2025-07-23 18:06   ` Sean Christopherson
  0 siblings, 0 replies; 17+ messages in thread
From: Sean Christopherson @ 2025-07-23 18:06 UTC (permalink / raw)
  To: Zack Rusin
  Cc: linux-kernel, Doug Covelli, Paolo Bonzini, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, kvm, linux-doc

On Tue, Apr 22, 2025, Zack Rusin wrote:
> @@ -6735,6 +6734,19 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		mutex_unlock(&kvm->lock);
>  		break;
>  	}
> +#ifdef CONFIG_KVM_VMWARE
> +	case KVM_CAP_X86_VMWARE_BACKDOOR:

I much perfer using a single KVM_CAP_X86_VMWARE capability.  More below.

> +		r = -EINVAL;
> +		if (cap->args[0] & ~1)

Using bit 0 for "enable" needs to be #defined arch/x86/include/uapi/asm/kvm.h.

At that point, adding more capabilities for the other VMware functionality doesn't
make much sense, especially since the capabilities that are added in later patches
don't have the kvm->created_vcpus protection, i.e. are likely buggy.

E.g. with the below diff (completely untested, probably won't apply cleanly?)
spread across three-ish patches, the accessors can be:

static inline bool kvm_is_vmware_cap_enabled(struct kvm *kvm, u64 cap)
{
	return kvm->arch.vmware.caps & cap;
}

static inline bool kvm_is_vmware_backdoor_enabled(struct kvm_vcpu *vcpu)
{
	return kvm_is_vmware_cap_enabled(kvm, KVM_VMWARE_ENABLE_BACKDOOR);
}

static inline bool kvm_is_vmware_hypercall_enabled(struct kvm *kvm)
{
	return kvm_is_vmware_cap_enabled(kvm, KVM_VMWARE_ENABLE_HYPERCALL);
}

static inline bool kvm_vmware_nested_backdoor_l0_enabled(struct kvm *kvm)
{
	return kvm_is_vmware_backdoor_enabled(kvm) &&
	       kvm_is_vmware_cap_enabled(kvm, KVM_VMWARE_ENABLE_NESTED_BACKDOOR);
}

---
 arch/x86/include/asm/kvm_host.h |  4 +---
 arch/x86/include/uapi/asm/kvm.h |  4 ++++
 arch/x86/kvm/x86.c              | 34 ++++++++++++---------------------
 3 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 639c49db0106..1433cdd14675 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1219,9 +1219,7 @@ struct kvm_xen {
 #ifdef CONFIG_KVM_VMWARE
 /* VMware emulation context */
 struct kvm_vmware {
-	bool backdoor_enabled;
-	bool hypercall_enabled;
-	bool nested_backdoor_l0_enabled;
+	u64 caps;
 };
 #endif
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index e019111e2150..ae578422d6f4 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -1013,4 +1013,8 @@ struct kvm_tdx_init_mem_region {
 	__u64 nr_pages;
 };
 
+#define KVM_VMWARE_ENABLE_BACKDOOR		_BITULL(0)
+#define KVM_VMWARE_ENABLE_HYPERCALL		_BITULL(1)
+#define KVM_VMWARE_ENABLE_NESTED_BACKDOOR	_BITULL(2)
+
 #endif /* _ASM_X86_KVM_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7234333a92d8..b9e2faf0ceb7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -126,6 +126,10 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
 #define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \
                                     KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
 
+#define KVM_CAP_VMWARE_VALID_MASK (KVM_VMWARE_CAP_ENABLE_BACKDOOR | \
+				   KVM_VMWARE_ENABLE_HYPERCALL | \
+				   KVM_VMWARE_ENABLE_NESTED_BACKDOOR)
+
 static void update_cr8_intercept(struct kvm_vcpu *vcpu);
 static void process_nmi(struct kvm_vcpu *vcpu);
 static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
@@ -4708,11 +4712,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_IRQFD_RESAMPLE:
 	case KVM_CAP_MEMORY_FAULT_INFO:
 	case KVM_CAP_X86_GUEST_MODE:
-#ifdef CONFIG_KVM_VMWARE
-	case KVM_CAP_X86_VMWARE_BACKDOOR:
-	case KVM_CAP_X86_VMWARE_HYPERCALL:
-	case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
-#endif
 		r = 1;
 		break;
 	case KVM_CAP_PRE_FAULT_MEMORY:
@@ -4836,6 +4835,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_READONLY_MEM:
 		r = kvm ? kvm_arch_has_readonly_mem(kvm) : 1;
 		break;
+#ifdef CONFIG_KVM_VMWARE
+	case KVM_CAP_X86_VMWARE:
+		return KVM_CAP_VMWARE_VALID_MASK;
+#endif
 	default:
 		break;
 	}
@@ -6669,31 +6672,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		break;
 	}
 #ifdef CONFIG_KVM_VMWARE
-	case KVM_CAP_X86_VMWARE_BACKDOOR:
+	case KVM_CAP_X86_VMWARE:
 		r = -EINVAL;
-		if (cap->args[0] & ~1)
+		if (cap->args[0] & ~KVM_CAP_VMWARE_VALID_MASK)
 			break;
+
 		mutex_lock(&kvm->lock);
 		if (!kvm->created_vcpus) {
-			kvm->arch.vmware.backdoor_enabled = cap->args[0];
+			kvm->arch.vmware.caps = cap->args[0];
 			r = 0;
 		}
 		mutex_unlock(&kvm->lock);
 		break;
-	case KVM_CAP_X86_VMWARE_HYPERCALL:
-		r = -EINVAL;
-		if (cap->args[0] & ~1)
-			break;
-		kvm->arch.vmware.hypercall_enabled = cap->args[0];
-		r = 0;
-		break;
-	case KVM_CAP_X86_VMWARE_NESTED_BACKDOOR_L0:
-		r = -EINVAL;
-		if (cap->args[0] & ~1)
-			break;
-		kvm->arch.vmware.nested_backdoor_l0_enabled = cap->args[0];
-		r = 0;
-		break;
 #endif
 	default:
 		r = -EINVAL;

base-commit: 77a53b6f5d1c2dabef34d890d212910ed1f43bcb
--

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/5] KVM: x86: Add support for VMware guest specific hypercalls
  2025-04-22 16:12 ` [PATCH v2 3/5] KVM: x86: Add support for VMware guest specific hypercalls Zack Rusin
@ 2025-07-23 18:14   ` Sean Christopherson
  0 siblings, 0 replies; 17+ messages in thread
From: Sean Christopherson @ 2025-07-23 18:14 UTC (permalink / raw)
  To: Zack Rusin
  Cc: linux-kernel, Doug Covelli, Paolo Bonzini, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, kvm, linux-doc

On Tue, Apr 22, 2025, Zack Rusin wrote:
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 9e3be87fc82b..f817601924bd 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -183,11 +183,13 @@ config KVM_VMWARE
>  	depends on KVM
>  	default y
>  	help
> -	  Provides KVM support for hosting VMware guests. Adds support for
> -	  VMware legacy backdoor interface: VMware tools and various userspace
> +	  Provides KVM support for hosting VMware guests. KVM features that can
> +	  be turned on when this option is enabled include:
> +	  - VMware legacy backdoor interface: VMware tools and various userspace
>  	  utilities running in VMware guests sometimes utilize specially
>  	  formatted IN, OUT and RDPMC instructions which need to be
>  	  intercepted.
> +	  - VMware hypercall interface: VMware hypercalls exit to userspace

Eh, I wouldn't bother enumerating the full set of features in the Kconfig.  Just
state that it guards VMware emulation, and let the documentation do the heavy
lifting.

> +static inline bool kvm_vmware_hypercall_enabled(struct kvm *kvm)
> +{
> +	return false;
> +}
> +
> +static inline int kvm_vmware_hypercall(struct kvm_vcpu *vcpu)
> +{
> +	return 0;
> +}

If we do this right, we shouldn't need a stub for kvm_vmware_hypercall(), and
can instead uncondtionally _declare_ kvm_vmware_hypercall(), but only fully
define/implement it CONFIG_KVM_VMWARE=y.  The kvm_is_vmware_xxx() stubs will
probably need to be __always_inline, but otherwise things should Just Work.

So long as kvm_is_vmware_hypercall_enabled() can be resolved to a compile-time
constant (of %false), the compiler's dead-code optimization will drop the call
to kvm_vmware_hypercall() before linking.  KVM (and the kernel at-large) already
heavily relies on dead-code optimization, e.g. we use this trick for sev.c APIs.

In addition to avoiding a stub, if we screw up, e.g. add an unguarded function
call, the bug will manifest as a link-time error, not a run-time error.

> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 793d0cf7ae3c..adf1a1449c06 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -135,6 +135,27 @@ struct kvm_xen_exit {
>  	} u;
>  };
>  
> +struct kvm_vmware_exit {
> +#define KVM_EXIT_VMWARE_HCALL          1
> +	__u32 type;
> +	__u32 pad1;
> +	union {
> +		struct {
> +			__u32 longmode;
> +			__u32 cpl;
> +			__u64 rax, rbx, rcx, rdx, rsi, rdi, rbp;
> +			__u64 result;
> +			struct {
> +				__u32 inject;
> +				__u32 pad2;
> +				__u32 vector;
> +				__u32 error_code;
> +				__u64 address;
> +			} exception;
> +		} hcall;
> +	};
> +};

Put this in the x86 header, arch/x86/include/uapi/asm/kvm.h.  The capability
itself goes in the arch-neutral header so that KVM doesn't have to worry about
collisions between capability numbers, but any arch specific payloads should go
in the arch header(s).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups
  2025-04-22 16:12 ` [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups Zack Rusin
  2025-04-23  7:56   ` Xin Li
@ 2025-07-23 18:19   ` Sean Christopherson
  1 sibling, 0 replies; 17+ messages in thread
From: Sean Christopherson @ 2025-07-23 18:19 UTC (permalink / raw)
  To: Zack Rusin
  Cc: linux-kernel, Doug Covelli, Paolo Bonzini, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, kvm, linux-doc

On Tue, Apr 22, 2025, Zack Rusin wrote:
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 04c375bf1ac2..74c472e51479 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -22,6 +22,7 @@
>  #include <asm/debugreg.h>
>  
>  #include "kvm_emulate.h"
> +#include "kvm_vmware.h"
>  #include "trace.h"
>  #include "mmu.h"
>  #include "x86.h"
> @@ -1517,6 +1518,11 @@ int nested_svm_exit_special(struct vcpu_svm *svm)
>  			 svm->vcpu.arch.apf.host_apf_flags)
>  			/* Trap async PF even if not shadowing */
>  			return NESTED_EXIT_HOST;
> +#ifdef CONFIG_KVM_VMWARE
> +		else if ((exit_code == (SVM_EXIT_EXCP_BASE + GP_VECTOR)) &&
> +			 kvm_vmware_wants_backdoor_to_l0(vcpu, to_svm(vcpu)->vmcb->save.cpl))
> +			return NESTED_EXIT_HOST;
> +#endif

Either provide a stub or do

		else if (IS_ENABLED(CONFIG_KVM_VMWARE) && ...)

Don't do both.  And definitely don't add a stub and #ifdef (some) callers.  I'd
say just drop the #ifdef and rely on the kvm_vmware_wants_backdoor_to_l0() stub
to get the compiler to optimize out the entire elif.

> @@ -6386,6 +6387,11 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
>  			return true;
>  		else if (is_ve_fault(intr_info))
>  			return true;
> +#ifdef CONFIG_KVM_VMWARE
> +		else if (is_gp_fault(intr_info) &&
> +			 kvm_vmware_wants_backdoor_to_l0(vcpu, vmx_get_cpl(vcpu)))
> +			return true;
> +#endif

Same thing here.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-07-23 18:19 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-22 16:12 [PATCH v2 0/5] KVM: Improve VMware guest support Zack Rusin
2025-04-22 16:12 ` [PATCH v2 2/5] KVM: x86: Allow enabling of the vmware backdoor via a cap Zack Rusin
2025-07-23 18:06   ` Sean Christopherson
2025-04-22 16:12 ` [PATCH v2 3/5] KVM: x86: Add support for VMware guest specific hypercalls Zack Rusin
2025-07-23 18:14   ` Sean Christopherson
2025-04-22 16:12 ` [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups Zack Rusin
2025-04-23  7:56   ` Xin Li
2025-04-23 11:43     ` Zack Rusin
2025-04-23 13:31       ` Sean Christopherson
2025-04-23 15:36         ` Zack Rusin
2025-04-23 15:54           ` Sean Christopherson
2025-04-23 16:58             ` Zack Rusin
2025-04-23 17:16               ` Sean Christopherson
2025-04-23 17:25                 ` Zack Rusin
2025-04-23 18:57                   ` Sean Christopherson
2025-04-23 20:01                     ` Zack Rusin
2025-07-23 18:19   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).