public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching
@ 2024-11-28  1:33 Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID Sean Christopherson
                   ` (58 more replies)
  0 siblings, 59 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

The super short TL;DR: snapshot all X86_FEATURE_* flags that KVM cares
about so that all queries against guest capabilities are "fast", e.g. don't
require manual enabling or judgment calls as to where a feature needs to be
fast.

The guest_cpu_cap_* nomenclature follows the existing kvm_cpu_cap_*
except for a few (maybe just one?) cases where guest cpu_caps need APIs
that kvm_cpu_caps don't.  In theory, the similar names will make this
approach more intuitive.

This series also adds more hardening, e.g. to assert at compile-time if a
feature flag is passed to the wrong word.  It also sets the stage for even
more hardening in the future, as tracking all KVM-supported features allows
shoving known vs. used features into arrays at compile time, which can then
be checked for consistency irrespective of hardware support.  E.g. allows
detecting if KVM is checking a feature without advertising it to userspace.
This extra hardening is future work; I have it mostly working, but it's ugly
and requires a runtime check to process the generated arrays.

There are *multiple* potentially breaking changes in this series (in for a
penny, in for a pound).  However, I don't expect any fallout for real world
VMMs because the ABI changes either disallow things that couldn't possibly
have worked in the first place, or are following in the footsteps of other
behaviors, e.g. KVM advertises x2APIC, which is 100% dependent on an in-kernel
local APIC.

 * Disallow stuffing CPUID-dependent guest CR4 features before setting guest
   CPUID.
 * Disallow KVM_CAP_X86_DISABLE_EXITS after vCPU creation
 * Reject disabling of MWAIT/HLT interception when not allowed
 * Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID.
 * Advertise HYPERVISOR in KVM_GET_SUPPORTED_CPUID

Validated the flag rework by comparing the output of KVM_GET_SUPPORTED_CPUID
(and the emulated version) at the beginning and end of the series, on AMD
and Intel hosts that should support almost every feature known to KVM.

Maxim, I did my best to incorporate all of your feedback, and when we
disagreed, I tried to find an approach that I we can hopefully both live
with, at least until someone comes up with a better idea.

I _think_ the only suggestion that I "rejected" entirely is the existence
of ALIASED_1_EDX_F.  I responded to the previous thread, definitely feel
free to continue the conversation there (or here).

If I missed something you care strongly about, please holler!

v3:
 - Collect more reviews.
 - Too many to list.
 
v2:
 - Collect a few reviews (though I dropped several due to the patches changing
   significantly).
 - Incorporate KVM's support into the vCPU's cpu_caps. [Maxim]
 - A massive pile of new patches.

Sean Christopherson (57):
  KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID
  KVM: x86: Limit use of F() and SF() to
    kvm_cpu_cap_{mask,init_kvm_defined}()
  KVM: x86: Do all post-set CPUID processing during vCPU creation
  KVM: x86: Explicitly do runtime CPUID updates "after" initial setup
  KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4
    on VMX
  KVM: selftests: Update x86's set_sregs_test to match KVM's CPUID
    enforcement
  KVM: selftests: Assert that vcpu->cpuid is non-NULL when getting CPUID
    entries
  KVM: selftests: Refresh vCPU CPUID cache in __vcpu_get_cpuid_entry()
  KVM: selftests: Verify KVM stuffs runtime CPUID OS bits on CR4 writes
  KVM: x86: Move __kvm_is_valid_cr4() definition to x86.h
  KVM: x86/pmu: Drop now-redundant refresh() during init()
  KVM: x86: Drop now-redundant MAXPHYADDR and GPA rsvd bits from vCPU
    creation
  KVM: x86: Disallow KVM_CAP_X86_DISABLE_EXITS after vCPU creation
  KVM: x86: Reject disabling of MWAIT/HLT interception when not allowed
  KVM: x86: Drop the now unused KVM_X86_DISABLE_VALID_EXITS
  KVM: selftests: Fix a bad TEST_REQUIRE() in x86's KVM PV test
  KVM: selftests: Update x86's KVM PV test to match KVM's disabling
    exits behavior
  KVM: x86: Zero out PV features cache when the CPUID leaf is not
    present
  KVM: x86: Don't update PV features caches when enabling enforcement
    capability
  KVM: x86: Do reverse CPUID sanity checks in __feature_leaf()
  KVM: x86: Account for max supported CPUID leaf when getting raw host
    CPUID
  KVM: x86: Unpack F() CPUID feature flag macros to one flag per line of
    code
  KVM: x86: Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init()
  KVM: x86: Add a macro to init CPUID features that are 64-bit only
  KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID
    features
  KVM: x86: Handle kernel- and KVM-defined CPUID words in a single
    helper
  KVM: x86: #undef SPEC_CTRL_SSBD in cpuid.c to avoid macro collisions
  KVM: x86: Harden CPU capabilities processing against out-of-scope
    features
  KVM: x86: Add a macro to init CPUID features that ignore host kernel
    support
  KVM: x86: Add a macro to init CPUID features that KVM emulates in
    software
  KVM: x86: Swap incoming guest CPUID into vCPU before massaging in
    KVM_SET_CPUID2
  KVM: x86: Clear PV_UNHALT for !HLT-exiting only when userspace sets
    CPUID
  KVM: x86: Remove unnecessary caching of KVM's PV CPUID base
  KVM: x86: Always operate on kvm_vcpu data in cpuid_entry2_find()
  KVM: x86: Move kvm_find_cpuid_entry{,_index}() up near
    cpuid_entry2_find()
  KVM: x86: Remove all direct usage of cpuid_entry2_find()
  KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID
  KVM: x86: Advertise HYPERVISOR in KVM_GET_SUPPORTED_CPUID
  KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
  KVM: x86: Replace guts of "governed" features with comprehensive
    cpu_caps
  KVM: x86: Initialize guest cpu_caps based on guest CPUID
  KVM: x86: Extract code for generating per-entry emulated CPUID
    information
  KVM: x86: Treat MONTIOR/MWAIT as a "partially emulated" feature
  KVM: x86: Initialize guest cpu_caps based on KVM support
  KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime
  KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns
    right leaf
  KVM: x86: Update OS{XSAVE,PKE} bits in guest CPUID irrespective of
    host support
  KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based
    features
  KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has()
  KVM: x86: Replace (almost) all guest CPUID feature queries with
    cpu_caps
  KVM: x86: Drop superfluous host XSAVE check when adjusting guest
    XSAVES caps
  KVM: x86: Add a macro for features that are synthesized into
    boot_cpu_data
  KVM: x86: Pull CPUID capabilities from boot_cpu_data only as needed
  KVM: x86: Rename "SF" macro to "SCATTERED_F"
  KVM: x86: Explicitly track feature flags that require vendor enabling
  KVM: x86: Explicitly track feature flags that are enabled at runtime
  KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps

 Documentation/virt/kvm/api.rst                |  10 +-
 arch/x86/include/asm/kvm_host.h               |  47 +-
 arch/x86/kvm/cpuid.c                          | 967 ++++++++++++------
 arch/x86/kvm/cpuid.h                          | 128 +--
 arch/x86/kvm/governed_features.h              |  22 -
 arch/x86/kvm/hyperv.c                         |   2 +-
 arch/x86/kvm/lapic.c                          |   4 +-
 arch/x86/kvm/mmu.h                            |   2 +-
 arch/x86/kvm/mmu/mmu.c                        |   4 +-
 arch/x86/kvm/pmu.c                            |   1 -
 arch/x86/kvm/reverse_cpuid.h                  |  23 +-
 arch/x86/kvm/smm.c                            |  10 +-
 arch/x86/kvm/svm/nested.c                     |  22 +-
 arch/x86/kvm/svm/pmu.c                        |   8 +-
 arch/x86/kvm/svm/sev.c                        |  21 +-
 arch/x86/kvm/svm/svm.c                        |  46 +-
 arch/x86/kvm/svm/svm.h                        |   4 +-
 arch/x86/kvm/vmx/hyperv.h                     |   2 +-
 arch/x86/kvm/vmx/nested.c                     |  18 +-
 arch/x86/kvm/vmx/pmu_intel.c                  |   4 +-
 arch/x86/kvm/vmx/sgx.c                        |  14 +-
 arch/x86/kvm/vmx/vmx.c                        |  61 +-
 arch/x86/kvm/x86.c                            | 153 ++-
 arch/x86/kvm/x86.h                            |   6 +-
 include/uapi/linux/kvm.h                      |   4 -
 .../selftests/kvm/include/x86_64/processor.h  |  18 +-
 .../selftests/kvm/x86_64/kvm_pv_test.c        |  38 +-
 .../selftests/kvm/x86_64/set_sregs_test.c     |  63 +-
 28 files changed, 1017 insertions(+), 685 deletions(-)
 delete mode 100644 arch/x86/kvm/governed_features.h


base-commit: 4d911c7abee56771b0219a9fbf0120d06bdc9c14
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-12-13 10:53   ` Vitaly Kuznetsov
  2024-11-28  1:33 ` [PATCH v3 02/57] KVM: x86: Limit use of F() and SF() to kvm_cpu_cap_{mask,init_kvm_defined}() Sean Christopherson
                   ` (57 subsequent siblings)
  58 siblings, 1 reply; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

When clearing CONSTANT_TSC during CPUID emulation due to a Hyper-V quirk,
use feature_bit() instead of SF() to ensure the bit is actually cleared.
SF() evaluates to zero if the _host_ doesn't support the feature.  I.e.
KVM could keep the bit set if userspace advertised CONSTANT_TSC despite
it not being supported in hardware.

Note, translating from a scattered feature to a the hardware version is
done by __feature_translate(), not SF().  The sole purpose of SF() is to
check kernel support for the scattered feature, *before* translation.

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 097bdc022d0f..776f24408fa3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1630,7 +1630,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 				*ebx &= ~(F(RTM) | F(HLE));
 		} else if (function == 0x80000007) {
 			if (kvm_hv_invtsc_suppressed(vcpu))
-				*edx &= ~SF(CONSTANT_TSC);
+				*edx &= ~feature_bit(CONSTANT_TSC);
 		}
 	} else {
 		*eax = *ebx = *ecx = *edx = 0;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 02/57] KVM: x86: Limit use of F() and SF() to kvm_cpu_cap_{mask,init_kvm_defined}()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 03/57] KVM: x86: Do all post-set CPUID processing during vCPU creation Sean Christopherson
                   ` (56 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Define and undefine the F() and SF() macros precisely around
kvm_set_cpu_caps() to make it all but impossible to use the macros outside
of kvm_cpu_cap_{mask,init_kvm_defined}().  Currently, F() is a simple
passthrough, but SF() is actively dangerous as it checks that the scattered
feature is supported by the host kernel.

And usage outside of the aforementioned helpers will run afoul of future
changes to harden KVM's CPUID management.

Opportunistically switch to feature_bit() when stuffing LA57 based on raw
hardware support.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 776f24408fa3..eb4b32bcfa56 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -61,15 +61,6 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
 	return ret;
 }
 
-#define F feature_bit
-
-/* Scattered Flag - For features that are scattered by cpufeatures.h. */
-#define SF(name)						\
-({								\
-	BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES);	\
-	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
-})
-
 /*
  * Magic value used by KVM when querying userspace-provided CPUID entries and
  * doesn't care about the CPIUD index because the index of the function in
@@ -604,6 +595,15 @@ static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask)
 	__kvm_cpu_cap_mask(leaf);
 }
 
+#define F feature_bit
+
+/* Scattered Flag - For features that are scattered by cpufeatures.h. */
+#define SF(name)						\
+({								\
+	BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES);	\
+	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
+})
+
 void kvm_set_cpu_caps(void)
 {
 #ifdef CONFIG_X86_64
@@ -668,7 +668,7 @@ void kvm_set_cpu_caps(void)
 		F(SGX_LC) | F(BUS_LOCK_DETECT)
 	);
 	/* Set LA57 based on hardware capability. */
-	if (cpuid_ecx(7) & F(LA57))
+	if (cpuid_ecx(7) & feature_bit(LA57))
 		kvm_cpu_cap_set(X86_FEATURE_LA57);
 
 	/*
@@ -850,6 +850,9 @@ void kvm_set_cpu_caps(void)
 }
 EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 
+#undef F
+#undef SF
+
 struct kvm_cpuid_array {
 	struct kvm_cpuid_entry2 *entries;
 	int maxnent;
@@ -925,14 +928,14 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
 		++array->nent;
 		break;
 	case 1:
-		entry->ecx = F(MOVBE);
+		entry->ecx = feature_bit(MOVBE);
 		++array->nent;
 		break;
 	case 7:
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 		entry->eax = 0;
 		if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
-			entry->ecx = F(RDPID);
+			entry->ecx = feature_bit(RDPID);
 		++array->nent;
 		break;
 	default:
@@ -1082,7 +1085,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 			goto out;
 
 		cpuid_entry_override(entry, CPUID_D_1_EAX);
-		if (entry->eax & (F(XSAVES)|F(XSAVEC)))
+		if (entry->eax & (feature_bit(XSAVES) | feature_bit(XSAVEC)))
 			entry->ebx = xstate_required_size(permitted_xcr0 | permitted_xss,
 							  true);
 		else {
@@ -1627,7 +1630,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 			u64 data;
 		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
 			    (data & TSX_CTRL_CPUID_CLEAR))
-				*ebx &= ~(F(RTM) | F(HLE));
+				*ebx &= ~(feature_bit(RTM) | feature_bit(HLE));
 		} else if (function == 0x80000007) {
 			if (kvm_hv_invtsc_suppressed(vcpu))
 				*edx &= ~feature_bit(CONSTANT_TSC);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 03/57] KVM: x86: Do all post-set CPUID processing during vCPU creation
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 02/57] KVM: x86: Limit use of F() and SF() to kvm_cpu_cap_{mask,init_kvm_defined}() Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 04/57] KVM: x86: Explicitly do runtime CPUID updates "after" initial setup Sean Christopherson
                   ` (55 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

During vCPU creation, process KVM's default, empty CPUID as if userspace
set an empty CPUID to ensure consistent and correct behavior with respect
to guest CPUID.  E.g. if userspace never sets guest CPUID, KVM will never
configure cr4_guest_rsvd_bits, and thus create divergent, incorrect, guest-
visible behavior due to letting the guest set any KVM-supported CR4 bits
despite the features not being allowed per guest CPUID.

Note!  This changes KVM's ABI, as lack of full CPUID processing allowed
userspace to stuff garbage vCPU state, e.g. userspace could set CR4 to a
guest-unsupported value via KVM_SET_SREGS.  But it's extremely unlikely
that this is a breaking change, as KVM already has many flows that require
userspace to set guest CPUID before loading vCPU state.  E.g. multiple MSR
flows consult guest CPUID on host writes, and KVM_SET_SREGS itself already
relies on guest CPUID being up-to-date, as KVM's validity check on CR3
consumes CPUID.0x7.1 (for LAM) and CPUID.0x80000008 (for MAXPHYADDR).

Furthermore, the plan is to commit to enforcing guest CPUID for userspace
writes to MSRs, at which point bypassing sregs CPUID checks is even more
nonsensical.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 arch/x86/kvm/cpuid.h | 1 +
 arch/x86/kvm/x86.c   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index eb4b32bcfa56..b9ad07e24160 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -338,7 +338,7 @@ static bool guest_cpuid_is_amd_or_hygon(struct kvm_vcpu *vcpu)
 	       is_guest_vendor_hygon(entry->ebx, entry->ecx, entry->edx);
 }
 
-static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
+void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	struct kvm_cpuid_entry2 *best;
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index c8dc66eddefd..e51b868e9d36 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -10,6 +10,7 @@
 extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
 void kvm_set_cpu_caps(void);
 
+void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu);
 void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu);
 void kvm_update_pv_runtime(struct kvm_vcpu *vcpu);
 struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2e713480933a..ca9b0a00cbcc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12301,6 +12301,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	kvm_xen_init_vcpu(vcpu);
 	vcpu_load(vcpu);
+	kvm_vcpu_after_set_cpuid(vcpu);
 	kvm_set_tsc_khz(vcpu, vcpu->kvm->arch.default_tsc_khz);
 	kvm_vcpu_reset(vcpu, false);
 	kvm_init_mmu(vcpu);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 04/57] KVM: x86: Explicitly do runtime CPUID updates "after" initial setup
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (2 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 03/57] KVM: x86: Do all post-set CPUID processing during vCPU creation Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX Sean Christopherson
                   ` (54 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Explicitly perform runtime CPUID adjustments as part of the "after set
CPUID" flow to guard against bugs where KVM consumes stale vCPU/CPUID
state during kvm_update_cpuid_runtime().  E.g. see commit 4736d85f0d18
("KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT").

Whacking each mole individually is not sustainable or robust, e.g. while
the aforemention commit fixed KVM's PV features, the same issue lurks for
Xen and Hyper-V features, Xen and Hyper-V simply don't have any runtime
features (though spoiler alert, neither should KVM).

Updating runtime features in the "full" path will also simplify adding a
snapshot of the guest's capabilities, i.e. of caching the intersection of
guest CPUID and kvm_cpu_caps (modulo a few edge cases).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b9ad07e24160..1944f9415672 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -157,6 +157,9 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
 	return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu, xfeatures);
 }
 
+static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
+				       int nent);
+
 /* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
 static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 				 int nent)
@@ -164,6 +167,17 @@ static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2
 	struct kvm_cpuid_entry2 *orig;
 	int i;
 
+	/*
+	 * Apply runtime CPUID updates to the incoming CPUID entries to avoid
+	 * false positives due mismatches on KVM-owned feature flags.  Note,
+	 * runtime CPUID updates may consume other CPUID-driven vCPU state,
+	 * e.g. KVM or Xen CPUID bases.  Updating runtime state before full
+	 * CPUID processing is functionally correct only because any change in
+	 * CPUID is disallowed, i.e. using stale data is ok because the below
+	 * checks will reject the change.
+	 */
+	__kvm_update_cpuid_runtime(vcpu, e2, nent);
+
 	if (nent != vcpu->arch.cpuid_nent)
 		return -EINVAL;
 
@@ -348,6 +362,8 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	bitmap_zero(vcpu->arch.governed_features.enabled,
 		    KVM_MAX_NR_GOVERNED_FEATURES);
 
+	kvm_update_cpuid_runtime(vcpu);
+
 	/*
 	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
 	 * hardware.  The hardware page walker doesn't let KVM disable GBPAGES,
@@ -429,8 +445,6 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 {
 	int r;
 
-	__kvm_update_cpuid_runtime(vcpu, e2, nent);
-
 	/*
 	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
 	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (3 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 04/57] KVM: x86: Explicitly do runtime CPUID updates "after" initial setup Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-12-13  1:30   ` Chao Gao
  2024-11-28  1:33 ` [PATCH v3 06/57] KVM: selftests: Update x86's set_sregs_test to match KVM's CPUID enforcement Sean Christopherson
                   ` (53 subsequent siblings)
  58 siblings, 1 reply; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Drop x86.c's local pre-computed cr4_reserved bits and instead fold KVM's
reserved bits into the guest's reserved bits.  This fixes a bug where VMX's
set_cr4_guest_host_mask() fails to account for KVM-reserved bits when
deciding which bits can be passed through to the guest.  In most cases,
letting the guest directly write reserved CR4 bits is ok, i.e. attempting
to set the bit(s) will still #GP, but not if a feature is available in
hardware but explicitly disabled by the host, e.g. if FSGSBASE support is
disabled via "nofsgsbase".

Note, the extra overhead of computing host reserved bits every time
userspace sets guest CPUID is negligible.  The feature bits that are
queried are packed nicely into a handful of words, and so checking and
setting each reserved bit costs in the neighborhood of ~5 cycles, i.e. the
total cost will be in the noise even if the number of checked CR4 bits
doubles over the next few years.  In other words, x86 will run out of CR4
bits long before the overhead becomes problematic.

Note #2, __cr4_reserved_bits() starts from CR4_RESERVED_BITS, which is
why the existing __kvm_cpu_cap_has() processing doesn't explicitly OR in
CR4_RESERVED_BITS (and why the new code doesn't do so either).

Fixes: 2ed41aa631fc ("KVM: VMX: Intercept guest reserved CR4 bits to inject #GP fault")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 7 +++++--
 arch/x86/kvm/x86.c   | 9 ---------
 2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 1944f9415672..27919c8f438b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -400,8 +400,11 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
 
 	kvm_pmu_refresh(vcpu);
-	vcpu->arch.cr4_guest_rsvd_bits =
-	    __cr4_reserved_bits(guest_cpuid_has, vcpu);
+
+#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
+	vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) |
+					 __cr4_reserved_bits(guest_cpuid_has, vcpu);
+#undef __kvm_cpu_cap_has
 
 	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu->arch.cpuid_entries,
 						    vcpu->arch.cpuid_nent));
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ca9b0a00cbcc..5288d53fef5c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -119,8 +119,6 @@ u64 __read_mostly efer_reserved_bits = ~((u64)(EFER_SCE | EFER_LME | EFER_LMA));
 static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
 #endif
 
-static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS;
-
 #define KVM_EXIT_HYPERCALL_VALID_MASK (1 << KVM_HC_MAP_GPA_RANGE)
 
 #define KVM_CAP_PMU_VALID_MASK KVM_PMU_CAP_DISABLE
@@ -1285,9 +1283,6 @@ EXPORT_SYMBOL_GPL(kvm_emulate_xsetbv);
 
 bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
-	if (cr4 & cr4_reserved_bits)
-		return false;
-
 	if (cr4 & vcpu->arch.cr4_guest_rsvd_bits)
 		return false;
 
@@ -9773,10 +9768,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
 		kvm_caps.supported_xss = 0;
 
-#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
-	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
-#undef __kvm_cpu_cap_has
-
 	if (kvm_caps.has_tsc_control) {
 		/*
 		 * Make sure the user can only configure tsc_khz values that
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 06/57] KVM: selftests: Update x86's set_sregs_test to match KVM's CPUID enforcement
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (4 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 07/57] KVM: selftests: Assert that vcpu->cpuid is non-NULL when getting CPUID entries Sean Christopherson
                   ` (52 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Rework x86's set sregs test to verify that KVM enforces CPUID vs. CR4
features even if userspace hasn't explicitly set guest CPUID.  KVM used to
allow userspace to set any KVM-supported CR4 value prior to KVM_SET_CPUID2,
and the test verified that behavior.

However, the testcase was written purely to verify KVM's existing behavior,
i.e. was NOT written to match the needs of real world VMMs.

Opportunistically verify that KVM continues to reject unsupported features
after KVM_SET_CPUID2 (using KVM_GET_SUPPORTED_CPUID).

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/x86_64/set_sregs_test.c     | 53 +++++++++++--------
 1 file changed, 30 insertions(+), 23 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/set_sregs_test.c b/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
index c021c0795a96..96fd690d479a 100644
--- a/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
@@ -41,13 +41,15 @@ do {										\
 	TEST_ASSERT(!memcmp(&new, &orig, sizeof(new)), "KVM modified sregs");	\
 } while (0)
 
+#define KVM_ALWAYS_ALLOWED_CR4 (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |	\
+				X86_CR4_DE | X86_CR4_PSE | X86_CR4_PAE |	\
+				X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |	\
+				X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT)
+
 static uint64_t calc_supported_cr4_feature_bits(void)
 {
-	uint64_t cr4;
+	uint64_t cr4 = KVM_ALWAYS_ALLOWED_CR4;
 
-	cr4 = X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE |
-	      X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE | X86_CR4_PGE |
-	      X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT;
 	if (kvm_cpu_has(X86_FEATURE_UMIP))
 		cr4 |= X86_CR4_UMIP;
 	if (kvm_cpu_has(X86_FEATURE_LA57))
@@ -72,28 +74,14 @@ static uint64_t calc_supported_cr4_feature_bits(void)
 	return cr4;
 }
 
-int main(int argc, char *argv[])
+static void test_cr_bits(struct kvm_vcpu *vcpu, uint64_t cr4)
 {
 	struct kvm_sregs sregs;
-	struct kvm_vcpu *vcpu;
-	struct kvm_vm *vm;
-	uint64_t cr4;
 	int rc, i;
 
-	/*
-	 * Create a dummy VM, specifically to avoid doing KVM_SET_CPUID2, and
-	 * use it to verify all supported CR4 bits can be set prior to defining
-	 * the vCPU model, i.e. without doing KVM_SET_CPUID2.
-	 */
-	vm = vm_create_barebones();
-	vcpu = __vm_vcpu_add(vm, 0);
-
 	vcpu_sregs_get(vcpu, &sregs);
-
-	sregs.cr0 = 0;
-	sregs.cr4 |= calc_supported_cr4_feature_bits();
-	cr4 = sregs.cr4;
-
+	sregs.cr0 &= ~(X86_CR0_CD | X86_CR0_NW);
+	sregs.cr4 |= cr4;
 	rc = _vcpu_sregs_set(vcpu, &sregs);
 	TEST_ASSERT(!rc, "Failed to set supported CR4 bits (0x%lx)", cr4);
 
@@ -101,7 +89,6 @@ int main(int argc, char *argv[])
 	TEST_ASSERT(sregs.cr4 == cr4, "sregs.CR4 (0x%llx) != CR4 (0x%lx)",
 		    sregs.cr4, cr4);
 
-	/* Verify all unsupported features are rejected by KVM. */
 	TEST_INVALID_CR_BIT(vcpu, cr4, sregs, X86_CR4_UMIP);
 	TEST_INVALID_CR_BIT(vcpu, cr4, sregs, X86_CR4_LA57);
 	TEST_INVALID_CR_BIT(vcpu, cr4, sregs, X86_CR4_VMXE);
@@ -119,10 +106,28 @@ int main(int argc, char *argv[])
 	/* NW without CD is illegal, as is PG without PE. */
 	TEST_INVALID_CR_BIT(vcpu, cr0, sregs, X86_CR0_NW);
 	TEST_INVALID_CR_BIT(vcpu, cr0, sregs, X86_CR0_PG);
+}
 
+int main(int argc, char *argv[])
+{
+	struct kvm_sregs sregs;
+	struct kvm_vcpu *vcpu;
+	struct kvm_vm *vm;
+	int rc;
+
+	/*
+	 * Create a dummy VM, specifically to avoid doing KVM_SET_CPUID2, and
+	 * use it to verify KVM enforces guest CPUID even if *userspace* never
+	 * sets CPUID.
+	 */
+	vm = vm_create_barebones();
+	vcpu = __vm_vcpu_add(vm, 0);
+	test_cr_bits(vcpu, KVM_ALWAYS_ALLOWED_CR4);
 	kvm_vm_free(vm);
 
-	/* Create a "real" VM and verify APIC_BASE can be set. */
+	/* Create a "real" VM with a fully populated guest CPUID and verify
+	 * APIC_BASE and all supported CR4 can be set.
+	 */
 	vm = vm_create_with_one_vcpu(&vcpu, NULL);
 
 	vcpu_sregs_get(vcpu, &sregs);
@@ -135,6 +140,8 @@ int main(int argc, char *argv[])
 	TEST_ASSERT(!rc, "Couldn't set IA32_APIC_BASE to %llx (valid)",
 		    sregs.apic_base);
 
+	test_cr_bits(vcpu, calc_supported_cr4_feature_bits());
+
 	kvm_vm_free(vm);
 
 	return 0;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 07/57] KVM: selftests: Assert that vcpu->cpuid is non-NULL when getting CPUID entries
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (5 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 06/57] KVM: selftests: Update x86's set_sregs_test to match KVM's CPUID enforcement Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 08/57] KVM: selftests: Refresh vCPU CPUID cache in __vcpu_get_cpuid_entry() Sean Christopherson
                   ` (51 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add a sanity check in __vcpu_get_cpuid_entry() to provide a friendlier
error than a segfault when a test developer tries to use a vCPU CPUID
helper on a barebones vCPU.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 tools/testing/selftests/kvm/include/x86_64/processor.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 645200e95f89..bdc121ed4ce5 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -1016,6 +1016,8 @@ static inline struct kvm_cpuid_entry2 *__vcpu_get_cpuid_entry(struct kvm_vcpu *v
 							      uint32_t function,
 							      uint32_t index)
 {
+	TEST_ASSERT(vcpu->cpuid, "Must do vcpu_init_cpuid() first (or equivalent)");
+
 	return (struct kvm_cpuid_entry2 *)get_cpuid_entry(vcpu->cpuid,
 							  function, index);
 }
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 08/57] KVM: selftests: Refresh vCPU CPUID cache in __vcpu_get_cpuid_entry()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (6 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 07/57] KVM: selftests: Assert that vcpu->cpuid is non-NULL when getting CPUID entries Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 09/57] KVM: selftests: Verify KVM stuffs runtime CPUID OS bits on CR4 writes Sean Christopherson
                   ` (50 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Refresh selftests' CPUID cache in the vCPU structure when querying a CPUID
entry so that tests don't consume stale data when KVM modifies CPUID as a
side effect to a completely unrelated change.  E.g. KVM adjusts OSXSAVE in
response to CR4.OSXSAVE changes.

Unnecessarily invoking KVM_GET_CPUID is suboptimal, but vcpu->cpuid exists
to simplify selftests development, not for performance reasons.  And,
unfortunately, trying to handle the side effects in tests or other flows
is unpleasant, e.g. selftests could manually refresh if KVM_SET_SREGS is
successful, but that would still leave a gap with respect to guest CR4
changes.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/include/x86_64/processor.h     | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index bdc121ed4ce5..7d1ab2d2ddbb 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -1012,12 +1012,19 @@ static inline struct kvm_cpuid2 *allocate_kvm_cpuid2(int nr_entries)
 
 void vcpu_init_cpuid(struct kvm_vcpu *vcpu, const struct kvm_cpuid2 *cpuid);
 
+static inline void vcpu_get_cpuid(struct kvm_vcpu *vcpu)
+{
+	vcpu_ioctl(vcpu, KVM_GET_CPUID2, vcpu->cpuid);
+}
+
 static inline struct kvm_cpuid_entry2 *__vcpu_get_cpuid_entry(struct kvm_vcpu *vcpu,
 							      uint32_t function,
 							      uint32_t index)
 {
 	TEST_ASSERT(vcpu->cpuid, "Must do vcpu_init_cpuid() first (or equivalent)");
 
+	vcpu_get_cpuid(vcpu);
+
 	return (struct kvm_cpuid_entry2 *)get_cpuid_entry(vcpu->cpuid,
 							  function, index);
 }
@@ -1038,7 +1045,7 @@ static inline int __vcpu_set_cpuid(struct kvm_vcpu *vcpu)
 		return r;
 
 	/* On success, refresh the cache to pick up adjustments made by KVM. */
-	vcpu_ioctl(vcpu, KVM_GET_CPUID2, vcpu->cpuid);
+	vcpu_get_cpuid(vcpu);
 	return 0;
 }
 
@@ -1048,12 +1055,7 @@ static inline void vcpu_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu_ioctl(vcpu, KVM_SET_CPUID2, vcpu->cpuid);
 
 	/* Refresh the cache to pick up adjustments made by KVM. */
-	vcpu_ioctl(vcpu, KVM_GET_CPUID2, vcpu->cpuid);
-}
-
-static inline void vcpu_get_cpuid(struct kvm_vcpu *vcpu)
-{
-	vcpu_ioctl(vcpu, KVM_GET_CPUID2, vcpu->cpuid);
+	vcpu_get_cpuid(vcpu);
 }
 
 void vcpu_set_cpuid_property(struct kvm_vcpu *vcpu,
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 09/57] KVM: selftests: Verify KVM stuffs runtime CPUID OS bits on CR4 writes
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (7 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 08/57] KVM: selftests: Refresh vCPU CPUID cache in __vcpu_get_cpuid_entry() Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 10/57] KVM: x86: Move __kvm_is_valid_cr4() definition to x86.h Sean Christopherson
                   ` (49 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Extend x86's set sregs test to verify that KVM sets/clears OSXSAVE and
OSKPKE according to CR4.XSAVE and CR4.PKE respectively.  For performance
reasons, KVM is responsible for emulating the architectural behavior of
the OS CPUID bits tracking CR4.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 tools/testing/selftests/kvm/x86_64/set_sregs_test.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/set_sregs_test.c b/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
index 96fd690d479a..f4095a3d1278 100644
--- a/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
@@ -85,6 +85,16 @@ static void test_cr_bits(struct kvm_vcpu *vcpu, uint64_t cr4)
 	rc = _vcpu_sregs_set(vcpu, &sregs);
 	TEST_ASSERT(!rc, "Failed to set supported CR4 bits (0x%lx)", cr4);
 
+	TEST_ASSERT(!!(sregs.cr4 & X86_CR4_OSXSAVE) ==
+		    (vcpu->cpuid && vcpu_cpuid_has(vcpu, X86_FEATURE_OSXSAVE)),
+		    "KVM didn't %s OSXSAVE in CPUID as expected",
+		    (sregs.cr4 & X86_CR4_OSXSAVE) ? "set" : "clear");
+
+	TEST_ASSERT(!!(sregs.cr4 & X86_CR4_PKE) ==
+		    (vcpu->cpuid && vcpu_cpuid_has(vcpu, X86_FEATURE_OSPKE)),
+		    "KVM didn't %s OSPKE in CPUID as expected",
+		    (sregs.cr4 & X86_CR4_PKE) ? "set" : "clear");
+
 	vcpu_sregs_get(vcpu, &sregs);
 	TEST_ASSERT(sregs.cr4 == cr4, "sregs.CR4 (0x%llx) != CR4 (0x%lx)",
 		    sregs.cr4, cr4);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 10/57] KVM: x86: Move __kvm_is_valid_cr4() definition to x86.h
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (8 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 09/57] KVM: selftests: Verify KVM stuffs runtime CPUID OS bits on CR4 writes Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 11/57] KVM: x86/pmu: Drop now-redundant refresh() during init() Sean Christopherson
                   ` (48 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Let vendor code inline __kvm_is_valid_cr4() now x86.c's cr4_reserved_bits
no longer exists, as keeping cr4_reserved_bits local to x86.c was the only
reason for "hiding" the definition of __kvm_is_valid_cr4().

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 9 ---------
 arch/x86/kvm/x86.h | 6 +++++-
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5288d53fef5c..5c6ade1f976e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1281,15 +1281,6 @@ int kvm_emulate_xsetbv(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_xsetbv);
 
-bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
-{
-	if (cr4 & vcpu->arch.cr4_guest_rsvd_bits)
-		return false;
-
-	return true;
-}
-EXPORT_SYMBOL_GPL(__kvm_is_valid_cr4);
-
 static bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	return __kvm_is_valid_cr4(vcpu, cr4) &&
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index ec623d23d13d..7a87c5fc57f1 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -550,7 +550,6 @@ static inline void kvm_machine_check(void)
 void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
 void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
 int kvm_spec_ctrl_test_value(u64 value);
-bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
 int kvm_handle_memory_failure(struct kvm_vcpu *vcpu, int r,
 			      struct x86_exception *e);
 int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva);
@@ -577,6 +576,11 @@ enum kvm_msr_access {
 #define  KVM_MSR_RET_UNSUPPORTED	2
 #define  KVM_MSR_RET_FILTERED		3
 
+static inline bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+{
+	return !(cr4 & vcpu->arch.cr4_guest_rsvd_bits);
+}
+
 #define __cr4_reserved_bits(__cpu_has, __c)             \
 ({                                                      \
 	u64 __reserved_bits = CR4_RESERVED_BITS;        \
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 11/57] KVM: x86/pmu: Drop now-redundant refresh() during init()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (9 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 10/57] KVM: x86: Move __kvm_is_valid_cr4() definition to x86.h Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 12/57] KVM: x86: Drop now-redundant MAXPHYADDR and GPA rsvd bits from vCPU creation Sean Christopherson
                   ` (47 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Drop the manual kvm_pmu_refresh() from kvm_pmu_init() now that
kvm_arch_vcpu_create() performs the refresh via kvm_vcpu_after_set_cpuid().

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/pmu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 47a46283c866..75e9cfc689f8 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -797,7 +797,6 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 
 	memset(pmu, 0, sizeof(*pmu));
 	kvm_pmu_call(init)(vcpu);
-	kvm_pmu_refresh(vcpu);
 }
 
 /* Release perf_events for vPMCs that have been unused for a full time slice.  */
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 12/57] KVM: x86: Drop now-redundant MAXPHYADDR and GPA rsvd bits from vCPU creation
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (10 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 11/57] KVM: x86/pmu: Drop now-redundant refresh() during init() Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 13/57] KVM: x86: Disallow KVM_CAP_X86_DISABLE_EXITS after " Sean Christopherson
                   ` (46 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Drop the manual initialization of maxphyaddr and reserved_gpa_bits during
vCPU creation now that kvm_arch_vcpu_create() unconditionally invokes
kvm_vcpu_after_set_cpuid(), which handles all such CPUID caching.

None of the helpers between the existing code in kvm_arch_vcpu_create()
and the call to kvm_vcpu_after_set_cpuid() consume maxphyaddr or
reserved_gpa_bits (though auditing vmx_vcpu_create() and svm_vcpu_create()
isn't exactly easy).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5c6ade1f976e..d6a182d94c6f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12258,9 +12258,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 		goto free_emulate_ctxt;
 	}
 
-	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
-	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
-
 	kvm_async_pf_hash_reset(vcpu);
 
 	if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_STUFF_FEATURE_MSRS)) {
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 13/57] KVM: x86: Disallow KVM_CAP_X86_DISABLE_EXITS after vCPU creation
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (11 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 12/57] KVM: x86: Drop now-redundant MAXPHYADDR and GPA rsvd bits from vCPU creation Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 14/57] KVM: x86: Reject disabling of MWAIT/HLT interception when not allowed Sean Christopherson
                   ` (45 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Reject KVM_CAP_X86_DISABLE_EXITS if vCPUs have been created, as disabling
PAUSE/MWAIT/HLT exits after vCPUs have been created is broken and useless,
e.g. except for PAUSE on SVM, the relevant intercepts aren't updated after
vCPU creation.  vCPUs may also end up with an inconsistent configuration
if exits are disabled between creation of multiple vCPUs.

Cc: Hou Wenlong <houwenlong.hwl@antgroup.com>
Link: https://lore.kernel.org/all/9227068821b275ac547eb2ede09ec65d2281fe07.1680179693.git.houwenlong.hwl@antgroup.com
Link: https://lore.kernel.org/all/20230121020738.2973-2-kechenl@nvidia.com
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/api.rst | 1 +
 arch/x86/kvm/x86.c             | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 454c2aaa155e..bbe445e6c113 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7670,6 +7670,7 @@ branch to guests' 0x200 interrupt vector.
 :Architectures: x86
 :Parameters: args[0] defines which exits are disabled
 :Returns: 0 on success, -EINVAL when args[0] contains invalid exits
+          or if any vCPUs have already been created
 
 Valid bits in args[0] are::
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d6a182d94c6f..c517d26f2c5b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6531,6 +6531,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
 			break;
 
+		mutex_lock(&kvm->lock);
+		if (kvm->created_vcpus)
+			goto disable_exits_unlock;
+
 		if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
 			kvm->arch.pause_in_guest = true;
 
@@ -6552,6 +6556,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		}
 
 		r = 0;
+disable_exits_unlock:
+		mutex_unlock(&kvm->lock);
 		break;
 	case KVM_CAP_MSR_PLATFORM_INFO:
 		kvm->arch.guest_can_read_msr_platform_info = cap->args[0];
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 14/57] KVM: x86: Reject disabling of MWAIT/HLT interception when not allowed
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (12 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 13/57] KVM: x86: Disallow KVM_CAP_X86_DISABLE_EXITS after " Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 15/57] KVM: x86: Drop the now unused KVM_X86_DISABLE_VALID_EXITS Sean Christopherson
                   ` (44 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Reject KVM_CAP_X86_DISABLE_EXITS if userspace attempts to disable MWAIT or
HLT exits and KVM previously reported (via KVM_CHECK_EXTENSION) that
disabling the exit(s) is not allowed.  E.g. because MWAIT isn't supported
or the CPU doesn't have an always-running APIC timer, or because KVM is
configured to mitigate cross-thread vulnerabilities.

Cc: Kechen Lu <kechenl@nvidia.com>
Fixes: 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT intercepts")
Fixes: 6f0f2d5ef895 ("KVM: x86: Mitigate the cross-thread return address predictions bug")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 54 ++++++++++++++++++++++++----------------------
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c517d26f2c5b..9b7f8047f896 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4531,6 +4531,20 @@ static inline bool kvm_can_mwait_in_guest(void)
 		boot_cpu_has(X86_FEATURE_ARAT);
 }
 
+static u64 kvm_get_allowed_disable_exits(void)
+{
+	u64 r = KVM_X86_DISABLE_EXITS_PAUSE;
+
+	if (!mitigate_smt_rsb) {
+		r |= KVM_X86_DISABLE_EXITS_HLT |
+			KVM_X86_DISABLE_EXITS_CSTATE;
+
+		if (kvm_can_mwait_in_guest())
+			r |= KVM_X86_DISABLE_EXITS_MWAIT;
+	}
+	return r;
+}
+
 #ifdef CONFIG_KVM_HYPERV
 static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu,
 					    struct kvm_cpuid2 __user *cpuid_arg)
@@ -4673,15 +4687,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = KVM_CLOCK_VALID_FLAGS;
 		break;
 	case KVM_CAP_X86_DISABLE_EXITS:
-		r = KVM_X86_DISABLE_EXITS_PAUSE;
-
-		if (!mitigate_smt_rsb) {
-			r |= KVM_X86_DISABLE_EXITS_HLT |
-			     KVM_X86_DISABLE_EXITS_CSTATE;
-
-			if (kvm_can_mwait_in_guest())
-				r |= KVM_X86_DISABLE_EXITS_MWAIT;
-		}
+		r = kvm_get_allowed_disable_exits();
 		break;
 	case KVM_CAP_X86_SMM:
 		if (!IS_ENABLED(CONFIG_KVM_SMM))
@@ -6528,33 +6534,29 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		break;
 	case KVM_CAP_X86_DISABLE_EXITS:
 		r = -EINVAL;
-		if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
+		if (cap->args[0] & ~kvm_get_allowed_disable_exits())
 			break;
 
 		mutex_lock(&kvm->lock);
 		if (kvm->created_vcpus)
 			goto disable_exits_unlock;
 
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
-			kvm->arch.pause_in_guest = true;
-
 #define SMT_RSB_MSG "This processor is affected by the Cross-Thread Return Predictions vulnerability. " \
 		    "KVM_CAP_X86_DISABLE_EXITS should only be used with SMT disabled or trusted guests."
 
-		if (!mitigate_smt_rsb) {
-			if (boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible() &&
-			    (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
-				pr_warn_once(SMT_RSB_MSG);
-
-			if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
-			    kvm_can_mwait_in_guest())
-				kvm->arch.mwait_in_guest = true;
-			if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
-				kvm->arch.hlt_in_guest = true;
-			if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
-				kvm->arch.cstate_in_guest = true;
-		}
+		if (!mitigate_smt_rsb && boot_cpu_has_bug(X86_BUG_SMT_RSB) &&
+		    cpu_smt_possible() &&
+		    (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
+			pr_warn_once(SMT_RSB_MSG);
 
+		if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
+			kvm->arch.pause_in_guest = true;
+		if (cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT)
+			kvm->arch.mwait_in_guest = true;
+		if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
+			kvm->arch.hlt_in_guest = true;
+		if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
+			kvm->arch.cstate_in_guest = true;
 		r = 0;
 disable_exits_unlock:
 		mutex_unlock(&kvm->lock);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 15/57] KVM: x86: Drop the now unused KVM_X86_DISABLE_VALID_EXITS
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (13 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 14/57] KVM: x86: Reject disabling of MWAIT/HLT interception when not allowed Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 16/57] KVM: selftests: Fix a bad TEST_REQUIRE() in x86's KVM PV test Sean Christopherson
                   ` (43 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Drop the KVM_X86_DISABLE_VALID_EXITS definition, as it is misleading, and
unused in KVM *because* it is misleading.  The set of exits that can be
disabled is dynamic, i.e. userspace (and KVM) must check KVM's actual
capabilities.

Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/uapi/linux/kvm.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 502ea63b5d2e..206e3e6a78c6 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -617,10 +617,6 @@ struct kvm_ioeventfd {
 #define KVM_X86_DISABLE_EXITS_HLT            (1 << 1)
 #define KVM_X86_DISABLE_EXITS_PAUSE          (1 << 2)
 #define KVM_X86_DISABLE_EXITS_CSTATE         (1 << 3)
-#define KVM_X86_DISABLE_VALID_EXITS          (KVM_X86_DISABLE_EXITS_MWAIT | \
-                                              KVM_X86_DISABLE_EXITS_HLT | \
-                                              KVM_X86_DISABLE_EXITS_PAUSE | \
-                                              KVM_X86_DISABLE_EXITS_CSTATE)
 
 /* for KVM_ENABLE_CAP */
 struct kvm_enable_cap {
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 16/57] KVM: selftests: Fix a bad TEST_REQUIRE() in x86's KVM PV test
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (14 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 15/57] KVM: x86: Drop the now unused KVM_X86_DISABLE_VALID_EXITS Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 17/57] KVM: selftests: Update x86's KVM PV test to match KVM's disabling exits behavior Sean Christopherson
                   ` (42 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Actually check for KVM support for disabling HLT-exiting instead of
effectively checking that KVM_CAP_X86_DISABLE_EXITS is #defined to a
non-zero value, and convert the TEST_REQUIRE() to a simple return so
that only the sub-test is skipped if HLT-exiting is mandatory.

The goof has likely gone unnoticed because all x86 CPUs support disabling
HLT-exiting, only systems with the opt-in mitigate_smt_rsb KVM module
param disallow HLT-exiting.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 tools/testing/selftests/kvm/x86_64/kvm_pv_test.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c b/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c
index 78878b3a2725..2aee93108a54 100644
--- a/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c
+++ b/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c
@@ -140,10 +140,11 @@ static void test_pv_unhalt(void)
 	struct kvm_cpuid_entry2 *ent;
 	u32 kvm_sig_old;
 
+	if (!(kvm_check_cap(KVM_CAP_X86_DISABLE_EXITS) & KVM_X86_DISABLE_EXITS_HLT))
+		return;
+
 	pr_info("testing KVM_FEATURE_PV_UNHALT\n");
 
-	TEST_REQUIRE(KVM_CAP_X86_DISABLE_EXITS);
-
 	/* KVM_PV_UNHALT test */
 	vm = vm_create_with_one_vcpu(&vcpu, guest_main);
 	vcpu_set_cpuid_feature(vcpu, X86_FEATURE_KVM_PV_UNHALT);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 17/57] KVM: selftests: Update x86's KVM PV test to match KVM's disabling exits behavior
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (15 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 16/57] KVM: selftests: Fix a bad TEST_REQUIRE() in x86's KVM PV test Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 18/57] KVM: x86: Zero out PV features cache when the CPUID leaf is not present Sean Christopherson
                   ` (41 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Rework x86's KVM PV features test to align with KVM's new, fixed behavior
of not allowing userspace to disable HLT-exiting after vCPUs have been
created.  Rework the core testcase to disable HLT-exiting before creating
a vCPU, and opportunistically modify keep the paired VM+vCPU creation to
verify that KVM rejects KVM_CAP_X86_DISABLE_EXITS as expected.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/x86_64/kvm_pv_test.c        | 33 +++++++++++++++++--
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c b/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c
index 2aee93108a54..1b805cbdb47b 100644
--- a/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c
+++ b/tools/testing/selftests/kvm/x86_64/kvm_pv_test.c
@@ -139,6 +139,7 @@ static void test_pv_unhalt(void)
 	struct kvm_vm *vm;
 	struct kvm_cpuid_entry2 *ent;
 	u32 kvm_sig_old;
+	int r;
 
 	if (!(kvm_check_cap(KVM_CAP_X86_DISABLE_EXITS) & KVM_X86_DISABLE_EXITS_HLT))
 		return;
@@ -152,19 +153,45 @@ static void test_pv_unhalt(void)
 	TEST_ASSERT(vcpu_cpuid_has(vcpu, X86_FEATURE_KVM_PV_UNHALT),
 		    "Enabling X86_FEATURE_KVM_PV_UNHALT had no effect");
 
-	/* Make sure KVM clears vcpu->arch.kvm_cpuid */
+	/* Verify KVM disallows disabling exits after vCPU creation. */
+	r = __vm_enable_cap(vm, KVM_CAP_X86_DISABLE_EXITS, KVM_X86_DISABLE_EXITS_HLT);
+	TEST_ASSERT(r && errno == EINVAL,
+		    "Disabling exits after vCPU creation didn't fail as expected");
+
+	kvm_vm_free(vm);
+
+	/* Verify that KVM clear PV_UNHALT from guest CPUID. */
+	vm = vm_create(1);
+	vm_enable_cap(vm, KVM_CAP_X86_DISABLE_EXITS, KVM_X86_DISABLE_EXITS_HLT);
+
+	vcpu = vm_vcpu_add(vm, 0, NULL);
+	TEST_ASSERT(!vcpu_cpuid_has(vcpu, X86_FEATURE_KVM_PV_UNHALT),
+		    "vCPU created with PV_UNHALT set by default");
+
+	vcpu_set_cpuid_feature(vcpu, X86_FEATURE_KVM_PV_UNHALT);
+	TEST_ASSERT(!vcpu_cpuid_has(vcpu, X86_FEATURE_KVM_PV_UNHALT),
+		    "PV_UNHALT set in guest CPUID when HLT-exiting is disabled");
+
+	/*
+	 * Clobber the KVM PV signature and verify KVM does NOT clear PV_UNHALT
+	 * when KVM PV is not present, and DOES clear PV_UNHALT when switching
+	 * back to the correct signature..
+	 */
 	ent = vcpu_get_cpuid_entry(vcpu, KVM_CPUID_SIGNATURE);
 	kvm_sig_old = ent->ebx;
 	ent->ebx = 0xdeadbeef;
 	vcpu_set_cpuid(vcpu);
 
-	vm_enable_cap(vm, KVM_CAP_X86_DISABLE_EXITS, KVM_X86_DISABLE_EXITS_HLT);
+	vcpu_set_cpuid_feature(vcpu, X86_FEATURE_KVM_PV_UNHALT);
+	TEST_ASSERT(vcpu_cpuid_has(vcpu, X86_FEATURE_KVM_PV_UNHALT),
+		    "PV_UNHALT cleared when using bogus KVM PV signature");
+
 	ent = vcpu_get_cpuid_entry(vcpu, KVM_CPUID_SIGNATURE);
 	ent->ebx = kvm_sig_old;
 	vcpu_set_cpuid(vcpu);
 
 	TEST_ASSERT(!vcpu_cpuid_has(vcpu, X86_FEATURE_KVM_PV_UNHALT),
-		    "KVM_FEATURE_PV_UNHALT is set with KVM_CAP_X86_DISABLE_EXITS");
+		    "PV_UNHALT set in guest CPUID when HLT-exiting is disabled");
 
 	/* FIXME: actually test KVM_FEATURE_PV_UNHALT feature */
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 18/57] KVM: x86: Zero out PV features cache when the CPUID leaf is not present
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (16 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 17/57] KVM: selftests: Update x86's KVM PV test to match KVM's disabling exits behavior Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 19/57] KVM: x86: Don't update PV features caches when enabling enforcement capability Sean Christopherson
                   ` (40 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Clear KVM's PV feature cache prior when processing a new guest CPUID so
that KVM doesn't keep a stale cache entry if userspace does KVM_SET_CPUID2
multiple times, once with a PV features entry, and a second time without.

Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID")
Cc: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 27919c8f438b..a94234637e09 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -251,6 +251,8 @@ void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best = kvm_find_kvm_cpuid_features(vcpu);
 
+	vcpu->arch.pv_cpuid.features = 0;
+
 	/*
 	 * save the feature bitmap to avoid cpuid lookup for every PV
 	 * operation
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 19/57] KVM: x86: Don't update PV features caches when enabling enforcement capability
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (17 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 18/57] KVM: x86: Zero out PV features cache when the CPUID leaf is not present Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 20/57] KVM: x86: Do reverse CPUID sanity checks in __feature_leaf() Sean Christopherson
                   ` (39 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Revert the chunk of commit 01b4f510b9f4 ("kvm: x86: ensure pv_cpuid.features
is initialized when enabling cap") that forced a PV features cache refresh
during KVM_CAP_ENFORCE_PV_FEATURE_CPUID, as whatever ioctl() ordering
issue it alleged to have fixed never existed upstream, and likely never
existed in any kernel.

At the time of the commit, there was a tangentially related ioctl()
ordering issue, as toggling KVM_X86_DISABLE_EXITS_HLT after KVM_SET_CPUID2
would have resulted in KVM potentially leaving KVM_FEATURE_PV_UNHALT set.
But (a) that bug affected the entire guest CPUID, not just the cache, (b)
commit 01b4f510b9f4 didn't address that bug, it only refreshed the cache
(with the bad CPUID), and (c) setting KVM_X86_DISABLE_EXITS_HLT after vCPU
creation is completely broken as KVM configures HLT-exiting only during
vCPU creation, which is why KVM_CAP_X86_DISABLE_EXITS is now disallowed if
vCPUs have been created.

Another tangentially related bug was KVM's failure to clear the cache when
handling KVM_SET_CPUID2, but again commit 01b4f510b9f4 did nothing to fix
that bug.

The most plausible explanation for the what commit 01b4f510b9f4 was trying
to fix is a bug that existed in Google's internal kernel that was the
source of commit 01b4f510b9f4.  At the time, Google's internal kernel had
not yet picked up commit 0d3b2ba16ba68 ("KVM: X86: Go on updating other
CPUID leaves when leaf 1 is absent"), i.e. KVM would not initialize the
PV features cache if KVM_SET_CPUID2 was called without a CPUID.0x1 entry.

Of course, no sane real world VMM would omit CPUID.0x1, including the KVM
selftest added by commit ac4a4d6de22e ("selftests: kvm: test enforcement
of paravirtual cpuid features").  And the test didn't actually try to
verify multiple orderings, nor did the selftest enter the guest without
doing KVM_SET_CPUID2, so who knows what motivated the change.

Regardless of why commit 01b4f510b9f4 ("kvm: x86: ensure pv_cpuid.features
is initialized when enabling cap") was added, refreshing the cache during
KVM_CAP_ENFORCE_PV_FEATURE_CPUID isn't necessary.

Cc: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 arch/x86/kvm/cpuid.h | 1 -
 arch/x86/kvm/x86.c   | 3 ---
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index a94234637e09..bfb81e417bef 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -247,7 +247,7 @@ static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcp
 					     vcpu->arch.cpuid_nent, base);
 }
 
-void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)
+static void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best = kvm_find_kvm_cpuid_features(vcpu);
 
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index e51b868e9d36..d4ece5db7b46 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -12,7 +12,6 @@ void kvm_set_cpu_caps(void);
 
 void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu);
 void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu);
-void kvm_update_pv_runtime(struct kvm_vcpu *vcpu);
 struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
 						    u32 function, u32 index);
 struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9b7f8047f896..9f0ffc3289d2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5814,9 +5814,6 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 
 	case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
 		vcpu->arch.pv_cpuid.enforce = cap->args[0];
-		if (vcpu->arch.pv_cpuid.enforce)
-			kvm_update_pv_runtime(vcpu);
-
 		return 0;
 	default:
 		return -EINVAL;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 20/57] KVM: x86: Do reverse CPUID sanity checks in __feature_leaf()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (18 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 19/57] KVM: x86: Don't update PV features caches when enabling enforcement capability Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 21/57] KVM: x86: Account for max supported CPUID leaf when getting raw host CPUID Sean Christopherson
                   ` (38 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Do the compile-time sanity checks on reverse_cpuid in __feature_leaf() so
that higher level APIs don't need to "manually" perform the sanity checks.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.h         | 3 ---
 arch/x86/kvm/reverse_cpuid.h | 6 ++++--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index d4ece5db7b46..5d0fe3793d75 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -179,7 +179,6 @@ static __always_inline void kvm_cpu_cap_clear(unsigned int x86_feature)
 {
 	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	reverse_cpuid_check(x86_leaf);
 	kvm_cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
 }
 
@@ -187,7 +186,6 @@ static __always_inline void kvm_cpu_cap_set(unsigned int x86_feature)
 {
 	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	reverse_cpuid_check(x86_leaf);
 	kvm_cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
 }
 
@@ -195,7 +193,6 @@ static __always_inline u32 kvm_cpu_cap_get(unsigned int x86_feature)
 {
 	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	reverse_cpuid_check(x86_leaf);
 	return kvm_cpu_caps[x86_leaf] & __feature_bit(x86_feature);
 }
 
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index e46220ece83c..1d2db9d529ff 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -145,7 +145,10 @@ static __always_inline u32 __feature_translate(int x86_feature)
 
 static __always_inline u32 __feature_leaf(int x86_feature)
 {
-	return __feature_translate(x86_feature) / 32;
+	u32 x86_leaf = __feature_translate(x86_feature) / 32;
+
+	reverse_cpuid_check(x86_leaf);
+	return x86_leaf;
 }
 
 /*
@@ -168,7 +171,6 @@ static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned int x86_featu
 {
 	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	reverse_cpuid_check(x86_leaf);
 	return reverse_cpuid[x86_leaf];
 }
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 21/57] KVM: x86: Account for max supported CPUID leaf when getting raw host CPUID
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (19 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 20/57] KVM: x86: Do reverse CPUID sanity checks in __feature_leaf() Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 22/57] KVM: x86: Unpack F() CPUID feature flag macros to one flag per line of code Sean Christopherson
                   ` (37 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Explicitly zero out the feature word in kvm_cpu_caps if the word's
associated CPUID function is greater than the max leaf supported by the
CPU.  For such unsupported functions, Intel CPUs return the output from
the last supported leaf, not all zeros.

Practically speaking, this is likely a benign bug, as KVM uses the raw
host CPUID to mask the kernel's computed capabilities, and the kernel does
perform max leaf checks when populating boot_cpu_data.  The only way KVM's
goof could be problematic is if the kernel force-set a feature in a leaf
that is completely unsupported, _and_ the max supported leaf happened to
return a value with '1' the same bit position.  Which is theoretically
possible, but extremely unlikely.  And even if that did happen, it's
entirely possible that KVM would still provide the correct functionality;
the kernel did set the capability after all.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index bfb81e417bef..c7fb6b764075 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -579,18 +579,37 @@ int kvm_vcpu_ioctl_get_cpuid2(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static __always_inline u32 raw_cpuid_get(struct cpuid_reg cpuid)
+{
+	struct kvm_cpuid_entry2 entry;
+	u32 base;
+
+	/*
+	 * KVM only supports features defined by Intel (0x0), AMD (0x80000000),
+	 * and Centaur (0xc0000000).  WARN if a feature for new vendor base is
+	 * defined, as this and other code would need to be updated.
+	 */
+	base = cpuid.function & 0xffff0000;
+	if (WARN_ON_ONCE(base && base != 0x80000000 && base != 0xc0000000))
+		return 0;
+
+	if (cpuid_eax(base) < cpuid.function)
+		return 0;
+
+	cpuid_count(cpuid.function, cpuid.index,
+		    &entry.eax, &entry.ebx, &entry.ecx, &entry.edx);
+
+	return *__cpuid_entry_get_reg(&entry, cpuid.reg);
+}
+
 /* Mask kvm_cpu_caps for @leaf with the raw CPUID capabilities of this CPU. */
 static __always_inline void __kvm_cpu_cap_mask(unsigned int leaf)
 {
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);
-	struct kvm_cpuid_entry2 entry;
 
 	reverse_cpuid_check(leaf);
 
-	cpuid_count(cpuid.function, cpuid.index,
-		    &entry.eax, &entry.ebx, &entry.ecx, &entry.edx);
-
-	kvm_cpu_caps[leaf] &= *__cpuid_entry_get_reg(&entry, cpuid.reg);
+	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);
 }
 
 static __always_inline
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 22/57] KVM: x86: Unpack F() CPUID feature flag macros to one flag per line of code
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (20 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 21/57] KVM: x86: Account for max supported CPUID leaf when getting raw host CPUID Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 23/57] KVM: x86: Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init() Sean Christopherson
                   ` (36 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Refactor kvm_set_cpu_caps() to express each supported (or not) feature
flag on a separate line, modulo a handful of cases where KVM does not, and
likely will not, support a sequence of flags.  This will allow adding
fancier macros with longer, more descriptive names without resulting in
absurd line lengths and/or weird code.  Isolating each flag also makes it
far easier to review changes, reduces code conflicts, and generally makes
it easier to resolve conflicts.  Lastly, it allows co-locating comments
for notable flags, e.g. MONITOR, precisely with the relevant flag.

No functional change intended.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 295 +++++++++++++++++++++++++++++++++----------
 1 file changed, 231 insertions(+), 64 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index c7fb6b764075..00b5b1a2a66f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -662,48 +662,121 @@ void kvm_set_cpu_caps(void)
 	       sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)));
 
 	kvm_cpu_cap_mask(CPUID_1_ECX,
+		F(XMM3) |
+		F(PCLMULQDQ) |
+		0 /* DTES64 */ |
 		/*
 		 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*
 		 * advertised to guests via CPUID!
 		 */
-		F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ |
+		0 /* MONITOR */ |
 		0 /* DS-CPL, VMX, SMX, EST */ |
-		0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
-		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
-		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
-		F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) |
-		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
-		F(F16C) | F(RDRAND)
+		0 /* TM2 */ |
+		F(SSSE3) |
+		0 /* CNXT-ID */ |
+		0 /* Reserved */ |
+		F(FMA) |
+		F(CX16) |
+		0 /* xTPR Update */ |
+		F(PDCM) |
+		F(PCID) |
+		0 /* Reserved, DCA */ |
+		F(XMM4_1) |
+		F(XMM4_2) |
+		F(X2APIC) |
+		F(MOVBE) |
+		F(POPCNT) |
+		0 /* Reserved*/ |
+		F(AES) |
+		F(XSAVE) |
+		0 /* OSXSAVE */ |
+		F(AVX) |
+		F(F16C) |
+		F(RDRAND)
 	);
 	/* KVM emulates x2apic in software irrespective of host support. */
 	kvm_cpu_cap_set(X86_FEATURE_X2APIC);
 
 	kvm_cpu_cap_mask(CPUID_1_EDX,
-		F(FPU) | F(VME) | F(DE) | F(PSE) |
-		F(TSC) | F(MSR) | F(PAE) | F(MCE) |
-		F(CX8) | F(APIC) | 0 /* Reserved */ | F(SEP) |
-		F(MTRR) | F(PGE) | F(MCA) | F(CMOV) |
-		F(PAT) | F(PSE36) | 0 /* PSN */ | F(CLFLUSH) |
-		0 /* Reserved, DS, ACPI */ | F(MMX) |
-		F(FXSR) | F(XMM) | F(XMM2) | F(SELFSNOOP) |
+		F(FPU) |
+		F(VME) |
+		F(DE) |
+		F(PSE) |
+		F(TSC) |
+		F(MSR) |
+		F(PAE) |
+		F(MCE) |
+		F(CX8) |
+		F(APIC) |
+		0 /* Reserved */ |
+		F(SEP) |
+		F(MTRR) |
+		F(PGE) |
+		F(MCA) |
+		F(CMOV) |
+		F(PAT) |
+		F(PSE36) |
+		0 /* PSN */ |
+		F(CLFLUSH) |
+		0 /* Reserved, DS, ACPI */ |
+		F(MMX) |
+		F(FXSR) |
+		F(XMM) |
+		F(XMM2) |
+		F(SELFSNOOP) |
 		0 /* HTT, TM, Reserved, PBE */
 	);
 
 	kvm_cpu_cap_mask(CPUID_7_0_EBX,
-		F(FSGSBASE) | F(SGX) | F(BMI1) | F(HLE) | F(AVX2) |
-		F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) | F(INVPCID) |
-		F(RTM) | F(ZERO_FCS_FDS) | 0 /*MPX*/ | F(AVX512F) |
-		F(AVX512DQ) | F(RDSEED) | F(ADX) | F(SMAP) | F(AVX512IFMA) |
-		F(CLFLUSHOPT) | F(CLWB) | 0 /*INTEL_PT*/ | F(AVX512PF) |
-		F(AVX512ER) | F(AVX512CD) | F(SHA_NI) | F(AVX512BW) |
+		F(FSGSBASE) |
+		F(SGX) |
+		F(BMI1) |
+		F(HLE) |
+		F(AVX2) |
+		F(FDP_EXCPTN_ONLY) |
+		F(SMEP) |
+		F(BMI2) |
+		F(ERMS) |
+		F(INVPCID) |
+		F(RTM) |
+		F(ZERO_FCS_FDS) |
+		0 /*MPX*/ |
+		F(AVX512F) |
+		F(AVX512DQ) |
+		F(RDSEED) |
+		F(ADX) |
+		F(SMAP) |
+		F(AVX512IFMA) |
+		F(CLFLUSHOPT) |
+		F(CLWB) |
+		0 /*INTEL_PT*/ |
+		F(AVX512PF) |
+		F(AVX512ER) |
+		F(AVX512CD) |
+		F(SHA_NI) |
+		F(AVX512BW) |
 		F(AVX512VL));
 
 	kvm_cpu_cap_mask(CPUID_7_ECX,
-		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
-		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
-		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
-		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
-		F(SGX_LC) | F(BUS_LOCK_DETECT)
+		F(AVX512VBMI) |
+		F(LA57) |
+		F(PKU) |
+		0 /*OSPKE*/ |
+		F(RDPID) |
+		F(AVX512_VPOPCNTDQ) |
+		F(UMIP) |
+		F(AVX512_VBMI2) |
+		F(GFNI) |
+		F(VAES) |
+		F(VPCLMULQDQ) |
+		F(AVX512_VNNI) |
+		F(AVX512_BITALG) |
+		F(CLDEMOTE) |
+		F(MOVDIRI) |
+		F(MOVDIR64B) |
+		0 /*WAITPKG*/ |
+		F(SGX_LC) |
+		F(BUS_LOCK_DETECT)
 	);
 	/* Set LA57 based on hardware capability. */
 	if (cpuid_ecx(7) & feature_bit(LA57))
@@ -717,11 +790,22 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_clear(X86_FEATURE_PKU);
 
 	kvm_cpu_cap_mask(CPUID_7_EDX,
-		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
-		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
-		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
-		F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) |
-		F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D)
+		F(AVX512_4VNNIW) |
+		F(AVX512_4FMAPS) |
+		F(SPEC_CTRL) |
+		F(SPEC_CTRL_SSBD) |
+		F(ARCH_CAPABILITIES) |
+		F(INTEL_STIBP) |
+		F(MD_CLEAR) |
+		F(AVX512_VP2INTERSECT) |
+		F(FSRM) |
+		F(SERIALIZE) |
+		F(TSXLDTRK) |
+		F(AVX512_FP16) |
+		F(AMX_TILE) |
+		F(AMX_INT8) |
+		F(AMX_BF16) |
+		F(FLUSH_L1D)
 	);
 
 	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
@@ -738,50 +822,110 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
 
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
-		F(SHA512) | F(SM3) | F(SM4) | F(AVX_VNNI) | F(AVX512_BF16) |
-		F(CMPCCXADD) | F(FZRM) | F(FSRS) | F(FSRC) | F(AMX_FP16) |
-		F(AVX_IFMA) | F(LAM)
+		F(SHA512) |
+		F(SM3) |
+		F(SM4) |
+		F(AVX_VNNI) |
+		F(AVX512_BF16) |
+		F(CMPCCXADD) |
+		F(FZRM) |
+		F(FSRS) |
+		F(FSRC) |
+		F(AMX_FP16) |
+		F(AVX_IFMA) |
+		F(LAM)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(AMX_COMPLEX) |
-		F(AVX_VNNI_INT16) | F(PREFETCHITI) | F(AVX10)
+		F(AVX_VNNI_INT8) |
+		F(AVX_NE_CONVERT) |
+		F(AMX_COMPLEX) |
+		F(AVX_VNNI_INT16) |
+		F(PREFETCHITI) |
+		F(AVX10)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
-		F(INTEL_PSFD) | F(IPRED_CTRL) | F(RRSBA_CTRL) | F(DDPD_U) |
-		F(BHI_CTRL) | F(MCDT_NO)
+		F(INTEL_PSFD) |
+		F(IPRED_CTRL) |
+		F(RRSBA_CTRL) |
+		F(DDPD_U) |
+		F(BHI_CTRL) |
+		F(MCDT_NO)
 	);
 
 	kvm_cpu_cap_mask(CPUID_D_1_EAX,
-		F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | F(XSAVES) | f_xfd
+		F(XSAVEOPT) |
+		F(XSAVEC) |
+		F(XGETBV1) |
+		F(XSAVES) |
+		f_xfd
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_12_EAX,
-		SF(SGX1) | SF(SGX2) | SF(SGX_EDECCSSA)
+		SF(SGX1) |
+		SF(SGX2) |
+		SF(SGX_EDECCSSA)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_24_0_EBX,
-		F(AVX10_128) | F(AVX10_256) | F(AVX10_512)
+		F(AVX10_128) |
+		F(AVX10_256) |
+		F(AVX10_512)
 	);
 
 	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
-		F(LAHF_LM) | F(CMP_LEGACY) | 0 /*SVM*/ | 0 /* ExtApicSpace */ |
-		F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
-		F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
-		0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM) |
-		F(TOPOEXT) | 0 /* PERFCTR_CORE */
+		F(LAHF_LM) |
+		F(CMP_LEGACY) |
+		0 /*SVM*/ |
+		0 /* ExtApicSpace */ |
+		F(CR8_LEGACY) |
+		F(ABM) |
+		F(SSE4A) |
+		F(MISALIGNSSE) |
+		F(3DNOWPREFETCH) |
+		F(OSVW) |
+		0 /* IBS */ |
+		F(XOP) |
+		0 /* SKINIT, WDT, LWP */ |
+		F(FMA4) |
+		F(TBM) |
+		F(TOPOEXT) |
+		0 /* PERFCTR_CORE */
 	);
 
 	kvm_cpu_cap_mask(CPUID_8000_0001_EDX,
-		F(FPU) | F(VME) | F(DE) | F(PSE) |
-		F(TSC) | F(MSR) | F(PAE) | F(MCE) |
-		F(CX8) | F(APIC) | 0 /* Reserved */ | F(SYSCALL) |
-		F(MTRR) | F(PGE) | F(MCA) | F(CMOV) |
-		F(PAT) | F(PSE36) | 0 /* Reserved */ |
-		F(NX) | 0 /* Reserved */ | F(MMXEXT) | F(MMX) |
-		F(FXSR) | F(FXSR_OPT) | f_gbpages | F(RDTSCP) |
-		0 /* Reserved */ | f_lm | F(3DNOWEXT) | F(3DNOW)
+		F(FPU) |
+		F(VME) |
+		F(DE) |
+		F(PSE) |
+		F(TSC) |
+		F(MSR) |
+		F(PAE) |
+		F(MCE) |
+		F(CX8) |
+		F(APIC) |
+		0 /* Reserved */ |
+		F(SYSCALL) |
+		F(MTRR) |
+		F(PGE) |
+		F(MCA) |
+		F(CMOV) |
+		F(PAT) |
+		F(PSE36) |
+		0 /* Reserved */ |
+		F(NX) |
+		0 /* Reserved */ |
+		F(MMXEXT) |
+		F(MMX) |
+		F(FXSR) |
+		F(FXSR_OPT) |
+		f_gbpages |
+		F(RDTSCP) |
+		0 /* Reserved */ |
+		f_lm |
+		F(3DNOWEXT) |
+		F(3DNOW)
 	);
 
 	if (!tdp_enabled && IS_ENABLED(CONFIG_X86_64))
@@ -792,10 +936,18 @@ void kvm_set_cpu_caps(void)
 	);
 
 	kvm_cpu_cap_mask(CPUID_8000_0008_EBX,
-		F(CLZERO) | F(XSAVEERPTR) |
-		F(WBNOINVD) | F(AMD_IBPB) | F(AMD_IBRS) | F(AMD_SSBD) | F(VIRT_SSBD) |
-		F(AMD_SSB_NO) | F(AMD_STIBP) | F(AMD_STIBP_ALWAYS_ON) |
-		F(AMD_PSFD) | F(AMD_IBPB_RET)
+		F(CLZERO) |
+		F(XSAVEERPTR) |
+		F(WBNOINVD) |
+		F(AMD_IBPB) |
+		F(AMD_IBRS) |
+		F(AMD_SSBD) |
+		F(VIRT_SSBD) |
+		F(AMD_SSB_NO) |
+		F(AMD_STIBP) |
+		F(AMD_STIBP_ALWAYS_ON) |
+		F(AMD_PSFD) |
+		F(AMD_IBPB_RET)
 	);
 
 	/*
@@ -832,12 +984,20 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_mask(CPUID_8000_000A_EDX, 0);
 
 	kvm_cpu_cap_mask(CPUID_8000_001F_EAX,
-		0 /* SME */ | 0 /* SEV */ | 0 /* VM_PAGE_FLUSH */ | 0 /* SEV_ES */ |
-		F(SME_COHERENT));
+		0 /* SME */ |
+		0 /* SEV */ |
+		0 /* VM_PAGE_FLUSH */ |
+		0 /* SEV_ES */ |
+		F(SME_COHERENT)
+	);
 
 	kvm_cpu_cap_mask(CPUID_8000_0021_EAX,
-		F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ |
-		F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ |
+		F(NO_NESTED_DATA_BP) |
+		F(LFENCE_RDTSC) |
+		0 /* SmmPgCfgLock */ |
+		F(NULL_SEL_CLR_BASE) |
+		F(AUTOIBRS) |
+		0 /* PrefetchCtlMsr */ |
 		F(WRMSR_XX_BASE_NS)
 	);
 
@@ -866,9 +1026,16 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_set(X86_FEATURE_NO_SMM_CTL_MSR);
 
 	kvm_cpu_cap_mask(CPUID_C000_0001_EDX,
-		F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) |
-		F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) |
-		F(PMM) | F(PMM_EN)
+		F(XSTORE) |
+		F(XSTORE_EN) |
+		F(XCRYPT) |
+		F(XCRYPT_EN) |
+		F(ACE2) |
+		F(ACE2_EN) |
+		F(PHE) |
+		F(PHE_EN) |
+		F(PMM) |
+		F(PMM_EN)
 	);
 
 	/*
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 23/57] KVM: x86: Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (21 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 22/57] KVM: x86: Unpack F() CPUID feature flag macros to one flag per line of code Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 24/57] KVM: x86: Add a macro to init CPUID features that are 64-bit only Sean Christopherson
                   ` (35 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init() in anticipation of merging
it with kvm_cpu_cap_init_kvm_defined(), and in anticipation of _setting_
bits in the helper (a future commit will play macro games to set emulated
feature flags via kvm_cpu_cap_init()).

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 00b5b1a2a66f..9bd8bac3cd52 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -615,7 +615,7 @@ static __always_inline void __kvm_cpu_cap_mask(unsigned int leaf)
 static __always_inline
 void kvm_cpu_cap_init_kvm_defined(enum kvm_only_cpuid_leafs leaf, u32 mask)
 {
-	/* Use kvm_cpu_cap_mask for leafs that aren't KVM-only. */
+	/* Use kvm_cpu_cap_init for leafs that aren't KVM-only. */
 	BUILD_BUG_ON(leaf < NCAPINTS);
 
 	kvm_cpu_caps[leaf] = mask;
@@ -623,7 +623,7 @@ void kvm_cpu_cap_init_kvm_defined(enum kvm_only_cpuid_leafs leaf, u32 mask)
 	__kvm_cpu_cap_mask(leaf);
 }
 
-static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask)
+static __always_inline void kvm_cpu_cap_init(enum cpuid_leafs leaf, u32 mask)
 {
 	/* Use kvm_cpu_cap_init_kvm_defined for KVM-only leafs. */
 	BUILD_BUG_ON(leaf >= NCAPINTS);
@@ -661,7 +661,7 @@ void kvm_set_cpu_caps(void)
 	memcpy(&kvm_cpu_caps, &boot_cpu_data.x86_capability,
 	       sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)));
 
-	kvm_cpu_cap_mask(CPUID_1_ECX,
+	kvm_cpu_cap_init(CPUID_1_ECX,
 		F(XMM3) |
 		F(PCLMULQDQ) |
 		0 /* DTES64 */ |
@@ -697,7 +697,7 @@ void kvm_set_cpu_caps(void)
 	/* KVM emulates x2apic in software irrespective of host support. */
 	kvm_cpu_cap_set(X86_FEATURE_X2APIC);
 
-	kvm_cpu_cap_mask(CPUID_1_EDX,
+	kvm_cpu_cap_init(CPUID_1_EDX,
 		F(FPU) |
 		F(VME) |
 		F(DE) |
@@ -727,7 +727,7 @@ void kvm_set_cpu_caps(void)
 		0 /* HTT, TM, Reserved, PBE */
 	);
 
-	kvm_cpu_cap_mask(CPUID_7_0_EBX,
+	kvm_cpu_cap_init(CPUID_7_0_EBX,
 		F(FSGSBASE) |
 		F(SGX) |
 		F(BMI1) |
@@ -757,7 +757,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX512BW) |
 		F(AVX512VL));
 
-	kvm_cpu_cap_mask(CPUID_7_ECX,
+	kvm_cpu_cap_init(CPUID_7_ECX,
 		F(AVX512VBMI) |
 		F(LA57) |
 		F(PKU) |
@@ -789,7 +789,7 @@ void kvm_set_cpu_caps(void)
 	if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE))
 		kvm_cpu_cap_clear(X86_FEATURE_PKU);
 
-	kvm_cpu_cap_mask(CPUID_7_EDX,
+	kvm_cpu_cap_init(CPUID_7_EDX,
 		F(AVX512_4VNNIW) |
 		F(AVX512_4FMAPS) |
 		F(SPEC_CTRL) |
@@ -821,7 +821,7 @@ void kvm_set_cpu_caps(void)
 	if (boot_cpu_has(X86_FEATURE_AMD_SSBD))
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
 
-	kvm_cpu_cap_mask(CPUID_7_1_EAX,
+	kvm_cpu_cap_init(CPUID_7_1_EAX,
 		F(SHA512) |
 		F(SM3) |
 		F(SM4) |
@@ -854,7 +854,7 @@ void kvm_set_cpu_caps(void)
 		F(MCDT_NO)
 	);
 
-	kvm_cpu_cap_mask(CPUID_D_1_EAX,
+	kvm_cpu_cap_init(CPUID_D_1_EAX,
 		F(XSAVEOPT) |
 		F(XSAVEC) |
 		F(XGETBV1) |
@@ -874,7 +874,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX10_512)
 	);
 
-	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
+	kvm_cpu_cap_init(CPUID_8000_0001_ECX,
 		F(LAHF_LM) |
 		F(CMP_LEGACY) |
 		0 /*SVM*/ |
@@ -894,7 +894,7 @@ void kvm_set_cpu_caps(void)
 		0 /* PERFCTR_CORE */
 	);
 
-	kvm_cpu_cap_mask(CPUID_8000_0001_EDX,
+	kvm_cpu_cap_init(CPUID_8000_0001_EDX,
 		F(FPU) |
 		F(VME) |
 		F(DE) |
@@ -935,7 +935,7 @@ void kvm_set_cpu_caps(void)
 		SF(CONSTANT_TSC)
 	);
 
-	kvm_cpu_cap_mask(CPUID_8000_0008_EBX,
+	kvm_cpu_cap_init(CPUID_8000_0008_EBX,
 		F(CLZERO) |
 		F(XSAVEERPTR) |
 		F(WBNOINVD) |
@@ -981,9 +981,9 @@ void kvm_set_cpu_caps(void)
 	 * Hide all SVM features by default, SVM will set the cap bits for
 	 * features it emulates and/or exposes for L1.
 	 */
-	kvm_cpu_cap_mask(CPUID_8000_000A_EDX, 0);
+	kvm_cpu_cap_init(CPUID_8000_000A_EDX, 0);
 
-	kvm_cpu_cap_mask(CPUID_8000_001F_EAX,
+	kvm_cpu_cap_init(CPUID_8000_001F_EAX,
 		0 /* SME */ |
 		0 /* SEV */ |
 		0 /* VM_PAGE_FLUSH */ |
@@ -991,7 +991,7 @@ void kvm_set_cpu_caps(void)
 		F(SME_COHERENT)
 	);
 
-	kvm_cpu_cap_mask(CPUID_8000_0021_EAX,
+	kvm_cpu_cap_init(CPUID_8000_0021_EAX,
 		F(NO_NESTED_DATA_BP) |
 		F(LFENCE_RDTSC) |
 		0 /* SmmPgCfgLock */ |
@@ -1015,7 +1015,7 @@ void kvm_set_cpu_caps(void)
 	 * kernel.  LFENCE_RDTSC was a Linux-defined synthetic feature long
 	 * before AMD joined the bandwagon, e.g. LFENCE is serializing on most
 	 * CPUs that support SSE2.  On CPUs that don't support AMD's leaf,
-	 * kvm_cpu_cap_mask() will unfortunately drop the flag due to ANDing
+	 * kvm_cpu_cap_init() will unfortunately drop the flag due to ANDing
 	 * the mask with the raw host CPUID, and reporting support in AMD's
 	 * leaf can make it easier for userspace to detect the feature.
 	 */
@@ -1025,7 +1025,7 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_NULL_SEL_CLR_BASE);
 	kvm_cpu_cap_set(X86_FEATURE_NO_SMM_CTL_MSR);
 
-	kvm_cpu_cap_mask(CPUID_C000_0001_EDX,
+	kvm_cpu_cap_init(CPUID_C000_0001_EDX,
 		F(XSTORE) |
 		F(XSTORE_EN) |
 		F(XCRYPT) |
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 24/57] KVM: x86: Add a macro to init CPUID features that are 64-bit only
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (22 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 23/57] KVM: x86: Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init() Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 25/57] KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID features Sean Christopherson
                   ` (34 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add a macro to mask-in feature flags that are supported only on 64-bit
kernels/KVM.  In addition to reducing overall #ifdeffery, using a macro
will allow hardening the kvm_cpu_cap initialization sequences to assert
that the features being advertised are indeed included in the word being
initialized.  And arguably using *F() macros through is more readable.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 9bd8bac3cd52..9219e164c810 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -642,17 +642,14 @@ static __always_inline void kvm_cpu_cap_init(enum cpuid_leafs leaf, u32 mask)
 	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
 })
 
+/* Features that KVM supports only on 64-bit kernels. */
+#define X86_64_F(name)						\
+({								\
+	(IS_ENABLED(CONFIG_X86_64) ? F(name) : 0);		\
+})
+
 void kvm_set_cpu_caps(void)
 {
-#ifdef CONFIG_X86_64
-	unsigned int f_gbpages = F(GBPAGES);
-	unsigned int f_lm = F(LM);
-	unsigned int f_xfd = F(XFD);
-#else
-	unsigned int f_gbpages = 0;
-	unsigned int f_lm = 0;
-	unsigned int f_xfd = 0;
-#endif
 	memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
 
 	BUILD_BUG_ON(sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)) >
@@ -859,7 +856,7 @@ void kvm_set_cpu_caps(void)
 		F(XSAVEC) |
 		F(XGETBV1) |
 		F(XSAVES) |
-		f_xfd
+		X86_64_F(XFD)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_12_EAX,
@@ -920,10 +917,10 @@ void kvm_set_cpu_caps(void)
 		F(MMX) |
 		F(FXSR) |
 		F(FXSR_OPT) |
-		f_gbpages |
+		X86_64_F(GBPAGES) |
 		F(RDTSCP) |
 		0 /* Reserved */ |
-		f_lm |
+		X86_64_F(LM) |
 		F(3DNOWEXT) |
 		F(3DNOW)
 	);
@@ -1057,6 +1054,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 
 #undef F
 #undef SF
+#undef X86_64_F
 
 struct kvm_cpuid_array {
 	struct kvm_cpuid_entry2 *entries;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 25/57] KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID features
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (23 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 24/57] KVM: x86: Add a macro to init CPUID features that are 64-bit only Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 26/57] KVM: x86: Handle kernel- and KVM-defined CPUID words in a single helper Sean Christopherson
                   ` (33 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add a macro to precisely handle CPUID features that AMD duplicated from
CPUID.0x1.EDX into CPUID.0x8000_0001.EDX.  This will allow adding an
assert that all features passed to kvm_cpu_cap_init() match the word being
processed, e.g. to prevent passing a feature from CPUID 0x7 to CPUID 0x1.

Because the kernel simply reuses the X86_FEATURE_* definitions from
CPUID.0x1.EDX, KVM's use of the aliased features would result in false
positives from such an assert.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 47 +++++++++++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 9219e164c810..ddff0c7c78b9 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -648,6 +648,16 @@ static __always_inline void kvm_cpu_cap_init(enum cpuid_leafs leaf, u32 mask)
 	(IS_ENABLED(CONFIG_X86_64) ? F(name) : 0);		\
 })
 
+/*
+ * Aliased Features - For features in 0x8000_0001.EDX that are duplicates of
+ * identical 0x1.EDX features, and thus are aliased from 0x1 to 0x8000_0001.
+ */
+#define ALIASED_1_EDX_F(name)							\
+({										\
+	BUILD_BUG_ON(__feature_leaf(X86_FEATURE_##name) != CPUID_1_EDX);	\
+	feature_bit(name);							\
+})
+
 void kvm_set_cpu_caps(void)
 {
 	memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
@@ -892,30 +902,30 @@ void kvm_set_cpu_caps(void)
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0001_EDX,
-		F(FPU) |
-		F(VME) |
-		F(DE) |
-		F(PSE) |
-		F(TSC) |
-		F(MSR) |
-		F(PAE) |
-		F(MCE) |
-		F(CX8) |
-		F(APIC) |
+		ALIASED_1_EDX_F(FPU) |
+		ALIASED_1_EDX_F(VME) |
+		ALIASED_1_EDX_F(DE) |
+		ALIASED_1_EDX_F(PSE) |
+		ALIASED_1_EDX_F(TSC) |
+		ALIASED_1_EDX_F(MSR) |
+		ALIASED_1_EDX_F(PAE) |
+		ALIASED_1_EDX_F(MCE) |
+		ALIASED_1_EDX_F(CX8) |
+		ALIASED_1_EDX_F(APIC) |
 		0 /* Reserved */ |
 		F(SYSCALL) |
-		F(MTRR) |
-		F(PGE) |
-		F(MCA) |
-		F(CMOV) |
-		F(PAT) |
-		F(PSE36) |
+		ALIASED_1_EDX_F(MTRR) |
+		ALIASED_1_EDX_F(PGE) |
+		ALIASED_1_EDX_F(MCA) |
+		ALIASED_1_EDX_F(CMOV) |
+		ALIASED_1_EDX_F(PAT) |
+		ALIASED_1_EDX_F(PSE36) |
 		0 /* Reserved */ |
 		F(NX) |
 		0 /* Reserved */ |
 		F(MMXEXT) |
-		F(MMX) |
-		F(FXSR) |
+		ALIASED_1_EDX_F(MMX) |
+		ALIASED_1_EDX_F(FXSR) |
 		F(FXSR_OPT) |
 		X86_64_F(GBPAGES) |
 		F(RDTSCP) |
@@ -1055,6 +1065,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 #undef F
 #undef SF
 #undef X86_64_F
+#undef ALIASED_1_EDX_F
 
 struct kvm_cpuid_array {
 	struct kvm_cpuid_entry2 *entries;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 26/57] KVM: x86: Handle kernel- and KVM-defined CPUID words in a single helper
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (24 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 25/57] KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID features Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 27/57] KVM: x86: #undef SPEC_CTRL_SSBD in cpuid.c to avoid macro collisions Sean Christopherson
                   ` (32 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Merge kvm_cpu_cap_init() and kvm_cpu_cap_init_kvm_defined() into a single
helper.  The only advantage of separating the two was to make it somewhat
obvious that KVM directly initializes the KVM-defined words, whereas using
a common helper will allow for hardening both kernel- and KVM-defined
CPUID words without needing copy+paste.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 46 +++++++++++++++-----------------------------
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index ddff0c7c78b9..73e756d097e4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -602,37 +602,23 @@ static __always_inline u32 raw_cpuid_get(struct cpuid_reg cpuid)
 	return *__cpuid_entry_get_reg(&entry, cpuid.reg);
 }
 
-/* Mask kvm_cpu_caps for @leaf with the raw CPUID capabilities of this CPU. */
-static __always_inline void __kvm_cpu_cap_mask(unsigned int leaf)
+static __always_inline void kvm_cpu_cap_init(u32 leaf, u32 mask)
 {
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);
 
-	reverse_cpuid_check(leaf);
+	/*
+	 * For kernel-defined leafs, mask the boot CPU's pre-populated value.
+	 * For KVM-defined leafs, explicitly set the leaf, as KVM is the one
+	 * and only authority.
+	 */
+	if (leaf < NCAPINTS)
+		kvm_cpu_caps[leaf] &= mask;
+	else
+		kvm_cpu_caps[leaf] = mask;
 
 	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);
 }
 
-static __always_inline
-void kvm_cpu_cap_init_kvm_defined(enum kvm_only_cpuid_leafs leaf, u32 mask)
-{
-	/* Use kvm_cpu_cap_init for leafs that aren't KVM-only. */
-	BUILD_BUG_ON(leaf < NCAPINTS);
-
-	kvm_cpu_caps[leaf] = mask;
-
-	__kvm_cpu_cap_mask(leaf);
-}
-
-static __always_inline void kvm_cpu_cap_init(enum cpuid_leafs leaf, u32 mask)
-{
-	/* Use kvm_cpu_cap_init_kvm_defined for KVM-only leafs. */
-	BUILD_BUG_ON(leaf >= NCAPINTS);
-
-	kvm_cpu_caps[leaf] &= mask;
-
-	__kvm_cpu_cap_mask(leaf);
-}
-
 #define F feature_bit
 
 /* Scattered Flag - For features that are scattered by cpufeatures.h. */
@@ -843,7 +829,7 @@ void kvm_set_cpu_caps(void)
 		F(LAM)
 	);
 
-	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
+	kvm_cpu_cap_init(CPUID_7_1_EDX,
 		F(AVX_VNNI_INT8) |
 		F(AVX_NE_CONVERT) |
 		F(AMX_COMPLEX) |
@@ -852,7 +838,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX10)
 	);
 
-	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
+	kvm_cpu_cap_init(CPUID_7_2_EDX,
 		F(INTEL_PSFD) |
 		F(IPRED_CTRL) |
 		F(RRSBA_CTRL) |
@@ -869,13 +855,13 @@ void kvm_set_cpu_caps(void)
 		X86_64_F(XFD)
 	);
 
-	kvm_cpu_cap_init_kvm_defined(CPUID_12_EAX,
+	kvm_cpu_cap_init(CPUID_12_EAX,
 		SF(SGX1) |
 		SF(SGX2) |
 		SF(SGX_EDECCSSA)
 	);
 
-	kvm_cpu_cap_init_kvm_defined(CPUID_24_0_EBX,
+	kvm_cpu_cap_init(CPUID_24_0_EBX,
 		F(AVX10_128) |
 		F(AVX10_256) |
 		F(AVX10_512)
@@ -938,7 +924,7 @@ void kvm_set_cpu_caps(void)
 	if (!tdp_enabled && IS_ENABLED(CONFIG_X86_64))
 		kvm_cpu_cap_set(X86_FEATURE_GBPAGES);
 
-	kvm_cpu_cap_init_kvm_defined(CPUID_8000_0007_EDX,
+	kvm_cpu_cap_init(CPUID_8000_0007_EDX,
 		SF(CONSTANT_TSC)
 	);
 
@@ -1012,7 +998,7 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_check_and_set(X86_FEATURE_IBPB_BRTYPE);
 	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
 
-	kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
+	kvm_cpu_cap_init(CPUID_8000_0022_EAX,
 		F(PERFMON_V2)
 	);
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 27/57] KVM: x86: #undef SPEC_CTRL_SSBD in cpuid.c to avoid macro collisions
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (25 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 26/57] KVM: x86: Handle kernel- and KVM-defined CPUID words in a single helper Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 28/57] KVM: x86: Harden CPU capabilities processing against out-of-scope features Sean Christopherson
                   ` (31 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Undefine SPEC_CTRL_SSBD, which is #defined by msr-index.h to represent the
enable flag in MSR_IA32_SPEC_CTRL, to avoid issues with the macro being
unpacked into its raw value when passed to KVM's F() macro.  This will
allow using multiple layers of macros in F() and friends, e.g. to harden
against incorrect usage of F().

No functional change intended (cpuid.c doesn't consume SPEC_CTRL_SSBD).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 73e756d097e4..efff83da3df3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -644,6 +644,12 @@ static __always_inline void kvm_cpu_cap_init(u32 leaf, u32 mask)
 	feature_bit(name);							\
 })
 
+/*
+ * Undefine the MSR bit macro to avoid token concatenation issues when
+ * processing X86_FEATURE_SPEC_CTRL_SSBD.
+ */
+#undef SPEC_CTRL_SSBD
+
 void kvm_set_cpu_caps(void)
 {
 	memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 28/57] KVM: x86: Harden CPU capabilities processing against out-of-scope features
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (26 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 27/57] KVM: x86: #undef SPEC_CTRL_SSBD in cpuid.c to avoid macro collisions Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 29/57] KVM: x86: Add a macro to init CPUID features that ignore host kernel support Sean Christopherson
                   ` (30 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add compile-time assertions to verify that usage of F() and friends in
kvm_set_cpu_caps() is scoped to the correct CPUID word, e.g. to detect
bugs where KVM passes a feature bit from word X into word y.

Add a one-off assertion in the aliased feature macro to ensure that only
word 0x8000_0001.EDX aliased the features defined for 0x1.EDX.

To do so, convert kvm_cpu_cap_init() to a macro and have it define a
local variable to track which CPUID word is being initialized that is
then used to validate usage of F() (all of the inputs are compile-time
constants and thus can be fed into BUILD_BUG_ON()).

Redefine KVM_VALIDATE_CPU_CAP_USAGE after kvm_set_cpu_caps() to be a nop
so that F() can be used in other flows that aren't as easily hardened,
e.g. __do_cpuid_func_emulated() and __do_cpuid_func().

Invoke KVM_VALIDATE_CPU_CAP_USAGE() in SF() and X86_64_F() to ensure the
validation occurs, e.g. if the usage of F() is completely compiled out
(which shouldn't happen for boot_cpu_has(), but could happen in the future,
e.g. if KVM were to use cpu_feature_enabled()).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 51 ++++++++++++++++++++++++++++++--------------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index efff83da3df3..c9a8513dbc30 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -602,35 +602,53 @@ static __always_inline u32 raw_cpuid_get(struct cpuid_reg cpuid)
 	return *__cpuid_entry_get_reg(&entry, cpuid.reg);
 }
 
-static __always_inline void kvm_cpu_cap_init(u32 leaf, u32 mask)
-{
-	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);
+/*
+ * For kernel-defined leafs, mask the boot CPU's pre-populated value.  For KVM-
+ * defined leafs, explicitly set the leaf, as KVM is the one and only authority.
+ */
+#define kvm_cpu_cap_init(leaf, mask)					\
+do {									\
+	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
+	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
+									\
+	if (leaf < NCAPINTS)						\
+		kvm_cpu_caps[leaf] &= (mask);				\
+	else								\
+		kvm_cpu_caps[leaf] = (mask);				\
+									\
+	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);			\
+} while (0)
 
-	/*
-	 * For kernel-defined leafs, mask the boot CPU's pre-populated value.
-	 * For KVM-defined leafs, explicitly set the leaf, as KVM is the one
-	 * and only authority.
-	 */
-	if (leaf < NCAPINTS)
-		kvm_cpu_caps[leaf] &= mask;
-	else
-		kvm_cpu_caps[leaf] = mask;
+/*
+ * Assert that the feature bit being declared, e.g. via F(), is in the CPUID
+ * word that's being initialized.  Exempt 0x8000_0001.EDX usage of 0x1.EDX
+ * features, as AMD duplicated many 0x1.EDX features into 0x8000_0001.EDX.
+ */
+#define KVM_VALIDATE_CPU_CAP_USAGE(name)				\
+do {									\
+	u32 __leaf = __feature_leaf(X86_FEATURE_##name);		\
+									\
+	BUILD_BUG_ON(__leaf != kvm_cpu_cap_init_in_progress);		\
+} while (0)
 
-	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);
-}
-
-#define F feature_bit
+#define F(name)							\
+({								\
+	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
+	feature_bit(name);					\
+})
 
 /* Scattered Flag - For features that are scattered by cpufeatures.h. */
 #define SF(name)						\
 ({								\
 	BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES);	\
+	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
 	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
 })
 
 /* Features that KVM supports only on 64-bit kernels. */
 #define X86_64_F(name)						\
 ({								\
+	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
 	(IS_ENABLED(CONFIG_X86_64) ? F(name) : 0);		\
 })
 
@@ -641,6 +659,7 @@ static __always_inline void kvm_cpu_cap_init(u32 leaf, u32 mask)
 #define ALIASED_1_EDX_F(name)							\
 ({										\
 	BUILD_BUG_ON(__feature_leaf(X86_FEATURE_##name) != CPUID_1_EDX);	\
+	BUILD_BUG_ON(kvm_cpu_cap_init_in_progress != CPUID_8000_0001_EDX);	\
 	feature_bit(name);							\
 })
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 29/57] KVM: x86: Add a macro to init CPUID features that ignore host kernel support
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (27 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 28/57] KVM: x86: Harden CPU capabilities processing against out-of-scope features Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 30/57] KVM: x86: Add a macro to init CPUID features that KVM emulates in software Sean Christopherson
                   ` (29 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add a macro for use in kvm_set_cpu_caps() to automagically initialize
features that KVM wants to support based solely on the CPU's capabilities,
e.g. KVM advertises LA57 support if it's available in hardware, even if
the host kernel isn't utilizing 57-bit virtual addresses.

Track a features that are passed through to userspace (from hardware) in
a local variable, and simply OR them in *after* adjusting the capabilities
that came from boot_cpu_data.

Note, eliminating the open-coded call to cpuid_ecx() also fixes a largely
benign bug where KVM could incorrectly report LA57 support on Intel CPUs
whose max supported CPUID is less than 7, i.e. if the max supported leaf
(<7) happened to have bit 16 set.  In practice, barring a funky virtual
machine setup, the bug is benign as all known CPUs that support VMX also
support leaf 7.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index c9a8513dbc30..9bf324aa5fae 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -610,12 +610,14 @@ static __always_inline u32 raw_cpuid_get(struct cpuid_reg cpuid)
 do {									\
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
+	u32 kvm_cpu_cap_passthrough = 0;				\
 									\
 	if (leaf < NCAPINTS)						\
 		kvm_cpu_caps[leaf] &= (mask);				\
 	else								\
 		kvm_cpu_caps[leaf] = (mask);				\
 									\
+	kvm_cpu_caps[leaf] |= kvm_cpu_cap_passthrough;			\
 	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);			\
 } while (0)
 
@@ -652,6 +654,18 @@ do {									\
 	(IS_ENABLED(CONFIG_X86_64) ? F(name) : 0);		\
 })
 
+/*
+ * Passthrough Feature - For features that KVM supports based purely on raw
+ * hardware CPUID, i.e. that KVM virtualizes even if the host kernel doesn't
+ * use the feature.  Simply force set the feature in KVM's capabilities, raw
+ * CPUID support will be factored in by kvm_cpu_cap_mask().
+ */
+#define PASSTHROUGH_F(name)					\
+({								\
+	kvm_cpu_cap_passthrough |= F(name);			\
+	F(name);						\
+})
+
 /*
  * Aliased Features - For features in 0x8000_0001.EDX that are duplicates of
  * identical 0x1.EDX features, and thus are aliased from 0x1 to 0x8000_0001.
@@ -777,7 +791,7 @@ void kvm_set_cpu_caps(void)
 
 	kvm_cpu_cap_init(CPUID_7_ECX,
 		F(AVX512VBMI) |
-		F(LA57) |
+		PASSTHROUGH_F(LA57) |
 		F(PKU) |
 		0 /*OSPKE*/ |
 		F(RDPID) |
@@ -796,9 +810,6 @@ void kvm_set_cpu_caps(void)
 		F(SGX_LC) |
 		F(BUS_LOCK_DETECT)
 	);
-	/* Set LA57 based on hardware capability. */
-	if (cpuid_ecx(7) & feature_bit(LA57))
-		kvm_cpu_cap_set(X86_FEATURE_LA57);
 
 	/*
 	 * PKU not yet implemented for shadow paging and requires OSPKE
@@ -1076,6 +1087,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 #undef F
 #undef SF
 #undef X86_64_F
+#undef PASSTHROUGH_F
 #undef ALIASED_1_EDX_F
 
 struct kvm_cpuid_array {
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 30/57] KVM: x86: Add a macro to init CPUID features that KVM emulates in software
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (28 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 29/57] KVM: x86: Add a macro to init CPUID features that ignore host kernel support Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 31/57] KVM: x86: Swap incoming guest CPUID into vCPU before massaging in KVM_SET_CPUID2 Sean Christopherson
                   ` (28 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
OR-in features that KVM emulates in software, i.e. that don't depend on
the feature being available in hardware.  The contained scope
of kvm_cpu_cap_init() allows using a local variable to track the set of
emulated leaves, which in addition to avoiding confusing and/or
unnecessary variables, helps prevent misuse of EMUL_F().

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 9bf324aa5fae..83b29c5a0498 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -611,6 +611,7 @@ do {									\
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
 	u32 kvm_cpu_cap_passthrough = 0;				\
+	u32 kvm_cpu_cap_emulated = 0;					\
 									\
 	if (leaf < NCAPINTS)						\
 		kvm_cpu_caps[leaf] &= (mask);				\
@@ -619,6 +620,7 @@ do {									\
 									\
 	kvm_cpu_caps[leaf] |= kvm_cpu_cap_passthrough;			\
 	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);			\
+	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
 } while (0)
 
 /*
@@ -654,6 +656,16 @@ do {									\
 	(IS_ENABLED(CONFIG_X86_64) ? F(name) : 0);		\
 })
 
+/*
+ * Emulated Feature - For features that KVM emulates in software irrespective
+ * of host CPU/kernel support.
+ */
+#define EMULATED_F(name)					\
+({								\
+	kvm_cpu_cap_emulated |= F(name);			\
+	F(name);						\
+})
+
 /*
  * Passthrough Feature - For features that KVM supports based purely on raw
  * hardware CPUID, i.e. that KVM virtualizes even if the host kernel doesn't
@@ -715,7 +727,7 @@ void kvm_set_cpu_caps(void)
 		0 /* Reserved, DCA */ |
 		F(XMM4_1) |
 		F(XMM4_2) |
-		F(X2APIC) |
+		EMULATED_F(X2APIC) |
 		F(MOVBE) |
 		F(POPCNT) |
 		0 /* Reserved*/ |
@@ -726,8 +738,6 @@ void kvm_set_cpu_caps(void)
 		F(F16C) |
 		F(RDRAND)
 	);
-	/* KVM emulates x2apic in software irrespective of host support. */
-	kvm_cpu_cap_set(X86_FEATURE_X2APIC);
 
 	kvm_cpu_cap_init(CPUID_1_EDX,
 		F(FPU) |
@@ -761,6 +771,7 @@ void kvm_set_cpu_caps(void)
 
 	kvm_cpu_cap_init(CPUID_7_0_EBX,
 		F(FSGSBASE) |
+		EMULATED_F(TSC_ADJUST) |
 		F(SGX) |
 		F(BMI1) |
 		F(HLE) |
@@ -823,7 +834,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX512_4FMAPS) |
 		F(SPEC_CTRL) |
 		F(SPEC_CTRL_SSBD) |
-		F(ARCH_CAPABILITIES) |
+		EMULATED_F(ARCH_CAPABILITIES) |
 		F(INTEL_STIBP) |
 		F(MD_CLEAR) |
 		F(AVX512_VP2INTERSECT) |
@@ -837,10 +848,6 @@ void kvm_set_cpu_caps(void)
 		F(FLUSH_L1D)
 	);
 
-	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
-	kvm_cpu_cap_set(X86_FEATURE_TSC_ADJUST);
-	kvm_cpu_cap_set(X86_FEATURE_ARCH_CAPABILITIES);
-
 	if (boot_cpu_has(X86_FEATURE_AMD_IBPB_RET) &&
 	    boot_cpu_has(X86_FEATURE_AMD_IBPB) &&
 	    boot_cpu_has(X86_FEATURE_AMD_IBRS))
@@ -1026,6 +1033,7 @@ void kvm_set_cpu_caps(void)
 		0 /* SmmPgCfgLock */ |
 		F(NULL_SEL_CLR_BASE) |
 		F(AUTOIBRS) |
+		EMULATED_F(NO_SMM_CTL_MSR) |
 		0 /* PrefetchCtlMsr */ |
 		F(WRMSR_XX_BASE_NS)
 	);
@@ -1052,7 +1060,6 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_LFENCE_RDTSC);
 	if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
 		kvm_cpu_cap_set(X86_FEATURE_NULL_SEL_CLR_BASE);
-	kvm_cpu_cap_set(X86_FEATURE_NO_SMM_CTL_MSR);
 
 	kvm_cpu_cap_init(CPUID_C000_0001_EDX,
 		F(XSTORE) |
@@ -1087,6 +1094,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 #undef F
 #undef SF
 #undef X86_64_F
+#undef EMULATED_F
 #undef PASSTHROUGH_F
 #undef ALIASED_1_EDX_F
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 31/57] KVM: x86: Swap incoming guest CPUID into vCPU before massaging in KVM_SET_CPUID2
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (29 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 30/57] KVM: x86: Add a macro to init CPUID features that KVM emulates in software Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:33 ` [PATCH v3 32/57] KVM: x86: Clear PV_UNHALT for !HLT-exiting only when userspace sets CPUID Sean Christopherson
                   ` (27 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

When handling KVM_SET_CPUID{,2}, swap the old and new CPUID arrays and
lengths before processing the new CPUID, and simply undo the swap if
setting the new CPUID fails for whatever reason.

To keep the diff reasonable, continue passing the entry array and length
to most helpers, and defer the more complete cleanup to future commits.

For any sane VMM, setting "bad" CPUID state is not a hot path (or even
something that is surviable), and setting guest CPUID before it's known
good will allow removing all of KVM's infrastructure for processing CPUID
entries directly (as opposed to operating on vcpu->arch.cpuid_entries).

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 54 ++++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 83b29c5a0498..e8c30de2faa9 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -121,10 +121,10 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
 	return NULL;
 }
 
-static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
-			   struct kvm_cpuid_entry2 *entries,
-			   int nent)
+static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 {
+	struct kvm_cpuid_entry2 *entries = vcpu->arch.cpuid_entries;
+	int nent = vcpu->arch.cpuid_nent;
 	struct kvm_cpuid_entry2 *best;
 	u64 xfeatures;
 
@@ -157,9 +157,6 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
 	return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu, xfeatures);
 }
 
-static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
-				       int nent);
-
 /* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
 static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 				 int nent)
@@ -175,8 +172,10 @@ static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2
 	 * CPUID processing is functionally correct only because any change in
 	 * CPUID is disallowed, i.e. using stale data is ok because the below
 	 * checks will reject the change.
+	 *
+	 * Note!  @e2 and @nent track the _old_ CPUID entries!
 	 */
-	__kvm_update_cpuid_runtime(vcpu, e2, nent);
+	kvm_update_cpuid_runtime(vcpu);
 
 	if (nent != vcpu->arch.cpuid_nent)
 		return -EINVAL;
@@ -329,9 +328,11 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime);
 
-static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent)
+static bool kvm_cpuid_has_hyperv(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_KVM_HYPERV
+	struct kvm_cpuid_entry2 *entries = vcpu->arch.cpuid_entries;
+	int nent = vcpu->arch.cpuid_nent;
 	struct kvm_cpuid_entry2 *entry;
 
 	entry = cpuid_entry2_find(entries, nent, HYPERV_CPUID_INTERFACE,
@@ -408,8 +409,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 					 __cr4_reserved_bits(guest_cpuid_has, vcpu);
 #undef __kvm_cpu_cap_has
 
-	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu->arch.cpuid_entries,
-						    vcpu->arch.cpuid_nent));
+	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu));
 
 	/* Invoke the vendor callback only after the above state is updated. */
 	kvm_x86_call(vcpu_after_set_cpuid)(vcpu);
@@ -450,6 +450,15 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 {
 	int r;
 
+	/*
+	 * Swap the existing (old) entries with the incoming (new) entries in
+	 * order to massage the new entries, e.g. to account for dynamic bits
+	 * that KVM controls, without clobbering the current guest CPUID, which
+	 * KVM needs to preserve in order to unwind on failure.
+	 */
+	swap(vcpu->arch.cpuid_entries, e2);
+	swap(vcpu->arch.cpuid_nent, nent);
+
 	/*
 	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
 	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
@@ -464,27 +473,21 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 	if (kvm_vcpu_has_run(vcpu)) {
 		r = kvm_cpuid_check_equal(vcpu, e2, nent);
 		if (r)
-			return r;
-
-		kvfree(e2);
-		return 0;
+			goto err;
+		goto success;
 	}
 
 #ifdef CONFIG_KVM_HYPERV
-	if (kvm_cpuid_has_hyperv(e2, nent)) {
+	if (kvm_cpuid_has_hyperv(vcpu)) {
 		r = kvm_hv_vcpu_init(vcpu);
 		if (r)
-			return r;
+			goto err;
 	}
 #endif
 
-	r = kvm_check_cpuid(vcpu, e2, nent);
+	r = kvm_check_cpuid(vcpu);
 	if (r)
-		return r;
-
-	kvfree(vcpu->arch.cpuid_entries);
-	vcpu->arch.cpuid_entries = e2;
-	vcpu->arch.cpuid_nent = nent;
+		goto err;
 
 	vcpu->arch.kvm_cpuid = kvm_get_hypervisor_cpuid(vcpu, KVM_SIGNATURE);
 #ifdef CONFIG_KVM_XEN
@@ -492,7 +495,14 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 #endif
 	kvm_vcpu_after_set_cpuid(vcpu);
 
+success:
+	kvfree(e2);
 	return 0;
+
+err:
+	swap(vcpu->arch.cpuid_entries, e2);
+	swap(vcpu->arch.cpuid_nent, nent);
+	return r;
 }
 
 /* when an old userspace process fills a new kernel module */
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 32/57] KVM: x86: Clear PV_UNHALT for !HLT-exiting only when userspace sets CPUID
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (30 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 31/57] KVM: x86: Swap incoming guest CPUID into vCPU before massaging in KVM_SET_CPUID2 Sean Christopherson
@ 2024-11-28  1:33 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 33/57] KVM: x86: Remove unnecessary caching of KVM's PV CPUID base Sean Christopherson
                   ` (26 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:33 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Now that KVM disallows disabling HLT-exiting after vCPUs have been created,
i.e. now that it's impossible for kvm_hlt_in_guest() to change while vCPUs
are running, apply KVM's PV_UNHALT quirk only when userspace is setting
guest CPUID.

Opportunistically rename the helper to make it clear that KVM's behavior
is a quirk that should never have been added.  KVM's documentation
explicitly states that userspace should not advertise PV_UNHALT if
HLT-exiting is disabled, but for unknown reasons, commit caa057a2cad6
("KVM: X86: Provide a capability to disable HLT intercepts") didn't stop
at documenting the requirement and also massaged the incoming guest CPUID.

Unfortunately, it's quite likely that userspace has come to rely on KVM's
behavior, i.e. the code can't simply be deleted.  The only reason KVM
doesn't have an "official" quirk is that there is no known use case where
disabling the quirk would make sense, i.e. letting userspace disable the
quirk would further increase KVM's burden without any benefit.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 28 +++++++++++-----------------
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e8c30de2faa9..3ba0e6a67823 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -157,6 +157,8 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 	return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu, xfeatures);
 }
 
+static u32 kvm_apply_cpuid_pv_features_quirk(struct kvm_vcpu *vcpu);
+
 /* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
 static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 				 int nent)
@@ -176,6 +178,7 @@ static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2
 	 * Note!  @e2 and @nent track the _old_ CPUID entries!
 	 */
 	kvm_update_cpuid_runtime(vcpu);
+	kvm_apply_cpuid_pv_features_quirk(vcpu);
 
 	if (nent != vcpu->arch.cpuid_nent)
 		return -EINVAL;
@@ -246,18 +249,17 @@ static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcp
 					     vcpu->arch.cpuid_nent, base);
 }
 
-static void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)
+static u32 kvm_apply_cpuid_pv_features_quirk(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best = kvm_find_kvm_cpuid_features(vcpu);
 
-	vcpu->arch.pv_cpuid.features = 0;
+	if (!best)
+		return 0;
 
-	/*
-	 * save the feature bitmap to avoid cpuid lookup for every PV
-	 * operation
-	 */
-	if (best)
-		vcpu->arch.pv_cpuid.features = best->eax;
+	if (kvm_hlt_in_guest(vcpu->kvm))
+		best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
+
+	return best->eax;
 }
 
 /*
@@ -279,7 +281,6 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
 				       int nent)
 {
 	struct kvm_cpuid_entry2 *best;
-	struct kvm_hypervisor_cpuid kvm_cpuid;
 
 	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 	if (best) {
@@ -306,13 +307,6 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
 		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
 
-	kvm_cpuid = __kvm_get_hypervisor_cpuid(entries, nent, KVM_SIGNATURE);
-	if (kvm_cpuid.base) {
-		best = __kvm_find_kvm_cpuid_features(entries, nent, kvm_cpuid.base);
-		if (kvm_hlt_in_guest(vcpu->kvm) && best)
-			best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
-	}
-
 	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
 		best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 		if (best)
@@ -396,7 +390,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu->arch.guest_supported_xcr0 =
 		cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent);
 
-	kvm_update_pv_runtime(vcpu);
+	vcpu->arch.pv_cpuid.features = kvm_apply_cpuid_pv_features_quirk(vcpu);
 
 	vcpu->arch.is_amd_compatible = guest_cpuid_is_amd_or_hygon(vcpu);
 	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 33/57] KVM: x86: Remove unnecessary caching of KVM's PV CPUID base
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (31 preceding siblings ...)
  2024-11-28  1:33 ` [PATCH v3 32/57] KVM: x86: Clear PV_UNHALT for !HLT-exiting only when userspace sets CPUID Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 34/57] KVM: x86: Always operate on kvm_vcpu data in cpuid_entry2_find() Sean Christopherson
                   ` (25 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Now that KVM only searches for KVM's PV CPUID base when userspace sets
guest CPUID, drop the cache and simply do the search every time.

Practically speaking, this is a nop except for situations where userspace
sets CPUID _after_ running the vCPU, which is anything but a hot path,
e.g. QEMU does so only when hotplugging a vCPU.  And on the flip side,
caching guest CPUID information, especially information that is used to
query/modify _other_ CPUID state, is inherently dangerous as it's all too
easy to use stale information, i.e. KVM should only cache CPUID state when
the performance and/or programming benefits justify it.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/cpuid.c            | 34 ++++++++-------------------------
 2 files changed, 8 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e159e44a6a1b..f076df9f18be 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -854,7 +854,6 @@ struct kvm_vcpu_arch {
 
 	int cpuid_nent;
 	struct kvm_cpuid_entry2 *cpuid_entries;
-	struct kvm_hypervisor_cpuid kvm_cpuid;
 	bool is_amd_compatible;
 
 	/*
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 3ba0e6a67823..b402b9f59cbb 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -168,12 +168,7 @@ static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2
 
 	/*
 	 * Apply runtime CPUID updates to the incoming CPUID entries to avoid
-	 * false positives due mismatches on KVM-owned feature flags.  Note,
-	 * runtime CPUID updates may consume other CPUID-driven vCPU state,
-	 * e.g. KVM or Xen CPUID bases.  Updating runtime state before full
-	 * CPUID processing is functionally correct only because any change in
-	 * CPUID is disallowed, i.e. using stale data is ok because the below
-	 * checks will reject the change.
+	 * false positives due mismatches on KVM-owned feature flags.
 	 *
 	 * Note!  @e2 and @nent track the _old_ CPUID entries!
 	 */
@@ -231,28 +226,16 @@ static struct kvm_hypervisor_cpuid kvm_get_hypervisor_cpuid(struct kvm_vcpu *vcp
 					  vcpu->arch.cpuid_nent, sig);
 }
 
-static struct kvm_cpuid_entry2 *__kvm_find_kvm_cpuid_features(struct kvm_cpuid_entry2 *entries,
-							      int nent, u32 kvm_cpuid_base)
-{
-	return cpuid_entry2_find(entries, nent, kvm_cpuid_base | KVM_CPUID_FEATURES,
-				 KVM_CPUID_INDEX_NOT_SIGNIFICANT);
-}
-
-static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu)
-{
-	u32 base = vcpu->arch.kvm_cpuid.base;
-
-	if (!base)
-		return NULL;
-
-	return __kvm_find_kvm_cpuid_features(vcpu->arch.cpuid_entries,
-					     vcpu->arch.cpuid_nent, base);
-}
-
 static u32 kvm_apply_cpuid_pv_features_quirk(struct kvm_vcpu *vcpu)
 {
-	struct kvm_cpuid_entry2 *best = kvm_find_kvm_cpuid_features(vcpu);
+	struct kvm_hypervisor_cpuid kvm_cpuid;
+	struct kvm_cpuid_entry2 *best;
 
+	kvm_cpuid = kvm_get_hypervisor_cpuid(vcpu, KVM_SIGNATURE);
+	if (!kvm_cpuid.base)
+		return 0;
+
+	best = kvm_find_cpuid_entry(vcpu, kvm_cpuid.base | KVM_CPUID_FEATURES);
 	if (!best)
 		return 0;
 
@@ -483,7 +466,6 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 	if (r)
 		goto err;
 
-	vcpu->arch.kvm_cpuid = kvm_get_hypervisor_cpuid(vcpu, KVM_SIGNATURE);
 #ifdef CONFIG_KVM_XEN
 	vcpu->arch.xen.cpuid = kvm_get_hypervisor_cpuid(vcpu, XEN_SIGNATURE);
 #endif
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 34/57] KVM: x86: Always operate on kvm_vcpu data in cpuid_entry2_find()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (32 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 33/57] KVM: x86: Remove unnecessary caching of KVM's PV CPUID base Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 35/57] KVM: x86: Move kvm_find_cpuid_entry{,_index}() up near cpuid_entry2_find() Sean Christopherson
                   ` (24 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Now that KVM sets vcpu->arch.cpuid_{entries,nent} before processing the
incoming CPUID entries during KVM_SET_CPUID{,2}, drop the @entries and
@nent params from cpuid_entry2_find() and unconditionally operate on the
vCPU state.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 62 +++++++++++++++-----------------------------
 1 file changed, 21 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b402b9f59cbb..af5c66408c78 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -70,8 +70,8 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
  */
 #define KVM_CPUID_INDEX_NOT_SIGNIFICANT -1ull
 
-static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
-	struct kvm_cpuid_entry2 *entries, int nent, u32 function, u64 index)
+static struct kvm_cpuid_entry2 *cpuid_entry2_find(struct kvm_vcpu *vcpu,
+						  u32 function, u64 index)
 {
 	struct kvm_cpuid_entry2 *e;
 	int i;
@@ -88,8 +88,8 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
 	 */
 	lockdep_assert_irqs_enabled();
 
-	for (i = 0; i < nent; i++) {
-		e = &entries[i];
+	for (i = 0; i < vcpu->arch.cpuid_nent; i++) {
+		e = &vcpu->arch.cpuid_entries[i];
 
 		if (e->function != function)
 			continue;
@@ -123,8 +123,6 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
 
 static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 {
-	struct kvm_cpuid_entry2 *entries = vcpu->arch.cpuid_entries;
-	int nent = vcpu->arch.cpuid_nent;
 	struct kvm_cpuid_entry2 *best;
 	u64 xfeatures;
 
@@ -132,7 +130,7 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
 	 * canonical address checks; exit if it is ever changed.
 	 */
-	best = cpuid_entry2_find(entries, nent, 0x80000008,
+	best = cpuid_entry2_find(vcpu, 0x80000008,
 				 KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 	if (best) {
 		int vaddr_bits = (best->eax & 0xff00) >> 8;
@@ -145,7 +143,7 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 	 * Exposing dynamic xfeatures to the guest requires additional
 	 * enabling in the FPU, e.g. to expand the guest XSAVE state size.
 	 */
-	best = cpuid_entry2_find(entries, nent, 0xd, 0);
+	best = cpuid_entry2_find(vcpu, 0xd, 0);
 	if (!best)
 		return 0;
 
@@ -191,15 +189,15 @@ static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2
 	return 0;
 }
 
-static struct kvm_hypervisor_cpuid __kvm_get_hypervisor_cpuid(struct kvm_cpuid_entry2 *entries,
-							      int nent, const char *sig)
+static struct kvm_hypervisor_cpuid kvm_get_hypervisor_cpuid(struct kvm_vcpu *vcpu,
+							    const char *sig)
 {
 	struct kvm_hypervisor_cpuid cpuid = {};
 	struct kvm_cpuid_entry2 *entry;
 	u32 base;
 
 	for_each_possible_hypervisor_cpuid_base(base) {
-		entry = cpuid_entry2_find(entries, nent, base, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+		entry = cpuid_entry2_find(vcpu, base, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 
 		if (entry) {
 			u32 signature[3];
@@ -219,13 +217,6 @@ static struct kvm_hypervisor_cpuid __kvm_get_hypervisor_cpuid(struct kvm_cpuid_e
 	return cpuid;
 }
 
-static struct kvm_hypervisor_cpuid kvm_get_hypervisor_cpuid(struct kvm_vcpu *vcpu,
-							    const char *sig)
-{
-	return __kvm_get_hypervisor_cpuid(vcpu->arch.cpuid_entries,
-					  vcpu->arch.cpuid_nent, sig);
-}
-
 static u32 kvm_apply_cpuid_pv_features_quirk(struct kvm_vcpu *vcpu)
 {
 	struct kvm_hypervisor_cpuid kvm_cpuid;
@@ -249,23 +240,22 @@ static u32 kvm_apply_cpuid_pv_features_quirk(struct kvm_vcpu *vcpu)
  * Calculate guest's supported XCR0 taking into account guest CPUID data and
  * KVM's supported XCR0 (comprised of host's XCR0 and KVM_SUPPORTED_XCR0).
  */
-static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
+static u64 cpuid_get_supported_xcr0(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
 
-	best = cpuid_entry2_find(entries, nent, 0xd, 0);
+	best = cpuid_entry2_find(vcpu, 0xd, 0);
 	if (!best)
 		return 0;
 
 	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
 }
 
-static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
-				       int nent)
+void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
 
-	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+	best = cpuid_entry2_find(vcpu, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 	if (best) {
 		/* Update OSXSAVE bit */
 		if (boot_cpu_has(X86_FEATURE_XSAVE))
@@ -276,43 +266,36 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
 			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
 	}
 
-	best = cpuid_entry2_find(entries, nent, 7, 0);
+	best = cpuid_entry2_find(vcpu, 7, 0);
 	if (best && boot_cpu_has(X86_FEATURE_PKU) && best->function == 0x7)
 		cpuid_entry_change(best, X86_FEATURE_OSPKE,
 				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
 
-	best = cpuid_entry2_find(entries, nent, 0xD, 0);
+	best = cpuid_entry2_find(vcpu, 0xD, 0);
 	if (best)
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, false);
 
-	best = cpuid_entry2_find(entries, nent, 0xD, 1);
+	best = cpuid_entry2_find(vcpu, 0xD, 1);
 	if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
 		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
 
 	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
-		best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+		best = cpuid_entry2_find(vcpu, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 		if (best)
 			cpuid_entry_change(best, X86_FEATURE_MWAIT,
 					   vcpu->arch.ia32_misc_enable_msr &
 					   MSR_IA32_MISC_ENABLE_MWAIT);
 	}
 }
-
-void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
-{
-	__kvm_update_cpuid_runtime(vcpu, vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent);
-}
 EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime);
 
 static bool kvm_cpuid_has_hyperv(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_KVM_HYPERV
-	struct kvm_cpuid_entry2 *entries = vcpu->arch.cpuid_entries;
-	int nent = vcpu->arch.cpuid_nent;
 	struct kvm_cpuid_entry2 *entry;
 
-	entry = cpuid_entry2_find(entries, nent, HYPERV_CPUID_INTERFACE,
+	entry = cpuid_entry2_find(vcpu, HYPERV_CPUID_INTERFACE,
 				  KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 	return entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX;
 #else
@@ -370,8 +353,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		kvm_apic_set_version(vcpu);
 	}
 
-	vcpu->arch.guest_supported_xcr0 =
-		cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent);
+	vcpu->arch.guest_supported_xcr0 = cpuid_get_supported_xcr0(vcpu);
 
 	vcpu->arch.pv_cpuid.features = kvm_apply_cpuid_pv_features_quirk(vcpu);
 
@@ -1756,16 +1738,14 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
 struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
 						    u32 function, u32 index)
 {
-	return cpuid_entry2_find(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent,
-				 function, index);
+	return cpuid_entry2_find(vcpu, function, index);
 }
 EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry_index);
 
 struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
 					      u32 function)
 {
-	return cpuid_entry2_find(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent,
-				 function, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+	return cpuid_entry2_find(vcpu, function, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
 }
 EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry);
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 35/57] KVM: x86: Move kvm_find_cpuid_entry{,_index}() up near cpuid_entry2_find()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (33 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 34/57] KVM: x86: Always operate on kvm_vcpu data in cpuid_entry2_find() Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 36/57] KVM: x86: Remove all direct usage of cpuid_entry2_find() Sean Christopherson
                   ` (23 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Move kvm_find_cpuid_entry{,_index}() "up" in cpuid.c so that they are
colocated with cpuid_entry2_find(), e.g. to make it easier to see the
effective guts of the helpers without having to bounce around cpuid.c.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index af5c66408c78..fb9c105714e9 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -121,6 +121,20 @@ static struct kvm_cpuid_entry2 *cpuid_entry2_find(struct kvm_vcpu *vcpu,
 	return NULL;
 }
 
+struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
+						    u32 function, u32 index)
+{
+	return cpuid_entry2_find(vcpu, function, index);
+}
+EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry_index);
+
+struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
+					      u32 function)
+{
+	return cpuid_entry2_find(vcpu, function, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+}
+EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry);
+
 static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
@@ -1735,20 +1749,6 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
 	return r;
 }
 
-struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
-						    u32 function, u32 index)
-{
-	return cpuid_entry2_find(vcpu, function, index);
-}
-EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry_index);
-
-struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
-					      u32 function)
-{
-	return cpuid_entry2_find(vcpu, function, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
-}
-EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry);
-
 /*
  * Intel CPUID semantics treats any query for an out-of-range leaf as if the
  * highest basic leaf (i.e. CPUID.0H:EAX) were requested.  AMD CPUID semantics
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 36/57] KVM: x86: Remove all direct usage of cpuid_entry2_find()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (34 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 35/57] KVM: x86: Move kvm_find_cpuid_entry{,_index}() up near cpuid_entry2_find() Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 37/57] KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID Sean Christopherson
                   ` (22 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Convert all use of cpuid_entry2_find() to kvm_find_cpuid_entry{,index}()
now that cpuid_entry2_find() operates on the vCPU state, i.e. now that
there is no need to use cpuid_entry2_find() directly in order to pass in
non-vCPU state.

To help prevent unwanted usage of cpuid_entry2_find(), #undef
KVM_CPUID_INDEX_NOT_SIGNIFICANT, i.e. force KVM to use
kvm_find_cpuid_entry().

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index fb9c105714e9..150d397345d5 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -135,6 +135,12 @@ struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
 }
 EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry);
 
+/*
+ * cpuid_entry2_find() and KVM_CPUID_INDEX_NOT_SIGNIFICANT should never be used
+ * directly outside of kvm_find_cpuid_entry() and kvm_find_cpuid_entry_index().
+ */
+#undef KVM_CPUID_INDEX_NOT_SIGNIFICANT
+
 static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
@@ -144,8 +150,7 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
 	 * canonical address checks; exit if it is ever changed.
 	 */
-	best = cpuid_entry2_find(vcpu, 0x80000008,
-				 KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+	best = kvm_find_cpuid_entry(vcpu, 0x80000008);
 	if (best) {
 		int vaddr_bits = (best->eax & 0xff00) >> 8;
 
@@ -157,7 +162,7 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu)
 	 * Exposing dynamic xfeatures to the guest requires additional
 	 * enabling in the FPU, e.g. to expand the guest XSAVE state size.
 	 */
-	best = cpuid_entry2_find(vcpu, 0xd, 0);
+	best = kvm_find_cpuid_entry_index(vcpu, 0xd, 0);
 	if (!best)
 		return 0;
 
@@ -211,7 +216,7 @@ static struct kvm_hypervisor_cpuid kvm_get_hypervisor_cpuid(struct kvm_vcpu *vcp
 	u32 base;
 
 	for_each_possible_hypervisor_cpuid_base(base) {
-		entry = cpuid_entry2_find(vcpu, base, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+		entry = kvm_find_cpuid_entry(vcpu, base);
 
 		if (entry) {
 			u32 signature[3];
@@ -258,7 +263,7 @@ static u64 cpuid_get_supported_xcr0(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
 
-	best = cpuid_entry2_find(vcpu, 0xd, 0);
+	best = kvm_find_cpuid_entry_index(vcpu, 0xd, 0);
 	if (!best)
 		return 0;
 
@@ -269,7 +274,7 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
 
-	best = cpuid_entry2_find(vcpu, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+	best = kvm_find_cpuid_entry(vcpu, 1);
 	if (best) {
 		/* Update OSXSAVE bit */
 		if (boot_cpu_has(X86_FEATURE_XSAVE))
@@ -280,22 +285,22 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
 	}
 
-	best = cpuid_entry2_find(vcpu, 7, 0);
+	best = kvm_find_cpuid_entry_index(vcpu, 7, 0);
 	if (best && boot_cpu_has(X86_FEATURE_PKU) && best->function == 0x7)
 		cpuid_entry_change(best, X86_FEATURE_OSPKE,
 				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
 
-	best = cpuid_entry2_find(vcpu, 0xD, 0);
+	best = kvm_find_cpuid_entry_index(vcpu, 0xD, 0);
 	if (best)
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, false);
 
-	best = cpuid_entry2_find(vcpu, 0xD, 1);
+	best = kvm_find_cpuid_entry_index(vcpu, 0xD, 1);
 	if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
 		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
 
 	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
-		best = cpuid_entry2_find(vcpu, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+		best = kvm_find_cpuid_entry(vcpu, 0x1);
 		if (best)
 			cpuid_entry_change(best, X86_FEATURE_MWAIT,
 					   vcpu->arch.ia32_misc_enable_msr &
@@ -309,8 +314,7 @@ static bool kvm_cpuid_has_hyperv(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_KVM_HYPERV
 	struct kvm_cpuid_entry2 *entry;
 
-	entry = cpuid_entry2_find(vcpu, HYPERV_CPUID_INTERFACE,
-				  KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_INTERFACE);
 	return entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX;
 #else
 	return false;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 37/57] KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (35 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 36/57] KVM: x86: Remove all direct usage of cpuid_entry2_find() Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 38/57] KVM: x86: Advertise HYPERVISOR " Sean Christopherson
                   ` (21 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Unconditionally advertise TSC_DEADLINE_TIMER via KVM_GET_SUPPORTED_CPUID,
as KVM always emulates deadline mode, *if* the VM has an in-kernel local
APIC.  The odds of a VMM emulating the local APIC in userspace, not
emulating the TSC deadline timer, _and_ reflecting
KVM_GET_SUPPORTED_CPUID back into KVM_SET_CPUID2, i.e. the risk of
over-advertising and breaking any setups, is extremely low.

KVM has _unconditionally_ advertised X2APIC via CPUID since commit
0d1de2d901f4 ("KVM: Always report x2apic as supported feature"), and it
is completely impossible for userspace to emulate X2APIC as KVM doesn't
support forwarding the MSR accesses to userspace.  I.e. KVM has relied on
userspace VMMs to not misreport local APIC capabilities for nearly 13
years.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/api.rst | 9 ++++++---
 arch/x86/kvm/cpuid.c           | 2 +-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index bbe445e6c113..61bf1f693e2d 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1825,15 +1825,18 @@ emulate them efficiently. The fields in each entry are defined as follows:
          the values returned by the cpuid instruction for
          this function/index combination
 
-The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
-as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
-support.  Instead it is reported via::
+x2APIC (CPUID leaf 1, ecx[21) and TSC deadline timer (CPUID leaf 1, ecx[24])
+may be returned as true, but they depend on KVM_CREATE_IRQCHIP for in-kernel
+emulation of the local APIC.  TSC deadline timer support is also reported via::
 
   ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
 
 if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
 feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
 
+Enabling x2APIC in KVM_SET_CPUID2 requires KVM_CREATE_IRQCHIP as KVM doesn't
+support forwarding x2APIC MSR accesses to userspace, i.e. KVM does not support
+emulating x2APIC in userspace.
 
 4.47 KVM_PPC_GET_PVINFO
 -----------------------
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 150d397345d5..51792cf48cd7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -716,7 +716,7 @@ void kvm_set_cpu_caps(void)
 		EMULATED_F(X2APIC) |
 		F(MOVBE) |
 		F(POPCNT) |
-		0 /* Reserved*/ |
+		EMULATED_F(TSC_DEADLINE_TIMER) |
 		F(AES) |
 		F(XSAVE) |
 		0 /* OSXSAVE */ |
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 38/57] KVM: x86: Advertise HYPERVISOR in KVM_GET_SUPPORTED_CPUID
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (36 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 37/57] KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 39/57] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
                   ` (20 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Unconditionally advertise "support" for the HYPERVISOR feature in CPUID,
as the flag simply communicates to the guest that's it's running under a
hypervisor.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 51792cf48cd7..a13bf0ab417d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -722,7 +722,8 @@ void kvm_set_cpu_caps(void)
 		0 /* OSXSAVE */ |
 		F(AVX) |
 		F(F16C) |
-		F(RDRAND)
+		F(RDRAND) |
+		EMULATED_F(HYPERVISOR)
 	);
 
 	kvm_cpu_cap_init(CPUID_1_EDX,
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 39/57] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (37 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 38/57] KVM: x86: Advertise HYPERVISOR " Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 40/57] KVM: x86: Replace guts of "governed" features with comprehensive cpu_caps Sean Christopherson
                   ` (19 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

As the first step toward replacing KVM's so-called "governed features"
framework with a more comprehensive, less poorly named implementation,
replace the "kvm_governed_feature" function prefix with "guest_cpu_cap"
and rename guest_can_use() to guest_cpu_cap_has().

The "guest_cpu_cap" naming scheme mirrors that of "kvm_cpu_cap", and
provides a more clear distinction between guest capabilities, which are
KVM controlled (heh, or one might say "governed"), and guest CPUID, which
with few exceptions is fully userspace controlled.

Opportunistically rewrite the comment about XSS passthrough for SEV-ES
guests to avoid referencing so many functions, as such comments are prone
to becoming stale (case in point...).

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c      |  2 +-
 arch/x86/kvm/cpuid.h      | 16 ++++++++--------
 arch/x86/kvm/mmu.h        |  2 +-
 arch/x86/kvm/mmu/mmu.c    |  4 ++--
 arch/x86/kvm/svm/nested.c | 22 +++++++++++-----------
 arch/x86/kvm/svm/sev.c    | 17 ++++++++---------
 arch/x86/kvm/svm/svm.c    | 26 +++++++++++++-------------
 arch/x86/kvm/svm/svm.h    |  4 ++--
 arch/x86/kvm/vmx/nested.c |  6 +++---
 arch/x86/kvm/vmx/vmx.c    | 16 ++++++++--------
 arch/x86/kvm/x86.c        |  4 ++--
 11 files changed, 59 insertions(+), 60 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index a13bf0ab417d..7b2fbb148661 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -359,7 +359,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
 				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
 	if (allow_gbpages)
-		kvm_governed_feature_set(vcpu, X86_FEATURE_GBPAGES);
+		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
 	if (best && apic) {
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 5d0fe3793d75..e1b05da23cf2 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -237,8 +237,8 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
 	return kvm_governed_feature_index(x86_feature) >= 0;
 }
 
-static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
-						     unsigned int x86_feature)
+static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
+					      unsigned int x86_feature)
 {
 	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
 
@@ -246,15 +246,15 @@ static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
 		  vcpu->arch.governed_features.enabled);
 }
 
-static __always_inline void kvm_governed_feature_check_and_set(struct kvm_vcpu *vcpu,
-							       unsigned int x86_feature)
+static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
+							unsigned int x86_feature)
 {
 	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
-		kvm_governed_feature_set(vcpu, x86_feature);
+		guest_cpu_cap_set(vcpu, x86_feature);
 }
 
-static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
-					  unsigned int x86_feature)
+static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
+					      unsigned int x86_feature)
 {
 	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
 
@@ -264,7 +264,7 @@ static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
 
 static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
-	if (guest_can_use(vcpu, X86_FEATURE_LAM))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LAM))
 		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
 
 	return kvm_vcpu_is_legal_gpa(vcpu, cr3);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index e9322358678b..caec3d11638d 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -126,7 +126,7 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
 
 static inline unsigned long kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
 {
-	if (!guest_can_use(vcpu, X86_FEATURE_LAM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LAM))
 		return 0;
 
 	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 22e7ad235123..d138560a9320 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5034,7 +5034,7 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 	__reset_rsvds_bits_mask(&context->guest_rsvd_check,
 				vcpu->arch.reserved_gpa_bits,
 				context->cpu_role.base.level, is_efer_nx(context),
-				guest_can_use(vcpu, X86_FEATURE_GBPAGES),
+				guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES),
 				is_cr4_pse(context),
 				guest_cpuid_is_amd_compatible(vcpu));
 }
@@ -5111,7 +5111,7 @@ static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
 	__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
 				context->root_role.level,
 				context->root_role.efer_nx,
-				guest_can_use(vcpu, X86_FEATURE_GBPAGES),
+				guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES),
 				is_pse, is_amd);
 
 	if (!shadow_me_mask)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index b708bdf7eaff..d77b094d9a4d 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -111,7 +111,7 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu)
 
 static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm)
 {
-	if (!guest_can_use(&svm->vcpu, X86_FEATURE_V_VMSAVE_VMLOAD))
+	if (!guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_V_VMSAVE_VMLOAD))
 		return true;
 
 	if (!nested_npt_enabled(svm))
@@ -594,7 +594,7 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
 		vmcb_mark_dirty(vmcb02, VMCB_DR);
 	}
 
-	if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+	if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
 		     (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
 		/*
 		 * Reserved bits of DEBUGCTL are ignored.  Be consistent with
@@ -651,7 +651,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 	 * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes.
 	 */
 
-	if (guest_can_use(vcpu, X86_FEATURE_VGIF) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VGIF) &&
 	    (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK))
 		int_ctl_vmcb12_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK);
 	else
@@ -689,7 +689,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 
 	vmcb02->control.tsc_offset = vcpu->arch.tsc_offset;
 
-	if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR) &&
 	    svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio)
 		nested_svm_update_tsc_ratio_msr(vcpu);
 
@@ -710,7 +710,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 	 * what a nrips=0 CPU would do (L1 is responsible for advancing RIP
 	 * prior to injecting the event).
 	 */
-	if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
 		vmcb02->control.next_rip    = svm->nested.ctl.next_rip;
 	else if (boot_cpu_has(X86_FEATURE_NRIPS))
 		vmcb02->control.next_rip    = vmcb12_rip;
@@ -720,7 +720,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 		svm->soft_int_injected = true;
 		svm->soft_int_csbase = vmcb12_csbase;
 		svm->soft_int_old_rip = vmcb12_rip;
-		if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
 			svm->soft_int_next_rip = svm->nested.ctl.next_rip;
 		else
 			svm->soft_int_next_rip = vmcb12_rip;
@@ -728,18 +728,18 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 
 	vmcb02->control.virt_ext            = vmcb01->control.virt_ext &
 					      LBR_CTL_ENABLE_MASK;
-	if (guest_can_use(vcpu, X86_FEATURE_LBRV))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV))
 		vmcb02->control.virt_ext  |=
 			(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK);
 
 	if (!nested_vmcb_needs_vls_intercept(svm))
 		vmcb02->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK;
 
-	if (guest_can_use(vcpu, X86_FEATURE_PAUSEFILTER))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PAUSEFILTER))
 		pause_count12 = svm->nested.ctl.pause_filter_count;
 	else
 		pause_count12 = 0;
-	if (guest_can_use(vcpu, X86_FEATURE_PFTHRESHOLD))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PFTHRESHOLD))
 		pause_thresh12 = svm->nested.ctl.pause_filter_thresh;
 	else
 		pause_thresh12 = 0;
@@ -1026,7 +1026,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	if (vmcb12->control.exit_code != SVM_EXIT_ERR)
 		nested_save_pending_event_to_vmcb12(svm, vmcb12);
 
-	if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
 		vmcb12->control.next_rip  = vmcb02->control.next_rip;
 
 	vmcb12->control.int_ctl           = svm->nested.ctl.int_ctl;
@@ -1065,7 +1065,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	if (!nested_exit_on_intr(svm))
 		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 
-	if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+	if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
 		     (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
 		svm_copy_lbrs(vmcb12, vmcb02);
 		svm_update_lbrv(vcpu);
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 72674b8825c4..4e5aba3f86cd 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4458,16 +4458,15 @@ static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm)
 	 * For SEV-ES, accesses to MSR_IA32_XSS should not be intercepted if
 	 * the host/guest supports its use.
 	 *
-	 * guest_can_use() checks a number of requirements on the host/guest to
-	 * ensure that MSR_IA32_XSS is available, but it might report true even
-	 * if X86_FEATURE_XSAVES isn't configured in the guest to ensure host
-	 * MSR_IA32_XSS is always properly restored. For SEV-ES, it is better
-	 * to further check that the guest CPUID actually supports
-	 * X86_FEATURE_XSAVES so that accesses to MSR_IA32_XSS by misbehaved
-	 * guests will still get intercepted and caught in the normal
-	 * kvm_emulate_rdmsr()/kvm_emulated_wrmsr() paths.
+	 * KVM treats the guest as being capable of using XSAVES even if XSAVES
+	 * isn't enabled in guest CPUID as there is no intercept for XSAVES,
+	 * i.e. the guest can use XSAVES/XRSTOR to read/write XSS if XSAVE is
+	 * exposed to the guest and XSAVES is supported in hardware.  Condition
+	 * full XSS passthrough on the guest being able to use XSAVES *and*
+	 * XSAVES being exposed to the guest so that KVM can at least honor
+	 * guest CPUID for RDMSR and WRMSR.
 	 */
-	if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES) &&
 	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVES))
 		set_msr_interception(vcpu, svm->msrpm, MSR_IA32_XSS, 1, 1);
 	else
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index dd15cc635655..f96c62a9d2c2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1049,7 +1049,7 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 	bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
 	bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
-			    (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+			    (is_guest_mode(vcpu) && guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
 			    (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
 
 	if (enable_lbrv == current_enable_lbrv)
@@ -2864,7 +2864,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	switch (msr_info->index) {
 	case MSR_AMD64_TSC_RATIO:
 		if (!msr_info->host_initiated &&
-		    !guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR))
 			return 1;
 		msr_info->data = svm->tsc_ratio_msr;
 		break;
@@ -3024,7 +3024,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 	switch (ecx) {
 	case MSR_AMD64_TSC_RATIO:
 
-		if (!guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR)) {
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR)) {
 
 			if (!msr->host_initiated)
 				return 1;
@@ -3046,7 +3046,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 
 		svm->tsc_ratio_msr = data;
 
-		if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR) &&
 		    is_guest_mode(vcpu))
 			nested_svm_update_tsc_ratio_msr(vcpu);
 
@@ -4404,11 +4404,11 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
 	    boot_cpu_has(X86_FEATURE_XSAVES) &&
 	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);
+		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
 
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_NRIPS);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_LBRV);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
 
 	/*
 	 * Intercept VMLOAD if the vCPU model is Intel in order to emulate that
@@ -4416,12 +4416,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * SVM on Intel is bonkers and extremely unlikely to work).
 	 */
 	if (!guest_cpuid_is_intel_compatible(vcpu))
-		kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
 
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VGIF);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VNMI);
 
 	svm_recalc_instruction_intercepts(vcpu, svm);
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 43fa6a16eb19..6eff8c60d5eb 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -502,7 +502,7 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
 
 static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
 {
-	return guest_can_use(&svm->vcpu, X86_FEATURE_VGIF) &&
+	return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_VGIF) &&
 	       (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
 }
 
@@ -554,7 +554,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
 
 static inline bool nested_vnmi_enabled(struct vcpu_svm *svm)
 {
-	return guest_can_use(&svm->vcpu, X86_FEATURE_VNMI) &&
+	return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_VNMI) &&
 	       (svm->nested.ctl.int_ctl & V_NMI_ENABLE_MASK);
 }
 
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index aa78b6f38dfe..9aaa703f5f98 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6617,7 +6617,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 	vmx = to_vmx(vcpu);
 	vmcs12 = get_vmcs12(vcpu);
 
-	if (guest_can_use(vcpu, X86_FEATURE_VMX) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX) &&
 	    (vmx->nested.vmxon || vmx->nested.smm.vmxon)) {
 		kvm_state.hdr.vmx.vmxon_pa = vmx->nested.vmxon_ptr;
 		kvm_state.hdr.vmx.vmcs12_pa = vmx->nested.current_vmptr;
@@ -6758,7 +6758,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 		if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS)
 			return -EINVAL;
 	} else {
-		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 			return -EINVAL;
 
 		if (!page_address_valid(vcpu, kvm_state->hdr.vmx.vmxon_pa))
@@ -6792,7 +6792,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 	if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) &&
-	    (!guest_can_use(vcpu, X86_FEATURE_VMX) ||
+	    (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX) ||
 	     !vmx->nested.enlightened_vmcs_enabled))
 			return -EINVAL;
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 893366e53732..ccba522246c3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2084,7 +2084,7 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
 		break;
 	case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
-		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 			return 1;
 		if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
 				    &msr_info->data))
@@ -2394,7 +2394,7 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
 		if (!msr_info->host_initiated)
 			return 1; /* they are read-only */
-		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 			return 1;
 		return vmx_set_vmx_msr(vcpu, msr_index, data);
 	case MSR_IA32_RTIT_CTL:
@@ -4591,7 +4591,7 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
 												\
 	if (cpu_has_vmx_##name()) {								\
 		if (kvm_is_governed_feature(X86_FEATURE_##feat_name))				\
-			__enabled = guest_can_use(__vcpu, X86_FEATURE_##feat_name);		\
+			__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);		\
 		else										\
 			__enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name);		\
 		vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
@@ -7830,10 +7830,10 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
 	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
+		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_XSAVES);
 
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_LAM);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VMX);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LAM);
 
 	vmx_setup_uret_msrs(vmx);
 
@@ -7841,7 +7841,7 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		vmcs_set_secondary_exec_control(vmx,
 						vmx_secondary_exec_control(vmx));
 
-	if (guest_can_use(vcpu, X86_FEATURE_VMX))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 		vmx->msr_ia32_feature_control_valid_bits |=
 			FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
 			FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
@@ -7850,7 +7850,7 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 			~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
 			  FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
 
-	if (guest_can_use(vcpu, X86_FEATURE_VMX))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 		nested_vmx_cr_fixed1_bits_update(vcpu);
 
 	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9f0ffc3289d2..1ee955cdb109 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1177,7 +1177,7 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
 		if (vcpu->arch.xcr0 != kvm_host.xcr0)
 			xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
 
-		if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES) &&
 		    vcpu->arch.ia32_xss != kvm_host.xss)
 			wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
 	}
@@ -1208,7 +1208,7 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
 		if (vcpu->arch.xcr0 != kvm_host.xcr0)
 			xsetbv(XCR_XFEATURE_ENABLED_MASK, kvm_host.xcr0);
 
-		if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES) &&
 		    vcpu->arch.ia32_xss != kvm_host.xss)
 			wrmsrl(MSR_IA32_XSS, kvm_host.xss);
 	}
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 40/57] KVM: x86: Replace guts of "governed" features with comprehensive cpu_caps
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (38 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 39/57] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 41/57] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
                   ` (18 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Replace the internals of the governed features framework with a more
comprehensive "guest CPU capabilities" implementation, i.e. with a guest
version of kvm_cpu_caps.  Keep the skeleton of governed features around
for now as vmx_adjust_sec_exec_control() relies on detecting governed
features to do the right thing for XSAVES, and switching all guest feature
queries to guest_cpu_cap_has() requires subtle and non-trivial changes,
i.e. is best done as a standalone change.

Tracking *all* guest capabilities that KVM cares will allow excising the
poorly named "governed features" framework, and effectively optimizes all
KVM queries of guest capabilities, i.e. doesn't require making a
subjective decision as to whether or not a feature is worth "governing",
and doesn't require adding the code to do so.

The cost of tracking all features is currently 92 bytes per vCPU on 64-bit
kernels: 100 bytes for cpu_caps versus 8 bytes for governed_features.
That cost is well worth paying even if the only benefit was eliminating
the "governed features" terminology.  And practically speaking, the real
cost is zero unless those 92 bytes pushes the size of vcpu_vmx or vcpu_svm
into a new order-N allocation, and if that happens there are better ways
to reduce the footprint of kvm_vcpu_arch, e.g. making the PMU and/or MTRR
state separate allocations.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 46 +++++++++++++++++++++------------
 arch/x86/kvm/cpuid.c            | 14 +++++++---
 arch/x86/kvm/cpuid.h            | 10 +++----
 arch/x86/kvm/reverse_cpuid.h    | 17 ------------
 4 files changed, 45 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f076df9f18be..81ce8cd5814a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -739,6 +739,23 @@ struct kvm_queued_exception {
 	bool has_payload;
 };
 
+/*
+ * Hardware-defined CPUID leafs that are either scattered by the kernel or are
+ * unknown to the kernel, but need to be directly used by KVM.  Note, these
+ * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
+ */
+enum kvm_only_cpuid_leafs {
+	CPUID_12_EAX	 = NCAPINTS,
+	CPUID_7_1_EDX,
+	CPUID_8000_0007_EDX,
+	CPUID_8000_0022_EAX,
+	CPUID_7_2_EDX,
+	CPUID_24_0_EBX,
+	NR_KVM_CPU_CAPS,
+
+	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
+};
+
 struct kvm_vcpu_arch {
 	/*
 	 * rip and regs accesses must go through
@@ -857,23 +874,20 @@ struct kvm_vcpu_arch {
 	bool is_amd_compatible;
 
 	/*
-	 * FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly
-	 * when "struct kvm_vcpu_arch" is no longer defined in an
-	 * arch/x86/include/asm header.  The max is mostly arbitrary, i.e.
-	 * can be increased as necessary.
+	 * cpu_caps holds the effective guest capabilities, i.e. the features
+	 * the vCPU is allowed to use.  Typically, but not always, features can
+	 * be used by the guest if and only if both KVM and userspace want to
+	 * expose the feature to the guest.
+	 *
+	 * A common exception is for virtualization holes, i.e. when KVM can't
+	 * prevent the guest from using a feature, in which case the vCPU "has"
+	 * the feature regardless of what KVM or userspace desires.
+	 *
+	 * Note, features that don't require KVM involvement in any way are
+	 * NOT enforced/sanitized by KVM, i.e. are taken verbatim from the
+	 * guest CPUID provided by userspace.
 	 */
-#define KVM_MAX_NR_GOVERNED_FEATURES BITS_PER_LONG
-
-	/*
-	 * Track whether or not the guest is allowed to use features that are
-	 * governed by KVM, where "governed" means KVM needs to manage state
-	 * and/or explicitly enable the feature in hardware.  Typically, but
-	 * not always, governed features can be used by the guest if and only
-	 * if both KVM and userspace want to expose the feature to the guest.
-	 */
-	struct {
-		DECLARE_BITMAP(enabled, KVM_MAX_NR_GOVERNED_FEATURES);
-	} governed_features;
+	u32 cpu_caps[NR_KVM_CPU_CAPS];
 
 	u64 reserved_gpa_bits;
 	int maxphyaddr;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7b2fbb148661..f0721ad84a18 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -339,9 +339,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	struct kvm_cpuid_entry2 *best;
 	bool allow_gbpages;
 
-	BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
-	bitmap_zero(vcpu->arch.governed_features.enabled,
-		    KVM_MAX_NR_GOVERNED_FEATURES);
+	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
 
 	kvm_update_cpuid_runtime(vcpu);
 
@@ -425,6 +423,7 @@ u64 kvm_vcpu_reserved_gpa_bits_raw(struct kvm_vcpu *vcpu)
 static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
                         int nent)
 {
+	u32 vcpu_caps[NR_KVM_CPU_CAPS];
 	int r;
 
 	/*
@@ -432,10 +431,18 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 	 * order to massage the new entries, e.g. to account for dynamic bits
 	 * that KVM controls, without clobbering the current guest CPUID, which
 	 * KVM needs to preserve in order to unwind on failure.
+	 *
+	 * Similarly, save the vCPU's current cpu_caps so that the capabilities
+	 * can be updated alongside the CPUID entries when performing runtime
+	 * updates.  Full initialization is done if and only if the vCPU hasn't
+	 * run, i.e. only if userspace is potentially changing CPUID features.
 	 */
 	swap(vcpu->arch.cpuid_entries, e2);
 	swap(vcpu->arch.cpuid_nent, nent);
 
+	memcpy(vcpu_caps, vcpu->arch.cpu_caps, sizeof(vcpu_caps));
+	BUILD_BUG_ON(sizeof(vcpu_caps) != sizeof(vcpu->arch.cpu_caps));
+
 	/*
 	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
 	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
@@ -476,6 +483,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 	return 0;
 
 err:
+	memcpy(vcpu->arch.cpu_caps, vcpu_caps, sizeof(vcpu_caps));
 	swap(vcpu->arch.cpuid_entries, e2);
 	swap(vcpu->arch.cpuid_nent, nent);
 	return r;
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index e1b05da23cf2..0a9c3086539b 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -240,10 +240,9 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
 static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
 					      unsigned int x86_feature)
 {
-	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	__set_bit(kvm_governed_feature_index(x86_feature),
-		  vcpu->arch.governed_features.enabled);
+	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
 }
 
 static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
@@ -256,10 +255,9 @@ static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
 static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
 					      unsigned int x86_feature)
 {
-	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	return test_bit(kvm_governed_feature_index(x86_feature),
-			vcpu->arch.governed_features.enabled);
+	return vcpu->arch.cpu_caps[x86_leaf] & __feature_bit(x86_feature);
 }
 
 static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index 1d2db9d529ff..fde0ae986003 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -6,23 +6,6 @@
 #include <asm/cpufeature.h>
 #include <asm/cpufeatures.h>
 
-/*
- * Hardware-defined CPUID leafs that are either scattered by the kernel or are
- * unknown to the kernel, but need to be directly used by KVM.  Note, these
- * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
- */
-enum kvm_only_cpuid_leafs {
-	CPUID_12_EAX	 = NCAPINTS,
-	CPUID_7_1_EDX,
-	CPUID_8000_0007_EDX,
-	CPUID_8000_0022_EAX,
-	CPUID_7_2_EDX,
-	CPUID_24_0_EBX,
-	NR_KVM_CPU_CAPS,
-
-	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
-};
-
 /*
  * Define a KVM-only feature flag.
  *
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 41/57] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (39 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 40/57] KVM: x86: Replace guts of "governed" features with comprehensive cpu_caps Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 42/57] KVM: x86: Extract code for generating per-entry emulated CPUID information Sean Christopherson
                   ` (17 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Initialize a vCPU's capabilities based on the guest CPUID provided by
userspace instead of simply zeroing the entire array.  This is the first
step toward using cpu_caps to query *all* CPUID-based guest capabilities,
i.e. will allow converting all usage of guest_cpuid_has() to
guest_cpu_cap_has().

Zeroing the array was the logical choice when using cpu_caps was opt-in,
e.g. "unsupported" was generally a safer default, and the whole point of
governed features is that KVM would need to check host and guest support,
i.e. making everything unsupported by default didn't require more code.

But requiring KVM to manually "enable" every CPUID-based feature in
cpu_caps would require an absurd amount of boilerplate code.

Follow existing CPUID/kvm_cpu_caps nomenclature where possible, e.g. for
the change() and clear() APIs.  Replace check_and_set() with constrain()
to try and capture that KVM is constraining userspace's desired guest
feature set based on KVM's capabilities.

This is intended to be gigantic nop, i.e. should not have any impact on
guest or KVM functionality.

This is also an intermediate step; a future commit will also incorporate
KVM support into the vCPU's cpu_caps before converting guest_cpuid_has()
to guest_cpu_cap_has().

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c   | 46 ++++++++++++++++++++++++++++++++++++++++--
 arch/x86/kvm/cpuid.h   | 24 +++++++++++++++++++---
 arch/x86/kvm/svm/svm.c | 28 +++++++++++++------------
 arch/x86/kvm/vmx/vmx.c |  8 +++++---
 4 files changed, 85 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index f0721ad84a18..803d89577e6f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -333,13 +333,56 @@ static bool guest_cpuid_is_amd_or_hygon(struct kvm_vcpu *vcpu)
 	       is_guest_vendor_hygon(entry->ebx, entry->ecx, entry->edx);
 }
 
+/*
+ * This isn't truly "unsafe", but except for the cpu_caps initialization code,
+ * all register lookups should use __cpuid_entry_get_reg(), which provides
+ * compile-time validation of the input.
+ */
+static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
+{
+	switch (reg) {
+	case CPUID_EAX:
+		return entry->eax;
+	case CPUID_EBX:
+		return entry->ebx;
+	case CPUID_ECX:
+		return entry->ecx;
+	case CPUID_EDX:
+		return entry->edx;
+	default:
+		WARN_ON_ONCE(1);
+		return 0;
+	}
+}
+
 void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	struct kvm_cpuid_entry2 *best;
+	struct kvm_cpuid_entry2 *entry;
 	bool allow_gbpages;
+	int i;
 
 	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
+	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
+
+	/*
+	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
+	 * honor userspace's definition for features that don't require KVM or
+	 * hardware management/support (or that KVM simply doesn't care about).
+	 */
+	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
+		const struct cpuid_reg cpuid = reverse_cpuid[i];
+
+		if (!cpuid.function)
+			continue;
+
+		entry = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
+		if (!entry)
+			continue;
+
+		vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(entry, cpuid.reg);
+	}
 
 	kvm_update_cpuid_runtime(vcpu);
 
@@ -356,8 +399,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
 				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
-	if (allow_gbpages)
-		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
+	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
 	if (best && apic) {
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 0a9c3086539b..8c9d6be8cb58 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -245,11 +245,29 @@ static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
 	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
 }
 
-static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
-							unsigned int x86_feature)
+static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
+						unsigned int x86_feature)
 {
-	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
+
+	vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
+}
+
+static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
+						 unsigned int x86_feature,
+						 bool guest_has_cap)
+{
+	if (guest_has_cap)
 		guest_cpu_cap_set(vcpu, x86_feature);
+	else
+		guest_cpu_cap_clear(vcpu, x86_feature);
+}
+
+static __always_inline void guest_cpu_cap_constrain(struct kvm_vcpu *vcpu,
+						    unsigned int x86_feature)
+{
+	if (!kvm_cpu_cap_has(x86_feature))
+		guest_cpu_cap_clear(vcpu, x86_feature);
 }
 
 static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f96c62a9d2c2..3b94cb6c2b7a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4401,27 +4401,29 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
 	 * the guest read/write access to the host's XSS.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
-	    boot_cpu_has(X86_FEATURE_XSAVES) &&
-	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
+	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
+			     boot_cpu_has(X86_FEATURE_XSAVE) &&
+			     boot_cpu_has(X86_FEATURE_XSAVES) &&
+			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
 
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_NRIPS);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_TSCRATEMSR);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_LBRV);
 
 	/*
 	 * Intercept VMLOAD if the vCPU model is Intel in order to emulate that
 	 * VMLOAD drops bits 63:32 of SYSENTER (ignoring the fact that exposing
 	 * SVM on Intel is bonkers and extremely unlikely to work).
 	 */
-	if (!guest_cpuid_is_intel_compatible(vcpu))
-		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+	if (guest_cpuid_is_intel_compatible(vcpu))
+		guest_cpu_cap_clear(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+	else
+		guest_cpu_cap_constrain(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
 
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VGIF);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VNMI);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_PAUSEFILTER);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_PFTHRESHOLD);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_VGIF);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_VNMI);
 
 	svm_recalc_instruction_intercepts(vcpu, svm);
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ccba522246c3..8b95ba323a17 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7830,10 +7830,12 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
 	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_XSAVES);
+		guest_cpu_cap_constrain(vcpu, X86_FEATURE_XSAVES);
+	else
+		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
 
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VMX);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LAM);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_VMX);
+	guest_cpu_cap_constrain(vcpu, X86_FEATURE_LAM);
 
 	vmx_setup_uret_msrs(vmx);
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 42/57] KVM: x86: Extract code for generating per-entry emulated CPUID information
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (40 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 41/57] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 43/57] KVM: x86: Treat MONTIOR/MWAIT as a "partially emulated" feature Sean Christopherson
                   ` (16 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Extract the meat of __do_cpuid_func_emulated() into a separate helper,
cpuid_func_emulated(), so that cpuid_func_emulated() can be used with a
single CPUID entry.  This will allow marking emulated features as fully
supported in the guest cpu_caps without needing to hardcode the set of
emulated features in multiple locations.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 803d89577e6f..153c4378b987 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1192,14 +1192,10 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
 	return entry;
 }
 
-static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
+static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func)
 {
-	struct kvm_cpuid_entry2 *entry;
+	memset(entry, 0, sizeof(*entry));
 
-	if (array->nent >= array->maxnent)
-		return -E2BIG;
-
-	entry = &array->entries[array->nent];
 	entry->function = func;
 	entry->index = 0;
 	entry->flags = 0;
@@ -1207,23 +1203,27 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
 	switch (func) {
 	case 0:
 		entry->eax = 7;
-		++array->nent;
-		break;
+		return 1;
 	case 1:
 		entry->ecx = feature_bit(MOVBE);
-		++array->nent;
-		break;
+		return 1;
 	case 7:
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 		entry->eax = 0;
 		if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
 			entry->ecx = feature_bit(RDPID);
-		++array->nent;
-		break;
+		return 1;
 	default:
-		break;
+		return 0;
 	}
+}
 
+static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
+{
+	if (array->nent >= array->maxnent)
+		return -E2BIG;
+
+	array->nent += cpuid_func_emulated(&array->entries[array->nent], func);
 	return 0;
 }
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 43/57] KVM: x86: Treat MONTIOR/MWAIT as a "partially emulated" feature
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (41 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 42/57] KVM: x86: Extract code for generating per-entry emulated CPUID information Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 44/57] KVM: x86: Initialize guest cpu_caps based on KVM support Sean Christopherson
                   ` (15 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Enumerate MWAIT in cpuid_func_emulated(), but only if the caller wants to
include "partially emulated" features, i.e. features that KVM kinda sorta
emulates, but with major caveats.  This will allow initializing the guest
cpu_caps based on the set of features that KVM virtualizes and/or emulates,
without needing to handle things like MONITOR/MWAIT as one-off exceptions.

Adding one-off handling for individual features is quite painful,
especially when considering future hardening.  It's very doable to verify,
at compile time, that every CPUID-based feature that KVM queries when
emulating guest behavior is actually known to KVM, e.g. to prevent KVM
bugs where KVM emulates some feature but fails to advertise support to
userspace.  In other words, any features that are special cased, i.e. not
handled generically in the CPUID framework, would also need to be special
cased for any hardening efforts that build on said framework.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 153c4378b987..0c63492f119d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1192,7 +1192,8 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
 	return entry;
 }
 
-static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func)
+static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func,
+			       bool include_partially_emulated)
 {
 	memset(entry, 0, sizeof(*entry));
 
@@ -1206,6 +1207,16 @@ static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func)
 		return 1;
 	case 1:
 		entry->ecx = feature_bit(MOVBE);
+		/*
+		 * KVM allows userspace to enumerate MONITOR+MWAIT support to
+		 * the guest, but the MWAIT feature flag is never advertised
+		 * to userspace because MONITOR+MWAIT aren't virtualized by
+		 * hardware, can't be faithfully emulated in software (KVM
+		 * emulates them as NOPs), and allowing the guest to execute
+		 * them natively requires enabling a per-VM capability.
+		 */
+		if (include_partially_emulated)
+			entry->ecx |= feature_bit(MWAIT);
 		return 1;
 	case 7:
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
@@ -1223,7 +1234,7 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
 	if (array->nent >= array->maxnent)
 		return -E2BIG;
 
-	array->nent += cpuid_func_emulated(&array->entries[array->nent], func);
+	array->nent += cpuid_func_emulated(&array->entries[array->nent], func, false);
 	return 0;
 }
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 44/57] KVM: x86: Initialize guest cpu_caps based on KVM support
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (42 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 43/57] KVM: x86: Treat MONTIOR/MWAIT as a "partially emulated" feature Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 45/57] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
                   ` (14 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Constrain all guest cpu_caps based on KVM support instead of constraining
only the few features that KVM _currently_ needs to verify are actually
supported by KVM.  The intent of cpu_caps is to track what the guest is
actually capable of using, not the raw, unfiltered CPUID values that the
guest sees.

I.e. KVM should always consult it's only support when making decisions
based on guest CPUID, and the only reason KVM has historically made the
checks opt-in was due to lack of centralized tracking.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c   | 15 ++++++++++++++-
 arch/x86/kvm/cpuid.h   |  7 -------
 arch/x86/kvm/svm/svm.c | 11 -----------
 arch/x86/kvm/vmx/vmx.c |  9 ++-------
 4 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0c63492f119d..8015d6b52a69 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -355,6 +355,9 @@ static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
 	}
 }
 
+static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func,
+			       bool include_partially_emulated);
+
 void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
@@ -373,6 +376,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
 		const struct cpuid_reg cpuid = reverse_cpuid[i];
+		struct kvm_cpuid_entry2 emulated;
 
 		if (!cpuid.function)
 			continue;
@@ -381,7 +385,16 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		if (!entry)
 			continue;
 
-		vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(entry, cpuid.reg);
+		cpuid_func_emulated(&emulated, cpuid.function, true);
+
+		/*
+		 * A vCPU has a feature if it's supported by KVM and is enabled
+		 * in guest CPUID.  Note, this includes features that are
+		 * supported by KVM but aren't advertised to userspace!
+		 */
+		vcpu->arch.cpu_caps[i] = kvm_cpu_caps[i] |
+					 cpuid_get_reg_unsafe(&emulated, cpuid.reg);
+		vcpu->arch.cpu_caps[i] &= cpuid_get_reg_unsafe(entry, cpuid.reg);
 	}
 
 	kvm_update_cpuid_runtime(vcpu);
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 8c9d6be8cb58..27da0964355c 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -263,13 +263,6 @@ static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
 		guest_cpu_cap_clear(vcpu, x86_feature);
 }
 
-static __always_inline void guest_cpu_cap_constrain(struct kvm_vcpu *vcpu,
-						    unsigned int x86_feature)
-{
-	if (!kvm_cpu_cap_has(x86_feature))
-		guest_cpu_cap_clear(vcpu, x86_feature);
-}
-
 static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
 					      unsigned int x86_feature)
 {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3b94cb6c2b7a..0045fe474023 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4406,10 +4406,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 			     boot_cpu_has(X86_FEATURE_XSAVES) &&
 			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
 
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_NRIPS);
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_TSCRATEMSR);
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_LBRV);
-
 	/*
 	 * Intercept VMLOAD if the vCPU model is Intel in order to emulate that
 	 * VMLOAD drops bits 63:32 of SYSENTER (ignoring the fact that exposing
@@ -4417,13 +4413,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	if (guest_cpuid_is_intel_compatible(vcpu))
 		guest_cpu_cap_clear(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
-	else
-		guest_cpu_cap_constrain(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
-
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_PAUSEFILTER);
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_PFTHRESHOLD);
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_VGIF);
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_VNMI);
 
 	svm_recalc_instruction_intercepts(vcpu, svm);
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8b95ba323a17..a7c2c36f2a4f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7828,15 +7828,10 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * to the guest.  XSAVES depends on CR4.OSXSAVE, and CR4.OSXSAVE can be
 	 * set if and only if XSAVE is supported.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
-	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		guest_cpu_cap_constrain(vcpu, X86_FEATURE_XSAVES);
-	else
+	if (!boot_cpu_has(X86_FEATURE_XSAVE) ||
+	    !guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
 		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
 
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_VMX);
-	guest_cpu_cap_constrain(vcpu, X86_FEATURE_LAM);
-
 	vmx_setup_uret_msrs(vmx);
 
 	if (cpu_has_secondary_exec_ctrls())
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 45/57] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (43 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 44/57] KVM: x86: Initialize guest cpu_caps based on KVM support Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 46/57] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
                   ` (13 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Move the handling of X86_FEATURE_MWAIT during CPUID runtime updates to
utilize the lookup done for other CPUID.0x1 features.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 8015d6b52a69..16cfa839e734 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -283,6 +283,11 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 
 		cpuid_entry_change(best, X86_FEATURE_APIC,
 			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
+
+		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
+			cpuid_entry_change(best, X86_FEATURE_MWAIT,
+					   vcpu->arch.ia32_misc_enable_msr &
+					   MSR_IA32_MISC_ENABLE_MWAIT);
 	}
 
 	best = kvm_find_cpuid_entry_index(vcpu, 7, 0);
@@ -298,14 +303,6 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 	if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
 		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
-
-	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
-		best = kvm_find_cpuid_entry(vcpu, 0x1);
-		if (best)
-			cpuid_entry_change(best, X86_FEATURE_MWAIT,
-					   vcpu->arch.ia32_misc_enable_msr &
-					   MSR_IA32_MISC_ENABLE_MWAIT);
-	}
 }
 EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime);
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 46/57] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (44 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 45/57] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 47/57] KVM: x86: Update OS{XSAVE,PKE} bits in guest CPUID irrespective of host support Sean Christopherson
                   ` (12 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Drop an unnecessary check that kvm_find_cpuid_entry_index(), i.e.
cpuid_entry2_find(), returns the correct leaf when getting CPUID.0x7.0x0
to update X86_FEATURE_OSPKE.  cpuid_entry2_find() never returns an entry
for the wrong function.  And not that it matters, but cpuid_entry2_find()
will always return a precise match for CPUID.0x7.0x0 since the index is
significant.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 16cfa839e734..7481926a0291 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -291,7 +291,7 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 	}
 
 	best = kvm_find_cpuid_entry_index(vcpu, 7, 0);
-	if (best && boot_cpu_has(X86_FEATURE_PKU) && best->function == 0x7)
+	if (best && boot_cpu_has(X86_FEATURE_PKU))
 		cpuid_entry_change(best, X86_FEATURE_OSPKE,
 				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 47/57] KVM: x86: Update OS{XSAVE,PKE} bits in guest CPUID irrespective of host support
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (45 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 46/57] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 48/57] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
                   ` (11 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

When making runtime CPUID updates, change OSXSAVE and OSPKE even if their
respective base features (XSAVE, PKU) are not supported by the host.  KVM
already incorporates host support in the vCPU's effective reserved CR4 bits.
I.e. OSXSAVE and OSPKE can be set if and only if the host supports them.

And conversely, since KVM's ABI is that KVM owns the dynamic OS feature
flags, clearing them when they obviously aren't supported and thus can't
be enabled is arguably a fix.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7481926a0291..be3357a408d4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -276,10 +276,8 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
 	if (best) {
-		/* Update OSXSAVE bit */
-		if (boot_cpu_has(X86_FEATURE_XSAVE))
-			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
-					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
+		cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
+				   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
 
 		cpuid_entry_change(best, X86_FEATURE_APIC,
 			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
@@ -291,7 +289,7 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 	}
 
 	best = kvm_find_cpuid_entry_index(vcpu, 7, 0);
-	if (best && boot_cpu_has(X86_FEATURE_PKU))
+	if (best)
 		cpuid_entry_change(best, X86_FEATURE_OSPKE,
 				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 48/57] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (46 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 47/57] KVM: x86: Update OS{XSAVE,PKE} bits in guest CPUID irrespective of host support Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 49/57] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
                   ` (10 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

When updating guest CPUID entries to emulate runtime behavior, e.g. when
the guest enables a CR4-based feature that is tied to a CPUID flag, also
update the vCPU's cpu_caps accordingly.  This will allow replacing all
usage of guest_cpuid_has() with guest_cpu_cap_has().

Note, this relies on kvm_set_cpuid() taking a snapshot of cpu_caps before
invoking kvm_update_cpuid_runtime(), i.e. when KVM is updating CPUID
entries that *may* become the vCPU's CPUID, so that unwinding to the old
cpu_caps is possible if userspace tries to set bogus CPUID information.

Note #2, none of the features in question use guest_cpu_cap_has() at this
time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.

Cc: Yang Weijiang <weijiang.yang@intel.com>
Cc: Robert Hoo <robert.hoo.linux@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index be3357a408d4..d3c3e1327ca1 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -270,28 +270,38 @@ static u64 cpuid_get_supported_xcr0(struct kvm_vcpu *vcpu)
 	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
 }
 
+static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
+						       struct kvm_cpuid_entry2 *entry,
+						       unsigned int x86_feature,
+						       bool has_feature)
+{
+	cpuid_entry_change(entry, x86_feature, has_feature);
+	guest_cpu_cap_change(vcpu, x86_feature, has_feature);
+}
+
 void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
 	if (best) {
-		cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
-				   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
+		kvm_update_feature_runtime(vcpu, best, X86_FEATURE_OSXSAVE,
+					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
 
-		cpuid_entry_change(best, X86_FEATURE_APIC,
-			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
+		kvm_update_feature_runtime(vcpu, best, X86_FEATURE_APIC,
+					   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
 
 		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
-			cpuid_entry_change(best, X86_FEATURE_MWAIT,
-					   vcpu->arch.ia32_misc_enable_msr &
-					   MSR_IA32_MISC_ENABLE_MWAIT);
+			kvm_update_feature_runtime(vcpu, best, X86_FEATURE_MWAIT,
+						   vcpu->arch.ia32_misc_enable_msr &
+						   MSR_IA32_MISC_ENABLE_MWAIT);
 	}
 
 	best = kvm_find_cpuid_entry_index(vcpu, 7, 0);
 	if (best)
-		cpuid_entry_change(best, X86_FEATURE_OSPKE,
-				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
+		kvm_update_feature_runtime(vcpu, best, X86_FEATURE_OSPKE,
+					   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
+
 
 	best = kvm_find_cpuid_entry_index(vcpu, 0xD, 0);
 	if (best)
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 49/57] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has()
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (47 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 48/57] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps Sean Christopherson
                   ` (9 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Move the implementations of guest_has_{spec_ctrl,pred_cmd}_msr() down
below guest_cpu_cap_has() so that their use of guest_cpuid_has() can be
replaced with calls to guest_cpu_cap_has().

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.h | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 27da0964355c..4901145ba2dc 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -149,21 +149,6 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
 	return x86_stepping(best->eax);
 }
 
-static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
-{
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
-}
-
-static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
-{
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
-}
-
 static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
@@ -279,4 +264,19 @@ static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr
 	return kvm_vcpu_is_legal_gpa(vcpu, cr3);
 }
 
+static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
+{
+	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
+}
+
+static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
+{
+	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
+}
+
 #endif
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (48 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 49/57] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-12-13  2:14   ` Chao Gao
  2024-11-28  1:34 ` [PATCH v3 51/57] KVM: x86: Drop superfluous host XSAVE check when adjusting guest XSAVES caps Sean Christopherson
                   ` (8 subsequent siblings)
  58 siblings, 1 reply; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Switch all queries (except XSAVES) of guest features from guest CPUID to
guest capabilities, i.e. replace all calls to guest_cpuid_has() with calls
to guest_cpu_cap_has().

Keep guest_cpuid_has() around for XSAVES, but subsume its helper
guest_cpuid_get_register() and add a compile-time assertion to prevent
using guest_cpuid_has() for any other feature.  Add yet another comment
for XSAVE to explain why KVM is allowed to query its raw guest CPUID.

Opportunistically drop the unused guest_cpuid_clear(), as there should be
no circumstance in which KVM needs to _clear_ a guest CPUID feature now
that everything is tracked via cpu_caps.  E.g. KVM may need to _change_
a feature to emulate dynamic CPUID flags, but KVM should never need to
clear a feature in guest CPUID to prevent it from being used by the guest.

Delete the last remnants of the governed features framework, as the lone
holdout was vmx_adjust_secondary_exec_control()'s divergent behavior for
governed vs. ungoverned features.

Note, replacing guest_cpuid_has() checks with guest_cpu_cap_has() when
computing reserved CR4 bits is a nop when viewed as a whole, as KVM's
capabilities are already incorporated into the calculation, i.e. if a
feature is present in guest CPUID but unsupported by KVM, its CR4 bit
was already being marked as reserved, checking guest_cpu_cap_has() simply
double-stamps that it's a reserved bit.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c             |  4 +-
 arch/x86/kvm/cpuid.h             | 76 ++++++++++++--------------------
 arch/x86/kvm/governed_features.h | 22 ---------
 arch/x86/kvm/hyperv.c            |  2 +-
 arch/x86/kvm/lapic.c             |  4 +-
 arch/x86/kvm/smm.c               | 10 ++---
 arch/x86/kvm/svm/pmu.c           |  8 ++--
 arch/x86/kvm/svm/sev.c           |  4 +-
 arch/x86/kvm/svm/svm.c           | 20 ++++-----
 arch/x86/kvm/vmx/hyperv.h        |  2 +-
 arch/x86/kvm/vmx/nested.c        | 12 ++---
 arch/x86/kvm/vmx/pmu_intel.c     |  4 +-
 arch/x86/kvm/vmx/sgx.c           | 14 +++---
 arch/x86/kvm/vmx/vmx.c           | 47 +++++++++-----------
 arch/x86/kvm/x86.c               | 66 +++++++++++++--------------
 15 files changed, 124 insertions(+), 171 deletions(-)
 delete mode 100644 arch/x86/kvm/governed_features.h

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index d3c3e1327ca1..8d088a888a0d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -416,7 +416,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * and can install smaller shadow pages if the host lacks 1GiB support.
 	 */
 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
-				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
+				      guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES);
 	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
@@ -441,7 +441,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
 	vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) |
-					 __cr4_reserved_bits(guest_cpuid_has, vcpu);
+					 __cr4_reserved_bits(guest_cpu_cap_has, vcpu);
 #undef __kvm_cpu_cap_has
 
 	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu));
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 4901145ba2dc..3d69a0ef8268 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -66,41 +66,40 @@ static __always_inline void cpuid_entry_override(struct kvm_cpuid_entry2 *entry,
 	*reg = kvm_cpu_caps[leaf];
 }
 
-static __always_inline u32 *guest_cpuid_get_register(struct kvm_vcpu *vcpu,
-						     unsigned int x86_feature)
+static __always_inline bool guest_cpuid_has(struct kvm_vcpu *vcpu,
+					    unsigned int x86_feature)
 {
 	const struct cpuid_reg cpuid = x86_feature_cpuid(x86_feature);
 	struct kvm_cpuid_entry2 *entry;
+	u32 *reg;
+
+	/*
+	 * XSAVES is a special snowflake.  Due to lack of a dedicated intercept
+	 * on SVM, KVM must assume that XSAVES (and thus XRSTORS) is usable by
+	 * the guest if the host supports XSAVES and *XSAVE* is exposed to the
+	 * guest.  Because the guest can execute XSAVES and XRSTORS, i.e. can
+	 * indirectly consume XSS, KVM must ensure XSS is zeroed when running
+	 * the guest, i.e. must set XSAVES in vCPU capabilities.  But to reject
+	 * direct XSS reads and writes (to minimize the virtualization hole and
+	 * honor userspace's CPUID), KVM needs to check the raw guest CPUID,
+	 * not KVM's view of guest capabilities.
+	 *
+	 * For all other features, guest capabilities are accurate.  Expand
+	 * this allowlist with extreme vigilance.
+	 */
+	BUILD_BUG_ON(x86_feature != X86_FEATURE_XSAVES);
 
 	entry = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
 	if (!entry)
 		return NULL;
 
-	return __cpuid_entry_get_reg(entry, cpuid.reg);
-}
-
-static __always_inline bool guest_cpuid_has(struct kvm_vcpu *vcpu,
-					    unsigned int x86_feature)
-{
-	u32 *reg;
-
-	reg = guest_cpuid_get_register(vcpu, x86_feature);
+	reg = __cpuid_entry_get_reg(entry, cpuid.reg);
 	if (!reg)
 		return false;
 
 	return *reg & __feature_bit(x86_feature);
 }
 
-static __always_inline void guest_cpuid_clear(struct kvm_vcpu *vcpu,
-					      unsigned int x86_feature)
-{
-	u32 *reg;
-
-	reg = guest_cpuid_get_register(vcpu, x86_feature);
-	if (reg)
-		*reg &= ~__feature_bit(x86_feature);
-}
-
 static inline bool guest_cpuid_is_amd_compatible(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.is_amd_compatible;
@@ -201,27 +200,6 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
 	return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
 }
 
-enum kvm_governed_features {
-#define KVM_GOVERNED_FEATURE(x) KVM_GOVERNED_##x,
-#include "governed_features.h"
-	KVM_NR_GOVERNED_FEATURES
-};
-
-static __always_inline int kvm_governed_feature_index(unsigned int x86_feature)
-{
-	switch (x86_feature) {
-#define KVM_GOVERNED_FEATURE(x) case x: return KVM_GOVERNED_##x;
-#include "governed_features.h"
-	default:
-		return -1;
-	}
-}
-
-static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
-{
-	return kvm_governed_feature_index(x86_feature) >= 0;
-}
-
 static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
 					      unsigned int x86_feature)
 {
@@ -266,17 +244,17 @@ static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr
 
 static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
 {
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
+	return (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_STIBP) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_SSBD));
 }
 
 static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
 {
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
+	return (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBPB) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_SBPB));
 }
 
 #endif
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
deleted file mode 100644
index ad463b1ed4e4..000000000000
--- a/arch/x86/kvm/governed_features.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#if !defined(KVM_GOVERNED_FEATURE) || defined(KVM_GOVERNED_X86_FEATURE)
-BUILD_BUG()
-#endif
-
-#define KVM_GOVERNED_X86_FEATURE(x) KVM_GOVERNED_FEATURE(X86_FEATURE_##x)
-
-KVM_GOVERNED_X86_FEATURE(GBPAGES)
-KVM_GOVERNED_X86_FEATURE(XSAVES)
-KVM_GOVERNED_X86_FEATURE(VMX)
-KVM_GOVERNED_X86_FEATURE(NRIPS)
-KVM_GOVERNED_X86_FEATURE(TSCRATEMSR)
-KVM_GOVERNED_X86_FEATURE(V_VMSAVE_VMLOAD)
-KVM_GOVERNED_X86_FEATURE(LBRV)
-KVM_GOVERNED_X86_FEATURE(PAUSEFILTER)
-KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
-KVM_GOVERNED_X86_FEATURE(VGIF)
-KVM_GOVERNED_X86_FEATURE(VNMI)
-KVM_GOVERNED_X86_FEATURE(LAM)
-
-#undef KVM_GOVERNED_X86_FEATURE
-#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 4f0a94346d00..6a6dd5a84f22 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1352,7 +1352,7 @@ static void __kvm_hv_xsaves_xsavec_maybe_warn(struct kvm_vcpu *vcpu)
 		return;
 
 	if (guest_cpuid_has(vcpu, X86_FEATURE_XSAVES) ||
-	    !guest_cpuid_has(vcpu, X86_FEATURE_XSAVEC))
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVEC))
 		return;
 
 	pr_notice_ratelimited("Booting SMP Windows KVM VM with !XSAVES && XSAVEC. "
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3c83951c619e..ae81ae27d534 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -598,7 +598,7 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu)
 	 * version first and level-triggered interrupts never get EOIed in
 	 * IOAPIC.
 	 */
-	if (guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) &&
 	    !ioapic_in_kernel(vcpu->kvm))
 		v |= APIC_LVR_DIRECTED_EOI;
 	kvm_lapic_set_reg(apic, APIC_LVR, v);
@@ -2634,7 +2634,7 @@ int kvm_apic_set_base(struct kvm_vcpu *vcpu, u64 value, bool host_initiated)
 		return 0;
 
 	u64 reserved_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu) | 0x2ff |
-		(guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) ? 0 : X2APIC_ENABLE);
+		(guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) ? 0 : X2APIC_ENABLE);
 
 	if ((value & reserved_bits) != 0 || new_mode == LAPIC_MODE_INVALID)
 		return 1;
diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c
index 85241c0c7f56..e0ab7df27b66 100644
--- a/arch/x86/kvm/smm.c
+++ b/arch/x86/kvm/smm.c
@@ -283,7 +283,7 @@ void enter_smm(struct kvm_vcpu *vcpu)
 	memset(smram.bytes, 0, sizeof(smram.bytes));
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		enter_smm_save_state_64(vcpu, &smram.smram64);
 	else
 #endif
@@ -353,7 +353,7 @@ void enter_smm(struct kvm_vcpu *vcpu)
 	kvm_set_segment(vcpu, &ds, VCPU_SREG_SS);
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		if (kvm_x86_call(set_efer)(vcpu, 0))
 			goto error;
 #endif
@@ -586,7 +586,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
 	 * supports long mode.
 	 */
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM)) {
 		struct kvm_segment cs_desc;
 		unsigned long cr4;
 
@@ -609,7 +609,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
 		kvm_set_cr0(vcpu, cr0 & ~(X86_CR0_PG | X86_CR0_PE));
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM)) {
 		unsigned long cr4, efer;
 
 		/* Clear CR4.PAE before clearing EFER.LME. */
@@ -634,7 +634,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
 		return X86EMUL_UNHANDLEABLE;
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		ret = rsm_load_state_64(ctxt, &smram.smram64);
 	else
 #endif
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 22d5a65b410c..288f7f2a46f2 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -46,7 +46,7 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
 
 	switch (msr) {
 	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE))
 			return NULL;
 		/*
 		 * Each PMU counter has a pair of CTL and CTR MSRs. CTLn
@@ -109,7 +109,7 @@ static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	case MSR_K7_EVNTSEL0 ... MSR_K7_PERFCTR3:
 		return pmu->version > 0;
 	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
-		return guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE);
+		return guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE);
 	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
 	case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
 	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
@@ -179,7 +179,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
 	union cpuid_0x80000022_ebx ebx;
 
 	pmu->version = 1;
-	if (guest_cpuid_has(vcpu, X86_FEATURE_PERFMON_V2)) {
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PERFMON_V2)) {
 		pmu->version = 2;
 		/*
 		 * Note, PERFMON_V2 is also in 0x80000022.0x0, i.e. the guest
@@ -189,7 +189,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
 			     x86_feature_cpuid(X86_FEATURE_PERFMON_V2).index);
 		ebx.full = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0)->ebx;
 		pmu->nr_arch_gp_counters = ebx.split.num_core_pmc;
-	} else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
+	} else if (guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
 		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
 	} else {
 		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 4e5aba3f86cd..09be12a44288 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4448,8 +4448,8 @@ static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm)
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 
 	if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) {
-		bool v_tsc_aux = guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
-				 guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
+		bool v_tsc_aux = guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) ||
+				 guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID);
 
 		set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, v_tsc_aux, v_tsc_aux);
 	}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0045fe474023..734b3ca40311 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1187,14 +1187,14 @@ static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
 	 */
 	if (kvm_cpu_cap_has(X86_FEATURE_INVPCID)) {
 		if (!npt_enabled ||
-		    !guest_cpuid_has(&svm->vcpu, X86_FEATURE_INVPCID))
+		    !guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_INVPCID))
 			svm_set_intercept(svm, INTERCEPT_INVPCID);
 		else
 			svm_clr_intercept(svm, INTERCEPT_INVPCID);
 	}
 
 	if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP)) {
-		if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP))
 			svm_clr_intercept(svm, INTERCEPT_RDTSCP);
 		else
 			svm_set_intercept(svm, INTERCEPT_RDTSCP);
@@ -2940,7 +2940,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_AMD64_VIRT_SPEC_CTRL:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_VIRT_SSBD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_VIRT_SSBD))
 			return 1;
 
 		msr_info->data = svm->virt_spec_ctrl;
@@ -3091,7 +3091,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		break;
 	case MSR_AMD64_VIRT_SPEC_CTRL:
 		if (!msr->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_VIRT_SSBD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_VIRT_SSBD))
 			return 1;
 
 		if (data & ~SPEC_CTRL_SSBD)
@@ -3272,7 +3272,7 @@ static int invpcid_interception(struct kvm_vcpu *vcpu)
 	unsigned long type;
 	gva_t gva;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_INVPCID)) {
 		kvm_queue_exception(vcpu, UD_VECTOR);
 		return 1;
 	}
@@ -4404,7 +4404,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
 			     boot_cpu_has(X86_FEATURE_XSAVE) &&
 			     boot_cpu_has(X86_FEATURE_XSAVES) &&
-			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
+			     guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE));
 
 	/*
 	 * Intercept VMLOAD if the vCPU model is Intel in order to emulate that
@@ -4422,7 +4422,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 	if (boot_cpu_has(X86_FEATURE_FLUSH_L1D))
 		set_msr_interception(vcpu, svm->msrpm, MSR_IA32_FLUSH_CMD, 0,
-				     !!guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D));
+				     !!guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
 
 	if (sev_guest(vcpu->kvm))
 		sev_vcpu_after_set_cpuid(svm);
@@ -4673,7 +4673,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, union kvm_smram *smram)
 	 * responsible for ensuring nested SVM and SMIs are mutually exclusive.
 	 */
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return 1;
 
 	smram->smram64.svm_guest_flag = 1;
@@ -4720,14 +4720,14 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const union kvm_smram *smram)
 
 	const struct kvm_smram_state_64 *smram64 = &smram->smram64;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return 0;
 
 	/* Non-zero if SMI arrived while vCPU was in guest mode. */
 	if (!smram64->svm_guest_flag)
 		return 0;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_SVM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SVM))
 		return 1;
 
 	if (!(smram64->efer & EFER_SVME))
diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
index a87407412615..11a339009781 100644
--- a/arch/x86/kvm/vmx/hyperv.h
+++ b/arch/x86/kvm/vmx/hyperv.h
@@ -42,7 +42,7 @@ static inline struct hv_enlightened_vmcs *nested_vmx_evmcs(struct vcpu_vmx *vmx)
 	return vmx->nested.hv_evmcs;
 }
 
-static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
+static inline bool guest_cpu_cap_has_evmcs(struct kvm_vcpu *vcpu)
 {
 	/*
 	 * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9aaa703f5f98..af2a8b021d0f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -257,7 +257,7 @@ static bool nested_evmcs_handle_vmclear(struct kvm_vcpu *vcpu, gpa_t vmptr)
 	 * state. It is possible that the area will stay mapped as
 	 * vmx->nested.hv_evmcs but this shouldn't be a problem.
 	 */
-	if (!guest_cpuid_has_evmcs(vcpu) ||
+	if (!guest_cpu_cap_has_evmcs(vcpu) ||
 	    !evmptr_is_valid(nested_get_evmptr(vcpu)))
 		return false;
 
@@ -2089,7 +2089,7 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
 	bool evmcs_gpa_changed = false;
 	u64 evmcs_gpa;
 
-	if (likely(!guest_cpuid_has_evmcs(vcpu)))
+	if (likely(!guest_cpu_cap_has_evmcs(vcpu)))
 		return EVMPTRLD_DISABLED;
 
 	evmcs_gpa = nested_get_evmptr(vcpu);
@@ -2992,7 +2992,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 #ifdef CONFIG_KVM_HYPERV
-	if (guest_cpuid_has_evmcs(vcpu))
+	if (guest_cpu_cap_has_evmcs(vcpu))
 		return nested_evmcs_check_controls(vmcs12);
 #endif
 
@@ -3287,7 +3287,7 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
 	 * L2 was running), map it here to make sure vmcs12 changes are
 	 * properly reflected.
 	 */
-	if (guest_cpuid_has_evmcs(vcpu) &&
+	if (guest_cpu_cap_has_evmcs(vcpu) &&
 	    vmx->nested.hv_evmcs_vmptr == EVMPTR_MAP_PENDING) {
 		enum nested_evmptrld_status evmptrld_status =
 			nested_vmx_handle_enlightened_vmptrld(vcpu, false);
@@ -5015,7 +5015,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
 	 * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
 	 * separate modes for L2 vs L1.
 	 */
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
 		indirect_branch_prediction_barrier();
 
 	/* Update any VMCS fields that might have changed while L2 ran */
@@ -6279,7 +6279,7 @@ static bool nested_vmx_exit_handled_encls(struct kvm_vcpu *vcpu,
 {
 	u32 encls_leaf;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_SGX) ||
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) ||
 	    !nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENCLS_EXITING))
 		return false;
 
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 9c9d4a336166..77012b2eca0e 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -110,7 +110,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
 
 static inline u64 vcpu_get_perf_capabilities(struct kvm_vcpu *vcpu)
 {
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM))
 		return 0;
 
 	return vcpu->arch.perf_capabilities;
@@ -160,7 +160,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 		ret = vcpu_get_perf_capabilities(vcpu) & PERF_CAP_PEBS_FORMAT;
 		break;
 	case MSR_IA32_DS_AREA:
-		ret = guest_cpuid_has(vcpu, X86_FEATURE_DS);
+		ret = guest_cpu_cap_has(vcpu, X86_FEATURE_DS);
 		break;
 	case MSR_PEBS_DATA_CFG:
 		perf_capabilities = vcpu_get_perf_capabilities(vcpu);
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index b352a3ba7354..9961e07cf071 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -122,7 +122,7 @@ static int sgx_inject_fault(struct kvm_vcpu *vcpu, gva_t gva, int trapnr)
 	 * likely than a bad userspace address.
 	 */
 	if ((trapnr == PF_VECTOR || !boot_cpu_has(X86_FEATURE_SGX2)) &&
-	    guest_cpuid_has(vcpu, X86_FEATURE_SGX2)) {
+	    guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2)) {
 		memset(&ex, 0, sizeof(ex));
 		ex.vector = PF_VECTOR;
 		ex.error_code = PFERR_PRESENT_MASK | PFERR_WRITE_MASK |
@@ -365,7 +365,7 @@ static inline bool encls_leaf_enabled_in_guest(struct kvm_vcpu *vcpu, u32 leaf)
 		return true;
 
 	if (leaf >= EAUG && leaf <= EMODT)
-		return guest_cpuid_has(vcpu, X86_FEATURE_SGX2);
+		return guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2);
 
 	return false;
 }
@@ -381,8 +381,8 @@ int handle_encls(struct kvm_vcpu *vcpu)
 {
 	u32 leaf = (u32)kvm_rax_read(vcpu);
 
-	if (!enable_sgx || !guest_cpuid_has(vcpu, X86_FEATURE_SGX) ||
-	    !guest_cpuid_has(vcpu, X86_FEATURE_SGX1)) {
+	if (!enable_sgx || !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) ||
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX1)) {
 		kvm_queue_exception(vcpu, UD_VECTOR);
 	} else if (!encls_leaf_enabled_in_guest(vcpu, leaf) ||
 		   !sgx_enabled_in_guest_bios(vcpu) || !is_paging(vcpu)) {
@@ -479,15 +479,15 @@ void vmx_write_encls_bitmap(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	if (!cpu_has_vmx_encls_vmexit())
 		return;
 
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) &&
 	    sgx_enabled_in_guest_bios(vcpu)) {
-		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX1)) {
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX1)) {
 			bitmap &= ~GENMASK_ULL(ETRACK, ECREATE);
 			if (sgx_intercept_encls_ecreate(vcpu))
 				bitmap |= (1 << ECREATE);
 		}
 
-		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX2))
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2))
 			bitmap &= ~GENMASK_ULL(EMODT, EAUG);
 
 		/*
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a7c2c36f2a4f..6e5edaa2ba3a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1908,8 +1908,8 @@ static void vmx_setup_uret_msrs(struct vcpu_vmx *vmx)
 	vmx_setup_uret_msr(vmx, MSR_EFER, update_transition_efer(vmx));
 
 	vmx_setup_uret_msr(vmx, MSR_TSC_AUX,
-			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
-			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
+			   guest_cpu_cap_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
+			   guest_cpu_cap_has(&vmx->vcpu, X86_FEATURE_RDPID));
 
 	/*
 	 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations of new
@@ -2062,7 +2062,7 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_BNDCFGS:
 		if (!kvm_mpx_supported() ||
 		    (!msr_info->host_initiated &&
-		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
+		     !guest_cpu_cap_has(vcpu, X86_FEATURE_MPX)))
 			return 1;
 		msr_info->data = vmcs_read64(GUEST_BNDCFGS);
 		break;
@@ -2078,7 +2078,7 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC))
 			return 1;
 		msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
 			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
@@ -2097,7 +2097,7 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * sanity checking and refuse to boot. Filter all unsupported
 		 * features out.
 		 */
-		if (!msr_info->host_initiated && guest_cpuid_has_evmcs(vcpu))
+		if (!msr_info->host_initiated && guest_cpu_cap_has_evmcs(vcpu))
 			nested_evmcs_filter_control_msr(vcpu, msr_info->index,
 							&msr_info->data);
 #endif
@@ -2167,7 +2167,7 @@ static u64 nested_vmx_truncate_sysenter_addr(struct kvm_vcpu *vcpu,
 						    u64 data)
 {
 #ifdef CONFIG_X86_64
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return (u32)data;
 #endif
 	return (unsigned long)data;
@@ -2178,7 +2178,7 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
 	u64 debugctl = 0;
 
 	if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT) &&
-	    (host_initiated || guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT)))
+	    (host_initiated || guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT)))
 		debugctl |= DEBUGCTLMSR_BUS_LOCK_DETECT;
 
 	if ((kvm_caps.supported_perf_cap & PMU_CAP_LBR_FMT) &&
@@ -2282,7 +2282,7 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_BNDCFGS:
 		if (!kvm_mpx_supported() ||
 		    (!msr_info->host_initiated &&
-		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
+		     !guest_cpu_cap_has(vcpu, X86_FEATURE_MPX)))
 			return 1;
 		if (is_noncanonical_msr_address(data & PAGE_MASK, vcpu) ||
 		    (data & MSR_IA32_BNDCFGS_RSVD))
@@ -2384,7 +2384,7 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * behavior, but it's close enough.
 		 */
 		if (!msr_info->host_initiated &&
-		    (!guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC) ||
+		    (!guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC) ||
 		    ((vmx->msr_ia32_feature_control & FEAT_CTL_LOCKED) &&
 		    !(vmx->msr_ia32_feature_control & FEAT_CTL_SGX_LC_ENABLED))))
 			return 1;
@@ -2468,9 +2468,9 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			if ((data & PERF_CAP_PEBS_MASK) !=
 			    (kvm_caps.supported_perf_cap & PERF_CAP_PEBS_MASK))
 				return 1;
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_DS))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_DS))
 				return 1;
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_DTES64))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_DTES64))
 				return 1;
 			if (!cpuid_model_is_consistent(vcpu))
 				return 1;
@@ -4590,10 +4590,7 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
 	bool __enabled;										\
 												\
 	if (cpu_has_vmx_##name()) {								\
-		if (kvm_is_governed_feature(X86_FEATURE_##feat_name))				\
-			__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);		\
-		else										\
-			__enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name);		\
+		__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);			\
 		vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
 						  __enabled, exiting);				\
 	}											\
@@ -4669,8 +4666,8 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 	 */
 	if (cpu_has_vmx_rdtscp()) {
 		bool rdpid_or_rdtscp_enabled =
-			guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
-			guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
+			guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) ||
+			guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID);
 
 		vmx_adjust_secondary_exec_control(vmx, &exec_control,
 						  SECONDARY_EXEC_ENABLE_RDTSCP,
@@ -5959,7 +5956,7 @@ static int handle_invpcid(struct kvm_vcpu *vcpu)
 	} operand;
 	int gpr_index;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_INVPCID)) {
 		kvm_queue_exception(vcpu, UD_VECTOR);
 		return 1;
 	}
@@ -7829,7 +7826,7 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * set if and only if XSAVE is supported.
 	 */
 	if (!boot_cpu_has(X86_FEATURE_XSAVE) ||
-	    !guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
 		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
 
 	vmx_setup_uret_msrs(vmx);
@@ -7851,21 +7848,21 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		nested_vmx_cr_fixed1_bits_update(vcpu);
 
 	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
-			guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
+			guest_cpu_cap_has(vcpu, X86_FEATURE_INTEL_PT))
 		update_intel_pt_cfg(vcpu);
 
 	if (boot_cpu_has(X86_FEATURE_RTM)) {
 		struct vmx_uret_msr *msr;
 		msr = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
 		if (msr) {
-			bool enabled = guest_cpuid_has(vcpu, X86_FEATURE_RTM);
+			bool enabled = guest_cpu_cap_has(vcpu, X86_FEATURE_RTM);
 			vmx_set_guest_uret_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
 		}
 	}
 
 	if (kvm_cpu_cap_has(X86_FEATURE_XFD))
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_XFD_ERR, MSR_TYPE_R,
-					  !guest_cpuid_has(vcpu, X86_FEATURE_XFD));
+					  !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD));
 
 	if (boot_cpu_has(X86_FEATURE_IBPB))
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W,
@@ -7873,17 +7870,17 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 	if (boot_cpu_has(X86_FEATURE_FLUSH_L1D))
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
-					  !guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D));
+					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
 
 	set_cr4_guest_host_mask(vmx);
 
 	vmx_write_encls_bitmap(vcpu, NULL);
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX))
 		vmx->msr_ia32_feature_control_valid_bits |= FEAT_CTL_SGX_ENABLED;
 	else
 		vmx->msr_ia32_feature_control_valid_bits &= ~FEAT_CTL_SGX_ENABLED;
 
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC))
 		vmx->msr_ia32_feature_control_valid_bits |=
 			FEAT_CTL_SGX_LC_ENABLED;
 	else
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1ee955cdb109..cc4563fb07d1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1502,10 +1502,10 @@ static u64 kvm_dr6_fixed(struct kvm_vcpu *vcpu)
 {
 	u64 fixed = DR6_FIXED_1;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_RTM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_RTM))
 		fixed |= DR6_RTM;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
 		fixed |= DR6_BUS_LOCK;
 	return fixed;
 }
@@ -1681,20 +1681,20 @@ static int do_get_feature_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
 
 static bool __kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
-	if (efer & EFER_AUTOIBRS && !guest_cpuid_has(vcpu, X86_FEATURE_AUTOIBRS))
+	if (efer & EFER_AUTOIBRS && !guest_cpu_cap_has(vcpu, X86_FEATURE_AUTOIBRS))
 		return false;
 
-	if (efer & EFER_FFXSR && !guest_cpuid_has(vcpu, X86_FEATURE_FXSR_OPT))
+	if (efer & EFER_FFXSR && !guest_cpu_cap_has(vcpu, X86_FEATURE_FXSR_OPT))
 		return false;
 
-	if (efer & EFER_SVME && !guest_cpuid_has(vcpu, X86_FEATURE_SVM))
+	if (efer & EFER_SVME && !guest_cpu_cap_has(vcpu, X86_FEATURE_SVM))
 		return false;
 
 	if (efer & (EFER_LME | EFER_LMA) &&
-	    !guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return false;
 
-	if (efer & EFER_NX && !guest_cpuid_has(vcpu, X86_FEATURE_NX))
+	if (efer & EFER_NX && !guest_cpu_cap_has(vcpu, X86_FEATURE_NX))
 		return false;
 
 	return true;
@@ -1836,8 +1836,8 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
 			return 1;
 
 		if (!host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 
 		/*
@@ -1894,8 +1894,8 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
 			return 1;
 
 		if (!host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 		break;
 	}
@@ -2081,7 +2081,7 @@ EXPORT_SYMBOL_GPL(kvm_handle_invalid_op);
 static int kvm_emulate_monitor_mwait(struct kvm_vcpu *vcpu, const char *insn)
 {
 	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS) &&
-	    !guest_cpuid_has(vcpu, X86_FEATURE_MWAIT))
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_MWAIT))
 		return kvm_handle_invalid_op(vcpu);
 
 	pr_warn_once("%s instruction emulated as NOP!\n", insn);
@@ -3753,13 +3753,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_ARCH_CAPABILITIES:
 		if (!msr_info->host_initiated ||
-		    !guest_cpuid_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
 			return KVM_MSR_RET_UNSUPPORTED;
 		vcpu->arch.arch_capabilities = data;
 		break;
 	case MSR_IA32_PERF_CAPABILITIES:
 		if (!msr_info->host_initiated ||
-		    !guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM))
 			return KVM_MSR_RET_UNSUPPORTED;
 
 		if (data & ~kvm_caps.supported_perf_cap)
@@ -3783,11 +3783,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			if ((!guest_has_pred_cmd_msr(vcpu)))
 				return 1;
 
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
-			    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
+			    !guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBPB))
 				reserved_bits |= PRED_CMD_IBPB;
 
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_SBPB))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SBPB))
 				reserved_bits |= PRED_CMD_SBPB;
 		}
 
@@ -3808,7 +3808,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	}
 	case MSR_IA32_FLUSH_CMD:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D))
 			return 1;
 
 		if (!boot_cpu_has(X86_FEATURE_FLUSH_L1D) || (data & ~L1D_FLUSH))
@@ -3859,7 +3859,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		kvm_set_lapic_tscdeadline_msr(vcpu, data);
 		break;
 	case MSR_IA32_TSC_ADJUST:
-		if (guest_cpuid_has(vcpu, X86_FEATURE_TSC_ADJUST)) {
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSC_ADJUST)) {
 			if (!msr_info->host_initiated) {
 				s64 adj = data - vcpu->arch.ia32_tsc_adjust_msr;
 				adjust_tsc_offset_guest(vcpu, adj);
@@ -3886,7 +3886,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 
 		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
 		    ((old_val ^ data)  & MSR_IA32_MISC_ENABLE_MWAIT)) {
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_XMM3))
 				return 1;
 			vcpu->arch.ia32_misc_enable_msr = data;
 			kvm_update_cpuid_runtime(vcpu);
@@ -4063,12 +4063,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		kvm_pr_unimpl_wrmsr(vcpu, msr, data);
 		break;
 	case MSR_AMD64_OSVW_ID_LENGTH:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		vcpu->arch.osvw.length = data;
 		break;
 	case MSR_AMD64_OSVW_STATUS:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		vcpu->arch.osvw.status = data;
 		break;
@@ -4087,7 +4087,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 #ifdef CONFIG_X86_64
 	case MSR_IA32_XFD:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		if (data & ~kvm_guest_supported_xfd(vcpu))
@@ -4097,7 +4097,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_XFD_ERR:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		if (data & ~kvm_guest_supported_xfd(vcpu))
@@ -4212,12 +4212,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		msr_info->data = vcpu->arch.microcode_version;
 		break;
 	case MSR_IA32_ARCH_CAPABILITIES:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
 			return KVM_MSR_RET_UNSUPPORTED;
 		msr_info->data = vcpu->arch.arch_capabilities;
 		break;
 	case MSR_IA32_PERF_CAPABILITIES:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM))
 			return KVM_MSR_RET_UNSUPPORTED;
 		msr_info->data = vcpu->arch.perf_capabilities;
 		break;
@@ -4418,12 +4418,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		msr_info->data = 0xbe702111;
 		break;
 	case MSR_AMD64_OSVW_ID_LENGTH:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		msr_info->data = vcpu->arch.osvw.length;
 		break;
 	case MSR_AMD64_OSVW_STATUS:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		msr_info->data = vcpu->arch.osvw.status;
 		break;
@@ -4442,14 +4442,14 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 #ifdef CONFIG_X86_64
 	case MSR_IA32_XFD:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		msr_info->data = vcpu->arch.guest_fpu.fpstate->xfd;
 		break;
 	case MSR_IA32_XFD_ERR:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		msr_info->data = vcpu->arch.guest_fpu.xfd_err;
@@ -8502,17 +8502,17 @@ static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
 
 static bool emulator_guest_has_movbe(struct x86_emulate_ctxt *ctxt)
 {
-	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_MOVBE);
+	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_MOVBE);
 }
 
 static bool emulator_guest_has_fxsr(struct x86_emulate_ctxt *ctxt)
 {
-	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_FXSR);
+	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_FXSR);
 }
 
 static bool emulator_guest_has_rdpid(struct x86_emulate_ctxt *ctxt)
 {
-	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_RDPID);
+	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_RDPID);
 }
 
 static bool emulator_guest_cpuid_is_intel_compatible(struct x86_emulate_ctxt *ctxt)
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 51/57] KVM: x86: Drop superfluous host XSAVE check when adjusting guest XSAVES caps
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (49 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 52/57] KVM: x86: Add a macro for features that are synthesized into boot_cpu_data Sean Christopherson
                   ` (7 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Drop the manual boot_cpu_has() checks on XSAVE when adjusting the guest's
XSAVES capabilities now that guest cpu_caps incorporates KVM's support.
The guest's cpu_caps are initialized from kvm_cpu_caps, which are in turn
initialized from boot_cpu_data, i.e. checking guest_cpu_cap_has() also
checks host/KVM capabilities (which is the entire point of cpu_caps).

Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 1 -
 arch/x86/kvm/vmx/vmx.c | 3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 734b3ca40311..07911ddf1efe 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4402,7 +4402,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * the guest read/write access to the host's XSS.
 	 */
 	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
-			     boot_cpu_has(X86_FEATURE_XSAVE) &&
 			     boot_cpu_has(X86_FEATURE_XSAVES) &&
 			     guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE));
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6e5edaa2ba3a..cf872d8691b5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7825,8 +7825,7 @@ void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * to the guest.  XSAVES depends on CR4.OSXSAVE, and CR4.OSXSAVE can be
 	 * set if and only if XSAVE is supported.
 	 */
-	if (!boot_cpu_has(X86_FEATURE_XSAVE) ||
-	    !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
 		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
 
 	vmx_setup_uret_msrs(vmx);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 52/57] KVM: x86: Add a macro for features that are synthesized into boot_cpu_data
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (50 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 51/57] KVM: x86: Drop superfluous host XSAVE check when adjusting guest XSAVES caps Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 53/57] KVM: x86: Pull CPUID capabilities from boot_cpu_data only as needed Sean Christopherson
                   ` (6 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add yet another CPUID macro, this time for features that the host kernel
synthesizes into boot_cpu_data, i.e. that the kernel force sets even in
situations where the feature isn't reported by CPUID.  Thanks to the
macro shenanigans of kvm_cpu_cap_init(), such features can now be handled
in the core CPUID framework, i.e. don't need to be handled out-of-band and
thus without as many guardrails.

Adding a dedicated macro also helps document what's going on, e.g. the
calls to kvm_cpu_cap_check_and_set() are very confusing unless the reader
knows exactly how kvm_cpu_cap_init() generates kvm_cpu_caps (and even
then, it's far from obvious).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 49 +++++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 8d088a888a0d..2b05a7e61994 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -665,6 +665,7 @@ do {									\
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
 	u32 kvm_cpu_cap_passthrough = 0;				\
+	u32 kvm_cpu_cap_synthesized = 0;				\
 	u32 kvm_cpu_cap_emulated = 0;					\
 									\
 	if (leaf < NCAPINTS)						\
@@ -673,7 +674,8 @@ do {									\
 		kvm_cpu_caps[leaf] = (mask);				\
 									\
 	kvm_cpu_caps[leaf] |= kvm_cpu_cap_passthrough;			\
-	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);			\
+	kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |			\
+			       kvm_cpu_cap_synthesized);		\
 	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
 } while (0)
 
@@ -720,6 +722,17 @@ do {									\
 	F(name);						\
 })
 
+/*
+ * Synthesized Feature - For features that are synthesized into boot_cpu_data,
+ * i.e. may not be present in the raw CPUID, but can still be advertised to
+ * userspace.  Primarily used for mitigation related feature flags.
+ */
+#define SYNTHESIZED_F(name)					\
+({								\
+	kvm_cpu_cap_synthesized |= F(name);			\
+	F(name);						\
+})
+
 /*
  * Passthrough Feature - For features that KVM supports based purely on raw
  * hardware CPUID, i.e. that KVM virtualizes even if the host kernel doesn't
@@ -1084,35 +1097,32 @@ void kvm_set_cpu_caps(void)
 
 	kvm_cpu_cap_init(CPUID_8000_0021_EAX,
 		F(NO_NESTED_DATA_BP) |
-		F(LFENCE_RDTSC) |
+		/*
+		 * Synthesize "LFENCE is serializing" into the AMD-defined entry
+		 * in KVM's supported CPUID, i.e. if the feature is reported as
+		 * supported by the kernel.  LFENCE_RDTSC was a Linux-defined
+		 * synthetic feature long before AMD joined the bandwagon, e.g.
+		 * LFENCE is serializing on most CPUs that support SSE2.  On
+		 * CPUs that don't support AMD's leaf, ANDing with the raw host
+		 * CPUID will drop the flags, and reporting support in AMD's
+		 * leaf can make it easier for userspace to detect the feature.
+		 */
+		SYNTHESIZED_F(LFENCE_RDTSC) |
 		0 /* SmmPgCfgLock */ |
 		F(NULL_SEL_CLR_BASE) |
 		F(AUTOIBRS) |
 		EMULATED_F(NO_SMM_CTL_MSR) |
 		0 /* PrefetchCtlMsr */ |
-		F(WRMSR_XX_BASE_NS)
+		F(WRMSR_XX_BASE_NS) |
+		SYNTHESIZED_F(SBPB) |
+		SYNTHESIZED_F(IBPB_BRTYPE) |
+		SYNTHESIZED_F(SRSO_NO)
 	);
 
-	kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
-	kvm_cpu_cap_check_and_set(X86_FEATURE_IBPB_BRTYPE);
-	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
-
 	kvm_cpu_cap_init(CPUID_8000_0022_EAX,
 		F(PERFMON_V2)
 	);
 
-	/*
-	 * Synthesize "LFENCE is serializing" into the AMD-defined entry in
-	 * KVM's supported CPUID if the feature is reported as supported by the
-	 * kernel.  LFENCE_RDTSC was a Linux-defined synthetic feature long
-	 * before AMD joined the bandwagon, e.g. LFENCE is serializing on most
-	 * CPUs that support SSE2.  On CPUs that don't support AMD's leaf,
-	 * kvm_cpu_cap_init() will unfortunately drop the flag due to ANDing
-	 * the mask with the raw host CPUID, and reporting support in AMD's
-	 * leaf can make it easier for userspace to detect the feature.
-	 */
-	if (cpu_feature_enabled(X86_FEATURE_LFENCE_RDTSC))
-		kvm_cpu_cap_set(X86_FEATURE_LFENCE_RDTSC);
 	if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
 		kvm_cpu_cap_set(X86_FEATURE_NULL_SEL_CLR_BASE);
 
@@ -1150,6 +1160,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 #undef SF
 #undef X86_64_F
 #undef EMULATED_F
+#undef SYNTHESIZED_F
 #undef PASSTHROUGH_F
 #undef ALIASED_1_EDX_F
 
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 53/57] KVM: x86: Pull CPUID capabilities from boot_cpu_data only as needed
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (51 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 52/57] KVM: x86: Add a macro for features that are synthesized into boot_cpu_data Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 54/57] KVM: x86: Rename "SF" macro to "SCATTERED_F" Sean Christopherson
                   ` (5 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Don't memcpy() all of boot_cpu_data.x86_capability, and instead explicitly
fill each kvm_cpu_cap_init leaf during kvm_cpu_cap_init().  While clever,
copying all kernel capabilities risks over-reporting KVM capabilities,
e.g. if KVM added support in __do_cpuid_func(), but neglected to init the
supported set of capabilities.

Note, explicitly grabbing leafs deliberately keeps Linux-defined leafs as
0!  KVM should never advertise Linux-defined leafs; any relevant features
that are "real", but scattered, must be gathered in their correct hardware-
defined leaf.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 2b05a7e61994..3b8ec5e7e39a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -657,21 +657,23 @@ static __always_inline u32 raw_cpuid_get(struct cpuid_reg cpuid)
 }
 
 /*
- * For kernel-defined leafs, mask the boot CPU's pre-populated value.  For KVM-
- * defined leafs, explicitly set the leaf, as KVM is the one and only authority.
+ * For kernel-defined leafs, mask KVM's supported feature set with the kernel's
+ * capabilities as well as raw CPUID.  For KVM-defined leafs, consult only raw
+ * CPUID, as KVM is the one and only authority (in the kernel).
  */
 #define kvm_cpu_cap_init(leaf, mask)					\
 do {									\
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
+	const u32 *kernel_cpu_caps = boot_cpu_data.x86_capability;	\
 	u32 kvm_cpu_cap_passthrough = 0;				\
 	u32 kvm_cpu_cap_synthesized = 0;				\
 	u32 kvm_cpu_cap_emulated = 0;					\
 									\
+	kvm_cpu_caps[leaf] = (mask);					\
+									\
 	if (leaf < NCAPINTS)						\
-		kvm_cpu_caps[leaf] &= (mask);				\
-	else								\
-		kvm_cpu_caps[leaf] = (mask);				\
+		kvm_cpu_caps[leaf] &= kernel_cpu_caps[leaf];		\
 									\
 	kvm_cpu_caps[leaf] |= kvm_cpu_cap_passthrough;			\
 	kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |			\
@@ -769,9 +771,6 @@ void kvm_set_cpu_caps(void)
 	BUILD_BUG_ON(sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)) >
 		     sizeof(boot_cpu_data.x86_capability));
 
-	memcpy(&kvm_cpu_caps, &boot_cpu_data.x86_capability,
-	       sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)));
-
 	kvm_cpu_cap_init(CPUID_1_ECX,
 		F(XMM3) |
 		F(PCLMULQDQ) |
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 54/57] KVM: x86: Rename "SF" macro to "SCATTERED_F"
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (52 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 53/57] KVM: x86: Pull CPUID capabilities from boot_cpu_data only as needed Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 55/57] KVM: x86: Explicitly track feature flags that require vendor enabling Sean Christopherson
                   ` (4 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Now that each feature flag is on its own line, i.e. brevity isn't a major
concern, drop the "SF" acronym and use the (almost) full name, SCATTERED_F.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 3b8ec5e7e39a..a1a80f1f10ec 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -700,7 +700,7 @@ do {									\
 })
 
 /* Scattered Flag - For features that are scattered by cpufeatures.h. */
-#define SF(name)						\
+#define SCATTERED_F(name)					\
 ({								\
 	BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES);	\
 	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
@@ -966,9 +966,9 @@ void kvm_set_cpu_caps(void)
 	);
 
 	kvm_cpu_cap_init(CPUID_12_EAX,
-		SF(SGX1) |
-		SF(SGX2) |
-		SF(SGX_EDECCSSA)
+		SCATTERED_F(SGX1) |
+		SCATTERED_F(SGX2) |
+		SCATTERED_F(SGX_EDECCSSA)
 	);
 
 	kvm_cpu_cap_init(CPUID_24_0_EBX,
@@ -1035,7 +1035,7 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_GBPAGES);
 
 	kvm_cpu_cap_init(CPUID_8000_0007_EDX,
-		SF(CONSTANT_TSC)
+		SCATTERED_F(CONSTANT_TSC)
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0008_EBX,
@@ -1156,7 +1156,7 @@ void kvm_set_cpu_caps(void)
 EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 
 #undef F
-#undef SF
+#undef SCATTERED_F
 #undef X86_64_F
 #undef EMULATED_F
 #undef SYNTHESIZED_F
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 55/57] KVM: x86: Explicitly track feature flags that require vendor enabling
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (53 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 54/57] KVM: x86: Rename "SF" macro to "SCATTERED_F" Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 56/57] KVM: x86: Explicitly track feature flags that are enabled at runtime Sean Christopherson
                   ` (3 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add another CPUID feature macro, VENDOR_F(), and use it to track features
that KVM supports, but that need additional vendor support and so are
conditionally enabled in vendor code.

Currently, VENDOR_F() is mostly just documentation, but tracking all
KVM-supported features will allow for asserting, at build time, take),
that all features that are set, cleared, *or* checked by KVM are known to
kvm_set_cpu_caps().

To fudge around a macro collision on 32-bit kernels, #undef DS to be able
to get at X86_FEATURE_DS.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 59 ++++++++++++++++++++++++++++++++------------
 1 file changed, 43 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index a1a80f1f10ec..5ac5fe2febf7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -758,12 +758,25 @@ do {									\
 	feature_bit(name);							\
 })
 
+/*
+ * Vendor Features - For features that KVM supports, but are added in later
+ * because they require additional vendor enabling.
+ */
+#define VENDOR_F(name)						\
+({								\
+	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
+	0;							\
+})
+
 /*
  * Undefine the MSR bit macro to avoid token concatenation issues when
  * processing X86_FEATURE_SPEC_CTRL_SSBD.
  */
 #undef SPEC_CTRL_SSBD
 
+/* DS is defined by ptrace-abi.h on 32-bit builds. */
+#undef DS
+
 void kvm_set_cpu_caps(void)
 {
 	memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
@@ -774,13 +787,14 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_init(CPUID_1_ECX,
 		F(XMM3) |
 		F(PCLMULQDQ) |
-		0 /* DTES64 */ |
+		VENDOR_F(DTES64) |
 		/*
 		 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*
 		 * advertised to guests via CPUID!
 		 */
 		0 /* MONITOR */ |
-		0 /* DS-CPL, VMX, SMX, EST */ |
+		VENDOR_F(VMX) |
+		0 /* DS-CPL, SMX, EST */ |
 		0 /* TM2 */ |
 		F(SSSE3) |
 		0 /* CNXT-ID */ |
@@ -827,7 +841,9 @@ void kvm_set_cpu_caps(void)
 		F(PSE36) |
 		0 /* PSN */ |
 		F(CLFLUSH) |
-		0 /* Reserved, DS, ACPI */ |
+		0 /* Reserved */ |
+		VENDOR_F(DS) |
+		0 /* ACPI */ |
 		F(MMX) |
 		F(FXSR) |
 		F(XMM) |
@@ -850,7 +866,7 @@ void kvm_set_cpu_caps(void)
 		F(INVPCID) |
 		F(RTM) |
 		F(ZERO_FCS_FDS) |
-		0 /*MPX*/ |
+		VENDOR_F(MPX) |
 		F(AVX512F) |
 		F(AVX512DQ) |
 		F(RDSEED) |
@@ -859,7 +875,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX512IFMA) |
 		F(CLFLUSHOPT) |
 		F(CLWB) |
-		0 /*INTEL_PT*/ |
+		VENDOR_F(INTEL_PT) |
 		F(AVX512PF) |
 		F(AVX512ER) |
 		F(AVX512CD) |
@@ -884,7 +900,7 @@ void kvm_set_cpu_caps(void)
 		F(CLDEMOTE) |
 		F(MOVDIRI) |
 		F(MOVDIR64B) |
-		0 /*WAITPKG*/ |
+		VENDOR_F(WAITPKG) |
 		F(SGX_LC) |
 		F(BUS_LOCK_DETECT)
 	);
@@ -980,7 +996,7 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_init(CPUID_8000_0001_ECX,
 		F(LAHF_LM) |
 		F(CMP_LEGACY) |
-		0 /*SVM*/ |
+		VENDOR_F(SVM) |
 		0 /* ExtApicSpace */ |
 		F(CR8_LEGACY) |
 		F(ABM) |
@@ -994,7 +1010,7 @@ void kvm_set_cpu_caps(void)
 		F(FMA4) |
 		F(TBM) |
 		F(TOPOEXT) |
-		0 /* PERFCTR_CORE */
+		VENDOR_F(PERFCTR_CORE)
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0001_EDX,
@@ -1080,17 +1096,27 @@ void kvm_set_cpu_caps(void)
 	    !boot_cpu_has(X86_FEATURE_AMD_SSBD))
 		kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
 
-	/*
-	 * Hide all SVM features by default, SVM will set the cap bits for
-	 * features it emulates and/or exposes for L1.
-	 */
-	kvm_cpu_cap_init(CPUID_8000_000A_EDX, 0);
+	/* All SVM features required additional vendor module enabling. */
+	kvm_cpu_cap_init(CPUID_8000_000A_EDX,
+		VENDOR_F(NPT) |
+		VENDOR_F(VMCBCLEAN) |
+		VENDOR_F(FLUSHBYASID) |
+		VENDOR_F(NRIPS) |
+		VENDOR_F(TSCRATEMSR) |
+		VENDOR_F(V_VMSAVE_VMLOAD) |
+		VENDOR_F(LBRV) |
+		VENDOR_F(PAUSEFILTER) |
+		VENDOR_F(PFTHRESHOLD) |
+		VENDOR_F(VGIF) |
+		VENDOR_F(VNMI) |
+		VENDOR_F(SVME_ADDR_CHK)
+	);
 
 	kvm_cpu_cap_init(CPUID_8000_001F_EAX,
-		0 /* SME */ |
-		0 /* SEV */ |
+		VENDOR_F(SME) |
+		VENDOR_F(SEV) |
 		0 /* VM_PAGE_FLUSH */ |
-		0 /* SEV_ES */ |
+		VENDOR_F(SEV_ES) |
 		F(SME_COHERENT)
 	);
 
@@ -1162,6 +1188,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 #undef SYNTHESIZED_F
 #undef PASSTHROUGH_F
 #undef ALIASED_1_EDX_F
+#undef VENDOR_F
 
 struct kvm_cpuid_array {
 	struct kvm_cpuid_entry2 *entries;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 56/57] KVM: x86: Explicitly track feature flags that are enabled at runtime
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (54 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 55/57] KVM: x86: Explicitly track feature flags that require vendor enabling Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-11-28  1:34 ` [PATCH v3 57/57] KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps Sean Christopherson
                   ` (2 subsequent siblings)
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Add one last (hopefully) CPUID feature macro, RUNTIME_F(), and use it
to track features that KVM supports, but that are only set at runtime
(in response to other state), and aren't advertised to userspace via
KVM_GET_SUPPORTED_CPUID.

Currently, RUNTIME_F() is mostly just documentation, but tracking all
KVM-supported features will allow for asserting, at build time, take),
that all features that are set, cleared, *or* checked by KVM are known to
kvm_set_cpu_caps().

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 5ac5fe2febf7..e03154b9833f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -768,6 +768,16 @@ do {									\
 	0;							\
 })
 
+/*
+ * Runtime Features - For features that KVM dynamically sets/clears at runtime,
+ * e.g. when CR4 changes, but which are never advertised to userspace.
+ */
+#define RUNTIME_F(name)						\
+({								\
+	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
+	0;							\
+})
+
 /*
  * Undefine the MSR bit macro to avoid token concatenation issues when
  * processing X86_FEATURE_SPEC_CTRL_SSBD.
@@ -790,9 +800,11 @@ void kvm_set_cpu_caps(void)
 		VENDOR_F(DTES64) |
 		/*
 		 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*
-		 * advertised to guests via CPUID!
+		 * advertised to guests via CPUID!  MWAIT is also technically a
+		 * runtime flag thanks to IA32_MISC_ENABLES; mark it as such so
+		 * that KVM is aware that it's a known, unadvertised flag.
 		 */
-		0 /* MONITOR */ |
+		RUNTIME_F(MWAIT) |
 		VENDOR_F(VMX) |
 		0 /* DS-CPL, SMX, EST */ |
 		0 /* TM2 */ |
@@ -813,7 +825,7 @@ void kvm_set_cpu_caps(void)
 		EMULATED_F(TSC_DEADLINE_TIMER) |
 		F(AES) |
 		F(XSAVE) |
-		0 /* OSXSAVE */ |
+		RUNTIME_F(OSXSAVE) |
 		F(AVX) |
 		F(F16C) |
 		F(RDRAND) |
@@ -887,7 +899,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX512VBMI) |
 		PASSTHROUGH_F(LA57) |
 		F(PKU) |
-		0 /*OSPKE*/ |
+		RUNTIME_F(OSPKE) |
 		F(RDPID) |
 		F(AVX512_VPOPCNTDQ) |
 		F(UMIP) |
@@ -1189,6 +1201,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 #undef PASSTHROUGH_F
 #undef ALIASED_1_EDX_F
 #undef VENDOR_F
+#undef RUNTIME_F
 
 struct kvm_cpuid_array {
 	struct kvm_cpuid_entry2 *entries;
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 57/57] KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (55 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 56/57] KVM: x86: Explicitly track feature flags that are enabled at runtime Sean Christopherson
@ 2024-11-28  1:34 ` Sean Christopherson
  2024-12-18  1:15   ` Maxim Levitsky
  2024-12-18  1:13 ` [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Maxim Levitsky
  2024-12-19  2:40 ` Sean Christopherson
  58 siblings, 1 reply; 65+ messages in thread
From: Sean Christopherson @ 2024-11-28  1:34 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Refactor the kvm_cpu_cap_init() macro magic to collect supported features
in a local variable instead of passing them to the macro as a "mask".  As
pointed out by Maxim, relying on macros to "return" a value and set local
variables is surprising, as the bitwise-OR logic suggests the macros are
pure, i.e. have no side effects.

Ideally, the feature initializers would have zero side effects, e.g. would
take local variables as params, but there isn't a sane way to do so
without either sacrificing the various compile-time assertions (basically
a non-starter), or passing at least one variable, e.g. a struct, to each
macro usage (adds a lot of noise and boilerplate code).

Opportunistically force callers to emit a trailing comma by intentionally
omitting a semicolon after invoking the feature initializers.  Forcing a
trailing comma isotales futures changes to a single line, i.e. doesn't
cause churn for unrelated features/lines when adding/removing/modifying a
feature.

No functional change intended.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 541 ++++++++++++++++++++++---------------------
 1 file changed, 273 insertions(+), 268 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e03154b9833f..572dfa7e206e 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -661,7 +661,7 @@ static __always_inline u32 raw_cpuid_get(struct cpuid_reg cpuid)
  * capabilities as well as raw CPUID.  For KVM-defined leafs, consult only raw
  * CPUID, as KVM is the one and only authority (in the kernel).
  */
-#define kvm_cpu_cap_init(leaf, mask)					\
+#define kvm_cpu_cap_init(leaf, feature_initializers...)			\
 do {									\
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
@@ -669,8 +669,11 @@ do {									\
 	u32 kvm_cpu_cap_passthrough = 0;				\
 	u32 kvm_cpu_cap_synthesized = 0;				\
 	u32 kvm_cpu_cap_emulated = 0;					\
+	u32 kvm_cpu_cap_features = 0;					\
 									\
-	kvm_cpu_caps[leaf] = (mask);					\
+	feature_initializers						\
+									\
+	kvm_cpu_caps[leaf] = kvm_cpu_cap_features;			\
 									\
 	if (leaf < NCAPINTS)						\
 		kvm_cpu_caps[leaf] &= kernel_cpu_caps[leaf];		\
@@ -696,7 +699,7 @@ do {									\
 #define F(name)							\
 ({								\
 	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
-	feature_bit(name);					\
+	kvm_cpu_cap_features |= feature_bit(name);		\
 })
 
 /* Scattered Flag - For features that are scattered by cpufeatures.h. */
@@ -704,14 +707,16 @@ do {									\
 ({								\
 	BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES);	\
 	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
-	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
+	if (boot_cpu_has(X86_FEATURE_##name))			\
+		F(name);					\
 })
 
 /* Features that KVM supports only on 64-bit kernels. */
 #define X86_64_F(name)						\
 ({								\
 	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
-	(IS_ENABLED(CONFIG_X86_64) ? F(name) : 0);		\
+	if (IS_ENABLED(CONFIG_X86_64))				\
+		F(name);					\
 })
 
 /*
@@ -720,7 +725,7 @@ do {									\
  */
 #define EMULATED_F(name)					\
 ({								\
-	kvm_cpu_cap_emulated |= F(name);			\
+	kvm_cpu_cap_emulated |= feature_bit(name);		\
 	F(name);						\
 })
 
@@ -731,7 +736,7 @@ do {									\
  */
 #define SYNTHESIZED_F(name)					\
 ({								\
-	kvm_cpu_cap_synthesized |= F(name);			\
+	kvm_cpu_cap_synthesized |= feature_bit(name);		\
 	F(name);						\
 })
 
@@ -743,7 +748,7 @@ do {									\
  */
 #define PASSTHROUGH_F(name)					\
 ({								\
-	kvm_cpu_cap_passthrough |= F(name);			\
+	kvm_cpu_cap_passthrough |= feature_bit(name);		\
 	F(name);						\
 })
 
@@ -755,7 +760,7 @@ do {									\
 ({										\
 	BUILD_BUG_ON(__feature_leaf(X86_FEATURE_##name) != CPUID_1_EDX);	\
 	BUILD_BUG_ON(kvm_cpu_cap_init_in_progress != CPUID_8000_0001_EDX);	\
-	feature_bit(name);							\
+	kvm_cpu_cap_features |= feature_bit(name);				\
 })
 
 /*
@@ -765,7 +770,6 @@ do {									\
 #define VENDOR_F(name)						\
 ({								\
 	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
-	0;							\
 })
 
 /*
@@ -775,7 +779,6 @@ do {									\
 #define RUNTIME_F(name)						\
 ({								\
 	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
-	0;							\
 })
 
 /*
@@ -795,126 +798,128 @@ void kvm_set_cpu_caps(void)
 		     sizeof(boot_cpu_data.x86_capability));
 
 	kvm_cpu_cap_init(CPUID_1_ECX,
-		F(XMM3) |
-		F(PCLMULQDQ) |
-		VENDOR_F(DTES64) |
+		F(XMM3),
+		F(PCLMULQDQ),
+		VENDOR_F(DTES64),
 		/*
 		 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*
 		 * advertised to guests via CPUID!  MWAIT is also technically a
 		 * runtime flag thanks to IA32_MISC_ENABLES; mark it as such so
 		 * that KVM is aware that it's a known, unadvertised flag.
 		 */
-		RUNTIME_F(MWAIT) |
-		VENDOR_F(VMX) |
-		0 /* DS-CPL, SMX, EST */ |
-		0 /* TM2 */ |
-		F(SSSE3) |
-		0 /* CNXT-ID */ |
-		0 /* Reserved */ |
-		F(FMA) |
-		F(CX16) |
-		0 /* xTPR Update */ |
-		F(PDCM) |
-		F(PCID) |
-		0 /* Reserved, DCA */ |
-		F(XMM4_1) |
-		F(XMM4_2) |
-		EMULATED_F(X2APIC) |
-		F(MOVBE) |
-		F(POPCNT) |
-		EMULATED_F(TSC_DEADLINE_TIMER) |
-		F(AES) |
-		F(XSAVE) |
-		RUNTIME_F(OSXSAVE) |
-		F(AVX) |
-		F(F16C) |
-		F(RDRAND) |
-		EMULATED_F(HYPERVISOR)
+		RUNTIME_F(MWAIT),
+		/* DS-CPL */
+		VENDOR_F(VMX),
+		/* SMX, EST */
+		/* TM2 */
+		F(SSSE3),
+		/* CNXT-ID */
+		/* Reserved */
+		F(FMA),
+		F(CX16),
+		/* xTPR Update */
+		F(PDCM),
+		F(PCID),
+		/* Reserved, DCA */
+		F(XMM4_1),
+		F(XMM4_2),
+		EMULATED_F(X2APIC),
+		F(MOVBE),
+		F(POPCNT),
+		EMULATED_F(TSC_DEADLINE_TIMER),
+		F(AES),
+		F(XSAVE),
+		RUNTIME_F(OSXSAVE),
+		F(AVX),
+		F(F16C),
+		F(RDRAND),
+		EMULATED_F(HYPERVISOR),
 	);
 
 	kvm_cpu_cap_init(CPUID_1_EDX,
-		F(FPU) |
-		F(VME) |
-		F(DE) |
-		F(PSE) |
-		F(TSC) |
-		F(MSR) |
-		F(PAE) |
-		F(MCE) |
-		F(CX8) |
-		F(APIC) |
-		0 /* Reserved */ |
-		F(SEP) |
-		F(MTRR) |
-		F(PGE) |
-		F(MCA) |
-		F(CMOV) |
-		F(PAT) |
-		F(PSE36) |
-		0 /* PSN */ |
-		F(CLFLUSH) |
-		0 /* Reserved */ |
-		VENDOR_F(DS) |
-		0 /* ACPI */ |
-		F(MMX) |
-		F(FXSR) |
-		F(XMM) |
-		F(XMM2) |
-		F(SELFSNOOP) |
-		0 /* HTT, TM, Reserved, PBE */
+		F(FPU),
+		F(VME),
+		F(DE),
+		F(PSE),
+		F(TSC),
+		F(MSR),
+		F(PAE),
+		F(MCE),
+		F(CX8),
+		F(APIC),
+		/* Reserved */
+		F(SEP),
+		F(MTRR),
+		F(PGE),
+		F(MCA),
+		F(CMOV),
+		F(PAT),
+		F(PSE36),
+		/* PSN */
+		F(CLFLUSH),
+		/* Reserved */
+		VENDOR_F(DS),
+		/* ACPI */
+		F(MMX),
+		F(FXSR),
+		F(XMM),
+		F(XMM2),
+		F(SELFSNOOP),
+		/* HTT, TM, Reserved, PBE */
 	);
 
 	kvm_cpu_cap_init(CPUID_7_0_EBX,
-		F(FSGSBASE) |
-		EMULATED_F(TSC_ADJUST) |
-		F(SGX) |
-		F(BMI1) |
-		F(HLE) |
-		F(AVX2) |
-		F(FDP_EXCPTN_ONLY) |
-		F(SMEP) |
-		F(BMI2) |
-		F(ERMS) |
-		F(INVPCID) |
-		F(RTM) |
-		F(ZERO_FCS_FDS) |
-		VENDOR_F(MPX) |
-		F(AVX512F) |
-		F(AVX512DQ) |
-		F(RDSEED) |
-		F(ADX) |
-		F(SMAP) |
-		F(AVX512IFMA) |
-		F(CLFLUSHOPT) |
-		F(CLWB) |
-		VENDOR_F(INTEL_PT) |
-		F(AVX512PF) |
-		F(AVX512ER) |
-		F(AVX512CD) |
-		F(SHA_NI) |
-		F(AVX512BW) |
-		F(AVX512VL));
+		F(FSGSBASE),
+		EMULATED_F(TSC_ADJUST),
+		F(SGX),
+		F(BMI1),
+		F(HLE),
+		F(AVX2),
+		F(FDP_EXCPTN_ONLY),
+		F(SMEP),
+		F(BMI2),
+		F(ERMS),
+		F(INVPCID),
+		F(RTM),
+		F(ZERO_FCS_FDS),
+		VENDOR_F(MPX),
+		F(AVX512F),
+		F(AVX512DQ),
+		F(RDSEED),
+		F(ADX),
+		F(SMAP),
+		F(AVX512IFMA),
+		F(CLFLUSHOPT),
+		F(CLWB),
+		VENDOR_F(INTEL_PT),
+		F(AVX512PF),
+		F(AVX512ER),
+		F(AVX512CD),
+		F(SHA_NI),
+		F(AVX512BW),
+		F(AVX512VL),
+	);
 
 	kvm_cpu_cap_init(CPUID_7_ECX,
-		F(AVX512VBMI) |
-		PASSTHROUGH_F(LA57) |
-		F(PKU) |
-		RUNTIME_F(OSPKE) |
-		F(RDPID) |
-		F(AVX512_VPOPCNTDQ) |
-		F(UMIP) |
-		F(AVX512_VBMI2) |
-		F(GFNI) |
-		F(VAES) |
-		F(VPCLMULQDQ) |
-		F(AVX512_VNNI) |
-		F(AVX512_BITALG) |
-		F(CLDEMOTE) |
-		F(MOVDIRI) |
-		F(MOVDIR64B) |
-		VENDOR_F(WAITPKG) |
-		F(SGX_LC) |
-		F(BUS_LOCK_DETECT)
+		F(AVX512VBMI),
+		PASSTHROUGH_F(LA57),
+		F(PKU),
+		RUNTIME_F(OSPKE),
+		F(RDPID),
+		F(AVX512_VPOPCNTDQ),
+		F(UMIP),
+		F(AVX512_VBMI2),
+		F(GFNI),
+		F(VAES),
+		F(VPCLMULQDQ),
+		F(AVX512_VNNI),
+		F(AVX512_BITALG),
+		F(CLDEMOTE),
+		F(MOVDIRI),
+		F(MOVDIR64B),
+		VENDOR_F(WAITPKG),
+		F(SGX_LC),
+		F(BUS_LOCK_DETECT),
 	);
 
 	/*
@@ -925,22 +930,22 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_clear(X86_FEATURE_PKU);
 
 	kvm_cpu_cap_init(CPUID_7_EDX,
-		F(AVX512_4VNNIW) |
-		F(AVX512_4FMAPS) |
-		F(SPEC_CTRL) |
-		F(SPEC_CTRL_SSBD) |
-		EMULATED_F(ARCH_CAPABILITIES) |
-		F(INTEL_STIBP) |
-		F(MD_CLEAR) |
-		F(AVX512_VP2INTERSECT) |
-		F(FSRM) |
-		F(SERIALIZE) |
-		F(TSXLDTRK) |
-		F(AVX512_FP16) |
-		F(AMX_TILE) |
-		F(AMX_INT8) |
-		F(AMX_BF16) |
-		F(FLUSH_L1D)
+		F(AVX512_4VNNIW),
+		F(AVX512_4FMAPS),
+		F(SPEC_CTRL),
+		F(SPEC_CTRL_SSBD),
+		EMULATED_F(ARCH_CAPABILITIES),
+		F(INTEL_STIBP),
+		F(MD_CLEAR),
+		F(AVX512_VP2INTERSECT),
+		F(FSRM),
+		F(SERIALIZE),
+		F(TSXLDTRK),
+		F(AVX512_FP16),
+		F(AMX_TILE),
+		F(AMX_INT8),
+		F(AMX_BF16),
+		F(FLUSH_L1D),
 	);
 
 	if (boot_cpu_has(X86_FEATURE_AMD_IBPB_RET) &&
@@ -953,132 +958,132 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
 
 	kvm_cpu_cap_init(CPUID_7_1_EAX,
-		F(SHA512) |
-		F(SM3) |
-		F(SM4) |
-		F(AVX_VNNI) |
-		F(AVX512_BF16) |
-		F(CMPCCXADD) |
-		F(FZRM) |
-		F(FSRS) |
-		F(FSRC) |
-		F(AMX_FP16) |
-		F(AVX_IFMA) |
-		F(LAM)
+		F(SHA512),
+		F(SM3),
+		F(SM4),
+		F(AVX_VNNI),
+		F(AVX512_BF16),
+		F(CMPCCXADD),
+		F(FZRM),
+		F(FSRS),
+		F(FSRC),
+		F(AMX_FP16),
+		F(AVX_IFMA),
+		F(LAM),
 	);
 
 	kvm_cpu_cap_init(CPUID_7_1_EDX,
-		F(AVX_VNNI_INT8) |
-		F(AVX_NE_CONVERT) |
-		F(AMX_COMPLEX) |
-		F(AVX_VNNI_INT16) |
-		F(PREFETCHITI) |
-		F(AVX10)
+		F(AVX_VNNI_INT8),
+		F(AVX_NE_CONVERT),
+		F(AMX_COMPLEX),
+		F(AVX_VNNI_INT16),
+		F(PREFETCHITI),
+		F(AVX10),
 	);
 
 	kvm_cpu_cap_init(CPUID_7_2_EDX,
-		F(INTEL_PSFD) |
-		F(IPRED_CTRL) |
-		F(RRSBA_CTRL) |
-		F(DDPD_U) |
-		F(BHI_CTRL) |
-		F(MCDT_NO)
+		F(INTEL_PSFD),
+		F(IPRED_CTRL),
+		F(RRSBA_CTRL),
+		F(DDPD_U),
+		F(BHI_CTRL),
+		F(MCDT_NO),
 	);
 
 	kvm_cpu_cap_init(CPUID_D_1_EAX,
-		F(XSAVEOPT) |
-		F(XSAVEC) |
-		F(XGETBV1) |
-		F(XSAVES) |
-		X86_64_F(XFD)
+		F(XSAVEOPT),
+		F(XSAVEC),
+		F(XGETBV1),
+		F(XSAVES),
+		X86_64_F(XFD),
 	);
 
 	kvm_cpu_cap_init(CPUID_12_EAX,
-		SCATTERED_F(SGX1) |
-		SCATTERED_F(SGX2) |
-		SCATTERED_F(SGX_EDECCSSA)
+		SCATTERED_F(SGX1),
+		SCATTERED_F(SGX2),
+		SCATTERED_F(SGX_EDECCSSA),
 	);
 
 	kvm_cpu_cap_init(CPUID_24_0_EBX,
-		F(AVX10_128) |
-		F(AVX10_256) |
-		F(AVX10_512)
+		F(AVX10_128),
+		F(AVX10_256),
+		F(AVX10_512),
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0001_ECX,
-		F(LAHF_LM) |
-		F(CMP_LEGACY) |
-		VENDOR_F(SVM) |
-		0 /* ExtApicSpace */ |
-		F(CR8_LEGACY) |
-		F(ABM) |
-		F(SSE4A) |
-		F(MISALIGNSSE) |
-		F(3DNOWPREFETCH) |
-		F(OSVW) |
-		0 /* IBS */ |
-		F(XOP) |
-		0 /* SKINIT, WDT, LWP */ |
-		F(FMA4) |
-		F(TBM) |
-		F(TOPOEXT) |
-		VENDOR_F(PERFCTR_CORE)
+		F(LAHF_LM),
+		F(CMP_LEGACY),
+		VENDOR_F(SVM),
+		/* ExtApicSpace */
+		F(CR8_LEGACY),
+		F(ABM),
+		F(SSE4A),
+		F(MISALIGNSSE),
+		F(3DNOWPREFETCH),
+		F(OSVW),
+		/* IBS */
+		F(XOP),
+		/* SKINIT, WDT, LWP */
+		F(FMA4),
+		F(TBM),
+		F(TOPOEXT),
+		VENDOR_F(PERFCTR_CORE),
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0001_EDX,
-		ALIASED_1_EDX_F(FPU) |
-		ALIASED_1_EDX_F(VME) |
-		ALIASED_1_EDX_F(DE) |
-		ALIASED_1_EDX_F(PSE) |
-		ALIASED_1_EDX_F(TSC) |
-		ALIASED_1_EDX_F(MSR) |
-		ALIASED_1_EDX_F(PAE) |
-		ALIASED_1_EDX_F(MCE) |
-		ALIASED_1_EDX_F(CX8) |
-		ALIASED_1_EDX_F(APIC) |
-		0 /* Reserved */ |
-		F(SYSCALL) |
-		ALIASED_1_EDX_F(MTRR) |
-		ALIASED_1_EDX_F(PGE) |
-		ALIASED_1_EDX_F(MCA) |
-		ALIASED_1_EDX_F(CMOV) |
-		ALIASED_1_EDX_F(PAT) |
-		ALIASED_1_EDX_F(PSE36) |
-		0 /* Reserved */ |
-		F(NX) |
-		0 /* Reserved */ |
-		F(MMXEXT) |
-		ALIASED_1_EDX_F(MMX) |
-		ALIASED_1_EDX_F(FXSR) |
-		F(FXSR_OPT) |
-		X86_64_F(GBPAGES) |
-		F(RDTSCP) |
-		0 /* Reserved */ |
-		X86_64_F(LM) |
-		F(3DNOWEXT) |
-		F(3DNOW)
+		ALIASED_1_EDX_F(FPU),
+		ALIASED_1_EDX_F(VME),
+		ALIASED_1_EDX_F(DE),
+		ALIASED_1_EDX_F(PSE),
+		ALIASED_1_EDX_F(TSC),
+		ALIASED_1_EDX_F(MSR),
+		ALIASED_1_EDX_F(PAE),
+		ALIASED_1_EDX_F(MCE),
+		ALIASED_1_EDX_F(CX8),
+		ALIASED_1_EDX_F(APIC),
+		/* Reserved */
+		F(SYSCALL),
+		ALIASED_1_EDX_F(MTRR),
+		ALIASED_1_EDX_F(PGE),
+		ALIASED_1_EDX_F(MCA),
+		ALIASED_1_EDX_F(CMOV),
+		ALIASED_1_EDX_F(PAT),
+		ALIASED_1_EDX_F(PSE36),
+		/* Reserved */
+		F(NX),
+		/* Reserved */
+		F(MMXEXT),
+		ALIASED_1_EDX_F(MMX),
+		ALIASED_1_EDX_F(FXSR),
+		F(FXSR_OPT),
+		X86_64_F(GBPAGES),
+		F(RDTSCP),
+		/* Reserved */
+		X86_64_F(LM),
+		F(3DNOWEXT),
+		F(3DNOW),
 	);
 
 	if (!tdp_enabled && IS_ENABLED(CONFIG_X86_64))
 		kvm_cpu_cap_set(X86_FEATURE_GBPAGES);
 
 	kvm_cpu_cap_init(CPUID_8000_0007_EDX,
-		SCATTERED_F(CONSTANT_TSC)
+		SCATTERED_F(CONSTANT_TSC),
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0008_EBX,
-		F(CLZERO) |
-		F(XSAVEERPTR) |
-		F(WBNOINVD) |
-		F(AMD_IBPB) |
-		F(AMD_IBRS) |
-		F(AMD_SSBD) |
-		F(VIRT_SSBD) |
-		F(AMD_SSB_NO) |
-		F(AMD_STIBP) |
-		F(AMD_STIBP_ALWAYS_ON) |
-		F(AMD_PSFD) |
-		F(AMD_IBPB_RET)
+		F(CLZERO),
+		F(XSAVEERPTR),
+		F(WBNOINVD),
+		F(AMD_IBPB),
+		F(AMD_IBRS),
+		F(AMD_SSBD),
+		F(VIRT_SSBD),
+		F(AMD_SSB_NO),
+		F(AMD_STIBP),
+		F(AMD_STIBP_ALWAYS_ON),
+		F(AMD_PSFD),
+		F(AMD_IBPB_RET),
 	);
 
 	/*
@@ -1110,30 +1115,30 @@ void kvm_set_cpu_caps(void)
 
 	/* All SVM features required additional vendor module enabling. */
 	kvm_cpu_cap_init(CPUID_8000_000A_EDX,
-		VENDOR_F(NPT) |
-		VENDOR_F(VMCBCLEAN) |
-		VENDOR_F(FLUSHBYASID) |
-		VENDOR_F(NRIPS) |
-		VENDOR_F(TSCRATEMSR) |
-		VENDOR_F(V_VMSAVE_VMLOAD) |
-		VENDOR_F(LBRV) |
-		VENDOR_F(PAUSEFILTER) |
-		VENDOR_F(PFTHRESHOLD) |
-		VENDOR_F(VGIF) |
-		VENDOR_F(VNMI) |
-		VENDOR_F(SVME_ADDR_CHK)
+		VENDOR_F(NPT),
+		VENDOR_F(VMCBCLEAN),
+		VENDOR_F(FLUSHBYASID),
+		VENDOR_F(NRIPS),
+		VENDOR_F(TSCRATEMSR),
+		VENDOR_F(V_VMSAVE_VMLOAD),
+		VENDOR_F(LBRV),
+		VENDOR_F(PAUSEFILTER),
+		VENDOR_F(PFTHRESHOLD),
+		VENDOR_F(VGIF),
+		VENDOR_F(VNMI),
+		VENDOR_F(SVME_ADDR_CHK),
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_001F_EAX,
-		VENDOR_F(SME) |
-		VENDOR_F(SEV) |
-		0 /* VM_PAGE_FLUSH */ |
-		VENDOR_F(SEV_ES) |
-		F(SME_COHERENT)
+		VENDOR_F(SME),
+		VENDOR_F(SEV),
+		/* VM_PAGE_FLUSH */
+		VENDOR_F(SEV_ES),
+		F(SME_COHERENT),
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0021_EAX,
-		F(NO_NESTED_DATA_BP) |
+		F(NO_NESTED_DATA_BP),
 		/*
 		 * Synthesize "LFENCE is serializing" into the AMD-defined entry
 		 * in KVM's supported CPUID, i.e. if the feature is reported as
@@ -1144,36 +1149,36 @@ void kvm_set_cpu_caps(void)
 		 * CPUID will drop the flags, and reporting support in AMD's
 		 * leaf can make it easier for userspace to detect the feature.
 		 */
-		SYNTHESIZED_F(LFENCE_RDTSC) |
-		0 /* SmmPgCfgLock */ |
-		F(NULL_SEL_CLR_BASE) |
-		F(AUTOIBRS) |
-		EMULATED_F(NO_SMM_CTL_MSR) |
-		0 /* PrefetchCtlMsr */ |
-		F(WRMSR_XX_BASE_NS) |
-		SYNTHESIZED_F(SBPB) |
-		SYNTHESIZED_F(IBPB_BRTYPE) |
-		SYNTHESIZED_F(SRSO_NO)
+		SYNTHESIZED_F(LFENCE_RDTSC),
+		/* SmmPgCfgLock */
+		F(NULL_SEL_CLR_BASE),
+		F(AUTOIBRS),
+		EMULATED_F(NO_SMM_CTL_MSR),
+		/* PrefetchCtlMsr */
+		F(WRMSR_XX_BASE_NS),
+		SYNTHESIZED_F(SBPB),
+		SYNTHESIZED_F(IBPB_BRTYPE),
+		SYNTHESIZED_F(SRSO_NO),
 	);
 
 	kvm_cpu_cap_init(CPUID_8000_0022_EAX,
-		F(PERFMON_V2)
+		F(PERFMON_V2),
 	);
 
 	if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
 		kvm_cpu_cap_set(X86_FEATURE_NULL_SEL_CLR_BASE);
 
 	kvm_cpu_cap_init(CPUID_C000_0001_EDX,
-		F(XSTORE) |
-		F(XSTORE_EN) |
-		F(XCRYPT) |
-		F(XCRYPT_EN) |
-		F(ACE2) |
-		F(ACE2_EN) |
-		F(PHE) |
-		F(PHE_EN) |
-		F(PMM) |
-		F(PMM_EN)
+		F(XSTORE),
+		F(XSTORE_EN),
+		F(XCRYPT),
+		F(XCRYPT_EN),
+		F(ACE2),
+		F(ACE2_EN),
+		F(PHE),
+		F(PHE_EN),
+		F(PMM),
+		F(PMM_EN),
 	);
 
 	/*
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX
  2024-11-28  1:33 ` [PATCH v3 05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX Sean Christopherson
@ 2024-12-13  1:30   ` Chao Gao
  0 siblings, 0 replies; 65+ messages in thread
From: Chao Gao @ 2024-12-13  1:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Jarkko Sakkinen, kvm, linux-sgx,
	linux-kernel, Maxim Levitsky, Hou Wenlong, Xiaoyao Li, Kechen Lu,
	Oliver Upton, Binbin Wu, Yang Weijiang, Robert Hoo

On Wed, Nov 27, 2024 at 05:33:32PM -0800, Sean Christopherson wrote:
>Drop x86.c's local pre-computed cr4_reserved bits and instead fold KVM's
>reserved bits into the guest's reserved bits.  This fixes a bug where VMX's
>set_cr4_guest_host_mask() fails to account for KVM-reserved bits when
>deciding which bits can be passed through to the guest.  In most cases,
>letting the guest directly write reserved CR4 bits is ok, i.e. attempting
>to set the bit(s) will still #GP, but not if a feature is available in
>hardware but explicitly disabled by the host, e.g. if FSGSBASE support is
>disabled via "nofsgsbase".
>
>Note, the extra overhead of computing host reserved bits every time
>userspace sets guest CPUID is negligible.  The feature bits that are
>queried are packed nicely into a handful of words, and so checking and
>setting each reserved bit costs in the neighborhood of ~5 cycles, i.e. the
>total cost will be in the noise even if the number of checked CR4 bits
>doubles over the next few years.  In other words, x86 will run out of CR4
>bits long before the overhead becomes problematic.
>
>Note #2, __cr4_reserved_bits() starts from CR4_RESERVED_BITS, which is
>why the existing __kvm_cpu_cap_has() processing doesn't explicitly OR in
>CR4_RESERVED_BITS (and why the new code doesn't do so either).
>
>Fixes: 2ed41aa631fc ("KVM: VMX: Intercept guest reserved CR4 bits to inject #GP fault")
>Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Chao Gao <chao.gao@intel.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps
  2024-11-28  1:34 ` [PATCH v3 50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps Sean Christopherson
@ 2024-12-13  2:14   ` Chao Gao
  2024-12-17  0:05     ` Sean Christopherson
  0 siblings, 1 reply; 65+ messages in thread
From: Chao Gao @ 2024-12-13  2:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Jarkko Sakkinen, kvm, linux-sgx,
	linux-kernel, Maxim Levitsky, Hou Wenlong, Xiaoyao Li, Kechen Lu,
	Oliver Upton, Binbin Wu, Yang Weijiang, Robert Hoo

On Wed, Nov 27, 2024 at 05:34:17PM -0800, Sean Christopherson wrote:
>Switch all queries (except XSAVES) of guest features from guest CPUID to
>guest capabilities, i.e. replace all calls to guest_cpuid_has() with calls
>to guest_cpu_cap_has().
>
>Keep guest_cpuid_has() around for XSAVES, but subsume its helper
>guest_cpuid_get_register() and add a compile-time assertion to prevent
>using guest_cpuid_has() for any other feature.  Add yet another comment
>for XSAVE to explain why KVM is allowed to query its raw guest CPUID.
>
>Opportunistically drop the unused guest_cpuid_clear(), as there should be
>no circumstance in which KVM needs to _clear_ a guest CPUID feature now
>that everything is tracked via cpu_caps.  E.g. KVM may need to _change_
>a feature to emulate dynamic CPUID flags, but KVM should never need to
>clear a feature in guest CPUID to prevent it from being used by the guest.
>
>Delete the last remnants of the governed features framework, as the lone
>holdout was vmx_adjust_secondary_exec_control()'s divergent behavior for
>governed vs. ungoverned features.
>
>Note, replacing guest_cpuid_has() checks with guest_cpu_cap_has() when
>computing reserved CR4 bits is a nop when viewed as a whole, as KVM's
>capabilities are already incorporated into the calculation, i.e. if a
>feature is present in guest CPUID but unsupported by KVM, its CR4 bit
>was already being marked as reserved, checking guest_cpu_cap_has() simply
>double-stamps that it's a reserved bit.

...

>
>Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>Signed-off-by: Sean Christopherson <seanjc@google.com>
>---
> arch/x86/kvm/cpuid.c             |  4 +-
> arch/x86/kvm/cpuid.h             | 76 ++++++++++++--------------------
> arch/x86/kvm/governed_features.h | 22 ---------
> arch/x86/kvm/hyperv.c            |  2 +-
> arch/x86/kvm/lapic.c             |  4 +-
> arch/x86/kvm/smm.c               | 10 ++---
> arch/x86/kvm/svm/pmu.c           |  8 ++--
> arch/x86/kvm/svm/sev.c           |  4 +-
> arch/x86/kvm/svm/svm.c           | 20 ++++-----
> arch/x86/kvm/vmx/hyperv.h        |  2 +-
> arch/x86/kvm/vmx/nested.c        | 12 ++---
> arch/x86/kvm/vmx/pmu_intel.c     |  4 +-
> arch/x86/kvm/vmx/sgx.c           | 14 +++---
> arch/x86/kvm/vmx/vmx.c           | 47 +++++++++-----------
> arch/x86/kvm/x86.c               | 66 +++++++++++++--------------
> 15 files changed, 124 insertions(+), 171 deletions(-)
> delete mode 100644 arch/x86/kvm/governed_features.h
>
>diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>index d3c3e1327ca1..8d088a888a0d 100644
>--- a/arch/x86/kvm/cpuid.c
>+++ b/arch/x86/kvm/cpuid.c
>@@ -416,7 +416,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> 	 * and can install smaller shadow pages if the host lacks 1GiB support.
> 	 */
> 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
>-				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
>+				      guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES);
> 	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
> 
> 	best = kvm_find_cpuid_entry(vcpu, 1);
>@@ -441,7 +441,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> 
> #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
> 	vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) |
>-					 __cr4_reserved_bits(guest_cpuid_has, vcpu);
>+					 __cr4_reserved_bits(guest_cpu_cap_has, vcpu);

So, actually, __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) can be dropped.
Is there any reason to keep it? It makes perfect sense to just look up the
guest cpu_caps given it already takes KVM caps into consideration.

> #undef __kvm_cpu_cap_has
> 
> 	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu));

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID
  2024-11-28  1:33 ` [PATCH v3 01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID Sean Christopherson
@ 2024-12-13 10:53   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 65+ messages in thread
From: Vitaly Kuznetsov @ 2024-12-13 10:53 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Sean Christopherson,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

Sean Christopherson <seanjc@google.com> writes:

> When clearing CONSTANT_TSC during CPUID emulation due to a Hyper-V quirk,
> use feature_bit() instead of SF() to ensure the bit is actually cleared.
> SF() evaluates to zero if the _host_ doesn't support the feature.  I.e.
> KVM could keep the bit set if userspace advertised CONSTANT_TSC despite
> it not being supported in hardware.

FWIW, I would strongly discourage such setups, all sorts of weird hangs
will likely be observed with Windows guests if TSC rate actually
changes.

>
> Note, translating from a scattered feature to a the hardware version is
> done by __feature_translate(), not SF().  The sole purpose of SF() is to
> check kernel support for the scattered feature, *before* translation.
>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 097bdc022d0f..776f24408fa3 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1630,7 +1630,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>  				*ebx &= ~(F(RTM) | F(HLE));
>  		} else if (function == 0x80000007) {
>  			if (kvm_hv_invtsc_suppressed(vcpu))
> -				*edx &= ~SF(CONSTANT_TSC);
> +				*edx &= ~feature_bit(CONSTANT_TSC);
>  		}
>  	} else {
>  		*eax = *ebx = *ecx = *edx = 0;

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps
  2024-12-13  2:14   ` Chao Gao
@ 2024-12-17  0:05     ` Sean Christopherson
  0 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-12-17  0:05 UTC (permalink / raw)
  To: Chao Gao
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Jarkko Sakkinen, kvm, linux-sgx,
	linux-kernel, Maxim Levitsky, Hou Wenlong, Xiaoyao Li, Kechen Lu,
	Oliver Upton, Binbin Wu, Yang Weijiang, Robert Hoo

On Fri, Dec 13, 2024, Chao Gao wrote:
> On Wed, Nov 27, 2024 at 05:34:17PM -0800, Sean Christopherson wrote:
> >diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> >index d3c3e1327ca1..8d088a888a0d 100644
> >--- a/arch/x86/kvm/cpuid.c
> >+++ b/arch/x86/kvm/cpuid.c
> >@@ -416,7 +416,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > 	 * and can install smaller shadow pages if the host lacks 1GiB support.
> > 	 */
> > 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
> >-				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> >+				      guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES);
> > 	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
> > 
> > 	best = kvm_find_cpuid_entry(vcpu, 1);
> >@@ -441,7 +441,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > 
> > #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
> > 	vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) |
> >-					 __cr4_reserved_bits(guest_cpuid_has, vcpu);
> >+					 __cr4_reserved_bits(guest_cpu_cap_has, vcpu);
> 
> So, actually, __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) can be dropped.
> Is there any reason to keep it? It makes perfect sense to just look up the
> guest cpu_caps given it already takes KVM caps into consideration.

Hmm, good point.  I agree that that keeping the __kvm_cpu_cap_has() checks is
unnecessary, though I'm tempted to turn it into a WARN.  E.g. to guard against
stuffing a feature into cpu_caps without thinking through the implications.

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index edef30359c19..3cbf384aeb7a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -460,9 +460,16 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
        kvm_pmu_refresh(vcpu);
 
+       vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(guest_cpu_cap_has, vcpu);
+       /*
+        * KVM's capabilities are incorporated into the vCPU's capabilities,
+        * and letting the guest to use a CR4-based feature that KVM doesn't
+        * support isn't allowed as KVM either needs to explicitly emulate the
+        * feature or set the CR4 bit in hardware.
+        */
 #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
-       vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) |
-                                        __cr4_reserved_bits(guest_cpu_cap_has, vcpu);
+       WARN_ON_ONCE(~vcpu->arch.cr4_guest_rsvd_bits &
+                    __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_));
 #undef __kvm_cpu_cap_has
 
        kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu));

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (56 preceding siblings ...)
  2024-11-28  1:34 ` [PATCH v3 57/57] KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps Sean Christopherson
@ 2024-12-18  1:13 ` Maxim Levitsky
  2024-12-19  2:40 ` Sean Christopherson
  58 siblings, 0 replies; 65+ messages in thread
From: Maxim Levitsky @ 2024-12-18  1:13 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Hou Wenlong, Xiaoyao Li, Kechen Lu,
	Oliver Upton, Binbin Wu, Yang Weijiang, Robert Hoo

On Wed, 2024-11-27 at 17:33 -0800, Sean Christopherson wrote:
> The super short TL;DR: snapshot all X86_FEATURE_* flags that KVM cares
> about so that all queries against guest capabilities are "fast", e.g. don't
> require manual enabling or judgment calls as to where a feature needs to be
> fast.
> 
> The guest_cpu_cap_* nomenclature follows the existing kvm_cpu_cap_*
> except for a few (maybe just one?) cases where guest cpu_caps need APIs
> that kvm_cpu_caps don't.  In theory, the similar names will make this
> approach more intuitive.
> 
> This series also adds more hardening, e.g. to assert at compile-time if a
> feature flag is passed to the wrong word.  It also sets the stage for even
> more hardening in the future, as tracking all KVM-supported features allows
> shoving known vs. used features into arrays at compile time, which can then
> be checked for consistency irrespective of hardware support.  E.g. allows
> detecting if KVM is checking a feature without advertising it to userspace.
> This extra hardening is future work; I have it mostly working, but it's ugly
> and requires a runtime check to process the generated arrays.
> 
> There are *multiple* potentially breaking changes in this series (in for a
> penny, in for a pound).  However, I don't expect any fallout for real world
> VMMs because the ABI changes either disallow things that couldn't possibly
> have worked in the first place, or are following in the footsteps of other
> behaviors, e.g. KVM advertises x2APIC, which is 100% dependent on an in-kernel
> local APIC.
> 
>  * Disallow stuffing CPUID-dependent guest CR4 features before setting guest
>    CPUID.
>  * Disallow KVM_CAP_X86_DISABLE_EXITS after vCPU creation
>  * Reject disabling of MWAIT/HLT interception when not allowed
>  * Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID.
>  * Advertise HYPERVISOR in KVM_GET_SUPPORTED_CPUID
> 
> Validated the flag rework by comparing the output of KVM_GET_SUPPORTED_CPUID
> (and the emulated version) at the beginning and end of the series, on AMD
> and Intel hosts that should support almost every feature known to KVM.
> 
> Maxim, I did my best to incorporate all of your feedback, and when we
> disagreed, I tried to find an approach that I we can hopefully both live
> with, at least until someone comes up with a better idea.
> 
> I _think_ the only suggestion that I "rejected" entirely is the existence
> of ALIASED_1_EDX_F.  I responded to the previous thread, definitely feel
> free to continue the conversation there (or here).
> 
> If I missed something you care strongly about, please holler!

Hi,

I did go over this patch series, I don't think I have anything to add,
there are still things I disagree, especially the F* macros, IMHO this
makes the code less readable.

So if you want to merge this, I won't object.

Thanks,
Best regards,
	Maxim Levitsky

> 
> v3:
>  - Collect more reviews.
>  - Too many to list.
>  
> v2:
>  - Collect a few reviews (though I dropped several due to the patches changing
>    significantly).
>  - Incorporate KVM's support into the vCPU's cpu_caps. [Maxim]
>  - A massive pile of new patches.
> 
> Sean Christopherson (57):
>   KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID
>   KVM: x86: Limit use of F() and SF() to
>     kvm_cpu_cap_{mask,init_kvm_defined}()
>   KVM: x86: Do all post-set CPUID processing during vCPU creation
>   KVM: x86: Explicitly do runtime CPUID updates "after" initial setup
>   KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4
>     on VMX
>   KVM: selftests: Update x86's set_sregs_test to match KVM's CPUID
>     enforcement
>   KVM: selftests: Assert that vcpu->cpuid is non-NULL when getting CPUID
>     entries
>   KVM: selftests: Refresh vCPU CPUID cache in __vcpu_get_cpuid_entry()
>   KVM: selftests: Verify KVM stuffs runtime CPUID OS bits on CR4 writes
>   KVM: x86: Move __kvm_is_valid_cr4() definition to x86.h
>   KVM: x86/pmu: Drop now-redundant refresh() during init()
>   KVM: x86: Drop now-redundant MAXPHYADDR and GPA rsvd bits from vCPU
>     creation
>   KVM: x86: Disallow KVM_CAP_X86_DISABLE_EXITS after vCPU creation
>   KVM: x86: Reject disabling of MWAIT/HLT interception when not allowed
>   KVM: x86: Drop the now unused KVM_X86_DISABLE_VALID_EXITS
>   KVM: selftests: Fix a bad TEST_REQUIRE() in x86's KVM PV test
>   KVM: selftests: Update x86's KVM PV test to match KVM's disabling
>     exits behavior
>   KVM: x86: Zero out PV features cache when the CPUID leaf is not
>     present
>   KVM: x86: Don't update PV features caches when enabling enforcement
>     capability
>   KVM: x86: Do reverse CPUID sanity checks in __feature_leaf()
>   KVM: x86: Account for max supported CPUID leaf when getting raw host
>     CPUID
>   KVM: x86: Unpack F() CPUID feature flag macros to one flag per line of
>     code
>   KVM: x86: Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init()
>   KVM: x86: Add a macro to init CPUID features that are 64-bit only
>   KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID
>     features
>   KVM: x86: Handle kernel- and KVM-defined CPUID words in a single
>     helper
>   KVM: x86: #undef SPEC_CTRL_SSBD in cpuid.c to avoid macro collisions
>   KVM: x86: Harden CPU capabilities processing against out-of-scope
>     features
>   KVM: x86: Add a macro to init CPUID features that ignore host kernel
>     support
>   KVM: x86: Add a macro to init CPUID features that KVM emulates in
>     software
>   KVM: x86: Swap incoming guest CPUID into vCPU before massaging in
>     KVM_SET_CPUID2
>   KVM: x86: Clear PV_UNHALT for !HLT-exiting only when userspace sets
>     CPUID
>   KVM: x86: Remove unnecessary caching of KVM's PV CPUID base
>   KVM: x86: Always operate on kvm_vcpu data in cpuid_entry2_find()
>   KVM: x86: Move kvm_find_cpuid_entry{,_index}() up near
>     cpuid_entry2_find()
>   KVM: x86: Remove all direct usage of cpuid_entry2_find()
>   KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID
>   KVM: x86: Advertise HYPERVISOR in KVM_GET_SUPPORTED_CPUID
>   KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
>   KVM: x86: Replace guts of "governed" features with comprehensive
>     cpu_caps
>   KVM: x86: Initialize guest cpu_caps based on guest CPUID
>   KVM: x86: Extract code for generating per-entry emulated CPUID
>     information
>   KVM: x86: Treat MONTIOR/MWAIT as a "partially emulated" feature
>   KVM: x86: Initialize guest cpu_caps based on KVM support
>   KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime
>   KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns
>     right leaf
>   KVM: x86: Update OS{XSAVE,PKE} bits in guest CPUID irrespective of
>     host support
>   KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based
>     features
>   KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has()
>   KVM: x86: Replace (almost) all guest CPUID feature queries with
>     cpu_caps
>   KVM: x86: Drop superfluous host XSAVE check when adjusting guest
>     XSAVES caps
>   KVM: x86: Add a macro for features that are synthesized into
>     boot_cpu_data
>   KVM: x86: Pull CPUID capabilities from boot_cpu_data only as needed
>   KVM: x86: Rename "SF" macro to "SCATTERED_F"
>   KVM: x86: Explicitly track feature flags that require vendor enabling
>   KVM: x86: Explicitly track feature flags that are enabled at runtime
>   KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps
> 
>  Documentation/virt/kvm/api.rst                |  10 +-
>  arch/x86/include/asm/kvm_host.h               |  47 +-
>  arch/x86/kvm/cpuid.c                          | 967 ++++++++++++------
>  arch/x86/kvm/cpuid.h                          | 128 +--
>  arch/x86/kvm/governed_features.h              |  22 -
>  arch/x86/kvm/hyperv.c                         |   2 +-
>  arch/x86/kvm/lapic.c                          |   4 +-
>  arch/x86/kvm/mmu.h                            |   2 +-
>  arch/x86/kvm/mmu/mmu.c                        |   4 +-
>  arch/x86/kvm/pmu.c                            |   1 -
>  arch/x86/kvm/reverse_cpuid.h                  |  23 +-
>  arch/x86/kvm/smm.c                            |  10 +-
>  arch/x86/kvm/svm/nested.c                     |  22 +-
>  arch/x86/kvm/svm/pmu.c                        |   8 +-
>  arch/x86/kvm/svm/sev.c                        |  21 +-
>  arch/x86/kvm/svm/svm.c                        |  46 +-
>  arch/x86/kvm/svm/svm.h                        |   4 +-
>  arch/x86/kvm/vmx/hyperv.h                     |   2 +-
>  arch/x86/kvm/vmx/nested.c                     |  18 +-
>  arch/x86/kvm/vmx/pmu_intel.c                  |   4 +-
>  arch/x86/kvm/vmx/sgx.c                        |  14 +-
>  arch/x86/kvm/vmx/vmx.c                        |  61 +-
>  arch/x86/kvm/x86.c                            | 153 ++-
>  arch/x86/kvm/x86.h                            |   6 +-
>  include/uapi/linux/kvm.h                      |   4 -
>  .../selftests/kvm/include/x86_64/processor.h  |  18 +-
>  .../selftests/kvm/x86_64/kvm_pv_test.c        |  38 +-
>  .../selftests/kvm/x86_64/set_sregs_test.c     |  63 +-
>  28 files changed, 1017 insertions(+), 685 deletions(-)
>  delete mode 100644 arch/x86/kvm/governed_features.h
> 
> 
> base-commit: 4d911c7abee56771b0219a9fbf0120d06bdc9c14



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 57/57] KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps
  2024-11-28  1:34 ` [PATCH v3 57/57] KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps Sean Christopherson
@ 2024-12-18  1:15   ` Maxim Levitsky
  0 siblings, 0 replies; 65+ messages in thread
From: Maxim Levitsky @ 2024-12-18  1:15 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Hou Wenlong, Xiaoyao Li, Kechen Lu,
	Oliver Upton, Binbin Wu, Yang Weijiang, Robert Hoo

On Wed, 2024-11-27 at 17:34 -0800, Sean Christopherson wrote:
> Refactor the kvm_cpu_cap_init() macro magic to collect supported features
> in a local variable instead of passing them to the macro as a "mask".  As
> pointed out by Maxim, relying on macros to "return" a value and set local
> variables is surprising, as the bitwise-OR logic suggests the macros are
> pure, i.e. have no side effects.
> 
> Ideally, the feature initializers would have zero side effects, e.g. would
> take local variables as params, but there isn't a sane way to do so
> without either sacrificing the various compile-time assertions (basically
> a non-starter), or passing at least one variable, e.g. a struct, to each
> macro usage (adds a lot of noise and boilerplate code).
> 
> Opportunistically force callers to emit a trailing comma by intentionally
> omitting a semicolon after invoking the feature initializers.  Forcing a
> trailing comma isotales futures changes to a single line, i.e. doesn't
> cause churn for unrelated features/lines when adding/removing/modifying a
> feature.
> 
> No functional change intended.
> 
> Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 541 ++++++++++++++++++++++---------------------
>  1 file changed, 273 insertions(+), 268 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index e03154b9833f..572dfa7e206e 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -661,7 +661,7 @@ static __always_inline u32 raw_cpuid_get(struct cpuid_reg cpuid)
>   * capabilities as well as raw CPUID.  For KVM-defined leafs, consult only raw
>   * CPUID, as KVM is the one and only authority (in the kernel).
>   */
> -#define kvm_cpu_cap_init(leaf, mask)					\
> +#define kvm_cpu_cap_init(leaf, feature_initializers...)			\
>  do {									\
>  	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
>  	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
> @@ -669,8 +669,11 @@ do {									\
>  	u32 kvm_cpu_cap_passthrough = 0;				\
>  	u32 kvm_cpu_cap_synthesized = 0;				\
>  	u32 kvm_cpu_cap_emulated = 0;					\
> +	u32 kvm_cpu_cap_features = 0;					\
>  									\
> -	kvm_cpu_caps[leaf] = (mask);					\
> +	feature_initializers						\
> +									\
> +	kvm_cpu_caps[leaf] = kvm_cpu_cap_features;			\
>  									\
>  	if (leaf < NCAPINTS)						\
>  		kvm_cpu_caps[leaf] &= kernel_cpu_caps[leaf];		\
> @@ -696,7 +699,7 @@ do {									\
>  #define F(name)							\
>  ({								\
>  	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
> -	feature_bit(name);					\
> +	kvm_cpu_cap_features |= feature_bit(name);		\
>  })
>  
>  /* Scattered Flag - For features that are scattered by cpufeatures.h. */
> @@ -704,14 +707,16 @@ do {									\
>  ({								\
>  	BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES);	\
>  	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
> -	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
> +	if (boot_cpu_has(X86_FEATURE_##name))			\
> +		F(name);					\
>  })
>  
>  /* Features that KVM supports only on 64-bit kernels. */
>  #define X86_64_F(name)						\
>  ({								\
>  	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
> -	(IS_ENABLED(CONFIG_X86_64) ? F(name) : 0);		\
> +	if (IS_ENABLED(CONFIG_X86_64))				\
> +		F(name);					\
>  })
>  
>  /*
> @@ -720,7 +725,7 @@ do {									\
>   */
>  #define EMULATED_F(name)					\
>  ({								\
> -	kvm_cpu_cap_emulated |= F(name);			\
> +	kvm_cpu_cap_emulated |= feature_bit(name);		\
>  	F(name);						\
>  })
>  
> @@ -731,7 +736,7 @@ do {									\
>   */
>  #define SYNTHESIZED_F(name)					\
>  ({								\
> -	kvm_cpu_cap_synthesized |= F(name);			\
> +	kvm_cpu_cap_synthesized |= feature_bit(name);		\
>  	F(name);						\
>  })
>  
> @@ -743,7 +748,7 @@ do {									\
>   */
>  #define PASSTHROUGH_F(name)					\
>  ({								\
> -	kvm_cpu_cap_passthrough |= F(name);			\
> +	kvm_cpu_cap_passthrough |= feature_bit(name);		\
>  	F(name);						\
>  })
>  
> @@ -755,7 +760,7 @@ do {									\
>  ({										\
>  	BUILD_BUG_ON(__feature_leaf(X86_FEATURE_##name) != CPUID_1_EDX);	\
>  	BUILD_BUG_ON(kvm_cpu_cap_init_in_progress != CPUID_8000_0001_EDX);	\
> -	feature_bit(name);							\
> +	kvm_cpu_cap_features |= feature_bit(name);				\
>  })
>  
>  /*
> @@ -765,7 +770,6 @@ do {									\
>  #define VENDOR_F(name)						\
>  ({								\
>  	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
> -	0;							\
>  })
>  
>  /*
> @@ -775,7 +779,6 @@ do {									\
>  #define RUNTIME_F(name)						\
>  ({								\
>  	KVM_VALIDATE_CPU_CAP_USAGE(name);			\
> -	0;							\
>  })
>  
>  /*
> @@ -795,126 +798,128 @@ void kvm_set_cpu_caps(void)
>  		     sizeof(boot_cpu_data.x86_capability));
>  
>  	kvm_cpu_cap_init(CPUID_1_ECX,
> -		F(XMM3) |
> -		F(PCLMULQDQ) |
> -		VENDOR_F(DTES64) |
> +		F(XMM3),
> +		F(PCLMULQDQ),
> +		VENDOR_F(DTES64),
>  		/*
>  		 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*
>  		 * advertised to guests via CPUID!  MWAIT is also technically a
>  		 * runtime flag thanks to IA32_MISC_ENABLES; mark it as such so
>  		 * that KVM is aware that it's a known, unadvertised flag.
>  		 */
> -		RUNTIME_F(MWAIT) |
> -		VENDOR_F(VMX) |
> -		0 /* DS-CPL, SMX, EST */ |
> -		0 /* TM2 */ |
> -		F(SSSE3) |
> -		0 /* CNXT-ID */ |
> -		0 /* Reserved */ |
> -		F(FMA) |
> -		F(CX16) |
> -		0 /* xTPR Update */ |
> -		F(PDCM) |
> -		F(PCID) |
> -		0 /* Reserved, DCA */ |
> -		F(XMM4_1) |
> -		F(XMM4_2) |
> -		EMULATED_F(X2APIC) |
> -		F(MOVBE) |
> -		F(POPCNT) |
> -		EMULATED_F(TSC_DEADLINE_TIMER) |
> -		F(AES) |
> -		F(XSAVE) |
> -		RUNTIME_F(OSXSAVE) |
> -		F(AVX) |
> -		F(F16C) |
> -		F(RDRAND) |
> -		EMULATED_F(HYPERVISOR)
> +		RUNTIME_F(MWAIT),
> +		/* DS-CPL */
> +		VENDOR_F(VMX),
> +		/* SMX, EST */
> +		/* TM2 */
> +		F(SSSE3),
> +		/* CNXT-ID */
> +		/* Reserved */
> +		F(FMA),
> +		F(CX16),
> +		/* xTPR Update */
> +		F(PDCM),
> +		F(PCID),
> +		/* Reserved, DCA */
> +		F(XMM4_1),
> +		F(XMM4_2),
> +		EMULATED_F(X2APIC),
> +		F(MOVBE),
> +		F(POPCNT),
> +		EMULATED_F(TSC_DEADLINE_TIMER),
> +		F(AES),
> +		F(XSAVE),
> +		RUNTIME_F(OSXSAVE),
> +		F(AVX),
> +		F(F16C),
> +		F(RDRAND),
> +		EMULATED_F(HYPERVISOR),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_1_EDX,
> -		F(FPU) |
> -		F(VME) |
> -		F(DE) |
> -		F(PSE) |
> -		F(TSC) |
> -		F(MSR) |
> -		F(PAE) |
> -		F(MCE) |
> -		F(CX8) |
> -		F(APIC) |
> -		0 /* Reserved */ |
> -		F(SEP) |
> -		F(MTRR) |
> -		F(PGE) |
> -		F(MCA) |
> -		F(CMOV) |
> -		F(PAT) |
> -		F(PSE36) |
> -		0 /* PSN */ |
> -		F(CLFLUSH) |
> -		0 /* Reserved */ |
> -		VENDOR_F(DS) |
> -		0 /* ACPI */ |
> -		F(MMX) |
> -		F(FXSR) |
> -		F(XMM) |
> -		F(XMM2) |
> -		F(SELFSNOOP) |
> -		0 /* HTT, TM, Reserved, PBE */
> +		F(FPU),
> +		F(VME),
> +		F(DE),
> +		F(PSE),
> +		F(TSC),
> +		F(MSR),
> +		F(PAE),
> +		F(MCE),
> +		F(CX8),
> +		F(APIC),
> +		/* Reserved */
> +		F(SEP),
> +		F(MTRR),
> +		F(PGE),
> +		F(MCA),
> +		F(CMOV),
> +		F(PAT),
> +		F(PSE36),
> +		/* PSN */
> +		F(CLFLUSH),
> +		/* Reserved */
> +		VENDOR_F(DS),
> +		/* ACPI */
> +		F(MMX),
> +		F(FXSR),
> +		F(XMM),
> +		F(XMM2),
> +		F(SELFSNOOP),
> +		/* HTT, TM, Reserved, PBE */
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_7_0_EBX,
> -		F(FSGSBASE) |
> -		EMULATED_F(TSC_ADJUST) |
> -		F(SGX) |
> -		F(BMI1) |
> -		F(HLE) |
> -		F(AVX2) |
> -		F(FDP_EXCPTN_ONLY) |
> -		F(SMEP) |
> -		F(BMI2) |
> -		F(ERMS) |
> -		F(INVPCID) |
> -		F(RTM) |
> -		F(ZERO_FCS_FDS) |
> -		VENDOR_F(MPX) |
> -		F(AVX512F) |
> -		F(AVX512DQ) |
> -		F(RDSEED) |
> -		F(ADX) |
> -		F(SMAP) |
> -		F(AVX512IFMA) |
> -		F(CLFLUSHOPT) |
> -		F(CLWB) |
> -		VENDOR_F(INTEL_PT) |
> -		F(AVX512PF) |
> -		F(AVX512ER) |
> -		F(AVX512CD) |
> -		F(SHA_NI) |
> -		F(AVX512BW) |
> -		F(AVX512VL));
> +		F(FSGSBASE),
> +		EMULATED_F(TSC_ADJUST),
> +		F(SGX),
> +		F(BMI1),
> +		F(HLE),
> +		F(AVX2),
> +		F(FDP_EXCPTN_ONLY),
> +		F(SMEP),
> +		F(BMI2),
> +		F(ERMS),
> +		F(INVPCID),
> +		F(RTM),
> +		F(ZERO_FCS_FDS),
> +		VENDOR_F(MPX),
> +		F(AVX512F),
> +		F(AVX512DQ),
> +		F(RDSEED),
> +		F(ADX),
> +		F(SMAP),
> +		F(AVX512IFMA),
> +		F(CLFLUSHOPT),
> +		F(CLWB),
> +		VENDOR_F(INTEL_PT),
> +		F(AVX512PF),
> +		F(AVX512ER),
> +		F(AVX512CD),
> +		F(SHA_NI),
> +		F(AVX512BW),
> +		F(AVX512VL),
> +	);
>  
>  	kvm_cpu_cap_init(CPUID_7_ECX,
> -		F(AVX512VBMI) |
> -		PASSTHROUGH_F(LA57) |
> -		F(PKU) |
> -		RUNTIME_F(OSPKE) |
> -		F(RDPID) |
> -		F(AVX512_VPOPCNTDQ) |
> -		F(UMIP) |
> -		F(AVX512_VBMI2) |
> -		F(GFNI) |
> -		F(VAES) |
> -		F(VPCLMULQDQ) |
> -		F(AVX512_VNNI) |
> -		F(AVX512_BITALG) |
> -		F(CLDEMOTE) |
> -		F(MOVDIRI) |
> -		F(MOVDIR64B) |
> -		VENDOR_F(WAITPKG) |
> -		F(SGX_LC) |
> -		F(BUS_LOCK_DETECT)
> +		F(AVX512VBMI),
> +		PASSTHROUGH_F(LA57),
> +		F(PKU),
> +		RUNTIME_F(OSPKE),
> +		F(RDPID),
> +		F(AVX512_VPOPCNTDQ),
> +		F(UMIP),
> +		F(AVX512_VBMI2),
> +		F(GFNI),
> +		F(VAES),
> +		F(VPCLMULQDQ),
> +		F(AVX512_VNNI),
> +		F(AVX512_BITALG),
> +		F(CLDEMOTE),
> +		F(MOVDIRI),
> +		F(MOVDIR64B),
> +		VENDOR_F(WAITPKG),
> +		F(SGX_LC),
> +		F(BUS_LOCK_DETECT),
>  	);
>  
>  	/*
> @@ -925,22 +930,22 @@ void kvm_set_cpu_caps(void)
>  		kvm_cpu_cap_clear(X86_FEATURE_PKU);
>  
>  	kvm_cpu_cap_init(CPUID_7_EDX,
> -		F(AVX512_4VNNIW) |
> -		F(AVX512_4FMAPS) |
> -		F(SPEC_CTRL) |
> -		F(SPEC_CTRL_SSBD) |
> -		EMULATED_F(ARCH_CAPABILITIES) |
> -		F(INTEL_STIBP) |
> -		F(MD_CLEAR) |
> -		F(AVX512_VP2INTERSECT) |
> -		F(FSRM) |
> -		F(SERIALIZE) |
> -		F(TSXLDTRK) |
> -		F(AVX512_FP16) |
> -		F(AMX_TILE) |
> -		F(AMX_INT8) |
> -		F(AMX_BF16) |
> -		F(FLUSH_L1D)
> +		F(AVX512_4VNNIW),
> +		F(AVX512_4FMAPS),
> +		F(SPEC_CTRL),
> +		F(SPEC_CTRL_SSBD),
> +		EMULATED_F(ARCH_CAPABILITIES),
> +		F(INTEL_STIBP),
> +		F(MD_CLEAR),
> +		F(AVX512_VP2INTERSECT),
> +		F(FSRM),
> +		F(SERIALIZE),
> +		F(TSXLDTRK),
> +		F(AVX512_FP16),
> +		F(AMX_TILE),
> +		F(AMX_INT8),
> +		F(AMX_BF16),
> +		F(FLUSH_L1D),
>  	);
>  
>  	if (boot_cpu_has(X86_FEATURE_AMD_IBPB_RET) &&
> @@ -953,132 +958,132 @@ void kvm_set_cpu_caps(void)
>  		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
>  
>  	kvm_cpu_cap_init(CPUID_7_1_EAX,
> -		F(SHA512) |
> -		F(SM3) |
> -		F(SM4) |
> -		F(AVX_VNNI) |
> -		F(AVX512_BF16) |
> -		F(CMPCCXADD) |
> -		F(FZRM) |
> -		F(FSRS) |
> -		F(FSRC) |
> -		F(AMX_FP16) |
> -		F(AVX_IFMA) |
> -		F(LAM)
> +		F(SHA512),
> +		F(SM3),
> +		F(SM4),
> +		F(AVX_VNNI),
> +		F(AVX512_BF16),
> +		F(CMPCCXADD),
> +		F(FZRM),
> +		F(FSRS),
> +		F(FSRC),
> +		F(AMX_FP16),
> +		F(AVX_IFMA),
> +		F(LAM),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_7_1_EDX,
> -		F(AVX_VNNI_INT8) |
> -		F(AVX_NE_CONVERT) |
> -		F(AMX_COMPLEX) |
> -		F(AVX_VNNI_INT16) |
> -		F(PREFETCHITI) |
> -		F(AVX10)
> +		F(AVX_VNNI_INT8),
> +		F(AVX_NE_CONVERT),
> +		F(AMX_COMPLEX),
> +		F(AVX_VNNI_INT16),
> +		F(PREFETCHITI),
> +		F(AVX10),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_7_2_EDX,
> -		F(INTEL_PSFD) |
> -		F(IPRED_CTRL) |
> -		F(RRSBA_CTRL) |
> -		F(DDPD_U) |
> -		F(BHI_CTRL) |
> -		F(MCDT_NO)
> +		F(INTEL_PSFD),
> +		F(IPRED_CTRL),
> +		F(RRSBA_CTRL),
> +		F(DDPD_U),
> +		F(BHI_CTRL),
> +		F(MCDT_NO),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_D_1_EAX,
> -		F(XSAVEOPT) |
> -		F(XSAVEC) |
> -		F(XGETBV1) |
> -		F(XSAVES) |
> -		X86_64_F(XFD)
> +		F(XSAVEOPT),
> +		F(XSAVEC),
> +		F(XGETBV1),
> +		F(XSAVES),
> +		X86_64_F(XFD),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_12_EAX,
> -		SCATTERED_F(SGX1) |
> -		SCATTERED_F(SGX2) |
> -		SCATTERED_F(SGX_EDECCSSA)
> +		SCATTERED_F(SGX1),
> +		SCATTERED_F(SGX2),
> +		SCATTERED_F(SGX_EDECCSSA),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_24_0_EBX,
> -		F(AVX10_128) |
> -		F(AVX10_256) |
> -		F(AVX10_512)
> +		F(AVX10_128),
> +		F(AVX10_256),
> +		F(AVX10_512),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_8000_0001_ECX,
> -		F(LAHF_LM) |
> -		F(CMP_LEGACY) |
> -		VENDOR_F(SVM) |
> -		0 /* ExtApicSpace */ |
> -		F(CR8_LEGACY) |
> -		F(ABM) |
> -		F(SSE4A) |
> -		F(MISALIGNSSE) |
> -		F(3DNOWPREFETCH) |
> -		F(OSVW) |
> -		0 /* IBS */ |
> -		F(XOP) |
> -		0 /* SKINIT, WDT, LWP */ |
> -		F(FMA4) |
> -		F(TBM) |
> -		F(TOPOEXT) |
> -		VENDOR_F(PERFCTR_CORE)
> +		F(LAHF_LM),
> +		F(CMP_LEGACY),
> +		VENDOR_F(SVM),
> +		/* ExtApicSpace */
> +		F(CR8_LEGACY),
> +		F(ABM),
> +		F(SSE4A),
> +		F(MISALIGNSSE),
> +		F(3DNOWPREFETCH),
> +		F(OSVW),
> +		/* IBS */
> +		F(XOP),
> +		/* SKINIT, WDT, LWP */
> +		F(FMA4),
> +		F(TBM),
> +		F(TOPOEXT),
> +		VENDOR_F(PERFCTR_CORE),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_8000_0001_EDX,
> -		ALIASED_1_EDX_F(FPU) |
> -		ALIASED_1_EDX_F(VME) |
> -		ALIASED_1_EDX_F(DE) |
> -		ALIASED_1_EDX_F(PSE) |
> -		ALIASED_1_EDX_F(TSC) |
> -		ALIASED_1_EDX_F(MSR) |
> -		ALIASED_1_EDX_F(PAE) |
> -		ALIASED_1_EDX_F(MCE) |
> -		ALIASED_1_EDX_F(CX8) |
> -		ALIASED_1_EDX_F(APIC) |
> -		0 /* Reserved */ |
> -		F(SYSCALL) |
> -		ALIASED_1_EDX_F(MTRR) |
> -		ALIASED_1_EDX_F(PGE) |
> -		ALIASED_1_EDX_F(MCA) |
> -		ALIASED_1_EDX_F(CMOV) |
> -		ALIASED_1_EDX_F(PAT) |
> -		ALIASED_1_EDX_F(PSE36) |
> -		0 /* Reserved */ |
> -		F(NX) |
> -		0 /* Reserved */ |
> -		F(MMXEXT) |
> -		ALIASED_1_EDX_F(MMX) |
> -		ALIASED_1_EDX_F(FXSR) |
> -		F(FXSR_OPT) |
> -		X86_64_F(GBPAGES) |
> -		F(RDTSCP) |
> -		0 /* Reserved */ |
> -		X86_64_F(LM) |
> -		F(3DNOWEXT) |
> -		F(3DNOW)
> +		ALIASED_1_EDX_F(FPU),
> +		ALIASED_1_EDX_F(VME),
> +		ALIASED_1_EDX_F(DE),
> +		ALIASED_1_EDX_F(PSE),
> +		ALIASED_1_EDX_F(TSC),
> +		ALIASED_1_EDX_F(MSR),
> +		ALIASED_1_EDX_F(PAE),
> +		ALIASED_1_EDX_F(MCE),
> +		ALIASED_1_EDX_F(CX8),
> +		ALIASED_1_EDX_F(APIC),
> +		/* Reserved */
> +		F(SYSCALL),
> +		ALIASED_1_EDX_F(MTRR),
> +		ALIASED_1_EDX_F(PGE),
> +		ALIASED_1_EDX_F(MCA),
> +		ALIASED_1_EDX_F(CMOV),
> +		ALIASED_1_EDX_F(PAT),
> +		ALIASED_1_EDX_F(PSE36),
> +		/* Reserved */
> +		F(NX),
> +		/* Reserved */
> +		F(MMXEXT),
> +		ALIASED_1_EDX_F(MMX),
> +		ALIASED_1_EDX_F(FXSR),
> +		F(FXSR_OPT),
> +		X86_64_F(GBPAGES),
> +		F(RDTSCP),
> +		/* Reserved */
> +		X86_64_F(LM),
> +		F(3DNOWEXT),
> +		F(3DNOW),
>  	);
>  
>  	if (!tdp_enabled && IS_ENABLED(CONFIG_X86_64))
>  		kvm_cpu_cap_set(X86_FEATURE_GBPAGES);
>  
>  	kvm_cpu_cap_init(CPUID_8000_0007_EDX,
> -		SCATTERED_F(CONSTANT_TSC)
> +		SCATTERED_F(CONSTANT_TSC),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_8000_0008_EBX,
> -		F(CLZERO) |
> -		F(XSAVEERPTR) |
> -		F(WBNOINVD) |
> -		F(AMD_IBPB) |
> -		F(AMD_IBRS) |
> -		F(AMD_SSBD) |
> -		F(VIRT_SSBD) |
> -		F(AMD_SSB_NO) |
> -		F(AMD_STIBP) |
> -		F(AMD_STIBP_ALWAYS_ON) |
> -		F(AMD_PSFD) |
> -		F(AMD_IBPB_RET)
> +		F(CLZERO),
> +		F(XSAVEERPTR),
> +		F(WBNOINVD),
> +		F(AMD_IBPB),
> +		F(AMD_IBRS),
> +		F(AMD_SSBD),
> +		F(VIRT_SSBD),
> +		F(AMD_SSB_NO),
> +		F(AMD_STIBP),
> +		F(AMD_STIBP_ALWAYS_ON),
> +		F(AMD_PSFD),
> +		F(AMD_IBPB_RET),
>  	);
>  
>  	/*
> @@ -1110,30 +1115,30 @@ void kvm_set_cpu_caps(void)
>  
>  	/* All SVM features required additional vendor module enabling. */
>  	kvm_cpu_cap_init(CPUID_8000_000A_EDX,
> -		VENDOR_F(NPT) |
> -		VENDOR_F(VMCBCLEAN) |
> -		VENDOR_F(FLUSHBYASID) |
> -		VENDOR_F(NRIPS) |
> -		VENDOR_F(TSCRATEMSR) |
> -		VENDOR_F(V_VMSAVE_VMLOAD) |
> -		VENDOR_F(LBRV) |
> -		VENDOR_F(PAUSEFILTER) |
> -		VENDOR_F(PFTHRESHOLD) |
> -		VENDOR_F(VGIF) |
> -		VENDOR_F(VNMI) |
> -		VENDOR_F(SVME_ADDR_CHK)
> +		VENDOR_F(NPT),
> +		VENDOR_F(VMCBCLEAN),
> +		VENDOR_F(FLUSHBYASID),
> +		VENDOR_F(NRIPS),
> +		VENDOR_F(TSCRATEMSR),
> +		VENDOR_F(V_VMSAVE_VMLOAD),
> +		VENDOR_F(LBRV),
> +		VENDOR_F(PAUSEFILTER),
> +		VENDOR_F(PFTHRESHOLD),
> +		VENDOR_F(VGIF),
> +		VENDOR_F(VNMI),
> +		VENDOR_F(SVME_ADDR_CHK),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_8000_001F_EAX,
> -		VENDOR_F(SME) |
> -		VENDOR_F(SEV) |
> -		0 /* VM_PAGE_FLUSH */ |
> -		VENDOR_F(SEV_ES) |
> -		F(SME_COHERENT)
> +		VENDOR_F(SME),
> +		VENDOR_F(SEV),
> +		/* VM_PAGE_FLUSH */
> +		VENDOR_F(SEV_ES),
> +		F(SME_COHERENT),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_8000_0021_EAX,
> -		F(NO_NESTED_DATA_BP) |
> +		F(NO_NESTED_DATA_BP),
>  		/*
>  		 * Synthesize "LFENCE is serializing" into the AMD-defined entry
>  		 * in KVM's supported CPUID, i.e. if the feature is reported as
> @@ -1144,36 +1149,36 @@ void kvm_set_cpu_caps(void)
>  		 * CPUID will drop the flags, and reporting support in AMD's
>  		 * leaf can make it easier for userspace to detect the feature.
>  		 */
> -		SYNTHESIZED_F(LFENCE_RDTSC) |
> -		0 /* SmmPgCfgLock */ |
> -		F(NULL_SEL_CLR_BASE) |
> -		F(AUTOIBRS) |
> -		EMULATED_F(NO_SMM_CTL_MSR) |
> -		0 /* PrefetchCtlMsr */ |
> -		F(WRMSR_XX_BASE_NS) |
> -		SYNTHESIZED_F(SBPB) |
> -		SYNTHESIZED_F(IBPB_BRTYPE) |
> -		SYNTHESIZED_F(SRSO_NO)
> +		SYNTHESIZED_F(LFENCE_RDTSC),
> +		/* SmmPgCfgLock */
> +		F(NULL_SEL_CLR_BASE),
> +		F(AUTOIBRS),
> +		EMULATED_F(NO_SMM_CTL_MSR),
> +		/* PrefetchCtlMsr */
> +		F(WRMSR_XX_BASE_NS),
> +		SYNTHESIZED_F(SBPB),
> +		SYNTHESIZED_F(IBPB_BRTYPE),
> +		SYNTHESIZED_F(SRSO_NO),
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_8000_0022_EAX,
> -		F(PERFMON_V2)
> +		F(PERFMON_V2),
>  	);
>  
>  	if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
>  		kvm_cpu_cap_set(X86_FEATURE_NULL_SEL_CLR_BASE);
>  
>  	kvm_cpu_cap_init(CPUID_C000_0001_EDX,
> -		F(XSTORE) |
> -		F(XSTORE_EN) |
> -		F(XCRYPT) |
> -		F(XCRYPT_EN) |
> -		F(ACE2) |
> -		F(ACE2_EN) |
> -		F(PHE) |
> -		F(PHE_EN) |
> -		F(PMM) |
> -		F(PMM_EN)
> +		F(XSTORE),
> +		F(XSTORE_EN),
> +		F(XCRYPT),
> +		F(XCRYPT_EN),
> +		F(ACE2),
> +		F(ACE2_EN),
> +		F(PHE),
> +		F(PHE_EN),
> +		F(PMM),
> +		F(PMM_EN),
>  	);
>  
>  	/*


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>




^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching
  2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
                   ` (57 preceding siblings ...)
  2024-12-18  1:13 ` [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Maxim Levitsky
@ 2024-12-19  2:40 ` Sean Christopherson
  58 siblings, 0 replies; 65+ messages in thread
From: Sean Christopherson @ 2024-12-19  2:40 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov,
	Jarkko Sakkinen
  Cc: kvm, linux-sgx, linux-kernel, Maxim Levitsky, Hou Wenlong,
	Xiaoyao Li, Kechen Lu, Oliver Upton, Binbin Wu, Yang Weijiang,
	Robert Hoo

On Wed, 27 Nov 2024 17:33:27 -0800, Sean Christopherson wrote:
> The super short TL;DR: snapshot all X86_FEATURE_* flags that KVM cares
> about so that all queries against guest capabilities are "fast", e.g. don't
> require manual enabling or judgment calls as to where a feature needs to be
> fast.
> 
> The guest_cpu_cap_* nomenclature follows the existing kvm_cpu_cap_*
> except for a few (maybe just one?) cases where guest cpu_caps need APIs
> that kvm_cpu_caps don't.  In theory, the similar names will make this
> approach more intuitive.
> 
> [...]

Applied to kvm-x86 misc, thanks!

[01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID
        https://github.com/kvm-x86/linux/commit/ccf4c1d15d5a
[02/57] KVM: x86: Limit use of F() and SF() to kvm_cpu_cap_{mask,init_kvm_defined}()
        https://github.com/kvm-x86/linux/commit/4b027f5af907
[03/57] KVM: x86: Do all post-set CPUID processing during vCPU creation
        https://github.com/kvm-x86/linux/commit/85e5ba83c016
[04/57] KVM: x86: Explicitly do runtime CPUID updates "after" initial setup
        https://github.com/kvm-x86/linux/commit/ec3d4440b2c8
[05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX
        https://github.com/kvm-x86/linux/commit/7520a53b8e0a
[06/57] KVM: selftests: Update x86's set_sregs_test to match KVM's CPUID enforcement
        https://github.com/kvm-x86/linux/commit/bf4dfc3aa875
[07/57] KVM: selftests: Assert that vcpu->cpuid is non-NULL when getting CPUID entries
        https://github.com/kvm-x86/linux/commit/08833719e770
[08/57] KVM: selftests: Refresh vCPU CPUID cache in __vcpu_get_cpuid_entry()
        https://github.com/kvm-x86/linux/commit/a2a791e82086
[09/57] KVM: selftests: Verify KVM stuffs runtime CPUID OS bits on CR4 writes
        https://github.com/kvm-x86/linux/commit/01bcd829c63f
[10/57] KVM: x86: Move __kvm_is_valid_cr4() definition to x86.h
        https://github.com/kvm-x86/linux/commit/b0c3d6871778
[11/57] KVM: x86/pmu: Drop now-redundant refresh() during init()
        https://github.com/kvm-x86/linux/commit/ac32cbd4dfc6
[12/57] KVM: x86: Drop now-redundant MAXPHYADDR and GPA rsvd bits from vCPU creation
        https://github.com/kvm-x86/linux/commit/21d7f06d1a83
[13/57] KVM: x86: Disallow KVM_CAP_X86_DISABLE_EXITS after vCPU creation
        https://github.com/kvm-x86/linux/commit/04cd8f8628d8
[14/57] KVM: x86: Reject disabling of MWAIT/HLT interception when not allowed
        https://github.com/kvm-x86/linux/commit/c829ccd4d9dc
[15/57] KVM: x86: Drop the now unused KVM_X86_DISABLE_VALID_EXITS
        https://github.com/kvm-x86/linux/commit/af5366bea2cb
[16/57] KVM: selftests: Fix a bad TEST_REQUIRE() in x86's KVM PV test
        https://github.com/kvm-x86/linux/commit/7b2658cb33c7
[17/57] KVM: selftests: Update x86's KVM PV test to match KVM's disabling exits behavior
        https://github.com/kvm-x86/linux/commit/59cb3acdb316
[18/57] KVM: x86: Zero out PV features cache when the CPUID leaf is not present
        https://github.com/kvm-x86/linux/commit/01d1059d635a
[19/57] KVM: x86: Don't update PV features caches when enabling enforcement capability
        https://github.com/kvm-x86/linux/commit/f21958e328a9
[20/57] KVM: x86: Do reverse CPUID sanity checks in __feature_leaf()
        https://github.com/kvm-x86/linux/commit/6416b0fb1660
[21/57] KVM: x86: Account for max supported CPUID leaf when getting raw host CPUID
        https://github.com/kvm-x86/linux/commit/96cbc766baf0
[22/57] KVM: x86: Unpack F() CPUID feature flag macros to one flag per line of code
        https://github.com/kvm-x86/linux/commit/ccf93de484a3
[23/57] KVM: x86: Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init()
        https://github.com/kvm-x86/linux/commit/3cc359ca29ad
[24/57] KVM: x86: Add a macro to init CPUID features that are 64-bit only
        https://github.com/kvm-x86/linux/commit/6eac4d99a967
[25/57] KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID features
        https://github.com/kvm-x86/linux/commit/264969b48a29
[26/57] KVM: x86: Handle kernel- and KVM-defined CPUID words in a single helper
        https://github.com/kvm-x86/linux/commit/46505c0f69f9
[27/57] KVM: x86: #undef SPEC_CTRL_SSBD in cpuid.c to avoid macro collisions
        https://github.com/kvm-x86/linux/commit/8d862c270bf1
[28/57] KVM: x86: Harden CPU capabilities processing against out-of-scope features
        https://github.com/kvm-x86/linux/commit/3d142340d717
[29/57] KVM: x86: Add a macro to init CPUID features that ignore host kernel support
        https://github.com/kvm-x86/linux/commit/5c8de4b3a5bc
[30/57] KVM: x86: Add a macro to init CPUID features that KVM emulates in software
        https://github.com/kvm-x86/linux/commit/6174004ebd25
[31/57] KVM: x86: Swap incoming guest CPUID into vCPU before massaging in KVM_SET_CPUID2
        https://github.com/kvm-x86/linux/commit/8c01290bda1a
[32/57] KVM: x86: Clear PV_UNHALT for !HLT-exiting only when userspace sets CPUID
        https://github.com/kvm-x86/linux/commit/63d8c702c2d4
[33/57] KVM: x86: Remove unnecessary caching of KVM's PV CPUID base
        https://github.com/kvm-x86/linux/commit/a5b32718081e
[34/57] KVM: x86: Always operate on kvm_vcpu data in cpuid_entry2_find()
        https://github.com/kvm-x86/linux/commit/285185f8e479
[35/57] KVM: x86: Move kvm_find_cpuid_entry{,_index}() up near cpuid_entry2_find()
        https://github.com/kvm-x86/linux/commit/8b30cb367c46
[36/57] KVM: x86: Remove all direct usage of cpuid_entry2_find()
        https://github.com/kvm-x86/linux/commit/136d605b4365
[37/57] KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID
        https://github.com/kvm-x86/linux/commit/9be4ec35d668
[38/57] KVM: x86: Advertise HYPERVISOR in KVM_GET_SUPPORTED_CPUID
        https://github.com/kvm-x86/linux/commit/9aa470f5ddb2
[39/57] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
        https://github.com/kvm-x86/linux/commit/2c5e168e5ce1
[40/57] KVM: x86: Replace guts of "governed" features with comprehensive cpu_caps
        https://github.com/kvm-x86/linux/commit/7ea34578aea7
[41/57] KVM: x86: Initialize guest cpu_caps based on guest CPUID
        https://github.com/kvm-x86/linux/commit/a7a308f863a1
[42/57] KVM: x86: Extract code for generating per-entry emulated CPUID information
        https://github.com/kvm-x86/linux/commit/ff402f56e8eb
[43/57] KVM: x86: Treat MONTIOR/MWAIT as a "partially emulated" feature
        https://github.com/kvm-x86/linux/commit/d4b9ff3d55de
[44/57] KVM: x86: Initialize guest cpu_caps based on KVM support
        https://github.com/kvm-x86/linux/commit/e592ec657d84
[45/57] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime
        https://github.com/kvm-x86/linux/commit/963180ae0637
[46/57] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf
        https://github.com/kvm-x86/linux/commit/cfd157452609
[47/57] KVM: x86: Update OS{XSAVE,PKE} bits in guest CPUID irrespective of host support
        https://github.com/kvm-x86/linux/commit/1f66590d7ff0
[48/57] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
        https://github.com/kvm-x86/linux/commit/75d4642fce01
[49/57] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has()
        https://github.com/kvm-x86/linux/commit/820545bdfeb0
[50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps
        https://github.com/kvm-x86/linux/commit/8f2a27752e80
[51/57] KVM: x86: Drop superfluous host XSAVE check when adjusting guest XSAVES caps
        https://github.com/kvm-x86/linux/commit/cbdeea032bfe
[52/57] KVM: x86: Add a macro for features that are synthesized into boot_cpu_data
        https://github.com/kvm-x86/linux/commit/75c489e12d4b
[53/57] KVM: x86: Pull CPUID capabilities from boot_cpu_data only as needed
        https://github.com/kvm-x86/linux/commit/3fd55b522795
[54/57] KVM: x86: Rename "SF" macro to "SCATTERED_F"
        https://github.com/kvm-x86/linux/commit/9b2776c7cf2b
[55/57] KVM: x86: Explicitly track feature flags that require vendor enabling
        https://github.com/kvm-x86/linux/commit/0fea7aa2dc6a
[56/57] KVM: x86: Explicitly track feature flags that are enabled at runtime
        https://github.com/kvm-x86/linux/commit/ac9d1b7591a2
[57/57] KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps
        https://github.com/kvm-x86/linux/commit/871ac338ef55

--
https://github.com/kvm-x86/linux/tree/next

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2024-12-19  2:42 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-28  1:33 [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 01/57] KVM: x86: Use feature_bit() to clear CONSTANT_TSC when emulating CPUID Sean Christopherson
2024-12-13 10:53   ` Vitaly Kuznetsov
2024-11-28  1:33 ` [PATCH v3 02/57] KVM: x86: Limit use of F() and SF() to kvm_cpu_cap_{mask,init_kvm_defined}() Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 03/57] KVM: x86: Do all post-set CPUID processing during vCPU creation Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 04/57] KVM: x86: Explicitly do runtime CPUID updates "after" initial setup Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX Sean Christopherson
2024-12-13  1:30   ` Chao Gao
2024-11-28  1:33 ` [PATCH v3 06/57] KVM: selftests: Update x86's set_sregs_test to match KVM's CPUID enforcement Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 07/57] KVM: selftests: Assert that vcpu->cpuid is non-NULL when getting CPUID entries Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 08/57] KVM: selftests: Refresh vCPU CPUID cache in __vcpu_get_cpuid_entry() Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 09/57] KVM: selftests: Verify KVM stuffs runtime CPUID OS bits on CR4 writes Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 10/57] KVM: x86: Move __kvm_is_valid_cr4() definition to x86.h Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 11/57] KVM: x86/pmu: Drop now-redundant refresh() during init() Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 12/57] KVM: x86: Drop now-redundant MAXPHYADDR and GPA rsvd bits from vCPU creation Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 13/57] KVM: x86: Disallow KVM_CAP_X86_DISABLE_EXITS after " Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 14/57] KVM: x86: Reject disabling of MWAIT/HLT interception when not allowed Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 15/57] KVM: x86: Drop the now unused KVM_X86_DISABLE_VALID_EXITS Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 16/57] KVM: selftests: Fix a bad TEST_REQUIRE() in x86's KVM PV test Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 17/57] KVM: selftests: Update x86's KVM PV test to match KVM's disabling exits behavior Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 18/57] KVM: x86: Zero out PV features cache when the CPUID leaf is not present Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 19/57] KVM: x86: Don't update PV features caches when enabling enforcement capability Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 20/57] KVM: x86: Do reverse CPUID sanity checks in __feature_leaf() Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 21/57] KVM: x86: Account for max supported CPUID leaf when getting raw host CPUID Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 22/57] KVM: x86: Unpack F() CPUID feature flag macros to one flag per line of code Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 23/57] KVM: x86: Rename kvm_cpu_cap_mask() to kvm_cpu_cap_init() Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 24/57] KVM: x86: Add a macro to init CPUID features that are 64-bit only Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 25/57] KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID features Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 26/57] KVM: x86: Handle kernel- and KVM-defined CPUID words in a single helper Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 27/57] KVM: x86: #undef SPEC_CTRL_SSBD in cpuid.c to avoid macro collisions Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 28/57] KVM: x86: Harden CPU capabilities processing against out-of-scope features Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 29/57] KVM: x86: Add a macro to init CPUID features that ignore host kernel support Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 30/57] KVM: x86: Add a macro to init CPUID features that KVM emulates in software Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 31/57] KVM: x86: Swap incoming guest CPUID into vCPU before massaging in KVM_SET_CPUID2 Sean Christopherson
2024-11-28  1:33 ` [PATCH v3 32/57] KVM: x86: Clear PV_UNHALT for !HLT-exiting only when userspace sets CPUID Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 33/57] KVM: x86: Remove unnecessary caching of KVM's PV CPUID base Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 34/57] KVM: x86: Always operate on kvm_vcpu data in cpuid_entry2_find() Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 35/57] KVM: x86: Move kvm_find_cpuid_entry{,_index}() up near cpuid_entry2_find() Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 36/57] KVM: x86: Remove all direct usage of cpuid_entry2_find() Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 37/57] KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 38/57] KVM: x86: Advertise HYPERVISOR " Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 39/57] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 40/57] KVM: x86: Replace guts of "governed" features with comprehensive cpu_caps Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 41/57] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 42/57] KVM: x86: Extract code for generating per-entry emulated CPUID information Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 43/57] KVM: x86: Treat MONTIOR/MWAIT as a "partially emulated" feature Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 44/57] KVM: x86: Initialize guest cpu_caps based on KVM support Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 45/57] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 46/57] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 47/57] KVM: x86: Update OS{XSAVE,PKE} bits in guest CPUID irrespective of host support Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 48/57] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 49/57] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 50/57] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps Sean Christopherson
2024-12-13  2:14   ` Chao Gao
2024-12-17  0:05     ` Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 51/57] KVM: x86: Drop superfluous host XSAVE check when adjusting guest XSAVES caps Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 52/57] KVM: x86: Add a macro for features that are synthesized into boot_cpu_data Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 53/57] KVM: x86: Pull CPUID capabilities from boot_cpu_data only as needed Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 54/57] KVM: x86: Rename "SF" macro to "SCATTERED_F" Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 55/57] KVM: x86: Explicitly track feature flags that require vendor enabling Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 56/57] KVM: x86: Explicitly track feature flags that are enabled at runtime Sean Christopherson
2024-11-28  1:34 ` [PATCH v3 57/57] KVM: x86: Use only local variables (no bitmask) to init kvm_cpu_caps Sean Christopherson
2024-12-18  1:15   ` Maxim Levitsky
2024-12-18  1:13 ` [PATCH v3 00/57] KVM: x86: CPUID overhaul, fixes, and caching Maxim Levitsky
2024-12-19  2:40 ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox