[PATCH 0/9] KVM: x86: Replace governed features with guest cpu

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps
@ 2023-11-10 23:55 Sean Christopherson
  2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
                   ` (8 more replies)
  0 siblings, 9 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Replace the poorly named, confusing, and high-maintenance "governed
features" framework with a comprehensive guest cpu_caps implementation.

In short, snapshot all X86_FEATURE_* flags that KVM cares about so that
all queries against guest capabilities are "fast", e.g. don't require
manual enabling or judgment calls as to where a feature needs to be fast.

The guest_cpu_cap_* nomenclature follows the existing kvm_cpu_cap_*
except for a few (maybe just one?) cases where guest cpu_caps need APIs
that kvm_cpu_caps don't.  In theory, the similar names will make this
approach more intuitive.

Sean Christopherson (9):
  KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
  KVM: x86: Replace guts of "goverened" features with comprehensive
    cpu_caps
  KVM: x86: Initialize guest cpu_caps based on guest CPUID
  KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime
  KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns
    right leaf
  KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based
    features
  KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has()
  KVM: x86: Replace all guest CPUID feature queries with cpu_caps check
  KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities

 arch/x86/include/asm/kvm_host.h  |  40 ++++++----
 arch/x86/kvm/cpuid.c             | 102 +++++++++++++++++++-------
 arch/x86/kvm/cpuid.h             | 121 +++++++++++--------------------
 arch/x86/kvm/governed_features.h |  21 ------
 arch/x86/kvm/lapic.c             |   2 +-
 arch/x86/kvm/mmu/mmu.c           |   4 +-
 arch/x86/kvm/mtrr.c              |   2 +-
 arch/x86/kvm/reverse_cpuid.h     |  15 ----
 arch/x86/kvm/smm.c               |  10 +--
 arch/x86/kvm/svm/nested.c        |  22 +++---
 arch/x86/kvm/svm/pmu.c           |   8 +-
 arch/x86/kvm/svm/sev.c           |   4 +-
 arch/x86/kvm/svm/svm.c           |  50 ++++++-------
 arch/x86/kvm/svm/svm.h           |   4 +-
 arch/x86/kvm/vmx/nested.c        |  18 ++---
 arch/x86/kvm/vmx/pmu_intel.c     |   4 +-
 arch/x86/kvm/vmx/sgx.c           |  14 ++--
 arch/x86/kvm/vmx/vmx.c           |  63 ++++++++--------
 arch/x86/kvm/vmx/vmx.h           |   2 +-
 arch/x86/kvm/x86.c               |  72 +++++++++---------
 20 files changed, 282 insertions(+), 296 deletions(-)
 delete mode 100644 arch/x86/kvm/governed_features.h


base-commit: 45b890f7689eb0aba454fc5831d2d79763781677
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-19 17:08   ` Maxim Levitsky
  2023-11-21  3:20   ` Chao Gao
  2023-11-10 23:55 ` [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps Sean Christopherson
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

As the first step toward replacing KVM's so-called "governed features"
framework with a more comprehensive, less poorly named implementation,
replace the "kvm_governed_feature" function prefix with "guest_cpu_cap"
and rename guest_can_use() to guest_cpu_cap_has().

The "guest_cpu_cap" naming scheme mirrors that of "kvm_cpu_cap", and
provides a more clear distinction between guest capabilities, which are
KVM controlled (heh, or one might say "governed"), and guest CPUID, which
with few exceptions is fully userspace controlled.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c      |  2 +-
 arch/x86/kvm/cpuid.h      | 12 ++++++------
 arch/x86/kvm/mmu/mmu.c    |  4 ++--
 arch/x86/kvm/svm/nested.c | 22 +++++++++++-----------
 arch/x86/kvm/svm/svm.c    | 26 +++++++++++++-------------
 arch/x86/kvm/svm/svm.h    |  4 ++--
 arch/x86/kvm/vmx/nested.c |  6 +++---
 arch/x86/kvm/vmx/vmx.c    | 14 +++++++-------
 arch/x86/kvm/x86.c        |  4 ++--
 9 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index dda6fc4cfae8..4f464187b063 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -345,7 +345,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
 				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
 	if (allow_gbpages)
-		kvm_governed_feature_set(vcpu, X86_FEATURE_GBPAGES);
+		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
 	if (best && apic) {
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 0b90532b6e26..245416ffa34c 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -254,7 +254,7 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
 	return kvm_governed_feature_index(x86_feature) >= 0;
 }
 
-static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
+static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
 						     unsigned int x86_feature)
 {
 	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
@@ -263,15 +263,15 @@ static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
 		  vcpu->arch.governed_features.enabled);
 }
 
-static __always_inline void kvm_governed_feature_check_and_set(struct kvm_vcpu *vcpu,
-							       unsigned int x86_feature)
+static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
+							unsigned int x86_feature)
 {
 	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
-		kvm_governed_feature_set(vcpu, x86_feature);
+		guest_cpu_cap_set(vcpu, x86_feature);
 }
 
-static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
-					  unsigned int x86_feature)
+static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
+					      unsigned int x86_feature)
 {
 	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b0f01d605617..cfed824587b9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4801,7 +4801,7 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 	__reset_rsvds_bits_mask(&context->guest_rsvd_check,
 				vcpu->arch.reserved_gpa_bits,
 				context->cpu_role.base.level, is_efer_nx(context),
-				guest_can_use(vcpu, X86_FEATURE_GBPAGES),
+				guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES),
 				is_cr4_pse(context),
 				guest_cpuid_is_amd_or_hygon(vcpu));
 }
@@ -4878,7 +4878,7 @@ static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
 	__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
 				context->root_role.level,
 				context->root_role.efer_nx,
-				guest_can_use(vcpu, X86_FEATURE_GBPAGES),
+				guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES),
 				is_pse, is_amd);
 
 	if (!shadow_me_mask)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 3fea8c47679e..ea0895262b12 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -107,7 +107,7 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu)
 
 static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm)
 {
-	if (!guest_can_use(&svm->vcpu, X86_FEATURE_V_VMSAVE_VMLOAD))
+	if (!guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_V_VMSAVE_VMLOAD))
 		return true;
 
 	if (!nested_npt_enabled(svm))
@@ -603,7 +603,7 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
 		vmcb_mark_dirty(vmcb02, VMCB_DR);
 	}
 
-	if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+	if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
 		     (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
 		/*
 		 * Reserved bits of DEBUGCTL are ignored.  Be consistent with
@@ -660,7 +660,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 	 * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes.
 	 */
 
-	if (guest_can_use(vcpu, X86_FEATURE_VGIF) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VGIF) &&
 	    (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK))
 		int_ctl_vmcb12_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK);
 	else
@@ -698,7 +698,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 
 	vmcb02->control.tsc_offset = vcpu->arch.tsc_offset;
 
-	if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR) &&
 	    svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio)
 		nested_svm_update_tsc_ratio_msr(vcpu);
 
@@ -719,7 +719,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 	 * what a nrips=0 CPU would do (L1 is responsible for advancing RIP
 	 * prior to injecting the event).
 	 */
-	if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
 		vmcb02->control.next_rip    = svm->nested.ctl.next_rip;
 	else if (boot_cpu_has(X86_FEATURE_NRIPS))
 		vmcb02->control.next_rip    = vmcb12_rip;
@@ -729,7 +729,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 		svm->soft_int_injected = true;
 		svm->soft_int_csbase = vmcb12_csbase;
 		svm->soft_int_old_rip = vmcb12_rip;
-		if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
 			svm->soft_int_next_rip = svm->nested.ctl.next_rip;
 		else
 			svm->soft_int_next_rip = vmcb12_rip;
@@ -737,18 +737,18 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 
 	vmcb02->control.virt_ext            = vmcb01->control.virt_ext &
 					      LBR_CTL_ENABLE_MASK;
-	if (guest_can_use(vcpu, X86_FEATURE_LBRV))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV))
 		vmcb02->control.virt_ext  |=
 			(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK);
 
 	if (!nested_vmcb_needs_vls_intercept(svm))
 		vmcb02->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK;
 
-	if (guest_can_use(vcpu, X86_FEATURE_PAUSEFILTER))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PAUSEFILTER))
 		pause_count12 = svm->nested.ctl.pause_filter_count;
 	else
 		pause_count12 = 0;
-	if (guest_can_use(vcpu, X86_FEATURE_PFTHRESHOLD))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PFTHRESHOLD))
 		pause_thresh12 = svm->nested.ctl.pause_filter_thresh;
 	else
 		pause_thresh12 = 0;
@@ -1035,7 +1035,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	if (vmcb12->control.exit_code != SVM_EXIT_ERR)
 		nested_save_pending_event_to_vmcb12(svm, vmcb12);
 
-	if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
 		vmcb12->control.next_rip  = vmcb02->control.next_rip;
 
 	vmcb12->control.int_ctl           = svm->nested.ctl.int_ctl;
@@ -1074,7 +1074,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	if (!nested_exit_on_intr(svm))
 		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 
-	if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+	if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
 		     (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
 		svm_copy_lbrs(vmcb12, vmcb02);
 		svm_update_lbrv(vcpu);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1855a6d7c976..8a99a73b6ee5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1046,7 +1046,7 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 	bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
 	bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
-			    (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+			    (is_guest_mode(vcpu) && guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
 			    (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
 
 	if (enable_lbrv == current_enable_lbrv)
@@ -2835,7 +2835,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	switch (msr_info->index) {
 	case MSR_AMD64_TSC_RATIO:
 		if (!msr_info->host_initiated &&
-		    !guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR))
 			return 1;
 		msr_info->data = svm->tsc_ratio_msr;
 		break;
@@ -2985,7 +2985,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 	switch (ecx) {
 	case MSR_AMD64_TSC_RATIO:
 
-		if (!guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR)) {
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR)) {
 
 			if (!msr->host_initiated)
 				return 1;
@@ -3007,7 +3007,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 
 		svm->tsc_ratio_msr = data;
 
-		if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR) &&
 		    is_guest_mode(vcpu))
 			nested_svm_update_tsc_ratio_msr(vcpu);
 
@@ -4318,11 +4318,11 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
 	    boot_cpu_has(X86_FEATURE_XSAVES) &&
 	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);
+		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
 
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_NRIPS);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_LBRV);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
 
 	/*
 	 * Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
@@ -4330,12 +4330,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * SVM on Intel is bonkers and extremely unlikely to work).
 	 */
 	if (!guest_cpuid_is_intel(vcpu))
-		kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
 
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VGIF);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VNMI);
 
 	svm_recalc_instruction_intercepts(vcpu, svm);
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index be67ab7fdd10..e49af42b4a33 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -443,7 +443,7 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
 
 static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
 {
-	return guest_can_use(&svm->vcpu, X86_FEATURE_VGIF) &&
+	return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_VGIF) &&
 	       (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
 }
 
@@ -495,7 +495,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
 
 static inline bool nested_vnmi_enabled(struct vcpu_svm *svm)
 {
-	return guest_can_use(&svm->vcpu, X86_FEATURE_VNMI) &&
+	return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_VNMI) &&
 	       (svm->nested.ctl.int_ctl & V_NMI_ENABLE_MASK);
 }
 
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index c5ec0ef51ff7..4750d1696d58 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6426,7 +6426,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 	vmx = to_vmx(vcpu);
 	vmcs12 = get_vmcs12(vcpu);
 
-	if (guest_can_use(vcpu, X86_FEATURE_VMX) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX) &&
 	    (vmx->nested.vmxon || vmx->nested.smm.vmxon)) {
 		kvm_state.hdr.vmx.vmxon_pa = vmx->nested.vmxon_ptr;
 		kvm_state.hdr.vmx.vmcs12_pa = vmx->nested.current_vmptr;
@@ -6567,7 +6567,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 		if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS)
 			return -EINVAL;
 	} else {
-		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 			return -EINVAL;
 
 		if (!page_address_valid(vcpu, kvm_state->hdr.vmx.vmxon_pa))
@@ -6601,7 +6601,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 	if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) &&
-	    (!guest_can_use(vcpu, X86_FEATURE_VMX) ||
+	    (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX) ||
 	     !vmx->nested.enlightened_vmcs_enabled))
 			return -EINVAL;
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index be20a60047b1..6328f0d47c64 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2050,7 +2050,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
 		break;
 	case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
-		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 			return 1;
 		if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
 				    &msr_info->data))
@@ -2358,7 +2358,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
 		if (!msr_info->host_initiated)
 			return 1; /* they are read-only */
-		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 			return 1;
 		return vmx_set_vmx_msr(vcpu, msr_index, data);
 	case MSR_IA32_RTIT_CTL:
@@ -4567,7 +4567,7 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
 												\
 	if (cpu_has_vmx_##name()) {								\
 		if (kvm_is_governed_feature(X86_FEATURE_##feat_name))				\
-			__enabled = guest_can_use(__vcpu, X86_FEATURE_##feat_name);		\
+			__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);		\
 		else										\
 			__enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name);		\
 		vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
@@ -7757,9 +7757,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
 	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
+		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_XSAVES);
 
-	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX);
+	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VMX);
 
 	vmx_setup_uret_msrs(vmx);
 
@@ -7767,7 +7767,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		vmcs_set_secondary_exec_control(vmx,
 						vmx_secondary_exec_control(vmx));
 
-	if (guest_can_use(vcpu, X86_FEATURE_VMX))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 		vmx->msr_ia32_feature_control_valid_bits |=
 			FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
 			FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
@@ -7776,7 +7776,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 			~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
 			  FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
 
-	if (guest_can_use(vcpu, X86_FEATURE_VMX))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
 		nested_vmx_cr_fixed1_bits_update(vcpu);
 
 	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2c924075f6f1..04a77b764a36 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1025,7 +1025,7 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
 		if (vcpu->arch.xcr0 != host_xcr0)
 			xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
 
-		if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES) &&
 		    vcpu->arch.ia32_xss != host_xss)
 			wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
 	}
@@ -1056,7 +1056,7 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
 		if (vcpu->arch.xcr0 != host_xcr0)
 			xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
 
-		if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES) &&
 		    vcpu->arch.ia32_xss != host_xss)
 			wrmsrl(MSR_IA32_XSS, host_xss);
 	}
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
  2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-14  9:12   ` Binbin Wu
  2023-11-19 17:22   ` Maxim Levitsky
  2023-11-10 23:55 ` [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Replace the internals of the governed features framework with a more
comprehensive "guest CPU capabilities" implementation, i.e. with a guest
version of kvm_cpu_caps.  Keep the skeleton of governed features around
for now as vmx_adjust_sec_exec_control() relies on detecting governed
features to do the right thing for XSAVES, and switching all guest feature
queries to guest_cpu_cap_has() requires subtle and non-trivial changes,
i.e. is best done as a standalone change.

Tracking *all* guest capabilities that KVM cares will allow excising the
poorly named "governed features" framework, and effectively optimizes all
KVM queries of guest capabilities, i.e. doesn't require making a
subjective decision as to whether or not a feature is worth "governing",
and doesn't require adding the code to do so.

The cost of tracking all features is currently 92 bytes per vCPU on 64-bit
kernels: 100 bytes for cpu_caps versus 8 bytes for governed_features.
That cost is well worth paying even if the only benefit was eliminating
the "governed features" terminology.  And practically speaking, the real
cost is zero unless those 92 bytes pushes the size of vcpu_vmx or vcpu_svm
into a new order-N allocation, and if that happens there are better ways
to reduce the footprint of kvm_vcpu_arch, e.g. making the PMU and/or MTRR
state separate allocations.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 40 ++++++++++++++++++++-------------
 arch/x86/kvm/cpuid.c            |  4 +---
 arch/x86/kvm/cpuid.h            | 14 ++++++------
 arch/x86/kvm/reverse_cpuid.h    | 15 -------------
 4 files changed, 32 insertions(+), 41 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d7036982332e..1d43dd5fdea7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -722,6 +722,22 @@ struct kvm_queued_exception {
 	bool has_payload;
 };
 
+/*
+ * Hardware-defined CPUID leafs that are either scattered by the kernel or are
+ * unknown to the kernel, but need to be directly used by KVM.  Note, these
+ * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
+ */
+enum kvm_only_cpuid_leafs {
+	CPUID_12_EAX	 = NCAPINTS,
+	CPUID_7_1_EDX,
+	CPUID_8000_0007_EDX,
+	CPUID_8000_0022_EAX,
+	NR_KVM_CPU_CAPS,
+
+	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
+};
+
+
 struct kvm_vcpu_arch {
 	/*
 	 * rip and regs accesses must go through
@@ -840,23 +856,15 @@ struct kvm_vcpu_arch {
 	struct kvm_hypervisor_cpuid kvm_cpuid;
 
 	/*
-	 * FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly
-	 * when "struct kvm_vcpu_arch" is no longer defined in an
-	 * arch/x86/include/asm header.  The max is mostly arbitrary, i.e.
-	 * can be increased as necessary.
+	 * Track the effective guest capabilities, i.e. the features the vCPU
+	 * is allowed to use.  Typically, but not always, features can be used
+	 * by the guest if and only if both KVM and userspace want to expose
+	 * the feature to the guest.  A common exception is for virtualization
+	 * holes, i.e. when KVM can't prevent the guest from using a feature,
+	 * in which case the vCPU "has" the feature regardless of what KVM or
+	 * userspace desires.
 	 */
-#define KVM_MAX_NR_GOVERNED_FEATURES BITS_PER_LONG
-
-	/*
-	 * Track whether or not the guest is allowed to use features that are
-	 * governed by KVM, where "governed" means KVM needs to manage state
-	 * and/or explicitly enable the feature in hardware.  Typically, but
-	 * not always, governed features can be used by the guest if and only
-	 * if both KVM and userspace want to expose the feature to the guest.
-	 */
-	struct {
-		DECLARE_BITMAP(enabled, KVM_MAX_NR_GOVERNED_FEATURES);
-	} governed_features;
+	u32 cpu_caps[NR_KVM_CPU_CAPS];
 
 	u64 reserved_gpa_bits;
 	int maxphyaddr;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 4f464187b063..4bf3c2d4dc7c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -327,9 +327,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	struct kvm_cpuid_entry2 *best;
 	bool allow_gbpages;
 
-	BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
-	bitmap_zero(vcpu->arch.governed_features.enabled,
-		    KVM_MAX_NR_GOVERNED_FEATURES);
+	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
 
 	/*
 	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 245416ffa34c..9f18c4395b71 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -255,12 +255,12 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
 }
 
 static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
-						     unsigned int x86_feature)
+					      unsigned int x86_feature)
 {
-	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	__set_bit(kvm_governed_feature_index(x86_feature),
-		  vcpu->arch.governed_features.enabled);
+	reverse_cpuid_check(x86_leaf);
+	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
 }
 
 static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
@@ -273,10 +273,10 @@ static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
 static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
 					      unsigned int x86_feature)
 {
-	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
-	return test_bit(kvm_governed_feature_index(x86_feature),
-			vcpu->arch.governed_features.enabled);
+	reverse_cpuid_check(x86_leaf);
+	return vcpu->arch.cpu_caps[x86_leaf] & __feature_bit(x86_feature);
 }
 
 #endif
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index b81650678375..4b658491e8f8 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -6,21 +6,6 @@
 #include <asm/cpufeature.h>
 #include <asm/cpufeatures.h>
 
-/*
- * Hardware-defined CPUID leafs that are either scattered by the kernel or are
- * unknown to the kernel, but need to be directly used by KVM.  Note, these
- * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
- */
-enum kvm_only_cpuid_leafs {
-	CPUID_12_EAX	 = NCAPINTS,
-	CPUID_7_1_EDX,
-	CPUID_8000_0007_EDX,
-	CPUID_8000_0022_EAX,
-	NR_KVM_CPU_CAPS,
-
-	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
-};
-
 /*
  * Define a KVM-only feature flag.
  *
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
  2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
  2023-11-10 23:55 ` [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-16  3:16   ` Yang, Weijiang
  2023-11-19 17:32   ` Maxim Levitsky
  2023-11-10 23:55 ` [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Initialize a vCPU's capabilities based on the guest CPUID provided by
userspace instead of simply zeroing the entire array.  This will allow
using cpu_caps to query *all* CPUID-based guest capabilities, i.e. will
allow converting all usage of guest_cpuid_has() to guest_cpu_cap_has().

Zeroing the array was the logical choice when using cpu_caps was opt-in,
e.g. "unsupported" was generally a safer default, and the whole point of
governed features is that KVM would need to check host and guest support,
i.e. making everything unsupported by default didn't require more code.

But requiring KVM to manually "enable" every CPUID-based feature in
cpu_caps would require an absurd amount of boilerplate code.

Follow existing CPUID/kvm_cpu_caps nomenclature where possible, e.g. for
the change() and clear() APIs.  Replace check_and_set() with restrict() to
try and capture that KVM is restricting userspace's desired guest feature
set based on KVM's capabilities.

This is intended to be gigantic nop, i.e. should not have any impact on
guest or KVM functionality.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c   | 43 +++++++++++++++++++++++++++++++++++++++---
 arch/x86/kvm/cpuid.h   | 25 +++++++++++++++++++++---
 arch/x86/kvm/svm/svm.c | 24 +++++++++++------------
 arch/x86/kvm/vmx/vmx.c |  6 ++++--
 4 files changed, 78 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 4bf3c2d4dc7c..5cf3d697ecb3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -321,13 +321,51 @@ static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent)
 	return entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX;
 }
 
+/*
+ * This isn't truly "unsafe", but all callers except kvm_cpu_after_set_cpuid()
+ * should use __cpuid_entry_get_reg(), which provides compile-time validation
+ * of the input.
+ */
+static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
+{
+	switch (reg) {
+	case CPUID_EAX:
+		return entry->eax;
+	case CPUID_EBX:
+		return entry->ebx;
+	case CPUID_ECX:
+		return entry->ecx;
+	case CPUID_EDX:
+		return entry->edx;
+	default:
+		WARN_ON_ONCE(1);
+		return 0;
+	}
+}
+
 static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	struct kvm_cpuid_entry2 *best;
 	bool allow_gbpages;
+	int i;
 
-	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
+	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
+
+	/*
+	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
+	 * honor userspace's definition for features that don't require KVM or
+	 * hardware management/support (or that KVM simply doesn't care about).
+	 */
+	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
+		const struct cpuid_reg cpuid = reverse_cpuid[i];
+
+		best = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
+		if (best)
+			vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(best, cpuid.reg);
+		else
+			vcpu->arch.cpu_caps[i] = 0;
+	}
 
 	/*
 	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
@@ -342,8 +380,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
 				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
-	if (allow_gbpages)
-		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
+	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
 	if (best && apic) {
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 9f18c4395b71..1707ef10b269 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -263,11 +263,30 @@ static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
 	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
 }
 
-static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
-							unsigned int x86_feature)
+static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
+						unsigned int x86_feature)
 {
-	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
+
+	reverse_cpuid_check(x86_leaf);
+	vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
+}
+
+static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
+						 unsigned int x86_feature,
+						 bool guest_has_cap)
+{
+	if (guest_has_cap)
 		guest_cpu_cap_set(vcpu, x86_feature);
+	else
+		guest_cpu_cap_clear(vcpu, x86_feature);
+}
+
+static __always_inline void guest_cpu_cap_restrict(struct kvm_vcpu *vcpu,
+						   unsigned int x86_feature)
+{
+	if (!kvm_cpu_cap_has(x86_feature))
+		guest_cpu_cap_clear(vcpu, x86_feature);
 }
 
 static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8a99a73b6ee5..5827328e30f1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
 	 * the guest read/write access to the host's XSS.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
-	    boot_cpu_has(X86_FEATURE_XSAVES) &&
-	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
+	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
+			     boot_cpu_has(X86_FEATURE_XSAVE) &&
+			     boot_cpu_has(X86_FEATURE_XSAVES) &&
+			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
 
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);
 
 	/*
 	 * Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
@@ -4330,12 +4330,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * SVM on Intel is bonkers and extremely unlikely to work).
 	 */
 	if (!guest_cpuid_is_intel(vcpu))
-		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+		guest_cpu_cap_restrict(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
 
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VGIF);
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VNMI);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PAUSEFILTER);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PFTHRESHOLD);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VGIF);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VNMI);
 
 	svm_recalc_instruction_intercepts(vcpu, svm);
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6328f0d47c64..5a056ad1ae55 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7757,9 +7757,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
 	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
-		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_XSAVES);
+		guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
+	else
+		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
 
-	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VMX);
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VMX);
 
 	vmx_setup_uret_msrs(vmx);
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
                   ` (2 preceding siblings ...)
  2023-11-10 23:55 ` [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-19 17:33   ` Maxim Levitsky
  2023-11-10 23:55 ` [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Move the handling of X86_FEATURE_MWAIT during CPUID runtime updates to
utilize the lookup done for other CPUID.0x1 features.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 5cf3d697ecb3..6777780be6ae 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -276,6 +276,11 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
 
 		cpuid_entry_change(best, X86_FEATURE_APIC,
 			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
+
+		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
+			cpuid_entry_change(best, X86_FEATURE_MWAIT,
+					   vcpu->arch.ia32_misc_enable_msr &
+					   MSR_IA32_MISC_ENABLE_MWAIT);
 	}
 
 	best = cpuid_entry2_find(entries, nent, 7, 0);
@@ -296,14 +301,6 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
 	if (kvm_hlt_in_guest(vcpu->kvm) && best &&
 		(best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
 		best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
-
-	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
-		best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
-		if (best)
-			cpuid_entry_change(best, X86_FEATURE_MWAIT,
-					   vcpu->arch.ia32_misc_enable_msr &
-					   MSR_IA32_MISC_ENABLE_MWAIT);
-	}
 }
 
 void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
                   ` (3 preceding siblings ...)
  2023-11-10 23:55 ` [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-19 17:33   ` Maxim Levitsky
  2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Drop an unnecessary check that cpuid_entry2_find() returns the correct
leaf when getting CPUID.0x7.0x0 to update X86_FEATURE_OSPKE, as
cpuid_entry2_find() never returns an entry for the wrong function.  And
not that it matters, but cpuid_entry2_find() will always return a precise
match for CPUID.0x7.0x0 since the index is significant.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6777780be6ae..36bd04030989 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -284,7 +284,7 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
 	}
 
 	best = cpuid_entry2_find(entries, nent, 7, 0);
-	if (best && boot_cpu_has(X86_FEATURE_PKU) && best->function == 0x7)
+	if (best && boot_cpu_has(X86_FEATURE_PKU))
 		cpuid_entry_change(best, X86_FEATURE_OSPKE,
 				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
                   ` (4 preceding siblings ...)
  2023-11-10 23:55 ` [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-13  8:03   ` Robert Hoo
                     ` (2 more replies)
  2023-11-10 23:55 ` [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
                   ` (2 subsequent siblings)
  8 siblings, 3 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

When updating guest CPUID entries to emulate runtime behavior, e.g. when
the guest enables a CR4-based feature that is tied to a CPUID flag, also
update the vCPU's cpu_caps accordingly.  This will allow replacing all
usage of guest_cpuid_has() with guest_cpu_cap_has().

Take care not to update guest capabilities when KVM is updating CPUID
entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
bogus CPUID information.  No extra call to update cpu_caps is needed as
the cpu_caps are initialized from the incoming guest CPUID, i.e. will
automatically get the updated values.

Note, none of the features in question use guest_cpu_cap_has() at this
time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 36bd04030989..37a991439fe6 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
 	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
 }
 
+static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
+						       struct kvm_cpuid_entry2 *entry,
+						       unsigned int x86_feature,
+						       bool has_feature)
+{
+	if (entry)
+		cpuid_entry_change(entry, x86_feature, has_feature);
+
+	if (vcpu)
+		guest_cpu_cap_change(vcpu, x86_feature, has_feature);
+}
+
 static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
 				       int nent)
 {
 	struct kvm_cpuid_entry2 *best;
+	struct kvm_vcpu *caps = vcpu;
+
+	/*
+	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
+	 * are coming in from userspace!
+	 */
+	if (entries != vcpu->arch.cpuid_entries)
+		caps = NULL;
 
 	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
-	if (best) {
-		/* Update OSXSAVE bit */
-		if (boot_cpu_has(X86_FEATURE_XSAVE))
-			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
+
+	if (boot_cpu_has(X86_FEATURE_XSAVE))
+		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
 					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
 
-		cpuid_entry_change(best, X86_FEATURE_APIC,
-			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
+	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
+				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
 
-		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
-			cpuid_entry_change(best, X86_FEATURE_MWAIT,
-					   vcpu->arch.ia32_misc_enable_msr &
-					   MSR_IA32_MISC_ENABLE_MWAIT);
-	}
+	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
+		kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
+					   vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);
 
 	best = cpuid_entry2_find(entries, nent, 7, 0);
-	if (best && boot_cpu_has(X86_FEATURE_PKU))
-		cpuid_entry_change(best, X86_FEATURE_OSPKE,
-				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
+	if (boot_cpu_has(X86_FEATURE_PKU))
+		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSPKE,
+					   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
 
 	best = cpuid_entry2_find(entries, nent, 0xD, 0);
 	if (best)
@@ -353,6 +370,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
 	 * honor userspace's definition for features that don't require KVM or
 	 * hardware management/support (or that KVM simply doesn't care about).
+	 *
+	 * Note, KVM has already done runtime updates on guest CPUID, i.e. this
+	 * will also correctly set runtime features in guest CPU capabilities.
 	 */
 	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
 		const struct cpuid_reg cpuid = reverse_cpuid[i];
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has()
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
                   ` (5 preceding siblings ...)
  2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-19 17:35   ` Maxim Levitsky
  2023-11-10 23:55 ` [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check Sean Christopherson
  2023-11-10 23:55 ` [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities Sean Christopherson
  8 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Move the implementations of guest_has_{spec_ctrl,pred_cmd}_msr() down
below guest_cpu_cap_has() so that their use of guest_cpuid_has() can be
replaced with calls to guest_cpu_cap_has().

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.h | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 1707ef10b269..bebf94a69630 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -163,21 +163,6 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
 	return x86_stepping(best->eax);
 }
 
-static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
-{
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
-}
-
-static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
-{
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
-}
-
 static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
@@ -298,4 +283,19 @@ static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
 	return vcpu->arch.cpu_caps[x86_leaf] & __feature_bit(x86_feature);
 }
 
+static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
+{
+	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
+}
+
+static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
+{
+	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
+}
+
 #endif
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
                   ` (6 preceding siblings ...)
  2023-11-10 23:55 ` [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-19 17:35   ` Maxim Levitsky
  2023-11-10 23:55 ` [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities Sean Christopherson
  8 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Switch all queries of guest features from guest CPUID to guest
capabilities, i.e. replace all calls to guest_cpuid_has() with calls to
guest_cpu_cap_has(), and drop guest_cpuid_has() and its helper
guest_cpuid_get_register().

Opportunistically drop the unused guest_cpuid_clear(), as there should be
no circumstance in which KVM needs to _clear_ a guest CPUID feature now
that everything is tracked via cpu_caps.  E.g. KVM may need to _change_
a feature to emulate dynamic CPUID flags, but KVM should never need to
clear a feature in guest CPUID to prevent it from being used by the guest.

Delete the last remnants of the governed features framework, as the lone
holdout was vmx_adjust_secondary_exec_control()'s divergent behavior for
governed vs. ungoverned features.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c             |  4 +-
 arch/x86/kvm/cpuid.h             | 70 ++++----------------------------
 arch/x86/kvm/governed_features.h | 21 ----------
 arch/x86/kvm/lapic.c             |  2 +-
 arch/x86/kvm/mtrr.c              |  2 +-
 arch/x86/kvm/smm.c               | 10 ++---
 arch/x86/kvm/svm/pmu.c           |  8 ++--
 arch/x86/kvm/svm/sev.c           |  4 +-
 arch/x86/kvm/svm/svm.c           | 20 ++++-----
 arch/x86/kvm/vmx/nested.c        | 12 +++---
 arch/x86/kvm/vmx/pmu_intel.c     |  4 +-
 arch/x86/kvm/vmx/sgx.c           | 14 +++----
 arch/x86/kvm/vmx/vmx.c           | 47 ++++++++++-----------
 arch/x86/kvm/vmx/vmx.h           |  2 +-
 arch/x86/kvm/x86.c               | 68 +++++++++++++++----------------
 15 files changed, 104 insertions(+), 184 deletions(-)
 delete mode 100644 arch/x86/kvm/governed_features.h

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 37a991439fe6..6407e5c45f20 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -396,7 +396,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * and can install smaller shadow pages if the host lacks 1GiB support.
 	 */
 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
-				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
+				      guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES);
 	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
 
 	best = kvm_find_cpuid_entry(vcpu, 1);
@@ -419,7 +419,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 	kvm_pmu_refresh(vcpu);
 	vcpu->arch.cr4_guest_rsvd_bits =
-	    __cr4_reserved_bits(guest_cpuid_has, vcpu);
+	    __cr4_reserved_bits(guest_cpu_cap_has, vcpu);
 
 	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu->arch.cpuid_entries,
 						    vcpu->arch.cpuid_nent));
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index bebf94a69630..98694dfe062e 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -72,41 +72,6 @@ static __always_inline void cpuid_entry_override(struct kvm_cpuid_entry2 *entry,
 	*reg = kvm_cpu_caps[leaf];
 }
 
-static __always_inline u32 *guest_cpuid_get_register(struct kvm_vcpu *vcpu,
-						     unsigned int x86_feature)
-{
-	const struct cpuid_reg cpuid = x86_feature_cpuid(x86_feature);
-	struct kvm_cpuid_entry2 *entry;
-
-	entry = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
-	if (!entry)
-		return NULL;
-
-	return __cpuid_entry_get_reg(entry, cpuid.reg);
-}
-
-static __always_inline bool guest_cpuid_has(struct kvm_vcpu *vcpu,
-					    unsigned int x86_feature)
-{
-	u32 *reg;
-
-	reg = guest_cpuid_get_register(vcpu, x86_feature);
-	if (!reg)
-		return false;
-
-	return *reg & __feature_bit(x86_feature);
-}
-
-static __always_inline void guest_cpuid_clear(struct kvm_vcpu *vcpu,
-					      unsigned int x86_feature)
-{
-	u32 *reg;
-
-	reg = guest_cpuid_get_register(vcpu, x86_feature);
-	if (reg)
-		*reg &= ~__feature_bit(x86_feature);
-}
-
 static inline bool guest_cpuid_is_amd_or_hygon(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
@@ -218,27 +183,6 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
 	return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
 }
 
-enum kvm_governed_features {
-#define KVM_GOVERNED_FEATURE(x) KVM_GOVERNED_##x,
-#include "governed_features.h"
-	KVM_NR_GOVERNED_FEATURES
-};
-
-static __always_inline int kvm_governed_feature_index(unsigned int x86_feature)
-{
-	switch (x86_feature) {
-#define KVM_GOVERNED_FEATURE(x) case x: return KVM_GOVERNED_##x;
-#include "governed_features.h"
-	default:
-		return -1;
-	}
-}
-
-static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
-{
-	return kvm_governed_feature_index(x86_feature) >= 0;
-}
-
 static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
 					      unsigned int x86_feature)
 {
@@ -285,17 +229,17 @@ static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
 
 static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
 {
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
+	return (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_STIBP) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_SSBD));
 }
 
 static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
 {
-	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
-		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
+	return (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBPB) ||
+		guest_cpu_cap_has(vcpu, X86_FEATURE_SBPB));
 }
 
 #endif
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
deleted file mode 100644
index 423a73395c10..000000000000
--- a/arch/x86/kvm/governed_features.h
+++ /dev/null
@@ -1,21 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#if !defined(KVM_GOVERNED_FEATURE) || defined(KVM_GOVERNED_X86_FEATURE)
-BUILD_BUG()
-#endif
-
-#define KVM_GOVERNED_X86_FEATURE(x) KVM_GOVERNED_FEATURE(X86_FEATURE_##x)
-
-KVM_GOVERNED_X86_FEATURE(GBPAGES)
-KVM_GOVERNED_X86_FEATURE(XSAVES)
-KVM_GOVERNED_X86_FEATURE(VMX)
-KVM_GOVERNED_X86_FEATURE(NRIPS)
-KVM_GOVERNED_X86_FEATURE(TSCRATEMSR)
-KVM_GOVERNED_X86_FEATURE(V_VMSAVE_VMLOAD)
-KVM_GOVERNED_X86_FEATURE(LBRV)
-KVM_GOVERNED_X86_FEATURE(PAUSEFILTER)
-KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
-KVM_GOVERNED_X86_FEATURE(VGIF)
-KVM_GOVERNED_X86_FEATURE(VNMI)
-
-#undef KVM_GOVERNED_X86_FEATURE
-#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 245b20973cae..f5fab29c827f 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -584,7 +584,7 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu)
 	 * version first and level-triggered interrupts never get EOIed in
 	 * IOAPIC.
 	 */
-	if (guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) &&
 	    !ioapic_in_kernel(vcpu->kvm))
 		v |= APIC_LVR_DIRECTED_EOI;
 	kvm_lapic_set_reg(apic, APIC_LVR, v);
diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
index a67c28a56417..9e8cb38ae1db 100644
--- a/arch/x86/kvm/mtrr.c
+++ b/arch/x86/kvm/mtrr.c
@@ -128,7 +128,7 @@ static u8 mtrr_disabled_type(struct kvm_vcpu *vcpu)
 	 * enable MTRRs and it is obviously undesirable to run the
 	 * guest entirely with UC memory and we use WB.
 	 */
-	if (guest_cpuid_has(vcpu, X86_FEATURE_MTRR))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_MTRR))
 		return MTRR_TYPE_UNCACHABLE;
 	else
 		return MTRR_TYPE_WRBACK;
diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c
index dc3d95fdca7d..3ca4154d9fa0 100644
--- a/arch/x86/kvm/smm.c
+++ b/arch/x86/kvm/smm.c
@@ -290,7 +290,7 @@ void enter_smm(struct kvm_vcpu *vcpu)
 	memset(smram.bytes, 0, sizeof(smram.bytes));
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		enter_smm_save_state_64(vcpu, &smram.smram64);
 	else
 #endif
@@ -360,7 +360,7 @@ void enter_smm(struct kvm_vcpu *vcpu)
 	kvm_set_segment(vcpu, &ds, VCPU_SREG_SS);
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		if (static_call(kvm_x86_set_efer)(vcpu, 0))
 			goto error;
 #endif
@@ -593,7 +593,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
 	 * supports long mode.
 	 */
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM)) {
 		struct kvm_segment cs_desc;
 		unsigned long cr4;
 
@@ -616,7 +616,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
 		kvm_set_cr0(vcpu, cr0 & ~(X86_CR0_PG | X86_CR0_PE));
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM)) {
 		unsigned long cr4, efer;
 
 		/* Clear CR4.PAE before clearing EFER.LME. */
@@ -639,7 +639,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
 		return X86EMUL_UNHANDLEABLE;
 
 #ifdef CONFIG_X86_64
-	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return rsm_load_state_64(ctxt, &smram.smram64);
 	else
 #endif
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 373ff6a6687b..16d396a31c16 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -46,7 +46,7 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
 
 	switch (msr) {
 	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE))
 			return NULL;
 		/*
 		 * Each PMU counter has a pair of CTL and CTR MSRs. CTLn
@@ -113,7 +113,7 @@ static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	case MSR_K7_EVNTSEL0 ... MSR_K7_PERFCTR3:
 		return pmu->version > 0;
 	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
-		return guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE);
+		return guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE);
 	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
 	case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
 	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
@@ -184,7 +184,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
 	union cpuid_0x80000022_ebx ebx;
 
 	pmu->version = 1;
-	if (guest_cpuid_has(vcpu, X86_FEATURE_PERFMON_V2)) {
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PERFMON_V2)) {
 		pmu->version = 2;
 		/*
 		 * Note, PERFMON_V2 is also in 0x80000022.0x0, i.e. the guest
@@ -194,7 +194,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
 			     x86_feature_cpuid(X86_FEATURE_PERFMON_V2).index);
 		ebx.full = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0)->ebx;
 		pmu->nr_arch_gp_counters = ebx.split.num_core_pmc;
-	} else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
+	} else if (guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
 		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
 	} else {
 		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 4900c078045a..05008d33ae63 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2967,8 +2967,8 @@ static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm)
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 
 	if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) {
-		bool v_tsc_aux = guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
-				 guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
+		bool v_tsc_aux = guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) ||
+				 guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID);
 
 		set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, v_tsc_aux, v_tsc_aux);
 	}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 5827328e30f1..9e3a9191dac1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1185,14 +1185,14 @@ static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
 	 */
 	if (kvm_cpu_cap_has(X86_FEATURE_INVPCID)) {
 		if (!npt_enabled ||
-		    !guest_cpuid_has(&svm->vcpu, X86_FEATURE_INVPCID))
+		    !guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_INVPCID))
 			svm_set_intercept(svm, INTERCEPT_INVPCID);
 		else
 			svm_clr_intercept(svm, INTERCEPT_INVPCID);
 	}
 
 	if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP)) {
-		if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP))
 			svm_clr_intercept(svm, INTERCEPT_RDTSCP);
 		else
 			svm_set_intercept(svm, INTERCEPT_RDTSCP);
@@ -2905,7 +2905,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_AMD64_VIRT_SPEC_CTRL:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_VIRT_SSBD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_VIRT_SSBD))
 			return 1;
 
 		msr_info->data = svm->virt_spec_ctrl;
@@ -3052,7 +3052,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		break;
 	case MSR_AMD64_VIRT_SPEC_CTRL:
 		if (!msr->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_VIRT_SSBD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_VIRT_SSBD))
 			return 1;
 
 		if (data & ~SPEC_CTRL_SSBD)
@@ -3224,7 +3224,7 @@ static int invpcid_interception(struct kvm_vcpu *vcpu)
 	unsigned long type;
 	gva_t gva;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_INVPCID)) {
 		kvm_queue_exception(vcpu, UD_VECTOR);
 		return 1;
 	}
@@ -4318,7 +4318,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
 			     boot_cpu_has(X86_FEATURE_XSAVE) &&
 			     boot_cpu_has(X86_FEATURE_XSAVES) &&
-			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
+			     guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE));
 
 	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
 	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
@@ -4345,7 +4345,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 	if (boot_cpu_has(X86_FEATURE_FLUSH_L1D))
 		set_msr_interception(vcpu, svm->msrpm, MSR_IA32_FLUSH_CMD, 0,
-				     !!guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D));
+				     !!guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
 
 	if (sev_guest(vcpu->kvm))
 		sev_vcpu_after_set_cpuid(svm);
@@ -4602,7 +4602,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, union kvm_smram *smram)
 	 * responsible for ensuring nested SVM and SMIs are mutually exclusive.
 	 */
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return 1;
 
 	smram->smram64.svm_guest_flag = 1;
@@ -4649,14 +4649,14 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const union kvm_smram *smram)
 
 	const struct kvm_smram_state_64 *smram64 = &smram->smram64;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return 0;
 
 	/* Non-zero if SMI arrived while vCPU was in guest mode. */
 	if (!smram64->svm_guest_flag)
 		return 0;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_SVM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SVM))
 		return 1;
 
 	if (!(smram64->efer & EFER_SVME))
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4750d1696d58..f046813e34c1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2005,7 +2005,7 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
 	bool evmcs_gpa_changed = false;
 	u64 evmcs_gpa;
 
-	if (likely(!guest_cpuid_has_evmcs(vcpu)))
+	if (likely(!guest_cpu_cap_has_evmcs(vcpu)))
 		return EVMPTRLD_DISABLED;
 
 	evmcs_gpa = nested_get_evmptr(vcpu);
@@ -2888,7 +2888,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
 	    nested_check_vm_entry_controls(vcpu, vmcs12))
 		return -EINVAL;
 
-	if (guest_cpuid_has_evmcs(vcpu))
+	if (guest_cpu_cap_has_evmcs(vcpu))
 		return nested_evmcs_check_controls(vmcs12);
 
 	return 0;
@@ -3170,7 +3170,7 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
 	 * L2 was running), map it here to make sure vmcs12 changes are
 	 * properly reflected.
 	 */
-	if (guest_cpuid_has_evmcs(vcpu) &&
+	if (guest_cpu_cap_has_evmcs(vcpu) &&
 	    vmx->nested.hv_evmcs_vmptr == EVMPTR_MAP_PENDING) {
 		enum nested_evmptrld_status evmptrld_status =
 			nested_vmx_handle_enlightened_vmptrld(vcpu, false);
@@ -4814,7 +4814,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
 	 * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
 	 * separate modes for L2 vs L1.
 	 */
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
 		indirect_branch_prediction_barrier();
 
 	/* Update any VMCS fields that might have changed while L2 ran */
@@ -5302,7 +5302,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
 	 * state. It is possible that the area will stay mapped as
 	 * vmx->nested.hv_evmcs but this shouldn't be a problem.
 	 */
-	if (likely(!guest_cpuid_has_evmcs(vcpu) ||
+	if (likely(!guest_cpu_cap_has_evmcs(vcpu) ||
 		   !evmptr_is_valid(nested_get_evmptr(vcpu)))) {
 		if (vmptr == vmx->nested.current_vmptr)
 			nested_release_vmcs12(vcpu);
@@ -6092,7 +6092,7 @@ static bool nested_vmx_exit_handled_encls(struct kvm_vcpu *vcpu,
 {
 	u32 encls_leaf;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_SGX) ||
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) ||
 	    !nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENCLS_EXITING))
 		return false;
 
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 820d3e1f6b4f..98d579c0ce28 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -160,7 +160,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
 
 static inline u64 vcpu_get_perf_capabilities(struct kvm_vcpu *vcpu)
 {
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM))
 		return 0;
 
 	return vcpu->arch.perf_capabilities;
@@ -210,7 +210,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 		ret = vcpu_get_perf_capabilities(vcpu) & PERF_CAP_PEBS_FORMAT;
 		break;
 	case MSR_IA32_DS_AREA:
-		ret = guest_cpuid_has(vcpu, X86_FEATURE_DS);
+		ret = guest_cpu_cap_has(vcpu, X86_FEATURE_DS);
 		break;
 	case MSR_PEBS_DATA_CFG:
 		perf_capabilities = vcpu_get_perf_capabilities(vcpu);
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 3e822e582497..9616b4ac0662 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -122,7 +122,7 @@ static int sgx_inject_fault(struct kvm_vcpu *vcpu, gva_t gva, int trapnr)
 	 * likely than a bad userspace address.
 	 */
 	if ((trapnr == PF_VECTOR || !boot_cpu_has(X86_FEATURE_SGX2)) &&
-	    guest_cpuid_has(vcpu, X86_FEATURE_SGX2)) {
+	    guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2)) {
 		memset(&ex, 0, sizeof(ex));
 		ex.vector = PF_VECTOR;
 		ex.error_code = PFERR_PRESENT_MASK | PFERR_WRITE_MASK |
@@ -365,7 +365,7 @@ static inline bool encls_leaf_enabled_in_guest(struct kvm_vcpu *vcpu, u32 leaf)
 		return true;
 
 	if (leaf >= EAUG && leaf <= EMODT)
-		return guest_cpuid_has(vcpu, X86_FEATURE_SGX2);
+		return guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2);
 
 	return false;
 }
@@ -381,8 +381,8 @@ int handle_encls(struct kvm_vcpu *vcpu)
 {
 	u32 leaf = (u32)kvm_rax_read(vcpu);
 
-	if (!enable_sgx || !guest_cpuid_has(vcpu, X86_FEATURE_SGX) ||
-	    !guest_cpuid_has(vcpu, X86_FEATURE_SGX1)) {
+	if (!enable_sgx || !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) ||
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX1)) {
 		kvm_queue_exception(vcpu, UD_VECTOR);
 	} else if (!encls_leaf_enabled_in_guest(vcpu, leaf) ||
 		   !sgx_enabled_in_guest_bios(vcpu) || !is_paging(vcpu)) {
@@ -479,15 +479,15 @@ void vmx_write_encls_bitmap(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	if (!cpu_has_vmx_encls_vmexit())
 		return;
 
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX) &&
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) &&
 	    sgx_enabled_in_guest_bios(vcpu)) {
-		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX1)) {
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX1)) {
 			bitmap &= ~GENMASK_ULL(ETRACK, ECREATE);
 			if (sgx_intercept_encls_ecreate(vcpu))
 				bitmap |= (1 << ECREATE);
 		}
 
-		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX2))
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2))
 			bitmap &= ~GENMASK_ULL(EMODT, EAUG);
 
 		/*
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5a056ad1ae55..815692dc0aff 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1874,8 +1874,8 @@ static void vmx_setup_uret_msrs(struct vcpu_vmx *vmx)
 	vmx_setup_uret_msr(vmx, MSR_EFER, update_transition_efer(vmx));
 
 	vmx_setup_uret_msr(vmx, MSR_TSC_AUX,
-			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
-			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
+			   guest_cpu_cap_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
+			   guest_cpu_cap_has(&vmx->vcpu, X86_FEATURE_RDPID));
 
 	/*
 	 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations of new
@@ -2028,7 +2028,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_BNDCFGS:
 		if (!kvm_mpx_supported() ||
 		    (!msr_info->host_initiated &&
-		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
+		     !guest_cpu_cap_has(vcpu, X86_FEATURE_MPX)))
 			return 1;
 		msr_info->data = vmcs_read64(GUEST_BNDCFGS);
 		break;
@@ -2044,7 +2044,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC))
 			return 1;
 		msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
 			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
@@ -2062,7 +2062,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * sanity checking and refuse to boot. Filter all unsupported
 		 * features out.
 		 */
-		if (!msr_info->host_initiated && guest_cpuid_has_evmcs(vcpu))
+		if (!msr_info->host_initiated && guest_cpu_cap_has_evmcs(vcpu))
 			nested_evmcs_filter_control_msr(vcpu, msr_info->index,
 							&msr_info->data);
 		break;
@@ -2131,7 +2131,7 @@ static u64 nested_vmx_truncate_sysenter_addr(struct kvm_vcpu *vcpu,
 						    u64 data)
 {
 #ifdef CONFIG_X86_64
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return (u32)data;
 #endif
 	return (unsigned long)data;
@@ -2142,7 +2142,7 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
 	u64 debugctl = 0;
 
 	if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT) &&
-	    (host_initiated || guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT)))
+	    (host_initiated || guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT)))
 		debugctl |= DEBUGCTLMSR_BUS_LOCK_DETECT;
 
 	if ((kvm_caps.supported_perf_cap & PMU_CAP_LBR_FMT) &&
@@ -2246,7 +2246,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_BNDCFGS:
 		if (!kvm_mpx_supported() ||
 		    (!msr_info->host_initiated &&
-		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
+		     !guest_cpu_cap_has(vcpu, X86_FEATURE_MPX)))
 			return 1;
 		if (is_noncanonical_address(data & PAGE_MASK, vcpu) ||
 		    (data & MSR_IA32_BNDCFGS_RSVD))
@@ -2348,7 +2348,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * behavior, but it's close enough.
 		 */
 		if (!msr_info->host_initiated &&
-		    (!guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC) ||
+		    (!guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC) ||
 		    ((vmx->msr_ia32_feature_control & FEAT_CTL_LOCKED) &&
 		    !(vmx->msr_ia32_feature_control & FEAT_CTL_SGX_LC_ENABLED))))
 			return 1;
@@ -2434,9 +2434,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			if ((data & PERF_CAP_PEBS_MASK) !=
 			    (kvm_caps.supported_perf_cap & PERF_CAP_PEBS_MASK))
 				return 1;
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_DS))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_DS))
 				return 1;
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_DTES64))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_DTES64))
 				return 1;
 			if (!cpuid_model_is_consistent(vcpu))
 				return 1;
@@ -4566,10 +4566,7 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
 	bool __enabled;										\
 												\
 	if (cpu_has_vmx_##name()) {								\
-		if (kvm_is_governed_feature(X86_FEATURE_##feat_name))				\
-			__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);		\
-		else										\
-			__enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name);		\
+		__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);			\
 		vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
 						  __enabled, exiting);				\
 	}											\
@@ -4644,8 +4641,8 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 	 */
 	if (cpu_has_vmx_rdtscp()) {
 		bool rdpid_or_rdtscp_enabled =
-			guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
-			guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
+			guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) ||
+			guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID);
 
 		vmx_adjust_secondary_exec_control(vmx, &exec_control,
 						  SECONDARY_EXEC_ENABLE_RDTSCP,
@@ -5947,7 +5944,7 @@ static int handle_invpcid(struct kvm_vcpu *vcpu)
 	} operand;
 	int gpr_index;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_INVPCID)) {
 		kvm_queue_exception(vcpu, UD_VECTOR);
 		return 1;
 	}
@@ -7756,7 +7753,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * set if and only if XSAVE is supported.
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
-	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
+	    guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
 		guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
 	else
 		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
@@ -7782,21 +7779,21 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		nested_vmx_cr_fixed1_bits_update(vcpu);
 
 	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
-			guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
+			guest_cpu_cap_has(vcpu, X86_FEATURE_INTEL_PT))
 		update_intel_pt_cfg(vcpu);
 
 	if (boot_cpu_has(X86_FEATURE_RTM)) {
 		struct vmx_uret_msr *msr;
 		msr = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
 		if (msr) {
-			bool enabled = guest_cpuid_has(vcpu, X86_FEATURE_RTM);
+			bool enabled = guest_cpu_cap_has(vcpu, X86_FEATURE_RTM);
 			vmx_set_guest_uret_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
 		}
 	}
 
 	if (kvm_cpu_cap_has(X86_FEATURE_XFD))
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_XFD_ERR, MSR_TYPE_R,
-					  !guest_cpuid_has(vcpu, X86_FEATURE_XFD));
+					  !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD));
 
 	if (boot_cpu_has(X86_FEATURE_IBPB))
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W,
@@ -7804,17 +7801,17 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 	if (boot_cpu_has(X86_FEATURE_FLUSH_L1D))
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
-					  !guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D));
+					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
 
 	set_cr4_guest_host_mask(vmx);
 
 	vmx_write_encls_bitmap(vcpu, NULL);
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX))
 		vmx->msr_ia32_feature_control_valid_bits |= FEAT_CTL_SGX_ENABLED;
 	else
 		vmx->msr_ia32_feature_control_valid_bits &= ~FEAT_CTL_SGX_ENABLED;
 
-	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC))
 		vmx->msr_ia32_feature_control_valid_bits |=
 			FEAT_CTL_SGX_LC_ENABLED;
 	else
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c2130d2c8e24..edca0a4276fb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -745,7 +745,7 @@ static inline bool vmx_can_use_ipiv(struct kvm_vcpu *vcpu)
 	return  lapic_in_kernel(vcpu) && enable_ipiv;
 }
 
-static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
+static inline bool guest_cpu_cap_has_evmcs(struct kvm_vcpu *vcpu)
 {
 	/*
 	 * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 04a77b764a36..a6b8f844a5bc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -487,7 +487,7 @@ int kvm_set_apic_base(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	enum lapic_mode old_mode = kvm_get_apic_mode(vcpu);
 	enum lapic_mode new_mode = kvm_apic_mode(msr_info->data);
 	u64 reserved_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu) | 0x2ff |
-		(guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) ? 0 : X2APIC_ENABLE);
+		(guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) ? 0 : X2APIC_ENABLE);
 
 	if ((msr_info->data & reserved_bits) != 0 || new_mode == LAPIC_MODE_INVALID)
 		return 1;
@@ -1362,10 +1362,10 @@ static u64 kvm_dr6_fixed(struct kvm_vcpu *vcpu)
 {
 	u64 fixed = DR6_FIXED_1;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_RTM))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_RTM))
 		fixed |= DR6_RTM;
 
-	if (!guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
+	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
 		fixed |= DR6_BUS_LOCK;
 	return fixed;
 }
@@ -1721,20 +1721,20 @@ static int do_get_msr_feature(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
 
 static bool __kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
-	if (efer & EFER_AUTOIBRS && !guest_cpuid_has(vcpu, X86_FEATURE_AUTOIBRS))
+	if (efer & EFER_AUTOIBRS && !guest_cpu_cap_has(vcpu, X86_FEATURE_AUTOIBRS))
 		return false;
 
-	if (efer & EFER_FFXSR && !guest_cpuid_has(vcpu, X86_FEATURE_FXSR_OPT))
+	if (efer & EFER_FFXSR && !guest_cpu_cap_has(vcpu, X86_FEATURE_FXSR_OPT))
 		return false;
 
-	if (efer & EFER_SVME && !guest_cpuid_has(vcpu, X86_FEATURE_SVM))
+	if (efer & EFER_SVME && !guest_cpu_cap_has(vcpu, X86_FEATURE_SVM))
 		return false;
 
 	if (efer & (EFER_LME | EFER_LMA) &&
-	    !guest_cpuid_has(vcpu, X86_FEATURE_LM))
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
 		return false;
 
-	if (efer & EFER_NX && !guest_cpuid_has(vcpu, X86_FEATURE_NX))
+	if (efer & EFER_NX && !guest_cpu_cap_has(vcpu, X86_FEATURE_NX))
 		return false;
 
 	return true;
@@ -1872,8 +1872,8 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
 			return 1;
 
 		if (!host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 
 		/*
@@ -1929,8 +1929,8 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
 			return 1;
 
 		if (!host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 		break;
 	}
@@ -2122,7 +2122,7 @@ EXPORT_SYMBOL_GPL(kvm_handle_invalid_op);
 static int kvm_emulate_monitor_mwait(struct kvm_vcpu *vcpu, const char *insn)
 {
 	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS) &&
-	    !guest_cpuid_has(vcpu, X86_FEATURE_MWAIT))
+	    !guest_cpu_cap_has(vcpu, X86_FEATURE_MWAIT))
 		return kvm_handle_invalid_op(vcpu);
 
 	pr_warn_once("%s instruction emulated as NOP!\n", insn);
@@ -3761,11 +3761,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			if ((!guest_has_pred_cmd_msr(vcpu)))
 				return 1;
 
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
-			    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
+			    !guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBPB))
 				reserved_bits |= PRED_CMD_IBPB;
 
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_SBPB))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SBPB))
 				reserved_bits |= PRED_CMD_SBPB;
 		}
 
@@ -3786,7 +3786,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	}
 	case MSR_IA32_FLUSH_CMD:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D))
 			return 1;
 
 		if (!boot_cpu_has(X86_FEATURE_FLUSH_L1D) || (data & ~L1D_FLUSH))
@@ -3837,7 +3837,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		kvm_set_lapic_tscdeadline_msr(vcpu, data);
 		break;
 	case MSR_IA32_TSC_ADJUST:
-		if (guest_cpuid_has(vcpu, X86_FEATURE_TSC_ADJUST)) {
+		if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSC_ADJUST)) {
 			if (!msr_info->host_initiated) {
 				s64 adj = data - vcpu->arch.ia32_tsc_adjust_msr;
 				adjust_tsc_offset_guest(vcpu, adj);
@@ -3864,7 +3864,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 
 		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
 		    ((old_val ^ data)  & MSR_IA32_MISC_ENABLE_MWAIT)) {
-			if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
+			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_XMM3))
 				return 1;
 			vcpu->arch.ia32_misc_enable_msr = data;
 			kvm_update_cpuid_runtime(vcpu);
@@ -3892,7 +3892,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_XSS:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES))
 			return 1;
 		/*
 		 * KVM supports exposing PT to the guest, but does not support
@@ -4039,12 +4039,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		kvm_pr_unimpl_wrmsr(vcpu, msr, data);
 		break;
 	case MSR_AMD64_OSVW_ID_LENGTH:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		vcpu->arch.osvw.length = data;
 		break;
 	case MSR_AMD64_OSVW_STATUS:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		vcpu->arch.osvw.status = data;
 		break;
@@ -4065,7 +4065,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 #ifdef CONFIG_X86_64
 	case MSR_IA32_XFD:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		if (data & ~kvm_guest_supported_xfd(vcpu))
@@ -4075,7 +4075,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_XFD_ERR:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		if (data & ~kvm_guest_supported_xfd(vcpu))
@@ -4199,13 +4199,13 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_ARCH_CAPABILITIES:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
 			return 1;
 		msr_info->data = vcpu->arch.arch_capabilities;
 		break;
 	case MSR_IA32_PERF_CAPABILITIES:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM))
 			return 1;
 		msr_info->data = vcpu->arch.perf_capabilities;
 		break;
@@ -4361,7 +4361,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 				   msr_info->host_initiated);
 	case MSR_IA32_XSS:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES))
 			return 1;
 		msr_info->data = vcpu->arch.ia32_xss;
 		break;
@@ -4404,12 +4404,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		msr_info->data = 0xbe702111;
 		break;
 	case MSR_AMD64_OSVW_ID_LENGTH:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		msr_info->data = vcpu->arch.osvw.length;
 		break;
 	case MSR_AMD64_OSVW_STATUS:
-		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
+		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
 			return 1;
 		msr_info->data = vcpu->arch.osvw.status;
 		break;
@@ -4428,14 +4428,14 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 #ifdef CONFIG_X86_64
 	case MSR_IA32_XFD:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		msr_info->data = vcpu->arch.guest_fpu.fpstate->xfd;
 		break;
 	case MSR_IA32_XFD_ERR:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
 			return 1;
 
 		msr_info->data = vcpu->arch.guest_fpu.xfd_err;
@@ -8368,17 +8368,17 @@ static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
 
 static bool emulator_guest_has_movbe(struct x86_emulate_ctxt *ctxt)
 {
-	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_MOVBE);
+	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_MOVBE);
 }
 
 static bool emulator_guest_has_fxsr(struct x86_emulate_ctxt *ctxt)
 {
-	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_FXSR);
+	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_FXSR);
 }
 
 static bool emulator_guest_has_rdpid(struct x86_emulate_ctxt *ctxt)
 {
-	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_RDPID);
+	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_RDPID);
 }
 
 static ulong emulator_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned reg)
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities
  2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
                   ` (7 preceding siblings ...)
  2023-11-10 23:55 ` [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check Sean Christopherson
@ 2023-11-10 23:55 ` Sean Christopherson
  2023-11-19 17:36   ` Maxim Levitsky
  8 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-10 23:55 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

Restrict XSAVE in guest cpu_caps so that XSAVES dependencies on XSAVE are
automatically handled instead of manually checking for host and guest
XSAVE support.  Aside from modifying XSAVE in cpu_caps, this should be a
glorified nop as KVM doesn't query guest XSAVE support (which is also why
it wasn't/isn't a bug to leave XSAVE set in guest CPUID).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 2 +-
 arch/x86/kvm/vmx/vmx.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9e3a9191dac1..6fe2d7bf4959 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4315,8 +4315,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
 	 * the guest read/write access to the host's XSS.
 	 */
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVE);
 	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
-			     boot_cpu_has(X86_FEATURE_XSAVE) &&
 			     boot_cpu_has(X86_FEATURE_XSAVES) &&
 			     guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE));
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 815692dc0aff..7645945af5c5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7752,8 +7752,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * to the guest.  XSAVES depends on CR4.OSXSAVE, and CR4.OSXSAVE can be
 	 * set if and only if XSAVE is supported.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
-	    guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
+	guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVE);
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
 		guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
 	else
 		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
@ 2023-11-13  8:03   ` Robert Hoo
  2023-11-14 13:48     ` Sean Christopherson
  2023-11-16  2:24   ` Yang, Weijiang
  2023-11-19 17:35   ` Maxim Levitsky
  2 siblings, 1 reply; 40+ messages in thread
From: Robert Hoo @ 2023-11-13  8:03 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Maxim Levitsky

On 11/11/2023 7:55 AM, Sean Christopherson wrote:
> When updating guest CPUID entries to emulate runtime behavior, e.g. when
> the guest enables a CR4-based feature that is tied to a CPUID flag, also
> update the vCPU's cpu_caps accordingly.  This will allow replacing all
> usage of guest_cpuid_has() with guest_cpu_cap_has().
> 
> Take care not to update guest capabilities when KVM is updating CPUID
> entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
> bogus CPUID information.  No extra call to update cpu_caps is needed as
> the cpu_caps are initialized from the incoming guest CPUID, i.e. will
> automatically get the updated values.
> 
> Note, none of the features in question use guest_cpu_cap_has() at this
> time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
>   1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 36bd04030989..37a991439fe6 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
>   	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
>   }
>   
> +static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
> +						       struct kvm_cpuid_entry2 *entry,
> +						       unsigned int x86_feature,
> +						       bool has_feature)
> +{
> +	if (entry)
> +		cpuid_entry_change(entry, x86_feature, has_feature);
> +
> +	if (vcpu)
> +		guest_cpu_cap_change(vcpu, x86_feature, has_feature);
> +}
> +
>   static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
>   				       int nent)
>   {
>   	struct kvm_cpuid_entry2 *best;
> +	struct kvm_vcpu *caps = vcpu;

u32 *caps  = vcpu->arch.cpu_caps;
and update guest_cpu_cap_set(), guest_cpu_cap_clear(), guest_cpu_cap_change() 
and guest_cpu_cap_restrict() to pass in vcpu->arch.cpu_caps instead of vcpu, 
since all of them merely refer to vcpu cap, rather than whole vcpu info.

Or, for simple change, here rename variable name "caps" --> "vcpu", to less 
reading confusion.

> +
> +	/*
> +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> +	 * are coming in from userspace!
> +	 */
> +	if (entries != vcpu->arch.cpuid_entries)
> +		caps = NULL;
>   
>   	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> -	if (best) {
> -		/* Update OSXSAVE bit */
> -		if (boot_cpu_has(X86_FEATURE_XSAVE))
> -			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> +
> +	if (boot_cpu_has(X86_FEATURE_XSAVE))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
>   					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
>   
> -		cpuid_entry_change(best, X86_FEATURE_APIC,
> -			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> +	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> +				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
>   
> -		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> -			cpuid_entry_change(best, X86_FEATURE_MWAIT,
> -					   vcpu->arch.ia32_misc_enable_msr &
> -					   MSR_IA32_MISC_ENABLE_MWAIT);
> -	}
> +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
> +					   vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);
>   
>   	best = cpuid_entry2_find(entries, nent, 7, 0);
> -	if (best && boot_cpu_has(X86_FEATURE_PKU))
> -		cpuid_entry_change(best, X86_FEATURE_OSPKE,
> -				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> +	if (boot_cpu_has(X86_FEATURE_PKU))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSPKE,
> +					   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
>   
>   	best = cpuid_entry2_find(entries, nent, 0xD, 0);
>   	if (best)
> @@ -353,6 +370,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
>   	 * honor userspace's definition for features that don't require KVM or
>   	 * hardware management/support (or that KVM simply doesn't care about).
> +	 *
> +	 * Note, KVM has already done runtime updates on guest CPUID, i.e. this
> +	 * will also correctly set runtime features in guest CPU capabilities.
>   	 */
>   	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
>   		const struct cpuid_reg cpuid = reverse_cpuid[i];


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps
  2023-11-10 23:55 ` [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps Sean Christopherson
@ 2023-11-14  9:12   ` Binbin Wu
  2023-11-19 17:22   ` Maxim Levitsky
  1 sibling, 0 replies; 40+ messages in thread
From: Binbin Wu @ 2023-11-14  9:12 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Maxim Levitsky



On 11/11/2023 7:55 AM, Sean Christopherson wrote:
> Replace the internals of the governed features framework with a more
> comprehensive "guest CPU capabilities" implementation, i.e. with a guest
> version of kvm_cpu_caps.  Keep the skeleton of governed features around
> for now as vmx_adjust_sec_exec_control() relies on detecting governed
> features to do the right thing for XSAVES, and switching all guest feature
> queries to guest_cpu_cap_has() requires subtle and non-trivial changes,
> i.e. is best done as a standalone change.
>
> Tracking *all* guest capabilities that KVM cares will allow excising the
> poorly named "governed features" framework, and effectively optimizes all
> KVM queries of guest capabilities, i.e. doesn't require making a
> subjective decision as to whether or not a feature is worth "governing",
> and doesn't require adding the code to do so.
>
> The cost of tracking all features is currently 92 bytes per vCPU on 64-bit
> kernels: 100 bytes for cpu_caps versus 8 bytes for governed_features.
> That cost is well worth paying even if the only benefit was eliminating
> the "governed features" terminology.  And practically speaking, the real
> cost is zero unless those 92 bytes pushes the size of vcpu_vmx or vcpu_svm
> into a new order-N allocation, and if that happens there are better ways
> to reduce the footprint of kvm_vcpu_arch, e.g. making the PMU and/or MTRR
> state separate allocations.
>
> Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Nit: one typo in the short log, "goverened" -> "governed"

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

> ---
>   arch/x86/include/asm/kvm_host.h | 40 ++++++++++++++++++++-------------
>   arch/x86/kvm/cpuid.c            |  4 +---
>   arch/x86/kvm/cpuid.h            | 14 ++++++------
>   arch/x86/kvm/reverse_cpuid.h    | 15 -------------
>   4 files changed, 32 insertions(+), 41 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d7036982332e..1d43dd5fdea7 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -722,6 +722,22 @@ struct kvm_queued_exception {
>   	bool has_payload;
>   };
>   
> +/*
> + * Hardware-defined CPUID leafs that are either scattered by the kernel or are
> + * unknown to the kernel, but need to be directly used by KVM.  Note, these
> + * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
> + */
> +enum kvm_only_cpuid_leafs {
> +	CPUID_12_EAX	 = NCAPINTS,
> +	CPUID_7_1_EDX,
> +	CPUID_8000_0007_EDX,
> +	CPUID_8000_0022_EAX,
> +	NR_KVM_CPU_CAPS,
> +
> +	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
> +};
> +
> +
>   struct kvm_vcpu_arch {
>   	/*
>   	 * rip and regs accesses must go through
> @@ -840,23 +856,15 @@ struct kvm_vcpu_arch {
>   	struct kvm_hypervisor_cpuid kvm_cpuid;
>   
>   	/*
> -	 * FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly
> -	 * when "struct kvm_vcpu_arch" is no longer defined in an
> -	 * arch/x86/include/asm header.  The max is mostly arbitrary, i.e.
> -	 * can be increased as necessary.
> +	 * Track the effective guest capabilities, i.e. the features the vCPU
> +	 * is allowed to use.  Typically, but not always, features can be used
> +	 * by the guest if and only if both KVM and userspace want to expose
> +	 * the feature to the guest.  A common exception is for virtualization
> +	 * holes, i.e. when KVM can't prevent the guest from using a feature,
> +	 * in which case the vCPU "has" the feature regardless of what KVM or
> +	 * userspace desires.
>   	 */
> -#define KVM_MAX_NR_GOVERNED_FEATURES BITS_PER_LONG
> -
> -	/*
> -	 * Track whether or not the guest is allowed to use features that are
> -	 * governed by KVM, where "governed" means KVM needs to manage state
> -	 * and/or explicitly enable the feature in hardware.  Typically, but
> -	 * not always, governed features can be used by the guest if and only
> -	 * if both KVM and userspace want to expose the feature to the guest.
> -	 */
> -	struct {
> -		DECLARE_BITMAP(enabled, KVM_MAX_NR_GOVERNED_FEATURES);
> -	} governed_features;
> +	u32 cpu_caps[NR_KVM_CPU_CAPS];
>   
>   	u64 reserved_gpa_bits;
>   	int maxphyaddr;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 4f464187b063..4bf3c2d4dc7c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -327,9 +327,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	struct kvm_cpuid_entry2 *best;
>   	bool allow_gbpages;
>   
> -	BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
> -	bitmap_zero(vcpu->arch.governed_features.enabled,
> -		    KVM_MAX_NR_GOVERNED_FEATURES);
> +	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
>   
>   	/*
>   	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 245416ffa34c..9f18c4395b71 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -255,12 +255,12 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
>   }
>   
>   static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
> -						     unsigned int x86_feature)
> +					      unsigned int x86_feature)
>   {
> -	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
> +	unsigned int x86_leaf = __feature_leaf(x86_feature);
>   
> -	__set_bit(kvm_governed_feature_index(x86_feature),
> -		  vcpu->arch.governed_features.enabled);
> +	reverse_cpuid_check(x86_leaf);
> +	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
>   }
>   
>   static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> @@ -273,10 +273,10 @@ static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
>   static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
>   					      unsigned int x86_feature)
>   {
> -	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
> +	unsigned int x86_leaf = __feature_leaf(x86_feature);
>   
> -	return test_bit(kvm_governed_feature_index(x86_feature),
> -			vcpu->arch.governed_features.enabled);
> +	reverse_cpuid_check(x86_leaf);
> +	return vcpu->arch.cpu_caps[x86_leaf] & __feature_bit(x86_feature);
>   }
>   
>   #endif
> diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
> index b81650678375..4b658491e8f8 100644
> --- a/arch/x86/kvm/reverse_cpuid.h
> +++ b/arch/x86/kvm/reverse_cpuid.h
> @@ -6,21 +6,6 @@
>   #include <asm/cpufeature.h>
>   #include <asm/cpufeatures.h>
>   
> -/*
> - * Hardware-defined CPUID leafs that are either scattered by the kernel or are
> - * unknown to the kernel, but need to be directly used by KVM.  Note, these
> - * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
> - */
> -enum kvm_only_cpuid_leafs {
> -	CPUID_12_EAX	 = NCAPINTS,
> -	CPUID_7_1_EDX,
> -	CPUID_8000_0007_EDX,
> -	CPUID_8000_0022_EAX,
> -	NR_KVM_CPU_CAPS,
> -
> -	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
> -};
> -
>   /*
>    * Define a KVM-only feature flag.
>    *


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-13  8:03   ` Robert Hoo
@ 2023-11-14 13:48     ` Sean Christopherson
  2023-11-15  1:59       ` Robert Hoo
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-14 13:48 UTC (permalink / raw)
  To: Robert Hoo; +Cc: Paolo Bonzini, kvm, linux-kernel, Maxim Levitsky

On Mon, Nov 13, 2023, Robert Hoo wrote:
> On 11/11/2023 7:55 AM, Sean Christopherson wrote:
> > When updating guest CPUID entries to emulate runtime behavior, e.g. when
> > the guest enables a CR4-based feature that is tied to a CPUID flag, also
> > update the vCPU's cpu_caps accordingly.  This will allow replacing all
> > usage of guest_cpuid_has() with guest_cpu_cap_has().
> > 
> > Take care not to update guest capabilities when KVM is updating CPUID
> > entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
> > bogus CPUID information.  No extra call to update cpu_caps is needed as
> > the cpu_caps are initialized from the incoming guest CPUID, i.e. will
> > automatically get the updated values.
> > 
> > Note, none of the features in question use guest_cpu_cap_has() at this
> > time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >   arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
> >   1 file changed, 34 insertions(+), 14 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 36bd04030989..37a991439fe6 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
> >   	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
> >   }
> > +static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
> > +						       struct kvm_cpuid_entry2 *entry,
> > +						       unsigned int x86_feature,
> > +						       bool has_feature)
> > +{
> > +	if (entry)
> > +		cpuid_entry_change(entry, x86_feature, has_feature);
> > +
> > +	if (vcpu)
> > +		guest_cpu_cap_change(vcpu, x86_feature, has_feature);
> > +}
> > +
> >   static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> >   				       int nent)
> >   {
> >   	struct kvm_cpuid_entry2 *best;
> > +	struct kvm_vcpu *caps = vcpu;
> 
> u32 *caps  = vcpu->arch.cpu_caps;
> and update guest_cpu_cap_set(), guest_cpu_cap_clear(),
> guest_cpu_cap_change() and guest_cpu_cap_restrict() to pass in
> vcpu->arch.cpu_caps instead of vcpu, since all of them merely refer to vcpu
> cap, rather than whole vcpu info.

No, because then every caller would need extra code to pass vcpu->cpu_caps, and
passing 'u32 *' provides less type safety than 'struct kvm_vcpu *'.  That tradeoff
isn't worth making this one path slightly easier to read.

> Or, for simple change, here rename variable name "caps" --> "vcpu", to less
> reading confusion.

@vcpu is already defined and needs to be used in this function.  See the comment
below.

I'm definitely open to a better name, though I would like to keep the name
relative short so that the line lengths of the callers is reasonable, e.g. would
prefer not to do vcpu_caps.

> > +	/*
> > +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > +	 * are coming in from userspace!
> > +	 */
> > +	if (entries != vcpu->arch.cpuid_entries)
> > +		caps = NULL;
> >   	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> > -	if (best) {
> > -		/* Update OSXSAVE bit */
> > -		if (boot_cpu_has(X86_FEATURE_XSAVE))
> > -			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> > +
> > +	if (boot_cpu_has(X86_FEATURE_XSAVE))
> > +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
> >   					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
> > -		cpuid_entry_change(best, X86_FEATURE_APIC,
> > -			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> > +	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> > +				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-14 13:48     ` Sean Christopherson
@ 2023-11-15  1:59       ` Robert Hoo
  2023-11-15 15:09         ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Robert Hoo @ 2023-11-15  1:59 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Maxim Levitsky

On 11/14/2023 9:48 PM, Sean Christopherson wrote:
> On Mon, Nov 13, 2023, Robert Hoo wrote:
...
>> u32 *caps  = vcpu->arch.cpu_caps;
>> and update guest_cpu_cap_set(), guest_cpu_cap_clear(),
>> guest_cpu_cap_change() and guest_cpu_cap_restrict() to pass in
>> vcpu->arch.cpu_caps instead of vcpu, since all of them merely refer to vcpu
>> cap, rather than whole vcpu info.
> 
> No, because then every caller would need extra code to pass vcpu->cpu_caps, 

Emm, I don't understand this. I tried to modified and compiled, all need to do 
is simply substitute "vcpu" with "vcpu->arch.cpu_caps" in calling. (at the end 
is my diff based on this patch set)

> and
> passing 'u32 *' provides less type safety than 'struct kvm_vcpu *'.  That tradeoff
> isn't worth making this one path slightly easier to read.

My point is also from vulnerability, long term, since as a principle, we'd 
better pass in param/info to a function of its necessity. e.g. cpuid_entry2_find().
Anyway, this is a less important point, shouldn't distract your focus.

This patch set's whole idea is good, I also felt confusion when initially 
looking into vCPUID code and its complicated dependencies with each other and 
KVM cap (or your word govern)  ( and even Kernel govern and HW cap ?). With this 
guest_cap[], the layered relationship can be much clearer, alone with fast guest 
cap queries.
> 
>> Or, for simple change, here rename variable name "caps" --> "vcpu", to less
>> reading confusion.
> 
> @vcpu is already defined and needs to be used in this function.  See the comment
> below.
> 
> I'm definitely open to a better name, though I would like to keep the name
> relative short so that the line lengths of the callers is reasonable, e.g. would
> prefer not to do vcpu_caps.
> 
>>> +	/*
>>> +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
>>> +	 * are coming in from userspace!
>>> +	 */
>>> +	if (entries != vcpu->arch.cpuid_entries)
>>> +		caps = NULL;
>>>    	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
>>> -	if (best) {
>>> -		/* Update OSXSAVE bit */
>>> -		if (boot_cpu_has(X86_FEATURE_XSAVE))
>>> -			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
>>> +
>>> +	if (boot_cpu_has(X86_FEATURE_XSAVE))
>>> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
>>>    					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
>>> -		cpuid_entry_change(best, X86_FEATURE_APIC,
>>> -			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
>>> +	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
>>> +				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6407e5c45f20..3e8976705342 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -262,7 +262,7 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 
*entries, int nent)
         return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
  }

-static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
+static __always_inline void kvm_update_feature_runtime(u32 *guest_caps,
                                                        struct kvm_cpuid_entry2 
*entry,
                                                        unsigned int x86_feature,
                                                        bool has_feature)
@@ -270,15 +270,15 @@ static __always_inline void 
kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
         if (entry)
                 cpuid_entry_change(entry, x86_feature, has_feature);

-       if (vcpu)
-               guest_cpu_cap_change(vcpu, x86_feature, has_feature);
+       if (guest_caps)
+               guest_cpu_cap_change(guest_caps, x86_feature, has_feature);
  }

  static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct 
kvm_cpuid_entry2 *entries,
                                        int nent)
  {
         struct kvm_cpuid_entry2 *best;
-       struct kvm_vcpu *caps = vcpu;
+       u32 *caps = vcpu->arch.cpu_caps;

         /*
          * Don't update vCPU capabilities if KVM is updating CPUID entries that
@@ -397,7 +397,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
          */
         allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
                                       guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES);
-       guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
+       guest_cpu_cap_change(vcpu->arch.cpu_caps, X86_FEATURE_GBPAGES, 
allow_gbpages);

         best = kvm_find_cpuid_entry(vcpu, 1);
         if (best && apic) {
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 98694dfe062e..a3a0482fc514 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -183,39 +183,39 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu 
*vcpu,
         return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
  }

-static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
+static __always_inline void guest_cpu_cap_set(u32 *caps,
                                               unsigned int x86_feature)
  {
         unsigned int x86_leaf = __feature_leaf(x86_feature);

         reverse_cpuid_check(x86_leaf);
-       vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
+       caps[x86_leaf] |= __feature_bit(x86_feature);
  }

-static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
+static __always_inline void guest_cpu_cap_clear(u32 *caps,
                                                 unsigned int x86_feature)
  {
         unsigned int x86_leaf = __feature_leaf(x86_feature);

         reverse_cpuid_check(x86_leaf);
-       vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
+       caps[x86_leaf] &= ~__feature_bit(x86_feature);
  }

-static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
+static __always_inline void guest_cpu_cap_change(u32 *caps,
                                                  unsigned int x86_feature,
                                                  bool guest_has_cap)
  {
         if (guest_has_cap)
-               guest_cpu_cap_set(vcpu, x86_feature);
+               guest_cpu_cap_set(caps, x86_feature);
         else
-               guest_cpu_cap_clear(vcpu, x86_feature);
+               guest_cpu_cap_clear(caps, x86_feature);
  }

-static __always_inline void guest_cpu_cap_restrict(struct kvm_vcpu *vcpu,
+static __always_inline void guest_cpu_cap_restrict(u32 *caps,
                                                    unsigned int x86_feature)
  {
         if (!kvm_cpu_cap_has(x86_feature))
-               guest_cpu_cap_clear(vcpu, x86_feature);
+               guest_cpu_cap_clear(caps, x86_feature);
  }

  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 6fe2d7bf4959..dd4ca07c3cd0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
          * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
          * the guest read/write access to the host's XSS.
          */
-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVE);
-       guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_XSAVE);
+       guest_cpu_cap_change(vcpu->arch.cpu_caps, X86_FEATURE_XSAVES,
                              boot_cpu_has(X86_FEATURE_XSAVES) &&
                              guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE));

-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_NRIPS);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_TSCRATEMSR);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_LBRV);

         /*
          * Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
@@ -4330,12 +4330,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
          * SVM on Intel is bonkers and extremely unlikely to work).
          */
         if (!guest_cpuid_is_intel(vcpu))
-               guest_cpu_cap_restrict(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+               guest_cpu_cap_restrict(vcpu->arch.cpu_caps, 
X86_FEATURE_V_VMSAVE_VMLOAD);

-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_PAUSEFILTER);
-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_PFTHRESHOLD);
-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_VGIF);
-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_VNMI);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_PAUSEFILTER);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_PFTHRESHOLD);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_VGIF);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_VNMI);

         svm_recalc_instruction_intercepts(vcpu, svm);

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7645945af5c5..c23c96dc24cf 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7752,13 +7752,13 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
          * to the guest.  XSAVES depends on CR4.OSXSAVE, and CR4.OSXSAVE can be
          * set if and only if XSAVE is supported.
          */
-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVE);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_XSAVE);
         if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
-               guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
+               guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_XSAVES);
         else
-               guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
+               guest_cpu_cap_clear(vcpu->arch.cpu_caps, X86_FEATURE_XSAVES);

-       guest_cpu_cap_restrict(vcpu, X86_FEATURE_VMX);
+       guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_VMX);

         vmx_setup_uret_msrs(vmx);


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-15  1:59       ` Robert Hoo
@ 2023-11-15 15:09         ` Sean Christopherson
  2023-11-17  1:28           ` Robert Hoo
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-15 15:09 UTC (permalink / raw)
  To: Robert Hoo; +Cc: Paolo Bonzini, kvm, linux-kernel, Maxim Levitsky

On Wed, Nov 15, 2023, Robert Hoo wrote:
> On 11/14/2023 9:48 PM, Sean Christopherson wrote:
> > On Mon, Nov 13, 2023, Robert Hoo wrote:
> ...
> > > u32 *caps  = vcpu->arch.cpu_caps;
> > > and update guest_cpu_cap_set(), guest_cpu_cap_clear(),
> > > guest_cpu_cap_change() and guest_cpu_cap_restrict() to pass in
> > > vcpu->arch.cpu_caps instead of vcpu, since all of them merely refer to vcpu
> > > cap, rather than whole vcpu info.
> > 
> > No, because then every caller would need extra code to pass
> > vcpu->cpu_caps,
> 
> Emm, I don't understand this. I tried to modified and compiled, all need to
> do is simply substitute "vcpu" with "vcpu->arch.cpu_caps" in calling. (at
> the end is my diff based on this patch set)

Yes, and I'm saying that

	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PAUSEFILTER);
	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PFTHRESHOLD);
	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VGIF);
	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VNMI);

is harder to read and write than this

	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_PAUSEFILTER);
	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_PFTHRESHOLD);
	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_VGIF);
	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_VNMI);

a one-time search-replace is easy, but the extra boilerplate has a non-zero cost
for every future developer/reader.

> > and passing 'u32 *' provides less type safety than 'struct kvm_vcpu *'.
> > That tradeoff isn't worth making this one path slightly easier to read.
> 
> My point is also from vulnerability, long term, since as a principle, we'd
> better pass in param/info to a function of its necessity.

Attempting to apply the principle of least privilege to low level C helpers is
nonsensical.  E.g. the helper can trivially get at the owning vcpu via container_of()
(well, if not for typeof assertions not playing nice with arrays, but open coding
container_of() is also trivial and illustrates the point).

	struct kvm_vcpu_arch *arch = (void *)caps -  offsetof(struct kvm_vcpu_arch, cpu_caps);
	struct kvm_vcpu *vcpu = container_of(arch, struct kvm_vcpu, arch);

	if (!kvm_cpu_cap_has(x86_feature))
		guest_cpu_cap_clear(vcpu, x86_feature);

And the intent behind that principle is to improve security/robustness; what I'm
saying is that passing in a 'u32 *" makes the overall implementation _less_ robust,
as it opens up the possibilities of passing in an unsafe/incorrect pointer.  E.g.
a well-intentioned, not _that_ obviously broken example is:

	guest_cpu_cap_restrict(&vcpu->arch.cpu_caps[CPUID_1_ECX], X86_FEATURE_XSAVE);

> e.g. cpuid_entry2_find().

The main reason cpuid_entry2_find() exists is because KVM checks the incoming
array provided by KVM_SET_CPUID2, which is also the reason why
__kvm_update_cpuid_runtime() takes an @entries array instead of just @vcpu.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
  2023-11-13  8:03   ` Robert Hoo
@ 2023-11-16  2:24   ` Yang, Weijiang
  2023-11-16 22:19     ` Sean Christopherson
  2023-11-19 17:35   ` Maxim Levitsky
  2 siblings, 1 reply; 40+ messages in thread
From: Yang, Weijiang @ 2023-11-16  2:24 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Maxim Levitsky

On 11/11/2023 7:55 AM, Sean Christopherson wrote:
> When updating guest CPUID entries to emulate runtime behavior, e.g. when
> the guest enables a CR4-based feature that is tied to a CPUID flag, also
> update the vCPU's cpu_caps accordingly.  This will allow replacing all
> usage of guest_cpuid_has() with guest_cpu_cap_has().
>
> Take care not to update guest capabilities when KVM is updating CPUID
> entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
> bogus CPUID information.  No extra call to update cpu_caps is needed as
> the cpu_caps are initialized from the incoming guest CPUID, i.e. will
> automatically get the updated values.
>
> Note, none of the features in question use guest_cpu_cap_has() at this
> time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
>   1 file changed, 34 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 36bd04030989..37a991439fe6 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
>   	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
>   }
>   
> +static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
> +						       struct kvm_cpuid_entry2 *entry,
> +						       unsigned int x86_feature,
> +						       bool has_feature)
> +{
> +	if (entry)
> +		cpuid_entry_change(entry, x86_feature, has_feature);
> +
> +	if (vcpu)
> +		guest_cpu_cap_change(vcpu, x86_feature, has_feature);
> +}
> +
>   static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
>   				       int nent)
>   {
>   	struct kvm_cpuid_entry2 *best;
> +	struct kvm_vcpu *caps = vcpu;
> +
> +	/*
> +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> +	 * are coming in from userspace!
> +	 */
> +	if (entries != vcpu->arch.cpuid_entries)
> +		caps = NULL;

Nit, why here we use caps instead of vcpu? Looks a bit weird.

>   
>   	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> -	if (best) {
> -		/* Update OSXSAVE bit */
> -		if (boot_cpu_has(X86_FEATURE_XSAVE))
> -			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> +
> +	if (boot_cpu_has(X86_FEATURE_XSAVE))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
>   					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
>   
> -		cpuid_entry_change(best, X86_FEATURE_APIC,
> -			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> +	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> +				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
>   
> -		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> -			cpuid_entry_change(best, X86_FEATURE_MWAIT,
> -					   vcpu->arch.ia32_misc_enable_msr &
> -					   MSR_IA32_MISC_ENABLE_MWAIT);
> -	}
> +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
> +					   vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);

 > 80 characters?

>   
>   	best = cpuid_entry2_find(entries, nent, 7, 0);
> -	if (best && boot_cpu_has(X86_FEATURE_PKU))
> -		cpuid_entry_change(best, X86_FEATURE_OSPKE,
> -				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> +	if (boot_cpu_has(X86_FEATURE_PKU))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSPKE,
> +					   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
>   
>   	best = cpuid_entry2_find(entries, nent, 0xD, 0);
>   	if (best)
> @@ -353,6 +370,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
>   	 * honor userspace's definition for features that don't require KVM or
>   	 * hardware management/support (or that KVM simply doesn't care about).
> +	 *
> +	 * Note, KVM has already done runtime updates on guest CPUID, i.e. this
> +	 * will also correctly set runtime features in guest CPU capabilities.
>   	 */
>   	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
>   		const struct cpuid_reg cpuid = reverse_cpuid[i];
Reviewed-by: Yang Weijiang <weijiang.yang@intel.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-11-10 23:55 ` [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
@ 2023-11-16  3:16   ` Yang, Weijiang
  2023-11-16 22:29     ` Sean Christopherson
  2023-11-19 17:32   ` Maxim Levitsky
  1 sibling, 1 reply; 40+ messages in thread
From: Yang, Weijiang @ 2023-11-16  3:16 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Maxim Levitsky

On 11/11/2023 7:55 AM, Sean Christopherson wrote:

[...]

> -static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> -							unsigned int x86_feature)
> +static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
> +						unsigned int x86_feature)
>   {
> -	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
> +	unsigned int x86_leaf = __feature_leaf(x86_feature);
> +
> +	reverse_cpuid_check(x86_leaf);
> +	vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
> +}
> +
> +static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
> +						 unsigned int x86_feature,
> +						 bool guest_has_cap)
> +{
> +	if (guest_has_cap)
>   		guest_cpu_cap_set(vcpu, x86_feature);
> +	else
> +		guest_cpu_cap_clear(vcpu, x86_feature);
> +}

I don't see any necessity to add 3 functions, i.e., guest_cpu_cap_{set, clear, change}, for
guest_cpu_cap update. IMHO one function is enough, e.g,:

static __always_inline void guest_cpu_cap_update(struct kvm_vcpu *vcpu,
                                                  unsigned int x86_feature,
                                                  bool guest_has_cap)
{
         unsigned int x86_leaf = __feature_leaf(x86_feature);

reverse_cpuid_check(x86_leaf);
         if (guest_has_cap)
                 vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
else
                 vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
}

> +
> +static __always_inline void guest_cpu_cap_restrict(struct kvm_vcpu *vcpu,
> +						   unsigned int x86_feature)
> +{
> +	if (!kvm_cpu_cap_has(x86_feature))
> +		guest_cpu_cap_clear(vcpu, x86_feature);
>   }

_restrict is not clear to me for what the function actually does -- it conditionally clears
guest cap depending on KVM support of the feature.

How about renaming it to guest_cpu_cap_sync()?

>   
>   static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8a99a73b6ee5..5827328e30f1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
>   	 * the guest read/write access to the host's XSS.
>   	 */
> -	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> -	    boot_cpu_has(X86_FEATURE_XSAVES) &&
> -	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> -		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
> +	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
> +			     boot_cpu_has(X86_FEATURE_XSAVE) &&
> +			     boot_cpu_has(X86_FEATURE_XSAVES) &&
> +			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
>   
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);
>   
>   	/*
>   	 * Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
> @@ -4330,12 +4330,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	 * SVM on Intel is bonkers and extremely unlikely to work).
>   	 */
>   	if (!guest_cpuid_is_intel(vcpu))
> -		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
> +		guest_cpu_cap_restrict(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
>   
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VGIF);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VNMI);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PAUSEFILTER);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PFTHRESHOLD);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VGIF);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VNMI);
>   
>   	svm_recalc_instruction_intercepts(vcpu, svm);
>   
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6328f0d47c64..5a056ad1ae55 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7757,9 +7757,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	 */
>   	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
>   	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> -		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_XSAVES);
> +		guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
> +	else
> +		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
>   
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VMX);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VMX);
>   
>   	vmx_setup_uret_msrs(vmx);
>   


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-16  2:24   ` Yang, Weijiang
@ 2023-11-16 22:19     ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-11-16 22:19 UTC (permalink / raw)
  To: Weijiang Yang
  Cc: Paolo Bonzini, kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Maxim Levitsky

On Thu, Nov 16, 2023, Weijiang Yang wrote:
> On 11/11/2023 7:55 AM, Sean Christopherson wrote:
> >   static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> >   				       int nent)
> >   {
> >   	struct kvm_cpuid_entry2 *best;
> > +	struct kvm_vcpu *caps = vcpu;
> > +
> > +	/*
> > +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > +	 * are coming in from userspace!
> > +	 */
> > +	if (entries != vcpu->arch.cpuid_entries)
> > +		caps = NULL;
> 
> Nit, why here we use caps instead of vcpu? Looks a bit weird.

See my response to Robert: https://lore.kernel.org/all/9395d416-cc5c-536d-641e-ffd971b682d1@gmail.com

> >   	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> > -	if (best) {
> > -		/* Update OSXSAVE bit */
> > -		if (boot_cpu_has(X86_FEATURE_XSAVE))
> > -			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> > +
> > +	if (boot_cpu_has(X86_FEATURE_XSAVE))
> > +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
> >   					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
> > -		cpuid_entry_change(best, X86_FEATURE_APIC,
> > -			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> > +	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> > +				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> > -		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> > -			cpuid_entry_change(best, X86_FEATURE_MWAIT,
> > -					   vcpu->arch.ia32_misc_enable_msr &
> > -					   MSR_IA32_MISC_ENABLE_MWAIT);
> > -	}
> > +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> > +		kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
> > +					   vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);
> 
> > 80 characters?

Hmm, yeah, I suppose.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-11-16  3:16   ` Yang, Weijiang
@ 2023-11-16 22:29     ` Sean Christopherson
  2023-11-17  8:33       ` Yang, Weijiang
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-16 22:29 UTC (permalink / raw)
  To: Weijiang Yang
  Cc: Paolo Bonzini, kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Maxim Levitsky

On Thu, Nov 16, 2023, Weijiang Yang wrote:
> On 11/11/2023 7:55 AM, Sean Christopherson wrote:
> 
> [...]
> 
> > -static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> > -							unsigned int x86_feature)
> > +static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
> > +						unsigned int x86_feature)
> >   {
> > -	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
> > +	unsigned int x86_leaf = __feature_leaf(x86_feature);
> > +
> > +	reverse_cpuid_check(x86_leaf);
> > +	vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
> > +}
> > +
> > +static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
> > +						 unsigned int x86_feature,
> > +						 bool guest_has_cap)
> > +{
> > +	if (guest_has_cap)
> >   		guest_cpu_cap_set(vcpu, x86_feature);
> > +	else
> > +		guest_cpu_cap_clear(vcpu, x86_feature);
> > +}
> 
> I don't see any necessity to add 3 functions, i.e., guest_cpu_cap_{set, clear, change}, for

I want to have equivalents to the cpuid_entry_*() APIs so that we don't end up
with two different sets of names.  And the clear() API already has a second user.

> guest_cpu_cap update. IMHO one function is enough, e.g,:

Hrm, I open coded the OR/AND logic in cpuid_entry_change() to try to force CMOV
instead of Jcc.  That honestly seems like a pointless optimization.  I would
rather use the helpers, which is less code.

> static __always_inline void guest_cpu_cap_update(struct kvm_vcpu *vcpu,
>                                                  unsigned int x86_feature,
>                                                  bool guest_has_cap)
> {
>         unsigned int x86_leaf = __feature_leaf(x86_feature);
> 
> reverse_cpuid_check(x86_leaf);
>         if (guest_has_cap)
>                 vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
> else
>                 vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
> }
> 
> > +
> > +static __always_inline void guest_cpu_cap_restrict(struct kvm_vcpu *vcpu,
> > +						   unsigned int x86_feature)
> > +{
> > +	if (!kvm_cpu_cap_has(x86_feature))
> > +		guest_cpu_cap_clear(vcpu, x86_feature);
> >   }
> 
> _restrict is not clear to me for what the function actually does -- it
> conditionally clears guest cap depending on KVM support of the feature.
> 
> How about renaming it to guest_cpu_cap_sync()?

"sync" isn't correct because it's not synchronizing with KVM's capabilitiy, e.g.
the guest capability will remaing unset if the guest CPUID bit is clear but the
KVM capability is available.

How about constrain()?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-15 15:09         ` Sean Christopherson
@ 2023-11-17  1:28           ` Robert Hoo
  0 siblings, 0 replies; 40+ messages in thread
From: Robert Hoo @ 2023-11-17  1:28 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Maxim Levitsky

On 11/15/2023 11:09 PM, Sean Christopherson wrote:
...
>>> No, because then every caller would need extra code to pass
>>> vcpu->cpu_caps,
>>
>> Emm, I don't understand this. I tried to modified and compiled, all need to
>> do is simply substitute "vcpu" with "vcpu->arch.cpu_caps" in calling. (at
>> the end is my diff based on this patch set)
> 
> Yes, and I'm saying that
> 
> 	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PAUSEFILTER);
> 	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PFTHRESHOLD);
> 	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VGIF);
> 	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VNMI);
> 
> is harder to read and write than this
> 
> 	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_PAUSEFILTER);
> 	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_PFTHRESHOLD);
> 	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_VGIF);
> 	guest_cpu_cap_restrict(vcpu->arch.cpu_caps, X86_FEATURE_VNMI);
> 
> a one-time search-replace is easy, but the extra boilerplate has a non-zero cost
> for every future developer/reader.

Hmm, I think this is trivial. And can be solved/eased by other means, e.g. 
Macro?. Rather than in the sacrifice of letting function's inside (easily) 
access those info it shouldn't.
> 
>>> and passing 'u32 *' provides less type safety than 'struct kvm_vcpu *'.
>>> That tradeoff isn't worth making this one path slightly easier to read.
>>
>> My point is also from vulnerability, long term, since as a principle, we'd
>> better pass in param/info to a function of its necessity.
> 
> Attempting to apply the principle of least privilege to low level C helpers is
> nonsensical.  E.g. the helper can trivially get at the owning vcpu via container_of()
> (well, if not for typeof assertions not playing nice with arrays, but open coding
> container_of() is also trivial and illustrates the point).
> 
> 	struct kvm_vcpu_arch *arch = (void *)caps -  offsetof(struct kvm_vcpu_arch, cpu_caps);
> 	struct kvm_vcpu *vcpu = container_of(arch, struct kvm_vcpu, arch);
> 
> 	if (!kvm_cpu_cap_has(x86_feature))
> 		guest_cpu_cap_clear(vcpu, x86_feature);
> 
> And the intent behind that principle is to improve security/robustness; what I'm
> saying is that passing in a 'u32 *" makes the overall implementation _less_ robust,
> as it opens up the possibilities of passing in an unsafe/incorrect pointer.  E.g.
> a well-intentioned, not _that_ obviously broken example is:
> 
> 	guest_cpu_cap_restrict(&vcpu->arch.cpu_caps[CPUID_1_ECX], X86_FEATURE_XSAVE);
> 
>> e.g. cpuid_entry2_find().
> 
> The main reason cpuid_entry2_find() exists is because KVM checks the incoming
> array provided by KVM_SET_CPUID2, which is also the reason why
> __kvm_update_cpuid_runtime() takes an @entries array instead of just @vcpu.

Thanks for detailed explanation, I understand your points deeper, though I would 
still prefer to honoring the principle if it was me to write the function. The 
concerns above can/should be addressed by other means. (If some really cannot be 
solved in C, i.e. more stringent type check, it's C to blame ;) but it on the 
other side offers those flexibility that other languages cannot, doesn't it?)
Another pros of the principle is that, it's also a fence, prevent (at least 
raise the bar) people in the future from doing something that shouldn't be in 
the function, e.g.  for his convenience to quickly fix a bug etc.

Anyway, it's a dilemma, and I said it's a less important point for this great 
progress of vCPUID's implementation, thanks.

Reviewed-by: Robert Hoo <robert.hoo.linux@gmail.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-11-16 22:29     ` Sean Christopherson
@ 2023-11-17  8:33       ` Yang, Weijiang
  2023-11-21  3:10         ` Yuan Yao
  0 siblings, 1 reply; 40+ messages in thread
From: Yang, Weijiang @ 2023-11-17  8:33 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Maxim Levitsky

On 11/17/2023 6:29 AM, Sean Christopherson wrote:
> On Thu, Nov 16, 2023, Weijiang Yang wrote:
>> On 11/11/2023 7:55 AM, Sean Christopherson wrote:
>>
>> [...]
>>
>>> -static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
>>> -							unsigned int x86_feature)
>>> +static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
>>> +						unsigned int x86_feature)
>>>    {
>>> -	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
>>> +	unsigned int x86_leaf = __feature_leaf(x86_feature);
>>> +
>>> +	reverse_cpuid_check(x86_leaf);
>>> +	vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
>>> +}
>>> +
>>> +static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
>>> +						 unsigned int x86_feature,
>>> +						 bool guest_has_cap)
>>> +{
>>> +	if (guest_has_cap)
>>>    		guest_cpu_cap_set(vcpu, x86_feature);
>>> +	else
>>> +		guest_cpu_cap_clear(vcpu, x86_feature);
>>> +}
>> I don't see any necessity to add 3 functions, i.e., guest_cpu_cap_{set, clear, change}, for
> I want to have equivalents to the cpuid_entry_*() APIs so that we don't end up
> with two different sets of names.  And the clear() API already has a second user.
>
>> guest_cpu_cap update. IMHO one function is enough, e.g,:
> Hrm, I open coded the OR/AND logic in cpuid_entry_change() to try to force CMOV
> instead of Jcc.  That honestly seems like a pointless optimization.  I would
> rather use the helpers, which is less code.
>
>> static __always_inline void guest_cpu_cap_update(struct kvm_vcpu *vcpu,
>>                                                   unsigned int x86_feature,
>>                                                   bool guest_has_cap)
>> {
>>          unsigned int x86_leaf = __feature_leaf(x86_feature);
>>
>> reverse_cpuid_check(x86_leaf);
>>          if (guest_has_cap)
>>                  vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
>> else
>>                  vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
>> }
>>
>>> +
>>> +static __always_inline void guest_cpu_cap_restrict(struct kvm_vcpu *vcpu,
>>> +						   unsigned int x86_feature)
>>> +{
>>> +	if (!kvm_cpu_cap_has(x86_feature))
>>> +		guest_cpu_cap_clear(vcpu, x86_feature);
>>>    }
>> _restrict is not clear to me for what the function actually does -- it
>> conditionally clears guest cap depending on KVM support of the feature.
>>
>> How about renaming it to guest_cpu_cap_sync()?
> "sync" isn't correct because it's not synchronizing with KVM's capabilitiy, e.g.
> the guest capability will remaing unset if the guest CPUID bit is clear but the
> KVM capability is available.
>
> How about constrain()?
I don't know, just feel we already have guest_cpu_cap_{set, clear, change}, here the name cannot exactly match the behavior of the function, maybe guest_cpu_cap_filter()? But just ignore the nit, up to you to decide the name :-)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
  2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
@ 2023-11-19 17:08   ` Maxim Levitsky
  2023-11-21  3:20   ` Chao Gao
  1 sibling, 0 replies; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:08 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> As the first step toward replacing KVM's so-called "governed features"
> framework with a more comprehensive, less poorly named implementation,
> replace the "kvm_governed_feature" function prefix with "guest_cpu_cap"
> and rename guest_can_use() to guest_cpu_cap_has().
> 
> The "guest_cpu_cap" naming scheme mirrors that of "kvm_cpu_cap", and
> provides a more clear distinction between guest capabilities, which are
> KVM controlled (heh, or one might say "governed"), and guest CPUID, which
> with few exceptions is fully userspace controlled.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c      |  2 +-
>  arch/x86/kvm/cpuid.h      | 12 ++++++------
>  arch/x86/kvm/mmu/mmu.c    |  4 ++--
>  arch/x86/kvm/svm/nested.c | 22 +++++++++++-----------
>  arch/x86/kvm/svm/svm.c    | 26 +++++++++++++-------------
>  arch/x86/kvm/svm/svm.h    |  4 ++--
>  arch/x86/kvm/vmx/nested.c |  6 +++---
>  arch/x86/kvm/vmx/vmx.c    | 14 +++++++-------
>  arch/x86/kvm/x86.c        |  4 ++--
>  9 files changed, 47 insertions(+), 47 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index dda6fc4cfae8..4f464187b063 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -345,7 +345,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
>  				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
>  	if (allow_gbpages)
> -		kvm_governed_feature_set(vcpu, X86_FEATURE_GBPAGES);
> +		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
>  
>  	best = kvm_find_cpuid_entry(vcpu, 1);
>  	if (best && apic) {
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 0b90532b6e26..245416ffa34c 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -254,7 +254,7 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
>  	return kvm_governed_feature_index(x86_feature) >= 0;
>  }
>  
> -static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
> +static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
>  						     unsigned int x86_feature)
>  {
>  	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
> @@ -263,15 +263,15 @@ static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
>  		  vcpu->arch.governed_features.enabled);
>  }
>  
> -static __always_inline void kvm_governed_feature_check_and_set(struct kvm_vcpu *vcpu,
> -							       unsigned int x86_feature)
> +static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> +							unsigned int x86_feature)
>  {
>  	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
> -		kvm_governed_feature_set(vcpu, x86_feature);
> +		guest_cpu_cap_set(vcpu, x86_feature);
>  }
>  
> -static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
> -					  unsigned int x86_feature)
> +static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> +					      unsigned int x86_feature)
>  {
>  	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
>  
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index b0f01d605617..cfed824587b9 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4801,7 +4801,7 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu,
>  	__reset_rsvds_bits_mask(&context->guest_rsvd_check,
>  				vcpu->arch.reserved_gpa_bits,
>  				context->cpu_role.base.level, is_efer_nx(context),
> -				guest_can_use(vcpu, X86_FEATURE_GBPAGES),
> +				guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES),
>  				is_cr4_pse(context),
>  				guest_cpuid_is_amd_or_hygon(vcpu));
>  }
> @@ -4878,7 +4878,7 @@ static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
>  	__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
>  				context->root_role.level,
>  				context->root_role.efer_nx,
> -				guest_can_use(vcpu, X86_FEATURE_GBPAGES),
> +				guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES),
>  				is_pse, is_amd);
>  
>  	if (!shadow_me_mask)
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 3fea8c47679e..ea0895262b12 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -107,7 +107,7 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu)
>  
>  static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm)
>  {
> -	if (!guest_can_use(&svm->vcpu, X86_FEATURE_V_VMSAVE_VMLOAD))
> +	if (!guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_V_VMSAVE_VMLOAD))
>  		return true;
>  
>  	if (!nested_npt_enabled(svm))
> @@ -603,7 +603,7 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
>  		vmcb_mark_dirty(vmcb02, VMCB_DR);
>  	}
>  
> -	if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
> +	if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
>  		     (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
>  		/*
>  		 * Reserved bits of DEBUGCTL are ignored.  Be consistent with
> @@ -660,7 +660,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>  	 * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes.
>  	 */
>  
> -	if (guest_can_use(vcpu, X86_FEATURE_VGIF) &&
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VGIF) &&
>  	    (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK))
>  		int_ctl_vmcb12_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK);
>  	else
> @@ -698,7 +698,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>  
>  	vmcb02->control.tsc_offset = vcpu->arch.tsc_offset;
>  
> -	if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR) &&
>  	    svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio)
>  		nested_svm_update_tsc_ratio_msr(vcpu);
>  
> @@ -719,7 +719,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>  	 * what a nrips=0 CPU would do (L1 is responsible for advancing RIP
>  	 * prior to injecting the event).
>  	 */
> -	if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
>  		vmcb02->control.next_rip    = svm->nested.ctl.next_rip;
>  	else if (boot_cpu_has(X86_FEATURE_NRIPS))
>  		vmcb02->control.next_rip    = vmcb12_rip;
> @@ -729,7 +729,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>  		svm->soft_int_injected = true;
>  		svm->soft_int_csbase = vmcb12_csbase;
>  		svm->soft_int_old_rip = vmcb12_rip;
> -		if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
>  			svm->soft_int_next_rip = svm->nested.ctl.next_rip;
>  		else
>  			svm->soft_int_next_rip = vmcb12_rip;
> @@ -737,18 +737,18 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>  
>  	vmcb02->control.virt_ext            = vmcb01->control.virt_ext &
>  					      LBR_CTL_ENABLE_MASK;
> -	if (guest_can_use(vcpu, X86_FEATURE_LBRV))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV))
>  		vmcb02->control.virt_ext  |=
>  			(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK);
>  
>  	if (!nested_vmcb_needs_vls_intercept(svm))
>  		vmcb02->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK;
>  
> -	if (guest_can_use(vcpu, X86_FEATURE_PAUSEFILTER))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PAUSEFILTER))
>  		pause_count12 = svm->nested.ctl.pause_filter_count;
>  	else
>  		pause_count12 = 0;
> -	if (guest_can_use(vcpu, X86_FEATURE_PFTHRESHOLD))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PFTHRESHOLD))
>  		pause_thresh12 = svm->nested.ctl.pause_filter_thresh;
>  	else
>  		pause_thresh12 = 0;
> @@ -1035,7 +1035,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>  	if (vmcb12->control.exit_code != SVM_EXIT_ERR)
>  		nested_save_pending_event_to_vmcb12(svm, vmcb12);
>  
> -	if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
>  		vmcb12->control.next_rip  = vmcb02->control.next_rip;
>  
>  	vmcb12->control.int_ctl           = svm->nested.ctl.int_ctl;
> @@ -1074,7 +1074,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>  	if (!nested_exit_on_intr(svm))
>  		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
>  
> -	if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
> +	if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
>  		     (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
>  		svm_copy_lbrs(vmcb12, vmcb02);
>  		svm_update_lbrv(vcpu);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 1855a6d7c976..8a99a73b6ee5 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1046,7 +1046,7 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  	bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
>  	bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
> -			    (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
> +			    (is_guest_mode(vcpu) && guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
>  			    (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>  
>  	if (enable_lbrv == current_enable_lbrv)
> @@ -2835,7 +2835,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	switch (msr_info->index) {
>  	case MSR_AMD64_TSC_RATIO:
>  		if (!msr_info->host_initiated &&
> -		    !guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR))
>  			return 1;
>  		msr_info->data = svm->tsc_ratio_msr;
>  		break;
> @@ -2985,7 +2985,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  	switch (ecx) {
>  	case MSR_AMD64_TSC_RATIO:
>  
> -		if (!guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR)) {
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR)) {
>  
>  			if (!msr->host_initiated)
>  				return 1;
> @@ -3007,7 +3007,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  
>  		svm->tsc_ratio_msr = data;
>  
> -		if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSCRATEMSR) &&
>  		    is_guest_mode(vcpu))
>  			nested_svm_update_tsc_ratio_msr(vcpu);
>  
> @@ -4318,11 +4318,11 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
>  	    boot_cpu_has(X86_FEATURE_XSAVES) &&
>  	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> -		kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);
> +		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
>  
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_NRIPS);
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_LBRV);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
>  
>  	/*
>  	 * Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
> @@ -4330,12 +4330,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * SVM on Intel is bonkers and extremely unlikely to work).
>  	 */
>  	if (!guest_cpuid_is_intel(vcpu))
> -		kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
> +		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
>  
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VGIF);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VNMI);
>  
>  	svm_recalc_instruction_intercepts(vcpu, svm);
>  
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index be67ab7fdd10..e49af42b4a33 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -443,7 +443,7 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
>  
>  static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
>  {
> -	return guest_can_use(&svm->vcpu, X86_FEATURE_VGIF) &&
> +	return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_VGIF) &&
>  	       (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
>  }
>  
> @@ -495,7 +495,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
>  
>  static inline bool nested_vnmi_enabled(struct vcpu_svm *svm)
>  {
> -	return guest_can_use(&svm->vcpu, X86_FEATURE_VNMI) &&
> +	return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_VNMI) &&
>  	       (svm->nested.ctl.int_ctl & V_NMI_ENABLE_MASK);
>  }
>  
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c5ec0ef51ff7..4750d1696d58 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6426,7 +6426,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
>  	vmx = to_vmx(vcpu);
>  	vmcs12 = get_vmcs12(vcpu);
>  
> -	if (guest_can_use(vcpu, X86_FEATURE_VMX) &&
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX) &&
>  	    (vmx->nested.vmxon || vmx->nested.smm.vmxon)) {
>  		kvm_state.hdr.vmx.vmxon_pa = vmx->nested.vmxon_ptr;
>  		kvm_state.hdr.vmx.vmcs12_pa = vmx->nested.current_vmptr;
> @@ -6567,7 +6567,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>  		if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS)
>  			return -EINVAL;
>  	} else {
> -		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
>  			return -EINVAL;
>  
>  		if (!page_address_valid(vcpu, kvm_state->hdr.vmx.vmxon_pa))
> @@ -6601,7 +6601,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>  		return -EINVAL;
>  
>  	if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) &&
> -	    (!guest_can_use(vcpu, X86_FEATURE_VMX) ||
> +	    (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX) ||
>  	     !vmx->nested.enlightened_vmcs_enabled))
>  			return -EINVAL;
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index be20a60047b1..6328f0d47c64 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2050,7 +2050,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
>  		break;
>  	case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> -		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
>  			return 1;
>  		if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
>  				    &msr_info->data))
> @@ -2358,7 +2358,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
>  		if (!msr_info->host_initiated)
>  			return 1; /* they are read-only */
> -		if (!guest_can_use(vcpu, X86_FEATURE_VMX))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
>  			return 1;
>  		return vmx_set_vmx_msr(vcpu, msr_index, data);
>  	case MSR_IA32_RTIT_CTL:
> @@ -4567,7 +4567,7 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
>  												\
>  	if (cpu_has_vmx_##name()) {								\
>  		if (kvm_is_governed_feature(X86_FEATURE_##feat_name))				\
> -			__enabled = guest_can_use(__vcpu, X86_FEATURE_##feat_name);		\
> +			__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);		\
>  		else										\
>  			__enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name);		\
>  		vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
> @@ -7757,9 +7757,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 */
>  	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
>  	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> -		kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
> +		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_XSAVES);
>  
> -	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX);
> +	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VMX);
>  
>  	vmx_setup_uret_msrs(vmx);
>  
> @@ -7767,7 +7767,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  		vmcs_set_secondary_exec_control(vmx,
>  						vmx_secondary_exec_control(vmx));
>  
> -	if (guest_can_use(vcpu, X86_FEATURE_VMX))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
>  		vmx->msr_ia32_feature_control_valid_bits |=
>  			FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
>  			FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
> @@ -7776,7 +7776,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  			~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
>  			  FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
>  
> -	if (guest_can_use(vcpu, X86_FEATURE_VMX))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_VMX))
>  		nested_vmx_cr_fixed1_bits_update(vcpu);
>  
>  	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2c924075f6f1..04a77b764a36 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1025,7 +1025,7 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
>  		if (vcpu->arch.xcr0 != host_xcr0)
>  			xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
>  
> -		if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES) &&
>  		    vcpu->arch.ia32_xss != host_xss)
>  			wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
>  	}
> @@ -1056,7 +1056,7 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
>  		if (vcpu->arch.xcr0 != host_xcr0)
>  			xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
>  
> -		if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES) &&
>  		    vcpu->arch.ia32_xss != host_xss)
>  			wrmsrl(MSR_IA32_XSS, host_xss);
>  	}


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps
  2023-11-10 23:55 ` [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps Sean Christopherson
  2023-11-14  9:12   ` Binbin Wu
@ 2023-11-19 17:22   ` Maxim Levitsky
  2023-11-28  1:24     ` Sean Christopherson
  1 sibling, 1 reply; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:22 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> Replace the internals of the governed features framework with a more
> comprehensive "guest CPU capabilities" implementation, i.e. with a guest
> version of kvm_cpu_caps.  Keep the skeleton of governed features around
> for now as vmx_adjust_sec_exec_control() relies on detecting governed
> features to do the right thing for XSAVES, and switching all guest feature
> queries to guest_cpu_cap_has() requires subtle and non-trivial changes,
> i.e. is best done as a standalone change.
> 
> Tracking *all* guest capabilities that KVM cares will allow excising the
> poorly named "governed features" framework, and effectively optimizes all
> KVM queries of guest capabilities, i.e. doesn't require making a
> subjective decision as to whether or not a feature is worth "governing",
> and doesn't require adding the code to do so.
> 
> The cost of tracking all features is currently 92 bytes per vCPU on 64-bit
> kernels: 100 bytes for cpu_caps versus 8 bytes for governed_features.
> That cost is well worth paying even if the only benefit was eliminating
> the "governed features" terminology.  And practically speaking, the real
> cost is zero unless those 92 bytes pushes the size of vcpu_vmx or vcpu_svm
> into a new order-N allocation, and if that happens there are better ways
> to reduce the footprint of kvm_vcpu_arch, e.g. making the PMU and/or MTRR
> state separate allocations.
> 
> Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 40 ++++++++++++++++++++-------------
>  arch/x86/kvm/cpuid.c            |  4 +---
>  arch/x86/kvm/cpuid.h            | 14 ++++++------
>  arch/x86/kvm/reverse_cpuid.h    | 15 -------------
>  4 files changed, 32 insertions(+), 41 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d7036982332e..1d43dd5fdea7 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -722,6 +722,22 @@ struct kvm_queued_exception {
>  	bool has_payload;
>  };
>  
> +/*
> + * Hardware-defined CPUID leafs that are either scattered by the kernel or are
> + * unknown to the kernel, but need to be directly used by KVM.  Note, these
> + * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
> + */
> +enum kvm_only_cpuid_leafs {
> +	CPUID_12_EAX	 = NCAPINTS,
> +	CPUID_7_1_EDX,
> +	CPUID_8000_0007_EDX,
> +	CPUID_8000_0022_EAX,
> +	NR_KVM_CPU_CAPS,
> +
> +	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
> +};
> +
> +
>  struct kvm_vcpu_arch {
>  	/*
>  	 * rip and regs accesses must go through
> @@ -840,23 +856,15 @@ struct kvm_vcpu_arch {
>  	struct kvm_hypervisor_cpuid kvm_cpuid;
>  
>  	/*
> -	 * FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly
> -	 * when "struct kvm_vcpu_arch" is no longer defined in an
> -	 * arch/x86/include/asm header.  The max is mostly arbitrary, i.e.
> -	 * can be increased as necessary.
> +	 * Track the effective guest capabilities, i.e. the features the vCPU
> +	 * is allowed to use.  Typically, but not always, features can be used
> +	 * by the guest if and only if both KVM and userspace want to expose
> +	 * the feature to the guest.  A common exception is for virtualization
> +	 * holes, i.e. when KVM can't prevent the guest from using a feature,
> +	 * in which case the vCPU "has" the feature regardless of what KVM or
> +	 * userspace desires.
>  	 */
> -#define KVM_MAX_NR_GOVERNED_FEATURES BITS_PER_LONG
> -
> -	/*
> -	 * Track whether or not the guest is allowed to use features that are
> -	 * governed by KVM, where "governed" means KVM needs to manage state
> -	 * and/or explicitly enable the feature in hardware.  Typically, but
> -	 * not always, governed features can be used by the guest if and only
> -	 * if both KVM and userspace want to expose the feature to the guest.
> -	 */
> -	struct {
> -		DECLARE_BITMAP(enabled, KVM_MAX_NR_GOVERNED_FEATURES);
> -	} governed_features;
> +	u32 cpu_caps[NR_KVM_CPU_CAPS];

Won't it be better to call this 'effective_cpu_caps' or something like that,
to put emphasis on the fact that these are not exactly the cpu caps that userspace wants.
Although probably any name will still be somewhat confusing.

>  
>  	u64 reserved_gpa_bits;
>  	int maxphyaddr;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 4f464187b063..4bf3c2d4dc7c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -327,9 +327,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	struct kvm_cpuid_entry2 *best;
>  	bool allow_gbpages;
>  
> -	BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
> -	bitmap_zero(vcpu->arch.governed_features.enabled,
> -		    KVM_MAX_NR_GOVERNED_FEATURES);
> +	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
>  
>  	/*
>  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 245416ffa34c..9f18c4395b71 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -255,12 +255,12 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
>  }
>  
>  static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
> -						     unsigned int x86_feature)
> +					      unsigned int x86_feature)
>  {
> -	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
> +	unsigned int x86_leaf = __feature_leaf(x86_feature);
>  
> -	__set_bit(kvm_governed_feature_index(x86_feature),
> -		  vcpu->arch.governed_features.enabled);
> +	reverse_cpuid_check(x86_leaf);
> +	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
>  }
>  
>  static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> @@ -273,10 +273,10 @@ static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
>  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
>  					      unsigned int x86_feature)
>  {
> -	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
> +	unsigned int x86_leaf = __feature_leaf(x86_feature);
>  
> -	return test_bit(kvm_governed_feature_index(x86_feature),
> -			vcpu->arch.governed_features.enabled);
> +	reverse_cpuid_check(x86_leaf);
> +	return vcpu->arch.cpu_caps[x86_leaf] & __feature_bit(x86_feature);
>  }

It might make sense to think about extracting the common code between
kvm_cpu_cap* and guest_cpu_cap*.

The whole notion of reverse cpuid, KVM only leaves, and other nice things
that it has is already very confusing, but as I understand there is
no better way of doing it.
But there must be a way to avoid at least duplicating this logic.

Also speaking of this logic, it would be nice to document it.
E.g for 'kvm_only_cpuid_leafs' it would be nice to have an explanation
for each entry on why it is needed.


Just curious: I wonder why Intel called them leaves?
CPUID leaves are just table entries, I don't see any tree there.

Finally isn't plural of "leaf" is "leaves"?

>  
>  #endif
> diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
> index b81650678375..4b658491e8f8 100644
> --- a/arch/x86/kvm/reverse_cpuid.h
> +++ b/arch/x86/kvm/reverse_cpuid.h
> @@ -6,21 +6,6 @@
>  #include <asm/cpufeature.h>
>  #include <asm/cpufeatures.h>
>  
> -/*
> - * Hardware-defined CPUID leafs that are either scattered by the kernel or are
> - * unknown to the kernel, but need to be directly used by KVM.  Note, these
> - * word values conflict with the kernel's "bug" caps, but KVM doesn't use those.
> - */
> -enum kvm_only_cpuid_leafs {
> -	CPUID_12_EAX	 = NCAPINTS,
> -	CPUID_7_1_EDX,
> -	CPUID_8000_0007_EDX,
> -	CPUID_8000_0022_EAX,
> -	NR_KVM_CPU_CAPS,
> -
> -	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
> -};
> -
>  /*
>   * Define a KVM-only feature flag.
>   *

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-11-10 23:55 ` [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
  2023-11-16  3:16   ` Yang, Weijiang
@ 2023-11-19 17:32   ` Maxim Levitsky
  2023-12-01  1:51     ` Sean Christopherson
  1 sibling, 1 reply; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:32 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> Initialize a vCPU's capabilities based on the guest CPUID provided by
> userspace instead of simply zeroing the entire array.  This will allow
> using cpu_caps to query *all* CPUID-based guest capabilities, i.e. will
> allow converting all usage of guest_cpuid_has() to guest_cpu_cap_has().
> 
> Zeroing the array was the logical choice when using cpu_caps was opt-in,
> e.g. "unsupported" was generally a safer default, and the whole point of
> governed features is that KVM would need to check host and guest support,
> i.e. making everything unsupported by default didn't require more code.
> 
> But requiring KVM to manually "enable" every CPUID-based feature in
> cpu_caps would require an absurd amount of boilerplate code.
> 
> Follow existing CPUID/kvm_cpu_caps nomenclature where possible, e.g. for
> the change() and clear() APIs.  Replace check_and_set() with restrict() to
> try and capture that KVM is restricting userspace's desired guest feature
> set based on KVM's capabilities.
> 
> This is intended to be gigantic nop, i.e. should not have any impact on
> guest or KVM functionality.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c   | 43 +++++++++++++++++++++++++++++++++++++++---
>  arch/x86/kvm/cpuid.h   | 25 +++++++++++++++++++++---
>  arch/x86/kvm/svm/svm.c | 24 +++++++++++------------
>  arch/x86/kvm/vmx/vmx.c |  6 ++++--
>  4 files changed, 78 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 4bf3c2d4dc7c..5cf3d697ecb3 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -321,13 +321,51 @@ static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent)
>  	return entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX;
>  }
>  
> +/*
> + * This isn't truly "unsafe", but all callers except kvm_cpu_after_set_cpuid()
> + * should use __cpuid_entry_get_reg(), which provides compile-time validation
> + * of the input.
> + */
> +static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
> +{
> +	switch (reg) {
> +	case CPUID_EAX:
> +		return entry->eax;
> +	case CPUID_EBX:
> +		return entry->ebx;
> +	case CPUID_ECX:
> +		return entry->ecx;
> +	case CPUID_EDX:
> +		return entry->edx;
> +	default:
> +		WARN_ON_ONCE(1);
> +		return 0;
> +	}
> +}



> +
>  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_lapic *apic = vcpu->arch.apic;
>  	struct kvm_cpuid_entry2 *best;
>  	bool allow_gbpages;
> +	int i;
>  
> -	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
> +	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
> +
> +	/*
> +	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> +	 * honor userspace's definition for features that don't require KVM or
> +	 * hardware management/support (or that KVM simply doesn't care about).
> +	 */
> +	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> +		const struct cpuid_reg cpuid = reverse_cpuid[i];
> +
> +		best = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
> +		if (best)
> +			vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(best, cpuid.reg);

Why not just use __cpuid_entry_get_reg? 

cpuid.reg comes from read/only 'reverse_cpuid' anyway, and in fact
it seems that all callers of __cpuid_entry_get_reg, take the reg value from
x86_feature_cpuid() which also takes it from 'reverse_cpuid'.

So if the compiler is smart enough to not complain in these cases, I don't see why this case
is different.


Also why not to initialize guest_caps = host_caps & userspace_cpuid?

If this was the default we won't need any guest_cpu_cap_restrict and such,
instead it will just work.

Special code will only be needed in few more complex cases, like forced exposed
of a feature to a guest due to a virtualization hole.


> +		else
> +			vcpu->arch.cpu_caps[i] = 0;
> +	}
>  
>  	/*
>  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> @@ -342,8 +380,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 */
>  	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
>  				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> -	if (allow_gbpages)
> -		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
> +	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);

IMHO the original code was more readable, now I need to look up the 'guest_cpu_cap_change()'
to understand what is going on.

>  
>  	best = kvm_find_cpuid_entry(vcpu, 1);
>  	if (best && apic) {
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 9f18c4395b71..1707ef10b269 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -263,11 +263,30 @@ static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
>  	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
>  }
>  
> -static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> -							unsigned int x86_feature)
> +static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
> +						unsigned int x86_feature)
>  {
> -	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
> +	unsigned int x86_leaf = __feature_leaf(x86_feature);
> +
> +	reverse_cpuid_check(x86_leaf);
> +	vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
> +}
> +
> +static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
> +						 unsigned int x86_feature,
> +						 bool guest_has_cap)
> +{
> +	if (guest_has_cap)
>  		guest_cpu_cap_set(vcpu, x86_feature);
> +	else
> +		guest_cpu_cap_clear(vcpu, x86_feature);
> +}

Let's not have this function, it's just not worth it IMHO.

> +
> +static __always_inline void guest_cpu_cap_restrict(struct kvm_vcpu *vcpu,
> +						   unsigned int x86_feature)
> +{
> +	if (!kvm_cpu_cap_has(x86_feature))
> +		guest_cpu_cap_clear(vcpu, x86_feature);
>  }

The purpose of this function is also very hard to decipher.

If we initialize guest_caps = host_caps & guest_cpuid then we won't
need this function.

>  
>  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8a99a73b6ee5..5827328e30f1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
>  	 * the guest read/write access to the host's XSS.
>  	 */
> -	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> -	    boot_cpu_has(X86_FEATURE_XSAVES) &&
> -	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> -		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
> +	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
> +			     boot_cpu_has(X86_FEATURE_XSAVE) &&
> +			     boot_cpu_has(X86_FEATURE_XSAVES) &&
> +			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));

In theory this change does change behavior, now the X86_FEATURE_XSAVE will
be set iff the condition is true, but before it was set *if* the condition was true.

>  
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);

One of the main reasons I don't like governed features is this manual list.
I want to reach the point that one won't need to add anything manually, unless there
is a good reason to do so, and there are only a few exceptions when the guest cap is set,
while the host's isn't.

>  
>  	/*
>  	 * Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
> @@ -4330,12 +4330,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * SVM on Intel is bonkers and extremely unlikely to work).
>  	 */
>  	if (!guest_cpuid_is_intel(vcpu))
> -		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
> +		guest_cpu_cap_restrict(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
>  
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VGIF);
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VNMI);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PAUSEFILTER);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_PFTHRESHOLD);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VGIF);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VNMI);
>  
>  	svm_recalc_instruction_intercepts(vcpu, svm);
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6328f0d47c64..5a056ad1ae55 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7757,9 +7757,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 */
>  	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
>  	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> -		guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_XSAVES);
> +		guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
> +	else
> +		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
>  
> -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_VMX);
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_VMX);
>  
>  	vmx_setup_uret_msrs(vmx);
>  


Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime
  2023-11-10 23:55 ` [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
@ 2023-11-19 17:33   ` Maxim Levitsky
  0 siblings, 0 replies; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:33 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> Move the handling of X86_FEATURE_MWAIT during CPUID runtime updates to
> utilize the lookup done for other CPUID.0x1 features.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 5cf3d697ecb3..6777780be6ae 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -276,6 +276,11 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
>  
>  		cpuid_entry_change(best, X86_FEATURE_APIC,
>  			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> +
> +		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> +			cpuid_entry_change(best, X86_FEATURE_MWAIT,
> +					   vcpu->arch.ia32_misc_enable_msr &
> +					   MSR_IA32_MISC_ENABLE_MWAIT);
>  	}
>  
>  	best = cpuid_entry2_find(entries, nent, 7, 0);
> @@ -296,14 +301,6 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
>  	if (kvm_hlt_in_guest(vcpu->kvm) && best &&
>  		(best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
>  		best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
> -
> -	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
> -		best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> -		if (best)
> -			cpuid_entry_change(best, X86_FEATURE_MWAIT,
> -					   vcpu->arch.ia32_misc_enable_msr &
> -					   MSR_IA32_MISC_ENABLE_MWAIT);
> -	}
>  }
>  
>  void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf
  2023-11-10 23:55 ` [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
@ 2023-11-19 17:33   ` Maxim Levitsky
  0 siblings, 0 replies; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:33 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> Drop an unnecessary check that cpuid_entry2_find() returns the correct
> leaf when getting CPUID.0x7.0x0 to update X86_FEATURE_OSPKE, as
> cpuid_entry2_find() never returns an entry for the wrong function.  And
> not that it matters, but cpuid_entry2_find() will always return a precise
> match for CPUID.0x7.0x0 since the index is significant.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 6777780be6ae..36bd04030989 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -284,7 +284,7 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
>  	}
>  
>  	best = cpuid_entry2_find(entries, nent, 7, 0);
> -	if (best && boot_cpu_has(X86_FEATURE_PKU) && best->function == 0x7)
> +	if (best && boot_cpu_has(X86_FEATURE_PKU))
>  		cpuid_entry_change(best, X86_FEATURE_OSPKE,
>  				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
>  

Makes sense.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
  2023-11-13  8:03   ` Robert Hoo
  2023-11-16  2:24   ` Yang, Weijiang
@ 2023-11-19 17:35   ` Maxim Levitsky
  2023-11-24  6:33     ` Xu Yilun
  2 siblings, 1 reply; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> When updating guest CPUID entries to emulate runtime behavior, e.g. when
> the guest enables a CR4-based feature that is tied to a CPUID flag, also
> update the vCPU's cpu_caps accordingly.  This will allow replacing all
> usage of guest_cpuid_has() with guest_cpu_cap_has().
> 
> Take care not to update guest capabilities when KVM is updating CPUID
> entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
> bogus CPUID information.  No extra call to update cpu_caps is needed as
> the cpu_caps are initialized from the incoming guest CPUID, i.e. will
> automatically get the updated values.
> 
> Note, none of the features in question use guest_cpu_cap_has() at this
> time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
>  1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 36bd04030989..37a991439fe6 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
>  	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
>  }
>  
> +static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
> +						       struct kvm_cpuid_entry2 *entry,
> +						       unsigned int x86_feature,
> +						       bool has_feature)
> +{
> +	if (entry)
> +		cpuid_entry_change(entry, x86_feature, has_feature);
> +
> +	if (vcpu)
> +		guest_cpu_cap_change(vcpu, x86_feature, has_feature);
> +}
> +
>  static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
>  				       int nent)
>  {
>  	struct kvm_cpuid_entry2 *best;
> +	struct kvm_vcpu *caps = vcpu;
> +
> +	/*
> +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> +	 * are coming in from userspace!
> +	 */
> +	if (entries != vcpu->arch.cpuid_entries)
> +		caps = NULL;

I think that this should be decided by the caller. Just a boolean will suffice.

Or even better: since the userspace CPUID update is really not important in terms of performance,
why to special case it? 

Even if these guest caps are later overwritten, I don't see why we
need to avoid updating them, and in fact introduce a small risk of them not being consistent
with the other cpu caps.

With this we can avoid having the 'cap' variable which is *very* confusing as well.


>  
>  	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> -	if (best) {
> -		/* Update OSXSAVE bit */
> -		if (boot_cpu_has(X86_FEATURE_XSAVE))
> -			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> +
> +	if (boot_cpu_has(X86_FEATURE_XSAVE))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
>  					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
>  
> -		cpuid_entry_change(best, X86_FEATURE_APIC,
> -			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> +	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> +				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
>  
> -		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> -			cpuid_entry_change(best, X86_FEATURE_MWAIT,
> -					   vcpu->arch.ia32_misc_enable_msr &
> -					   MSR_IA32_MISC_ENABLE_MWAIT);
> -	}
> +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
> +					   vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);
>  
>  	best = cpuid_entry2_find(entries, nent, 7, 0);
> -	if (best && boot_cpu_has(X86_FEATURE_PKU))
> -		cpuid_entry_change(best, X86_FEATURE_OSPKE,
> -				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> +	if (boot_cpu_has(X86_FEATURE_PKU))
> +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSPKE,
> +					   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
>  
>  	best = cpuid_entry2_find(entries, nent, 0xD, 0);
>  	if (best)
> @@ -353,6 +370,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
>  	 * honor userspace's definition for features that don't require KVM or
>  	 * hardware management/support (or that KVM simply doesn't care about).
> +	 *
> +	 * Note, KVM has already done runtime updates on guest CPUID, i.e. this
> +	 * will also correctly set runtime features in guest CPU capabilities.
>  	 */
>  	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
>  		const struct cpuid_reg cpuid = reverse_cpuid[i];


Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has()
  2023-11-10 23:55 ` [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
@ 2023-11-19 17:35   ` Maxim Levitsky
  0 siblings, 0 replies; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> Move the implementations of guest_has_{spec_ctrl,pred_cmd}_msr() down
> below guest_cpu_cap_has() so that their use of guest_cpuid_has() can be
> replaced with calls to guest_cpu_cap_has().
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.h | 30 +++++++++++++++---------------
>  1 file changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 1707ef10b269..bebf94a69630 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -163,21 +163,6 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
>  	return x86_stepping(best->eax);
>  }
>  
> -static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
> -{
> -	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
> -}
> -
> -static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
> -{
> -	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
> -}
> -
>  static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
>  {
>  	return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
> @@ -298,4 +283,19 @@ static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
>  	return vcpu->arch.cpu_caps[x86_leaf] & __feature_bit(x86_feature);
>  }
>  
> +static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
> +{
> +	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> +		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
> +		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
> +		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
> +}
> +
> +static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
> +{
> +	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> +		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
> +		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
> +}
> +
>  #endif
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check
  2023-11-10 23:55 ` [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check Sean Christopherson
@ 2023-11-19 17:35   ` Maxim Levitsky
  0 siblings, 0 replies; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> Switch all queries of guest features from guest CPUID to guest
> capabilities, i.e. replace all calls to guest_cpuid_has() with calls to
> guest_cpu_cap_has(), and drop guest_cpuid_has() and its helper
> guest_cpuid_get_register().
> 
> Opportunistically drop the unused guest_cpuid_clear(), as there should be
> no circumstance in which KVM needs to _clear_ a guest CPUID feature now
> that everything is tracked via cpu_caps.  E.g. KVM may need to _change_
> a feature to emulate dynamic CPUID flags, but KVM should never need to
> clear a feature in guest CPUID to prevent it from being used by the guest.
> 
> Delete the last remnants of the governed features framework, as the lone
> holdout was vmx_adjust_secondary_exec_control()'s divergent behavior for
> governed vs. ungoverned features.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c             |  4 +-
>  arch/x86/kvm/cpuid.h             | 70 ++++----------------------------
>  arch/x86/kvm/governed_features.h | 21 ----------
>  arch/x86/kvm/lapic.c             |  2 +-
>  arch/x86/kvm/mtrr.c              |  2 +-
>  arch/x86/kvm/smm.c               | 10 ++---
>  arch/x86/kvm/svm/pmu.c           |  8 ++--
>  arch/x86/kvm/svm/sev.c           |  4 +-
>  arch/x86/kvm/svm/svm.c           | 20 ++++-----
>  arch/x86/kvm/vmx/nested.c        | 12 +++---
>  arch/x86/kvm/vmx/pmu_intel.c     |  4 +-
>  arch/x86/kvm/vmx/sgx.c           | 14 +++----
>  arch/x86/kvm/vmx/vmx.c           | 47 ++++++++++-----------
>  arch/x86/kvm/vmx/vmx.h           |  2 +-
>  arch/x86/kvm/x86.c               | 68 +++++++++++++++----------------
>  15 files changed, 104 insertions(+), 184 deletions(-)
>  delete mode 100644 arch/x86/kvm/governed_features.h
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 37a991439fe6..6407e5c45f20 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -396,7 +396,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * and can install smaller shadow pages if the host lacks 1GiB support.
>  	 */
>  	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
> -				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> +				      guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES);
>  	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
>  
>  	best = kvm_find_cpuid_entry(vcpu, 1);
> @@ -419,7 +419,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  
>  	kvm_pmu_refresh(vcpu);
>  	vcpu->arch.cr4_guest_rsvd_bits =
> -	    __cr4_reserved_bits(guest_cpuid_has, vcpu);
> +	    __cr4_reserved_bits(guest_cpu_cap_has, vcpu);
>  
>  	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu->arch.cpuid_entries,
>  						    vcpu->arch.cpuid_nent));
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index bebf94a69630..98694dfe062e 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -72,41 +72,6 @@ static __always_inline void cpuid_entry_override(struct kvm_cpuid_entry2 *entry,
>  	*reg = kvm_cpu_caps[leaf];
>  }
>  
> -static __always_inline u32 *guest_cpuid_get_register(struct kvm_vcpu *vcpu,
> -						     unsigned int x86_feature)
> -{
> -	const struct cpuid_reg cpuid = x86_feature_cpuid(x86_feature);
> -	struct kvm_cpuid_entry2 *entry;
> -
> -	entry = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
> -	if (!entry)
> -		return NULL;
> -
> -	return __cpuid_entry_get_reg(entry, cpuid.reg);
> -}
> -
> -static __always_inline bool guest_cpuid_has(struct kvm_vcpu *vcpu,
> -					    unsigned int x86_feature)
> -{
> -	u32 *reg;
> -
> -	reg = guest_cpuid_get_register(vcpu, x86_feature);
> -	if (!reg)
> -		return false;
> -
> -	return *reg & __feature_bit(x86_feature);
> -}
> -
> -static __always_inline void guest_cpuid_clear(struct kvm_vcpu *vcpu,
> -					      unsigned int x86_feature)
> -{
> -	u32 *reg;
> -
> -	reg = guest_cpuid_get_register(vcpu, x86_feature);
> -	if (reg)
> -		*reg &= ~__feature_bit(x86_feature);
> -}
> -
>  static inline bool guest_cpuid_is_amd_or_hygon(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpuid_entry2 *best;
> @@ -218,27 +183,6 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
>  	return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
>  }
>  
> -enum kvm_governed_features {
> -#define KVM_GOVERNED_FEATURE(x) KVM_GOVERNED_##x,
> -#include "governed_features.h"
> -	KVM_NR_GOVERNED_FEATURES
> -};
> -
> -static __always_inline int kvm_governed_feature_index(unsigned int x86_feature)
> -{
> -	switch (x86_feature) {
> -#define KVM_GOVERNED_FEATURE(x) case x: return KVM_GOVERNED_##x;
> -#include "governed_features.h"
> -	default:
> -		return -1;
> -	}
> -}
> -
> -static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
> -{
> -	return kvm_governed_feature_index(x86_feature) >= 0;
> -}
> -
>  static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
>  					      unsigned int x86_feature)
>  {
> @@ -285,17 +229,17 @@ static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
>  
>  static inline bool guest_has_spec_ctrl_msr(struct kvm_vcpu *vcpu)
>  {
> -	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD));
> +	return (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> +		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_STIBP) ||
> +		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS) ||
> +		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_SSBD));
>  }
>  
>  static inline bool guest_has_pred_cmd_msr(struct kvm_vcpu *vcpu)
>  {
> -	return (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB) ||
> -		guest_cpuid_has(vcpu, X86_FEATURE_SBPB));
> +	return (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> +		guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBPB) ||
> +		guest_cpu_cap_has(vcpu, X86_FEATURE_SBPB));
>  }
>  
>  #endif
> diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
> deleted file mode 100644
> index 423a73395c10..000000000000
> --- a/arch/x86/kvm/governed_features.h
> +++ /dev/null
> @@ -1,21 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#if !defined(KVM_GOVERNED_FEATURE) || defined(KVM_GOVERNED_X86_FEATURE)
> -BUILD_BUG()
> -#endif
> -
> -#define KVM_GOVERNED_X86_FEATURE(x) KVM_GOVERNED_FEATURE(X86_FEATURE_##x)
> -
> -KVM_GOVERNED_X86_FEATURE(GBPAGES)
> -KVM_GOVERNED_X86_FEATURE(XSAVES)
> -KVM_GOVERNED_X86_FEATURE(VMX)
> -KVM_GOVERNED_X86_FEATURE(NRIPS)
> -KVM_GOVERNED_X86_FEATURE(TSCRATEMSR)
> -KVM_GOVERNED_X86_FEATURE(V_VMSAVE_VMLOAD)
> -KVM_GOVERNED_X86_FEATURE(LBRV)
> -KVM_GOVERNED_X86_FEATURE(PAUSEFILTER)
> -KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
> -KVM_GOVERNED_X86_FEATURE(VGIF)
> -KVM_GOVERNED_X86_FEATURE(VNMI)
> -
> -#undef KVM_GOVERNED_X86_FEATURE
> -#undef KVM_GOVERNED_FEATURE
> 



> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 245b20973cae..f5fab29c827f 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -584,7 +584,7 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu)
>  	 * version first and level-triggered interrupts never get EOIed in
>  	 * IOAPIC.
>  	 */
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) &&
>  	    !ioapic_in_kernel(vcpu->kvm))
>  		v |= APIC_LVR_DIRECTED_EOI;
>  	kvm_lapic_set_reg(apic, APIC_LVR, v);
> diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
> index a67c28a56417..9e8cb38ae1db 100644
> --- a/arch/x86/kvm/mtrr.c
> +++ b/arch/x86/kvm/mtrr.c
> @@ -128,7 +128,7 @@ static u8 mtrr_disabled_type(struct kvm_vcpu *vcpu)
>  	 * enable MTRRs and it is obviously undesirable to run the
>  	 * guest entirely with UC memory and we use WB.
>  	 */
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_MTRR))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_MTRR))
>  		return MTRR_TYPE_UNCACHABLE;
>  	else
>  		return MTRR_TYPE_WRBACK;
> diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c
> index dc3d95fdca7d..3ca4154d9fa0 100644
> --- a/arch/x86/kvm/smm.c
> +++ b/arch/x86/kvm/smm.c
> @@ -290,7 +290,7 @@ void enter_smm(struct kvm_vcpu *vcpu)
>  	memset(smram.bytes, 0, sizeof(smram.bytes));
>  
>  #ifdef CONFIG_X86_64
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
>  		enter_smm_save_state_64(vcpu, &smram.smram64);
>  	else
>  #endif
> @@ -360,7 +360,7 @@ void enter_smm(struct kvm_vcpu *vcpu)
>  	kvm_set_segment(vcpu, &ds, VCPU_SREG_SS);
>  
>  #ifdef CONFIG_X86_64
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
>  		if (static_call(kvm_x86_set_efer)(vcpu, 0))
>  			goto error;
>  #endif
> @@ -593,7 +593,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
>  	 * supports long mode.
>  	 */
>  #ifdef CONFIG_X86_64
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM)) {
>  		struct kvm_segment cs_desc;
>  		unsigned long cr4;
>  
> @@ -616,7 +616,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
>  		kvm_set_cr0(vcpu, cr0 & ~(X86_CR0_PG | X86_CR0_PE));
>  
>  #ifdef CONFIG_X86_64
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM)) {
>  		unsigned long cr4, efer;
>  
>  		/* Clear CR4.PAE before clearing EFER.LME. */
> @@ -639,7 +639,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
>  		return X86EMUL_UNHANDLEABLE;
>  
>  #ifdef CONFIG_X86_64
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
>  		return rsm_load_state_64(ctxt, &smram.smram64);
>  	else
>  #endif
> diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
> index 373ff6a6687b..16d396a31c16 100644
> --- a/arch/x86/kvm/svm/pmu.c
> +++ b/arch/x86/kvm/svm/pmu.c
> @@ -46,7 +46,7 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
>  
>  	switch (msr) {
>  	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
> -		if (!guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE))
>  			return NULL;
>  		/*
>  		 * Each PMU counter has a pair of CTL and CTR MSRs. CTLn
> @@ -113,7 +113,7 @@ static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
>  	case MSR_K7_EVNTSEL0 ... MSR_K7_PERFCTR3:
>  		return pmu->version > 0;
>  	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
> -		return guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE);
> +		return guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE);
>  	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
>  	case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
>  	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
> @@ -184,7 +184,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
>  	union cpuid_0x80000022_ebx ebx;
>  
>  	pmu->version = 1;
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_PERFMON_V2)) {
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_PERFMON_V2)) {
>  		pmu->version = 2;
>  		/*
>  		 * Note, PERFMON_V2 is also in 0x80000022.0x0, i.e. the guest
> @@ -194,7 +194,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
>  			     x86_feature_cpuid(X86_FEATURE_PERFMON_V2).index);
>  		ebx.full = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0)->ebx;
>  		pmu->nr_arch_gp_counters = ebx.split.num_core_pmc;
> -	} else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
> +	} else if (guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
>  		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
>  	} else {
>  		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 4900c078045a..05008d33ae63 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2967,8 +2967,8 @@ static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm)
>  	struct kvm_vcpu *vcpu = &svm->vcpu;
>  
>  	if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) {
> -		bool v_tsc_aux = guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
> -				 guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
> +		bool v_tsc_aux = guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) ||
> +				 guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID);
>  
>  		set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, v_tsc_aux, v_tsc_aux);
>  	}
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 5827328e30f1..9e3a9191dac1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1185,14 +1185,14 @@ static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
>  	 */
>  	if (kvm_cpu_cap_has(X86_FEATURE_INVPCID)) {
>  		if (!npt_enabled ||
> -		    !guest_cpuid_has(&svm->vcpu, X86_FEATURE_INVPCID))
> +		    !guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_INVPCID))
>  			svm_set_intercept(svm, INTERCEPT_INVPCID);
>  		else
>  			svm_clr_intercept(svm, INTERCEPT_INVPCID);
>  	}
>  
>  	if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP)) {
> -		if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP))
>  			svm_clr_intercept(svm, INTERCEPT_RDTSCP);
>  		else
>  			svm_set_intercept(svm, INTERCEPT_RDTSCP);
> @@ -2905,7 +2905,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		break;
>  	case MSR_AMD64_VIRT_SPEC_CTRL:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_VIRT_SSBD))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_VIRT_SSBD))
>  			return 1;
>  
>  		msr_info->data = svm->virt_spec_ctrl;
> @@ -3052,7 +3052,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		break;
>  	case MSR_AMD64_VIRT_SPEC_CTRL:
>  		if (!msr->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_VIRT_SSBD))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_VIRT_SSBD))
>  			return 1;
>  
>  		if (data & ~SPEC_CTRL_SSBD)
> @@ -3224,7 +3224,7 @@ static int invpcid_interception(struct kvm_vcpu *vcpu)
>  	unsigned long type;
>  	gva_t gva;
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_INVPCID)) {
>  		kvm_queue_exception(vcpu, UD_VECTOR);
>  		return 1;
>  	}
> @@ -4318,7 +4318,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
>  			     boot_cpu_has(X86_FEATURE_XSAVE) &&
>  			     boot_cpu_has(X86_FEATURE_XSAVES) &&
> -			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
> +			     guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE));
>  
>  	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
>  	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
> @@ -4345,7 +4345,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  
>  	if (boot_cpu_has(X86_FEATURE_FLUSH_L1D))
>  		set_msr_interception(vcpu, svm->msrpm, MSR_IA32_FLUSH_CMD, 0,
> -				     !!guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D));
> +				     !!guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
>  
>  	if (sev_guest(vcpu->kvm))
>  		sev_vcpu_after_set_cpuid(svm);
> @@ -4602,7 +4602,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, union kvm_smram *smram)
>  	 * responsible for ensuring nested SVM and SMIs are mutually exclusive.
>  	 */
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
>  		return 1;
>  
>  	smram->smram64.svm_guest_flag = 1;
> @@ -4649,14 +4649,14 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const union kvm_smram *smram)
>  
>  	const struct kvm_smram_state_64 *smram64 = &smram->smram64;
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
>  		return 0;
>  
>  	/* Non-zero if SMI arrived while vCPU was in guest mode. */
>  	if (!smram64->svm_guest_flag)
>  		return 0;
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_SVM))
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SVM))
>  		return 1;
>  
>  	if (!(smram64->efer & EFER_SVME))
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 4750d1696d58..f046813e34c1 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2005,7 +2005,7 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
>  	bool evmcs_gpa_changed = false;
>  	u64 evmcs_gpa;
>  
> -	if (likely(!guest_cpuid_has_evmcs(vcpu)))
> +	if (likely(!guest_cpu_cap_has_evmcs(vcpu)))
>  		return EVMPTRLD_DISABLED;
>  
>  	evmcs_gpa = nested_get_evmptr(vcpu);
> @@ -2888,7 +2888,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
>  	    nested_check_vm_entry_controls(vcpu, vmcs12))
>  		return -EINVAL;
>  
> -	if (guest_cpuid_has_evmcs(vcpu))
> +	if (guest_cpu_cap_has_evmcs(vcpu))
>  		return nested_evmcs_check_controls(vmcs12);
>  
>  	return 0;
> @@ -3170,7 +3170,7 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
>  	 * L2 was running), map it here to make sure vmcs12 changes are
>  	 * properly reflected.
>  	 */
> -	if (guest_cpuid_has_evmcs(vcpu) &&
> +	if (guest_cpu_cap_has_evmcs(vcpu) &&
>  	    vmx->nested.hv_evmcs_vmptr == EVMPTR_MAP_PENDING) {
>  		enum nested_evmptrld_status evmptrld_status =
>  			nested_vmx_handle_enlightened_vmptrld(vcpu, false);
> @@ -4814,7 +4814,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
>  	 * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
>  	 * separate modes for L2 vs L1.
>  	 */
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
>  		indirect_branch_prediction_barrier();
>  
>  	/* Update any VMCS fields that might have changed while L2 ran */
> @@ -5302,7 +5302,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
>  	 * state. It is possible that the area will stay mapped as
>  	 * vmx->nested.hv_evmcs but this shouldn't be a problem.
>  	 */
> -	if (likely(!guest_cpuid_has_evmcs(vcpu) ||
> +	if (likely(!guest_cpu_cap_has_evmcs(vcpu) ||
>  		   !evmptr_is_valid(nested_get_evmptr(vcpu)))) {
>  		if (vmptr == vmx->nested.current_vmptr)
>  			nested_release_vmcs12(vcpu);
> @@ -6092,7 +6092,7 @@ static bool nested_vmx_exit_handled_encls(struct kvm_vcpu *vcpu,
>  {
>  	u32 encls_leaf;
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_SGX) ||
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) ||
>  	    !nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENCLS_EXITING))
>  		return false;
>  
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 820d3e1f6b4f..98d579c0ce28 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -160,7 +160,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
>  
>  static inline u64 vcpu_get_perf_capabilities(struct kvm_vcpu *vcpu)
>  {
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM))
>  		return 0;
>  
>  	return vcpu->arch.perf_capabilities;
> @@ -210,7 +210,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
>  		ret = vcpu_get_perf_capabilities(vcpu) & PERF_CAP_PEBS_FORMAT;
>  		break;
>  	case MSR_IA32_DS_AREA:
> -		ret = guest_cpuid_has(vcpu, X86_FEATURE_DS);
> +		ret = guest_cpu_cap_has(vcpu, X86_FEATURE_DS);
>  		break;
>  	case MSR_PEBS_DATA_CFG:
>  		perf_capabilities = vcpu_get_perf_capabilities(vcpu);
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index 3e822e582497..9616b4ac0662 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -122,7 +122,7 @@ static int sgx_inject_fault(struct kvm_vcpu *vcpu, gva_t gva, int trapnr)
>  	 * likely than a bad userspace address.
>  	 */
>  	if ((trapnr == PF_VECTOR || !boot_cpu_has(X86_FEATURE_SGX2)) &&
> -	    guest_cpuid_has(vcpu, X86_FEATURE_SGX2)) {
> +	    guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2)) {
>  		memset(&ex, 0, sizeof(ex));
>  		ex.vector = PF_VECTOR;
>  		ex.error_code = PFERR_PRESENT_MASK | PFERR_WRITE_MASK |
> @@ -365,7 +365,7 @@ static inline bool encls_leaf_enabled_in_guest(struct kvm_vcpu *vcpu, u32 leaf)
>  		return true;
>  
>  	if (leaf >= EAUG && leaf <= EMODT)
> -		return guest_cpuid_has(vcpu, X86_FEATURE_SGX2);
> +		return guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2);
>  
>  	return false;
>  }
> @@ -381,8 +381,8 @@ int handle_encls(struct kvm_vcpu *vcpu)
>  {
>  	u32 leaf = (u32)kvm_rax_read(vcpu);
>  
> -	if (!enable_sgx || !guest_cpuid_has(vcpu, X86_FEATURE_SGX) ||
> -	    !guest_cpuid_has(vcpu, X86_FEATURE_SGX1)) {
> +	if (!enable_sgx || !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) ||
> +	    !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX1)) {
>  		kvm_queue_exception(vcpu, UD_VECTOR);
>  	} else if (!encls_leaf_enabled_in_guest(vcpu, leaf) ||
>  		   !sgx_enabled_in_guest_bios(vcpu) || !is_paging(vcpu)) {
> @@ -479,15 +479,15 @@ void vmx_write_encls_bitmap(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  	if (!cpu_has_vmx_encls_vmexit())
>  		return;
>  
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX) &&
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX) &&
>  	    sgx_enabled_in_guest_bios(vcpu)) {
> -		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX1)) {
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX1)) {
>  			bitmap &= ~GENMASK_ULL(ETRACK, ECREATE);
>  			if (sgx_intercept_encls_ecreate(vcpu))
>  				bitmap |= (1 << ECREATE);
>  		}
>  
> -		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX2))
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX2))
>  			bitmap &= ~GENMASK_ULL(EMODT, EAUG);
>  
>  		/*
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 5a056ad1ae55..815692dc0aff 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1874,8 +1874,8 @@ static void vmx_setup_uret_msrs(struct vcpu_vmx *vmx)
>  	vmx_setup_uret_msr(vmx, MSR_EFER, update_transition_efer(vmx));
>  
>  	vmx_setup_uret_msr(vmx, MSR_TSC_AUX,
> -			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
> -			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
> +			   guest_cpu_cap_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
> +			   guest_cpu_cap_has(&vmx->vcpu, X86_FEATURE_RDPID));
>  
>  	/*
>  	 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations of new
> @@ -2028,7 +2028,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_BNDCFGS:
>  		if (!kvm_mpx_supported() ||
>  		    (!msr_info->host_initiated &&
> -		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
> +		     !guest_cpu_cap_has(vcpu, X86_FEATURE_MPX)))
>  			return 1;
>  		msr_info->data = vmcs_read64(GUEST_BNDCFGS);
>  		break;
> @@ -2044,7 +2044,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		break;
>  	case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC))
>  			return 1;
>  		msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
>  			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
> @@ -2062,7 +2062,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		 * sanity checking and refuse to boot. Filter all unsupported
>  		 * features out.
>  		 */
> -		if (!msr_info->host_initiated && guest_cpuid_has_evmcs(vcpu))
> +		if (!msr_info->host_initiated && guest_cpu_cap_has_evmcs(vcpu))
>  			nested_evmcs_filter_control_msr(vcpu, msr_info->index,
>  							&msr_info->data);
>  		break;
> @@ -2131,7 +2131,7 @@ static u64 nested_vmx_truncate_sysenter_addr(struct kvm_vcpu *vcpu,
>  						    u64 data)
>  {
>  #ifdef CONFIG_X86_64
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_LM))
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
>  		return (u32)data;
>  #endif
>  	return (unsigned long)data;
> @@ -2142,7 +2142,7 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
>  	u64 debugctl = 0;
>  
>  	if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT) &&
> -	    (host_initiated || guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT)))
> +	    (host_initiated || guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT)))
>  		debugctl |= DEBUGCTLMSR_BUS_LOCK_DETECT;
>  
>  	if ((kvm_caps.supported_perf_cap & PMU_CAP_LBR_FMT) &&
> @@ -2246,7 +2246,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_BNDCFGS:
>  		if (!kvm_mpx_supported() ||
>  		    (!msr_info->host_initiated &&
> -		     !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
> +		     !guest_cpu_cap_has(vcpu, X86_FEATURE_MPX)))
>  			return 1;
>  		if (is_noncanonical_address(data & PAGE_MASK, vcpu) ||
>  		    (data & MSR_IA32_BNDCFGS_RSVD))
> @@ -2348,7 +2348,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		 * behavior, but it's close enough.
>  		 */
>  		if (!msr_info->host_initiated &&
> -		    (!guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC) ||
> +		    (!guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC) ||
>  		    ((vmx->msr_ia32_feature_control & FEAT_CTL_LOCKED) &&
>  		    !(vmx->msr_ia32_feature_control & FEAT_CTL_SGX_LC_ENABLED))))
>  			return 1;
> @@ -2434,9 +2434,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			if ((data & PERF_CAP_PEBS_MASK) !=
>  			    (kvm_caps.supported_perf_cap & PERF_CAP_PEBS_MASK))
>  				return 1;
> -			if (!guest_cpuid_has(vcpu, X86_FEATURE_DS))
> +			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_DS))
>  				return 1;
> -			if (!guest_cpuid_has(vcpu, X86_FEATURE_DTES64))
> +			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_DTES64))
>  				return 1;
>  			if (!cpuid_model_is_consistent(vcpu))
>  				return 1;
> @@ -4566,10 +4566,7 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
>  	bool __enabled;										\
>  												\
>  	if (cpu_has_vmx_##name()) {								\
> -		if (kvm_is_governed_feature(X86_FEATURE_##feat_name))				\
> -			__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);		\
> -		else										\
> -			__enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name);		\
> +		__enabled = guest_cpu_cap_has(__vcpu, X86_FEATURE_##feat_name);			\
>  		vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
>  						  __enabled, exiting);				\
>  	}											\
> @@ -4644,8 +4641,8 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
>  	 */
>  	if (cpu_has_vmx_rdtscp()) {
>  		bool rdpid_or_rdtscp_enabled =
> -			guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
> -			guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
> +			guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) ||
> +			guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID);
>  
>  		vmx_adjust_secondary_exec_control(vmx, &exec_control,
>  						  SECONDARY_EXEC_ENABLE_RDTSCP,
> @@ -5947,7 +5944,7 @@ static int handle_invpcid(struct kvm_vcpu *vcpu)
>  	} operand;
>  	int gpr_index;
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_INVPCID)) {
>  		kvm_queue_exception(vcpu, UD_VECTOR);
>  		return 1;
>  	}
> @@ -7756,7 +7753,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * set if and only if XSAVE is supported.
>  	 */
>  	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> -	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> +	    guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
>  		guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
>  	else
>  		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);
> @@ -7782,21 +7779,21 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  		nested_vmx_cr_fixed1_bits_update(vcpu);
>  
>  	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
> -			guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
> +			guest_cpu_cap_has(vcpu, X86_FEATURE_INTEL_PT))
>  		update_intel_pt_cfg(vcpu);
>  
>  	if (boot_cpu_has(X86_FEATURE_RTM)) {
>  		struct vmx_uret_msr *msr;
>  		msr = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
>  		if (msr) {
> -			bool enabled = guest_cpuid_has(vcpu, X86_FEATURE_RTM);
> +			bool enabled = guest_cpu_cap_has(vcpu, X86_FEATURE_RTM);
>  			vmx_set_guest_uret_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
>  		}
>  	}
>  
>  	if (kvm_cpu_cap_has(X86_FEATURE_XFD))
>  		vmx_set_intercept_for_msr(vcpu, MSR_IA32_XFD_ERR, MSR_TYPE_R,
> -					  !guest_cpuid_has(vcpu, X86_FEATURE_XFD));
> +					  !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD));
>  
>  	if (boot_cpu_has(X86_FEATURE_IBPB))
>  		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W,
> @@ -7804,17 +7801,17 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  
>  	if (boot_cpu_has(X86_FEATURE_FLUSH_L1D))
>  		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
> -					  !guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D));
> +					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
>  
>  	set_cr4_guest_host_mask(vmx);
>  
>  	vmx_write_encls_bitmap(vcpu, NULL);
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX))
>  		vmx->msr_ia32_feature_control_valid_bits |= FEAT_CTL_SGX_ENABLED;
>  	else
>  		vmx->msr_ia32_feature_control_valid_bits &= ~FEAT_CTL_SGX_ENABLED;
>  
> -	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SGX_LC))
>  		vmx->msr_ia32_feature_control_valid_bits |=
>  			FEAT_CTL_SGX_LC_ENABLED;
>  	else
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index c2130d2c8e24..edca0a4276fb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -745,7 +745,7 @@ static inline bool vmx_can_use_ipiv(struct kvm_vcpu *vcpu)
>  	return  lapic_in_kernel(vcpu) && enable_ipiv;
>  }
>  
> -static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
> +static inline bool guest_cpu_cap_has_evmcs(struct kvm_vcpu *vcpu)
>  {
>  	/*
>  	 * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 04a77b764a36..a6b8f844a5bc 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -487,7 +487,7 @@ int kvm_set_apic_base(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	enum lapic_mode old_mode = kvm_get_apic_mode(vcpu);
>  	enum lapic_mode new_mode = kvm_apic_mode(msr_info->data);
>  	u64 reserved_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu) | 0x2ff |
> -		(guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) ? 0 : X2APIC_ENABLE);
> +		(guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) ? 0 : X2APIC_ENABLE);
>  
>  	if ((msr_info->data & reserved_bits) != 0 || new_mode == LAPIC_MODE_INVALID)
>  		return 1;
> @@ -1362,10 +1362,10 @@ static u64 kvm_dr6_fixed(struct kvm_vcpu *vcpu)
>  {
>  	u64 fixed = DR6_FIXED_1;
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_RTM))
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_RTM))
>  		fixed |= DR6_RTM;
>  
> -	if (!guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
> +	if (!guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
>  		fixed |= DR6_BUS_LOCK;
>  	return fixed;
>  }
> @@ -1721,20 +1721,20 @@ static int do_get_msr_feature(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>  
>  static bool __kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
>  {
> -	if (efer & EFER_AUTOIBRS && !guest_cpuid_has(vcpu, X86_FEATURE_AUTOIBRS))
> +	if (efer & EFER_AUTOIBRS && !guest_cpu_cap_has(vcpu, X86_FEATURE_AUTOIBRS))
>  		return false;
>  
> -	if (efer & EFER_FFXSR && !guest_cpuid_has(vcpu, X86_FEATURE_FXSR_OPT))
> +	if (efer & EFER_FFXSR && !guest_cpu_cap_has(vcpu, X86_FEATURE_FXSR_OPT))
>  		return false;
>  
> -	if (efer & EFER_SVME && !guest_cpuid_has(vcpu, X86_FEATURE_SVM))
> +	if (efer & EFER_SVME && !guest_cpu_cap_has(vcpu, X86_FEATURE_SVM))
>  		return false;
>  
>  	if (efer & (EFER_LME | EFER_LMA) &&
> -	    !guest_cpuid_has(vcpu, X86_FEATURE_LM))
> +	    !guest_cpu_cap_has(vcpu, X86_FEATURE_LM))
>  		return false;
>  
> -	if (efer & EFER_NX && !guest_cpuid_has(vcpu, X86_FEATURE_NX))
> +	if (efer & EFER_NX && !guest_cpu_cap_has(vcpu, X86_FEATURE_NX))
>  		return false;
>  
>  	return true;
> @@ -1872,8 +1872,8 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
>  			return 1;
>  
>  		if (!host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID))
>  			return 1;
>  
>  		/*
> @@ -1929,8 +1929,8 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
>  			return 1;
>  
>  		if (!host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID))
>  			return 1;
>  		break;
>  	}
> @@ -2122,7 +2122,7 @@ EXPORT_SYMBOL_GPL(kvm_handle_invalid_op);
>  static int kvm_emulate_monitor_mwait(struct kvm_vcpu *vcpu, const char *insn)
>  {
>  	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS) &&
> -	    !guest_cpuid_has(vcpu, X86_FEATURE_MWAIT))
> +	    !guest_cpu_cap_has(vcpu, X86_FEATURE_MWAIT))
>  		return kvm_handle_invalid_op(vcpu);
>  
>  	pr_warn_once("%s instruction emulated as NOP!\n", insn);
> @@ -3761,11 +3761,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			if ((!guest_has_pred_cmd_msr(vcpu)))
>  				return 1;
>  
> -			if (!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
> -			    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBPB))
> +			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
> +			    !guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBPB))
>  				reserved_bits |= PRED_CMD_IBPB;
>  
> -			if (!guest_cpuid_has(vcpu, X86_FEATURE_SBPB))
> +			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SBPB))
>  				reserved_bits |= PRED_CMD_SBPB;
>  		}
>  
> @@ -3786,7 +3786,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	}
>  	case MSR_IA32_FLUSH_CMD:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D))
>  			return 1;
>  
>  		if (!boot_cpu_has(X86_FEATURE_FLUSH_L1D) || (data & ~L1D_FLUSH))
> @@ -3837,7 +3837,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		kvm_set_lapic_tscdeadline_msr(vcpu, data);
>  		break;
>  	case MSR_IA32_TSC_ADJUST:
> -		if (guest_cpuid_has(vcpu, X86_FEATURE_TSC_ADJUST)) {
> +		if (guest_cpu_cap_has(vcpu, X86_FEATURE_TSC_ADJUST)) {
>  			if (!msr_info->host_initiated) {
>  				s64 adj = data - vcpu->arch.ia32_tsc_adjust_msr;
>  				adjust_tsc_offset_guest(vcpu, adj);
> @@ -3864,7 +3864,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  
>  		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
>  		    ((old_val ^ data)  & MSR_IA32_MISC_ENABLE_MWAIT)) {
> -			if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
> +			if (!guest_cpu_cap_has(vcpu, X86_FEATURE_XMM3))
>  				return 1;
>  			vcpu->arch.ia32_misc_enable_msr = data;
>  			kvm_update_cpuid_runtime(vcpu);
> @@ -3892,7 +3892,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		break;
>  	case MSR_IA32_XSS:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES))
>  			return 1;
>  		/*
>  		 * KVM supports exposing PT to the guest, but does not support
> @@ -4039,12 +4039,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		kvm_pr_unimpl_wrmsr(vcpu, msr, data);
>  		break;
>  	case MSR_AMD64_OSVW_ID_LENGTH:
> -		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
>  			return 1;
>  		vcpu->arch.osvw.length = data;
>  		break;
>  	case MSR_AMD64_OSVW_STATUS:
> -		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
>  			return 1;
>  		vcpu->arch.osvw.status = data;
>  		break;
> @@ -4065,7 +4065,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  #ifdef CONFIG_X86_64
>  	case MSR_IA32_XFD:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
>  			return 1;
>  
>  		if (data & ~kvm_guest_supported_xfd(vcpu))
> @@ -4075,7 +4075,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		break;
>  	case MSR_IA32_XFD_ERR:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
>  			return 1;
>  
>  		if (data & ~kvm_guest_supported_xfd(vcpu))
> @@ -4199,13 +4199,13 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		break;
>  	case MSR_IA32_ARCH_CAPABILITIES:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
>  			return 1;
>  		msr_info->data = vcpu->arch.arch_capabilities;
>  		break;
>  	case MSR_IA32_PERF_CAPABILITIES:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM))
>  			return 1;
>  		msr_info->data = vcpu->arch.perf_capabilities;
>  		break;
> @@ -4361,7 +4361,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  				   msr_info->host_initiated);
>  	case MSR_IA32_XSS:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES))
>  			return 1;
>  		msr_info->data = vcpu->arch.ia32_xss;
>  		break;
> @@ -4404,12 +4404,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		msr_info->data = 0xbe702111;
>  		break;
>  	case MSR_AMD64_OSVW_ID_LENGTH:
> -		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
>  			return 1;
>  		msr_info->data = vcpu->arch.osvw.length;
>  		break;
>  	case MSR_AMD64_OSVW_STATUS:
> -		if (!guest_cpuid_has(vcpu, X86_FEATURE_OSVW))
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_OSVW))
>  			return 1;
>  		msr_info->data = vcpu->arch.osvw.status;
>  		break;
> @@ -4428,14 +4428,14 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  #ifdef CONFIG_X86_64
>  	case MSR_IA32_XFD:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
>  			return 1;
>  
>  		msr_info->data = vcpu->arch.guest_fpu.fpstate->xfd;
>  		break;
>  	case MSR_IA32_XFD_ERR:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_XFD))
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_XFD))
>  			return 1;
>  
>  		msr_info->data = vcpu->arch.guest_fpu.xfd_err;
> @@ -8368,17 +8368,17 @@ static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
>  
>  static bool emulator_guest_has_movbe(struct x86_emulate_ctxt *ctxt)
>  {
> -	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_MOVBE);
> +	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_MOVBE);
>  }
>  
>  static bool emulator_guest_has_fxsr(struct x86_emulate_ctxt *ctxt)
>  {
> -	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_FXSR);
> +	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_FXSR);
>  }
>  
>  static bool emulator_guest_has_rdpid(struct x86_emulate_ctxt *ctxt)
>  {
> -	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_RDPID);
> +	return guest_cpu_cap_has(emul_to_vcpu(ctxt), X86_FEATURE_RDPID);
>  }
>  
>  static ulong emulator_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned reg)


Looks OK, but the patch is long and I might have missed something.

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities
  2023-11-10 23:55 ` [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities Sean Christopherson
@ 2023-11-19 17:36   ` Maxim Levitsky
  0 siblings, 0 replies; 40+ messages in thread
From: Maxim Levitsky @ 2023-11-19 17:36 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> Restrict XSAVE in guest cpu_caps so that XSAVES dependencies on XSAVE are
> automatically handled instead of manually checking for host and guest
> XSAVE support.  Aside from modifying XSAVE in cpu_caps, this should be a
> glorified nop as KVM doesn't query guest XSAVE support (which is also why
> it wasn't/isn't a bug to leave XSAVE set in guest CPUID).
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/svm.c | 2 +-
>  arch/x86/kvm/vmx/vmx.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 9e3a9191dac1..6fe2d7bf4959 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4315,8 +4315,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
>  	 * the guest read/write access to the host's XSS.
>  	 */
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVE);
>  	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
> -			     boot_cpu_has(X86_FEATURE_XSAVE) &&
>  			     boot_cpu_has(X86_FEATURE_XSAVES) &&
>  			     guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE));
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 815692dc0aff..7645945af5c5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7752,8 +7752,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * to the guest.  XSAVES depends on CR4.OSXSAVE, and CR4.OSXSAVE can be
>  	 * set if and only if XSAVE is supported.
>  	 */
> -	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> -	    guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
> +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVE);
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVE))
>  		guest_cpu_cap_restrict(vcpu, X86_FEATURE_XSAVES);
>  	else
>  		guest_cpu_cap_clear(vcpu, X86_FEATURE_XSAVES);

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky





^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-11-17  8:33       ` Yang, Weijiang
@ 2023-11-21  3:10         ` Yuan Yao
  0 siblings, 0 replies; 40+ messages in thread
From: Yuan Yao @ 2023-11-21  3:10 UTC (permalink / raw)
  To: Yang, Weijiang
  Cc: Sean Christopherson, Paolo Bonzini, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, Maxim Levitsky

On Fri, Nov 17, 2023 at 04:33:27PM +0800, Yang, Weijiang wrote:
> On 11/17/2023 6:29 AM, Sean Christopherson wrote:
> > On Thu, Nov 16, 2023, Weijiang Yang wrote:
> > > On 11/11/2023 7:55 AM, Sean Christopherson wrote:
> > >
> > > [...]
> > >
> > > > -static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> > > > -							unsigned int x86_feature)
> > > > +static __always_inline void guest_cpu_cap_clear(struct kvm_vcpu *vcpu,
> > > > +						unsigned int x86_feature)
> > > >    {
> > > > -	if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
> > > > +	unsigned int x86_leaf = __feature_leaf(x86_feature);
> > > > +
> > > > +	reverse_cpuid_check(x86_leaf);
> > > > +	vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
> > > > +}
> > > > +
> > > > +static __always_inline void guest_cpu_cap_change(struct kvm_vcpu *vcpu,
> > > > +						 unsigned int x86_feature,
> > > > +						 bool guest_has_cap)
> > > > +{
> > > > +	if (guest_has_cap)
> > > >    		guest_cpu_cap_set(vcpu, x86_feature);
> > > > +	else
> > > > +		guest_cpu_cap_clear(vcpu, x86_feature);
> > > > +}
> > > I don't see any necessity to add 3 functions, i.e., guest_cpu_cap_{set, clear, change}, for
> > I want to have equivalents to the cpuid_entry_*() APIs so that we don't end up
> > with two different sets of names.  And the clear() API already has a second user.
> >
> > > guest_cpu_cap update. IMHO one function is enough, e.g,:
> > Hrm, I open coded the OR/AND logic in cpuid_entry_change() to try to force CMOV
> > instead of Jcc.  That honestly seems like a pointless optimization.  I would
> > rather use the helpers, which is less code.
> >
> > > static __always_inline void guest_cpu_cap_update(struct kvm_vcpu *vcpu,
> > >                                                   unsigned int x86_feature,
> > >                                                   bool guest_has_cap)
> > > {
> > >          unsigned int x86_leaf = __feature_leaf(x86_feature);
> > >
> > > reverse_cpuid_check(x86_leaf);
> > >          if (guest_has_cap)
> > >                  vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
> > > else
> > >                  vcpu->arch.cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
> > > }
> > >
> > > > +
> > > > +static __always_inline void guest_cpu_cap_restrict(struct kvm_vcpu *vcpu,
> > > > +						   unsigned int x86_feature)
> > > > +{
> > > > +	if (!kvm_cpu_cap_has(x86_feature))
> > > > +		guest_cpu_cap_clear(vcpu, x86_feature);
> > > >    }
> > > _restrict is not clear to me for what the function actually does -- it
> > > conditionally clears guest cap depending on KVM support of the feature.
> > >
> > > How about renaming it to guest_cpu_cap_sync()?
> > "sync" isn't correct because it's not synchronizing with KVM's capabilitiy, e.g.
> > the guest capability will remaing unset if the guest CPUID bit is clear but the
> > KVM capability is available.
> >
> > How about constrain()?
> I don't know, just feel we already have guest_cpu_cap_{set, clear, change}, here the name cannot exactly match the behavior of the function, maybe guest_cpu_cap_filter()? But just ignore the nit, up to you to decide the name :-)

How about guest_cpu_cap_kvm_restrict or guest_cpu_cap_kvm_constrain ?

>
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap"
  2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
  2023-11-19 17:08   ` Maxim Levitsky
@ 2023-11-21  3:20   ` Chao Gao
  1 sibling, 0 replies; 40+ messages in thread
From: Chao Gao @ 2023-11-21  3:20 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Maxim Levitsky

On Fri, Nov 10, 2023 at 03:55:20PM -0800, Sean Christopherson wrote:
>As the first step toward replacing KVM's so-called "governed features"
>framework with a more comprehensive, less poorly named implementation,
>replace the "kvm_governed_feature" function prefix with "guest_cpu_cap"
>and rename guest_can_use() to guest_cpu_cap_has().
>
>The "guest_cpu_cap" naming scheme mirrors that of "kvm_cpu_cap", and
>provides a more clear distinction between guest capabilities, which are
>KVM controlled (heh, or one might say "governed"), and guest CPUID, which
>with few exceptions is fully userspace controlled.
>
>No functional change intended.
>
>Signed-off-by: Sean Christopherson <seanjc@google.com>
>---
> arch/x86/kvm/cpuid.c      |  2 +-
> arch/x86/kvm/cpuid.h      | 12 ++++++------
> arch/x86/kvm/mmu/mmu.c    |  4 ++--
> arch/x86/kvm/svm/nested.c | 22 +++++++++++-----------
> arch/x86/kvm/svm/svm.c    | 26 +++++++++++++-------------
> arch/x86/kvm/svm/svm.h    |  4 ++--
> arch/x86/kvm/vmx/nested.c |  6 +++---
> arch/x86/kvm/vmx/vmx.c    | 14 +++++++-------
> arch/x86/kvm/x86.c        |  4 ++--
> 9 files changed, 47 insertions(+), 47 deletions(-)
>
>diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>index dda6fc4cfae8..4f464187b063 100644
>--- a/arch/x86/kvm/cpuid.c
>+++ b/arch/x86/kvm/cpuid.c
>@@ -345,7 +345,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> 	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
> 				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> 	if (allow_gbpages)
>-		kvm_governed_feature_set(vcpu, X86_FEATURE_GBPAGES);
>+		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
> 
> 	best = kvm_find_cpuid_entry(vcpu, 1);
> 	if (best && apic) {
>diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
>index 0b90532b6e26..245416ffa34c 100644
>--- a/arch/x86/kvm/cpuid.h
>+++ b/arch/x86/kvm/cpuid.h
>@@ -254,7 +254,7 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
> 	return kvm_governed_feature_index(x86_feature) >= 0;
> }
> 
>-static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
>+static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
> 						     unsigned int x86_feature)

nit: wrong indentation.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-19 17:35   ` Maxim Levitsky
@ 2023-11-24  6:33     ` Xu Yilun
  2023-11-28  0:43       ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Xu Yilun @ 2023-11-24  6:33 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel

On Sun, Nov 19, 2023 at 07:35:30PM +0200, Maxim Levitsky wrote:
> On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > When updating guest CPUID entries to emulate runtime behavior, e.g. when
> > the guest enables a CR4-based feature that is tied to a CPUID flag, also
> > update the vCPU's cpu_caps accordingly.  This will allow replacing all
> > usage of guest_cpuid_has() with guest_cpu_cap_has().
> > 
> > Take care not to update guest capabilities when KVM is updating CPUID
> > entries that *may* become the vCPU's CPUID, e.g. if userspace tries to set
> > bogus CPUID information.  No extra call to update cpu_caps is needed as
> > the cpu_caps are initialized from the incoming guest CPUID, i.e. will
> > automatically get the updated values.
> > 
> > Note, none of the features in question use guest_cpu_cap_has() at this
> > time, i.e. aside from settings bits in cpu_caps, this is a glorified nop.
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/cpuid.c | 48 +++++++++++++++++++++++++++++++-------------
> >  1 file changed, 34 insertions(+), 14 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 36bd04030989..37a991439fe6 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -262,31 +262,48 @@ static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)
> >  	return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0;
> >  }
> >  
> > +static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vcpu,
> > +						       struct kvm_cpuid_entry2 *entry,
> > +						       unsigned int x86_feature,
> > +						       bool has_feature)
> > +{
> > +	if (entry)
> > +		cpuid_entry_change(entry, x86_feature, has_feature);
> > +
> > +	if (vcpu)
> > +		guest_cpu_cap_change(vcpu, x86_feature, has_feature);
> > +}
> > +
> >  static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> >  				       int nent)
> >  {
> >  	struct kvm_cpuid_entry2 *best;
> > +	struct kvm_vcpu *caps = vcpu;
> > +
> > +	/*
> > +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > +	 * are coming in from userspace!
> > +	 */
> > +	if (entries != vcpu->arch.cpuid_entries)
> > +		caps = NULL;
> 
> I think that this should be decided by the caller. Just a boolean will suffice.

kvm_set_cpuid() calls this function only to validate/adjust the temporary
"entries" variable. While kvm_update_cpuid_runtime() calls it to do system
level changes.

So I kind of agree to make the caller fully awared, how about adding a
newly named wrapper for kvm_set_cpuid(), like:


  static void kvm_adjust_cpuid_entry(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
				     int nent)

  {
	WARN_ON(entries == vcpu->arch.cpuid_entries);
	__kvm_update_cpuid_runtime(vcpu, entries, nent);
  }

> 
> Or even better: since the userspace CPUID update is really not important in terms of performance,
> why to special case it? 
> 
> Even if these guest caps are later overwritten, I don't see why we
> need to avoid updating them, and in fact introduce a small risk of them not being consistent

IIUC, for kvm_set_cpuid() case, KVM may then fail the userspace cpuid setting,
so we can't change guest caps at this phase.

Thanks,
Yilun

> with the other cpu caps.
> 
> With this we can avoid having the 'cap' variable which is *very* confusing as well.
> 
> 
> >  
> >  	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> > -	if (best) {
> > -		/* Update OSXSAVE bit */
> > -		if (boot_cpu_has(X86_FEATURE_XSAVE))
> > -			cpuid_entry_change(best, X86_FEATURE_OSXSAVE,
> > +
> > +	if (boot_cpu_has(X86_FEATURE_XSAVE))
> > +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSXSAVE,
> >  					   kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE));
> >  
> > -		cpuid_entry_change(best, X86_FEATURE_APIC,
> > -			   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> > +	kvm_update_feature_runtime(caps, best, X86_FEATURE_APIC,
> > +				   vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE);
> >  
> > -		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> > -			cpuid_entry_change(best, X86_FEATURE_MWAIT,
> > -					   vcpu->arch.ia32_misc_enable_msr &
> > -					   MSR_IA32_MISC_ENABLE_MWAIT);
> > -	}
> > +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT))
> > +		kvm_update_feature_runtime(caps, best, X86_FEATURE_MWAIT,
> > +					   vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT);
> >  
> >  	best = cpuid_entry2_find(entries, nent, 7, 0);
> > -	if (best && boot_cpu_has(X86_FEATURE_PKU))
> > -		cpuid_entry_change(best, X86_FEATURE_OSPKE,
> > -				   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> > +	if (boot_cpu_has(X86_FEATURE_PKU))
> > +		kvm_update_feature_runtime(caps, best, X86_FEATURE_OSPKE,
> > +					   kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> >  
> >  	best = cpuid_entry2_find(entries, nent, 0xD, 0);
> >  	if (best)
> > @@ -353,6 +370,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> >  	 * honor userspace's definition for features that don't require KVM or
> >  	 * hardware management/support (or that KVM simply doesn't care about).
> > +	 *
> > +	 * Note, KVM has already done runtime updates on guest CPUID, i.e. this
> > +	 * will also correctly set runtime features in guest CPU capabilities.
> >  	 */
> >  	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> >  		const struct cpuid_reg cpuid = reverse_cpuid[i];
> 
> 
> Best regards,
> 	Maxim Levitsky
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-24  6:33     ` Xu Yilun
@ 2023-11-28  0:43       ` Sean Christopherson
  2023-11-28  5:13         ` Xu Yilun
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-11-28  0:43 UTC (permalink / raw)
  To: Xu Yilun; +Cc: Maxim Levitsky, Paolo Bonzini, kvm, linux-kernel

On Fri, Nov 24, 2023, Xu Yilun wrote:
> On Sun, Nov 19, 2023 at 07:35:30PM +0200, Maxim Levitsky wrote:
> > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > >  static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> > >  				       int nent)
> > >  {
> > >  	struct kvm_cpuid_entry2 *best;
> > > +	struct kvm_vcpu *caps = vcpu;
> > > +
> > > +	/*
> > > +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > > +	 * are coming in from userspace!
> > > +	 */
> > > +	if (entries != vcpu->arch.cpuid_entries)
> > > +		caps = NULL;
> > 
> > I think that this should be decided by the caller. Just a boolean will suffice.

I strongly disagree.  The _only_ time the caps should be updated is if
entries == vcpu->arch.cpuid_entries, and if entries == cpuid_entires than the caps
should _always_ be updated.

> kvm_set_cpuid() calls this function only to validate/adjust the temporary
> "entries" variable. While kvm_update_cpuid_runtime() calls it to do system
> level changes.
> 
> So I kind of agree to make the caller fully awared, how about adding a
> newly named wrapper for kvm_set_cpuid(), like:
> 
> 
>   static void kvm_adjust_cpuid_entry(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> 				     int nent)
> 
>   {
> 	WARN_ON(entries == vcpu->arch.cpuid_entries);
> 	__kvm_update_cpuid_runtime(vcpu, entries, nent);

But taking it a step further, we end up with

	WARN_ON_ONCE(update_caps != (entries == vcpu->arch.cpuid_entries));

which is silly since any bugs that would result in the WARN firing can be avoided
by doing:

	update_caps = entries == vcpu->arch.cpuid_entries;

which eventually distils down to the code I posted.

> > Or even better: since the userspace CPUID update is really not important in
> > terms of performance, why to special case it? 
> > 
> > Even if these guest caps are later overwritten, I don't see why we need to
> > avoid updating them, and in fact introduce a small risk of them not being
> > consistent
> 
> IIUC, for kvm_set_cpuid() case, KVM may then fail the userspace cpuid setting,
> so we can't change guest caps at this phase.

> Or even better: since the userspace CPUID update is really not important in
> terms of performance, why to special case it? 

Yep, and sadly __kvm_update_cpuid_runtime() *must* be invoked before kvm_set_cpuid()
is guaranteed to succeed because the whole point is to massage guest CPUID before
checking for divergences.

> > With this we can avoid having the 'cap' variable which is *very* confusing as well.

I agree the "caps" variable is confusing, but it's the least awful option I see.
The alternatives I can think of are:

  1. Update a dummy caps array
  2. Take a snapshot of the caps and restore them
  3. Have separate paths for updated guest CPUID versus guest caps

#1 would require passing a "u32 *" to guest_cpu_cap_change() (or an equivalent),
which I really, really don't want to do.  It' also a waste of cycles, and I'm
skeptical that it would be any less confusing than the proposed code.

#2 increases the complexity of kvm_set_cpuid() by introducing recovery paths, i.e.
adds more things that can break, and again is wasteful (though copying ~100 bytes
or so in a slow path isn't a big deal).

#3 would create unnecessary maintenance burden as we'd have to ensure any changes
hit both paths.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps
  2023-11-19 17:22   ` Maxim Levitsky
@ 2023-11-28  1:24     ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-11-28  1:24 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Paolo Bonzini, kvm, linux-kernel

On Sun, Nov 19, 2023, Maxim Levitsky wrote:
> On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > @@ -840,23 +856,15 @@ struct kvm_vcpu_arch {
> >  	struct kvm_hypervisor_cpuid kvm_cpuid;
> >  
> >  	/*
> > -	 * FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly
> > -	 * when "struct kvm_vcpu_arch" is no longer defined in an
> > -	 * arch/x86/include/asm header.  The max is mostly arbitrary, i.e.
> > -	 * can be increased as necessary.
> > +	 * Track the effective guest capabilities, i.e. the features the vCPU
> > +	 * is allowed to use.  Typically, but not always, features can be used
> > +	 * by the guest if and only if both KVM and userspace want to expose
> > +	 * the feature to the guest.  A common exception is for virtualization
> > +	 * holes, i.e. when KVM can't prevent the guest from using a feature,
> > +	 * in which case the vCPU "has" the feature regardless of what KVM or
> > +	 * userspace desires.
> >  	 */
> > -#define KVM_MAX_NR_GOVERNED_FEATURES BITS_PER_LONG
> > -
> > -	/*
> > -	 * Track whether or not the guest is allowed to use features that are
> > -	 * governed by KVM, where "governed" means KVM needs to manage state
> > -	 * and/or explicitly enable the feature in hardware.  Typically, but
> > -	 * not always, governed features can be used by the guest if and only
> > -	 * if both KVM and userspace want to expose the feature to the guest.
> > -	 */
> > -	struct {
> > -		DECLARE_BITMAP(enabled, KVM_MAX_NR_GOVERNED_FEATURES);
> > -	} governed_features;
> > +	u32 cpu_caps[NR_KVM_CPU_CAPS];
> 
> Won't it be better to call this 'effective_cpu_caps' or something like that,
> to put emphasis on the fact that these are not exactly the cpu caps that
> userspace wants.  Although probably any name will still be somewhat
> confusing.

I'd prefer to keep 'cpu_caps' so that the name is aligned with the APIs, e.g. I
think having "effective" in the field but not e.g. guest_cpu_cap_set() would be
even more confusing.

Also, looking at this again, "effective" isn't strictly correct (my comment about
is also wrong), as virtualization holes that neither userspace nor KVM cares about
aren't reflected in CPUID caps.  E.g. things like MOVBE and POPCNT have a CPUID
flag but no way to prevent the guest from using them.

So a truly accurate name would have to be something like
effective_cpu_caps_that_kvm_cares_about.  :-)

> >  	u64 reserved_gpa_bits;
> >  	int maxphyaddr;
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 4f464187b063..4bf3c2d4dc7c 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -327,9 +327,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	struct kvm_cpuid_entry2 *best;
> >  	bool allow_gbpages;
> >  
> > -	BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
> > -	bitmap_zero(vcpu->arch.governed_features.enabled,
> > -		    KVM_MAX_NR_GOVERNED_FEATURES);
> > +	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
> >  
> >  	/*
> >  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> > index 245416ffa34c..9f18c4395b71 100644
> > --- a/arch/x86/kvm/cpuid.h
> > +++ b/arch/x86/kvm/cpuid.h
> > @@ -255,12 +255,12 @@ static __always_inline bool kvm_is_governed_feature(unsigned int x86_feature)
> >  }
> >  
> >  static __always_inline void guest_cpu_cap_set(struct kvm_vcpu *vcpu,
> > -						     unsigned int x86_feature)
> > +					      unsigned int x86_feature)
> >  {
> > -	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
> > +	unsigned int x86_leaf = __feature_leaf(x86_feature);
> >  
> > -	__set_bit(kvm_governed_feature_index(x86_feature),
> > -		  vcpu->arch.governed_features.enabled);
> > +	reverse_cpuid_check(x86_leaf);
> > +	vcpu->arch.cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
> >  }
> >  
> >  static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> > @@ -273,10 +273,10 @@ static __always_inline void guest_cpu_cap_check_and_set(struct kvm_vcpu *vcpu,
> >  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> >  					      unsigned int x86_feature)
> >  {
> > -	BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
> > +	unsigned int x86_leaf = __feature_leaf(x86_feature);
> >  
> > -	return test_bit(kvm_governed_feature_index(x86_feature),
> > -			vcpu->arch.governed_features.enabled);
> > +	reverse_cpuid_check(x86_leaf);
> > +	return vcpu->arch.cpu_caps[x86_leaf] & __feature_bit(x86_feature);
> >  }
> 
> It might make sense to think about extracting the common code between
> kvm_cpu_cap* and guest_cpu_cap*.
> 
> The whole notion of reverse cpuid, KVM only leaves, and other nice things
> that it has is already very confusing, but as I understand there is
> no better way of doing it.
> But there must be a way to avoid at least duplicating this logic.

Yeah, that thought crossed my mind too, but de-duplicating the interesting bits
would require macros, which I think would be a net negative for this code.  I
could definitely be convinced otherwise though, I do love me some macros :-)

Something easy I can, should, and will do regardless is to move reverse_cpuid_check()
into __feature_leaf().  It's already in __feature_bit(), and that would cut down
the copy+paste at least a little bit even if we do/don't use macros.

> Also speaking of this logic, it would be nice to document it.
> E.g for 'kvm_only_cpuid_leafs' it would be nice to have an explanation
> for each entry on why it is needed.

As in, which bits KVM cares about?  That's guaranteed to become stale, and the
high level answer is always "because KVM needs it, but the kernel does not".  The
actual bits are kinda sorta documented by KVM_X86_FEATURE() usage, and any
conflicts with the kernel's cpufeatures.h should result in a build error due to
KVM trying to redefined a macro.

> Just curious: I wonder why Intel called them leaves?
> CPUID leaves are just table entries, I don't see any tree there.

LOL, I suppose a tree without branches can still have leaves?
 
> Finally isn't plural of "leaf" is "leaves"?

Heh, yes, "leaves" is the more standar plural form of "leaf".  And looking at the
SDM, "leafs" is used mostly for GETSEC and ENCL{S,U} "leafs".  I spent a lot of
my formative x86 years doing SMX stuff, and then way to much time on SGX, so I
guess "leafs" just stuck with me.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features
  2023-11-28  0:43       ` Sean Christopherson
@ 2023-11-28  5:13         ` Xu Yilun
  0 siblings, 0 replies; 40+ messages in thread
From: Xu Yilun @ 2023-11-28  5:13 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Maxim Levitsky, Paolo Bonzini, kvm, linux-kernel

On Mon, Nov 27, 2023 at 04:43:45PM -0800, Sean Christopherson wrote:
> On Fri, Nov 24, 2023, Xu Yilun wrote:
> > On Sun, Nov 19, 2023 at 07:35:30PM +0200, Maxim Levitsky wrote:
> > > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > > >  static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> > > >  				       int nent)
> > > >  {
> > > >  	struct kvm_cpuid_entry2 *best;
> > > > +	struct kvm_vcpu *caps = vcpu;
> > > > +
> > > > +	/*
> > > > +	 * Don't update vCPU capabilities if KVM is updating CPUID entries that
> > > > +	 * are coming in from userspace!
> > > > +	 */
> > > > +	if (entries != vcpu->arch.cpuid_entries)
> > > > +		caps = NULL;
> > > 
> > > I think that this should be decided by the caller. Just a boolean will suffice.
> 
> I strongly disagree.  The _only_ time the caps should be updated is if
> entries == vcpu->arch.cpuid_entries, and if entries == cpuid_entires than the caps
> should _always_ be updated.
> 
> > kvm_set_cpuid() calls this function only to validate/adjust the temporary
> > "entries" variable. While kvm_update_cpuid_runtime() calls it to do system
> > level changes.
> > 
> > So I kind of agree to make the caller fully awared, how about adding a
> > newly named wrapper for kvm_set_cpuid(), like:
> > 
> > 
> >   static void kvm_adjust_cpuid_entry(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,
> > 				     int nent)
> > 
> >   {
> > 	WARN_ON(entries == vcpu->arch.cpuid_entries);
> > 	__kvm_update_cpuid_runtime(vcpu, entries, nent);
> 
> But taking it a step further, we end up with
> 
> 	WARN_ON_ONCE(update_caps != (entries == vcpu->arch.cpuid_entries));
> 
> which is silly since any bugs that would result in the WARN firing can be avoided
> by doing:
> 
> 	update_caps = entries == vcpu->arch.cpuid_entries;
> 
> which eventually distils down to the code I posted.

OK, I agree with you.

My initial idea is to make developers easier to recognize what is
happening by name, without looking into __kvm_update_cpuid_runtime().
But it seems causing other subtle confuse and wastes cycles. Maybe the
comment is already good enough for developers.

Thanks,
Yilun

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-11-19 17:32   ` Maxim Levitsky
@ 2023-12-01  1:51     ` Sean Christopherson
  2023-12-21 16:59       ` Maxim Levitsky
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2023-12-01  1:51 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Paolo Bonzini, kvm, linux-kernel

On Sun, Nov 19, 2023, Maxim Levitsky wrote:
> On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > +/*
> > + * This isn't truly "unsafe", but all callers except kvm_cpu_after_set_cpuid()
> > + * should use __cpuid_entry_get_reg(), which provides compile-time validation
> > + * of the input.
> > + */
> > +static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
> > +{
> > +	switch (reg) {
> > +	case CPUID_EAX:
> > +		return entry->eax;
> > +	case CPUID_EBX:
> > +		return entry->ebx;
> > +	case CPUID_ECX:
> > +		return entry->ecx;
> > +	case CPUID_EDX:
> > +		return entry->edx;
> > +	default:
> > +		WARN_ON_ONCE(1);
> > +		return 0;
> > +	}
> > +}

...

> >  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  {
> >  	struct kvm_lapic *apic = vcpu->arch.apic;
> >  	struct kvm_cpuid_entry2 *best;
> >  	bool allow_gbpages;
> > +	int i;
> >  
> > -	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
> > +	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
> > +
> > +	/*
> > +	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> > +	 * honor userspace's definition for features that don't require KVM or
> > +	 * hardware management/support (or that KVM simply doesn't care about).
> > +	 */
> > +	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> > +		const struct cpuid_reg cpuid = reverse_cpuid[i];
> > +
> > +		best = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
> > +		if (best)
> > +			vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(best, cpuid.reg);
> 
> Why not just use __cpuid_entry_get_reg? 
> 
> cpuid.reg comes from read/only 'reverse_cpuid' anyway, and in fact
> it seems that all callers of __cpuid_entry_get_reg, take the reg value from
> x86_feature_cpuid() which also takes it from 'reverse_cpuid'.
> 
> So if the compiler is smart enough to not complain in these cases, I don't
> see why this case is different.

It's because the input isn't a compile-time constant, and so the BUILD_BUG() in
the default path will fire.  All of the compile-time assertions in reverse_cpuid.h
rely on the feature being a constant value, which allows the compiler to optimize
away the dead paths, i.e. turn __cpuid_entry_get_reg()'s switch statement into
simple pointer arithmetic and thus omit the BUILD_BUG() code.

> Also why not to initialize guest_caps = host_caps & userspace_cpuid?
>
> If this was the default we won't need any guest_cpu_cap_restrict and such,
> instead it will just work.

Hrm, I definitely like the idea.  Unfortunately, unless we do an audit of all
~120 uses of guest_cpuid_has(), restricting those based on kvm_cpu_caps might
break userspace.

Aside from purging the governed feature nomenclature, the main goal of this series
provide a way to do fast lookups of all known guest CPUID bits without needing to
opt-in on a feature-by-feature basis, including for features that are fully
controlled by userspace.

It's definitely doable, but I'm not all that confident that the end result would
be a net positive, e.g. I believe we would need to special case things like the
feature bits that gate MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.  MOVBE and RDPID
are other features that come to mind, where KVM emulates the feature in software
but it won't be set in kvm_cpu_caps.

Oof, and MONITOR and MWAIT too, as KVM deliberately doesn't advertise those to
userspace.

So yeah, I'm not opposed to trying that route at some point, but I really don't
want to do that in this series as the risk of subtly breaking something is super
high.

> Special code will only be needed in few more complex cases, like forced exposed
> of a feature to a guest due to a virtualization hole.
> 
> 
> > +		else
> > +			vcpu->arch.cpu_caps[i] = 0;
> > +	}
> >  
> >  	/*
> >  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> > @@ -342,8 +380,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	 */
> >  	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
> >  				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> > -	if (allow_gbpages)
> > -		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
> > +	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
> 
> IMHO the original code was more readable, now I need to look up the
> 'guest_cpu_cap_change()' to understand what is going on.

The change is "necessary".  The issue is that with the caps 0-initialied, the
!allow_gbpages could simply do nothing.  Now, KVM needs to explicitly clear the
flag, i.e. would need to do:

	if (allow_gbpages)
		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
	else
		guest_cpu_cap_clear(vcpu, X86_FEATURE_GBPAGES);

I don't much love the name either, but it pairs with cpuid_entry_change() and I
want to keep the kvm_cpu_cap, cpuid_entry, and guest_cpu_cap APIs in sync as far
as the APIs go.  The only reason kvm_cpu_cap_change() doesn't exist is because
there aren't any flows that need to toggle a bit.

> >  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 8a99a73b6ee5..5827328e30f1 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
> >  	 * the guest read/write access to the host's XSS.
> >  	 */
> > -	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> > -	    boot_cpu_has(X86_FEATURE_XSAVES) &&
> > -	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> > -		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
> > +	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
> > +			     boot_cpu_has(X86_FEATURE_XSAVE) &&
> > +			     boot_cpu_has(X86_FEATURE_XSAVES) &&
> > +			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
> 
> In theory this change does change behavior, now the X86_FEATURE_XSAVE will
> be set iff the condition is true, but before it was set *if* the condition was true.

No, before it was set if and only if the condition was true, because in that case
caps were 0-initialized, i.e. this was/is the only way for XSAVE to be set.

> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);
> 
> One of the main reasons I don't like governed features is this manual list.

To be fair, the manual lists predate the governed features.

> I want to reach the point that one won't need to add anything manually,
> unless there is a good reason to do so, and there are only a few exceptions
> when the guest cap is set, while the host's isn't.

Yeah, agreed.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-12-01  1:51     ` Sean Christopherson
@ 2023-12-21 16:59       ` Maxim Levitsky
  2024-01-05  2:13         ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Maxim Levitsky @ 2023-12-21 16:59 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

On Thu, 2023-11-30 at 17:51 -0800, Sean Christopherson wrote:
> On Sun, Nov 19, 2023, Maxim Levitsky wrote:
> > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > > +/*
> > > + * This isn't truly "unsafe", but all callers except kvm_cpu_after_set_cpuid()
> > > + * should use __cpuid_entry_get_reg(), which provides compile-time validation
> > > + * of the input.
> > > + */
> > > +static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
> > > +{
> > > +	switch (reg) {
> > > +	case CPUID_EAX:
> > > +		return entry->eax;
> > > +	case CPUID_EBX:
> > > +		return entry->ebx;
> > > +	case CPUID_ECX:
> > > +		return entry->ecx;
> > > +	case CPUID_EDX:
> > > +		return entry->edx;
> > > +	default:
> > > +		WARN_ON_ONCE(1);
> > > +		return 0;
> > > +	}
> > > +}
> 
> ...
> 
> > >  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > >  {
> > >  	struct kvm_lapic *apic = vcpu->arch.apic;
> > >  	struct kvm_cpuid_entry2 *best;
> > >  	bool allow_gbpages;
> > > +	int i;
> > >  
> > > -	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
> > > +	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
> > > +
> > > +	/*
> > > +	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> > > +	 * honor userspace's definition for features that don't require KVM or
> > > +	 * hardware management/support (or that KVM simply doesn't care about).
> > > +	 */
> > > +	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> > > +		const struct cpuid_reg cpuid = reverse_cpuid[i];
> > > +
> > > +		best = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
> > > +		if (best)
> > > +			vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(best, cpuid.reg);
> > 
> > Why not just use __cpuid_entry_get_reg? 
> > 
> > cpuid.reg comes from read/only 'reverse_cpuid' anyway, and in fact
> > it seems that all callers of __cpuid_entry_get_reg, take the reg value from
> > x86_feature_cpuid() which also takes it from 'reverse_cpuid'.
> > 
> > So if the compiler is smart enough to not complain in these cases, I don't
> > see why this case is different.
> 
> It's because the input isn't a compile-time constant, and so the BUILD_BUG() in
> the default path will fire. 
>  All of the compile-time assertions in reverse_cpuid.h
> rely on the feature being a constant value, which allows the compiler to optimize
> away the dead paths, i.e. turn __cpuid_entry_get_reg()'s switch statement into
> simple pointer arithmetic and thus omit the BUILD_BUG() code.

In the above code, assuming that the compiler really treats the reverse_cpuid as const
(if that assumption is not true, then all uses of __cpuid_entry_get_reg are also not compile
time constant either),
then the 'reg' value depends only on 'i', and therefore for each iteration of the loop,
the compiler does know the compile time value of the 'reg',
and so it can easily prove that 'default' case in __cpuid_entry_get_reg can't be reached,
and thus eliminate that BUILD_BUG().


> 
> > Also why not to initialize guest_caps = host_caps & userspace_cpuid?
> > 
> > If this was the default we won't need any guest_cpu_cap_restrict and such,
> > instead it will just work.
> 
> Hrm, I definitely like the idea.  Unfortunately, unless we do an audit of all
> ~120 uses of guest_cpuid_has(), restricting those based on kvm_cpu_caps might
> break userspace.

120 uses is not that bad, IMHO it is worth it - we won't need to deal with that
in the future.

How about a compromise - you change the patches such as it will be possible to remove
these cases one by one, and also all new cases will be fully automatic?


> 
> Aside from purging the governed feature nomenclature, the main goal of this series
> provide a way to do fast lookups of all known guest CPUID bits without needing to
> opt-in on a feature-by-feature basis, including for features that are fully
> controlled by userspace.

I'll note that, this makes sense.
> 
> It's definitely doable, but I'm not all that confident that the end result would
> be a net positive, e.g. I believe we would need to special case things like the
> feature bits that gate MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.  MOVBE and RDPID
> are other features that come to mind, where KVM emulates the feature in software
> but it won't be set in kvm_cpu_caps.

> 
> Oof, and MONITOR and MWAIT too, as KVM deliberately doesn't advertise those to
> userspace.
> 
> So yeah, I'm not opposed to trying that route at some point, but I really don't
> want to do that in this series as the risk of subtly breaking something is super
> high.
> 
> > Special code will only be needed in few more complex cases, like forced exposed
> > of a feature to a guest due to a virtualization hole.
> > 
> > 
> > > +		else
> > > +			vcpu->arch.cpu_caps[i] = 0;
> > > +	}
> > >  
> > >  	/*
> > >  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> > > @@ -342,8 +380,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > >  	 */
> > >  	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
> > >  				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> > > -	if (allow_gbpages)
> > > -		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
> > > +	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
> > 
> > IMHO the original code was more readable, now I need to look up the
> > 'guest_cpu_cap_change()' to understand what is going on.
> 
> The change is "necessary".  The issue is that with the caps 0-initialied, the
> !allow_gbpages could simply do nothing.  Now, KVM needs to explicitly clear the
> flag, i.e. would need to do:
> 
> 	if (allow_gbpages)
> 		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
> 	else
> 		guest_cpu_cap_clear(vcpu, X86_FEATURE_GBPAGES);

I understand now but I am complaining more about the fact that I like the
explicit longer version better than calling guest_cpu_cap_change because it's not obvious
what guest_cpu_cap_change really does. I am not going to fight over this though,
just saying.

> 
> I don't much love the name either, but it pairs with cpuid_entry_change() and I
> want to keep the kvm_cpu_cap, cpuid_entry, and guest_cpu_cap APIs in sync as far
> as the APIs go.  The only reason kvm_cpu_cap_change() doesn't exist is because
> there aren't any flows that need to toggle a bit.
> 
> > >  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > index 8a99a73b6ee5..5827328e30f1 100644
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > >  	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
> > >  	 * the guest read/write access to the host's XSS.
> > >  	 */
> > > -	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> > > -	    boot_cpu_has(X86_FEATURE_XSAVES) &&
> > > -	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> > > -		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
> > > +	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
> > > +			     boot_cpu_has(X86_FEATURE_XSAVE) &&
> > > +			     boot_cpu_has(X86_FEATURE_XSAVES) &&
> > > +			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
> > 
> > In theory this change does change behavior, now the X86_FEATURE_XSAVE will
> > be set iff the condition is true, but before it was set *if* the condition was true.
> 
> No, before it was set if and only if the condition was true, because in that case
> caps were 0-initialized, i.e. this was/is the only way for XSAVE to be set.
> 
> > > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
> > > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> > > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
> > > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
> > > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
> > > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);
> > 
> > One of the main reasons I don't like governed features is this manual list.
> 
> To be fair, the manual lists predate the governed features.

100% agree, however the point of governed features was to simplify this list,
the point of this patch set is to simplify these lists and yet they still remain,
more or less untouched, and we will still need to maintain them.

Again I do think that governed features and/or this patchset are better than
the mess that was there before, but a part of me wants to fully get rid of this mess instead
of just making it a bit more beautiful. 

> 
> > I want to reach the point that one won't need to add anything manually,
> > unless there is a good reason to do so, and there are only a few exceptions
> > when the guest cap is set, while the host's isn't.
> 
> Yeah, agreed.

I am glad that we are on the same page here.

Best regards,
	Maxim Levitsky

> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2023-12-21 16:59       ` Maxim Levitsky
@ 2024-01-05  2:13         ` Sean Christopherson
  2024-01-12  0:44           ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2024-01-05  2:13 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Paolo Bonzini, kvm, linux-kernel

On Thu, Dec 21, 2023, Maxim Levitsky wrote:
> On Thu, 2023-11-30 at 17:51 -0800, Sean Christopherson wrote:
> > On Sun, Nov 19, 2023, Maxim Levitsky wrote:
> > > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > > > +/*
> > > > + * This isn't truly "unsafe", but all callers except kvm_cpu_after_set_cpuid()
> > > > + * should use __cpuid_entry_get_reg(), which provides compile-time validation
> > > > + * of the input.
> > > > + */
> > > > +static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
> > > > +{
> > > > +	switch (reg) {
> > > > +	case CPUID_EAX:
> > > > +		return entry->eax;
> > > > +	case CPUID_EBX:
> > > > +		return entry->ebx;
> > > > +	case CPUID_ECX:
> > > > +		return entry->ecx;
> > > > +	case CPUID_EDX:
> > > > +		return entry->edx;
> > > > +	default:
> > > > +		WARN_ON_ONCE(1);
> > > > +		return 0;
> > > > +	}
> > > > +}
> > 
> > ...
> > 
> > > >  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	struct kvm_lapic *apic = vcpu->arch.apic;
> > > >  	struct kvm_cpuid_entry2 *best;
> > > >  	bool allow_gbpages;
> > > > +	int i;
> > > >  
> > > > -	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
> > > > +	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
> > > > +
> > > > +	/*
> > > > +	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> > > > +	 * honor userspace's definition for features that don't require KVM or
> > > > +	 * hardware management/support (or that KVM simply doesn't care about).
> > > > +	 */
> > > > +	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> > > > +		const struct cpuid_reg cpuid = reverse_cpuid[i];
> > > > +
> > > > +		best = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
> > > > +		if (best)
> > > > +			vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(best, cpuid.reg);
> > > 
> > > Why not just use __cpuid_entry_get_reg? 
> > > 
> > > cpuid.reg comes from read/only 'reverse_cpuid' anyway, and in fact
> > > it seems that all callers of __cpuid_entry_get_reg, take the reg value from
> > > x86_feature_cpuid() which also takes it from 'reverse_cpuid'.
> > > 
> > > So if the compiler is smart enough to not complain in these cases, I don't
> > > see why this case is different.
> > 
> > It's because the input isn't a compile-time constant, and so the BUILD_BUG() in
> > the default path will fire. 
> >  All of the compile-time assertions in reverse_cpuid.h
> > rely on the feature being a constant value, which allows the compiler to optimize
> > away the dead paths, i.e. turn __cpuid_entry_get_reg()'s switch statement into
> > simple pointer arithmetic and thus omit the BUILD_BUG() code.
> 
> In the above code, assuming that the compiler really treats the reverse_cpuid
> as const (if that assumption is not true, then all uses of __cpuid_entry_get_reg
> are also not compile time constant either),

It's not so much the compiler treating something as const as it is the compiler
generating code that resolves the relevant inputs to compile-time constants.

> then the 'reg' value depends only on 'i', and therefore for each iteration of
> the loop, the compiler does know the compile time value of the 'reg', and so
> it can easily prove that 'default' case in __cpuid_entry_get_reg can't be
> reached, and thus eliminate that BUILD_BUG().

A compiler _could_ know, but as above what truly matters is what code the compiler
actually generates.  E.g. all helpers are tagged __always_inline to prevent the
compiler from uninlining the functions (thanks, KASAN), at which point the code
of the non-inline function is no longer dealing with compile-time constants and
so the BUILD_BUG_ON() doesn't get eliminated.

For the loop, while all values are indeed constant, the compiler may choose to
generate a loop instead of unrolling everything.  A sufficiently clever compiler
could still detect that nothing in the loop can hit the "default" case, but in
practice neither clang nor gcc does that level of optimization, at least not yet.

> > > Also why not to initialize guest_caps = host_caps & userspace_cpuid?
> > > 
> > > If this was the default we won't need any guest_cpu_cap_restrict and such,
> > > instead it will just work.
> > 
> > Hrm, I definitely like the idea.  Unfortunately, unless we do an audit of all
> > ~120 uses of guest_cpuid_has(), restricting those based on kvm_cpu_caps might
> > break userspace.
> 
> 120 uses is not that bad, IMHO it is worth it - we won't need to deal with that
> in the future.
> 
> How about a compromise - you change the patches such as it will be possible
> to remove these cases one by one, and also all new cases will be fully
> automatic?

Hrm, I'm not necessarily opposed to that, but I don't think we should go partway
unless we are 100% confident that changing the default to "use guest CPUID ANDed
with KVM capabilities" is the best end state, *and* that someone will actually
have the bandwidth to do the work soon-ish so that KVM isn't in a half-baked
state for months on end.  Even then, my preference would definitely be to switch
everything in one go.

And automatically handling new features would only be feasible for entirely new
leafs.  E.g. X86_FEATURE_RDPID is buried in CPUID.0x7.0x0.ECX, so to automatically
handle new features KVM would need to set the default guest_caps for all bits
*except* RDPID, at which point we're again building a set of features that need
to opt-out.

> > To be fair, the manual lists predate the governed features.
> 
> 100% agree, however the point of governed features was to simplify this list,
> the point of this patch set is to simplify these lists and yet they still remain,
> more or less untouched, and we will still need to maintain them.
> 
> Again I do think that governed features and/or this patchset are better than
> the mess that was there before, but a part of me wants to fully get rid of
> this mess instead of just making it a bit more beautiful. 

Oh, I would love to get rid of the mess too, I _completely_ getting rid of the
mess isn't realistic.  There are guaranteed to be exceptions to the rule, whether
the rule is "use guest CPUID by default" or "use guest CPUID constrained by KVM
capabilities by default".

I.e. there will always be some amount of manual messiness, the question is which
default behavior would yield the smallest mess.  My gut agrees with you, that
defaulting to "guest & KVM" would yield the fewest exceptions.  But as above,
I think we're better off doing the switch as an all-or-nothing things (where "all"
means within a single series, not within a single patch).

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
  2024-01-05  2:13         ` Sean Christopherson
@ 2024-01-12  0:44           ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2024-01-12  0:44 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Paolo Bonzini, kvm, linux-kernel

On Thu, Jan 04, 2024, Sean Christopherson wrote:
> On Thu, Dec 21, 2023, Maxim Levitsky wrote:
> > On Thu, 2023-11-30 at 17:51 -0800, Sean Christopherson wrote:
> > > On Sun, Nov 19, 2023, Maxim Levitsky wrote:
> > > > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
 
> > > > Also why not to initialize guest_caps = host_caps & userspace_cpuid?
> > > > 
> > > > If this was the default we won't need any guest_cpu_cap_restrict and such,
> > > > instead it will just work.
> > > 
> > > Hrm, I definitely like the idea.  Unfortunately, unless we do an audit of all
> > > ~120 uses of guest_cpuid_has(), restricting those based on kvm_cpu_caps might
> > > break userspace.
> > 
> > 120 uses is not that bad, IMHO it is worth it - we won't need to deal with that
> > in the future.
> > 
> > How about a compromise - you change the patches such as it will be possible
> > to remove these cases one by one, and also all new cases will be fully
> > automatic?
> 
> Hrm, I'm not necessarily opposed to that, but I don't think we should go partway
> unless we are 100% confident that changing the default to "use guest CPUID ANDed
> with KVM capabilities" is the best end state, *and* that someone will actually
> have the bandwidth to do the work soon-ish so that KVM isn't in a half-baked
> state for months on end.  Even then, my preference would definitely be to switch
> everything in one go.
> 
> And automatically handling new features would only be feasible for entirely new
> leafs.  E.g. X86_FEATURE_RDPID is buried in CPUID.0x7.0x0.ECX, so to automatically
> handle new features KVM would need to set the default guest_caps for all bits
> *except* RDPID, at which point we're again building a set of features that need
> to opt-out.
> 
> > > To be fair, the manual lists predate the governed features.
> > 
> > 100% agree, however the point of governed features was to simplify this list,
> > the point of this patch set is to simplify these lists and yet they still remain,
> > more or less untouched, and we will still need to maintain them.
> > 
> > Again I do think that governed features and/or this patchset are better than
> > the mess that was there before, but a part of me wants to fully get rid of
> > this mess instead of just making it a bit more beautiful. 
> 
> Oh, I would love to get rid of the mess too, I _completely_ getting rid of the
> mess isn't realistic.  There are guaranteed to be exceptions to the rule, whether
> the rule is "use guest CPUID by default" or "use guest CPUID constrained by KVM
> capabilities by default".
> 
> I.e. there will always be some amount of manual messiness, the question is which
> default behavior would yield the smallest mess.  My gut agrees with you, that
> defaulting to "guest & KVM" would yield the fewest exceptions.  But as above,
> I think we're better off doing the switch as an all-or-nothing things (where "all"
> means within a single series, not within a single patch).

Ok, the idea of having vcpu->arch.cpu_caps default to a KVM & GUEST is growing
on me.  There's a lurking bug in KVM that in some ways is due to lack of a per-vCPU,
KVM-enforced set of a features.  The bug is relatively benign (VMX passes through
CR4.FSGSBASE when it's not supported in the host), and easy to fix (incorporate
KVM-reserved CR4 bits into vcpu->arch.cr4_guest_rsvd_bits), but it really is
something that just shouldn't happen.  E.g. KVM's handling of EFER has a similar
lurking problem where __kvm_valid_efer() is unsafe to use without also consulting
efer_reserved_bits.

And after digging a bit more, I think I'm just being overly paranoid.  I'm fairly
certain the only exceptions are literally the few that I've called out (RDPID,
MOVBE, and MWAIT (which is only a problem because of a stupid quirk)).  I don't
yet have a firm plan on how to deal with the exceptions in a clean way, e.g. I'd
like to somehow have the "opt-out" code share the set of emulated features with
__do_cpuid_func_emulated().  One thought would be to add kvm_emulated_cpu_caps,
which would be *comically* wasteful, but might be worth the 90 bytes.

For v2, what if I post this more or less as-is, with a "convert to KVM & GUEST"
patch thrown in at the end as an RFC?  I want to do a lot more testing (and staring)
before committing to the conversion, and sadly I don't have anywhere near enough
cycles to do that right now.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2024-01-12  0:44 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
2023-11-19 17:08   ` Maxim Levitsky
2023-11-21  3:20   ` Chao Gao
2023-11-10 23:55 ` [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps Sean Christopherson
2023-11-14  9:12   ` Binbin Wu
2023-11-19 17:22   ` Maxim Levitsky
2023-11-28  1:24     ` Sean Christopherson
2023-11-10 23:55 ` [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
2023-11-16  3:16   ` Yang, Weijiang
2023-11-16 22:29     ` Sean Christopherson
2023-11-17  8:33       ` Yang, Weijiang
2023-11-21  3:10         ` Yuan Yao
2023-11-19 17:32   ` Maxim Levitsky
2023-12-01  1:51     ` Sean Christopherson
2023-12-21 16:59       ` Maxim Levitsky
2024-01-05  2:13         ` Sean Christopherson
2024-01-12  0:44           ` Sean Christopherson
2023-11-10 23:55 ` [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
2023-11-19 17:33   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
2023-11-19 17:33   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
2023-11-13  8:03   ` Robert Hoo
2023-11-14 13:48     ` Sean Christopherson
2023-11-15  1:59       ` Robert Hoo
2023-11-15 15:09         ` Sean Christopherson
2023-11-17  1:28           ` Robert Hoo
2023-11-16  2:24   ` Yang, Weijiang
2023-11-16 22:19     ` Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-24  6:33     ` Xu Yilun
2023-11-28  0:43       ` Sean Christopherson
2023-11-28  5:13         ` Xu Yilun
2023-11-10 23:55 ` [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities Sean Christopherson
2023-11-19 17:36   ` Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).