* [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework
@ 2023-07-29 1:15 Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 01/21] KVM: nSVM: Check instead of asserting on nested TSC scaling support Sean Christopherson
` (21 more replies)
0 siblings, 22 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Add a framework to manage and cache KVM-governed features, i.e. CPUID
based features that require explicit KVM enabling and/or need to be
queried semi-frequently by KVM. The idea originally came up in the
context of the architectural LBRs series as a way to avoid querying
guest CPUID in hot paths without needing a dedicated flag, but as
evidenced by the shortlog, the most common usage is to handle the ever-
growing list of SVM features that are exposed to L1.
The first six patches are fixes and cleanups related to TSC scaling.
nSVM has WARN_ONs that can be triggered by userspace at will, and the
code is a bit crusty. They aren't directly related to the governed
stuff, I stumbled upon the issues they fix when staring at the patch to
convert "TSC scaling enabled". I included them here mainly to avoid
code conflicts. I'm hoping all of this can go into 6.6, e.g. so that CET
support can build on guest_can_use(), but I can always grab the TSC
patches for 6.6 if the governed stuff needs more time.
Note, I still don't like the name "governed", but no one has suggested
anything else, let alone anything better :-)
v2:
- Add patches to clean up TSC scaling.
- Add a comment explaining the virtual VMLOAD/VMLAVE vs. SYSENTER on
Intel madness.
- Use a governed feature for X86_FEATURE_VMX.
- Incorporate KVM capabilities into the main check-and-set helper. [Chao]
v1: https://lore.kernel.org/all/20230217231022.816138-1-seanjc@google.com
Sean Christopherson (21):
KVM: nSVM: Check instead of asserting on nested TSC scaling support
KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state
KVM: nSVM: Use the "outer" helper for writing multiplier to
MSR_AMD64_TSC_RATIO
KVM: SVM: Clean up preemption toggling related to MSR_AMD64_TSC_RATIO
KVM: x86: Always write vCPU's current TSC offset/ratio in vendor hooks
KVM: nSVM: Skip writes to MSR_AMD64_TSC_RATIO if guest state isn't
loaded
KVM: x86: Add a framework for enabling KVM-governed x86 features
KVM: x86/mmu: Use KVM-governed feature framework to track "GBPAGES
enabled"
KVM: VMX: Recompute "XSAVES enabled" only after CPUID update
KVM: VMX: Check KVM CPU caps, not just VMX MSR support, for XSAVE
enabling
KVM: VMX: Rename XSAVES control to follow KVM's preferred "ENABLE_XYZ"
KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled"
KVM: nVMX: Use KVM-governed feature framework to track "nested VMX
enabled"
KVM: nSVM: Use KVM-governed feature framework to track "NRIPS enabled"
KVM: nSVM: Use KVM-governed feature framework to track "TSC scaling
enabled"
KVM: nSVM: Use KVM-governed feature framework to track "vVM{SAVE,LOAD}
enabled"
KVM: nSVM: Use KVM-governed feature framework to track "LBRv enabled"
KVM: nSVM: Use KVM-governed feature framework to track "Pause Filter
enabled"
KVM: nSVM: Use KVM-governed feature framework to track "vGIF enabled"
KVM: nSVM: Use KVM-governed feature framework to track "vNMI enabled"
KVM: x86: Disallow guest CPUID lookups when IRQs are disabled
arch/x86/include/asm/kvm_host.h | 23 ++++++++-
arch/x86/include/asm/vmx.h | 2 +-
arch/x86/kvm/cpuid.c | 34 ++++++++++++++
arch/x86/kvm/cpuid.h | 46 ++++++++++++++++++
arch/x86/kvm/governed_features.h | 21 +++++++++
arch/x86/kvm/mmu/mmu.c | 20 ++------
arch/x86/kvm/svm/nested.c | 57 ++++++++++++----------
arch/x86/kvm/svm/svm.c | 81 ++++++++++++++++++--------------
arch/x86/kvm/svm/svm.h | 18 ++-----
arch/x86/kvm/vmx/capabilities.h | 2 +-
arch/x86/kvm/vmx/hyperv.c | 2 +-
arch/x86/kvm/vmx/nested.c | 13 ++---
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 81 +++++++++++++++-----------------
arch/x86/kvm/vmx/vmx.h | 3 +-
arch/x86/kvm/x86.c | 9 ++--
16 files changed, 261 insertions(+), 153 deletions(-)
create mode 100644 arch/x86/kvm/governed_features.h
base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v2 01/21] KVM: nSVM: Check instead of asserting on nested TSC scaling support
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 02/21] KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state Sean Christopherson
` (20 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Check for nested TSC scaling support on nested SVM VMRUN instead of
asserting that TSC scaling is exposed to L1 if L1's MSR_AMD64_TSC_RATIO
has diverged from KVM's default. Userspace can trigger the WARN at will
by writing the MSR and then updating guest CPUID to hide the feature
(modifying guest CPUID is allowed anytime before KVM_RUN). E.g. hacking
KVM's state_test selftest to do
vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0);
vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR);
after restoring state in a new VM+vCPU yields an endless supply of:
------------[ cut here ]------------
WARNING: CPU: 164 PID: 62565 at arch/x86/kvm/svm/nested.c:699
nested_vmcb02_prepare_control+0x3d6/0x3f0 [kvm_amd]
Call Trace:
<TASK>
enter_svm_guest_mode+0x114/0x560 [kvm_amd]
nested_svm_vmrun+0x260/0x330 [kvm_amd]
vmrun_interception+0x29/0x30 [kvm_amd]
svm_invoke_exit_handler+0x35/0x100 [kvm_amd]
svm_handle_exit+0xe7/0x180 [kvm_amd]
kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm]
kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm]
__se_sys_ioctl+0x7a/0xc0
__x64_sys_ioctl+0x21/0x30
do_syscall_64+0x41/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x45ca1b
Note, the nested #VMEXIT path has the same flaw, but needs a different
fix and will be handled separately.
Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/nested.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 96936ddf1b3c..0b90f5cf9df3 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -695,10 +695,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
vmcb02->control.tsc_offset = vcpu->arch.tsc_offset;
- if (svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio) {
- WARN_ON(!svm->tsc_scaling_enabled);
+ if (svm->tsc_scaling_enabled &&
+ svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio)
nested_svm_update_tsc_ratio_msr(vcpu);
- }
vmcb02->control.int_ctl =
(svm->nested.ctl.int_ctl & int_ctl_vmcb12_bits) |
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 02/21] KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 01/21] KVM: nSVM: Check instead of asserting on nested TSC scaling support Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 03/21] KVM: nSVM: Use the "outer" helper for writing multiplier to MSR_AMD64_TSC_RATIO Sean Christopherson
` (19 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
When emulating nested VM-Exit, load L1's TSC multiplier if L1's desired
ratio doesn't match the current ratio, not if the ratio L1 is using for
L2 diverges from the default. Functionally, the end result is the same
as KVM will run L2 with L1's multiplier if L2's multiplier is the default,
i.e. checking that L1's multiplier is loaded is equivalent to checking if
L2 has a non-default multiplier.
However, the assertion that TSC scaling is exposed to L1 is flawed, as
userspace can trigger the WARN at will by writing the MSR and then
updating guest CPUID to hide the feature (modifying guest CPUID is
allowed anytime before KVM_RUN). E.g. hacking KVM's state_test
selftest to do
vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0);
vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR);
after restoring state in a new VM+vCPU yields an endless supply of:
------------[ cut here ]------------
WARNING: CPU: 10 PID: 206939 at arch/x86/kvm/svm/nested.c:1105
nested_svm_vmexit+0x6af/0x720 [kvm_amd]
Call Trace:
nested_svm_exit_handled+0x102/0x1f0 [kvm_amd]
svm_handle_exit+0xb9/0x180 [kvm_amd]
kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm]
kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm]
? trace_hardirqs_off+0x4d/0xa0
__se_sys_ioctl+0x7a/0xc0
__x64_sys_ioctl+0x21/0x30
do_syscall_64+0x41/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Unlike the nested VMRUN path, hoisting the svm->tsc_scaling_enabled check
into the if-statement is wrong as KVM needs to ensure L1's multiplier is
loaded in the above scenario. Alternatively, the WARN_ON() could simply
be deleted, but that would make KVM's behavior even more subtle, e.g. it's
not immediately obvious why it's safe to write MSR_AMD64_TSC_RATIO when
checking only tsc_ratio_msr.
Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/nested.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 0b90f5cf9df3..c66c823ae222 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1100,8 +1100,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
vmcb_mark_dirty(vmcb01, VMCB_INTERCEPTS);
}
- if (svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio) {
- WARN_ON(!svm->tsc_scaling_enabled);
+ if (kvm_caps.has_tsc_control &&
+ vcpu->arch.tsc_scaling_ratio != vcpu->arch.l1_tsc_scaling_ratio) {
vcpu->arch.tsc_scaling_ratio = vcpu->arch.l1_tsc_scaling_ratio;
__svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
}
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 03/21] KVM: nSVM: Use the "outer" helper for writing multiplier to MSR_AMD64_TSC_RATIO
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 01/21] KVM: nSVM: Check instead of asserting on nested TSC scaling support Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 02/21] KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 04/21] KVM: SVM: Clean up preemption toggling related " Sean Christopherson
` (18 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
When emulating nested SVM transitions, use the outer helper for writing
the TSC multiplier for L2. Using the inner helper only for one-off cases,
i.e. for paths where KVM is NOT emulating or modifying vCPU state, will
allow for multiple cleanups:
- Explicitly disabling preemption only in the outer helper
- Getting the multiplier from the vCPU field in the outer helper
- Skipping the WRMSR in the outer helper if guest state isn't loaded
Opportunistically delete an extra newline.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/nested.c | 4 ++--
arch/x86/kvm/svm/svm.c | 5 ++---
arch/x86/kvm/svm/svm.h | 2 +-
3 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index c66c823ae222..5d5a1d7832fb 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1103,7 +1103,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
if (kvm_caps.has_tsc_control &&
vcpu->arch.tsc_scaling_ratio != vcpu->arch.l1_tsc_scaling_ratio) {
vcpu->arch.tsc_scaling_ratio = vcpu->arch.l1_tsc_scaling_ratio;
- __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
+ svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
}
svm->nested.ctl.nested_cr3 = 0;
@@ -1536,7 +1536,7 @@ void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu)
vcpu->arch.tsc_scaling_ratio =
kvm_calc_nested_tsc_multiplier(vcpu->arch.l1_tsc_scaling_ratio,
svm->tsc_ratio_msr);
- __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
+ svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
}
/* Inverse operation of nested_copy_vmcb_control_to_cache(). asid is copied too. */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d381ad424554..13f316375b14 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -550,7 +550,7 @@ static int svm_check_processor_compat(void)
return 0;
}
-void __svm_write_tsc_multiplier(u64 multiplier)
+static void __svm_write_tsc_multiplier(u64 multiplier)
{
preempt_disable();
@@ -1110,12 +1110,11 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
vmcb_mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
}
-static void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
+void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
{
__svm_write_tsc_multiplier(multiplier);
}
-
/* Evaluate instruction intercepts that depend on guest CPUID features. */
static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
struct vcpu_svm *svm)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 18af7e712a5a..7132c0a04817 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -658,7 +658,7 @@ int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
bool has_error_code, u32 error_code);
int nested_svm_exit_special(struct vcpu_svm *svm);
void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu);
-void __svm_write_tsc_multiplier(u64 multiplier);
+void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier);
void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
struct vmcb_control_area *control);
void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 04/21] KVM: SVM: Clean up preemption toggling related to MSR_AMD64_TSC_RATIO
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (2 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 03/21] KVM: nSVM: Use the "outer" helper for writing multiplier to MSR_AMD64_TSC_RATIO Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 05/21] KVM: x86: Always write vCPU's current TSC offset/ratio in vendor hooks Sean Christopherson
` (17 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Explicitly disable preemption when writing MSR_AMD64_TSC_RATIO only in the
"outer" helper, as all direct callers of the "inner" helper now run with
preemption already disabled. And that isn't a coincidence, as the outer
helper requires a vCPU and is intended to be used when modifying guest
state and/or emulating guest instructions, which are typically done with
preemption enabled.
Direct use of the inner helper should be extremely limited, as the only
time KVM should modify MSR_AMD64_TSC_RATIO without a vCPU is when
sanitizing the MSR for a specific pCPU (currently done when {en,dis}abling
disabling SVM). The other direct caller is svm_prepare_switch_to_guest(),
which does have a vCPU, but is a one-off special case: KVM is about to
enter the guest on a specific pCPU and thus must have preemption disabled.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/svm.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 13f316375b14..9fc5e402636a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -552,15 +552,11 @@ static int svm_check_processor_compat(void)
static void __svm_write_tsc_multiplier(u64 multiplier)
{
- preempt_disable();
-
if (multiplier == __this_cpu_read(current_tsc_ratio))
- goto out;
+ return;
wrmsrl(MSR_AMD64_TSC_RATIO, multiplier);
__this_cpu_write(current_tsc_ratio, multiplier);
-out:
- preempt_enable();
}
static void svm_hardware_disable(void)
@@ -1112,7 +1108,9 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
{
+ preempt_disable();
__svm_write_tsc_multiplier(multiplier);
+ preempt_enable();
}
/* Evaluate instruction intercepts that depend on guest CPUID features. */
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 05/21] KVM: x86: Always write vCPU's current TSC offset/ratio in vendor hooks
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (3 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 04/21] KVM: SVM: Clean up preemption toggling related " Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 06/21] KVM: nSVM: Skip writes to MSR_AMD64_TSC_RATIO if guest state isn't loaded Sean Christopherson
` (16 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Drop the @offset and @multiplier params from the kvm_x86_ops hooks for
propagating TSC offsets/multipliers into hardware, and instead have the
vendor implementations pull the information directly from the vCPU
structure. The respective vCPU fields _must_ be written at the same
time in order to maintain consistent state, i.e. it's not random luck
that the value passed in by all callers is grabbed from the vCPU.
Explicitly grabbing the value from the vCPU field in SVM's implementation
in particular will allow for additional cleanup without introducing even
more subtle dependencies. Specifically, SVM can skip the WRMSR if guest
state isn't loaded, i.e. svm_prepare_switch_to_guest() will load the
correct value for the vCPU prior to entering the guest.
This also reconciles KVM's handling of related values that are stored in
the vCPU, as svm_write_tsc_offset() already assumes/requires the caller
to have updated l1_tsc_offset.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm_host.h | 4 ++--
arch/x86/kvm/svm/nested.c | 4 ++--
arch/x86/kvm/svm/svm.c | 8 ++++----
arch/x86/kvm/svm/svm.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 8 ++++----
arch/x86/kvm/x86.c | 5 ++---
6 files changed, 15 insertions(+), 16 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 28bd38303d70..dad9331c5270 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1654,8 +1654,8 @@ struct kvm_x86_ops {
u64 (*get_l2_tsc_offset)(struct kvm_vcpu *vcpu);
u64 (*get_l2_tsc_multiplier)(struct kvm_vcpu *vcpu);
- void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
- void (*write_tsc_multiplier)(struct kvm_vcpu *vcpu, u64 multiplier);
+ void (*write_tsc_offset)(struct kvm_vcpu *vcpu);
+ void (*write_tsc_multiplier)(struct kvm_vcpu *vcpu);
/*
* Retrieve somewhat arbitrary exit information. Intended to
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 5d5a1d7832fb..3342cc4a5189 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1103,7 +1103,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
if (kvm_caps.has_tsc_control &&
vcpu->arch.tsc_scaling_ratio != vcpu->arch.l1_tsc_scaling_ratio) {
vcpu->arch.tsc_scaling_ratio = vcpu->arch.l1_tsc_scaling_ratio;
- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
+ svm_write_tsc_multiplier(vcpu);
}
svm->nested.ctl.nested_cr3 = 0;
@@ -1536,7 +1536,7 @@ void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu)
vcpu->arch.tsc_scaling_ratio =
kvm_calc_nested_tsc_multiplier(vcpu->arch.l1_tsc_scaling_ratio,
svm->tsc_ratio_msr);
- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
+ svm_write_tsc_multiplier(vcpu);
}
/* Inverse operation of nested_copy_vmcb_control_to_cache(). asid is copied too. */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9fc5e402636a..c786c8e9108f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1097,19 +1097,19 @@ static u64 svm_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu)
return svm->tsc_ratio_msr;
}
-static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
+static void svm_write_tsc_offset(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
svm->vmcb01.ptr->control.tsc_offset = vcpu->arch.l1_tsc_offset;
- svm->vmcb->control.tsc_offset = offset;
+ svm->vmcb->control.tsc_offset = vcpu->arch.tsc_offset;
vmcb_mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
}
-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
+void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu)
{
preempt_disable();
- __svm_write_tsc_multiplier(multiplier);
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
preempt_enable();
}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7132c0a04817..5829a1801862 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -658,7 +658,7 @@ int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
bool has_error_code, u32 error_code);
int nested_svm_exit_special(struct vcpu_svm *svm);
void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu);
-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier);
+void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu);
void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
struct vmcb_control_area *control);
void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0ecf4be2c6af..ca6194b0e35e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1884,14 +1884,14 @@ u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu)
return kvm_caps.default_tsc_scaling_ratio;
}
-static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
+static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu)
{
- vmcs_write64(TSC_OFFSET, offset);
+ vmcs_write64(TSC_OFFSET, vcpu->arch.tsc_offset);
}
-static void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
+static void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu)
{
- vmcs_write64(TSC_MULTIPLIER, multiplier);
+ vmcs_write64(TSC_MULTIPLIER, vcpu->arch.tsc_scaling_ratio);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a6b9bea62fb8..5a14378ed4e1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2615,7 +2615,7 @@ static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 l1_offset)
else
vcpu->arch.tsc_offset = l1_offset;
- static_call(kvm_x86_write_tsc_offset)(vcpu, vcpu->arch.tsc_offset);
+ static_call(kvm_x86_write_tsc_offset)(vcpu);
}
static void kvm_vcpu_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 l1_multiplier)
@@ -2631,8 +2631,7 @@ static void kvm_vcpu_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 l1_multipli
vcpu->arch.tsc_scaling_ratio = l1_multiplier;
if (kvm_caps.has_tsc_control)
- static_call(kvm_x86_write_tsc_multiplier)(
- vcpu, vcpu->arch.tsc_scaling_ratio);
+ static_call(kvm_x86_write_tsc_multiplier)(vcpu);
}
static inline bool kvm_check_tsc_unstable(void)
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 06/21] KVM: nSVM: Skip writes to MSR_AMD64_TSC_RATIO if guest state isn't loaded
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (4 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 05/21] KVM: x86: Always write vCPU's current TSC offset/ratio in vendor hooks Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features Sean Christopherson
` (15 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Skip writes to MSR_AMD64_TSC_RATIO that are done in the context of a vCPU
if guest state isn't loaded, i.e. if KVM will update MSR_AMD64_TSC_RATIO
during svm_prepare_switch_to_guest() before entering the guest. Checking
guest_state_loaded may or may not be a net positive for performance as
the current_tsc_ratio cache will optimize away duplicate WRMSRs in the
vast majority of scenarios. However, the cost of the check is negligible,
and the real motivation is to document that KVM needs to load the vCPU's
value only when running the vCPU.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/svm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c786c8e9108f..64092df06f94 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1109,7 +1109,8 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu)
void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu)
{
preempt_disable();
- __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
+ if (to_svm(vcpu)->guest_state_loaded)
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
preempt_enable();
}
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (5 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 06/21] KVM: nSVM: Skip writes to MSR_AMD64_TSC_RATIO if guest state isn't loaded Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-08-14 4:43 ` Zeng Guang
2023-07-29 1:15 ` [PATCH v2 08/21] KVM: x86/mmu: Use KVM-governed feature framework to track "GBPAGES enabled" Sean Christopherson
` (14 subsequent siblings)
21 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Introduce yet another X86_FEATURE flag framework to manage and cache KVM
governed features (for lack of a better name). "Governed" in this case
means that KVM has some level of involvement and/or vested interest in
whether or not an X86_FEATURE can be used by the guest. The intent of the
framework is twofold: to simplify caching of guest CPUID flags that KVM
needs to frequently query, and to add clarity to such caching, e.g. it
isn't immediately obvious that SVM's bundle of flags for "optional nested
SVM features" track whether or not a flag is exposed to L1.
Begrudgingly define KVM_MAX_NR_GOVERNED_FEATURES for the size of the
bitmap to avoid exposing governed_features.h in arch/x86/include/asm/, but
add a FIXME to call out that it can and should be cleaned up once
"struct kvm_vcpu_arch" is no longer expose to the kernel at large.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm_host.h | 19 +++++++++++++
arch/x86/kvm/cpuid.c | 4 +++
arch/x86/kvm/cpuid.h | 46 ++++++++++++++++++++++++++++++++
arch/x86/kvm/governed_features.h | 9 +++++++
4 files changed, 78 insertions(+)
create mode 100644 arch/x86/kvm/governed_features.h
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index dad9331c5270..007fa8bfd634 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -831,6 +831,25 @@ struct kvm_vcpu_arch {
struct kvm_cpuid_entry2 *cpuid_entries;
struct kvm_hypervisor_cpuid kvm_cpuid;
+ /*
+ * FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly
+ * when "struct kvm_vcpu_arch" is no longer defined in an
+ * arch/x86/include/asm header. The max is mostly arbitrary, i.e.
+ * can be increased as necessary.
+ */
+#define KVM_MAX_NR_GOVERNED_FEATURES BITS_PER_LONG
+
+ /*
+ * Track whether or not the guest is allowed to use features that are
+ * governed by KVM, where "governed" means KVM needs to manage state
+ * and/or explicitly enable the feature in hardware. Typically, but
+ * not always, governed features can be used by the guest if and only
+ * if both KVM and userspace want to expose the feature to the guest.
+ */
+ struct {
+ DECLARE_BITMAP(enabled, KVM_MAX_NR_GOVERNED_FEATURES);
+ } governed_features;
+
u64 reserved_gpa_bits;
int maxphyaddr;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7f4d13383cf2..ef826568c222 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -313,6 +313,10 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
struct kvm_lapic *apic = vcpu->arch.apic;
struct kvm_cpuid_entry2 *best;
+ BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
+ bitmap_zero(vcpu->arch.governed_features.enabled,
+ KVM_MAX_NR_GOVERNED_FEATURES);
+
best = kvm_find_cpuid_entry(vcpu, 1);
if (best && apic) {
if (cpuid_entry_has(best, X86_FEATURE_TSC_DEADLINE_TIMER))
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index b1658c0de847..3000fbe97678 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -232,4 +232,50 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
}
+enum kvm_governed_features {
+#define KVM_GOVERNED_FEATURE(x) KVM_GOVERNED_##x,
+#include "governed_features.h"
+ KVM_NR_GOVERNED_FEATURES
+};
+
+static __always_inline int kvm_governed_feature_index(unsigned int x86_feature)
+{
+ switch (x86_feature) {
+#define KVM_GOVERNED_FEATURE(x) case x: return KVM_GOVERNED_##x;
+#include "governed_features.h"
+ default:
+ return -1;
+ }
+}
+
+static __always_inline int kvm_is_governed_feature(unsigned int x86_feature)
+{
+ return kvm_governed_feature_index(x86_feature) >= 0;
+}
+
+static __always_inline void kvm_governed_feature_set(struct kvm_vcpu *vcpu,
+ unsigned int x86_feature)
+{
+ BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
+
+ __set_bit(kvm_governed_feature_index(x86_feature),
+ vcpu->arch.governed_features.enabled);
+}
+
+static __always_inline void kvm_governed_feature_check_and_set(struct kvm_vcpu *vcpu,
+ unsigned int x86_feature)
+{
+ if (kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature))
+ kvm_governed_feature_set(vcpu, x86_feature);
+}
+
+static __always_inline bool guest_can_use(struct kvm_vcpu *vcpu,
+ unsigned int x86_feature)
+{
+ BUILD_BUG_ON(!kvm_is_governed_feature(x86_feature));
+
+ return test_bit(kvm_governed_feature_index(x86_feature),
+ vcpu->arch.governed_features.enabled);
+}
+
#endif
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
new file mode 100644
index 000000000000..40ce8e6608cd
--- /dev/null
+++ b/arch/x86/kvm/governed_features.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#if !defined(KVM_GOVERNED_FEATURE) || defined(KVM_GOVERNED_X86_FEATURE)
+BUILD_BUG()
+#endif
+
+#define KVM_GOVERNED_X86_FEATURE(x) KVM_GOVERNED_FEATURE(X86_FEATURE_##x)
+
+#undef KVM_GOVERNED_X86_FEATURE
+#undef KVM_GOVERNED_FEATURE
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 08/21] KVM: x86/mmu: Use KVM-governed feature framework to track "GBPAGES enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (6 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 09/21] KVM: VMX: Recompute "XSAVES enabled" only after CPUID update Sean Christopherson
` (13 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Use the governed feature framework to track whether or not the guest can
use 1GiB pages, and drop the one-off helper that wraps the surprisingly
non-trivial logic surrounding 1GiB page usage in the guest.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/cpuid.c | 17 +++++++++++++++++
arch/x86/kvm/governed_features.h | 2 ++
arch/x86/kvm/mmu/mmu.c | 20 +++-----------------
3 files changed, 22 insertions(+), 17 deletions(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index ef826568c222..f74d6c404551 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -312,11 +312,28 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;
struct kvm_cpuid_entry2 *best;
+ bool allow_gbpages;
BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
bitmap_zero(vcpu->arch.governed_features.enabled,
KVM_MAX_NR_GOVERNED_FEATURES);
+ /*
+ * If TDP is enabled, let the guest use GBPAGES if they're supported in
+ * hardware. The hardware page walker doesn't let KVM disable GBPAGES,
+ * i.e. won't treat them as reserved, and KVM doesn't redo the GVA->GPA
+ * walk for performance and complexity reasons. Not to mention KVM
+ * _can't_ solve the problem because GVA->GPA walks aren't visible to
+ * KVM once a TDP translation is installed. Mimic hardware behavior so
+ * that KVM's is at least consistent, i.e. doesn't randomly inject #PF.
+ * If TDP is disabled, honor *only* guest CPUID as KVM has full control
+ * and can install smaller shadow pages if the host lacks 1GiB support.
+ */
+ allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
+ guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
+ if (allow_gbpages)
+ kvm_governed_feature_set(vcpu, X86_FEATURE_GBPAGES);
+
best = kvm_find_cpuid_entry(vcpu, 1);
if (best && apic) {
if (cpuid_entry_has(best, X86_FEATURE_TSC_DEADLINE_TIMER))
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 40ce8e6608cd..b29c15d5e038 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -5,5 +5,7 @@ BUILD_BUG()
#define KVM_GOVERNED_X86_FEATURE(x) KVM_GOVERNED_FEATURE(X86_FEATURE_##x)
+KVM_GOVERNED_X86_FEATURE(GBPAGES)
+
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index ec169f5c7dce..7b9104b054bc 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4808,28 +4808,13 @@ static void __reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
}
}
-static bool guest_can_use_gbpages(struct kvm_vcpu *vcpu)
-{
- /*
- * If TDP is enabled, let the guest use GBPAGES if they're supported in
- * hardware. The hardware page walker doesn't let KVM disable GBPAGES,
- * i.e. won't treat them as reserved, and KVM doesn't redo the GVA->GPA
- * walk for performance and complexity reasons. Not to mention KVM
- * _can't_ solve the problem because GVA->GPA walks aren't visible to
- * KVM once a TDP translation is installed. Mimic hardware behavior so
- * that KVM's is at least consistent, i.e. doesn't randomly inject #PF.
- */
- return tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
- guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
-}
-
static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu,
struct kvm_mmu *context)
{
__reset_rsvds_bits_mask(&context->guest_rsvd_check,
vcpu->arch.reserved_gpa_bits,
context->cpu_role.base.level, is_efer_nx(context),
- guest_can_use_gbpages(vcpu),
+ guest_can_use(vcpu, X86_FEATURE_GBPAGES),
is_cr4_pse(context),
guest_cpuid_is_amd_or_hygon(vcpu));
}
@@ -4906,7 +4891,8 @@ static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
context->root_role.level,
context->root_role.efer_nx,
- guest_can_use_gbpages(vcpu), is_pse, is_amd);
+ guest_can_use(vcpu, X86_FEATURE_GBPAGES),
+ is_pse, is_amd);
if (!shadow_me_mask)
return;
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 09/21] KVM: VMX: Recompute "XSAVES enabled" only after CPUID update
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (7 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 08/21] KVM: x86/mmu: Use KVM-governed feature framework to track "GBPAGES enabled" Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 10/21] KVM: VMX: Check KVM CPU caps, not just VMX MSR support, for XSAVE enabling Sean Christopherson
` (12 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Recompute whether or not XSAVES is enabled for the guest only if the
guest's CPUID model changes instead of redoing the computation every time
KVM generates vmcs01's secondary execution controls. The boot_cpu_has()
and cpu_has_vmx_xsaves() checks should never change after KVM is loaded,
and if they do the kernel/KVM is hosed.
Opportunistically add a comment explaining _why_ XSAVES is effectively
exposed to the guest if and only if XSAVE is also exposed to the guest.
Practically speaking, no functional change intended (KVM will do fewer
computations, but should still get the see the same xsaves_enabled value
whenever KVM looks at it).
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/vmx.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ca6194b0e35e..307d73749185 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4587,19 +4587,10 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
if (!enable_pml || !atomic_read(&vcpu->kvm->nr_memslots_dirty_logging))
exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
- if (cpu_has_vmx_xsaves()) {
- /* Exposing XSAVES only when XSAVE is exposed */
- bool xsaves_enabled =
- boot_cpu_has(X86_FEATURE_XSAVE) &&
- guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
- guest_cpuid_has(vcpu, X86_FEATURE_XSAVES);
-
- vcpu->arch.xsaves_enabled = xsaves_enabled;
-
+ if (cpu_has_vmx_xsaves())
vmx_adjust_secondary_exec_control(vmx, &exec_control,
SECONDARY_EXEC_XSAVES,
- xsaves_enabled, false);
- }
+ vcpu->arch.xsaves_enabled, false);
/*
* RDPID is also gated by ENABLE_RDTSCP, turn on the control if either
@@ -7722,8 +7713,15 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
- /* xsaves_enabled is recomputed in vmx_compute_secondary_exec_control(). */
- vcpu->arch.xsaves_enabled = false;
+ /*
+ * XSAVES is effectively enabled if and only if XSAVE is also exposed
+ * to the guest. XSAVES depends on CR4.OSXSAVE, and CR4.OSXSAVE can be
+ * set if and only if XSAVE is supported.
+ */
+ vcpu->arch.xsaves_enabled = cpu_has_vmx_xsaves() &&
+ boot_cpu_has(X86_FEATURE_XSAVE) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_XSAVES);
vmx_setup_uret_msrs(vmx);
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 10/21] KVM: VMX: Check KVM CPU caps, not just VMX MSR support, for XSAVE enabling
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (8 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 09/21] KVM: VMX: Recompute "XSAVES enabled" only after CPUID update Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 11/21] KVM: VMX: Rename XSAVES control to follow KVM's preferred "ENABLE_XYZ" Sean Christopherson
` (11 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Check KVM CPU capabilities instead of raw VMX support for XSAVES when
determining whether or not XSAVER can/should be exposed to the guest.
Practically speaking, it's nonsensical/impossible for a CPU to support
"enable XSAVES" without XSAVES being supported natively. The real
motivation for checking to kvm_cpu_cap_has() is to allow using the
governed feature's standard check-and-set logic.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/vmx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 307d73749185..e358e3fa1ced 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7718,7 +7718,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
* to the guest. XSAVES depends on CR4.OSXSAVE, and CR4.OSXSAVE can be
* set if and only if XSAVE is supported.
*/
- vcpu->arch.xsaves_enabled = cpu_has_vmx_xsaves() &&
+ vcpu->arch.xsaves_enabled = kvm_cpu_cap_has(X86_FEATURE_XSAVES) &&
boot_cpu_has(X86_FEATURE_XSAVE) &&
guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
guest_cpuid_has(vcpu, X86_FEATURE_XSAVES);
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 11/21] KVM: VMX: Rename XSAVES control to follow KVM's preferred "ENABLE_XYZ"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (9 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 10/21] KVM: VMX: Check KVM CPU caps, not just VMX MSR support, for XSAVE enabling Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 12/21] KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled" Sean Christopherson
` (10 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Rename the XSAVES secondary execution control to follow KVM's preferred
style so that XSAVES related logic can use common macros that depend on
KVM's preferred style.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/vmx.h | 2 +-
arch/x86/kvm/vmx/capabilities.h | 2 +-
arch/x86/kvm/vmx/hyperv.c | 2 +-
arch/x86/kvm/vmx/nested.c | 6 +++---
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 2 +-
arch/x86/kvm/vmx/vmx.h | 2 +-
7 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 0d02c4aafa6f..0e73616b82f3 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -71,7 +71,7 @@
#define SECONDARY_EXEC_RDSEED_EXITING VMCS_CONTROL_BIT(RDSEED_EXITING)
#define SECONDARY_EXEC_ENABLE_PML VMCS_CONTROL_BIT(PAGE_MOD_LOGGING)
#define SECONDARY_EXEC_PT_CONCEAL_VMX VMCS_CONTROL_BIT(PT_CONCEAL_VMX)
-#define SECONDARY_EXEC_XSAVES VMCS_CONTROL_BIT(XSAVES)
+#define SECONDARY_EXEC_ENABLE_XSAVES VMCS_CONTROL_BIT(XSAVES)
#define SECONDARY_EXEC_MODE_BASED_EPT_EXEC VMCS_CONTROL_BIT(MODE_BASED_EPT_EXEC)
#define SECONDARY_EXEC_PT_USE_GPA VMCS_CONTROL_BIT(PT_USE_GPA)
#define SECONDARY_EXEC_TSC_SCALING VMCS_CONTROL_BIT(TSC_SCALING)
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index d0abee35d7ba..41a4533f9989 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -252,7 +252,7 @@ static inline bool cpu_has_vmx_pml(void)
static inline bool cpu_has_vmx_xsaves(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &
- SECONDARY_EXEC_XSAVES;
+ SECONDARY_EXEC_ENABLE_XSAVES;
}
static inline bool cpu_has_vmx_waitpkg(void)
diff --git a/arch/x86/kvm/vmx/hyperv.c b/arch/x86/kvm/vmx/hyperv.c
index 79450e1ed7cf..313b8bb5b8a7 100644
--- a/arch/x86/kvm/vmx/hyperv.c
+++ b/arch/x86/kvm/vmx/hyperv.c
@@ -78,7 +78,7 @@
SECONDARY_EXEC_DESC | \
SECONDARY_EXEC_ENABLE_RDTSCP | \
SECONDARY_EXEC_ENABLE_INVPCID | \
- SECONDARY_EXEC_XSAVES | \
+ SECONDARY_EXEC_ENABLE_XSAVES | \
SECONDARY_EXEC_RDSEED_EXITING | \
SECONDARY_EXEC_RDRAND_EXITING | \
SECONDARY_EXEC_TSC_SCALING | \
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 516391cc0d64..22e08d30baef 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2307,7 +2307,7 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
SECONDARY_EXEC_ENABLE_INVPCID |
SECONDARY_EXEC_ENABLE_RDTSCP |
- SECONDARY_EXEC_XSAVES |
+ SECONDARY_EXEC_ENABLE_XSAVES |
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_APIC_REGISTER_VIRT |
@@ -6331,7 +6331,7 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu,
* If if it were, XSS would have to be checked against
* the XSS exit bitmap in vmcs12.
*/
- return nested_cpu_has2(vmcs12, SECONDARY_EXEC_XSAVES);
+ return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_XSAVES);
case EXIT_REASON_UMWAIT:
case EXIT_REASON_TPAUSE:
return nested_cpu_has2(vmcs12,
@@ -6874,7 +6874,7 @@ static void nested_vmx_setup_secondary_ctls(u32 ept_caps,
SECONDARY_EXEC_ENABLE_INVPCID |
SECONDARY_EXEC_ENABLE_VMFUNC |
SECONDARY_EXEC_RDSEED_EXITING |
- SECONDARY_EXEC_XSAVES |
+ SECONDARY_EXEC_ENABLE_XSAVES |
SECONDARY_EXEC_TSC_SCALING |
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index 96952263b029..b4b9d51438c6 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -168,7 +168,7 @@ static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
static inline bool nested_cpu_has_xsaves(struct vmcs12 *vmcs12)
{
- return nested_cpu_has2(vmcs12, SECONDARY_EXEC_XSAVES);
+ return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_XSAVES);
}
static inline bool nested_cpu_has_pml(struct vmcs12 *vmcs12)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e358e3fa1ced..a0a47be2feed 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4589,7 +4589,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
if (cpu_has_vmx_xsaves())
vmx_adjust_secondary_exec_control(vmx, &exec_control,
- SECONDARY_EXEC_XSAVES,
+ SECONDARY_EXEC_ENABLE_XSAVES,
vcpu->arch.xsaves_enabled, false);
/*
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 32384ba38499..cde902b44d97 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -562,7 +562,7 @@ static inline u8 vmx_get_rvi(void)
SECONDARY_EXEC_APIC_REGISTER_VIRT | \
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | \
SECONDARY_EXEC_SHADOW_VMCS | \
- SECONDARY_EXEC_XSAVES | \
+ SECONDARY_EXEC_ENABLE_XSAVES | \
SECONDARY_EXEC_RDSEED_EXITING | \
SECONDARY_EXEC_RDRAND_EXITING | \
SECONDARY_EXEC_ENABLE_PML | \
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 12/21] KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (10 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 11/21] KVM: VMX: Rename XSAVES control to follow KVM's preferred "ENABLE_XYZ" Sean Christopherson
@ 2023-07-29 1:15 ` Sean Christopherson
2023-08-14 6:28 ` Zeng Guang
2023-07-29 1:16 ` [PATCH v2 13/21] KVM: nVMX: Use KVM-governed feature framework to track "nested VMX enabled" Sean Christopherson
` (9 subsequent siblings)
21 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Use the governed feature framework to track if XSAVES is "enabled", i.e.
if XSAVES can be used by the guest. Add a comment in the SVM code to
explain the very unintuitive logic of deliberately NOT checking if XSAVES
is enumerated in the guest CPUID model.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/svm.c | 17 ++++++++++++++---
arch/x86/kvm/vmx/vmx.c | 32 ++++++++++++++++++--------------
arch/x86/kvm/x86.c | 4 ++--
4 files changed, 35 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index b29c15d5e038..b896a64e4ac3 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -6,6 +6,7 @@ BUILD_BUG()
#define KVM_GOVERNED_X86_FEATURE(x) KVM_GOVERNED_FEATURE(X86_FEATURE_##x)
KVM_GOVERNED_X86_FEATURE(GBPAGES)
+KVM_GOVERNED_X86_FEATURE(XSAVES)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 64092df06f94..d5f8cb402eb7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4204,9 +4204,20 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
struct kvm_cpuid_entry2 *best;
- vcpu->arch.xsaves_enabled = guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
- boot_cpu_has(X86_FEATURE_XSAVE) &&
- boot_cpu_has(X86_FEATURE_XSAVES);
+ /*
+ * SVM doesn't provide a way to disable just XSAVES in the guest, KVM
+ * can only disable all variants of by disallowing CR4.OSXSAVE from
+ * being set. As a result, if the host has XSAVE and XSAVES, and the
+ * guest has XSAVE enabled, the guest can execute XSAVES without
+ * faulting. Treat XSAVES as enabled in this case regardless of
+ * whether it's advertised to the guest so that KVM context switches
+ * XSS on VM-Enter/VM-Exit. Failure to do so would effectively give
+ * the guest read/write access to the host's XSS.
+ */
+ if (boot_cpu_has(X86_FEATURE_XSAVE) &&
+ boot_cpu_has(X86_FEATURE_XSAVES) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
+ kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);
/* Update nrips enabled cache */
svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a0a47be2feed..3100ed62615c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4518,16 +4518,19 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
* based on a single guest CPUID bit, with a dedicated feature bit. This also
* verifies that the control is actually supported by KVM and hardware.
*/
-#define vmx_adjust_sec_exec_control(vmx, exec_control, name, feat_name, ctrl_name, exiting) \
-({ \
- bool __enabled; \
- \
- if (cpu_has_vmx_##name()) { \
- __enabled = guest_cpuid_has(&(vmx)->vcpu, \
- X86_FEATURE_##feat_name); \
- vmx_adjust_secondary_exec_control(vmx, exec_control, \
- SECONDARY_EXEC_##ctrl_name, __enabled, exiting); \
- } \
+#define vmx_adjust_sec_exec_control(vmx, exec_control, name, feat_name, ctrl_name, exiting) \
+({ \
+ struct kvm_vcpu *__vcpu = &(vmx)->vcpu; \
+ bool __enabled; \
+ \
+ if (cpu_has_vmx_##name()) { \
+ if (kvm_is_governed_feature(X86_FEATURE_##feat_name)) \
+ __enabled = guest_can_use(__vcpu, X86_FEATURE_##feat_name); \
+ else \
+ __enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name); \
+ vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
+ __enabled, exiting); \
+ } \
})
/* More macro magic for ENABLE_/opt-in versus _EXITING/opt-out controls. */
@@ -4587,10 +4590,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
if (!enable_pml || !atomic_read(&vcpu->kvm->nr_memslots_dirty_logging))
exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
- if (cpu_has_vmx_xsaves())
- vmx_adjust_secondary_exec_control(vmx, &exec_control,
- SECONDARY_EXEC_ENABLE_XSAVES,
- vcpu->arch.xsaves_enabled, false);
+ vmx_adjust_sec_exec_feature(vmx, &exec_control, xsaves, XSAVES);
/*
* RDPID is also gated by ENABLE_RDTSCP, turn on the control if either
@@ -4609,6 +4609,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
SECONDARY_EXEC_ENABLE_RDTSCP,
rdpid_or_rdtscp_enabled, false);
}
+
vmx_adjust_sec_exec_feature(vmx, &exec_control, invpcid, INVPCID);
vmx_adjust_sec_exec_exiting(vmx, &exec_control, rdrand, RDRAND);
@@ -7722,6 +7723,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
boot_cpu_has(X86_FEATURE_XSAVE) &&
guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
guest_cpuid_has(vcpu, X86_FEATURE_XSAVES);
+ if (boot_cpu_has(X86_FEATURE_XSAVE) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
vmx_setup_uret_msrs(vmx);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a14378ed4e1..201fa957ce9a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1012,7 +1012,7 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
if (vcpu->arch.xcr0 != host_xcr0)
xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
- if (vcpu->arch.xsaves_enabled &&
+ if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
vcpu->arch.ia32_xss != host_xss)
wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
}
@@ -1043,7 +1043,7 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
if (vcpu->arch.xcr0 != host_xcr0)
xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
- if (vcpu->arch.xsaves_enabled &&
+ if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
vcpu->arch.ia32_xss != host_xss)
wrmsrl(MSR_IA32_XSS, host_xss);
}
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 13/21] KVM: nVMX: Use KVM-governed feature framework to track "nested VMX enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (11 preceding siblings ...)
2023-07-29 1:15 ` [PATCH v2 12/21] KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-08-14 8:11 ` Yuan Yao
2023-07-29 1:16 ` [PATCH v2 14/21] KVM: nSVM: Use KVM-governed feature framework to track "NRIPS enabled" Sean Christopherson
` (8 subsequent siblings)
21 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "VMX exposed to L1" via a governed feature flag instead of using a
dedicated helper to provide the same functionality. The main goal is to
drive convergence between VMX and SVM with respect to querying features
that are controllable via module param (SVM likes to cache nested
features), avoiding the guest CPUID lookups at runtime is just a bonus
and unlikely to provide any meaningful performance benefits.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/vmx/nested.c | 7 ++++---
arch/x86/kvm/vmx/vmx.c | 21 ++++++---------------
arch/x86/kvm/vmx/vmx.h | 1 -
4 files changed, 11 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index b896a64e4ac3..22446614bf49 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -7,6 +7,7 @@ BUILD_BUG()
KVM_GOVERNED_X86_FEATURE(GBPAGES)
KVM_GOVERNED_X86_FEATURE(XSAVES)
+KVM_GOVERNED_X86_FEATURE(VMX)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 22e08d30baef..c5ec0ef51ff7 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6426,7 +6426,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
vmx = to_vmx(vcpu);
vmcs12 = get_vmcs12(vcpu);
- if (nested_vmx_allowed(vcpu) &&
+ if (guest_can_use(vcpu, X86_FEATURE_VMX) &&
(vmx->nested.vmxon || vmx->nested.smm.vmxon)) {
kvm_state.hdr.vmx.vmxon_pa = vmx->nested.vmxon_ptr;
kvm_state.hdr.vmx.vmcs12_pa = vmx->nested.current_vmptr;
@@ -6567,7 +6567,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS)
return -EINVAL;
} else {
- if (!nested_vmx_allowed(vcpu))
+ if (!guest_can_use(vcpu, X86_FEATURE_VMX))
return -EINVAL;
if (!page_address_valid(vcpu, kvm_state->hdr.vmx.vmxon_pa))
@@ -6601,7 +6601,8 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
return -EINVAL;
if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) &&
- (!nested_vmx_allowed(vcpu) || !vmx->nested.enlightened_vmcs_enabled))
+ (!guest_can_use(vcpu, X86_FEATURE_VMX) ||
+ !vmx->nested.enlightened_vmcs_enabled))
return -EINVAL;
vmx_leave_nested(vcpu);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3100ed62615c..fdf932cfc64d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1894,17 +1894,6 @@ static void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu)
vmcs_write64(TSC_MULTIPLIER, vcpu->arch.tsc_scaling_ratio);
}
-/*
- * nested_vmx_allowed() checks whether a guest should be allowed to use VMX
- * instructions and MSRs (i.e., nested VMX). Nested VMX is disabled for
- * all guests if the "nested" module option is off, and can also be disabled
- * for a single guest by disabling its VMX cpuid bit.
- */
-bool nested_vmx_allowed(struct kvm_vcpu *vcpu)
-{
- return nested && guest_cpuid_has(vcpu, X86_FEATURE_VMX);
-}
-
/*
* Userspace is allowed to set any supported IA32_FEATURE_CONTROL regardless of
* guest CPUID. Note, KVM allows userspace to set "VMX in SMX" to maintain
@@ -2032,7 +2021,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
break;
case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
- if (!nested_vmx_allowed(vcpu))
+ if (!guest_can_use(vcpu, X86_FEATURE_VMX))
return 1;
if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
&msr_info->data))
@@ -2340,7 +2329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
if (!msr_info->host_initiated)
return 1; /* they are read-only */
- if (!nested_vmx_allowed(vcpu))
+ if (!guest_can_use(vcpu, X86_FEATURE_VMX))
return 1;
return vmx_set_vmx_msr(vcpu, msr_index, data);
case MSR_IA32_RTIT_CTL:
@@ -7727,13 +7716,15 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX);
+
vmx_setup_uret_msrs(vmx);
if (cpu_has_secondary_exec_ctrls())
vmcs_set_secondary_exec_control(vmx,
vmx_secondary_exec_control(vmx));
- if (nested_vmx_allowed(vcpu))
+ if (guest_can_use(vcpu, X86_FEATURE_VMX))
vmx->msr_ia32_feature_control_valid_bits |=
FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
@@ -7742,7 +7733,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
- if (nested_vmx_allowed(vcpu))
+ if (guest_can_use(vcpu, X86_FEATURE_VMX))
nested_vmx_cr_fixed1_bits_update(vcpu);
if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index cde902b44d97..c2130d2c8e24 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -374,7 +374,6 @@ struct kvm_vmx {
u64 *pid_table;
};
-bool nested_vmx_allowed(struct kvm_vcpu *vcpu);
void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu,
struct loaded_vmcs *buddy);
int allocate_vpid(void);
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 14/21] KVM: nSVM: Use KVM-governed feature framework to track "NRIPS enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (12 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 13/21] KVM: nVMX: Use KVM-governed feature framework to track "nested VMX enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 15/21] KVM: nSVM: Use KVM-governed feature framework to track "TSC scaling enabled" Sean Christopherson
` (7 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "NRIPS exposed to L1" via a governed feature flag instead of using
a dedicated bit/flag in vcpu_svm.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/nested.c | 6 +++---
arch/x86/kvm/svm/svm.c | 4 +---
arch/x86/kvm/svm/svm.h | 1 -
4 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 22446614bf49..722b66af412c 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -8,6 +8,7 @@ BUILD_BUG()
KVM_GOVERNED_X86_FEATURE(GBPAGES)
KVM_GOVERNED_X86_FEATURE(XSAVES)
KVM_GOVERNED_X86_FEATURE(VMX)
+KVM_GOVERNED_X86_FEATURE(NRIPS)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 3342cc4a5189..9092f3f8dccf 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -716,7 +716,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
* what a nrips=0 CPU would do (L1 is responsible for advancing RIP
* prior to injecting the event).
*/
- if (svm->nrips_enabled)
+ if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
vmcb02->control.next_rip = svm->nested.ctl.next_rip;
else if (boot_cpu_has(X86_FEATURE_NRIPS))
vmcb02->control.next_rip = vmcb12_rip;
@@ -726,7 +726,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
svm->soft_int_injected = true;
svm->soft_int_csbase = vmcb12_csbase;
svm->soft_int_old_rip = vmcb12_rip;
- if (svm->nrips_enabled)
+ if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
svm->soft_int_next_rip = svm->nested.ctl.next_rip;
else
svm->soft_int_next_rip = vmcb12_rip;
@@ -1026,7 +1026,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
if (vmcb12->control.exit_code != SVM_EXIT_ERR)
nested_save_pending_event_to_vmcb12(svm, vmcb12);
- if (svm->nrips_enabled)
+ if (guest_can_use(vcpu, X86_FEATURE_NRIPS))
vmcb12->control.next_rip = vmcb02->control.next_rip;
vmcb12->control.int_ctl = svm->nested.ctl.int_ctl;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d5f8cb402eb7..7c1aa532f767 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4219,9 +4219,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);
- /* Update nrips enabled cache */
- svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
- guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_NRIPS);
svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR);
svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5829a1801862..c06de55425d4 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -259,7 +259,6 @@ struct vcpu_svm {
bool soft_int_injected;
/* optional nested SVM features that are enabled for this guest */
- bool nrips_enabled : 1;
bool tsc_scaling_enabled : 1;
bool v_vmload_vmsave_enabled : 1;
bool lbrv_enabled : 1;
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 15/21] KVM: nSVM: Use KVM-governed feature framework to track "TSC scaling enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (13 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 14/21] KVM: nSVM: Use KVM-governed feature framework to track "NRIPS enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 16/21] KVM: nSVM: Use KVM-governed feature framework to track "vVM{SAVE,LOAD} enabled" Sean Christopherson
` (6 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "TSC scaling exposed to L1" via a governed feature flag instead of
using a dedicated bit/flag in vcpu_svm.
Note, this fixes a benign bug where KVM would mark TSC scaling as exposed
to L1 even if overall nested SVM supported is disabled, i.e. KVM would let
L1 write MSR_AMD64_TSC_RATIO even when KVM didn't advertise TSCRATEMSR
support to userspace.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/nested.c | 2 +-
arch/x86/kvm/svm/svm.c | 10 ++++++----
arch/x86/kvm/svm/svm.h | 1 -
4 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 722b66af412c..32c0469cf952 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -9,6 +9,7 @@ KVM_GOVERNED_X86_FEATURE(GBPAGES)
KVM_GOVERNED_X86_FEATURE(XSAVES)
KVM_GOVERNED_X86_FEATURE(VMX)
KVM_GOVERNED_X86_FEATURE(NRIPS)
+KVM_GOVERNED_X86_FEATURE(TSCRATEMSR)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 9092f3f8dccf..da65948064dc 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -695,7 +695,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
vmcb02->control.tsc_offset = vcpu->arch.tsc_offset;
- if (svm->tsc_scaling_enabled &&
+ if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio)
nested_svm_update_tsc_ratio_msr(vcpu);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7c1aa532f767..2f7f7df5a591 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2755,7 +2755,8 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
switch (msr_info->index) {
case MSR_AMD64_TSC_RATIO:
- if (!msr_info->host_initiated && !svm->tsc_scaling_enabled)
+ if (!msr_info->host_initiated &&
+ !guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR))
return 1;
msr_info->data = svm->tsc_ratio_msr;
break;
@@ -2897,7 +2898,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
switch (ecx) {
case MSR_AMD64_TSC_RATIO:
- if (!svm->tsc_scaling_enabled) {
+ if (!guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR)) {
if (!msr->host_initiated)
return 1;
@@ -2919,7 +2920,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
svm->tsc_ratio_msr = data;
- if (svm->tsc_scaling_enabled && is_guest_mode(vcpu))
+ if (guest_can_use(vcpu, X86_FEATURE_TSCRATEMSR) &&
+ is_guest_mode(vcpu))
nested_svm_update_tsc_ratio_msr(vcpu);
break;
@@ -4220,8 +4222,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_NRIPS);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
- svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR);
svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV);
svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index c06de55425d4..4e7332f77702 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -259,7 +259,6 @@ struct vcpu_svm {
bool soft_int_injected;
/* optional nested SVM features that are enabled for this guest */
- bool tsc_scaling_enabled : 1;
bool v_vmload_vmsave_enabled : 1;
bool lbrv_enabled : 1;
bool pause_filter_enabled : 1;
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 16/21] KVM: nSVM: Use KVM-governed feature framework to track "vVM{SAVE,LOAD} enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (14 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 15/21] KVM: nSVM: Use KVM-governed feature framework to track "TSC scaling enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 17/21] KVM: nSVM: Use KVM-governed feature framework to track "LBRv enabled" Sean Christopherson
` (5 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "virtual VMSAVE/VMLOAD exposed to L1" via a governed feature flag
instead of using a dedicated bit/flag in vcpu_svm.
Opportunistically add a comment explaining why KVM disallows virtual
VMLOAD/VMSAVE when the vCPU model is Intel.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/nested.c | 2 +-
arch/x86/kvm/svm/svm.c | 10 +++++++---
arch/x86/kvm/svm/svm.h | 1 -
4 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 32c0469cf952..f01a95fd0071 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -10,6 +10,7 @@ KVM_GOVERNED_X86_FEATURE(XSAVES)
KVM_GOVERNED_X86_FEATURE(VMX)
KVM_GOVERNED_X86_FEATURE(NRIPS)
KVM_GOVERNED_X86_FEATURE(TSCRATEMSR)
+KVM_GOVERNED_X86_FEATURE(V_VMSAVE_VMLOAD)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index da65948064dc..24d47ebeb0e0 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -107,7 +107,7 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu)
static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm)
{
- if (!svm->v_vmload_vmsave_enabled)
+ if (!guest_can_use(&svm->vcpu, X86_FEATURE_V_VMSAVE_VMLOAD))
return true;
if (!nested_npt_enabled(svm))
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2f7f7df5a591..39a69c1786ea 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1154,8 +1154,6 @@ static inline void init_vmcb_after_set_cpuid(struct kvm_vcpu *vcpu)
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_EIP, 0, 0);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_ESP, 0, 0);
-
- svm->v_vmload_vmsave_enabled = false;
} else {
/*
* If hardware supports Virtual VMLOAD VMSAVE then enable it
@@ -4226,7 +4224,13 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV);
- svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+ /*
+ * Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
+ * VMLOAD drops bits 63:32 of SYSENTER (ignoring the fact that exposing
+ * SVM on Intel is bonkers and extremely unlikely to work).
+ */
+ if (!guest_cpuid_is_intel(vcpu))
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
svm->pause_filter_enabled = kvm_cpu_cap_has(X86_FEATURE_PAUSEFILTER) &&
guest_cpuid_has(vcpu, X86_FEATURE_PAUSEFILTER);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 4e7332f77702..b475241df6dc 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -259,7 +259,6 @@ struct vcpu_svm {
bool soft_int_injected;
/* optional nested SVM features that are enabled for this guest */
- bool v_vmload_vmsave_enabled : 1;
bool lbrv_enabled : 1;
bool pause_filter_enabled : 1;
bool pause_threshold_enabled : 1;
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 17/21] KVM: nSVM: Use KVM-governed feature framework to track "LBRv enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (15 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 16/21] KVM: nSVM: Use KVM-governed feature framework to track "vVM{SAVE,LOAD} enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 18/21] KVM: nSVM: Use KVM-governed feature framework to track "Pause Filter enabled" Sean Christopherson
` (4 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "LBR virtualization exposed to L1" via a governed feature flag
instead of using a dedicated bit/flag in vcpu_svm.
Note, checking KVM's capabilities instead of the "lbrv" param means that
the code isn't strictly equivalent, as lbrv_enabled could have been set
if nested=false where as that the governed feature cannot. But that's a
glorified nop as the feature/flag is consumed only by paths that are
gated by nSVM being enabled.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/nested.c | 23 +++++++++++++----------
arch/x86/kvm/svm/svm.c | 7 ++++---
arch/x86/kvm/svm/svm.h | 1 -
4 files changed, 18 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index f01a95fd0071..3a4c0e40e1e0 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -11,6 +11,7 @@ KVM_GOVERNED_X86_FEATURE(VMX)
KVM_GOVERNED_X86_FEATURE(NRIPS)
KVM_GOVERNED_X86_FEATURE(TSCRATEMSR)
KVM_GOVERNED_X86_FEATURE(V_VMSAVE_VMLOAD)
+KVM_GOVERNED_X86_FEATURE(LBRV)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 24d47ebeb0e0..f50f74b1a04e 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -552,6 +552,7 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
bool new_vmcb12 = false;
struct vmcb *vmcb01 = svm->vmcb01.ptr;
struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
+ struct kvm_vcpu *vcpu = &svm->vcpu;
nested_vmcb02_compute_g_pat(svm);
@@ -577,18 +578,18 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
vmcb_mark_dirty(vmcb02, VMCB_DT);
}
- kvm_set_rflags(&svm->vcpu, vmcb12->save.rflags | X86_EFLAGS_FIXED);
+ kvm_set_rflags(vcpu, vmcb12->save.rflags | X86_EFLAGS_FIXED);
- svm_set_efer(&svm->vcpu, svm->nested.save.efer);
+ svm_set_efer(vcpu, svm->nested.save.efer);
- svm_set_cr0(&svm->vcpu, svm->nested.save.cr0);
- svm_set_cr4(&svm->vcpu, svm->nested.save.cr4);
+ svm_set_cr0(vcpu, svm->nested.save.cr0);
+ svm_set_cr4(vcpu, svm->nested.save.cr4);
svm->vcpu.arch.cr2 = vmcb12->save.cr2;
- kvm_rax_write(&svm->vcpu, vmcb12->save.rax);
- kvm_rsp_write(&svm->vcpu, vmcb12->save.rsp);
- kvm_rip_write(&svm->vcpu, vmcb12->save.rip);
+ kvm_rax_write(vcpu, vmcb12->save.rax);
+ kvm_rsp_write(vcpu, vmcb12->save.rsp);
+ kvm_rip_write(vcpu, vmcb12->save.rip);
/* In case we don't even reach vcpu_run, the fields are not updated */
vmcb02->save.rax = vmcb12->save.rax;
@@ -602,7 +603,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
vmcb_mark_dirty(vmcb02, VMCB_DR);
}
- if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
+ if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+ (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
/*
* Reserved bits of DEBUGCTL are ignored. Be consistent with
* svm_set_msr's definition of reserved bits.
@@ -734,7 +736,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
vmcb02->control.virt_ext = vmcb01->control.virt_ext &
LBR_CTL_ENABLE_MASK;
- if (svm->lbrv_enabled)
+ if (guest_can_use(vcpu, X86_FEATURE_LBRV))
vmcb02->control.virt_ext |=
(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK);
@@ -1065,7 +1067,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
if (!nested_exit_on_intr(svm))
kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
- if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
+ if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) &&
+ (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
svm_copy_lbrs(vmcb12, vmcb02);
svm_update_lbrv(vcpu);
} else if (unlikely(vmcb01->control.virt_ext & LBR_CTL_ENABLE_MASK)) {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 39a69c1786ea..a83fa6df7c04 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -984,9 +984,11 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
bool current_enable_lbrv = !!(svm->vmcb->control.virt_ext &
LBR_CTL_ENABLE_MASK);
- if (unlikely(is_guest_mode(vcpu) && svm->lbrv_enabled))
+ if (unlikely(is_guest_mode(vcpu) &&
+ guest_can_use(vcpu, X86_FEATURE_LBRV))) {
if (unlikely(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))
enable_lbrv = true;
+ }
if (enable_lbrv == current_enable_lbrv)
return;
@@ -4221,8 +4223,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_NRIPS);
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
-
- svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_LBRV);
/*
* Intercept VMLOAD if the vCPU mode is Intel in order to emulate that
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index b475241df6dc..0e21823e8a19 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -259,7 +259,6 @@ struct vcpu_svm {
bool soft_int_injected;
/* optional nested SVM features that are enabled for this guest */
- bool lbrv_enabled : 1;
bool pause_filter_enabled : 1;
bool pause_threshold_enabled : 1;
bool vgif_enabled : 1;
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 18/21] KVM: nSVM: Use KVM-governed feature framework to track "Pause Filter enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (16 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 17/21] KVM: nSVM: Use KVM-governed feature framework to track "LBRv enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 19/21] KVM: nSVM: Use KVM-governed feature framework to track "vGIF enabled" Sean Christopherson
` (3 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "Pause Filtering is exposed to L1" via governed feature flags
instead of using dedicated bits/flags in vcpu_svm.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 2 ++
arch/x86/kvm/svm/nested.c | 10 ++++++++--
arch/x86/kvm/svm/svm.c | 7 ++-----
arch/x86/kvm/svm/svm.h | 2 --
4 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 3a4c0e40e1e0..9afd34f30599 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -12,6 +12,8 @@ KVM_GOVERNED_X86_FEATURE(NRIPS)
KVM_GOVERNED_X86_FEATURE(TSCRATEMSR)
KVM_GOVERNED_X86_FEATURE(V_VMSAVE_VMLOAD)
KVM_GOVERNED_X86_FEATURE(LBRV)
+KVM_GOVERNED_X86_FEATURE(PAUSEFILTER)
+KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index f50f74b1a04e..ac03b2bc5b2c 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -743,8 +743,14 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
if (!nested_vmcb_needs_vls_intercept(svm))
vmcb02->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK;
- pause_count12 = svm->pause_filter_enabled ? svm->nested.ctl.pause_filter_count : 0;
- pause_thresh12 = svm->pause_threshold_enabled ? svm->nested.ctl.pause_filter_thresh : 0;
+ if (guest_can_use(vcpu, X86_FEATURE_PAUSEFILTER))
+ pause_count12 = svm->nested.ctl.pause_filter_count;
+ else
+ pause_count12 = 0;
+ if (guest_can_use(vcpu, X86_FEATURE_PFTHRESHOLD))
+ pause_thresh12 = svm->nested.ctl.pause_filter_thresh;
+ else
+ pause_thresh12 = 0;
if (kvm_pause_in_guest(svm->vcpu.kvm)) {
/* use guest values since host doesn't intercept PAUSE */
vmcb02->control.pause_filter_count = pause_count12;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a83fa6df7c04..be3a11f00f4e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4233,11 +4233,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
if (!guest_cpuid_is_intel(vcpu))
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
- svm->pause_filter_enabled = kvm_cpu_cap_has(X86_FEATURE_PAUSEFILTER) &&
- guest_cpuid_has(vcpu, X86_FEATURE_PAUSEFILTER);
-
- svm->pause_threshold_enabled = kvm_cpu_cap_has(X86_FEATURE_PFTHRESHOLD) &&
- guest_cpuid_has(vcpu, X86_FEATURE_PFTHRESHOLD);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
svm->vgif_enabled = vgif && guest_cpuid_has(vcpu, X86_FEATURE_VGIF);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 0e21823e8a19..fb438439b61e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -259,8 +259,6 @@ struct vcpu_svm {
bool soft_int_injected;
/* optional nested SVM features that are enabled for this guest */
- bool pause_filter_enabled : 1;
- bool pause_threshold_enabled : 1;
bool vgif_enabled : 1;
bool vnmi_enabled : 1;
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 19/21] KVM: nSVM: Use KVM-governed feature framework to track "vGIF enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (17 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 18/21] KVM: nSVM: Use KVM-governed feature framework to track "Pause Filter enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 20/21] KVM: nSVM: Use KVM-governed feature framework to track "vNMI enabled" Sean Christopherson
` (2 subsequent siblings)
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "virtual GIF exposed to L1" via a governed feature flag instead of
using a dedicated bit/flag in vcpu_svm.
Note, checking KVM's capabilities instead of the "vgif" param means that
the code isn't strictly equivalent, as vgif_enabled could have been set
if nested=false where as that the governed feature cannot. But that's a
glorified nop as the feature/flag is consumed only by paths that are
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/nested.c | 3 ++-
arch/x86/kvm/svm/svm.c | 3 +--
arch/x86/kvm/svm/svm.h | 5 +++--
4 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 9afd34f30599..368696c2e96b 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -14,6 +14,7 @@ KVM_GOVERNED_X86_FEATURE(V_VMSAVE_VMLOAD)
KVM_GOVERNED_X86_FEATURE(LBRV)
KVM_GOVERNED_X86_FEATURE(PAUSEFILTER)
KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
+KVM_GOVERNED_X86_FEATURE(VGIF)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ac03b2bc5b2c..dd496c9e5f91 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -660,7 +660,8 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
* exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes.
*/
- if (svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK))
+ if (guest_can_use(vcpu, X86_FEATURE_VGIF) &&
+ (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK))
int_ctl_vmcb12_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK);
else
int_ctl_vmcb01_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index be3a11f00f4e..6d9bb4453f2d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4235,8 +4235,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
-
- svm->vgif_enabled = vgif && guest_cpuid_has(vcpu, X86_FEATURE_VGIF);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
svm->vnmi_enabled = vnmi && guest_cpuid_has(vcpu, X86_FEATURE_VNMI);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index fb438439b61e..6eb5877cc6c3 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -22,6 +22,7 @@
#include <asm/svm.h>
#include <asm/sev-common.h>
+#include "cpuid.h"
#include "kvm_cache_regs.h"
#define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
@@ -259,7 +260,6 @@ struct vcpu_svm {
bool soft_int_injected;
/* optional nested SVM features that are enabled for this guest */
- bool vgif_enabled : 1;
bool vnmi_enabled : 1;
u32 ldr_reg;
@@ -485,7 +485,8 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
{
- return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
+ return guest_can_use(&svm->vcpu, X86_FEATURE_VGIF) &&
+ (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
}
static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 20/21] KVM: nSVM: Use KVM-governed feature framework to track "vNMI enabled"
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (18 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 19/21] KVM: nSVM: Use KVM-governed feature framework to track "vGIF enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 21/21] KVM: x86: Disallow guest CPUID lookups when IRQs are disabled Sean Christopherson
2023-08-04 0:40 ` [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Track "virtual NMI exposed to L1" via a governed feature flag instead of
using a dedicated bit/flag in vcpu_svm.
Note, checking KVM's capabilities instead of the "vnmi" param means that
the code isn't strictly equivalent, as vnmi_enabled could have been set
if nested=false where as that the governed feature cannot. But that's a
glorified nop as the feature/flag is consumed only by paths that are
gated by nSVM being enabled.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/svm.c | 3 +--
arch/x86/kvm/svm/svm.h | 5 +----
3 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index 368696c2e96b..423a73395c10 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -15,6 +15,7 @@ KVM_GOVERNED_X86_FEATURE(LBRV)
KVM_GOVERNED_X86_FEATURE(PAUSEFILTER)
KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
KVM_GOVERNED_X86_FEATURE(VGIF)
+KVM_GOVERNED_X86_FEATURE(VNMI)
#undef KVM_GOVERNED_X86_FEATURE
#undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 6d9bb4453f2d..89cc9f4f3ddc 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4236,8 +4236,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PAUSEFILTER);
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
-
- svm->vnmi_enabled = vnmi && guest_cpuid_has(vcpu, X86_FEATURE_VNMI);
+ kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
svm_recalc_instruction_intercepts(vcpu, svm);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6eb5877cc6c3..06400cfe2244 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -259,9 +259,6 @@ struct vcpu_svm {
unsigned long soft_int_next_rip;
bool soft_int_injected;
- /* optional nested SVM features that are enabled for this guest */
- bool vnmi_enabled : 1;
-
u32 ldr_reg;
u32 dfr_reg;
struct page *avic_backing_page;
@@ -537,7 +534,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
static inline bool nested_vnmi_enabled(struct vcpu_svm *svm)
{
- return svm->vnmi_enabled &&
+ return guest_can_use(&svm->vcpu, X86_FEATURE_VNMI) &&
(svm->nested.ctl.int_ctl & V_NMI_ENABLE_MASK);
}
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 21/21] KVM: x86: Disallow guest CPUID lookups when IRQs are disabled
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (19 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 20/21] KVM: nSVM: Use KVM-governed feature framework to track "vNMI enabled" Sean Christopherson
@ 2023-07-29 1:16 ` Sean Christopherson
2023-08-04 0:40 ` [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-07-29 1:16 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
Now that KVM has a framework for caching guest CPUID feature flags, add
a "rule" that IRQs must be enabled when doing guest CPUID lookups, and
enforce the rule via a lockdep assertion. CPUID lookups are slow, and
within KVM, IRQs are only ever disabled in hot paths, e.g. the core run
loop, fast page fault handling, etc. I.e. querying guest CPUID with IRQs
disabled, especially in the run loop, should be avoided.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/cpuid.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index f74d6c404551..4b14bd9c5637 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -11,6 +11,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/kvm_host.h>
+#include "linux/lockdep.h"
#include <linux/export.h>
#include <linux/vmalloc.h>
#include <linux/uaccess.h>
@@ -84,6 +85,18 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
struct kvm_cpuid_entry2 *e;
int i;
+ /*
+ * KVM has a semi-arbitrary rule that querying the guest's CPUID model
+ * with IRQs disabled is disallowed. The CPUID model can legitimately
+ * have over one hundred entries, i.e. the lookup is slow, and IRQs are
+ * typically disabled in KVM only when KVM is in a performance critical
+ * path, e.g. the core VM-Enter/VM-Exit run loop. Nothing will break
+ * if this rule is violated, this assertion is purely to flag potential
+ * performance issues. If this fires, consider moving the lookup out
+ * of the hotpath, e.g. by caching information during CPUID updates.
+ */
+ lockdep_assert_irqs_enabled();
+
for (i = 0; i < nent; i++) {
e = &entries[i];
--
2.41.0.487.g6d72f3e995-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
` (20 preceding siblings ...)
2023-07-29 1:16 ` [PATCH v2 21/21] KVM: x86: Disallow guest CPUID lookups when IRQs are disabled Sean Christopherson
@ 2023-08-04 0:40 ` Sean Christopherson
21 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-08-04 0:40 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm, linux-kernel, Maxim Levitsky
On Fri, 28 Jul 2023 18:15:47 -0700, Sean Christopherson wrote:
> Add a framework to manage and cache KVM-governed features, i.e. CPUID
> based features that require explicit KVM enabling and/or need to be
> queried semi-frequently by KVM. The idea originally came up in the
> context of the architectural LBRs series as a way to avoid querying
> guest CPUID in hot paths without needing a dedicated flag, but as
> evidenced by the shortlog, the most common usage is to handle the ever-
> growing list of SVM features that are exposed to L1.
>
> [...]
Applied to kvm-x86 misc. I'm still hoping for reviews before sending this on
to Paolo, but I also want to get this into -next sooner than later. And I
tried to grab everything else for "misc" ahead of time, so hopefully there won't
be much, if any, collateral damage if I need to unwind.
[01/21] KVM: nSVM: Check instead of asserting on nested TSC scaling support
https://github.com/kvm-x86/linux/commit/06b2c8f7d994
[02/21] KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state
https://github.com/kvm-x86/linux/commit/3b5ebb78fdc4
[03/21] KVM: nSVM: Use the "outer" helper for writing multiplier to MSR_AMD64_TSC_RATIO
https://github.com/kvm-x86/linux/commit/b5b10353a0c6
[04/21] KVM: SVM: Clean up preemption toggling related to MSR_AMD64_TSC_RATIO
https://github.com/kvm-x86/linux/commit/3e6e20a49690
[05/21] KVM: x86: Always write vCPU's current TSC offset/ratio in vendor hooks
https://github.com/kvm-x86/linux/commit/c1a3910b30f6
[06/21] KVM: nSVM: Skip writes to MSR_AMD64_TSC_RATIO if guest state isn't loaded
https://github.com/kvm-x86/linux/commit/923428d7f741
[07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features
https://github.com/kvm-x86/linux/commit/ed425263d977
[08/21] KVM: x86/mmu: Use KVM-governed feature framework to track "GBPAGES enabled"
https://github.com/kvm-x86/linux/commit/f274417a0920
[09/21] KVM: VMX: Recompute "XSAVES enabled" only after CPUID update
https://github.com/kvm-x86/linux/commit/73db29bfc576
[10/21] KVM: VMX: Check KVM CPU caps, not just VMX MSR support, for XSAVE enabling
https://github.com/kvm-x86/linux/commit/82816e485b02
[11/21] KVM: VMX: Rename XSAVES control to follow KVM's preferred "ENABLE_XYZ"
https://github.com/kvm-x86/linux/commit/080b5890f88e
[12/21] KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled"
https://github.com/kvm-x86/linux/commit/ae1f7788d4a3
[13/21] KVM: nVMX: Use KVM-governed feature framework to track "nested VMX enabled"
https://github.com/kvm-x86/linux/commit/d60c21f44fe2
[14/21] KVM: nSVM: Use KVM-governed feature framework to track "NRIPS enabled"
https://github.com/kvm-x86/linux/commit/f32e9ba4ff57
[15/21] KVM: nSVM: Use KVM-governed feature framework to track "TSC scaling enabled"
https://github.com/kvm-x86/linux/commit/417293d8eebc
[16/21] KVM: nSVM: Use KVM-governed feature framework to track "vVM{SAVE,LOAD} enabled"
https://github.com/kvm-x86/linux/commit/a179919b6b08
[17/21] KVM: nSVM: Use KVM-governed feature framework to track "LBRv enabled"
https://github.com/kvm-x86/linux/commit/2a688180ae59
[18/21] KVM: nSVM: Use KVM-governed feature framework to track "Pause Filter enabled"
https://github.com/kvm-x86/linux/commit/2b4a6cce48b6
[19/21] KVM: nSVM: Use KVM-governed feature framework to track "vGIF enabled"
https://github.com/kvm-x86/linux/commit/994e74975227
[20/21] KVM: nSVM: Use KVM-governed feature framework to track "vNMI enabled"
https://github.com/kvm-x86/linux/commit/17e2299ac6a6
[21/21] KVM: x86: Disallow guest CPUID lookups when IRQs are disabled
https://github.com/kvm-x86/linux/commit/20b18b1dea4c
--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features
2023-07-29 1:15 ` [PATCH v2 07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features Sean Christopherson
@ 2023-08-14 4:43 ` Zeng Guang
2023-08-14 17:20 ` Sean Christopherson
0 siblings, 1 reply; 27+ messages in thread
From: Zeng Guang @ 2023-08-14 4:43 UTC (permalink / raw)
To: Christopherson,, Sean, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Maxim Levitsky
On 7/29/2023 9:15 AM, Sean Christopherson wrote:
> Introduce yet another X86_FEATURE flag framework to manage and cache KVM
> governed features (for lack of a better name). "Governed" in this case
> means that KVM has some level of involvement and/or vested interest in
> whether or not an X86_FEATURE can be used by the guest. The intent of the
> framework is twofold: to simplify caching of guest CPUID flags that KVM
> needs to frequently query, and to add clarity to such caching, e.g. it
> isn't immediately obvious that SVM's bundle of flags for "optional nested
> SVM features" track whether or not a flag is exposed to L1.
>
> Begrudgingly define KVM_MAX_NR_GOVERNED_FEATURES for the size of the
> bitmap to avoid exposing governed_features.h in arch/x86/include/asm/, but
> add a FIXME to call out that it can and should be cleaned up once
> "struct kvm_vcpu_arch" is no longer expose to the kernel at large.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/include/asm/kvm_host.h | 19 +++++++++++++
> arch/x86/kvm/cpuid.c | 4 +++
> arch/x86/kvm/cpuid.h | 46 ++++++++++++++++++++++++++++++++
> arch/x86/kvm/governed_features.h | 9 +++++++
> 4 files changed, 78 insertions(+)
> create mode 100644 arch/x86/kvm/governed_features.h
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index dad9331c5270..007fa8bfd634 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -831,6 +831,25 @@ struct kvm_vcpu_arch {
> struct kvm_cpuid_entry2 *cpuid_entries;
> struct kvm_hypervisor_cpuid kvm_cpuid;
>
> + /*
> + * FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly
> + * when "struct kvm_vcpu_arch" is no longer defined in an
> + * arch/x86/include/asm header. The max is mostly arbitrary, i.e.
> + * can be increased as necessary.
> + */
> +#define KVM_MAX_NR_GOVERNED_FEATURES BITS_PER_LONG
> +
> + /*
> + * Track whether or not the guest is allowed to use features that are
> + * governed by KVM, where "governed" means KVM needs to manage state
> + * and/or explicitly enable the feature in hardware. Typically, but
> + * not always, governed features can be used by the guest if and only
> + * if both KVM and userspace want to expose the feature to the guest.
> + */
> + struct {
> + DECLARE_BITMAP(enabled, KVM_MAX_NR_GOVERNED_FEATURES);
> + } governed_features;
> +
> u64 reserved_gpa_bits;
> int maxphyaddr;
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 7f4d13383cf2..ef826568c222 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -313,6 +313,10 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> struct kvm_lapic *apic = vcpu->arch.apic;
> struct kvm_cpuid_entry2 *best;
>
> + BUILD_BUG_ON(KVM_NR_GOVERNED_FEATURES > KVM_MAX_NR_GOVERNED_FEATURES);
> + bitmap_zero(vcpu->arch.governed_features.enabled,
> + KVM_MAX_NR_GOVERNED_FEATURES);
> +
> best = kvm_find_cpuid_entry(vcpu, 1);
> if (best && apic) {
> if (cpuid_entry_has(best, X86_FEATURE_TSC_DEADLINE_TIMER))
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index b1658c0de847..3000fbe97678 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -232,4 +232,50 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
> return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
> }
>
> +enum kvm_governed_features {
> +#define KVM_GOVERNED_FEATURE(x) KVM_GOVERNED_##x,
> +#include "governed_features.h"
> + KVM_NR_GOVERNED_FEATURES
> +};
> +
> +static __always_inline int kvm_governed_feature_index(unsigned int x86_feature)
> +{
> + switch (x86_feature) {
> +#define KVM_GOVERNED_FEATURE(x) case x: return KVM_GOVERNED_##x;
> +#include "governed_features.h"
> + default:
> + return -1;
> + }
> +}
> +
> +static __always_inline int kvm_is_governed_feature(unsigned int x86_feature)
> +{
> + return kvm_governed_feature_index(x86_feature) >= 0;
> +}
> +
I think it proper to return a bool, not "int" instead.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 12/21] KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled"
2023-07-29 1:15 ` [PATCH v2 12/21] KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled" Sean Christopherson
@ 2023-08-14 6:28 ` Zeng Guang
0 siblings, 0 replies; 27+ messages in thread
From: Zeng Guang @ 2023-08-14 6:28 UTC (permalink / raw)
To: Christopherson,, Sean, Paolo Bonzini, Vitaly Kuznetsov
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Maxim Levitsky
On 7/29/2023 9:15 AM, Sean Christopherson wrote:
> Use the governed feature framework to track if XSAVES is "enabled", i.e.
> if XSAVES can be used by the guest. Add a comment in the SVM code to
> explain the very unintuitive logic of deliberately NOT checking if XSAVES
> is enumerated in the guest CPUID model.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/governed_features.h | 1 +
> arch/x86/kvm/svm/svm.c | 17 ++++++++++++++---
> arch/x86/kvm/vmx/vmx.c | 32 ++++++++++++++++++--------------
> arch/x86/kvm/x86.c | 4 ++--
> 4 files changed, 35 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
> index b29c15d5e038..b896a64e4ac3 100644
> --- a/arch/x86/kvm/governed_features.h
> +++ b/arch/x86/kvm/governed_features.h
> @@ -6,6 +6,7 @@ BUILD_BUG()
> #define KVM_GOVERNED_X86_FEATURE(x) KVM_GOVERNED_FEATURE(X86_FEATURE_##x)
>
> KVM_GOVERNED_X86_FEATURE(GBPAGES)
> +KVM_GOVERNED_X86_FEATURE(XSAVES)
>
> #undef KVM_GOVERNED_X86_FEATURE
> #undef KVM_GOVERNED_FEATURE
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 64092df06f94..d5f8cb402eb7 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4204,9 +4204,20 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> struct vcpu_svm *svm = to_svm(vcpu);
> struct kvm_cpuid_entry2 *best;
>
> - vcpu->arch.xsaves_enabled = guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
> - boot_cpu_has(X86_FEATURE_XSAVE) &&
> - boot_cpu_has(X86_FEATURE_XSAVES);
> + /*
> + * SVM doesn't provide a way to disable just XSAVES in the guest, KVM
> + * can only disable all variants of by disallowing CR4.OSXSAVE from
> + * being set. As a result, if the host has XSAVE and XSAVES, and the
> + * guest has XSAVE enabled, the guest can execute XSAVES without
> + * faulting. Treat XSAVES as enabled in this case regardless of
> + * whether it's advertised to the guest so that KVM context switches
> + * XSS on VM-Enter/VM-Exit. Failure to do so would effectively give
> + * the guest read/write access to the host's XSS.
> + */
> + if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> + boot_cpu_has(X86_FEATURE_XSAVES) &&
> + guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> + kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);
>
> /* Update nrips enabled cache */
> svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a0a47be2feed..3100ed62615c 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4518,16 +4518,19 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
> * based on a single guest CPUID bit, with a dedicated feature bit. This also
> * verifies that the control is actually supported by KVM and hardware.
> */
> -#define vmx_adjust_sec_exec_control(vmx, exec_control, name, feat_name, ctrl_name, exiting) \
> -({ \
> - bool __enabled; \
> - \
> - if (cpu_has_vmx_##name()) { \
> - __enabled = guest_cpuid_has(&(vmx)->vcpu, \
> - X86_FEATURE_##feat_name); \
> - vmx_adjust_secondary_exec_control(vmx, exec_control, \
> - SECONDARY_EXEC_##ctrl_name, __enabled, exiting); \
> - } \
> +#define vmx_adjust_sec_exec_control(vmx, exec_control, name, feat_name, ctrl_name, exiting) \
> +({ \
> + struct kvm_vcpu *__vcpu = &(vmx)->vcpu; \
> + bool __enabled; \
> + \
> + if (cpu_has_vmx_##name()) { \
> + if (kvm_is_governed_feature(X86_FEATURE_##feat_name)) \
> + __enabled = guest_can_use(__vcpu, X86_FEATURE_##feat_name); \
> + else \
> + __enabled = guest_cpuid_has(__vcpu, X86_FEATURE_##feat_name); \
> + vmx_adjust_secondary_exec_control(vmx, exec_control, SECONDARY_EXEC_##ctrl_name,\
> + __enabled, exiting); \
> + } \
> })
>
> /* More macro magic for ENABLE_/opt-in versus _EXITING/opt-out controls. */
> @@ -4587,10 +4590,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
> if (!enable_pml || !atomic_read(&vcpu->kvm->nr_memslots_dirty_logging))
> exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
>
> - if (cpu_has_vmx_xsaves())
> - vmx_adjust_secondary_exec_control(vmx, &exec_control,
> - SECONDARY_EXEC_ENABLE_XSAVES,
> - vcpu->arch.xsaves_enabled, false);
> + vmx_adjust_sec_exec_feature(vmx, &exec_control, xsaves, XSAVES);
>
> /*
> * RDPID is also gated by ENABLE_RDTSCP, turn on the control if either
> @@ -4609,6 +4609,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
> SECONDARY_EXEC_ENABLE_RDTSCP,
> rdpid_or_rdtscp_enabled, false);
> }
> +
> vmx_adjust_sec_exec_feature(vmx, &exec_control, invpcid, INVPCID);
>
> vmx_adjust_sec_exec_exiting(vmx, &exec_control, rdrand, RDRAND);
> @@ -7722,6 +7723,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> boot_cpu_has(X86_FEATURE_XSAVE) &&
> guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
> guest_cpuid_has(vcpu, X86_FEATURE_XSAVES);
> + if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> + guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> + kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
>
> vmx_setup_uret_msrs(vmx);
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a14378ed4e1..201fa957ce9a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1012,7 +1012,7 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
> if (vcpu->arch.xcr0 != host_xcr0)
> xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
>
> - if (vcpu->arch.xsaves_enabled &&
> + if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
> vcpu->arch.ia32_xss != host_xss)
> wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
> }
> @@ -1043,7 +1043,7 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
> if (vcpu->arch.xcr0 != host_xcr0)
> xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
>
> - if (vcpu->arch.xsaves_enabled &&
> + if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
> vcpu->arch.ia32_xss != host_xss)
> wrmsrl(MSR_IA32_XSS, host_xss);
> }
"xsaves_enabled" can be removed from struct kvm_vcpu_arch as VMX/SVM doesn't reference it anymore.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 13/21] KVM: nVMX: Use KVM-governed feature framework to track "nested VMX enabled"
2023-07-29 1:16 ` [PATCH v2 13/21] KVM: nVMX: Use KVM-governed feature framework to track "nested VMX enabled" Sean Christopherson
@ 2023-08-14 8:11 ` Yuan Yao
0 siblings, 0 replies; 27+ messages in thread
From: Yuan Yao @ 2023-08-14 8:11 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Vitaly Kuznetsov, kvm, linux-kernel,
Maxim Levitsky
On Fri, Jul 28, 2023 at 06:16:00PM -0700, Sean Christopherson wrote:
> Track "VMX exposed to L1" via a governed feature flag instead of using a
> dedicated helper to provide the same functionality. The main goal is to
> drive convergence between VMX and SVM with respect to querying features
> that are controllable via module param (SVM likes to cache nested
> features), avoiding the guest CPUID lookups at runtime is just a bonus
> and unlikely to provide any meaningful performance benefits.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/governed_features.h | 1 +
> arch/x86/kvm/vmx/nested.c | 7 ++++---
> arch/x86/kvm/vmx/vmx.c | 21 ++++++---------------
> arch/x86/kvm/vmx/vmx.h | 1 -
> 4 files changed, 11 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
> index b896a64e4ac3..22446614bf49 100644
> --- a/arch/x86/kvm/governed_features.h
> +++ b/arch/x86/kvm/governed_features.h
> @@ -7,6 +7,7 @@ BUILD_BUG()
>
> KVM_GOVERNED_X86_FEATURE(GBPAGES)
> KVM_GOVERNED_X86_FEATURE(XSAVES)
> +KVM_GOVERNED_X86_FEATURE(VMX)
>
> #undef KVM_GOVERNED_X86_FEATURE
> #undef KVM_GOVERNED_FEATURE
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 22e08d30baef..c5ec0ef51ff7 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6426,7 +6426,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
> vmx = to_vmx(vcpu);
> vmcs12 = get_vmcs12(vcpu);
>
> - if (nested_vmx_allowed(vcpu) &&
> + if (guest_can_use(vcpu, X86_FEATURE_VMX) &&
> (vmx->nested.vmxon || vmx->nested.smm.vmxon)) {
> kvm_state.hdr.vmx.vmxon_pa = vmx->nested.vmxon_ptr;
> kvm_state.hdr.vmx.vmcs12_pa = vmx->nested.current_vmptr;
> @@ -6567,7 +6567,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
> if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS)
> return -EINVAL;
> } else {
> - if (!nested_vmx_allowed(vcpu))
> + if (!guest_can_use(vcpu, X86_FEATURE_VMX))
> return -EINVAL;
>
> if (!page_address_valid(vcpu, kvm_state->hdr.vmx.vmxon_pa))
> @@ -6601,7 +6601,8 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
> return -EINVAL;
>
> if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) &&
> - (!nested_vmx_allowed(vcpu) || !vmx->nested.enlightened_vmcs_enabled))
> + (!guest_can_use(vcpu, X86_FEATURE_VMX) ||
> + !vmx->nested.enlightened_vmcs_enabled))
> return -EINVAL;
>
> vmx_leave_nested(vcpu);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3100ed62615c..fdf932cfc64d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1894,17 +1894,6 @@ static void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu)
> vmcs_write64(TSC_MULTIPLIER, vcpu->arch.tsc_scaling_ratio);
> }
>
> -/*
> - * nested_vmx_allowed() checks whether a guest should be allowed to use VMX
> - * instructions and MSRs (i.e., nested VMX). Nested VMX is disabled for
> - * all guests if the "nested" module option is off, and can also be disabled
> - * for a single guest by disabling its VMX cpuid bit.
> - */
> -bool nested_vmx_allowed(struct kvm_vcpu *vcpu)
> -{
> - return nested && guest_cpuid_has(vcpu, X86_FEATURE_VMX);
the removed nested here already covered by
kvm_cpu_cap_set(X86_FEATURE_VMX) from vmx_set_cpu_caps().
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
> -}
> -
> /*
> * Userspace is allowed to set any supported IA32_FEATURE_CONTROL regardless of
> * guest CPUID. Note, KVM allows userspace to set "VMX in SMX" to maintain
> @@ -2032,7 +2021,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> [msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
> break;
> case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> - if (!nested_vmx_allowed(vcpu))
> + if (!guest_can_use(vcpu, X86_FEATURE_VMX))
> return 1;
> if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
> &msr_info->data))
> @@ -2340,7 +2329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> if (!msr_info->host_initiated)
> return 1; /* they are read-only */
> - if (!nested_vmx_allowed(vcpu))
> + if (!guest_can_use(vcpu, X86_FEATURE_VMX))
> return 1;
> return vmx_set_vmx_msr(vcpu, msr_index, data);
> case MSR_IA32_RTIT_CTL:
> @@ -7727,13 +7716,15 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES);
>
> + kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX);
> +
> vmx_setup_uret_msrs(vmx);
>
> if (cpu_has_secondary_exec_ctrls())
> vmcs_set_secondary_exec_control(vmx,
> vmx_secondary_exec_control(vmx));
>
> - if (nested_vmx_allowed(vcpu))
> + if (guest_can_use(vcpu, X86_FEATURE_VMX))
> vmx->msr_ia32_feature_control_valid_bits |=
> FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
> FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
> @@ -7742,7 +7733,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> ~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
> FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
>
> - if (nested_vmx_allowed(vcpu))
> + if (guest_can_use(vcpu, X86_FEATURE_VMX))
> nested_vmx_cr_fixed1_bits_update(vcpu);
>
> if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index cde902b44d97..c2130d2c8e24 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -374,7 +374,6 @@ struct kvm_vmx {
> u64 *pid_table;
> };
>
> -bool nested_vmx_allowed(struct kvm_vcpu *vcpu);
> void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu,
> struct loaded_vmcs *buddy);
> int allocate_vpid(void);
> --
> 2.41.0.487.g6d72f3e995-goog
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features
2023-08-14 4:43 ` Zeng Guang
@ 2023-08-14 17:20 ` Sean Christopherson
0 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2023-08-14 17:20 UTC (permalink / raw)
To: Zeng Guang
Cc: Paolo Bonzini, Vitaly Kuznetsov, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Maxim Levitsky
On Mon, Aug 14, 2023, Zeng Guang wrote:
>
> On 7/29/2023 9:15 AM, Sean Christopherson wrote:
> > +static __always_inline int kvm_governed_feature_index(unsigned int x86_feature)
> > +{
> > + switch (x86_feature) {
> > +#define KVM_GOVERNED_FEATURE(x) case x: return KVM_GOVERNED_##x;
> > +#include "governed_features.h"
> > + default:
> > + return -1;
> > + }
> > +}
> > +
> > +static __always_inline int kvm_is_governed_feature(unsigned int x86_feature)
> > +{
> > + return kvm_governed_feature_index(x86_feature) >= 0;
> > +}
> > +
>
> I think it proper to return a bool, not "int" instead.
Yeah. I'm pretty sure someone brought this up in v1 too, and I spaced on it.
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2023-08-14 17:21 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-29 1:15 [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 01/21] KVM: nSVM: Check instead of asserting on nested TSC scaling support Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 02/21] KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 03/21] KVM: nSVM: Use the "outer" helper for writing multiplier to MSR_AMD64_TSC_RATIO Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 04/21] KVM: SVM: Clean up preemption toggling related " Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 05/21] KVM: x86: Always write vCPU's current TSC offset/ratio in vendor hooks Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 06/21] KVM: nSVM: Skip writes to MSR_AMD64_TSC_RATIO if guest state isn't loaded Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 07/21] KVM: x86: Add a framework for enabling KVM-governed x86 features Sean Christopherson
2023-08-14 4:43 ` Zeng Guang
2023-08-14 17:20 ` Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 08/21] KVM: x86/mmu: Use KVM-governed feature framework to track "GBPAGES enabled" Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 09/21] KVM: VMX: Recompute "XSAVES enabled" only after CPUID update Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 10/21] KVM: VMX: Check KVM CPU caps, not just VMX MSR support, for XSAVE enabling Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 11/21] KVM: VMX: Rename XSAVES control to follow KVM's preferred "ENABLE_XYZ" Sean Christopherson
2023-07-29 1:15 ` [PATCH v2 12/21] KVM: x86: Use KVM-governed feature framework to track "XSAVES enabled" Sean Christopherson
2023-08-14 6:28 ` Zeng Guang
2023-07-29 1:16 ` [PATCH v2 13/21] KVM: nVMX: Use KVM-governed feature framework to track "nested VMX enabled" Sean Christopherson
2023-08-14 8:11 ` Yuan Yao
2023-07-29 1:16 ` [PATCH v2 14/21] KVM: nSVM: Use KVM-governed feature framework to track "NRIPS enabled" Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 15/21] KVM: nSVM: Use KVM-governed feature framework to track "TSC scaling enabled" Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 16/21] KVM: nSVM: Use KVM-governed feature framework to track "vVM{SAVE,LOAD} enabled" Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 17/21] KVM: nSVM: Use KVM-governed feature framework to track "LBRv enabled" Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 18/21] KVM: nSVM: Use KVM-governed feature framework to track "Pause Filter enabled" Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 19/21] KVM: nSVM: Use KVM-governed feature framework to track "vGIF enabled" Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 20/21] KVM: nSVM: Use KVM-governed feature framework to track "vNMI enabled" Sean Christopherson
2023-07-29 1:16 ` [PATCH v2 21/21] KVM: x86: Disallow guest CPUID lookups when IRQs are disabled Sean Christopherson
2023-08-04 0:40 ` [PATCH v2 00/21] KVM: x86: Add "governed" X86_FEATURE framework Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).