From: Sean Christopherson <seanjc@google.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Eric Li <ercli@ucdavis.edu>, David Matlack <dmatlack@google.com>,
Oliver Upton <oupton@google.com>
Subject: Re: [PATCH v5 05/15] KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value
Date: Tue, 1 Nov 2022 17:58:21 +0000 [thread overview]
Message-ID: <Y2FePYteNrEfZ7D5@google.com> (raw)
In-Reply-To: <20221101101801.zxcjswoesg2gltri@linux.intel.com>
On Tue, Nov 01, 2022, Yu Zhang wrote:
> On Mon, Oct 31, 2022 at 05:11:10PM +0000, Sean Christopherson wrote:
> > vmcs_config.nested never becomes out-of-date, it's read-only after __init (not
> > currently marked as such, that will be remedied soon).
> >
> > The auditing performed by KVM is purely to guard against userspace enabling
> > features that KVM doesn't support. KVM is not responsible for ensuring that the
> > vCPU's CPUID model match the VMX MSR model.
>
> Do you mean the VMX MSR model shall not be changed after the cpuid updates?
No, I mean that the virtual CPU model (CPUID + VMX MSRs) that is presented to the
guest is the responsibility of host userspace. KVM only cares about not enabling
bits/features that KVM doesn't supported.
> And for VMX MSR model, do you mean the vmx->nested.msrs or the ones in
> vmcs_config->nested?
vmx->nested.msrs. vmcs_config->nested is effectively the VMX equivalent of
KVM_GET_SUPPORTED_CPUID.
> What I observed is that vmx->nested.msrs.secondary_ctls_high will be changed
> in vmx_adjust_secondary_exec_control(), which can be triggered after cpuid is
> set.
Ugh, that path got overlooked when we yanked out KVM's manipulaton of VMX MSRs
in response to guest CPUID changes. I wonder if we can get away with changing
KVM's behavior to only ensure a feature isn't exposed to L2 when it's not exposed
to L1.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6b4266e949a3..cfc35d559d91 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4523,8 +4523,8 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
* Update the nested MSR settings so that a nested VMM can/can't set
* controls for features that are/aren't exposed to the guest.
*/
- if (nested) {
- if (enabled)
+ if (nested && !enabled)
+ if (exiting)
vmx->nested.msrs.secondary_ctls_high |= control;
else
vmx->nested.msrs.secondary_ctls_high &= ~control;
> Since KVM's config(vmcs_config->nested.secondary_ctls_high) is done during init
> by nested_vmx_setup_ctls_msrs(), which only kept a subset of the flags from the
> vmcs_confg->cpu_based_2nd_exec_ctrl, the vmx_restore_control_msr() could fail
> later, when userspace VMM tries to enable a feature(the only one I witnessed is
> SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE) by setting MSR_IA32_VMX_PROCBASED_CTLS2.
> Because the vmx->nested.msrs.secondary_ctls_high is updated by cpuid, but this
> bit is not taken from vmcs_conf->cpu_based_2nd_exec_ctrl by nested_vmx_setup_ctls_msrs()
> for vmcs_config->nested.secondary_ctls_high.
>
> The failure can be fixed, simply by adding SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE in
> nested_vmx_setup_ctls_msrs(), e.g.
Assuming KVM actually supports user wait/pause in L2, this is an orthogonal bug
to the CPUID-based manipulation above. KVM simply neglects to advertise to userspace
that ENABLE_USR_WAIT_PAUSE is supported for nested virtualization.
If KVM doesn't correctly support virtualizing user wait/pause for L2, then the
correct location to fix this is in vmx_secondary_exec_control().
> > > Another question is about the setting of secondary_ctls_high in
> > > nested_vmx_setup_ctls_msrs(). I saw there's a comment saying:
> > > "Do not include those that depend on CPUID bits, they are
> > > added later by vmx_vcpu_after_set_cpuid.".
> >
> > That's a stale comment, see the very next commit, 8805875aa473 ("Revert "KVM: nVMX:
> > Do not expose MPX VMX controls when guest MPX disabled""), as well as the slightly
> > later commit 9389d5774aca ("Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
> > VM-{Entry,Exit} control"").
> >
>
> So the comment can be and shall be removed, right?
Yep.
> > > if (cpu_has_vmx_vmfunc()) {
> > > msrs->secondary_ctls_high |=
> > > SECONDARY_EXEC_ENABLE_VMFUNC;
> >
> > This one is still required. KVM never enables VMFUNC for itself, i.e. it won't
> > be set in KVM's VMCS configuration.
> >
>
> My understanding is that, for VMFUNC, altough KVM does not support it,
> SECONDARY_EXEC_ENABLE_VMFUNC is still set in the secondary proc-based
> vm execution ctrol. KVM just uses different handlers for VMFUNC exits
> from L1(to inject the #UD) and L2(to emulate the eptp switching). So
> it doesn't matter if we do not clear this bit for vmcs_config->nested.
> procbased_ctls_high.
Ah, you're right, I didn't realize KVM enables VMFUNC in L1. Enabling VMFUNC for
L1 is silly though, it's trivial to clear the feature in vmx_secondary_exec_control().
That said, enabling VMFUNC in vmcs01 is an orthogonal topic, and it _is_ indeed
easier to keep the feature in the reference config. Now that the nested config
is derived from the non-nested config, nested_vmx_setup_ctls_msrs() can do:
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 61a2e551640a..751cc67aa1a9 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6811,7 +6811,8 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
SECONDARY_EXEC_ENABLE_INVPCID |
SECONDARY_EXEC_RDSEED_EXITING |
SECONDARY_EXEC_XSAVES |
- SECONDARY_EXEC_TSC_SCALING;
+ SECONDARY_EXEC_TSC_SCALING |
+ SECONDARY_EXEC_ENABLE_VMFUNC;
/*
* We can emulate "VMCS shadowing," even if the hardware
@@ -6842,17 +6843,12 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
}
}
- if (cpu_has_vmx_vmfunc()) {
- msrs->secondary_ctls_high |=
- SECONDARY_EXEC_ENABLE_VMFUNC;
- /*
- * Advertise EPTP switching unconditionally
- * since we emulate it
- */
- if (enable_ept)
- msrs->vmfunc_controls =
- VMX_VMFUNC_EPTP_SWITCHING;
- }
+ /*
+ * Advertise EPTP switching if VMFUNC and EPT are supported, KVM
+ * emulates the actual EPTP switch in software.
+ */
+ if (cpu_has_vmx_vmfunc() && enable_ept)
+ msrs->vmfunc_controls = VMX_VMFUNC_EPTP_SWITCHING;
/*
* Old versions of KVM use the single-context version without
next prev parent reply other threads:[~2022-11-01 17:58 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-07 21:35 [PATCH v5 00/15] KVM: nVMX: VMX MSR quirk+fixes, CR4 fixes Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 01/15] KVM: x86: Split kvm_is_valid_cr4() and export only the non-vendor bits Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 02/15] KVM: nVMX: Account for KVM reserved CR4 bits in consistency checks Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 03/15] KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4 Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 04/15] KVM: nVMX: Rename handle_vm{on,off}() to handle_vmx{on,off}() Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 05/15] KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value Sean Christopherson
2022-10-31 16:39 ` Yu Zhang
2022-10-31 17:11 ` Sean Christopherson
2022-11-01 10:18 ` Yu Zhang
2022-11-01 17:58 ` Sean Christopherson [this message]
2022-11-02 8:54 ` Yu Zhang
2022-11-03 16:53 ` Sean Christopherson
2022-11-07 8:28 ` Yu Zhang
2022-11-07 15:06 ` Sean Christopherson
2022-11-08 10:21 ` Yu Zhang
2022-11-08 18:35 ` Sean Christopherson
2022-11-10 8:44 ` Yu Zhang
2022-11-10 16:08 ` Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 06/15] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Sean Christopherson
2022-07-22 9:06 ` Paolo Bonzini
2022-06-07 21:35 ` [PATCH v5 07/15] KVM: VMX: Add helper to check if the guest PMU has PERF_GLOBAL_CTRL Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 08/15] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 09/15] KVM: nVMX: Drop nested_vmx_pmu_refresh() Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 10/15] KVM: nVMX: Add a quirk for KVM tweaks to VMX MSRs Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 11/15] KVM: nVMX: Set UMIP bit CR4_FIXED1 MSR when emulating UMIP Sean Christopherson
2022-07-22 9:49 ` Paolo Bonzini
2022-06-07 21:36 ` [PATCH v5 12/15] KVM: nVMX: Extend VMX MSRs quirk to CR0/4 fixed1 bits Sean Christopherson
2022-07-22 9:50 ` Paolo Bonzini
2022-06-07 21:36 ` [PATCH v5 13/15] KVM: selftests: Add test to verify KVM's VMX MSRs quirk for controls Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 14/15] KVM: selftests: Extend VMX MSRs test to cover CR4_FIXED1 (and its quirks) Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 15/15] KVM: selftests: Verify VMX MSRs can be restored to KVM-supported values Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2FePYteNrEfZ7D5@google.com \
--to=seanjc@google.com \
--cc=dmatlack@google.com \
--cc=ercli@ucdavis.edu \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox