From: Sean Christopherson <seanjc@google.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Eric Li <ercli@ucdavis.edu>, David Matlack <dmatlack@google.com>,
Oliver Upton <oupton@google.com>
Subject: Re: [PATCH v5 05/15] KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value
Date: Tue, 1 Nov 2022 17:58:21 +0000 [thread overview]
Message-ID: <Y2FePYteNrEfZ7D5@google.com> (raw)
In-Reply-To: <20221101101801.zxcjswoesg2gltri@linux.intel.com>
On Tue, Nov 01, 2022, Yu Zhang wrote:
> On Mon, Oct 31, 2022 at 05:11:10PM +0000, Sean Christopherson wrote:
> > vmcs_config.nested never becomes out-of-date, it's read-only after __init (not
> > currently marked as such, that will be remedied soon).
> >
> > The auditing performed by KVM is purely to guard against userspace enabling
> > features that KVM doesn't support. KVM is not responsible for ensuring that the
> > vCPU's CPUID model match the VMX MSR model.
>
> Do you mean the VMX MSR model shall not be changed after the cpuid updates?
No, I mean that the virtual CPU model (CPUID + VMX MSRs) that is presented to the
guest is the responsibility of host userspace. KVM only cares about not enabling
bits/features that KVM doesn't supported.
> And for VMX MSR model, do you mean the vmx->nested.msrs or the ones in
> vmcs_config->nested?
vmx->nested.msrs. vmcs_config->nested is effectively the VMX equivalent of
KVM_GET_SUPPORTED_CPUID.
> What I observed is that vmx->nested.msrs.secondary_ctls_high will be changed
> in vmx_adjust_secondary_exec_control(), which can be triggered after cpuid is
> set.
Ugh, that path got overlooked when we yanked out KVM's manipulaton of VMX MSRs
in response to guest CPUID changes. I wonder if we can get away with changing
KVM's behavior to only ensure a feature isn't exposed to L2 when it's not exposed
to L1.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6b4266e949a3..cfc35d559d91 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4523,8 +4523,8 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
* Update the nested MSR settings so that a nested VMM can/can't set
* controls for features that are/aren't exposed to the guest.
*/
- if (nested) {
- if (enabled)
+ if (nested && !enabled)
+ if (exiting)
vmx->nested.msrs.secondary_ctls_high |= control;
else
vmx->nested.msrs.secondary_ctls_high &= ~control;
> Since KVM's config(vmcs_config->nested.secondary_ctls_high) is done during init
> by nested_vmx_setup_ctls_msrs(), which only kept a subset of the flags from the
> vmcs_confg->cpu_based_2nd_exec_ctrl, the vmx_restore_control_msr() could fail
> later, when userspace VMM tries to enable a feature(the only one I witnessed is
> SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE) by setting MSR_IA32_VMX_PROCBASED_CTLS2.
> Because the vmx->nested.msrs.secondary_ctls_high is updated by cpuid, but this
> bit is not taken from vmcs_conf->cpu_based_2nd_exec_ctrl by nested_vmx_setup_ctls_msrs()
> for vmcs_config->nested.secondary_ctls_high.
>
> The failure can be fixed, simply by adding SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE in
> nested_vmx_setup_ctls_msrs(), e.g.
Assuming KVM actually supports user wait/pause in L2, this is an orthogonal bug
to the CPUID-based manipulation above. KVM simply neglects to advertise to userspace
that ENABLE_USR_WAIT_PAUSE is supported for nested virtualization.
If KVM doesn't correctly support virtualizing user wait/pause for L2, then the
correct location to fix this is in vmx_secondary_exec_control().
> > > Another question is about the setting of secondary_ctls_high in
> > > nested_vmx_setup_ctls_msrs(). I saw there's a comment saying:
> > > "Do not include those that depend on CPUID bits, they are
> > > added later by vmx_vcpu_after_set_cpuid.".
> >
> > That's a stale comment, see the very next commit, 8805875aa473 ("Revert "KVM: nVMX:
> > Do not expose MPX VMX controls when guest MPX disabled""), as well as the slightly
> > later commit 9389d5774aca ("Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
> > VM-{Entry,Exit} control"").
> >
>
> So the comment can be and shall be removed, right?
Yep.
> > > if (cpu_has_vmx_vmfunc()) {
> > > msrs->secondary_ctls_high |=
> > > SECONDARY_EXEC_ENABLE_VMFUNC;
> >
> > This one is still required. KVM never enables VMFUNC for itself, i.e. it won't
> > be set in KVM's VMCS configuration.
> >
>
> My understanding is that, for VMFUNC, altough KVM does not support it,
> SECONDARY_EXEC_ENABLE_VMFUNC is still set in the secondary proc-based
> vm execution ctrol. KVM just uses different handlers for VMFUNC exits
> from L1(to inject the #UD) and L2(to emulate the eptp switching). So
> it doesn't matter if we do not clear this bit for vmcs_config->nested.
> procbased_ctls_high.
Ah, you're right, I didn't realize KVM enables VMFUNC in L1. Enabling VMFUNC for
L1 is silly though, it's trivial to clear the feature in vmx_secondary_exec_control().
That said, enabling VMFUNC in vmcs01 is an orthogonal topic, and it _is_ indeed
easier to keep the feature in the reference config. Now that the nested config
is derived from the non-nested config, nested_vmx_setup_ctls_msrs() can do:
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 61a2e551640a..751cc67aa1a9 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6811,7 +6811,8 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
SECONDARY_EXEC_ENABLE_INVPCID |
SECONDARY_EXEC_RDSEED_EXITING |
SECONDARY_EXEC_XSAVES |
- SECONDARY_EXEC_TSC_SCALING;
+ SECONDARY_EXEC_TSC_SCALING |
+ SECONDARY_EXEC_ENABLE_VMFUNC;
/*
* We can emulate "VMCS shadowing," even if the hardware
@@ -6842,17 +6843,12 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
}
}
- if (cpu_has_vmx_vmfunc()) {
- msrs->secondary_ctls_high |=
- SECONDARY_EXEC_ENABLE_VMFUNC;
- /*
- * Advertise EPTP switching unconditionally
- * since we emulate it
- */
- if (enable_ept)
- msrs->vmfunc_controls =
- VMX_VMFUNC_EPTP_SWITCHING;
- }
+ /*
+ * Advertise EPTP switching if VMFUNC and EPT are supported, KVM
+ * emulates the actual EPTP switch in software.
+ */
+ if (cpu_has_vmx_vmfunc() && enable_ept)
+ msrs->vmfunc_controls = VMX_VMFUNC_EPTP_SWITCHING;
/*
* Old versions of KVM use the single-context version without
next prev parent reply other threads:[~2022-11-01 17:58 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-07 21:35 [PATCH v5 00/15] KVM: nVMX: VMX MSR quirk+fixes, CR4 fixes Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 01/15] KVM: x86: Split kvm_is_valid_cr4() and export only the non-vendor bits Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 02/15] KVM: nVMX: Account for KVM reserved CR4 bits in consistency checks Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 03/15] KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4 Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 04/15] KVM: nVMX: Rename handle_vm{on,off}() to handle_vmx{on,off}() Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 05/15] KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value Sean Christopherson
2022-10-31 16:39 ` Yu Zhang
2022-10-31 17:11 ` Sean Christopherson
2022-11-01 10:18 ` Yu Zhang
2022-11-01 17:58 ` Sean Christopherson [this message]
2022-11-02 8:54 ` Yu Zhang
2022-11-03 16:53 ` Sean Christopherson
2022-11-07 8:28 ` Yu Zhang
2022-11-07 15:06 ` Sean Christopherson
2022-11-08 10:21 ` Yu Zhang
2022-11-08 18:35 ` Sean Christopherson
2022-11-10 8:44 ` Yu Zhang
2022-11-10 16:08 ` Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 06/15] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Sean Christopherson
2022-07-22 9:06 ` Paolo Bonzini
2022-06-07 21:35 ` [PATCH v5 07/15] KVM: VMX: Add helper to check if the guest PMU has PERF_GLOBAL_CTRL Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 08/15] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 09/15] KVM: nVMX: Drop nested_vmx_pmu_refresh() Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 10/15] KVM: nVMX: Add a quirk for KVM tweaks to VMX MSRs Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 11/15] KVM: nVMX: Set UMIP bit CR4_FIXED1 MSR when emulating UMIP Sean Christopherson
2022-07-22 9:49 ` Paolo Bonzini
2022-06-07 21:36 ` [PATCH v5 12/15] KVM: nVMX: Extend VMX MSRs quirk to CR0/4 fixed1 bits Sean Christopherson
2022-07-22 9:50 ` Paolo Bonzini
2022-06-07 21:36 ` [PATCH v5 13/15] KVM: selftests: Add test to verify KVM's VMX MSRs quirk for controls Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 14/15] KVM: selftests: Extend VMX MSRs test to cover CR4_FIXED1 (and its quirks) Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 15/15] KVM: selftests: Verify VMX MSRs can be restored to KVM-supported values Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2FePYteNrEfZ7D5@google.com \
--to=seanjc@google.com \
--cc=dmatlack@google.com \
--cc=ercli@ucdavis.edu \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.