public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yang Weijiang <weijiang.yang@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Yang Weijiang <weijiang.yang@intel.com>,
	pbonzini@redhat.com, vkuznets@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/3] KVM: nVMX: Sync L2 guest CET states between L1/L2
Date: Wed, 24 Mar 2021 21:51:15 +0800	[thread overview]
Message-ID: <20210324135115.GA11269@local-michael-cet-test.sh.intel.com> (raw)
In-Reply-To: <YFoPro1bw07YEaXe@google.com>

On Tue, Mar 23, 2021 at 03:56:30PM +0000, Sean Christopherson wrote:
> On Tue, Mar 23, 2021, Yang Weijiang wrote:
> > On Tue, Mar 16, 2021 at 05:03:47PM +0800, Yang Weijiang wrote:
> > 
> > Hi, Sean,
> > Could you respond my below rely? I'm not sure how to proceed, thanks!
> > 
> > > On Mon, Mar 15, 2021 at 09:45:11AM -0700, Sean Christopherson wrote:
> > > > On Mon, Mar 15, 2021, Yang Weijiang wrote:
> 
> ...
> 
> > > > > @@ -2556,6 +2563,15 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> > > > >  	if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending ||
> > > > >  	    !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)))
> > > > >  		vmcs_write64(GUEST_BNDCFGS, vmx->nested.vmcs01_guest_bndcfgs);
> > > > > +
> > > > > +	if (kvm_cet_supported() && (!vmx->nested.nested_run_pending ||
> > > > > +	    !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE))) {
> > > > 
> > > > Not your code per se, since this pattern comes from BNDCFGS and DEBUGCTL, but I
> > > > don't see how loading vmcs01 state in this combo is correct:
> > > > 
> > > >     a. kvm_xxx_supported()              == 1
> > > >     b. nested_run_pending               == false
> > > >     c. vm_entry_controls.load_xxx_state == true
> > > > 
> > > > nested_vmx_enter_non_root_mode() only snapshots vmcs01 if 
> > > > vm_entry_controls.load_xxx_state == false, which means the above combo is
> > > > loading stale values (or more likely, zeros).
> > > > 
> > > > I _think_ nested_vmx_enter_non_root_mode() just needs to snapshot vmcs01 if
> > > > nested_run_pending=false.  For migration, if userspace restores MSRs after
> > > > KVM_SET_NESTED_STATE, then what's done here is likely irrelevant.  If userspace
> > > > restores MSRs before nested state, then vmcs01 will hold the desired value since
> > > > setting MSRs would have written the value into vmcs01.
> > > 
> > > Then the code nested_vmx_enter_non_root_mode() would look like:
> > > 
> > > if (kvm_cet_supported() && !vmx->nested.nested_run_pending &&
> > >     !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) {
> > > 	...
> > >     }
> > > 
> > > I have another concern now, if vm_entry_controls.load_cet_state == false, and L1
> > > updated vmcs fields, so the latest states are in vmcs12, but they cannot
> > > be synced to vmcs02 because in prepare_vmcs02_rare():
> > > 
> > > if (kvm_cet_supported() && vmx->nested.nested_run_pending &&
> > >     (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) {
> > > 	...
> > >     }
> > > 
> > > so L2 got stale status. IMO, L1 guest sets vm_entry_controls.load_cet_state == false
> > > should be rare case. We can even igore this case :-)
> 
> Yes, that's an L1 bug if it expects L2 state to come from vmcs12 in that case.
> Architecturally, the vcms12 value won't be visible to L2 until L1 enables the
> VM-Entry control, at which point KVM would detect the refreshed vmcs12 and sync
> the "rare" fields.

Thanks, Sean!
So I'll change code as below:

if (kvm_cet_supported() && !vmx->nested.nested_run_pending &&
    !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) {
      ...
    }
>
> > > > I suspect no one has reported this issue because guests simply don't use MPX,
> > > > and up until the recent LBR stuff, KVM effectively zeroed out DEBUGCTL for the
> > > > guest.
> > > > 
> > > So for MPX and DEBUGCTL, is it worth some separate fix patch?
> 
> Yes, assuming my analysis is correct.  That doesn't necessarily need to be your
> responsibility, though patches are of course welcome :-)
> 
> Jim, Paolo, any thoughts?
> 
OK, let me wait for Jim and Paolo's comments on this...

> > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > > index 45622e9c4449..4184ff601120 100644
> > > > --- a/arch/x86/kvm/vmx/nested.c
> > > > +++ b/arch/x86/kvm/vmx/nested.c
> > > > @@ -3298,10 +3298,11 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
> > > >         if (likely(!evaluate_pending_interrupts) && kvm_vcpu_apicv_active(vcpu))
> > > >                 evaluate_pending_interrupts |= vmx_has_apicv_interrupt(vcpu);
> > > > 
> > > > -       if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
> > > > +       if (!vmx->nested.nested_run_pending ||
> > > > +           !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
> > > >                 vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
> > > > -       if (kvm_mpx_supported() &&
> > > > -               !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
> > > > +       if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending ||
> > > > +           !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)))
> > > >                 vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
> > > > 
> > > >         /*
> > > > 
> > > > 
> > > > Side topic, all of this code is broken for SMM emulation.  SMI+RSM don't do a
> > > > full VM-Exit -> VM-Entry; the CPU forcefully exits non-root, but most state that
> > > > is loaded from the VMCS is left untouched.  It's the SMI handler's responsibility
> > > > to not enable features, e.g. to not set CR4.CET.  For sane use cases, this
> > > > probably doesn't matter as vmcs12 will be configured to context switch state,
> > > > but if L1 is doing anything out of the ordinary, SMI+RSM will corrupt state.
> > > > 
> > > > E.g. if L1 enables MPX in the guest, does not intercept L2 writes to BNDCFGS,
> > > > and does not load BNDCFGS on VM-Entry, then SMI+RSM would corrupt BNDCFGS since
> > > > the SMI "exit" would clear BNDCFGS, and the RSM "entry" would load zero.  This
> > > > is 100% contrived, and probably doesn't impact real world use cases, but it
> > > > still bugs me :-)
> > > 
> > > Exactly, should it be fixed by separate patch or leave it as is?
> 
> Definitely leave it for now, properly fixing the SMI+RSM code goes far beyond
> basic CET support.

Sure.


  reply	other threads:[~2021-03-24 13:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-15  7:18 [PATCH v4 0/3] CET fix patches for nested guest Yang Weijiang
2021-03-15  7:18 ` [PATCH v4 1/3] KVM: nVMX: Sync L2 guest CET states between L1/L2 Yang Weijiang
2021-03-15 16:45   ` Sean Christopherson
2021-03-16  9:03     ` Yang Weijiang
2021-03-23  0:43       ` Yang Weijiang
2021-03-23 15:56         ` Sean Christopherson
2021-03-24 13:51           ` Yang Weijiang [this message]
2021-03-15  7:18 ` [PATCH v4 2/3] KVM: nVMX: Set X86_CR4_CET in cr4_fixed1_bits if CET IBT is enabled Yang Weijiang
2021-03-15  7:18 ` [PATCH v4 3/3] KVM: nVMX: Add CET entry/exit load bits to evmcs unsupported list Yang Weijiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210324135115.GA11269@local-michael-cet-test.sh.intel.com \
    --to=weijiang.yang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox