All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Subject: Re: [GIT PULL] KVM: x86: SVM changes for 6.4
Date: Wed, 26 Apr 2023 22:40:02 +0000	[thread overview]
Message-ID: <ZEmoQslJ3gPqI8jG@google.com> (raw)
In-Reply-To: <CABgObfYkaxMYoGazW+rC=zUF2g6j1AfRkOTDyCBSE2GTm9GNhQ@mail.gmail.com>

On Wed, Apr 26, 2023, Paolo Bonzini wrote:
> On Wed, Apr 26, 2023 at 10:02 PM Sean Christopherson <seanjc@google.com> wrote:
> > > This is probably the sub-PR for which I'm more interested in giving
> > > the code a closer look, but this is more about understanding the
> > > changes than it is about expecting something bad in it.
> >
> > 100% agree.  If you were to scrutinize only one thing for 6.4, the vNMI changes
> > are definitely my choice for extra eyeballs.
> 
> Interesting read. The split commits left me wondering _why_ patches
> 1-7 were needed for vNMI, but that's a known limitation of losing the
> cover letter, and the Link or Message-Id trailers try to amend for
> that.
> 
> I have a few comments indeed, most of which are absolutely nits and
> can be ignored or fixed as follow-ups. It's my turn to send a "belated
> review" patch series, which I'll do for -rc2, but please check if
> there are any disagreements.
> 
> First of all, this comment caught my attention:
> 
> +    /*
> +     * Rules for synchronizing int_ctl bits from vmcb02 to vmcb01:
> +     *
> +     * V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:  If L1 doesn't
> +     * intercept interrupts, then KVM will use vmcb02's V_IRQ (and related
> +     * flags) to detect interrupt windows for L1 IRQs (even if L1 uses
> +     * virtual interrupt masking).  Raise KVM_REQ_EVENT to ensure that
> +     * KVM re-requests an interrupt window if necessary, which implicitly
> +     * copies this bits from vmcb02 to vmcb01.
> +     *
> +     * V_TPR: If L1 doesn't use virtual interrupt masking, then L1's vTPR
> +     * is stored in vmcb02, but its value doesn't need to be copied from/to
> +     * vmcb01 because it is copied from/to the virtual APIC's TPR register
> +     * on each VM entry/exit.
> +     *
> +     * V_GIF: If nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's
> +     * V_GIF.  However, GIF is architecturally clear on each VM exit, thus
> +     * there is no need to copy V_GIF from vmcb02 to vmcb01.
> +     */
> 
> "Rules for synchronizing int_ctl bits from vmcb02 to vmcb01" suggests
> that this is work done here, and it also misled me into looking at
> nested_sync_control_from_vmcb02 (which is instead about vmcb02 ->
> vmcb12).

+1.  I had a similar reaction when I first saw the code, but learned to live with
it after a few versions :-)

> So what about replacing it with
> 
>     * int_ctl bits from vmcb02 have to be synchronized to both vmcb12
> and vmcb01.
>     * The former is in nested_sync_control_from_vmcb02, invoked on every vmexit,
>     * while the latter is scattered all over the place:
> 
> and perhaps also call out nested_svm_virtualize_tpr(), sync_lapic_to_cr8() and
> sync_cr8_to_lapic in the V_TPR part?
> 
> Another super small thing which is not worth a respin (would have been):
> 
>  Subject: [PATCH 05/13] KVM: x86: Raise an event request when processing NMIs
> - if an NMI is pending
> + iff an NMI is pending
> 
> The "if" suggests that we were missing an event request, while "iff"
> suggests that we were doing them unnecessarily.

Gah, suprised I didn't catch that, I do love me some "iff".

> As an aside, I like the "coding style violation" of commit
> 400fee8c9b2d. Because the "limit = 2" case is anti-architectural, it
> makes more sense to have it as an "else" rather than as the default.
> An alternative could have been:
> 
>   unsigned limit = 1;
>   if (!static_call(kvm_x86_get_nmi_mask)(vcpu) && !vcpu->arch.nmi_injected)
>       limit = 2;
> 
> but the ugly condition makes this solution worse.
> 
> Next on, commit ab2ee212a57b ("KVM: x86: Save/restore all NMIs when
> multiple NMIs are pending"). Here, "allows userspace to restore 255
> pending NMIs" in the commit message is kinda scary, and I thought
> about following up with a fixlet to KVM_SET_VCPU_EVENTS:
> 
> +        events->nmi.pending = min(vcpu->arch.nmi_pending, 2);
>          vcpu->arch.nmi_pending = 0;
>          atomic_set(&vcpu->arch.nmi_queued, events->nmi.pending);
>          kvm_make_request(KVM_REQ_NMI, vcpu);
> 
> but really this isn't needed because process_nmi() does have
> 
>      vcpu->arch.nmi_pending = min(vcpu->arch.nmi_pending, limit);
> 
> So in the end this is also fine, just a remark on the commit message.

Even restoring 255 NMIs would be fine from KVM's perspective.  The guest might
not be happy, but that's likely true if there are _any_ spurious NMIs.  IIRC, I
didn't call out the process_nmi() behavior because having 255 pending virtual
NMIs doesn't put KVM at risk anymore than does having 2 pending virtual NMIs.

> May be worth an additional comment instead here in
> KVM_SET_VCPU_EVENTS, before the atomic_set().
> 
> On to the actual vNMI patch:
> 
> +    /*
> +     * KVM should never request an NMI window when vNMI is enabled, as KVM
> +     * allows at most one to-be-injected NMI and one pending NMI, i.e. if
> +     * two NMIs arrive simultaneously, KVM will inject one and set
> +     * V_NMI_PENDING for the other.  WARN, but continue with the standard
> +     * single-step approach to try and salvage the pending NMI.
> +     */
> +    WARN_ON_ONCE(is_vnmi_enabled(svm));
> 
> Understandable, but also scary. :) I am not sure losing a pending NMI
> is a big deal. IIRC the "limit = 2" case only matters because Windows
> uses an NMI shootdown when rebooting the system and in some cases it
> would hang; but in this case we're in a buggy situation. And it means
> having to think about how the IRET+single-step method interacts with
> vNMI, and what is the meaning of !svm->awaiting_iret_completion
> (tested right below) in this buggy case. I'd just "return" here.

Heh, Santosh originally had it return and I had the opposite reaction: why bail
and *guarantee* problems for the guest, instead of continuing on and *maybe*
causing problems for the guest.

 : Last thought, unless there's something that will obviously break, it's probably
 : better to WARN and continue than to bail.  I.e. do the single-step and hope for
 : the best.  Bailing at this point doesn't seem like it would help.

I don't have a super strong preference.  As you said, KVM is already in a buggy
scenario, though my vote is still to carry on.

https://lkml.kernel.org/r/Y9mwz%2FG6%2BG8NSX3%2B%40google.com

> And another small nit to conclude - kvm_get_nr_pending_nmis() could be static.
> 
> The only thing that leaves me a bit puzzled is the naming and
> rationale of get_vnmi_vmcb_l1(). I'll let you or Santosh clarify that.

Ah, I think what happened is that I complained about is_vnmi_enabled() being
misleading (https://lore.kernel.org/all/Y9m0q31NBmsnhVGD@google.com), but instead
of renaming the top-level helper, Santosh added an inner helper and here we are.

Re-reading what I wrote, and looking at the code with fresh eyes, I don't think
I agree with past me regarding the name of is_vnmi_enabled().  My biggest
objection/confusion with the original code was the comment saying vNMI was
"inhibited".  Appending "for_l1()" makes the usage in the callers quite confusing.

So my vote is to do:

static inline bool is_vnmi_enabled(struct vcpu_svm *svm)
{
	if (!vnmi)
		return false;

	if (is_guest_mode(&svm->vcpu))
		return false;

	return !!(svm->vmcb01.ptr->control.int_ctl & V_NMI_ENABLE_MASK);
}

---
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index f44751dd8d5d..5b604565d4b3 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -556,25 +556,15 @@ static inline bool is_x2apic_msrpm_offset(u32 offset)
               (msr < (APIC_BASE_MSR + 0x100));
 }
 
-static inline struct vmcb *get_vnmi_vmcb_l1(struct vcpu_svm *svm)
+static inline bool is_vnmi_enabled(struct vcpu_svm *svm)
 {
        if (!vnmi)
-               return NULL;
+               return false;
 
        if (is_guest_mode(&svm->vcpu))
-               return NULL;
-       else
-               return svm->vmcb01.ptr;
-}
-
-static inline bool is_vnmi_enabled(struct vcpu_svm *svm)
-{
-       struct vmcb *vmcb = get_vnmi_vmcb_l1(svm);
-
-       if (vmcb)
-               return !!(vmcb->control.int_ctl & V_NMI_ENABLE_MASK);
-       else
                return false;
+
+       return !!(svm->vmcb01.ptr->control.int_ctl & V_NMI_ENABLE_MASK);
 }
 
 /* svm.c */

  reply	other threads:[~2023-04-26 22:41 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-24 17:35 [GIT PULL] KVM: Non-x86 changes for 6.4 Sean Christopherson
2023-04-24 17:35 ` [GIT PULL] KVM: x86: Misc " Sean Christopherson
2023-04-26 19:49   ` Paolo Bonzini
2023-04-24 17:35 ` [GIT PULL] KVM: x86: MMU " Sean Christopherson
2023-04-26 19:52   ` Paolo Bonzini
2023-04-24 17:35 ` [GIT PULL] KVM: x86: PMU " Sean Christopherson
2023-04-26 19:55   ` Paolo Bonzini
2023-04-24 17:35 ` [GIT PULL] KVM: x86: Selftests " Sean Christopherson
2023-04-26 19:56   ` Paolo Bonzini
2023-04-24 17:35 ` [GIT PULL] KVM: x86: SVM " Sean Christopherson
2023-04-26 19:58   ` Paolo Bonzini
2023-04-26 20:02     ` Sean Christopherson
2023-04-26 20:48       ` Paolo Bonzini
2023-04-26 22:40         ` Sean Christopherson [this message]
2023-04-24 17:35 ` [GIT PULL] KVM: x86: VMX " Sean Christopherson
2023-04-26 19:57   ` Paolo Bonzini
2023-04-26 19:49 ` [GIT PULL] KVM: Non-x86 " Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZEmoQslJ3gPqI8jG@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.