Re: [PATCH v3 11/11] KVM: nVMX: Wake L2 from HLT when nested posted-interrupt pending

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	idan.brown@oracle.com, Jim Mattson <jmattson@google.com>,
	kvm list <kvm@vger.kernel.org>,
	liam.merwick@oracle.com, wanpeng.li@hotmail.com
Subject: Re: [PATCH v3 11/11] KVM: nVMX: Wake L2 from HLT when nested posted-interrupt pending
Date: Tue, 24 Nov 2020 03:19:09 +0000	[thread overview]
Message-ID: <20201124031909.GA103400@google.com> (raw)
In-Reply-To: <20201124015515.GA75780@google.com>

On Tue, Nov 24, 2020 at 01:55:15AM +0000, Sean Christopherson wrote:
> On Mon, Nov 23, 2020 at 04:13:49PM -0800, Oliver Upton wrote:
> > On Mon, Nov 23, 2020 at 4:10 PM Oliver Upton <oupton@google.com> wrote:
> > >
> > > On Mon, Nov 23, 2020 at 2:42 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > >
> > > > On 23/11/20 20:22, Oliver Upton wrote:
> > > > > The pi_pending bit works rather well as it is only a hint to KVM that it
> > > > > may owe the guest a posted-interrupt completion. However, if we were to
> > > > > set the guest's nested PINV as pending in the L1 IRR it'd be challenging
> > > > > to infer whether or not it should actually be injected in L1 or result
> > > > > in posted-interrupt processing for L2.
> > > >
> > > > Stupid question: why does it matter?  The behavior when the PINV is
> > > > delivered does not depend on the time it enters the IRR, only on the
> > > > time that it enters ISR.  If that happens while the vCPU while in L2, it
> > > > would trigger posted interrupt processing; if PINV moves to ISR while in
> > > > L1, it would be delivered normally as an interrupt.
> > > >
> > > > There are various special cases but they should fall in place.  For
> > > > example, if PINV is delivered during L1 vmentry (with IF=0), it would be
> > > > delivered at the next inject_pending_event when the VMRUN vmexit is
> > > > processed and interrupts are unmasked.
> > > >
> > > > The tricky case is when L0 tries to deliver the PINV to L1 as a posted
> > > > interrupt, i.e. in vmx_deliver_nested_posted_interrupt.  Then the
> > > >
> > > >                  if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true))
> > > >                          kvm_vcpu_kick(vcpu);
> > > >
> > > > needs a tweak to fall back to setting the PINV in L1's IRR:
> > > >
> > > >                  if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true)) {
> > > >                          /* set PINV in L1's IRR */
> > > >                         kvm_vcpu_kick(vcpu);
> > > >                 }
> > >
> > > Yeah, I think that's fair. Regardless, the pi_pending bit should've
> > > only been set if the IPI was actually sent. Though I suppose
> > 
> > Didn't finish my thought :-/
> > 
> > Though I suppose pi_pending was set unconditionally (and skipped the
> > IRR) since until recently KVM completely bungled handling the PINV
> > correctly when in the L1 IRR.
> > 
> > >
> > > > but you also have to do the same *in the PINV handler*
> > > > sysvec_kvm_posted_intr_nested_ipi too, to handle the case where the
> > > > L2->L0 vmexit races against sending the IPI.
> > >
> > > Indeed, there is a race but are we assured that the target vCPU thread
> > > is scheduled on the target CPU when that IPI arrives?
> > >
> > > I believe there is a 1-to-many relationship here, which is why I said
> > > each CPU would need to maintain a linked list of possible vCPUs to
> > > iterate and find the intended recipient.
> 
> Ya, the concern is that it's theoretically possible for the PINV to arrive in L0
> after a different vCPU has been loaded (or even multiple different vCPUs).      
> E.g. if the sending pCPU is hit with an NMI after checking vcpu->mode, and the  
> NMI runs for some absurd amount of time.  If that happens, the PINV handler     
> won't know which vCPU(s) should get an IRQ injected into L1 without additional  
> tracking.  KVM would need to set something like nested.pi_pending before doing  
> kvm_vcpu_trigger_posted_interrupt(), i.e. nothing really changes, it just gets  
> more complex.

If we do want to do something fancy with the L0 PINV handler, I think we could
avoid a list by having the dest vCPU busy wait before exiting guest_mode if
there is one or more PINV IPIs that are in the process of being sent, and then
wait again in vcpu_put() for the PINV to be handled.  The latter in particular
would rarely come into play, i.e. shouldn't impact performance.  That would
prevent having a somewhat unbounded et of vCPUs that may or may not have an
oustanding PINV.

next prev parent reply	other threads:[~2020-11-24  3:19 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-24 16:12 [PATCH v3 00/11] KVM: nVMX: Fix multiple issues in nested posted-interrupts Liran Alon
2017-12-24 16:12 ` [PATCH v3 01/11] KVM: x86: Optimization: Create SVM stubs for sync_pir_to_irr() Liran Alon
2017-12-27  9:56   ` Paolo Bonzini
2017-12-27 10:01     ` Liran Alon
2017-12-24 16:12 ` [PATCH v3 02/11] KVM: x86: Change __kvm_apic_update_irr() to also return if max IRR updated Liran Alon
2018-01-02  1:51   ` Quan Xu
2017-12-24 16:12 ` [PATCH v3 03/11] KVM: nVMX: Re-evaluate L1 pending events when running L2 and L1 got posted-interrupt Liran Alon
2018-01-02  2:45   ` Quan Xu
2018-01-02  9:57     ` Liran Alon
2018-01-02 11:21       ` Quan Xu
2018-01-02 11:52         ` Quan Xu
2018-01-02 12:20           ` Liran Alon
2018-01-03  5:32             ` Quan Xu
2018-01-03  5:35   ` Quan Xu
2017-12-24 16:12 ` [PATCH v3 04/11] KVM: nVMX: Fix injection to L2 when L1 don't intercept external-interrupts Liran Alon
2017-12-24 16:12 ` [PATCH v3 05/11] KVM: x86: Rename functions which saves vCPU in per-cpu var Liran Alon
2017-12-24 16:12 ` [PATCH v3 06/11] KVM: x86: Set current_vcpu per-cpu var before enabling interrupts at host Liran Alon
2017-12-27 10:06   ` Paolo Bonzini
2017-12-27 10:44     ` Liran Alon
2017-12-24 16:12 ` [PATCH v3 07/11] KVM: x86: Add util for getting current vCPU running on CPU Liran Alon
2017-12-24 16:13 ` [PATCH v3 08/11] KVM: x86: Register empty handler for POSTED_INTR_NESTED_VECTOR IPI Liran Alon
2017-12-24 16:13 ` [PATCH v3 09/11] KVM: nVMX: Deliver missed nested-PI notification-vector via self-IPI while interrupts disabled Liran Alon
2017-12-24 16:13 ` [PATCH v3 10/11] KVM: nVMX: Wake halted L2 on nested posted-interrupt Liran Alon
2017-12-27 11:31   ` Paolo Bonzini
2017-12-27 12:01     ` Liran Alon
2017-12-27 12:27       ` Paolo Bonzini
2017-12-27 12:52         ` Liran Alon
2017-12-27 13:05           ` Paolo Bonzini
2017-12-27 15:33             ` Liran Alon
2017-12-27 15:54               ` Paolo Bonzini
2018-01-01 21:32                 ` Paolo Bonzini
2018-01-01 22:37                   ` Liran Alon
2018-01-02  7:25                     ` Paolo Bonzini
2017-12-24 16:13 ` [PATCH v3 11/11] KVM: nVMX: Wake L2 from HLT when nested posted-interrupt pending Liran Alon
2017-12-27 10:15   ` Paolo Bonzini
2017-12-27 10:51     ` Liran Alon
2017-12-27 12:55       ` Paolo Bonzini
2017-12-27 15:15         ` Liran Alon
2017-12-27 15:55           ` Paolo Bonzini
2020-11-23 19:22             ` Oliver Upton
2020-11-23 22:42               ` Paolo Bonzini
2020-11-24  0:10                 ` Oliver Upton
2020-11-24  0:13                   ` Oliver Upton
2020-11-24  1:55                     ` Sean Christopherson
2020-11-24  3:19                       ` Sean Christopherson [this message]
2020-11-24 11:39                       ` Paolo Bonzini
2020-11-24 21:22                         ` Sean Christopherson
2020-11-25  0:10                           ` Paolo Bonzini
2020-11-25  1:14                             ` Sean Christopherson
2020-11-25 17:00                               ` Paolo Bonzini
2020-11-25 18:32                                 ` Sean Christopherson
2020-11-26 13:13                                   ` Paolo Bonzini
2020-11-30 19:14                                     ` Sean Christopherson
2020-11-30 19:36                                       ` Paolo Bonzini
2020-12-03 22:07                         ` Jim Mattson
2020-11-24 11:09                   ` Paolo Bonzini
2020-12-03 21:45                     ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201124031909.GA103400@google.com \
    --to=seanjc@google.com \
    --cc=idan.brown@oracle.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.