Re: [PATCH v3 11/11] KVM: nVMX: Wake L2 from HLT when nested posted-interrupt pending

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	idan.brown@oracle.com, Jim Mattson <jmattson@google.com>,
	kvm list <kvm@vger.kernel.org>,
	liam.merwick@oracle.com, wanpeng.li@hotmail.com
Subject: Re: [PATCH v3 11/11] KVM: nVMX: Wake L2 from HLT when nested posted-interrupt pending
Date: Tue, 24 Nov 2020 01:55:15 +0000	[thread overview]
Message-ID: <20201124015515.GA75780@google.com> (raw)
In-Reply-To: <CAOQ_QshMoc9W9g6XRuGM4hCtMdvUxSDpGAhp3vNxhxhWTK-5CQ@mail.gmail.com>

On Mon, Nov 23, 2020 at 04:13:49PM -0800, Oliver Upton wrote:
> On Mon, Nov 23, 2020 at 4:10 PM Oliver Upton <oupton@google.com> wrote:
> >
> > On Mon, Nov 23, 2020 at 2:42 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > >
> > > On 23/11/20 20:22, Oliver Upton wrote:
> > > > The pi_pending bit works rather well as it is only a hint to KVM that it
> > > > may owe the guest a posted-interrupt completion. However, if we were to
> > > > set the guest's nested PINV as pending in the L1 IRR it'd be challenging
> > > > to infer whether or not it should actually be injected in L1 or result
> > > > in posted-interrupt processing for L2.
> > >
> > > Stupid question: why does it matter?  The behavior when the PINV is
> > > delivered does not depend on the time it enters the IRR, only on the
> > > time that it enters ISR.  If that happens while the vCPU while in L2, it
> > > would trigger posted interrupt processing; if PINV moves to ISR while in
> > > L1, it would be delivered normally as an interrupt.
> > >
> > > There are various special cases but they should fall in place.  For
> > > example, if PINV is delivered during L1 vmentry (with IF=0), it would be
> > > delivered at the next inject_pending_event when the VMRUN vmexit is
> > > processed and interrupts are unmasked.
> > >
> > > The tricky case is when L0 tries to deliver the PINV to L1 as a posted
> > > interrupt, i.e. in vmx_deliver_nested_posted_interrupt.  Then the
> > >
> > >                  if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true))
> > >                          kvm_vcpu_kick(vcpu);
> > >
> > > needs a tweak to fall back to setting the PINV in L1's IRR:
> > >
> > >                  if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true)) {
> > >                          /* set PINV in L1's IRR */
> > >                         kvm_vcpu_kick(vcpu);
> > >                 }
> >
> > Yeah, I think that's fair. Regardless, the pi_pending bit should've
> > only been set if the IPI was actually sent. Though I suppose
> 
> Didn't finish my thought :-/
> 
> Though I suppose pi_pending was set unconditionally (and skipped the
> IRR) since until recently KVM completely bungled handling the PINV
> correctly when in the L1 IRR.
> 
> >
> > > but you also have to do the same *in the PINV handler*
> > > sysvec_kvm_posted_intr_nested_ipi too, to handle the case where the
> > > L2->L0 vmexit races against sending the IPI.
> >
> > Indeed, there is a race but are we assured that the target vCPU thread
> > is scheduled on the target CPU when that IPI arrives?
> >
> > I believe there is a 1-to-many relationship here, which is why I said
> > each CPU would need to maintain a linked list of possible vCPUs to
> > iterate and find the intended recipient.

Ya, the concern is that it's theoretically possible for the PINV to arrive in L0
after a different vCPU has been loaded (or even multiple different vCPUs).      
E.g. if the sending pCPU is hit with an NMI after checking vcpu->mode, and the  
NMI runs for some absurd amount of time.  If that happens, the PINV handler     
won't know which vCPU(s) should get an IRQ injected into L1 without additional  
tracking.  KVM would need to set something like nested.pi_pending before doing  
kvm_vcpu_trigger_posted_interrupt(), i.e. nothing really changes, it just gets  
more complex.

> > The process of removing vCPUs from the list where we caught the IPI
> > in L0 is quite clear, but it doesn't seem like we could ever know to
> > remove vCPUs from the list when hardware catches that IPI.

It's probably possible by shadowing the PI descriptor, but doing so would likely
wipe out the perf benefits of nested PI.

That being said, if we don't care about strictly adhering to the spec (sorry in
advance Jim), I think we could avoid nested.pi_pending if we're ok with KVM
processing virtual interrupts that technically shouldn't happen, yet.  E.g. if
the L0 PINV handler consumes vIRR bits that were set after the last PI from L1.
KVM already does that with nested.pi_pending, so it might not be the worst idea
in the world since the existing nested PI implementation mostly works.

next prev parent reply	other threads:[~2020-11-24  1:55 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-24 16:12 [PATCH v3 00/11] KVM: nVMX: Fix multiple issues in nested posted-interrupts Liran Alon
2017-12-24 16:12 ` [PATCH v3 01/11] KVM: x86: Optimization: Create SVM stubs for sync_pir_to_irr() Liran Alon
2017-12-27  9:56   ` Paolo Bonzini
2017-12-27 10:01     ` Liran Alon
2017-12-24 16:12 ` [PATCH v3 02/11] KVM: x86: Change __kvm_apic_update_irr() to also return if max IRR updated Liran Alon
2018-01-02  1:51   ` Quan Xu
2017-12-24 16:12 ` [PATCH v3 03/11] KVM: nVMX: Re-evaluate L1 pending events when running L2 and L1 got posted-interrupt Liran Alon
2018-01-02  2:45   ` Quan Xu
2018-01-02  9:57     ` Liran Alon
2018-01-02 11:21       ` Quan Xu
2018-01-02 11:52         ` Quan Xu
2018-01-02 12:20           ` Liran Alon
2018-01-03  5:32             ` Quan Xu
2018-01-03  5:35   ` Quan Xu
2017-12-24 16:12 ` [PATCH v3 04/11] KVM: nVMX: Fix injection to L2 when L1 don't intercept external-interrupts Liran Alon
2017-12-24 16:12 ` [PATCH v3 05/11] KVM: x86: Rename functions which saves vCPU in per-cpu var Liran Alon
2017-12-24 16:12 ` [PATCH v3 06/11] KVM: x86: Set current_vcpu per-cpu var before enabling interrupts at host Liran Alon
2017-12-27 10:06   ` Paolo Bonzini
2017-12-27 10:44     ` Liran Alon
2017-12-24 16:12 ` [PATCH v3 07/11] KVM: x86: Add util for getting current vCPU running on CPU Liran Alon
2017-12-24 16:13 ` [PATCH v3 08/11] KVM: x86: Register empty handler for POSTED_INTR_NESTED_VECTOR IPI Liran Alon
2017-12-24 16:13 ` [PATCH v3 09/11] KVM: nVMX: Deliver missed nested-PI notification-vector via self-IPI while interrupts disabled Liran Alon
2017-12-24 16:13 ` [PATCH v3 10/11] KVM: nVMX: Wake halted L2 on nested posted-interrupt Liran Alon
2017-12-27 11:31   ` Paolo Bonzini
2017-12-27 12:01     ` Liran Alon
2017-12-27 12:27       ` Paolo Bonzini
2017-12-27 12:52         ` Liran Alon
2017-12-27 13:05           ` Paolo Bonzini
2017-12-27 15:33             ` Liran Alon
2017-12-27 15:54               ` Paolo Bonzini
2018-01-01 21:32                 ` Paolo Bonzini
2018-01-01 22:37                   ` Liran Alon
2018-01-02  7:25                     ` Paolo Bonzini
2017-12-24 16:13 ` [PATCH v3 11/11] KVM: nVMX: Wake L2 from HLT when nested posted-interrupt pending Liran Alon
2017-12-27 10:15   ` Paolo Bonzini
2017-12-27 10:51     ` Liran Alon
2017-12-27 12:55       ` Paolo Bonzini
2017-12-27 15:15         ` Liran Alon
2017-12-27 15:55           ` Paolo Bonzini
2020-11-23 19:22             ` Oliver Upton
2020-11-23 22:42               ` Paolo Bonzini
2020-11-24  0:10                 ` Oliver Upton
2020-11-24  0:13                   ` Oliver Upton
2020-11-24  1:55                     ` Sean Christopherson [this message]
2020-11-24  3:19                       ` Sean Christopherson
2020-11-24 11:39                       ` Paolo Bonzini
2020-11-24 21:22                         ` Sean Christopherson
2020-11-25  0:10                           ` Paolo Bonzini
2020-11-25  1:14                             ` Sean Christopherson
2020-11-25 17:00                               ` Paolo Bonzini
2020-11-25 18:32                                 ` Sean Christopherson
2020-11-26 13:13                                   ` Paolo Bonzini
2020-11-30 19:14                                     ` Sean Christopherson
2020-11-30 19:36                                       ` Paolo Bonzini
2020-12-03 22:07                         ` Jim Mattson
2020-11-24 11:09                   ` Paolo Bonzini
2020-12-03 21:45                     ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201124015515.GA75780@google.com \
    --to=seanjc@google.com \
    --cc=idan.brown@oracle.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.