All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Upton <oliver.upton@linux.dev>
To: Mingwei Zhang <mizhang@google.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] KVM: x86: Get vmcs12 pages before checking pending interrupts
Date: Wed, 3 Aug 2022 18:36:24 +0000	[thread overview]
Message-ID: <YurAKLuw2RTtMJVT@google.com> (raw)
In-Reply-To: <Yuqpqr/aE6KN5MLv@google.com>

On Wed, Aug 03, 2022 at 05:00:26PM +0000, Mingwei Zhang wrote:
> On Wed, Aug 03, 2022, Mingwei Zhang wrote:
> > On Wed, Aug 03, 2022, Maxim Levitsky wrote:
> > > On Tue, 2022-08-02 at 23:07 +0000, Mingwei Zhang wrote:
> > > > From: Oliver Upton <oupton@google.com>
> > > > 
> > > > vmx_guest_apic_has_interrupts implicitly depends on the virtual APIC
> > > > page being present + mapped into the kernel address space. However, with
> > > > demand paging we break this dependency, as the KVM_REQ_GET_VMCS12_PAGES
> > > > event isn't assessed before entering vcpu_block.
> > > > 
> > > > Fix this by getting vmcs12 pages before inspecting the guest's APIC
> > > > page. Note that upstream does not have this issue, as they will directly
> > > > get the vmcs12 pages on vmlaunch/vmresume instead of relying on the
> > > > event request mechanism. However, the upstream approach is problematic,
> > > > as the vmcs12 pages will not be present if a live migration occurred
> > > > before checking the virtual APIC page.
> > > 
> > > Since this patch is intended for upstream, I don't fully understand
> > > the meaning of the above paragraph.
> > 
> > My apology. Some of the statement needs to be updated, which I should do
> > before sending. But I think the point here is that there is a missing
> > get_nested_state_pages() call here within vcpu_block() when there is the
> > request of KVM_REQ_GET_NESTED_STATE_PAGES.

This was my poorly written changelog, sorry about that Mingwei :)

> > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > index 5366f884e9a7..1d3d8127aaea 100644
> > > > --- a/arch/x86/kvm/x86.c
> > > > +++ b/arch/x86/kvm/x86.c
> > > > @@ -10599,6 +10599,23 @@ static inline int vcpu_block(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	bool hv_timer;
> > > >  
> > > > +	/*
> > > > +	 * We must first get the vmcs12 pages before checking for interrupts
> > > > +	 * that might unblock the guest if L1 is using virtual-interrupt
> > > > +	 * delivery.
> > > > +	 */
> > > > +	if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
> > > > +		/*
> > > > +		 * If we have to ask user-space to post-copy a page,
> > > > +		 * then we have to keep trying to get all of the
> > > > +		 * VMCS12 pages until we succeed.
> > > > +		 */
> > > > +		if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) {
> > > > +			kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
> > > > +			return 0;
> > > > +		}
> > > > +	}
> > > > +
> > > >  	if (!kvm_arch_vcpu_runnable(vcpu)) {
> > > >  		/*
> > > >  		 * Switch to the software timer before halt-polling/blocking as
> > > 
> > > 
> > > If I understand correctly, you are saying that if apic backing page is migrated in post copy
> > > then 'get_nested_state_pages' will return false and thus fail?
> > 
> > What I mean is that when the vCPU was halted and then migrated in this
> > case, KVM did not call get_nested_state_pages() before getting into
> > kvm_arch_vcpu_runnable(). This function checks the apic backing page and
> > fails on that check and triggered the warning.
> > > 
> > > AFAIK both SVM and VMX versions of 'get_nested_state_pages' assume that this is not the case
> > > for many things like MSR bitmaps and such - they always uses non atomic versions
> > > of guest memory access like 'kvm_vcpu_read_guest' and 'kvm_vcpu_map' which
> > > supposed to block if they attempt to access HVA which is not present, and then
> > > userfaultd should take over and wake them up.
> > 
> > You are right here.
> > > 
> > > If that still fails, nested VM entry is usually failed, and/or the whole VM
> > > is crashed with 'KVM_EXIT_INTERNAL_ERROR'.
> > > 
> 
> Ah, I think I understand what you are saying. hmm, so basically the
> patch here is to continuously request vmcs12 pages if failed. But what
> you are saying is that we just need to call 'get_nested_state_pages'
> once. If it fails, then the VM fails to work. Let me double check and
> get back.

IIRC the reason we reset the request on a nonzero return was because our
local implementation of the VMX hook was non-blocking and would bail on
the first page that needed to be demanded from the source. So, you
effectively keep hitting the request until all the pages are pulled in.

--
Thanks,
Oliver



  reply	other threads:[~2022-08-03 18:36 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-02 23:07 [PATCH 0/5] Fix a race between posted interrupt delivery and migration in a nested VM Mingwei Zhang
2022-08-02 23:07 ` [PATCH 1/5] KVM: x86: Get vmcs12 pages before checking pending interrupts Mingwei Zhang
2022-08-03 10:08   ` Maxim Levitsky
2022-08-03 16:45     ` Mingwei Zhang
2022-08-03 17:00       ` Mingwei Zhang
2022-08-03 18:36         ` Oliver Upton [this message]
2022-08-03 17:18   ` Paolo Bonzini
2022-08-03 17:51     ` Mingwei Zhang
2022-08-03 19:34       ` Maxim Levitsky
2022-08-25  0:11         ` Sean Christopherson
2022-08-25  2:51           ` Jim Mattson
2022-08-25 14:40             ` Sean Christopherson
2022-08-25 17:16               ` Mingwei Zhang
2022-08-25 17:53                 ` Sean Christopherson
2022-08-25 20:35                   ` Mingwei Zhang
2022-08-25 20:37                     ` Mingwei Zhang
2022-08-25 23:21                     ` Jim Mattson
2022-08-02 23:07 ` [PATCH 2/5] selftests: KVM/x86: Fix vcpu_{save,load}_state() by adding APIC state into kvm_x86_state Mingwei Zhang
2022-08-03 18:44   ` Oliver Upton
2022-08-03 19:21     ` Sean Christopherson
2022-08-03 23:55       ` Oliver Upton
2022-08-02 23:07 ` [PATCH 3/5] selftests: KVM: Introduce vcpu_run_interruptable() Mingwei Zhang
2022-08-25  0:15   ` Sean Christopherson
2022-08-28 19:21     ` Mingwei Zhang
2022-08-02 23:07 ` [PATCH 4/5] selftests: KVM: Add support for posted interrupt handling in L2 Mingwei Zhang
2022-08-25  0:16   ` Sean Christopherson
2022-08-28 19:29     ` Mingwei Zhang
2022-08-02 23:07 ` [PATCH 5/5] selftests: KVM: Test if posted interrupt delivery race with migration Mingwei Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YurAKLuw2RTtMJVT@google.com \
    --to=oliver.upton@linux.dev \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.