From: Mingwei Zhang <mizhang@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Maxim Levitsky <mlevitsk@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Oliver Upton <oupton@google.com>,
Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH v2 1/4] KVM: x86: move the event handling of KVM_REQ_GET_VMCS12_PAGES into a common function
Date: Wed, 7 Sep 2022 00:01:02 +0000 [thread overview]
Message-ID: <YxffPlIL/17kZY0k@google.com> (raw)
In-Reply-To: <YwzkvfT0AiwaojTx@google.com>
On Mon, Aug 29, 2022, Sean Christopherson wrote:
> On Sun, Aug 28, 2022, Mingwei Zhang wrote:
> > Create a common function to handle kvm request in the vcpu_run loop. KVM
> > implicitly assumes the virtual APIC page being present + mapped into the
> > kernel address space when executing vmx_guest_apic_has_interrupts().
> > However, with demand paging KVM breaks the assumption, as the
> > KVM_REQ_GET_VMCS12_PAGES event isn't assessed before entering vcpu_block.
>
> KVM_REQ_GET_VMCS12_PAGES doesn't exist upstream.
ack.
>
> > Fix this by getting vmcs12 pages before inspecting the guest's APIC page.
> > Because of this fix, the event handling code of
> > KVM_REQ_GET_NESTED_STATE_PAGES becomes a common code path for both
> > vcpu_enter_guest() and vcpu_block(). Thus, put this code snippet into a
> > common helper function to avoid code duplication.
> >
> > Cc: Maxim Levitsky <mlevitsk@redhat.com>
> > Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Originally-by: Oliver Upton <oupton@google.com>
> > Signed-off-by: Oliver Upton <oupton@google.com>
>
> If you drop someone as author, then their SOB also needs to be jettisoned.
>
ack.
> > Signed-off-by: Mingwei Zhang <mizhang@google.com>
> > ---
> > arch/x86/kvm/x86.c | 29 +++++++++++++++++++++++------
> > 1 file changed, 23 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index d7374d768296..3dcaac8f0584 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -10261,12 +10261,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > r = -EIO;
> > goto out;
> > }
> > - if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
> > - if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) {
> > - r = 0;
> > - goto out;
> > - }
> > - }
> > if (kvm_check_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu))
> > kvm_mmu_free_obsolete_roots(vcpu);
> > if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
> > @@ -10666,6 +10660,23 @@ static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
> > !vcpu->arch.apf.halted);
> > }
> >
> > +static int kvm_vcpu_handle_common_requests(struct kvm_vcpu *vcpu)
> > +{
> > + if (kvm_request_pending(vcpu)) {
>
> Probably going to be a moot point, but write this as
>
> if (!kvm_request_pending(vcpu))
> return 1;
>
> to reduce indentation.
>
> > + /*
> > + * Get the vmcs12 pages before checking for interrupts that
> > + * might unblock the guest if L1 is using virtual-interrupt
> > + * delivery.
> > + */
> > + if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
> > + if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu)))
>
> Similarly
>
> if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu) &&
> unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu)))
> return 0;
>
> though I can see the argument for fully isolating each request. But again, likely
> a moot point.
>
> > + return 0;
> > + }
> > + }
> > +
> > + return 1;
> > +}
> > +
> > /* Called within kvm->srcu read side. */
> > static int vcpu_run(struct kvm_vcpu *vcpu)
> > {
> > @@ -10681,6 +10692,12 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
> > * this point can start executing an instruction.
> > */
> > vcpu->arch.at_instruction_boundary = false;
> > +
> > + /* Process common request regardless of vcpu state. */
> > + r = kvm_vcpu_handle_common_requests(vcpu);
>
> IMO this is subtly a dangerous hook. It implies that both vcpu_enter_guest()
> and vcpu_block() correctly handle requests becoming pending after the "common"
> check, but that's not actually the case. If a request _needs_ to be handled
> before vcpu_block(), then ideally it should be explicitly queried in
> kvm_vcpu_check_block(). KVM_REQ_GET_NESTED_STATE_PAGES doesn't have issues because
> it's only ever set from the vCPU itself.
>
> Following that train of thought, KVM_REQ_GET_NESTED_STATE_PAGES really shouldn't
> even be a request. Aha! And we can do that in a way that would magically fix this
> bug, and would ensure we don't leave a trap for future us.
>
> KVM already provides KVM_REQ_UNBLOCK to prevent blocking the vCPU without actaully
> waking the vCPU, i.e. to kick the vCPU back into the vcpu_run() loop. The request
> is provided specifically for scenarios like this where KVM needs to do work before
> blocking.
>
hmm. I think this won't work. The warning is happening at this trace
(although the dynamic trace does not show the full stack trace in source
code):
WARN_ON_ONCE(!vmx->nested.virtual_apic_map.gfn))
vmx_guest_apic_has_interrupt()
kvm_guest_apic_has_interrupt()
kvm_vcpu_has_events()
kvm_arch_vcpu_runnablea()
kvm_vcpu_check_block()
If you go to kvm_vcpu_check_block(), the check of KVM_REQ_UNBLOCK is
behind check of kvm_arch_vcpu_runnable(). So, with the diff you pointed
out, we will still see the warning.
Maybe what we can do is to re-order the
kvm_check_request(KVM_REQ_UNBLOCK, vcpu) to the beginning of the
kvm_vcpu_check_block()? But I am not sure.
Thanks.
-Mingwei
> Normally I'd say we should do this over multiple patches so that the "blocking"
> bug is fixed before doing the rework/cleanup, but I'm ok if we want to skip straight
> to the rework since we're obviously carrying an internal patch and no one else is
> likely to need the fix. But I also wouldn't object to including an intermediate
> patch to fix the bug so that there's a better paper trail.
>
> E.g. as a very partial conversion:
>
> ---
> arch/x86/include/asm/kvm_host.h | 2 ++
> arch/x86/kvm/vmx/nested.c | 2 +-
> arch/x86/kvm/x86.c | 12 ++++++++++++
> arch/x86/kvm/x86.h | 10 ++++++++++
> 4 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 9345303c8c6d..bfca37419783 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -939,6 +939,8 @@ struct kvm_vcpu_arch {
> */
> bool pdptrs_from_userspace;
>
> + bool nested_get_pages_pending;
> +
> #if IS_ENABLED(CONFIG_HYPERV)
> hpa_t hv_root_tdp;
> #endif
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index ddd4367d4826..e83b145c3a35 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3446,7 +3446,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
> * to nested_get_vmcs12_pages before the next VM-entry. The MSRs
> * have already been set at vmentry time and should not be reset.
> */
> - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
> + kvm_nested_get_pages_set_pending(vcpu);
> }
>
> /*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c0e3e7915a3a..0a7601ebffc6 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9650,6 +9650,12 @@ int kvm_check_nested_events(struct kvm_vcpu *vcpu)
> return kvm_x86_ops.nested_ops->check_events(vcpu);
> }
>
> +static int kvm_get_nested_state_pages(struct kvm_vcpu *vcpu)
> +{
> + vcpu->arch.nested_get_pages_pending = false;
> + return kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu);
> +}
> +
> static void kvm_inject_exception(struct kvm_vcpu *vcpu)
> {
> trace_kvm_inj_exception(vcpu->arch.exception.nr,
> @@ -10700,6 +10706,12 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
> if (kvm_cpu_has_pending_timer(vcpu))
> kvm_inject_pending_timer_irqs(vcpu);
>
> + if (vcpu->arch.nested_get_pages_pending) {
> + r = kvm_get_nested_state_pages(vcpu);
> + if (r <= 0)
> + break;
> + }
> +
> if (dm_request_for_irq_injection(vcpu) &&
> kvm_vcpu_ready_for_interrupt_injection(vcpu)) {
> r = 0;
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 1926d2cb8e79..e35aac39dc73 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -481,4 +481,14 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
> unsigned int port, void *data, unsigned int count,
> int in);
>
> +static inline void kvm_nested_get_pages_set_pending(struct kvm_vcpu *vcpu)
> +{
> + /*
> + * Here is a comment explaining why KVM needs to prevent the vCPU from
> + * blocking until the vCPU's nested pages have been loaded.
> + */
> + vcpu->arch.nested_get_pages_pending = true;
> + kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
> +}
> +
> #endif
>
> base-commit: 14a47a98151834c5bd2f6d8d592b01108a3f882a
> --
next prev parent reply other threads:[~2022-09-07 0:01 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-28 22:25 [PATCH v2 0/4] Fix a race between posted interrupt delivery and migration in a nested VM Mingwei Zhang
2022-08-28 22:25 ` [PATCH v2 1/4] KVM: x86: move the event handling of KVM_REQ_GET_VMCS12_PAGES into a common function Mingwei Zhang
2022-08-29 16:09 ` Sean Christopherson
2022-09-07 0:01 ` Mingwei Zhang [this message]
2022-09-07 2:50 ` Yuan Yao
2022-09-07 4:26 ` Mingwei Zhang
2022-09-07 5:35 ` Yuan Yao
2022-09-07 15:48 ` Sean Christopherson
2022-09-07 15:56 ` Sean Christopherson
2022-08-28 22:25 ` [PATCH v2 2/4] KVM: selftests: Save/restore vAPIC state in migration tests Mingwei Zhang
2022-08-28 22:25 ` [PATCH v2 3/4] KVM: selftests: Add support for posted interrupt handling in L2 Mingwei Zhang
2022-08-29 16:19 ` Sean Christopherson
2022-09-07 0:02 ` Mingwei Zhang
2022-08-28 22:25 ` [PATCH v2 4/4] KVM: selftests: Test if posted interrupt delivery race with migration Mingwei Zhang
2022-08-29 17:21 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YxffPlIL/17kZY0k@google.com \
--to=mizhang@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlevitsk@redhat.com \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).