kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Liran Alon <liran.alon@oracle.com>
Cc: pbonzini@redhat.com, kvm@vger.kernel.org, jmattson@google.com,
	wanpeng.li@hotmail.com, idan.brown@oracle.com,
	Krish Sadhukhan <krish.sadhukhan@oracle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH v2 4/5] KVM: nVMX: Deliver missed nested-PI notification-vector via self-IPI while interrupts disabled
Date: Wed, 6 Dec 2017 21:45:52 +0100	[thread overview]
Message-ID: <20171206204552.GD3644@flask> (raw)
In-Reply-To: <1512461786-6465-5-git-send-email-liran.alon@oracle.com>

2017-12-05 10:16+0200, Liran Alon:
> When L1 wants to send a posted-interrupt to another L1 CPU running L2,
> it sets the relevant bit in vmx->nested.pi_desc->pir and ON bit in
> vmx->nested.pi_desc->control. Then it attempts to send a
> notification-vector IPI to dest L1 CPU.
> This attempt to send IPI will exit to L0 which will reach
> vmx_deliver_nested_posted_interrupt() which does the following:
> 1. If dest L0 CPU is currently running guest (vcpu->mode ==
> IN_GUEST_MODE), it sends a physical IPI of PI nested-notification-vector.
> 2. It sets KVM_REQ_EVENT in dest vCPU. This is done such that if dest
> L0 CPU exits from guest to host and therefore doesn't recognize physical
> IPI (or it wasn't sent), then KVM_REQ_EVENT will be consumed on next
> vmentry which will call vmx_check_nested_events() which should call
> (in theory) vmx_complete_nested_posted_interrupt(). That function
> should see vmx->nested.pi_desc->control ON bit is set and therefore
> "emulate" posted-interrupt delivery for L1 (sync PIR to IRR in L1
> virtual-apic-page & update vmcs02 RVI).
> 
> The above logic regarding nested-posted-interrupts contains multiple
> issues:
> 
> A) Race-condition:
> On step (1) it is possible sender will see vcpu->mode == IN_GUEST_MODE
> but before sending physical IPI, the dest CPU will exit to host.
> Therefore, physical IPI could be received at host which it's hanlder
> does nothing. In addition, assume that dest CPU passes the checks
> for pending kvm requests before sender sets KVM_REQ_EVENT. Therefore,
> dest CPU will resume L2 without evaluating nested-posted-interrupts.
> 
> B) vmx_complete_nested_posted_interrupt() is not always called
> when needed. Even if dest CPU consumed KVM_REQ_EVENT, there is a bug
> that vmx_check_nested_events() could exit from L2 to L1 before calling
> vmx_complete_nested_posted_interrupt(). Therefore, on next resume of
> L1 into L2, nested-posted-interrupts won't be evaluated even though L0
> resume L2 (We may resume L2 without having KVM_REQ_EVENT set).
> 
> This commit removes nested-posted-interrupts processing from
> check_nested_events() and instead makes sure to process
> nested-posted-interrupts on vmentry after interrupts
> disabled. Processing of nested-posted-interrupts is delegated
> to hardware by issueing a self-IPI of relevant notification-vector
> which will be delivered to CPU when CPU is in guest.
> 
> * Bug (A) is solved by the fact that processing of
> nested-posted-interrupts is not dependent on KVM_REQ_EVENT and
> happens before every vmentry to L2.
> * Bug (B) is now trivially solved by processing
> nested-posted-interrupts before each vmentry to L2 guest.
> 
> An alternative could have been to just call
> vmx_complete_nested_posted_interrupt() at this call-site. However, we
> have decided to go with this approach because:
> 1. It would require modifying vmx_complete_nested_posted_interrupt()
> to be able to work with interrupts disabled (not-trivial).
> 2. We preffer to avoid software-emulation of hardware behavior if it
> is possible.

Nice solution!  Self-IPI was slower on non-nested, but even if it is
slower here, the code saving is definitely worth it.

> Fixes: 705699a13994 ("KVM: nVMX: Enable nested posted interrupt processing")
> 
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> Co-authored-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> @@ -5156,6 +5114,17 @@ static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>  		kvm_vcpu_kick(vcpu);
>  }
>  
> +static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +	WARN_ON(!vcpu->arch.apicv_active);
> +
> +	if (is_guest_mode(vcpu) && vmx->nested.pi_desc &&
> +	    pi_test_on(vmx->nested.pi_desc))
> +		kvm_vcpu_trigger_posted_interrupt(vcpu, true);

We're always sending to self, so the function would be better with
apic->send_IPI_self() as it might be accelerated by hardware.

> +}
> +
>  /*
>   * Set up the vmcs's constant host-state fields, i.e., host-state fields that
>   * will not change in the lifetime of the guest.
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> @@ -6962,12 +6962,18 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	smp_mb__after_srcu_read_unlock();
>  
>  	/*
> -	 * This handles the case where a posted interrupt was
> -	 * notified with kvm_vcpu_kick.
> +	 * In case guest got the posted-interrupt notification
> +	 * vector while running in host, we need to make sure
> +	 * it arrives to guest.
> +	 * For L1 posted-interrupts, we manually sync PIR to IRR.
> +	 * For L2 posted-interrupts, we send notification-vector
> +	 * again by self IPI such that it will now be received in guest.
>  	 */
> -	if (kvm_lapic_enabled(vcpu)) {
> -		if (kvm_x86_ops->sync_pir_to_irr && vcpu->arch.apicv_active)
> +	if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active) {
> +		if (kvm_x86_ops->sync_pir_to_irr)
>  			kvm_x86_ops->sync_pir_to_irr(vcpu);
> +		if (kvm_x86_ops->complete_nested_posted_interrupt)
> +			kvm_x86_ops->complete_nested_posted_interrupt(vcpu);

I think that vcpu->arch.apicv_active is enough to guarantee presence of
both sync_pir_to_irr and complete_nested_posted_interrupt,

and considering this is a hot path, I'd check is_guest_mode(vcpu) before
calling the second function.

(Saving the function call with a new x86_op for those two seems viable
 too, but would need deeper analysis and benchmarking as it is putting
 more code into hot cache ...)

Thanks.

  parent reply	other threads:[~2017-12-06 20:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-05  8:16 [PATCH v2 0/5]: KVM: nVMX: Fix multiple issues with nested-posted-interrupts Liran Alon
2017-12-05  8:16 ` [PATCH v2 1/5] KVM: nVMX: Remove pi_pending as signal to process nested posted-interrupts Liran Alon
2017-12-05  8:16 ` [PATCH v2 2/5] KVM: nVMX: Re-evaluate L1 pending events when running L2 and L1 got posted-interrupt Liran Alon
2017-12-06 18:52   ` Radim Krčmář
2017-12-07  2:29     ` Liran Alon
2017-12-11 22:53     ` Paolo Bonzini
2017-12-05  8:16 ` [PATCH v2 3/5] KVM: nVMX: Fix injection to L2 when L1 don't intercept external-interrupts Liran Alon
2017-12-06 20:20   ` Radim Krčmář
2017-12-07 11:19     ` Liran Alon
2017-12-07 16:41       ` Radim Krčmář
2017-12-05  8:16 ` [PATCH v2 4/5] KVM: nVMX: Deliver missed nested-PI notification-vector via self-IPI while interrupts disabled Liran Alon
2017-12-05  8:36   ` Wincy Van
2017-12-06 20:45   ` Radim Krčmář [this message]
2017-12-07 11:33     ` Liran Alon
2017-12-07 16:26       ` Radim Krčmář
2017-12-11 22:57         ` Paolo Bonzini
2017-12-05  8:16 ` [PATCH v2 5/5] KVM: nVMX: Wake halted L2 on nested posted-interrupt Liran Alon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171206204552.GD3644@flask \
    --to=rkrcmar@redhat.com \
    --cc=idan.brown@oracle.com \
    --cc=jmattson@google.com \
    --cc=konrad.wilk@oracle.com \
    --cc=krish.sadhukhan@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=liran.alon@oracle.com \
    --cc=pbonzini@redhat.com \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).