All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: kvm@vger.kernel.org, Jim Mattson <jmattson@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Joerg Roedel <joro@8bytes.org>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>, Vitaly Kuznetsov <vkuznets@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	linux-kernel@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH v2 3/5] KVM: SVM: fix race between interrupt delivery and AVIC inhibition
Date: Tue, 4 Jan 2022 22:52:45 +0000	[thread overview]
Message-ID: <YdTPvdY6ysjXMpAU@google.com> (raw)
In-Reply-To: <20211213104634.199141-4-mlevitsk@redhat.com>

On Mon, Dec 13, 2021, Maxim Levitsky wrote:
> If svm_deliver_avic_intr is called just after the target vcpu's AVIC got
> inhibited, it might read a stale value of vcpu->arch.apicv_active
> which can lead to the target vCPU not noticing the interrupt.
> 
> To fix this use load-acquire/store-release so that, if the target vCPU
> is IN_GUEST_MODE, we're guaranteed to see a previous disabling of the
> AVIC.  If AVIC has been disabled in the meanwhile, proceed with the
> KVM_REQ_EVENT-based delivery.
> 
> All this complicated logic is actually exactly how we can handle an
> incomplete IPI vmexit; the only difference lies in who sets IRR, whether
> KVM or the processor.
> 
> Also incomplete IPI vmexit, has the same races as svm_deliver_avic_intr.
> therefore just reuse the avic_kick_target_vcpu for it as well.
> 
> Reported-by: Maxim Levitsky <mlevitsk@redhat.com>

Heh, probably don't need a Reported-by for a patch you wrote :-)

> Co-developed-with: Paolo Bonzini <pbonzini@redhat.com>

Co-developed-by: is preferred, and should be accompanied by Paolo's SoB.

> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/avic.c | 85 +++++++++++++++++++++++++----------------
>  arch/x86/kvm/x86.c      |  4 +-
>  2 files changed, 55 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 90364d02f22aa..34f62da2fbadd 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -289,6 +289,47 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +static void avic_kick_target_vcpu(struct kvm_vcpu *vcpu)
> +{
> +	bool in_guest_mode;
> +
> +	/*
> +	 * vcpu->arch.apicv_active is read after vcpu->mode.  Pairs

This should say "must be read", not "is read".  It's obvious from the code that
apicv_active is read second, the comment is there to say that it _must_ be read
after vcpu->mode.

> +	 * with smp_store_release in vcpu_enter_guest.
> +	 */
> +	in_guest_mode = (smp_load_acquire(&vcpu->mode) == IN_GUEST_MODE);

IMO, it's marginally clear to initialize the bool.

	bool in_guest_mode = (smp_load_acquire(&vcpu->mode) == IN_GUEST_MODE);

> +	if (READ_ONCE(vcpu->arch.apicv_active)) {
> +		if (in_guest_mode) {
> +			/*
> +			 * Signal the doorbell to tell hardware to inject the IRQ if the vCPU
> +			 * is in the guest.  If the vCPU is not in the guest, hardware will
> +			 * automatically process AVIC interrupts at VMRUN.

Might as well wrap these comments at 80 chars since they're being moved.  Or
maybe even better....

	/* blah blah blah */
	if (!READ_ONCE(vcpu->arch.apicv_active)) {
		kvm_make_request(KVM_REQ_EVENT, vcpu);
		kvm_vcpu_kick(vcpu);
		return;
	}

	if (in_guest_mode) {
		...
	} else {
		....
	}

...so that the existing comments can be preserved as is.

> +			 *
> +			 * Note, the vCPU could get migrated to a different pCPU at any
> +			 * point, which could result in signalling the wrong/previous
> +			 * pCPU.  But if that happens the vCPU is guaranteed to do a
> +			 * VMRUN (after being migrated) and thus will process pending
> +			 * interrupts, i.e. a doorbell is not needed (and the spurious
> +			 * one is harmless).
> +			 */
> +			int cpu = READ_ONCE(vcpu->cpu);
> +			if (cpu != get_cpu())
> +				wrmsrl(SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu));
> +			put_cpu();
> +		} else {
> +			/*
> +			 * Wake the vCPU if it was blocking.  KVM will then detect the
> +			 * pending IRQ when checking if the vCPU has a wake event.
> +			 */
> +			kvm_vcpu_wake_up(vcpu);
> +		}
> +	} else {
> +		/* Compare this case with __apic_accept_irq.  */

Honestly, this comment isn't very helpful.  It only takes a few lines to say:

		/*
		 * Manually signal the event, the __apic_accept_irq() fallback
		 * path can't be used if AVIC is disabled after the vector is
		 * already queued in the vIRR.
		 */

(incorporating more feedback below)

> +		kvm_make_request(KVM_REQ_EVENT, vcpu);
> +		kvm_vcpu_kick(vcpu);
> +	}
> +}
> +
>  static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
>  				   u32 icrl, u32 icrh)
>  {
> @@ -304,8 +345,10 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
>  					GET_APIC_DEST_FIELD(icrh),
> -					icrl & APIC_DEST_MASK))
> -			kvm_vcpu_wake_up(vcpu);
> +					icrl & APIC_DEST_MASK)) {
> +			vcpu->arch.apic->irr_pending = true;
> +			avic_kick_target_vcpu(vcpu);
> +		}
>  	}
>  }
>  
> @@ -671,9 +714,12 @@ void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
>  
>  int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
>  {
> -	if (!vcpu->arch.apicv_active)
> -		return -1;
> -
> +	/*
> +	 * Below, we have to handle anyway the case of AVIC being disabled
> +	 * in the middle of this function, and there is hardly any overhead
> +	 * if AVIC is disabled.  So, we do not bother returning -1 and handle
> +	 * the kick ourselves for disabled APICv.

Hmm, my preference would be to keep the "return -1" even though apicv_active must
be rechecked.  That would help highlight that returning "failure" after this point
is not an option as it would result in kvm_lapic_set_irr() being called twice.

> +	 */
>  	kvm_lapic_set_irr(vec, vcpu->arch.apic);
>  
>  	/*
> @@ -684,34 +730,7 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
>  	 * the doorbell if the vCPU is already running in the guest.
>  	 */
>  	smp_mb__after_atomic();
> -
> -	/*
> -	 * Signal the doorbell to tell hardware to inject the IRQ if the vCPU
> -	 * is in the guest.  If the vCPU is not in the guest, hardware will
> -	 * automatically process AVIC interrupts at VMRUN.
> -	 */
> -	if (vcpu->mode == IN_GUEST_MODE) {
> -		int cpu = READ_ONCE(vcpu->cpu);
> -
> -		/*
> -		 * Note, the vCPU could get migrated to a different pCPU at any
> -		 * point, which could result in signalling the wrong/previous
> -		 * pCPU.  But if that happens the vCPU is guaranteed to do a
> -		 * VMRUN (after being migrated) and thus will process pending
> -		 * interrupts, i.e. a doorbell is not needed (and the spurious
> -		 * one is harmless).
> -		 */
> -		if (cpu != get_cpu())
> -			wrmsrl(SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu));
> -		put_cpu();
> -	} else {
> -		/*
> -		 * Wake the vCPU if it was blocking.  KVM will then detect the
> -		 * pending IRQ when checking if the vCPU has a wake event.
> -		 */
> -		kvm_vcpu_wake_up(vcpu);
> -	}
> -
> +	avic_kick_target_vcpu(vcpu);
>  	return 0;
>  }
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 85127b3e3690b..81a74d86ee5eb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9869,7 +9869,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	 * result in virtual interrupt delivery.
>  	 */
>  	local_irq_disable();
> -	vcpu->mode = IN_GUEST_MODE;
> +
> +	/* Store vcpu->apicv_active before vcpu->mode.  */
> +	smp_store_release(&vcpu->mode, IN_GUEST_MODE);
>  
>  	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>  
> -- 
> 2.26.3
> 

  reply	other threads:[~2022-01-04 22:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-13 10:46 [PATCH v2 0/5] RFC: KVM: SVM: Allow L1's AVIC to co-exist with nesting Maxim Levitsky
2021-12-13 10:46 ` [PATCH v2 1/5] KVM: nSVM: deal with L1 hypervisor that intercepts interrupts but lets L2 control EFLAGS.IF Maxim Levitsky
2021-12-13 11:34   ` Paolo Bonzini
2021-12-13 13:07     ` Maxim Levitsky
2021-12-13 13:15       ` Paolo Bonzini
2021-12-13 13:29         ` Maxim Levitsky
2021-12-13 10:46 ` [PATCH v2 2/5] KVM: SVM: allow to force AVIC to be enabled Maxim Levitsky
2022-01-04 22:25   ` Sean Christopherson
2022-01-05 10:54     ` Maxim Levitsky
2022-01-05 17:46       ` Sean Christopherson
2021-12-13 10:46 ` [PATCH v2 3/5] KVM: SVM: fix race between interrupt delivery and AVIC inhibition Maxim Levitsky
2022-01-04 22:52   ` Sean Christopherson [this message]
2022-01-05 11:03     ` Maxim Levitsky
2022-01-05 11:54       ` Paolo Bonzini
2022-01-05 12:15         ` Maxim Levitsky
2022-01-07 15:32       ` Paolo Bonzini
2022-01-07 23:42         ` Sean Christopherson
2022-01-10 15:10           ` Paolo Bonzini
2021-12-13 10:46 ` [PATCH v2 4/5] KVM: x86: don't touch irr_pending in kvm_apic_update_apicv when inhibiting it Maxim Levitsky
2022-01-04 22:57   ` Sean Christopherson
2022-01-05 10:56     ` Maxim Levitsky
2021-12-13 10:46 ` [PATCH v2 5/5] KVM: SVM: allow AVIC to co-exist with a nested guest running Maxim Levitsky
2022-01-05 21:56   ` Sean Christopherson
2022-01-06  8:44     ` Maxim Levitsky
2022-01-06 17:41       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YdTPvdY6ysjXMpAU@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.