kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
@ 2024-10-23 12:45 Yong He
  2024-10-30 18:31 ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: Yong He @ 2024-10-23 12:45 UTC (permalink / raw)
  To: pbonzini, seanjc, kvm; +Cc: wanpengli, alexyonghe

From: Yong He <alexyonghe@tencent.com>

Try to enable irr_pending when set APIC state, if there is
pending interrupt in IRR with disabled APICv.

In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
always send signals to stop running vcpu threads, then save
entire VM state, including APIC state. There may be a pending
timer interrupt in the saved APIC IRR that is injected before
vcpu_run return. But when restoring the VM, since APICv is
disabled, irr_pending is disabled by default, so this may cause
the timer interrupt in the IRR to be suspended for a long time,
until the next interrupt comes.

Signed-off-by: Yong He <alexyonghe@tencent.com>
---
 arch/x86/kvm/lapic.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 2098dc689088..7373f649958b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
 						apic_find_highest_irr(apic));
 		kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
 	}
+
+	/* Search the IRR and enable irr_pending state with disabled APICv*/
+	if (!enable_apicv && apic_search_irr(apic) != -1)
+		apic->irr_pending = true;
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
 	if (ioapic_in_kernel(vcpu->kvm))
 		kvm_rtc_eoi_tracking_restore_one(vcpu);
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
  2024-10-23 12:45 [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv Yong He
@ 2024-10-30 18:31 ` Sean Christopherson
  2024-10-31  3:39   ` zhuangel570
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2024-10-30 18:31 UTC (permalink / raw)
  To: Yong He; +Cc: pbonzini, kvm, wanpengli, alexyonghe

On Wed, Oct 23, 2024, Yong He wrote:
> From: Yong He <alexyonghe@tencent.com>
> 
> Try to enable irr_pending when set APIC state, if there is
> pending interrupt in IRR with disabled APICv.
> 
> In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> always send signals to stop running vcpu threads, then save
> entire VM state, including APIC state. There may be a pending
> timer interrupt in the saved APIC IRR that is injected before
> vcpu_run return. But when restoring the VM, since APICv is
> disabled, irr_pending is disabled by default, so this may cause
> the timer interrupt in the IRR to be suspended for a long time,
> until the next interrupt comes.
> 
> Signed-off-by: Yong He <alexyonghe@tencent.com>
> ---
>  arch/x86/kvm/lapic.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 2098dc689088..7373f649958b 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
>  						apic_find_highest_irr(apic));
>  		kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
>  	}
> +
> +	/* Search the IRR and enable irr_pending state with disabled APICv*/
> +	if (!enable_apicv && apic_search_irr(apic) != -1)

This can/should be an "else" from the above "if (apic->apicv_active)".  I also
think KVM can safely clear irr_pending in this case, which is also why irr_pending
isn't handling in kvm_apic_update_apicv().  When APICv is disabled (inhibited) at
runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.

But when stuffing APIC state, I don't see how that can happen.  So this?

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 65412640cfc7..deb73aea2c06 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -3086,6 +3086,15 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
                kvm_x86_call(hwapic_irr_update)(vcpu,
                                                apic_find_highest_irr(apic));
                kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
+       } else {
+               /*
+                * Note, kvm_apic_update_apicv() is responsible for updating
+                * isr_count and highest_isr_cache.  irr_pending is somewhat
+                * special because it mustn't be cleared when APICv is disabled
+                * at runtime, and only state restore can cause an IRR bit to
+                * be set without also refreshing irr_pending.
+                */
+               apic->irr_pending = apic_search_irr(apic) != -1;
        }
        kvm_make_request(KVM_REQ_EVENT, vcpu);
        if (ioapic_in_kernel(vcpu->kvm))

> +		apic->irr_pending = true;
>  	kvm_make_request(KVM_REQ_EVENT, vcpu);
>  	if (ioapic_in_kernel(vcpu->kvm))
>  		kvm_rtc_eoi_tracking_restore_one(vcpu);
> -- 
> 2.43.5
> 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
  2024-10-30 18:31 ` Sean Christopherson
@ 2024-10-31  3:39   ` zhuangel570
  2024-10-31 15:16     ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: zhuangel570 @ 2024-10-31  3:39 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: pbonzini, kvm, wanpengli, alexyonghe

On Thu, Oct 31, 2024 at 2:31 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Oct 23, 2024, Yong He wrote:
> > From: Yong He <alexyonghe@tencent.com>
> >
> > Try to enable irr_pending when set APIC state, if there is
> > pending interrupt in IRR with disabled APICv.
> >
> > In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> > always send signals to stop running vcpu threads, then save
> > entire VM state, including APIC state. There may be a pending
> > timer interrupt in the saved APIC IRR that is injected before
> > vcpu_run return. But when restoring the VM, since APICv is
> > disabled, irr_pending is disabled by default, so this may cause
> > the timer interrupt in the IRR to be suspended for a long time,
> > until the next interrupt comes.
> >
> > Signed-off-by: Yong He <alexyonghe@tencent.com>
> > ---
> >  arch/x86/kvm/lapic.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 2098dc689088..7373f649958b 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> >                                               apic_find_highest_irr(apic));
> >               kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> >       }
> > +
> > +     /* Search the IRR and enable irr_pending state with disabled APICv*/
> > +     if (!enable_apicv && apic_search_irr(apic) != -1)
>
> This can/should be an "else" from the above "if (apic->apicv_active)".  I also
> think KVM can safely clear irr_pending in this case, which is also why irr_pending
> isn't handling in kvm_apic_update_apicv().  When APICv is disabled (inhibited) at
> runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.

Thank you for your review and suggestions.

>
> But when stuffing APIC state, I don't see how that can happen.  So this?

Here is our case.

APICv is disabled by set enable_apicv to 0. Create VM snapshot, then
start/restore
new VM base the snapshot. We occasionally encountered issues with VMs hanging
for long periods of time after restore. Investigation show that there
is a timer IRQ
pending in IRR, but the newly restored VM could not detect it, because
irr_pending
is not set when restoring the APIC state by kvm_apic_set_state().

Further investigation show when creating VM snapshot, VMM pause VCPUs by signal,
an in-flight timer pending in IRR, and the tscdeadline is 0 in saved
APIC state. All these
contexts in saved APIC state prove that kvm_inject_pending_timer_irqs
had just injected
a timer (will also set the tscdeadline to 0) before the VCPU handle the signal.

Maybe this patch is a fix for 755c2bf87860 (KVM: x86: lapic: don't
touch irr_pending in
kvm_apic_update_apicv when inhibiting it), the irr_pending enable
check is missed in
kvm_apic_set_state() after that.

>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 65412640cfc7..deb73aea2c06 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -3086,6 +3086,15 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
>                 kvm_x86_call(hwapic_irr_update)(vcpu,
>                                                 apic_find_highest_irr(apic));
>                 kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> +       } else {
> +               /*
> +                * Note, kvm_apic_update_apicv() is responsible for updating
> +                * isr_count and highest_isr_cache.  irr_pending is somewhat
> +                * special because it mustn't be cleared when APICv is disabled
> +                * at runtime, and only state restore can cause an IRR bit to
> +                * be set without also refreshing irr_pending.
> +                */
> +               apic->irr_pending = apic_search_irr(apic) != -1;
>         }
>         kvm_make_request(KVM_REQ_EVENT, vcpu);
>         if (ioapic_in_kernel(vcpu->kvm))
>
> > +             apic->irr_pending = true;
> >       kvm_make_request(KVM_REQ_EVENT, vcpu);
> >       if (ioapic_in_kernel(vcpu->kvm))
> >               kvm_rtc_eoi_tracking_restore_one(vcpu);
> > --
> > 2.43.5
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
  2024-10-31  3:39   ` zhuangel570
@ 2024-10-31 15:16     ` Sean Christopherson
  2024-11-01  7:42       ` zhuangel570
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2024-10-31 15:16 UTC (permalink / raw)
  To: zhuangel570; +Cc: pbonzini, kvm, wanpengli, alexyonghe

On Thu, Oct 31, 2024, zhuangel570 wrote:
> On Thu, Oct 31, 2024 at 2:31 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Oct 23, 2024, Yong He wrote:
> > > From: Yong He <alexyonghe@tencent.com>
> > >
> > > Try to enable irr_pending when set APIC state, if there is
> > > pending interrupt in IRR with disabled APICv.
> > >
> > > In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> > > always send signals to stop running vcpu threads, then save
> > > entire VM state, including APIC state. There may be a pending
> > > timer interrupt in the saved APIC IRR that is injected before
> > > vcpu_run return. But when restoring the VM, since APICv is
> > > disabled, irr_pending is disabled by default, so this may cause
> > > the timer interrupt in the IRR to be suspended for a long time,
> > > until the next interrupt comes.
> > >
> > > Signed-off-by: Yong He <alexyonghe@tencent.com>
> > > ---
> > >  arch/x86/kvm/lapic.c | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > index 2098dc689088..7373f649958b 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> > >                                               apic_find_highest_irr(apic));
> > >               kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> > >       }
> > > +
> > > +     /* Search the IRR and enable irr_pending state with disabled APICv*/
> > > +     if (!enable_apicv && apic_search_irr(apic) != -1)
> >
> > This can/should be an "else" from the above "if (apic->apicv_active)".  I also
> > think KVM can safely clear irr_pending in this case, which is also why irr_pending
> > isn't handling in kvm_apic_update_apicv().  When APICv is disabled (inhibited) at
> > runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.
> 
> Thank you for your review and suggestions.
> 
> >
> > But when stuffing APIC state, I don't see how that can happen.  So this?
> 
> Here is our case.

Sorry for not being clear.  I wasn't saying that the bug you encountered can't
happen.  I 100% agree it's a real bug.  What I was saying can't happen is an
in-flight virtual IRQ from another vCPU or device racing with setting APIC state,
and the VM (or VMM) having any expectation that everything would work correctly.

And so, I think it's save for KVM to explicitly set irr_pending, i.e. to potentially
_clear_ irr_pending.

If you are able to verify the psuedo-patch I posted fixes the bug you encountered,
that is how I would like to fix the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
  2024-10-31 15:16     ` Sean Christopherson
@ 2024-11-01  7:42       ` zhuangel570
  0 siblings, 0 replies; 5+ messages in thread
From: zhuangel570 @ 2024-11-01  7:42 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: pbonzini, kvm, wanpengli, alexyonghe

On Thu, Oct 31, 2024 at 11:16 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Oct 31, 2024, zhuangel570 wrote:
> > On Thu, Oct 31, 2024 at 2:31 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Wed, Oct 23, 2024, Yong He wrote:
> > > > From: Yong He <alexyonghe@tencent.com>
> > > >
> > > > Try to enable irr_pending when set APIC state, if there is
> > > > pending interrupt in IRR with disabled APICv.
> > > >
> > > > In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> > > > always send signals to stop running vcpu threads, then save
> > > > entire VM state, including APIC state. There may be a pending
> > > > timer interrupt in the saved APIC IRR that is injected before
> > > > vcpu_run return. But when restoring the VM, since APICv is
> > > > disabled, irr_pending is disabled by default, so this may cause
> > > > the timer interrupt in the IRR to be suspended for a long time,
> > > > until the next interrupt comes.
> > > >
> > > > Signed-off-by: Yong He <alexyonghe@tencent.com>
> > > > ---
> > > >  arch/x86/kvm/lapic.c | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > > index 2098dc689088..7373f649958b 100644
> > > > --- a/arch/x86/kvm/lapic.c
> > > > +++ b/arch/x86/kvm/lapic.c
> > > > @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> > > >                                               apic_find_highest_irr(apic));
> > > >               kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> > > >       }
> > > > +
> > > > +     /* Search the IRR and enable irr_pending state with disabled APICv*/
> > > > +     if (!enable_apicv && apic_search_irr(apic) != -1)
> > >
> > > This can/should be an "else" from the above "if (apic->apicv_active)".  I also
> > > think KVM can safely clear irr_pending in this case, which is also why irr_pending
> > > isn't handling in kvm_apic_update_apicv().  When APICv is disabled (inhibited) at
> > > runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.
> >
> > Thank you for your review and suggestions.
> >
> > >
> > > But when stuffing APIC state, I don't see how that can happen.  So this?
> >
> > Here is our case.
>
> Sorry for not being clear.  I wasn't saying that the bug you encountered can't
> happen.  I 100% agree it's a real bug.  What I was saying can't happen is an
> in-flight virtual IRQ from another vCPU or device racing with setting APIC state,
> and the VM (or VMM) having any expectation that everything would work correctly.
>
> And so, I think it's save for KVM to explicitly set irr_pending, i.e. to potentially
> _clear_ irr_pending.

Got it, thanks.

>
> If you are able to verify the psuedo-patch I posted fixes the bug you encountered,
> that is how I would like to fix the bug.

Yes, your patch may fix the issue better, and I have checked on my test machine
and it is fixed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-11-01  7:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-23 12:45 [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv Yong He
2024-10-30 18:31 ` Sean Christopherson
2024-10-31  3:39   ` zhuangel570
2024-10-31 15:16     ` Sean Christopherson
2024-11-01  7:42       ` zhuangel570

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).