* [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
@ 2024-10-23 12:45 Yong He
2024-10-30 18:31 ` Sean Christopherson
0 siblings, 1 reply; 5+ messages in thread
From: Yong He @ 2024-10-23 12:45 UTC (permalink / raw)
To: pbonzini, seanjc, kvm; +Cc: wanpengli, alexyonghe
From: Yong He <alexyonghe@tencent.com>
Try to enable irr_pending when set APIC state, if there is
pending interrupt in IRR with disabled APICv.
In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
always send signals to stop running vcpu threads, then save
entire VM state, including APIC state. There may be a pending
timer interrupt in the saved APIC IRR that is injected before
vcpu_run return. But when restoring the VM, since APICv is
disabled, irr_pending is disabled by default, so this may cause
the timer interrupt in the IRR to be suspended for a long time,
until the next interrupt comes.
Signed-off-by: Yong He <alexyonghe@tencent.com>
---
arch/x86/kvm/lapic.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 2098dc689088..7373f649958b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
apic_find_highest_irr(apic));
kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
}
+
+ /* Search the IRR and enable irr_pending state with disabled APICv*/
+ if (!enable_apicv && apic_search_irr(apic) != -1)
+ apic->irr_pending = true;
kvm_make_request(KVM_REQ_EVENT, vcpu);
if (ioapic_in_kernel(vcpu->kvm))
kvm_rtc_eoi_tracking_restore_one(vcpu);
--
2.43.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
2024-10-23 12:45 [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv Yong He
@ 2024-10-30 18:31 ` Sean Christopherson
2024-10-31 3:39 ` zhuangel570
0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2024-10-30 18:31 UTC (permalink / raw)
To: Yong He; +Cc: pbonzini, kvm, wanpengli, alexyonghe
On Wed, Oct 23, 2024, Yong He wrote:
> From: Yong He <alexyonghe@tencent.com>
>
> Try to enable irr_pending when set APIC state, if there is
> pending interrupt in IRR with disabled APICv.
>
> In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> always send signals to stop running vcpu threads, then save
> entire VM state, including APIC state. There may be a pending
> timer interrupt in the saved APIC IRR that is injected before
> vcpu_run return. But when restoring the VM, since APICv is
> disabled, irr_pending is disabled by default, so this may cause
> the timer interrupt in the IRR to be suspended for a long time,
> until the next interrupt comes.
>
> Signed-off-by: Yong He <alexyonghe@tencent.com>
> ---
> arch/x86/kvm/lapic.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 2098dc689088..7373f649958b 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> apic_find_highest_irr(apic));
> kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> }
> +
> + /* Search the IRR and enable irr_pending state with disabled APICv*/
> + if (!enable_apicv && apic_search_irr(apic) != -1)
This can/should be an "else" from the above "if (apic->apicv_active)". I also
think KVM can safely clear irr_pending in this case, which is also why irr_pending
isn't handling in kvm_apic_update_apicv(). When APICv is disabled (inhibited) at
runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.
But when stuffing APIC state, I don't see how that can happen. So this?
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 65412640cfc7..deb73aea2c06 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -3086,6 +3086,15 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
kvm_x86_call(hwapic_irr_update)(vcpu,
apic_find_highest_irr(apic));
kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
+ } else {
+ /*
+ * Note, kvm_apic_update_apicv() is responsible for updating
+ * isr_count and highest_isr_cache. irr_pending is somewhat
+ * special because it mustn't be cleared when APICv is disabled
+ * at runtime, and only state restore can cause an IRR bit to
+ * be set without also refreshing irr_pending.
+ */
+ apic->irr_pending = apic_search_irr(apic) != -1;
}
kvm_make_request(KVM_REQ_EVENT, vcpu);
if (ioapic_in_kernel(vcpu->kvm))
> + apic->irr_pending = true;
> kvm_make_request(KVM_REQ_EVENT, vcpu);
> if (ioapic_in_kernel(vcpu->kvm))
> kvm_rtc_eoi_tracking_restore_one(vcpu);
> --
> 2.43.5
>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
2024-10-30 18:31 ` Sean Christopherson
@ 2024-10-31 3:39 ` zhuangel570
2024-10-31 15:16 ` Sean Christopherson
0 siblings, 1 reply; 5+ messages in thread
From: zhuangel570 @ 2024-10-31 3:39 UTC (permalink / raw)
To: Sean Christopherson; +Cc: pbonzini, kvm, wanpengli, alexyonghe
On Thu, Oct 31, 2024 at 2:31 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Oct 23, 2024, Yong He wrote:
> > From: Yong He <alexyonghe@tencent.com>
> >
> > Try to enable irr_pending when set APIC state, if there is
> > pending interrupt in IRR with disabled APICv.
> >
> > In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> > always send signals to stop running vcpu threads, then save
> > entire VM state, including APIC state. There may be a pending
> > timer interrupt in the saved APIC IRR that is injected before
> > vcpu_run return. But when restoring the VM, since APICv is
> > disabled, irr_pending is disabled by default, so this may cause
> > the timer interrupt in the IRR to be suspended for a long time,
> > until the next interrupt comes.
> >
> > Signed-off-by: Yong He <alexyonghe@tencent.com>
> > ---
> > arch/x86/kvm/lapic.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 2098dc689088..7373f649958b 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> > apic_find_highest_irr(apic));
> > kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> > }
> > +
> > + /* Search the IRR and enable irr_pending state with disabled APICv*/
> > + if (!enable_apicv && apic_search_irr(apic) != -1)
>
> This can/should be an "else" from the above "if (apic->apicv_active)". I also
> think KVM can safely clear irr_pending in this case, which is also why irr_pending
> isn't handling in kvm_apic_update_apicv(). When APICv is disabled (inhibited) at
> runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.
Thank you for your review and suggestions.
>
> But when stuffing APIC state, I don't see how that can happen. So this?
Here is our case.
APICv is disabled by set enable_apicv to 0. Create VM snapshot, then
start/restore
new VM base the snapshot. We occasionally encountered issues with VMs hanging
for long periods of time after restore. Investigation show that there
is a timer IRQ
pending in IRR, but the newly restored VM could not detect it, because
irr_pending
is not set when restoring the APIC state by kvm_apic_set_state().
Further investigation show when creating VM snapshot, VMM pause VCPUs by signal,
an in-flight timer pending in IRR, and the tscdeadline is 0 in saved
APIC state. All these
contexts in saved APIC state prove that kvm_inject_pending_timer_irqs
had just injected
a timer (will also set the tscdeadline to 0) before the VCPU handle the signal.
Maybe this patch is a fix for 755c2bf87860 (KVM: x86: lapic: don't
touch irr_pending in
kvm_apic_update_apicv when inhibiting it), the irr_pending enable
check is missed in
kvm_apic_set_state() after that.
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 65412640cfc7..deb73aea2c06 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -3086,6 +3086,15 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> kvm_x86_call(hwapic_irr_update)(vcpu,
> apic_find_highest_irr(apic));
> kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> + } else {
> + /*
> + * Note, kvm_apic_update_apicv() is responsible for updating
> + * isr_count and highest_isr_cache. irr_pending is somewhat
> + * special because it mustn't be cleared when APICv is disabled
> + * at runtime, and only state restore can cause an IRR bit to
> + * be set without also refreshing irr_pending.
> + */
> + apic->irr_pending = apic_search_irr(apic) != -1;
> }
> kvm_make_request(KVM_REQ_EVENT, vcpu);
> if (ioapic_in_kernel(vcpu->kvm))
>
> > + apic->irr_pending = true;
> > kvm_make_request(KVM_REQ_EVENT, vcpu);
> > if (ioapic_in_kernel(vcpu->kvm))
> > kvm_rtc_eoi_tracking_restore_one(vcpu);
> > --
> > 2.43.5
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
2024-10-31 3:39 ` zhuangel570
@ 2024-10-31 15:16 ` Sean Christopherson
2024-11-01 7:42 ` zhuangel570
0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2024-10-31 15:16 UTC (permalink / raw)
To: zhuangel570; +Cc: pbonzini, kvm, wanpengli, alexyonghe
On Thu, Oct 31, 2024, zhuangel570 wrote:
> On Thu, Oct 31, 2024 at 2:31 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Oct 23, 2024, Yong He wrote:
> > > From: Yong He <alexyonghe@tencent.com>
> > >
> > > Try to enable irr_pending when set APIC state, if there is
> > > pending interrupt in IRR with disabled APICv.
> > >
> > > In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> > > always send signals to stop running vcpu threads, then save
> > > entire VM state, including APIC state. There may be a pending
> > > timer interrupt in the saved APIC IRR that is injected before
> > > vcpu_run return. But when restoring the VM, since APICv is
> > > disabled, irr_pending is disabled by default, so this may cause
> > > the timer interrupt in the IRR to be suspended for a long time,
> > > until the next interrupt comes.
> > >
> > > Signed-off-by: Yong He <alexyonghe@tencent.com>
> > > ---
> > > arch/x86/kvm/lapic.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > index 2098dc689088..7373f649958b 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> > > apic_find_highest_irr(apic));
> > > kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> > > }
> > > +
> > > + /* Search the IRR and enable irr_pending state with disabled APICv*/
> > > + if (!enable_apicv && apic_search_irr(apic) != -1)
> >
> > This can/should be an "else" from the above "if (apic->apicv_active)". I also
> > think KVM can safely clear irr_pending in this case, which is also why irr_pending
> > isn't handling in kvm_apic_update_apicv(). When APICv is disabled (inhibited) at
> > runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.
>
> Thank you for your review and suggestions.
>
> >
> > But when stuffing APIC state, I don't see how that can happen. So this?
>
> Here is our case.
Sorry for not being clear. I wasn't saying that the bug you encountered can't
happen. I 100% agree it's a real bug. What I was saying can't happen is an
in-flight virtual IRQ from another vCPU or device racing with setting APIC state,
and the VM (or VMM) having any expectation that everything would work correctly.
And so, I think it's save for KVM to explicitly set irr_pending, i.e. to potentially
_clear_ irr_pending.
If you are able to verify the psuedo-patch I posted fixes the bug you encountered,
that is how I would like to fix the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv
2024-10-31 15:16 ` Sean Christopherson
@ 2024-11-01 7:42 ` zhuangel570
0 siblings, 0 replies; 5+ messages in thread
From: zhuangel570 @ 2024-11-01 7:42 UTC (permalink / raw)
To: Sean Christopherson; +Cc: pbonzini, kvm, wanpengli, alexyonghe
On Thu, Oct 31, 2024 at 11:16 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Oct 31, 2024, zhuangel570 wrote:
> > On Thu, Oct 31, 2024 at 2:31 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Wed, Oct 23, 2024, Yong He wrote:
> > > > From: Yong He <alexyonghe@tencent.com>
> > > >
> > > > Try to enable irr_pending when set APIC state, if there is
> > > > pending interrupt in IRR with disabled APICv.
> > > >
> > > > In save/restore VM scenery with disabled APICv. Qemu/CloudHypervisor
> > > > always send signals to stop running vcpu threads, then save
> > > > entire VM state, including APIC state. There may be a pending
> > > > timer interrupt in the saved APIC IRR that is injected before
> > > > vcpu_run return. But when restoring the VM, since APICv is
> > > > disabled, irr_pending is disabled by default, so this may cause
> > > > the timer interrupt in the IRR to be suspended for a long time,
> > > > until the next interrupt comes.
> > > >
> > > > Signed-off-by: Yong He <alexyonghe@tencent.com>
> > > > ---
> > > > arch/x86/kvm/lapic.c | 4 ++++
> > > > 1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > > index 2098dc689088..7373f649958b 100644
> > > > --- a/arch/x86/kvm/lapic.c
> > > > +++ b/arch/x86/kvm/lapic.c
> > > > @@ -3099,6 +3099,10 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> > > > apic_find_highest_irr(apic));
> > > > kvm_x86_call(hwapic_isr_update)(apic_find_highest_isr(apic));
> > > > }
> > > > +
> > > > + /* Search the IRR and enable irr_pending state with disabled APICv*/
> > > > + if (!enable_apicv && apic_search_irr(apic) != -1)
> > >
> > > This can/should be an "else" from the above "if (apic->apicv_active)". I also
> > > think KVM can safely clear irr_pending in this case, which is also why irr_pending
> > > isn't handling in kvm_apic_update_apicv(). When APICv is disabled (inhibited) at
> > > runtime, an IRQ may be in-flight, i.e. apic_search_irr() can get a false negative.
> >
> > Thank you for your review and suggestions.
> >
> > >
> > > But when stuffing APIC state, I don't see how that can happen. So this?
> >
> > Here is our case.
>
> Sorry for not being clear. I wasn't saying that the bug you encountered can't
> happen. I 100% agree it's a real bug. What I was saying can't happen is an
> in-flight virtual IRQ from another vCPU or device racing with setting APIC state,
> and the VM (or VMM) having any expectation that everything would work correctly.
>
> And so, I think it's save for KVM to explicitly set irr_pending, i.e. to potentially
> _clear_ irr_pending.
Got it, thanks.
>
> If you are able to verify the psuedo-patch I posted fixes the bug you encountered,
> that is how I would like to fix the bug.
Yes, your patch may fix the issue better, and I have checked on my test machine
and it is fixed.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-11-01 7:43 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-23 12:45 [PATCH] KVM: x86: Try to enable irr_pending state with disabled APICv Yong He
2024-10-30 18:31 ` Sean Christopherson
2024-10-31 3:39 ` zhuangel570
2024-10-31 15:16 ` Sean Christopherson
2024-11-01 7:42 ` zhuangel570
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).