* [PATCH 6.1.y] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode
@ 2024-08-20 5:32 David Hunter
2024-08-20 6:18 ` 答复: [外部邮件] " Li,Rongqing
0 siblings, 1 reply; 8+ messages in thread
From: David Hunter @ 2024-08-20 5:32 UTC (permalink / raw)
To: stable
Cc: seanjc, pbonzini, dave.hansen, x86, hpa, kvm, linux-kernel,
javier.carrasco.cruz, shuah, David Hunter, Peter Shier,
Jim Mattson, Wanpeng Li, Li RongQing
From: Li RongQing <lirongqing@baidu.com>
[ Upstream Commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2]
when the vCPU was migrated, if its timer is expired, KVM _should_ fire
the timer ASAP, zeroing the deadline here will cause the timer to
immediately fire on the destination
Cc: Sean Christopherson <seanjc@google.com>
Cc: Peter Shier <pshier@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Link: https://lore.kernel.org/r/20230106040625.8404-1-lirongqing@baidu.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
(cherry picked from commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2)
The code was able to compile without errors or warnings.
Signed-off-by: David Hunter <david.hunter.linux@gmail.com>
---
arch/x86/kvm/lapic.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c90fef0258c5..3cd590ace95a 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1843,8 +1843,12 @@ static bool set_target_expiration(struct kvm_lapic *apic, u32 count_reg)
if (unlikely(count_reg != APIC_TMICT)) {
deadline = tmict_to_ns(apic,
kvm_lapic_get_reg(apic, count_reg));
- if (unlikely(deadline <= 0))
- deadline = apic->lapic_timer.period;
+ if (unlikely(deadline <= 0)) {
+ if (apic_lvtt_period(apic))
+ deadline = apic->lapic_timer.period;
+ else
+ deadline = 0;
+ }
else if (unlikely(deadline > apic->lapic_timer.period)) {
pr_info_ratelimited(
"kvm: vcpu %i: requested lapic timer restore with "
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* 答复: [外部邮件] [PATCH 6.1.y] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode
2024-08-20 5:32 [PATCH 6.1.y] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode David Hunter
@ 2024-08-20 6:18 ` Li,Rongqing
2024-08-20 14:07 ` Sean Christopherson
0 siblings, 1 reply; 8+ messages in thread
From: Li,Rongqing @ 2024-08-20 6:18 UTC (permalink / raw)
To: David Hunter, stable@vger.kernel.org
Cc: seanjc@google.com, pbonzini@redhat.com,
dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
javier.carrasco.cruz@gmail.com, shuah@kernel.org, Peter Shier,
Jim Mattson, Wanpeng Li
>
> From: Li RongQing <lirongqing@baidu.com>
>
> [ Upstream Commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2]
>
> when the vCPU was migrated, if its timer is expired, KVM _should_ fire the
> timer ASAP, zeroing the deadline here will cause the timer to immediately fire
> on the destination
>
This patch increased the reproduce ratio of lapic timer interrupt losing, which has been fixed by the following patch;
so I think patch should not merge it into 6.1
commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2
Author: Haitao Shan <hshan@google.com>
Date: Tue Sep 12 16:55:45 2023 -0700
KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
When running android emulator (which is based on QEMU 2.12) on
certain Intel hosts with kernel version 6.3-rc1 or above, guest
will freeze after loading a snapshot. This is almost 100%
reproducible. By default, the android emulator will use snapshot
to speed up the next launching of the same android guest. So
this breaks the android emulator badly.
I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
running command "loadvm" after "savevm". The same issue is
observed. At the same time, none of our AMD platforms is impacted.
More experiments show that loading the KVM module with
"enable_apicv=false" can workaround it.
The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
fire timer when it is migrated and expired, and in oneshot mode").
However, as is pointed out by Sean Christopherson, it is introduced
by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
it is migrated and expired, and in oneshot mode") just makes it
easier to hit the issue.
Having both commits, the oneshot lapic timer gets fired immediately
inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
platforms with APIC virtualization and posted interrupt processing,
this eventually leads to setting the corresponding PIR bit. However,
the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
by apicv_post_state_restore. This leads to timer interrupt lost.
The fix is to move vmx_apicv_post_state_restore to the beginning of
the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
What vmx_apicv_post_state_restore does is actually clearing any
former apicv state and this behavior is more suitable to carry out
in the beginning.
Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Haitao Shan <hshan@google.com>
Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Peter Shier <pshier@google.com>
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Wanpeng Li <wanpengli@tencent.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> Link:
> https://lore.kernel.org/r/20230106040625.8404-1-lirongqing@baidu.com
> Signed-off-by: Sean Christopherson <seanjc@google.com>
>
> (cherry picked from commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2)
> The code was able to compile without errors or warnings.
> Signed-off-by: David Hunter <david.hunter.linux@gmail.com>
> ---
> arch/x86/kvm/lapic.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> c90fef0258c5..3cd590ace95a 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1843,8 +1843,12 @@ static bool set_target_expiration(struct kvm_lapic
> *apic, u32 count_reg)
> if (unlikely(count_reg != APIC_TMICT)) {
> deadline = tmict_to_ns(apic,
> kvm_lapic_get_reg(apic, count_reg));
> - if (unlikely(deadline <= 0))
> - deadline = apic->lapic_timer.period;
> + if (unlikely(deadline <= 0)) {
> + if (apic_lvtt_period(apic))
> + deadline = apic->lapic_timer.period;
> + else
> + deadline = 0;
> + }
> else if (unlikely(deadline > apic->lapic_timer.period)) {
> pr_info_ratelimited(
> "kvm: vcpu %i: requested lapic timer restore with "
> --
> 2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 答复: [外部邮件] [PATCH 6.1.y] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode
2024-08-20 6:18 ` 答复: [外部邮件] " Li,Rongqing
@ 2024-08-20 14:07 ` Sean Christopherson
2024-08-26 22:13 ` [PATCH 6.1.y 0/2 V2] KVM: x86: fire timer when it is migrated David Hunter
0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2024-08-20 14:07 UTC (permalink / raw)
To: Li Rongqing
Cc: David Hunter, stable@vger.kernel.org, pbonzini@redhat.com,
dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
javier.carrasco.cruz@gmail.com, shuah@kernel.org, Peter Shier,
Jim Mattson
On Tue, Aug 20, 2024, Li,Rongqing wrote:
> >
> > From: Li RongQing <lirongqing@baidu.com>
> >
> > [ Upstream Commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2]
> >
> > when the vCPU was migrated, if its timer is expired, KVM _should_ fire the
> > timer ASAP, zeroing the deadline here will cause the timer to immediately fire
> > on the destination
> >
>
> This patch increased the reproduce ratio of lapic timer interrupt losing,
Yep, this caused a painful amount of fallout in our environment.
> which has been fixed by the following patch; so I think patch should not
> merge it into 6.1
David, can you prep a small series with both this patch and the fix below?
Thanks!
> commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2
> Author: Haitao Shan <hshan@google.com>
> Date: Tue Sep 12 16:55:45 2023 -0700
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 6.1.y 0/2 V2] KVM: x86: fire timer when it is migrated
2024-08-20 14:07 ` Sean Christopherson
@ 2024-08-26 22:13 ` David Hunter
2024-08-26 22:13 ` [PATCH 6.1.y 1/2 V2] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode David Hunter
2024-08-26 22:13 ` [PATCH 6.1.y 2/2 V2] KVM: x86: Fix lapic timer interrupt lost after loading a snapshot David Hunter
0 siblings, 2 replies; 8+ messages in thread
From: David Hunter @ 2024-08-26 22:13 UTC (permalink / raw)
To: seanjc
Cc: dave.hansen, david.hunter.linux, hpa, javier.carrasco.cruz,
jmattson, kvm, linux-kernel, lirongqing, pbonzini, pshier, shuah,
stable, x86
Hello,
I'm sending you two this first because this will be my first time
sending a series patch. Is this okay?
Backport for 6.1.y. These two commits should be backported together to
fix an issue that arrises from commit 967235d320329e4a7a2bd1a36b04293063e985ae
-Subject: 'VM: vmx: clear pending interrupts on KVM_SET_LAPIC'
[ Upstream Commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2 ]
Haitao Shan (1):
KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
[ Upstream Commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2 ]
Li RongQing (1):
KVM: x86: fire timer when it is migrated and expired, and in oneshot
mode
---
V1 --> V2
- changed to a series patch to fix an issue with the patch
arch/x86/kvm/lapic.c | 8 ++++++--
arch/x86/kvm/vmx/vmx.c | 1 +
2 files changed, 7 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 6.1.y 1/2 V2] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode
2024-08-26 22:13 ` [PATCH 6.1.y 0/2 V2] KVM: x86: fire timer when it is migrated David Hunter
@ 2024-08-26 22:13 ` David Hunter
2024-08-26 22:13 ` [PATCH 6.1.y 2/2 V2] KVM: x86: Fix lapic timer interrupt lost after loading a snapshot David Hunter
1 sibling, 0 replies; 8+ messages in thread
From: David Hunter @ 2024-08-26 22:13 UTC (permalink / raw)
To: seanjc
Cc: dave.hansen, david.hunter.linux, hpa, javier.carrasco.cruz,
jmattson, kvm, linux-kernel, lirongqing, pbonzini, pshier, shuah,
stable, x86, Wanpeng Li
[Upstream Commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2
From: Li RongQing <lirongqing@baidu.com>
when the vCPU was migrated, if its timer is expired, KVM _should_ fire
the timer ASAP, zeroing the deadline here will cause the timer to
immediately fire on the destination
Cc: Sean Christopherson <seanjc@google.com>
Cc: Peter Shier <pshier@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Link: https://lore.kernel.org/r/20230106040625.8404-1-lirongqing@baidu.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Cherry-picked from commit 8e6ed96cdd5001c55fccc80a17f651741c1ca7d2]
Signed-off-by: David Hunter <david.hunter.linux@gmail.com>
---
arch/x86/kvm/lapic.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c90fef0258c5..3cd590ace95a 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1843,8 +1843,12 @@ static bool set_target_expiration(struct kvm_lapic *apic, u32 count_reg)
if (unlikely(count_reg != APIC_TMICT)) {
deadline = tmict_to_ns(apic,
kvm_lapic_get_reg(apic, count_reg));
- if (unlikely(deadline <= 0))
- deadline = apic->lapic_timer.period;
+ if (unlikely(deadline <= 0)) {
+ if (apic_lvtt_period(apic))
+ deadline = apic->lapic_timer.period;
+ else
+ deadline = 0;
+ }
else if (unlikely(deadline > apic->lapic_timer.period)) {
pr_info_ratelimited(
"kvm: vcpu %i: requested lapic timer restore with "
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 6.1.y 2/2 V2] KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
2024-08-26 22:13 ` [PATCH 6.1.y 0/2 V2] KVM: x86: fire timer when it is migrated David Hunter
2024-08-26 22:13 ` [PATCH 6.1.y 1/2 V2] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode David Hunter
@ 2024-08-26 22:13 ` David Hunter
2024-08-27 13:10 ` Greg KH
1 sibling, 1 reply; 8+ messages in thread
From: David Hunter @ 2024-08-26 22:13 UTC (permalink / raw)
To: seanjc
Cc: dave.hansen, david.hunter.linux, hpa, javier.carrasco.cruz,
jmattson, kvm, linux-kernel, lirongqing, pbonzini, pshier, shuah,
stable, x86, Haitao Shan
[ Upstream Commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2]
From: Haitao Shan <hshan@google.com>
Date: Tue Sep 12 16:55:45 2023 -0700
When running android emulator (which is based on QEMU 2.12) on
certain Intel hosts with kernel version 6.3-rc1 or above, guest
will freeze after loading a snapshot. This is almost 100%
reproducible. By default, the android emulator will use snapshot
to speed up the next launching of the same android guest. So
this breaks the android emulator badly.
I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
running command "loadvm" after "savevm". The same issue is
observed. At the same time, none of our AMD platforms is impacted.
More experiments show that loading the KVM module with
"enable_apicv=false" can workaround it.
The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
fire timer when it is migrated and expired, and in oneshot mode").
However, as is pointed out by Sean Christopherson, it is introduced
by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
it is migrated and expired, and in oneshot mode") just makes it
easier to hit the issue.
Having both commits, the oneshot lapic timer gets fired immediately
inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
platforms with APIC virtualization and posted interrupt processing,
this eventually leads to setting the corresponding PIR bit. However,
the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
by apicv_post_state_restore. This leads to timer interrupt lost.
The fix is to move vmx_apicv_post_state_restore to the beginning of
the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
What vmx_apicv_post_state_restore does is actually clearing any
former apicv state and this behavior is more suitable to carry out
in the beginning.
Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Haitao Shan <hshan@google.com>
Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
(Cherry-Picked from commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2)
Signed-off-by: David Hunter <david.hunter.linux@gmail.com>
---
arch/x86/kvm/vmx/vmx.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 87abf4eebf8a..4040075bbd5a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8203,6 +8203,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.load_eoi_exitmap = vmx_load_eoi_exitmap,
.apicv_pre_state_restore = vmx_apicv_pre_state_restore,
.check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons,
+ .required_apicv_inhibits = VMX_REQUIRED_APICV_INHIBITS,
.hwapic_irr_update = vmx_hwapic_irr_update,
.hwapic_isr_update = vmx_hwapic_isr_update,
.guest_apic_has_interrupt = vmx_guest_apic_has_interrupt,
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 6.1.y 2/2 V2] KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
2024-08-26 22:13 ` [PATCH 6.1.y 2/2 V2] KVM: x86: Fix lapic timer interrupt lost after loading a snapshot David Hunter
@ 2024-08-27 13:10 ` Greg KH
2024-08-27 18:52 ` Sean Christopherson
0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2024-08-27 13:10 UTC (permalink / raw)
To: David Hunter
Cc: seanjc, dave.hansen, hpa, javier.carrasco.cruz, jmattson, kvm,
linux-kernel, lirongqing, pbonzini, pshier, shuah, stable, x86,
Haitao Shan
On Mon, Aug 26, 2024 at 06:13:36PM -0400, David Hunter wrote:
>
> [ Upstream Commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2]
This is already in the 6.1.66 release, so do you want it applied again?
> From: Haitao Shan <hshan@google.com>
> Date: Tue Sep 12 16:55:45 2023 -0700
>
> When running android emulator (which is based on QEMU 2.12) on
> certain Intel hosts with kernel version 6.3-rc1 or above, guest
> will freeze after loading a snapshot. This is almost 100%
> reproducible. By default, the android emulator will use snapshot
> to speed up the next launching of the same android guest. So
> this breaks the android emulator badly.
>
> I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
> running command "loadvm" after "savevm". The same issue is
> observed. At the same time, none of our AMD platforms is impacted.
> More experiments show that loading the KVM module with
> "enable_apicv=false" can workaround it.
>
> The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
> fire timer when it is migrated and expired, and in oneshot mode").
> However, as is pointed out by Sean Christopherson, it is introduced
> by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
> KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
> it is migrated and expired, and in oneshot mode") just makes it
> easier to hit the issue.
>
> Having both commits, the oneshot lapic timer gets fired immediately
> inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
> platforms with APIC virtualization and posted interrupt processing,
> this eventually leads to setting the corresponding PIR bit. However,
> the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
> by apicv_post_state_restore. This leads to timer interrupt lost.
>
> The fix is to move vmx_apicv_post_state_restore to the beginning of
> the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
> What vmx_apicv_post_state_restore does is actually clearing any
> former apicv state and this behavior is more suitable to carry out
> in the beginning.
>
> Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
> Cc: stable@vger.kernel.org
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Haitao Shan <hshan@google.com>
> Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com
> Signed-off-by: Sean Christopherson <seanjc@google.com>
>
> (Cherry-Picked from commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2)
> Signed-off-by: David Hunter <david.hunter.linux@gmail.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 87abf4eebf8a..4040075bbd5a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8203,6 +8203,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> .load_eoi_exitmap = vmx_load_eoi_exitmap,
> .apicv_pre_state_restore = vmx_apicv_pre_state_restore,
> .check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons,
> + .required_apicv_inhibits = VMX_REQUIRED_APICV_INHIBITS,
> .hwapic_irr_update = vmx_hwapic_irr_update,
> .hwapic_isr_update = vmx_hwapic_isr_update,
> .guest_apic_has_interrupt = vmx_guest_apic_has_interrupt,
Wait, this is just one hunk? This feels wrong, you didn't say why you
modfied this from the original commit, or backport, what was wrong with
that?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6.1.y 2/2 V2] KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
2024-08-27 13:10 ` Greg KH
@ 2024-08-27 18:52 ` Sean Christopherson
0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2024-08-27 18:52 UTC (permalink / raw)
To: Greg KH
Cc: David Hunter, dave.hansen, hpa, javier.carrasco.cruz, jmattson,
kvm, linux-kernel, lirongqing, pbonzini, pshier, shuah, stable,
x86, Haitao Shan
On Tue, Aug 27, 2024, Greg KH wrote:
> On Mon, Aug 26, 2024 at 06:13:36PM -0400, David Hunter wrote:
> >
> > [ Upstream Commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2]
>
> This is already in the 6.1.66 release, so do you want it applied again?
>
> > From: Haitao Shan <hshan@google.com>
> > Date: Tue Sep 12 16:55:45 2023 -0700
> >
> > When running android emulator (which is based on QEMU 2.12) on
> > certain Intel hosts with kernel version 6.3-rc1 or above, guest
> > will freeze after loading a snapshot. This is almost 100%
> > reproducible. By default, the android emulator will use snapshot
> > to speed up the next launching of the same android guest. So
> > this breaks the android emulator badly.
> >
> > I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
> > running command "loadvm" after "savevm". The same issue is
> > observed. At the same time, none of our AMD platforms is impacted.
> > More experiments show that loading the KVM module with
> > "enable_apicv=false" can workaround it.
> >
> > The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
> > fire timer when it is migrated and expired, and in oneshot mode").
> > However, as is pointed out by Sean Christopherson, it is introduced
> > by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
> > KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
> > it is migrated and expired, and in oneshot mode") just makes it
> > easier to hit the issue.
> >
> > Having both commits, the oneshot lapic timer gets fired immediately
> > inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
> > platforms with APIC virtualization and posted interrupt processing,
> > this eventually leads to setting the corresponding PIR bit. However,
> > the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
> > by apicv_post_state_restore. This leads to timer interrupt lost.
> >
> > The fix is to move vmx_apicv_post_state_restore to the beginning of
> > the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
> > What vmx_apicv_post_state_restore does is actually clearing any
> > former apicv state and this behavior is more suitable to carry out
> > in the beginning.
> >
> > Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
> > Cc: stable@vger.kernel.org
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Haitao Shan <hshan@google.com>
> > Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> >
> > (Cherry-Picked from commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2)
> > Signed-off-by: David Hunter <david.hunter.linux@gmail.com>
> > ---
> > arch/x86/kvm/vmx/vmx.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 87abf4eebf8a..4040075bbd5a 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -8203,6 +8203,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> > .load_eoi_exitmap = vmx_load_eoi_exitmap,
> > .apicv_pre_state_restore = vmx_apicv_pre_state_restore,
> > .check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons,
> > + .required_apicv_inhibits = VMX_REQUIRED_APICV_INHIBITS,
> > .hwapic_irr_update = vmx_hwapic_irr_update,
> > .hwapic_isr_update = vmx_hwapic_isr_update,
> > .guest_apic_has_interrupt = vmx_guest_apic_has_interrupt,
>
> Wait, this is just one hunk? This feels wrong, you didn't say why you
> modfied this from the original commit, or backport, what was wrong with
> that?
Gah, my bad. I told David[*] that this needed to be paired with patch 1 to avoid
creating a regression in 6.1.y, without realizing this commit had already landed
in 6.1.y.
So yeah, please ignore this patch.
[*] https://lore.kernel.org/all/ZsSiQkQVSz0DarYC@google.com
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-08-27 18:52 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-20 5:32 [PATCH 6.1.y] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode David Hunter
2024-08-20 6:18 ` 答复: [外部邮件] " Li,Rongqing
2024-08-20 14:07 ` Sean Christopherson
2024-08-26 22:13 ` [PATCH 6.1.y 0/2 V2] KVM: x86: fire timer when it is migrated David Hunter
2024-08-26 22:13 ` [PATCH 6.1.y 1/2 V2] KVM: x86: fire timer when it is migrated and expired, and in oneshot mode David Hunter
2024-08-26 22:13 ` [PATCH 6.1.y 2/2 V2] KVM: x86: Fix lapic timer interrupt lost after loading a snapshot David Hunter
2024-08-27 13:10 ` Greg KH
2024-08-27 18:52 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).