* [PATCH] arm64: KVM: Initialize vGIC before preempt-disabled section in kvm_reset_vcpu()
@ 2026-04-12 8:04 Deepanshu Kartikey
2026-04-16 14:20 ` Marc Zyngier
0 siblings, 1 reply; 4+ messages in thread
From: Deepanshu Kartikey @ 2026-04-12 8:04 UTC (permalink / raw)
To: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will
Cc: drjones, christoffer.dall, linux-arm-kernel, kvmarm, linux-kernel,
Deepanshu Kartikey, syzbot+12b178b7c756664d2518
kvm_reset_vcpu() calls kvm_timer_vcpu_reset() inside a preempt-disabled
section to avoid races with preempt notifiers that also call vcpu put/load.
However, kvm_timer_vcpu_reset() eventually calls kvm_vgic_inject_irq()
which triggers vgic_lazy_init() if the vGIC has not been initialized yet.
vgic_lazy_init() acquires a mutex and calls vgic_init() which invokes
synchronize_srcu_expedited() -- both of which may sleep. Sleeping inside
a preempt-disabled section is illegal and causes:
BUG: scheduling while atomic: syz.1.49/3699/0x00000002
Fix this by calling vgic_lazy_init() before preempt_disable(). On the
second call inside kvm_vgic_inject_irq(), vgic_initialized() will return
true and vgic_lazy_init() will return immediately without sleeping.
Fixes: e761a927bc9a ("KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded")
Reported-by: syzbot+12b178b7c756664d2518@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=12b178b7c756664d2518
Tested-by: syzbot+12b178b7c756664d2518@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
arch/arm64/kvm/reset.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b963fd975aac..4ee16b4a37b5 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -25,6 +25,7 @@
#include <asm/ptrace.h>
#include <asm/kvm_arm.h>
#include <asm/kvm_asm.h>
+#include "vgic/vgic.h"
#include <asm/kvm_emulate.h>
#include <asm/kvm_mmu.h>
#include <asm/kvm_nested.h>
@@ -198,6 +199,14 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.reset_state.reset = false;
spin_unlock(&vcpu->arch.mp_state_lock);
+
+ /*
+ * Initialize vGIC before entering preempt-disabled section.
+ * vgic_lazy_init() may sleep via mutex_lock, which is illegal
+ * inside preempt_disable(). Second call inside kvm_vgic_inject_irq
+ * will find vGIC already initialized and return immediately.
+ */
+ vgic_lazy_init(vcpu->kvm);
preempt_disable();
loaded = (vcpu->cpu != -1);
if (loaded)
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] arm64: KVM: Initialize vGIC before preempt-disabled section in kvm_reset_vcpu() 2026-04-12 8:04 [PATCH] arm64: KVM: Initialize vGIC before preempt-disabled section in kvm_reset_vcpu() Deepanshu Kartikey @ 2026-04-16 14:20 ` Marc Zyngier 2026-04-21 1:48 ` Deepanshu Kartikey 0 siblings, 1 reply; 4+ messages in thread From: Marc Zyngier @ 2026-04-16 14:20 UTC (permalink / raw) To: Deepanshu Kartikey Cc: oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas, will, drjones, christoffer.dall, linux-arm-kernel, kvmarm, linux-kernel, syzbot+12b178b7c756664d2518 On Sun, 12 Apr 2026 09:04:37 +0100, Deepanshu Kartikey <kartikey406@gmail.com> wrote: > > kvm_reset_vcpu() calls kvm_timer_vcpu_reset() inside a preempt-disabled > section to avoid races with preempt notifiers that also call vcpu put/load. > > However, kvm_timer_vcpu_reset() eventually calls kvm_vgic_inject_irq() > which triggers vgic_lazy_init() if the vGIC has not been initialized yet. > vgic_lazy_init() acquires a mutex and calls vgic_init() which invokes > synchronize_srcu_expedited() -- both of which may sleep. Sleeping inside > a preempt-disabled section is illegal and causes: > > BUG: scheduling while atomic: syz.1.49/3699/0x00000002 > > Fix this by calling vgic_lazy_init() before preempt_disable(). On the > second call inside kvm_vgic_inject_irq(), vgic_initialized() will return > true and vgic_lazy_init() will return immediately without sleeping. > I think this really goes in the wrong direction. Forcing the vgic (a global resource) to initialise when the vcpu's timer (a local resource) is reset feels at best bizarre. Now you are promoting it to be forced at vcpu reset. This makes things worse. You probably want to take a step back and look at *why* we end-up here. The core reason seems to be that the timer emulation caches the level in a per-timer structure, and tries hard not call into the vgic unless the level changes. Which means that unless the vgic is initialised and is able to latch that state, the initial pending state will not be propagated to the guest. But do we need this optimisation? I don't think so. Other emulated devices don't require it. We can let the vgic know the state of the timer at every vcpu entry, just like we do for other virtual interrupts that the kernel injects (PMU, vgic MI). Once you remove the this cache and the need for the vgic to buffer things outside of normal execution, you can also drop the magic init from the interrupt injection path, because the injection will happen on the run path, just like any other PPI. That'd be a much better approach IMO. Thanks, M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] arm64: KVM: Initialize vGIC before preempt-disabled section in kvm_reset_vcpu() 2026-04-16 14:20 ` Marc Zyngier @ 2026-04-21 1:48 ` Deepanshu Kartikey 2026-04-21 7:42 ` Marc Zyngier 0 siblings, 1 reply; 4+ messages in thread From: Deepanshu Kartikey @ 2026-04-21 1:48 UTC (permalink / raw) To: Marc Zyngier Cc: oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas, will, drjones, christoffer.dall, linux-arm-kernel, kvmarm, linux-kernel, syzbot+12b178b7c756664d2518 On Thu, Apr 16, 2026 at 7:50 PM Marc Zyngier <maz@kernel.org> wrote: > > On Sun, 12 Apr 2026 09:04:37 +0100, > Deepanshu Kartikey <kartikey406@gmail.com> wrote: > > > > kvm_reset_vcpu() calls kvm_timer_vcpu_reset() inside a preempt-disabled > > section to avoid races with preempt notifiers that also call vcpu put/load. > > > > However, kvm_timer_vcpu_reset() eventually calls kvm_vgic_inject_irq() > > which triggers vgic_lazy_init() if the vGIC has not been initialized yet. > > vgic_lazy_init() acquires a mutex and calls vgic_init() which invokes > > synchronize_srcu_expedited() -- both of which may sleep. Sleeping inside > > a preempt-disabled section is illegal and causes: > > > > BUG: scheduling while atomic: syz.1.49/3699/0x00000002 > > > > Fix this by calling vgic_lazy_init() before preempt_disable(). On the > > second call inside kvm_vgic_inject_irq(), vgic_initialized() will return > > true and vgic_lazy_init() will return immediately without sleeping. > > > > I think this really goes in the wrong direction. Forcing the vgic (a > global resource) to initialise when the vcpu's timer (a local > resource) is reset feels at best bizarre. Now you are promoting it to > be forced at vcpu reset. This makes things worse. > > You probably want to take a step back and look at *why* we end-up > here. The core reason seems to be that the timer emulation caches the > level in a per-timer structure, and tries hard not call into the vgic > unless the level changes. Which means that unless the vgic is > initialised and is able to latch that state, the initial pending state > will not be propagated to the guest. > > But do we need this optimisation? I don't think so. Other emulated > devices don't require it. We can let the vgic know the state of the > timer at every vcpu entry, just like we do for other virtual > interrupts that the kernel injects (PMU, vgic MI). > > Once you remove the this cache and the need for the vgic to buffer > things outside of normal execution, you can also drop the magic init > from the interrupt injection path, because the injection will happen > on the run path, just like any other PPI. > > That'd be a much better approach IMO. > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible. Hi Marc, Thank you for the detailed feedback! I apologize for the delayed response — I was away on holiday. I understand your point. My fix addresses the symptom rather than the root cause. Forcing vGIC (a global resource) to initialize during timer (a local resource) reset is not the right approach. I will take your suggestion and work on: I will send a v2 once I have something ready. Thanks again for the guidance! Best regards, Deeanshu Kartikey ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] arm64: KVM: Initialize vGIC before preempt-disabled section in kvm_reset_vcpu() 2026-04-21 1:48 ` Deepanshu Kartikey @ 2026-04-21 7:42 ` Marc Zyngier 0 siblings, 0 replies; 4+ messages in thread From: Marc Zyngier @ 2026-04-21 7:42 UTC (permalink / raw) To: Deepanshu Kartikey Cc: oupton, joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas, will, drjones, christoffer.dall, linux-arm-kernel, kvmarm, linux-kernel, syzbot+12b178b7c756664d2518 On Tue, 21 Apr 2026 02:48:02 +0100, Deepanshu Kartikey <kartikey406@gmail.com> wrote: > > On Thu, Apr 16, 2026 at 7:50 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Sun, 12 Apr 2026 09:04:37 +0100, > > Deepanshu Kartikey <kartikey406@gmail.com> wrote: > > > > > > kvm_reset_vcpu() calls kvm_timer_vcpu_reset() inside a preempt-disabled > > > section to avoid races with preempt notifiers that also call vcpu put/load. > > > > > > However, kvm_timer_vcpu_reset() eventually calls kvm_vgic_inject_irq() > > > which triggers vgic_lazy_init() if the vGIC has not been initialized yet. > > > vgic_lazy_init() acquires a mutex and calls vgic_init() which invokes > > > synchronize_srcu_expedited() -- both of which may sleep. Sleeping inside > > > a preempt-disabled section is illegal and causes: > > > > > > BUG: scheduling while atomic: syz.1.49/3699/0x00000002 > > > > > > Fix this by calling vgic_lazy_init() before preempt_disable(). On the > > > second call inside kvm_vgic_inject_irq(), vgic_initialized() will return > > > true and vgic_lazy_init() will return immediately without sleeping. > > > > > > > I think this really goes in the wrong direction. Forcing the vgic (a > > global resource) to initialise when the vcpu's timer (a local > > resource) is reset feels at best bizarre. Now you are promoting it to > > be forced at vcpu reset. This makes things worse. > > > > You probably want to take a step back and look at *why* we end-up > > here. The core reason seems to be that the timer emulation caches the > > level in a per-timer structure, and tries hard not call into the vgic > > unless the level changes. Which means that unless the vgic is > > initialised and is able to latch that state, the initial pending state > > will not be propagated to the guest. > > > > But do we need this optimisation? I don't think so. Other emulated > > devices don't require it. We can let the vgic know the state of the > > timer at every vcpu entry, just like we do for other virtual > > interrupts that the kernel injects (PMU, vgic MI). > > > > Once you remove the this cache and the need for the vgic to buffer > > things outside of normal execution, you can also drop the magic init > > from the interrupt injection path, because the injection will happen > > on the run path, just like any other PPI. > > > > That'd be a much better approach IMO. > > > > Thanks, > > > > M. > > > > -- > > Without deviation from the norm, progress is not possible. > > > Hi Marc, > > Thank you for the detailed feedback! I apologize for the delayed > response — I was away on holiday. No need to apologise. I hadn't spotted this report (syzkaller badly triaged it), so thanks for bring it to my attention. > > I understand your point. My fix addresses the symptom rather than the > root cause. Forcing vGIC (a global resource) to initialize during > timer (a local resource) reset is not the right approach. > > I will take your suggestion and work on: > > I will send a v2 once I have something ready. I've posted my own take on this at [1], for which you are on Cc. Feel free to review it if you have the time (I'll post a v2 anyway, as I have accumulated a couple of additional fixes). Cheers, M. [1] https://lore.kernel.org/r/20260417124612.2770268-1-maz@kernel.org -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-21 7:43 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-12 8:04 [PATCH] arm64: KVM: Initialize vGIC before preempt-disabled section in kvm_reset_vcpu() Deepanshu Kartikey 2026-04-16 14:20 ` Marc Zyngier 2026-04-21 1:48 ` Deepanshu Kartikey 2026-04-21 7:42 ` Marc Zyngier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox