* CPU Lockups in KVM with deferred hrtimer rearming
@ 2026-04-16 20:50 Verma, Vishal L
0 siblings, 0 replies; only message in thread
From: Verma, Vishal L @ 2026-04-16 20:50 UTC (permalink / raw)
To: peterz@infradead.org, tglx@kernel.org
Cc: kvm@vger.kernel.org, Edgecombe, Rick P, Wu, Binbin,
x86@kernel.org
Hi Peter,
We noticed a KVM Unit test 'x2apic' - (APIC LVT timer one shot)
failing, and also some TDX specific tests running into multiple CPUs in
hard lockups on a 192-CPU Emerald Rapids system, and we traced it to
the htrimers deferred rearming merge.
Making CONFIG_HRTIMER_REARM_DEFERRED default to n in Kconfig made both
pass.
This is the hard lockup splat:
watchdog: CPU98: Watchdog detected hard LOCKUP on cpu 98
Modules linked in: openvswitch nsh tls ipt_REJECT iptable_mangle iptable_nat iptable_filter ip_tables bridge stp llc kvm_intel kvm irqbypass sunrpc
irq event stamp: 34998
hardirqs last enabled at (34997): [<ffffffffc090ce6d>] tdx_vcpu_run+0x5d/0x350 [kvm_intel]
hardirqs last disabled at (34998): [<ffffffffb9add6df>] exc_nmi+0xaf/0x1a0
softirqs last enabled at (34404): [<ffffffffb83fdd93>] __irq_exit_rcu+0xe3/0x160
softirqs last disabled at (34395): [<ffffffffb83fdd93>] __irq_exit_rcu+0xe3/0x160
CPU: 98 UID: 0 PID: 54785 Comm: qemu-system-x86 Not tainted 7.0.0-g10324ed6a556 #1 PREEMPT(full)
Hardware name: HPE ProLiant DL380 Gen11/ProLiant DL380 Gen11, BIOS 2.48 03/11/2025
RIP: 0010:vmx_do_nmi_irqoff+0x13/0x20 [kvm_intel]
Code: ff ff 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 48 83 e4 f0 6a 18 55 9c 6a 10 e8 3d db 6e f7 <c9> c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90
RSP: 0018:ff8d3a069bdf3af0 EFLAGS: 00000086
RAX: ff3cc96963d68000 RBX: ff3cc96963d68000 RCX: 4000000200000000
RDX: 0000000080000200 RSI: ff3cc96963d699d0 RDI: ff3cc96963d68000
RBP: ff8d3a069bdf3af0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ff3cc968d03d0000 R14: ff3cc968d03d0000 R15: 0000000000000000
FS: 00007f26ab7fe6c0(0000) GS:ff3cc98782d76000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001544af004 CR4: 0000000000f73ef0
PKRU: 00000000
Call Trace:
<TASK>
vmx_handle_nmi+0xdf/0x140 [kvm_intel]
tdx_vcpu_enter_exit+0xd5/0x300 [kvm_intel]
tdx_vcpu_run+0x5d/0x350 [kvm_intel]
vcpu_run+0xd4a/0x1800 [kvm]
? __local_bh_enable_ip+0x7b/0xf0
? kvm_arch_vcpu_ioctl_run+0x38b/0x5f0 [kvm]
? kvm_arch_vcpu_ioctl_run+0xb9/0x5f0 [kvm]
kvm_arch_vcpu_ioctl_run+0x38b/0x5f0 [kvm]
kvm_vcpu_ioctl+0x2ef/0xb00 [kvm]
? __fget_files+0x2b/0x190
? find_held_lock+0x2b/0x80
__x64_sys_ioctl+0x97/0xe0
do_syscall_64+0xf4/0x1540
? __x64_sys_ioctl+0xb1/0xe0
? trace_hardirqs_on_prepare+0xd2/0xf0
? do_syscall_64+0x225/0x1540
? trace_hardirqs_on+0x18/0x100
? __local_bh_enable_ip+0x7b/0xf0
? arch_do_signal_or_restart+0x155/0x250
? trace_hardirqs_off+0x4e/0xf0
? exit_to_user_mode_loop+0x150/0x4e0
? trace_hardirqs_on_prepare+0xd2/0xf0
? do_syscall_64+0x225/0x1540
? do_user_addr_fault+0x36c/0x6b0
? lockdep_hardirqs_on_prepare+0xdb/0x190
? trace_hardirqs_on+0x18/0x100
? do_syscall_64+0xab/0x1540
? exc_page_fault+0x12c/0x2b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f45f7ae00ed
Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
RSP: 002b:00007f26ab7f3e70 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f26ab7fe6c0 RCX: 00007f45f7ae00ed
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000099
RBP: 00007f26ab7f3ec0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f26ab7fe6c0
R13: 00007ffdc7adecd0 R14: 00007f26ab7fecdc R15: 00007ffdc7adedd7
</TASK>
I tried out AI assisted and patch (below) which does happen to solve
it, but I'm not familiar in this area, and not sure if this is the
right fix.
---
diff --git a/include/linux/entry-virt.h b/include/linux/entry-virt.h
index bfa767702d9a..c4856c252412 100644
--- a/include/linux/entry-virt.h
+++ b/include/linux/entry-virt.h
@@ -4,6 +4,7 @@
#include <linux/static_call_types.h>
#include <linux/resume_user_mode.h>
+#include <linux/hrtimer_rearm.h>
#include <linux/syscalls.h>
#include <linux/seccomp.h>
#include <linux/sched.h>
@@ -58,6 +59,7 @@ int xfer_to_guest_mode_handle_work(void);
static inline void xfer_to_guest_mode_prepare(void)
{
lockdep_assert_irqs_disabled();
+ hrtimer_rearm_deferred();
tick_nohz_user_enter_prepare();
}
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 5bd6efe598f0..f3bd084d9a72 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2058,6 +2058,7 @@ void __hrtimer_rearm_deferred(void)
}
hrtimer_rearm(cpu_base, expires_next, true);
}
+EXPORT_SYMBOL_GPL(__hrtimer_rearm_deferred);
static __always_inline void
hrtimer_interrupt_rearm(struct hrtimer_cpu_base *cpu_base, ktime_t expires_next)
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2026-04-16 20:50 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-16 20:50 CPU Lockups in KVM with deferred hrtimer rearming Verma, Vishal L
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox