public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* CPU Lockups in KVM with deferred hrtimer rearming
@ 2026-04-16 20:50 Verma, Vishal L
  0 siblings, 0 replies; only message in thread
From: Verma, Vishal L @ 2026-04-16 20:50 UTC (permalink / raw)
  To: peterz@infradead.org, tglx@kernel.org
  Cc: kvm@vger.kernel.org, Edgecombe, Rick P, Wu, Binbin,
	x86@kernel.org

Hi Peter,

We noticed a KVM Unit test 'x2apic' - (APIC LVT timer one shot)
failing, and also some TDX specific tests running into multiple CPUs in
hard lockups on a 192-CPU Emerald Rapids system, and we traced it to
the htrimers deferred rearming merge.

Making CONFIG_HRTIMER_REARM_DEFERRED default to n in Kconfig made both
pass.

This is the hard lockup splat:

   watchdog: CPU98: Watchdog detected hard LOCKUP on cpu 98
   Modules linked in: openvswitch nsh tls ipt_REJECT iptable_mangle iptable_nat iptable_filter ip_tables bridge stp llc kvm_intel kvm irqbypass sunrpc
   irq event stamp: 34998
   hardirqs last  enabled at (34997): [<ffffffffc090ce6d>] tdx_vcpu_run+0x5d/0x350 [kvm_intel]
   hardirqs last disabled at (34998): [<ffffffffb9add6df>] exc_nmi+0xaf/0x1a0
   softirqs last  enabled at (34404): [<ffffffffb83fdd93>] __irq_exit_rcu+0xe3/0x160
   softirqs last disabled at (34395): [<ffffffffb83fdd93>] __irq_exit_rcu+0xe3/0x160
   CPU: 98 UID: 0 PID: 54785 Comm: qemu-system-x86 Not tainted 7.0.0-g10324ed6a556 #1 PREEMPT(full) 
   Hardware name: HPE ProLiant DL380 Gen11/ProLiant DL380 Gen11, BIOS 2.48 03/11/2025
   RIP: 0010:vmx_do_nmi_irqoff+0x13/0x20 [kvm_intel]
   Code: ff ff 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 48 83 e4 f0 6a 18 55 9c 6a 10 e8 3d db 6e f7 <c9> c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90
   RSP: 0018:ff8d3a069bdf3af0 EFLAGS: 00000086
   RAX: ff3cc96963d68000 RBX: ff3cc96963d68000 RCX: 4000000200000000
   RDX: 0000000080000200 RSI: ff3cc96963d699d0 RDI: ff3cc96963d68000
   RBP: ff8d3a069bdf3af0 R08: 0000000000000000 R09: 0000000000000000
   R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
   R13: ff3cc968d03d0000 R14: ff3cc968d03d0000 R15: 0000000000000000
   FS:  00007f26ab7fe6c0(0000) GS:ff3cc98782d76000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 0000000000000000 CR3: 00000001544af004 CR4: 0000000000f73ef0
   PKRU: 00000000
   Call Trace:
    <TASK>
    vmx_handle_nmi+0xdf/0x140 [kvm_intel]
    tdx_vcpu_enter_exit+0xd5/0x300 [kvm_intel]
    tdx_vcpu_run+0x5d/0x350 [kvm_intel]
    vcpu_run+0xd4a/0x1800 [kvm]
    ? __local_bh_enable_ip+0x7b/0xf0
    ? kvm_arch_vcpu_ioctl_run+0x38b/0x5f0 [kvm]
    ? kvm_arch_vcpu_ioctl_run+0xb9/0x5f0 [kvm]
    kvm_arch_vcpu_ioctl_run+0x38b/0x5f0 [kvm]
    kvm_vcpu_ioctl+0x2ef/0xb00 [kvm]
    ? __fget_files+0x2b/0x190
    ? find_held_lock+0x2b/0x80
    __x64_sys_ioctl+0x97/0xe0
    do_syscall_64+0xf4/0x1540
    ? __x64_sys_ioctl+0xb1/0xe0
    ? trace_hardirqs_on_prepare+0xd2/0xf0
    ? do_syscall_64+0x225/0x1540
    ? trace_hardirqs_on+0x18/0x100
    ? __local_bh_enable_ip+0x7b/0xf0
    ? arch_do_signal_or_restart+0x155/0x250
    ? trace_hardirqs_off+0x4e/0xf0
    ? exit_to_user_mode_loop+0x150/0x4e0
    ? trace_hardirqs_on_prepare+0xd2/0xf0
    ? do_syscall_64+0x225/0x1540
    ? do_user_addr_fault+0x36c/0x6b0
    ? lockdep_hardirqs_on_prepare+0xdb/0x190
    ? trace_hardirqs_on+0x18/0x100
    ? do_syscall_64+0xab/0x1540
    ? exc_page_fault+0x12c/0x2b0
    entry_SYSCALL_64_after_hwframe+0x76/0x7e
   RIP: 0033:0x7f45f7ae00ed
   Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
   RSP: 002b:00007f26ab7f3e70 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
   RAX: ffffffffffffffda RBX: 00007f26ab7fe6c0 RCX: 00007f45f7ae00ed
   RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000099
   RBP: 00007f26ab7f3ec0 R08: 0000000000000000 R09: 0000000000000000
   R10: 0000000000000000 R11: 0000000000000246 R12: 00007f26ab7fe6c0
   R13: 00007ffdc7adecd0 R14: 00007f26ab7fecdc R15: 00007ffdc7adedd7
    </TASK>

I tried out AI assisted and patch (below) which does happen to solve
it, but I'm not familiar in this area, and not sure if this is the
right fix.

---

diff --git a/include/linux/entry-virt.h b/include/linux/entry-virt.h
index bfa767702d9a..c4856c252412 100644
--- a/include/linux/entry-virt.h
+++ b/include/linux/entry-virt.h
@@ -4,6 +4,7 @@
 
 #include <linux/static_call_types.h>
 #include <linux/resume_user_mode.h>
+#include <linux/hrtimer_rearm.h>
 #include <linux/syscalls.h>
 #include <linux/seccomp.h>
 #include <linux/sched.h>
@@ -58,6 +59,7 @@ int xfer_to_guest_mode_handle_work(void);
 static inline void xfer_to_guest_mode_prepare(void)
 {
        lockdep_assert_irqs_disabled();
+       hrtimer_rearm_deferred();
        tick_nohz_user_enter_prepare();
 }
 
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 5bd6efe598f0..f3bd084d9a72 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2058,6 +2058,7 @@ void __hrtimer_rearm_deferred(void)
        }
        hrtimer_rearm(cpu_base, expires_next, true);
 }
+EXPORT_SYMBOL_GPL(__hrtimer_rearm_deferred);
 
 static __always_inline void
 hrtimer_interrupt_rearm(struct hrtimer_cpu_base *cpu_base, ktime_t expires_next)

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2026-04-16 20:50 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-16 20:50 CPU Lockups in KVM with deferred hrtimer rearming Verma, Vishal L

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox