From: Jiri Slaby <jirislaby@kernel.org>
To: Matthieu Baerts <matttbe@kernel.org>,
Stefan Hajnoczi <stefanha@redhat.com>,
Stefano Garzarella <sgarzare@redhat.com>
Cc: kvm@vger.kernel.org, virtualization@lists.linux.dev,
Netdev <netdev@vger.kernel.org>,
rcu@vger.kernel.org, "MPTCP Linux" <mptcp@lists.linux.dev>,
"Linux Kernel" <linux-kernel@vger.kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"Thomas Gleixner" <tglx@kernel.org>,
"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"luto@kernel.org" <luto@kernel.org>,
"Michal Koutný" <MKoutny@suse.com>
Subject: Re: Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout
Date: Mon, 2 Mar 2026 06:28:38 +0100 [thread overview]
Message-ID: <863a5291-a636-47d0-891c-bb0524d2e134@kernel.org> (raw)
In-Reply-To: <7f3e74d7-67dc-48d7-99d2-0b87f671651b@kernel.org>
On 26. 02. 26, 11:37, Jiri Slaby wrote:
> On 06. 02. 26, 12:54, Matthieu Baerts wrote:
>> Our CI for the MPTCP subsystem is now regularly hitting various stalls
>> before even starting the MPTCP test suite. These issues are visible on
>> top of the latest net and net-next trees, which have been sync with
>> Linus' tree yesterday. All these issues have been seen on a "public CI"
>> using GitHub-hosted runners with KVM support, where the tested kernel is
>> launched in a nested (I suppose) VM. I can see the issue with or without
>> debug.config. According to the logs, it might have started around
>> v6.19-rc0, but I was unavailable for a few weeks, and I couldn't react
>> quicker, sorry for that. Unfortunately, I cannot reproduce this locally,
>> and the CI doesn't currently have the ability to execute bisections.
>
> Hmm, after the switch of the qemu guest kernels to 6.19, our (opensuse)
> build service is stalling in smp_call_function_many_cond() randomly too:
> https://bugzilla.suse.com/show_bug.cgi?id=1258936
>
> The attachment from there contains sysrq-t logs too:
> https://bugzilla.suse.com/attachment.cgi?id=888612
A small update. Just in case this rings a bell somewhere.
We have a qemu mem dump from the affected kernel. It shows that both
CPU0 and CPU1 are waiting for CPU2's rq lock. CPU2 is in userspace.
crash> bt -xsc 0
PID: 6483 TASK: ffff8d1759c20000 CPU: 0 COMMAND: "compile"
[exception RIP: native_halt+14]
RIP: ffffffffb9d1124e RSP: ffffcead0696f9a0 RFLAGS: 00000046
RAX: 0000000000000003 RBX: 0000000000040000 RCX: 00000000fffffff8
RDX: ffff8d1a7ffc5140 RSI: 0000000000000003 RDI: ffff8d1a6fd35dc0
RBP: ffff8d1a6fd35dc0 R8: ffff8d1a6fc36dc0 R9: fffffffffffffff8
R10: 0000000000000000 R11: 0000000000000004 R12: ffff8d1a6fc36dc0
R13: 0000000000000000 R14: ffff8d1a7ffc5140 R15: ffffcead0696fad0
CS: 0010 SS: 0018
#0 [ffffcead0696f9a0] kvm_wait+0x44 at ffffffffb9d0fe54
#1 [ffffcead0696f9a8] __pv_queued_spin_lock_slowpath+0x247 at
ffffffffbaafb507
#2 [ffffcead0696f9d8] _raw_spin_lock+0x29 at ffffffffbaafadf9
#3 [ffffcead0696f9e0] raw_spin_rq_lock_nested+0x1c at ffffffffb9d8c12c
#4 [ffffcead0696f9f8] _raw_spin_rq_lock_irqsave+0x17 at ffffffffb9d96ca7
#5 [ffffcead0696fa08] sched_balance_rq+0x56d at ffffffffb9da718d
#6 [ffffcead0696fb18] pick_next_task_fair+0x240 at ffffffffb9da7e00
#7 [ffffcead0696fb88] __schedule+0x19e at ffffffffbaaf00de
#8 [ffffcead0696fc40] schedule+0x27 at ffffffffbaaf1697
#9 [ffffcead0696fc50] futex_do_wait+0x4a at ffffffffb9e61c5a
#10 [ffffcead0696fc68] __futex_wait+0x8e at ffffffffb9e6241e
#11 [ffffcead0696fd30] futex_wait+0x6b at ffffffffb9e624fb
#12 [ffffcead0696fdc0] do_futex+0xc5 at ffffffffb9e5e305
#13 [ffffcead0696fdc8] __x64_sys_futex+0x112 at ffffffffb9e5e932
#14 [ffffcead0696fe38] do_syscall_64+0x81 at ffffffffbaae2a61
#15 [ffffcead0696ff40] entry_SYSCALL_64_after_hwframe+0x76 at
ffffffffb9a0012f
RIP: 0000000000495303 RSP: 000000c000073c98 RFLAGS: 00000286
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000495303
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000c000058958
RBP: 000000c000073ce0 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000286 R12: 0000000000000024
R13: 0000000000000001 R14: 000000c000002c40 R15: 0000000000000001
ORIG_RAX: 00000000000000ca CS: 0033 SS: 002b
crash> bt -xsc 1
PID: 6481 TASK: ffff8d1759c8b680 CPU: 1 COMMAND: "compile"
[exception RIP: __pv_queued_spin_lock_slowpath+190]
RIP: ffffffffbaafb37e RSP: ffffcead000f8b38 RFLAGS: 00000046
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 0000000000040003 RSI: 0000000000040003 RDI: ffff8d1a6fd35dc0
RBP: ffff8d1a6fd35dc0 R8: 0000000000000000 R9: 00000001000c3f60
R10: ffffffffbbc75960 R11: ffffcead000f8a48 R12: ffff8d1a6fcb6dc0
R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffbbe65940
CS: 0010 SS: 0000
#0 [ffffcead000f8b60] _raw_spin_lock+0x29 at ffffffffbaafadf9
#1 [ffffcead000f8b68] raw_spin_rq_lock_nested+0x1c at ffffffffb9d8c12c
#2 [ffffcead000f8b80] _raw_spin_rq_lock_irqsave+0x17 at ffffffffb9dc9cc7
#3 [ffffcead000f8b90] print_cfs_rq+0xce at ffffffffb9dd0d8e
#4 [ffffcead000f8c98] print_cfs_stats+0x62 at ffffffffb9da9ee2
#5 [ffffcead000f8cc8] print_cpu+0x243 at ffffffffb9dcbe73
#6 [ffffcead000f8d00] sysrq_sched_debug_show+0x2e at ffffffffb9dd1b7e
#7 [ffffcead000f8d18] show_state_filter+0xcd at ffffffffb9d91f4d
#8 [ffffcead000f8d40] sysrq_handle_showstate+0x10 at ffffffffba60b750
#9 [ffffcead000f8d48] __handle_sysrq.cold+0x9b at ffffffffb9c4f486
#10 [ffffcead000f8d70] sysrq_filter+0xd7 at ffffffffba60c237
#11 [ffffcead000f8d98] input_handle_events_filter+0x45 at ffffffffba766c05
#12 [ffffcead000f8dd0] input_pass_values+0x134 at ffffffffba766ec4
#13 [ffffcead000f8e00] input_event_dispose+0x156 at ffffffffba767046
#14 [ffffcead000f8e20] input_event+0x58 at ffffffffba76ac18
#15 [ffffcead000f8e50] atkbd_receive_byte+0x64d at ffffffffba772e6d
#16 [ffffcead000f8ea8] ps2_interrupt+0x9d at ffffffffba7665ed
#17 [ffffcead000f8ed0] serio_interrupt+0x4f at ffffffffba761e0f
#18 [ffffcead000f8f00] i8042_handle_data+0x11c at ffffffffba76316c
#19 [ffffcead000f8f40] i8042_interrupt+0x11 at ffffffffba763581
#20 [ffffcead000f8f50] __handle_irq_event_percpu+0x55 at ffffffffb9df1e15
#21 [ffffcead000f8f90] handle_irq_event+0x38 at ffffffffb9df2058
#22 [ffffcead000f8fb0] handle_edge_irq+0xc5 at ffffffffb9df7b95
#23 [ffffcead000f8fd0] __common_interrupt+0x44 at ffffffffb9cc2354
#24 [ffffcead000f8ff0] common_interrupt+0x80 at ffffffffbaae6090
--- <IRQ stack> ---
#25 [ffffcead06bcfb98] asm_common_interrupt+0x26 at ffffffffb9a01566
[exception RIP: smp_call_function_many_cond+304]
RIP: ffffffffb9e63080 RSP: ffffcead06bcfc40 RFLAGS: 00000202
RAX: 0000000000000011 RBX: 0000000000000202 RCX: ffff8d1a6fc3f800
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R8: ffff8d174009cc30 R9: 0000000000000000
R10: ffff8d174009c0d8 R11: 0000000000000000 R12: 0000000000000001
R13: 0000000000000003 R14: ffff8d1a6fcb7280 R15: 0000000000000001
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#26 [ffffcead06bcfcb0] on_each_cpu_cond_mask+0x24 at ffffffffb9e634f4
#27 [ffffcead06bcfcb8] flush_tlb_mm_range+0x1b1 at ffffffffb9d225d1
#28 [ffffcead06bcfd08] ptep_clear_flush+0x93 at ffffffffba066e13
#29 [ffffcead06bcfd30] do_wp_page+0x6a2 at ffffffffba04c692
#30 [ffffcead06bcfdb8] __handle_mm_fault+0xa49 at ffffffffba055c79
#31 [ffffcead06bcfe98] handle_mm_fault+0xe7 at ffffffffba056297
#32 [ffffcead06bcfed8] do_user_addr_fault+0x21a at ffffffffb9d1db6a
#33 [ffffcead06bcff18] exc_page_fault+0x69 at ffffffffbaae99c9
#34 [ffffcead06bcff40] asm_exc_page_fault+0x26 at ffffffffb9a012a6
RIP: 000000000042351c RSP: 000000c0013aafd0 RFLAGS: 00010246
RAX: 0000000000000002 RBX: 00000000017584c0 RCX: 0000000000000000
RDX: 0000000000000005 RSI: 000000000163edc0 RDI: 0000000000000003
RBP: 000000c0013ab080 R8: 0000000000000001 R9: 00007f0d9853f800
R10: 00007f0d98334e00 R11: 00007f0d98afa020 R12: 00007f0d98afa020
R13: 0000000000000050 R14: 000000c000002380 R15: 0000000000000001
ORIG_RAX: ffffffffffffffff CS: 0033 SS: 002b
RIP: 000000000042351c RSP: 000000c0013aafd0 RFLAGS: 00010246
RAX: 0000000000000002 RBX: 00000000017584c0 RCX: 0000000000000000
RDX: 0000000000000005 RSI: 000000000163edc0 RDI: 0000000000000003
RBP: 000000c0013ab080 R8: 0000000000000001 R9: 00007f0d9853f800
R10: 00007f0d98334e00 R11: 00007f0d98afa020 R12: 00007f0d98afa020
R13: 0000000000000050 R14: 000000c000002380 R15: 0000000000000001
ORIG_RAX: ffffffffffffffff CS: 0033 SS: 002b
crash> bt -xsc 2
PID: 6540 TASK: ffff8d1773ae3680 CPU: 2 COMMAND: "compile"
RIP: 0000000000495372 RSP: 000000c00003e000 RFLAGS: 00000206
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000495372
RDX: 0000000000000000 RSI: 000000c00003e000 RDI: 00000000000d0f00
RBP: 00007ffcf8a71aa8 R8: 000000c00005a090 R9: 000000c000002700
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000491580
R13: 000000c00005a008 R14: 00000000017222e0 R15: ffffffffffffffff
ORIG_RAX: 0000000000000038 CS: 0033 SS: 002b
The state of the lock:
crash> struct rq.__lock -x ffff8d1a6fd35dc0
__lock = {
raw_lock = {
{
val = {
counter = 0x40003
},
{
locked = 0x3,
pending = 0x0
},
{
locked_pending = 0x3,
tail = 0x4
}
}
}
},
thanks,
--
js
suse labs
next prev parent reply other threads:[~2026-03-02 5:28 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 11:54 Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout Matthieu Baerts
2026-02-06 16:38 ` Stefano Garzarella
2026-02-06 17:13 ` Matthieu Baerts
2026-02-26 10:37 ` Jiri Slaby
2026-03-02 5:28 ` Jiri Slaby [this message]
2026-03-02 11:46 ` Peter Zijlstra
2026-03-02 14:30 ` Waiman Long
2026-03-05 7:00 ` Jiri Slaby
2026-03-05 11:53 ` Jiri Slaby
2026-03-05 12:20 ` Jiri Slaby
2026-03-05 16:16 ` Thomas Gleixner
2026-03-05 17:33 ` Jiri Slaby
2026-03-05 19:25 ` Thomas Gleixner
2026-03-06 5:48 ` Jiri Slaby
2026-03-06 9:57 ` Thomas Gleixner
2026-03-06 10:16 ` Jiri Slaby
2026-03-06 16:28 ` Thomas Gleixner
2026-03-06 11:06 ` Matthieu Baerts
2026-03-06 16:57 ` Matthieu Baerts
2026-03-06 18:31 ` Jiri Slaby
2026-03-06 18:44 ` Matthieu Baerts
2026-03-06 21:40 ` Matthieu Baerts
2026-03-06 15:24 ` Peter Zijlstra
2026-03-07 9:01 ` Thomas Gleixner
2026-03-07 22:29 ` Thomas Gleixner
2026-03-08 9:15 ` Thomas Gleixner
2026-03-08 16:55 ` Jiri Slaby
2026-03-08 16:58 ` Thomas Gleixner
2026-03-08 17:23 ` Matthieu Baerts
2026-03-09 8:43 ` Thomas Gleixner
2026-03-09 12:23 ` Matthieu Baerts
2026-03-10 8:09 ` Thomas Gleixner
2026-03-10 8:20 ` Thomas Gleixner
2026-03-10 8:56 ` Jiri Slaby
2026-03-10 9:00 ` Jiri Slaby
2026-03-10 10:03 ` Thomas Gleixner
2026-03-10 10:06 ` Thomas Gleixner
2026-03-10 11:24 ` Matthieu Baerts
2026-03-10 11:54 ` Peter Zijlstra
2026-03-10 12:28 ` Thomas Gleixner
2026-03-10 13:40 ` Matthieu Baerts
2026-03-10 13:47 ` Thomas Gleixner
2026-03-10 15:51 ` Matthieu Baerts
2026-03-03 13:23 ` Matthieu Baerts
2026-03-05 6:46 ` Jiri Slaby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=863a5291-a636-47d0-891c-bb0524d2e134@kernel.org \
--to=jirislaby@kernel.org \
--cc=MKoutny@suse.com \
--cc=dave.hansen@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=sgarzare@redhat.com \
--cc=shinichiro.kawasaki@wdc.com \
--cc=stefanha@redhat.com \
--cc=tglx@kernel.org \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox