From: Jiri Slaby <jirislaby@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@kernel.org>
Cc: "Matthieu Baerts" <matttbe@kernel.org>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Stefano Garzarella" <sgarzare@redhat.com>,
kvm@vger.kernel.org, virtualization@lists.linux.dev,
Netdev <netdev@vger.kernel.org>,
rcu@vger.kernel.org, "MPTCP Linux" <mptcp@lists.linux.dev>,
"Linux Kernel" <linux-kernel@vger.kernel.org>,
"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"luto@kernel.org" <luto@kernel.org>,
"Michal Koutný" <MKoutny@suse.com>,
"Waiman Long" <longman@redhat.com>
Subject: Re: Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout
Date: Thu, 5 Mar 2026 08:00:15 +0100 [thread overview]
Message-ID: <717310d8-6274-4b7f-8a19-561c45f5f565@kernel.org> (raw)
In-Reply-To: <20260302114636.GL606826@noisy.programming.kicks-ass.net>
On 02. 03. 26, 12:46, Peter Zijlstra wrote:
> On Mon, Mar 02, 2026 at 06:28:38AM +0100, Jiri Slaby wrote:
>
>> The state of the lock:
>>
>> crash> struct rq.__lock -x ffff8d1a6fd35dc0
>> __lock = {
>> raw_lock = {
>> {
>> val = {
>> counter = 0x40003
>> },
>> {
>> locked = 0x3,
>> pending = 0x0
>> },
>> {
>> locked_pending = 0x3,
>> tail = 0x4
>> }
>> }
>> }
>> },
>>
>
>
> That had me remember the below patch that never quite made it. I've
> rebased it to something more recent so it applies.
>
> If you stick that in, we might get a clue as to who is owning that lock.
> Provided it all wants to reproduce well enough.
Thanks, I applied it, but to date it is still not accepted yet:
https://build.opensuse.org/requests/1335893
In the meantime, me and Michal K. did some digging into qemu dumps.
Details at (and a couple previous comments):
https://bugzilla.suse.com/show_bug.cgi?id=1258936#c17
tl;dr:
In one of the dumps, one process sits in
context_switch
-> mm_get_cid (before switch_to())
> 65 kworker/1:1 SP= 0xffffcf82c022fd98 -> __schedule+0x16ee
(ffffffff820f162e) -> call mm_get_cid
Michal extracted the vCPU's RIP and it turned out:
> Hm, I'd say the CPU could be spinning in mm_get_cid() waiting for a
free CID.
> ...
> ffff8a88458137c0: 000000000000000f 000000000000000f
> ^
> Hm, so indeed CIDs for all four CPUs are occupied.
To me (I don't know what CID is either), this might point as a possible
culprit to Thomas' "sched/mmcid: Cure mode transition woes" [1].
Funnily enough, 47ee94efccf6 ("sched/mmcid: Protect transition on weakly
ordered systems") spells:
> As a consequence the task will
> not drop the CID when scheduling out before the fixup is
completed, which
> means the CID space can be exhausted and the next task scheduling
in will
> loop in mm_get_cid() and the fixup thread can livelock on the
held runqueue
> lock as above.
Which sounds like what exactly happens here. Except the patch is from
the series above, so is already in 6.19 obviously.
I noticed there is also a 7.0-rc1 fix:
1e83ccd5921a sched/mmcid: Don't assume CID is CPU owned on mode switch
But that got into 6.19.1 already (we are at 6.19.3). So does not improve
the situation.
Any ideas?
[1] https://lore.kernel.org/all/20260201192234.380608594@kernel.org/
thanks,
--
js
suse labs
next prev parent reply other threads:[~2026-03-05 7:00 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 11:54 Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout Matthieu Baerts
2026-02-06 16:38 ` Stefano Garzarella
2026-02-06 17:13 ` Matthieu Baerts
2026-02-26 10:37 ` Jiri Slaby
2026-03-02 5:28 ` Jiri Slaby
2026-03-02 11:46 ` Peter Zijlstra
2026-03-02 14:30 ` Waiman Long
2026-03-05 7:00 ` Jiri Slaby [this message]
2026-03-05 11:53 ` Jiri Slaby
2026-03-05 12:20 ` Jiri Slaby
2026-03-05 16:16 ` Thomas Gleixner
2026-03-05 17:33 ` Jiri Slaby
2026-03-05 19:25 ` Thomas Gleixner
2026-03-06 5:48 ` Jiri Slaby
2026-03-06 9:57 ` Thomas Gleixner
2026-03-06 10:16 ` Jiri Slaby
2026-03-06 16:28 ` Thomas Gleixner
2026-03-06 11:06 ` Matthieu Baerts
2026-03-06 16:57 ` Matthieu Baerts
2026-03-06 18:31 ` Jiri Slaby
2026-03-06 18:44 ` Matthieu Baerts
2026-03-06 21:40 ` Matthieu Baerts
2026-03-06 15:24 ` Peter Zijlstra
2026-03-07 9:01 ` Thomas Gleixner
2026-03-07 22:29 ` Thomas Gleixner
2026-03-08 9:15 ` Thomas Gleixner
2026-03-08 16:55 ` Jiri Slaby
2026-03-08 16:58 ` Thomas Gleixner
2026-03-08 17:23 ` Matthieu Baerts
2026-03-09 8:43 ` Thomas Gleixner
2026-03-09 12:23 ` Matthieu Baerts
2026-03-10 8:09 ` Thomas Gleixner
2026-03-10 8:20 ` Thomas Gleixner
2026-03-10 8:56 ` Jiri Slaby
2026-03-10 9:00 ` Jiri Slaby
2026-03-10 10:03 ` Thomas Gleixner
2026-03-10 10:06 ` Thomas Gleixner
2026-03-10 11:24 ` Matthieu Baerts
2026-03-10 11:54 ` Peter Zijlstra
2026-03-10 12:28 ` Thomas Gleixner
2026-03-10 13:40 ` Matthieu Baerts
2026-03-10 13:47 ` Thomas Gleixner
2026-03-10 15:51 ` Matthieu Baerts
2026-03-03 13:23 ` Matthieu Baerts
2026-03-05 6:46 ` Jiri Slaby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=717310d8-6274-4b7f-8a19-561c45f5f565@kernel.org \
--to=jirislaby@kernel.org \
--cc=MKoutny@suse.com \
--cc=dave.hansen@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=luto@kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=sgarzare@redhat.com \
--cc=shinichiro.kawasaki@wdc.com \
--cc=stefanha@redhat.com \
--cc=tglx@kernel.org \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox