From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>, Jiri Slaby <jirislaby@kernel.org>
Cc: "Matthieu Baerts" <matttbe@kernel.org>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Stefano Garzarella" <sgarzare@redhat.com>,
kvm@vger.kernel.org, virtualization@lists.linux.dev,
Netdev <netdev@vger.kernel.org>,
rcu@vger.kernel.org, "MPTCP Linux" <mptcp@lists.linux.dev>,
"Linux Kernel" <linux-kernel@vger.kernel.org>,
"Thomas Gleixner" <tglx@kernel.org>,
"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"luto@kernel.org" <luto@kernel.org>,
"Michal Koutný" <MKoutny@suse.com>
Subject: Re: Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout
Date: Mon, 2 Mar 2026 09:30:18 -0500 [thread overview]
Message-ID: <d828fd84-55f5-4392-8afb-d5b1c539a2ef@redhat.com> (raw)
In-Reply-To: <20260302114636.GL606826@noisy.programming.kicks-ass.net>
On 3/2/26 6:46 AM, Peter Zijlstra wrote:
> On Mon, Mar 02, 2026 at 06:28:38AM +0100, Jiri Slaby wrote:
>
>> The state of the lock:
>>
>> crash> struct rq.__lock -x ffff8d1a6fd35dc0
>> __lock = {
>> raw_lock = {
>> {
>> val = {
>> counter = 0x40003
>> },
>> {
>> locked = 0x3,
>> pending = 0x0
>> },
>> {
>> locked_pending = 0x3,
>> tail = 0x4
>> }
>> }
>> }
>> },
>>
>
> That had me remember the below patch that never quite made it. I've
> rebased it to something more recent so it applies.
>
> If you stick that in, we might get a clue as to who is owning that lock.
> Provided it all wants to reproduce well enough.
>
> ---
> Subject: locking/qspinlock: Save previous node & owner CPU into mcs_spinlock
> From: Waiman Long <longman@redhat.com>
> Date: Fri, 3 May 2024 22:41:06 -0400
Oh, I forgot about that patch. I should had followed up at that time.
BTW, a lock value of 3 means that it is running paravirtual qspinlock.
It also means that we may not know exactly what the lock owner is if it
was acquired by lock stealing.
Cheers,
Longman
>
> From: Waiman Long <longman@redhat.com>
>
> When examining a contended spinlock in a crash dump, we can only find
> out the tail CPU in the MCS wait queue. There is no simple way to find
> out what other CPUs are waiting for the spinlock and which CPU is the
> lock owner.
>
> Make it easier to figure out these information by saving previous node
> data into the mcs_spinlock structure. This will allow us to reconstruct
> the MCS wait queue from tail to head. In order not to expand the size
> of mcs_spinlock, the original count field is split into two 16-bit
> chunks. The first chunk is for count and the second one is the new
> prev_node value.
>
> bits 0-1 : qnode index
> bits 2-15: CPU number + 1
>
> This prev_node value may be truncated if there are 16k or more CPUs in
> the system.
>
> The locked value in the queue head is also repurposed to hold an encoded
> qspinlock owner CPU number when acquiring the lock in the qspinlock
> slowpath of an contended lock.
>
> This lock owner information will not be available when the lock is
> acquired directly in the fast path or in the pending code path. There
> is no easy way around that.
>
> These changes should make analysis of a contended spinlock in a crash
> dump easier.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: https://patch.msgid.link/20240504024106.654319-1-longman@redhat.com
> ---
> include/asm-generic/mcs_spinlock.h | 5 +++--
> kernel/locking/mcs_spinlock.h | 8 +++++++-
> kernel/locking/qspinlock.c | 8 ++++++++
> 3 files changed, 18 insertions(+), 3 deletions(-)
>
> --- a/include/asm-generic/mcs_spinlock.h
> +++ b/include/asm-generic/mcs_spinlock.h
> @@ -3,8 +3,9 @@
>
> struct mcs_spinlock {
> struct mcs_spinlock *next;
> - int locked; /* 1 if lock acquired */
> - int count; /* nesting count, see qspinlock.c */
> + int locked; /* non-zero if lock acquired */
> + short count; /* nesting count, see qspinlock.c */
> + short prev_node; /* encoded previous node value */
> };
>
> /*
> --- a/kernel/locking/mcs_spinlock.h
> +++ b/kernel/locking/mcs_spinlock.h
> @@ -13,6 +13,12 @@
> #ifndef __LINUX_MCS_SPINLOCK_H
> #define __LINUX_MCS_SPINLOCK_H
>
> +/*
> + * Save an encoded version of the current MCS lock owner CPU to the
> + * mcs_spinlock structure of the next lock owner.
> + */
> +#define MCS_LOCKED (smp_processor_id() + 1)
> +
> #include <asm/mcs_spinlock.h>
>
> #ifndef arch_mcs_spin_lock_contended
> @@ -34,7 +40,7 @@
> * unlocking.
> */
> #define arch_mcs_spin_unlock_contended(l) \
> - smp_store_release((l), 1)
> + smp_store_release((l), MCS_LOCKED)
> #endif
>
> /*
> --- a/kernel/locking/qspinlock.c
> +++ b/kernel/locking/qspinlock.c
> @@ -250,6 +250,7 @@ void __lockfunc queued_spin_lock_slowpat
>
> node->locked = 0;
> node->next = NULL;
> + node->prev_node = 0;
> pv_init_node(node);
>
> /*
> @@ -278,6 +279,13 @@ void __lockfunc queued_spin_lock_slowpat
> next = NULL;
>
> /*
> + * The prev_node value is saved for crash dump analysis purpose only,
> + * it is not used within the qspinlock code. The encoded node value
> + * may be truncated if there are 16k or more CPUs in the system.
> + */
> + node->prev_node = old >> _Q_TAIL_IDX_OFFSET;
> +
> + /*
> * if there was a previous node; link it and wait until reaching the
> * head of the waitqueue.
> */
>
next prev parent reply other threads:[~2026-03-02 14:30 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 11:54 Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout Matthieu Baerts
2026-02-06 16:38 ` Stefano Garzarella
2026-02-06 17:13 ` Matthieu Baerts
2026-02-26 10:37 ` Jiri Slaby
2026-03-02 5:28 ` Jiri Slaby
2026-03-02 11:46 ` Peter Zijlstra
2026-03-02 14:30 ` Waiman Long [this message]
2026-03-05 7:00 ` Jiri Slaby
2026-03-05 11:53 ` Jiri Slaby
2026-03-05 12:20 ` Jiri Slaby
2026-03-05 16:16 ` Thomas Gleixner
2026-03-05 17:33 ` Jiri Slaby
2026-03-05 19:25 ` Thomas Gleixner
2026-03-06 5:48 ` Jiri Slaby
2026-03-06 9:57 ` Thomas Gleixner
2026-03-06 10:16 ` Jiri Slaby
2026-03-06 16:28 ` Thomas Gleixner
2026-03-06 11:06 ` Matthieu Baerts
2026-03-06 16:57 ` Matthieu Baerts
2026-03-06 18:31 ` Jiri Slaby
2026-03-06 18:44 ` Matthieu Baerts
2026-03-06 21:40 ` Matthieu Baerts
2026-03-06 15:24 ` Peter Zijlstra
2026-03-07 9:01 ` Thomas Gleixner
2026-03-07 22:29 ` Thomas Gleixner
2026-03-08 9:15 ` Thomas Gleixner
2026-03-08 16:55 ` Jiri Slaby
2026-03-08 16:58 ` Thomas Gleixner
2026-03-08 17:23 ` Matthieu Baerts
2026-03-09 8:43 ` Thomas Gleixner
2026-03-09 12:23 ` Matthieu Baerts
2026-03-10 8:09 ` Thomas Gleixner
2026-03-10 8:20 ` Thomas Gleixner
2026-03-10 8:56 ` Jiri Slaby
2026-03-10 9:00 ` Jiri Slaby
2026-03-10 10:03 ` Thomas Gleixner
2026-03-10 10:06 ` Thomas Gleixner
2026-03-10 11:24 ` Matthieu Baerts
2026-03-10 11:54 ` Peter Zijlstra
2026-03-10 12:28 ` Thomas Gleixner
2026-03-10 13:40 ` Matthieu Baerts
2026-03-10 13:47 ` Thomas Gleixner
2026-03-10 15:51 ` Matthieu Baerts
2026-03-03 13:23 ` Matthieu Baerts
2026-03-05 6:46 ` Jiri Slaby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d828fd84-55f5-4392-8afb-d5b1c539a2ef@redhat.com \
--to=longman@redhat.com \
--cc=MKoutny@suse.com \
--cc=dave.hansen@linux.intel.com \
--cc=jirislaby@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=sgarzare@redhat.com \
--cc=shinichiro.kawasaki@wdc.com \
--cc=stefanha@redhat.com \
--cc=tglx@kernel.org \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox