public inbox for virtualization@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Jiri Slaby <jirislaby@kernel.org>
Cc: "Matthieu Baerts" <matttbe@kernel.org>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	kvm@vger.kernel.org, virtualization@lists.linux.dev,
	Netdev <netdev@vger.kernel.org>,
	rcu@vger.kernel.org, "MPTCP Linux" <mptcp@lists.linux.dev>,
	"Linux Kernel" <linux-kernel@vger.kernel.org>,
	"Thomas Gleixner" <tglx@kernel.org>,
	"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"Michal Koutný" <MKoutny@suse.com>,
	"Waiman Long" <longman@redhat.com>
Subject: Re: Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout
Date: Mon, 2 Mar 2026 12:46:36 +0100	[thread overview]
Message-ID: <20260302114636.GL606826@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <863a5291-a636-47d0-891c-bb0524d2e134@kernel.org>

On Mon, Mar 02, 2026 at 06:28:38AM +0100, Jiri Slaby wrote:

> The state of the lock:
> 
> crash> struct rq.__lock -x ffff8d1a6fd35dc0
>   __lock = {
>     raw_lock = {
>       {
>         val = {
>           counter = 0x40003
>         },
>         {
>           locked = 0x3,
>           pending = 0x0
>         },
>         {
>           locked_pending = 0x3,
>           tail = 0x4
>         }
>       }
>     }
>   },
> 


That had me remember the below patch that never quite made it. I've
rebased it to something more recent so it applies.

If you stick that in, we might get a clue as to who is owning that lock.
Provided it all wants to reproduce well enough.

---
Subject: locking/qspinlock: Save previous node & owner CPU into mcs_spinlock
From: Waiman Long <longman@redhat.com>
Date: Fri, 3 May 2024 22:41:06 -0400

From: Waiman Long <longman@redhat.com>

When examining a contended spinlock in a crash dump, we can only find
out the tail CPU in the MCS wait queue. There is no simple way to find
out what other CPUs are waiting for the spinlock and which CPU is the
lock owner.

Make it easier to figure out these information by saving previous node
data into the mcs_spinlock structure. This will allow us to reconstruct
the MCS wait queue from tail to head. In order not to expand the size
of mcs_spinlock, the original count field is split into two 16-bit
chunks. The first chunk is for count and the second one is the new
prev_node value.

  bits 0-1 : qnode index
  bits 2-15: CPU number + 1

This prev_node value may be truncated if there are 16k or more CPUs in
the system.

The locked value in the queue head is also repurposed to hold an encoded
qspinlock owner CPU number when acquiring the lock in the qspinlock
slowpath of an contended lock.

This lock owner information will not be available when the lock is
acquired directly in the fast path or in the pending code path. There
is no easy way around that.

These changes should make analysis of a contended spinlock in a crash
dump easier.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20240504024106.654319-1-longman@redhat.com
---
 include/asm-generic/mcs_spinlock.h |    5 +++--
 kernel/locking/mcs_spinlock.h      |    8 +++++++-
 kernel/locking/qspinlock.c         |    8 ++++++++
 3 files changed, 18 insertions(+), 3 deletions(-)

--- a/include/asm-generic/mcs_spinlock.h
+++ b/include/asm-generic/mcs_spinlock.h
@@ -3,8 +3,9 @@
 
 struct mcs_spinlock {
 	struct mcs_spinlock *next;
-	int locked; /* 1 if lock acquired */
-	int count;  /* nesting count, see qspinlock.c */
+	int locked;	 /* non-zero if lock acquired */
+	short count;	 /* nesting count, see qspinlock.c */
+	short prev_node; /* encoded previous node value */
 };
 
 /*
--- a/kernel/locking/mcs_spinlock.h
+++ b/kernel/locking/mcs_spinlock.h
@@ -13,6 +13,12 @@
 #ifndef __LINUX_MCS_SPINLOCK_H
 #define __LINUX_MCS_SPINLOCK_H
 
+/*
+ * Save an encoded version of the current MCS lock owner CPU to the
+ * mcs_spinlock structure of the next lock owner.
+ */
+#define MCS_LOCKED	(smp_processor_id() + 1)
+
 #include <asm/mcs_spinlock.h>
 
 #ifndef arch_mcs_spin_lock_contended
@@ -34,7 +40,7 @@
  * unlocking.
  */
 #define arch_mcs_spin_unlock_contended(l)				\
-	smp_store_release((l), 1)
+	smp_store_release((l), MCS_LOCKED)
 #endif
 
 /*
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -250,6 +250,7 @@ void __lockfunc queued_spin_lock_slowpat
 
 	node->locked = 0;
 	node->next = NULL;
+	node->prev_node = 0;
 	pv_init_node(node);
 
 	/*
@@ -278,6 +279,13 @@ void __lockfunc queued_spin_lock_slowpat
 	next = NULL;
 
 	/*
+	 * The prev_node value is saved for crash dump analysis purpose only,
+	 * it is not used within the qspinlock code. The encoded node value
+	 * may be truncated if there are 16k or more CPUs in the system.
+	 */
+	node->prev_node = old >> _Q_TAIL_IDX_OFFSET;
+
+	/*
 	 * if there was a previous node; link it and wait until reaching the
 	 * head of the waitqueue.
 	 */

  reply	other threads:[~2026-03-02 11:46 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06 11:54 Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout Matthieu Baerts
2026-02-06 16:38 ` Stefano Garzarella
2026-02-06 17:13   ` Matthieu Baerts
2026-02-26 10:37 ` Jiri Slaby
2026-03-02  5:28   ` Jiri Slaby
2026-03-02 11:46     ` Peter Zijlstra [this message]
2026-03-02 14:30       ` Waiman Long
2026-03-05  7:00       ` Jiri Slaby
2026-03-05 11:53         ` Jiri Slaby
2026-03-05 12:20           ` Jiri Slaby
2026-03-05 16:16             ` Thomas Gleixner
2026-03-05 17:33               ` Jiri Slaby
2026-03-05 19:25                 ` Thomas Gleixner
2026-03-06  5:48                   ` Jiri Slaby
2026-03-06  9:57                     ` Thomas Gleixner
2026-03-06 10:16                       ` Jiri Slaby
2026-03-06 16:28                         ` Thomas Gleixner
2026-03-06 11:06                       ` Matthieu Baerts
2026-03-06 16:57                         ` Matthieu Baerts
2026-03-06 18:31                           ` Jiri Slaby
2026-03-06 18:44                             ` Matthieu Baerts
2026-03-06 21:40                           ` Matthieu Baerts
2026-03-06 15:24                       ` Peter Zijlstra
2026-03-07  9:01                         ` Thomas Gleixner
2026-03-07 22:29                           ` Thomas Gleixner
2026-03-08  9:15                             ` Thomas Gleixner
2026-03-08 16:55                               ` Jiri Slaby
2026-03-08 16:58                               ` Thomas Gleixner
2026-03-08 17:23                                 ` Matthieu Baerts
2026-03-09  8:43                                   ` Thomas Gleixner
2026-03-09 12:23                                     ` Matthieu Baerts
2026-03-10  8:09                                       ` Thomas Gleixner
2026-03-10  8:20                                         ` Thomas Gleixner
2026-03-10  8:56                                         ` Jiri Slaby
2026-03-10  9:00                                           ` Jiri Slaby
2026-03-10 10:03                                             ` Thomas Gleixner
2026-03-10 10:06                                               ` Thomas Gleixner
2026-03-10 11:24                                                 ` Matthieu Baerts
2026-03-10 11:54                                                   ` Peter Zijlstra
2026-03-10 12:28                                                     ` Thomas Gleixner
2026-03-10 13:40                                                       ` Matthieu Baerts
2026-03-10 13:47                                                         ` Thomas Gleixner
2026-03-10 15:51                                                           ` Matthieu Baerts
2026-03-03 13:23   ` Matthieu Baerts
2026-03-05  6:46     ` Jiri Slaby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260302114636.GL606826@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=MKoutny@suse.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jirislaby@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=luto@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mptcp@lists.linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=sgarzare@redhat.com \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=stefanha@redhat.com \
    --cc=tglx@kernel.org \
    --cc=virtualization@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox