From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Barret Rhoden <brho@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Will Deacon <will@kernel.org>, Waiman Long <llong@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Tejun Heo <tj@kernel.org>, Josh Don <joshdon@google.com>,
Dohyun Kim <dohyunkim@google.com>,
linux-arm-kernel@lists.infradead.org, kkd@meta.com,
kernel-team@meta.com
Subject: [PATCH bpf-next v4 09/25] rqspinlock: Protect pending bit owners from stalls
Date: Sat, 15 Mar 2025 21:05:25 -0700 [thread overview]
Message-ID: <20250316040541.108729-10-memxor@gmail.com> (raw)
In-Reply-To: <20250316040541.108729-1-memxor@gmail.com>
The pending bit is used to avoid queueing in case the lock is
uncontended, and has demonstrated benefits for the 2 contender scenario,
esp. on x86. In case the pending bit is acquired and we wait for the
locked bit to disappear, we may get stuck due to the lock owner not
making progress. Hence, this waiting loop must be protected with a
timeout check.
To perform a graceful recovery once we decide to abort our lock
acquisition attempt in this case, we must unset the pending bit since we
own it. All waiters undoing their changes and exiting gracefully allows
the lock word to be restored to the unlocked state once all participants
(owner, waiters) have been recovered, and the lock remains usable.
Hence, set the pending bit back to zero before returning to the caller.
Introduce a lockevent (rqspinlock_lock_timeout) to capture timeout
event statistics.
Reviewed-by: Barret Rhoden <brho@google.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
include/asm-generic/rqspinlock.h | 2 +-
kernel/bpf/rqspinlock.c | 32 ++++++++++++++++++++++++++-----
kernel/locking/lock_events_list.h | 5 +++++
3 files changed, 33 insertions(+), 6 deletions(-)
diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h
index 5dd4dd8aee69..9bd11cb7acd6 100644
--- a/include/asm-generic/rqspinlock.h
+++ b/include/asm-generic/rqspinlock.h
@@ -15,7 +15,7 @@
struct qspinlock;
typedef struct qspinlock rqspinlock_t;
-extern void resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val);
+extern int resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val);
/*
* Default timeout for waiting loops is 0.25 seconds
diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
index d429b923b58f..262294cfd36f 100644
--- a/kernel/bpf/rqspinlock.c
+++ b/kernel/bpf/rqspinlock.c
@@ -138,6 +138,10 @@ static DEFINE_PER_CPU_ALIGNED(struct qnode, rqnodes[_Q_MAX_NODES]);
* @lock: Pointer to queued spinlock structure
* @val: Current value of the queued spinlock 32-bit word
*
+ * Return:
+ * * 0 - Lock was acquired successfully.
+ * * -ETIMEDOUT - Lock acquisition failed because of timeout.
+ *
* (queue tail, pending bit, lock value)
*
* fast : slow : unlock
@@ -154,12 +158,12 @@ static DEFINE_PER_CPU_ALIGNED(struct qnode, rqnodes[_Q_MAX_NODES]);
* contended : (*,x,y) +--> (*,0,0) ---> (*,0,1) -' :
* queue : ^--' :
*/
-void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
+int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
{
struct mcs_spinlock *prev, *next, *node;
struct rqspinlock_timeout ts;
+ int idx, ret = 0;
u32 old, tail;
- int idx;
BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS));
@@ -217,8 +221,25 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
* clear_pending_set_locked() implementations imply full
* barriers.
*/
- if (val & _Q_LOCKED_MASK)
- smp_cond_load_acquire(&lock->locked, !VAL);
+ if (val & _Q_LOCKED_MASK) {
+ RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT);
+ res_smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret));
+ }
+
+ if (ret) {
+ /*
+ * We waited for the locked bit to go back to 0, as the pending
+ * waiter, but timed out. We need to clear the pending bit since
+ * we own it. Once a stuck owner has been recovered, the lock
+ * must be restored to a valid state, hence removing the pending
+ * bit is necessary.
+ *
+ * *,1,* -> *,0,*
+ */
+ clear_pending(lock);
+ lockevent_inc(rqspinlock_lock_timeout);
+ return ret;
+ }
/*
* take ownership and clear the pending bit.
@@ -227,7 +248,7 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
*/
clear_pending_set_locked(lock);
lockevent_inc(lock_pending);
- return;
+ return 0;
/*
* End of pending bit optimistic spinning and beginning of MCS
@@ -378,5 +399,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
* release the node
*/
__this_cpu_dec(rqnodes[0].mcs.count);
+ return 0;
}
EXPORT_SYMBOL_GPL(resilient_queued_spin_lock_slowpath);
diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
index 97fb6f3f840a..c5286249994d 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -49,6 +49,11 @@ LOCK_EVENT(lock_use_node4) /* # of locking ops that use 4th percpu node */
LOCK_EVENT(lock_no_node) /* # of locking ops w/o using percpu node */
#endif /* CONFIG_QUEUED_SPINLOCKS */
+/*
+ * Locking events for Resilient Queued Spin Lock
+ */
+LOCK_EVENT(rqspinlock_lock_timeout) /* # of locking ops that timeout */
+
/*
* Locking events for rwsem
*/
--
2.47.1
next prev parent reply other threads:[~2025-03-16 4:22 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-16 4:05 [PATCH bpf-next v4 00/25] Resilient Queued Spin Lock Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 01/25] locking: Move MCS struct definition to public header Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 02/25] locking: Move common qspinlock helpers to a private header Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 03/25] locking: Allow obtaining result of arch_mcs_spin_lock_contended Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 04/25] locking: Copy out qspinlock.c to kernel/bpf/rqspinlock.c Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 05/25] rqspinlock: Add rqspinlock.h header Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 06/25] rqspinlock: Drop PV and virtualization support Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 07/25] rqspinlock: Add support for timeouts Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 08/25] rqspinlock: Hardcode cond_acquire loops for arm64 Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` Kumar Kartikeya Dwivedi [this message]
2025-03-16 4:05 ` [PATCH bpf-next v4 10/25] rqspinlock: Protect waiters in queue from stalls Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 11/25] rqspinlock: Protect waiters in trylock fallback " Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 12/25] rqspinlock: Add deadlock detection and recovery Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 13/25] rqspinlock: Add a test-and-set fallback Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 14/25] rqspinlock: Add basic support for CONFIG_PARAVIRT Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 15/25] rqspinlock: Add helper to print a splat on timeout or deadlock Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 16/25] rqspinlock: Add macros for rqspinlock usage Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 17/25] rqspinlock: Add entry to Makefile, MAINTAINERS Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 18/25] rqspinlock: Add locktorture support Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 19/25] bpf: Convert hashtab.c to rqspinlock Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 20/25] bpf: Convert percpu_freelist.c " Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 21/25] bpf: Convert lpm_trie.c " Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 22/25] bpf: Introduce rqspinlock kfuncs Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 23/25] bpf: Implement verifier support for rqspinlock Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 24/25] bpf: Maintain FIFO property for rqspinlock unlock Kumar Kartikeya Dwivedi
2025-03-16 4:05 ` [PATCH bpf-next v4 25/25] selftests/bpf: Add tests for rqspinlock Kumar Kartikeya Dwivedi
2025-03-18 20:32 ` [PATCH bpf-next v4 00/25] Resilient Queued Spin Lock Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250316040541.108729-10-memxor@gmail.com \
--to=memxor@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brho@google.com \
--cc=daniel@iogearbox.net \
--cc=dohyunkim@google.com \
--cc=eddyz87@gmail.com \
--cc=joshdon@google.com \
--cc=kernel-team@meta.com \
--cc=kkd@meta.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=llong@redhat.com \
--cc=martin.lau@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).