From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Ritesh Oedayrajsingh Varma <ritesh@superluminal.eu>,
Jelle van der Beek <jelle@superluminal.eu>,
kkd@meta.com, kernel-team@meta.com
Subject: [PATCH bpf-next v2 4/6] rqspinlock: Disable spinning for trylock fallback
Date: Fri, 28 Nov 2025 23:28:00 +0000 [thread overview]
Message-ID: <20251128232802.1031906-5-memxor@gmail.com> (raw)
In-Reply-To: <20251128232802.1031906-1-memxor@gmail.com>
The original trylock fallback was inherited from qspinlock, and then
reused for the reentrant NMIs while the slow path is active. However,
under contention, it is very unlikely for the trylock to succeed in
taking the lock. In addition, a trylock also has no fairness guarantees,
and thus is prone to starvation issues under extreme scenarios.
The original qspinlock had no choice in terms of returning an error the
caller; if the node count was breached, it had to fall back to trylock
to attempt to take the lock. In case of rqspinlock, we do have the
option of returning to the user. Thus, simply attempt the trylock once,
and instead of spinning, return an error in case the lock cannot be
taken.
This ends up significantly reducing the time spent in the trylock
fallback, since we no longer wait for the timeout duration trying to
aimlessly acquire the lock when there's a high-probability that under
contention, it won't be available to us anyway.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
kernel/bpf/rqspinlock.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
index e602cbbbd029..e35b06fcf9ee 100644
--- a/kernel/bpf/rqspinlock.c
+++ b/kernel/bpf/rqspinlock.c
@@ -450,19 +450,17 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
* not be nested NMIs taking spinlocks. That may not be true in
* some architectures even though the chance of needing more than
* 4 nodes will still be extremely unlikely. When that happens,
- * we fall back to spinning on the lock directly without using
- * any MCS node. This is not the most elegant solution, but is
- * simple enough.
+ * we fall back to attempting a trylock operation without using
+ * any MCS node. Unlike qspinlock which cannot fail, we have the
+ * option of failing the slow path, and under contention, such a
+ * trylock spinning will likely be treated unfairly due to lack of
+ * queueing, hence do not spin.
*/
if (unlikely(idx >= _Q_MAX_NODES || (in_nmi() && idx > 0))) {
lockevent_inc(lock_no_node);
- RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT);
- while (!queued_spin_trylock(lock)) {
- if (RES_CHECK_TIMEOUT(ts, ret, ~0u)) {
- lockevent_inc(rqspinlock_lock_timeout);
- goto err_release_node;
- }
- cpu_relax();
+ if (!queued_spin_trylock(lock)) {
+ ret = -EDEADLK;
+ goto err_release_node;
}
goto release;
}
--
2.51.0
next prev parent reply other threads:[~2025-11-28 23:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-28 23:27 [PATCH bpf-next v2 0/6] Limited queueing in NMI for rqspinlock Kumar Kartikeya Dwivedi
2025-11-28 23:27 ` [PATCH bpf-next v2 1/6] rqspinlock: Enclose lock/unlock within lock entry acquisitions Kumar Kartikeya Dwivedi
2025-11-28 23:27 ` [PATCH bpf-next v2 2/6] rqspinlock: Perform AA checks immediately Kumar Kartikeya Dwivedi
2025-11-28 23:27 ` [PATCH bpf-next v2 3/6] rqspinlock: Use trylock fallback when per-CPU rqnode is busy Kumar Kartikeya Dwivedi
2025-11-29 1:12 ` bot+bpf-ci
2025-11-29 18:13 ` Kumar Kartikeya Dwivedi
2025-11-28 23:28 ` Kumar Kartikeya Dwivedi [this message]
2025-11-28 23:28 ` [PATCH bpf-next v2 5/6] rqspinlock: Precede non-head waiter queueing with AA check Kumar Kartikeya Dwivedi
2025-11-28 23:28 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add success stats to rqspinlock stress test Kumar Kartikeya Dwivedi
2025-11-29 17:40 ` [PATCH bpf-next v2 0/6] Limited queueing in NMI for rqspinlock patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251128232802.1031906-5-memxor@gmail.com \
--to=memxor@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=jelle@superluminal.eu \
--cc=kernel-team@meta.com \
--cc=kkd@meta.com \
--cc=martin.lau@kernel.org \
--cc=ritesh@superluminal.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox