From: Waiman Long <llong@redhat.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Waiman Long <llong@redhat.com>
Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Tejun Heo <tj@kernel.org>, Barret Rhoden <brho@google.com>,
Josh Don <joshdon@google.com>, Dohyun Kim <dohyunkim@google.com>,
kernel-team@meta.com
Subject: Re: [PATCH bpf-next v1 14/22] rqspinlock: Add macros for rqspinlock usage
Date: Wed, 8 Jan 2025 20:11:43 -0500 [thread overview]
Message-ID: <2ff3a68c-1328-4b47-a4aa-0365b3f1809b@redhat.com> (raw)
In-Reply-To: <CAP01T77QD_pYBVS4PfG3jDeXObKHZJkV2nQX+0njv11oKTEqRA@mail.gmail.com>
On 1/8/25 3:41 PM, Kumar Kartikeya Dwivedi wrote:
> On Wed, 8 Jan 2025 at 22:26, Waiman Long <llong@redhat.com> wrote:
>> On 1/7/25 8:59 AM, Kumar Kartikeya Dwivedi wrote:
>>> Introduce helper macros that wrap around the rqspinlock slow path and
>>> provide an interface analogous to the raw_spin_lock API. Note that
>>> in case of error conditions, preemption and IRQ disabling is
>>> automatically unrolled before returning the error back to the caller.
>>>
>>> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>>> ---
>>> include/asm-generic/rqspinlock.h | 58 ++++++++++++++++++++++++++++++++
>>> 1 file changed, 58 insertions(+)
>>>
>>> diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h
>>> index dc436ab01471..53be8426373c 100644
>>> --- a/include/asm-generic/rqspinlock.h
>>> +++ b/include/asm-generic/rqspinlock.h
>>> @@ -12,8 +12,10 @@
>>> #include <linux/types.h>
>>> #include <vdso/time64.h>
>>> #include <linux/percpu.h>
>>> +#include <asm/qspinlock.h>
>>>
>>> struct qspinlock;
>>> +typedef struct qspinlock rqspinlock_t;
>>>
>>> extern int resilient_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val, u64 timeout);
>>>
>>> @@ -82,4 +84,60 @@ static __always_inline void release_held_lock_entry(void)
>>> this_cpu_dec(rqspinlock_held_locks.cnt);
>>> }
>>>
>>> +/**
>>> + * res_spin_lock - acquire a queued spinlock
>>> + * @lock: Pointer to queued spinlock structure
>>> + */
>>> +static __always_inline int res_spin_lock(rqspinlock_t *lock)
>>> +{
>>> + int val = 0;
>>> +
>>> + if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL))) {
>>> + grab_held_lock_entry(lock);
>>> + return 0;
>>> + }
>>> + return resilient_queued_spin_lock_slowpath(lock, val, RES_DEF_TIMEOUT);
>>> +}
>>> +
>>> +static __always_inline void res_spin_unlock(rqspinlock_t *lock)
>>> +{
>>> + struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks);
>>> +
>>> + if (unlikely(rqh->cnt > RES_NR_HELD))
>>> + goto unlock;
>>> + WRITE_ONCE(rqh->locks[rqh->cnt - 1], NULL);
>>> + /*
>>> + * Release barrier, ensuring ordering. See release_held_lock_entry.
>>> + */
>>> +unlock:
>>> + queued_spin_unlock(lock);
>>> + this_cpu_dec(rqspinlock_held_locks.cnt);
>>> +}
>>> +
>>> +#define raw_res_spin_lock_init(lock) ({ *(lock) = (struct qspinlock)__ARCH_SPIN_LOCK_UNLOCKED; })
>>> +
>>> +#define raw_res_spin_lock(lock) \
>>> + ({ \
>>> + int __ret; \
>>> + preempt_disable(); \
>>> + __ret = res_spin_lock(lock); \
>>> + if (__ret) \
>>> + preempt_enable(); \
>>> + __ret; \
>>> + })
>>> +
>>> +#define raw_res_spin_unlock(lock) ({ res_spin_unlock(lock); preempt_enable(); })
>>> +
>>> +#define raw_res_spin_lock_irqsave(lock, flags) \
>>> + ({ \
>>> + int __ret; \
>>> + local_irq_save(flags); \
>>> + __ret = raw_res_spin_lock(lock); \
>>> + if (__ret) \
>>> + local_irq_restore(flags); \
>>> + __ret; \
>>> + })
>>> +
>>> +#define raw_res_spin_unlock_irqrestore(lock, flags) ({ raw_res_spin_unlock(lock); local_irq_restore(flags); })
>>> +
>>> #endif /* __ASM_GENERIC_RQSPINLOCK_H */
>> Lockdep calls aren't included in the helper functions. That means all
>> the *res_spin_lock*() calls will be outside the purview of lockdep. That
>> also means a multi-CPU circular locking dependency involving a mixture
>> of qspinlocks and rqspinlocks may not be detectable.
> Yes, this is true, but I am not sure whether lockdep fits well in this
> case, or how to map its semantics.
> Some BPF users (e.g. in patch 17) expect and rely on rqspinlock to
> return errors on AA deadlocks, as nesting is possible, so we'll get
> false alarms with it. Lockdep also needs to treat rqspinlock as a
> trylock, since it's essentially fallible, and IIUC it skips diagnosing
> in those cases.
Yes, we can certainly treat rqspinlock as a trylock.
> Most of the users use rqspinlock because it is expected a deadlock may
> be constructed at runtime (either due to BPF programs or by attaching
> programs to the kernel), so lockdep splats will not be helpful on
> debug kernels.
In most cases, lockdep will report a cyclic locking dependency
(potential deadlock) before a real deadlock happens as it requires the
right combination of events happening in a specific sequence. So lockdep
can report a deadlock while the runtime check of rqspinlock may not see
it and there is no locking stall. Also rqspinlock will not see the other
locks held in the current context.
> Say if a mix of both qspinlock and rqspinlock were involved in an ABBA
> situation, as long as rqspinlock is being acquired on one of the
> threads, it will still timeout even if check_deadlock fails to
> establish presence of a deadlock. This will mean the qspinlock call on
> the other side will make progress as long as the kernel unwinds locks
> correctly on failures (by handling rqspinlock errors and releasing
> held locks on the way out).
That is true only if the latest lock to be acquired is a rqspinlock. If
all the rqspinlocks in the circular path have already been acquired, no
unwinding is possible.
That is probably not an issue with the limited rqspinlock conversion in
this patch series. In the future when more and more locks are converted
to use rqspinlock, this scenario may happen.
Cheers,
Longman
next prev parent reply other threads:[~2025-01-09 1:11 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-07 13:59 [PATCH bpf-next v1 00/22] Resilient Queued Spin Lock Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 01/22] locking: Move MCS struct definition to public header Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 02/22] locking: Move common qspinlock helpers to a private header Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 03/22] locking: Allow obtaining result of arch_mcs_spin_lock_contended Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 04/22] locking: Copy out qspinlock.c to rqspinlock.c Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 05/22] rqspinlock: Add rqspinlock.h header Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 06/22] rqspinlock: Drop PV and virtualization support Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 07/22] rqspinlock: Add support for timeouts Kumar Kartikeya Dwivedi
2025-01-07 14:50 ` Peter Zijlstra
2025-01-07 17:14 ` Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 08/22] rqspinlock: Protect pending bit owners from stalls Kumar Kartikeya Dwivedi
2025-01-07 14:51 ` Peter Zijlstra
2025-01-07 17:14 ` Kumar Kartikeya Dwivedi
2025-01-07 19:17 ` Peter Zijlstra
2025-01-07 19:22 ` Peter Zijlstra
2025-01-07 19:54 ` Kumar Kartikeya Dwivedi
2025-01-08 2:19 ` Waiman Long
2025-01-08 20:13 ` Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 09/22] rqspinlock: Protect waiters in queue " Kumar Kartikeya Dwivedi
2025-01-08 3:38 ` Waiman Long
2025-01-08 20:42 ` Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 10/22] rqspinlock: Protect waiters in trylock fallback " Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 11/22] rqspinlock: Add deadlock detection and recovery Kumar Kartikeya Dwivedi
2025-01-08 16:06 ` Waiman Long
2025-01-08 20:19 ` Kumar Kartikeya Dwivedi
2025-01-09 0:32 ` Waiman Long
2025-01-07 13:59 ` [PATCH bpf-next v1 12/22] rqspinlock: Add basic support for CONFIG_PARAVIRT Kumar Kartikeya Dwivedi
2025-01-08 16:27 ` Waiman Long
2025-01-08 20:32 ` Kumar Kartikeya Dwivedi
2025-01-09 0:48 ` Waiman Long
2025-01-09 2:42 ` Alexei Starovoitov
2025-01-09 2:58 ` Waiman Long
2025-01-09 3:37 ` Alexei Starovoitov
2025-01-09 3:46 ` Waiman Long
2025-01-09 3:53 ` Alexei Starovoitov
2025-01-09 3:58 ` Waiman Long
2025-01-07 13:59 ` [PATCH bpf-next v1 13/22] rqspinlock: Add helper to print a splat on timeout or deadlock Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 14/22] rqspinlock: Add macros for rqspinlock usage Kumar Kartikeya Dwivedi
2025-01-08 16:55 ` Waiman Long
2025-01-08 20:41 ` Kumar Kartikeya Dwivedi
2025-01-09 1:11 ` Waiman Long [this message]
2025-01-09 3:30 ` Alexei Starovoitov
2025-01-09 4:09 ` Waiman Long
2025-01-07 13:59 ` [PATCH bpf-next v1 15/22] rqspinlock: Add locktorture support Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 16/22] rqspinlock: Add entry to Makefile, MAINTAINERS Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 17/22] bpf: Convert hashtab.c to rqspinlock Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 18/22] bpf: Convert percpu_freelist.c " Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 19/22] bpf: Convert lpm_trie.c " Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 20/22] bpf: Introduce rqspinlock kfuncs Kumar Kartikeya Dwivedi
2025-01-08 10:23 ` kernel test robot
2025-01-08 10:23 ` kernel test robot
2025-01-08 10:44 ` kernel test robot
2025-01-07 14:00 ` [PATCH bpf-next v1 21/22] bpf: Implement verifier support for rqspinlock Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 22/22] selftests/bpf: Add tests " Kumar Kartikeya Dwivedi
2025-01-07 23:54 ` [PATCH bpf-next v1 00/22] Resilient Queued Spin Lock Linus Torvalds
2025-01-08 9:18 ` Peter Zijlstra
2025-01-08 20:12 ` Kumar Kartikeya Dwivedi
2025-01-08 20:30 ` Linus Torvalds
2025-01-08 21:06 ` Kumar Kartikeya Dwivedi
2025-01-08 21:30 ` Paul E. McKenney
2025-01-09 13:59 ` Waiman Long
2025-01-09 21:13 ` Kumar Kartikeya Dwivedi
2025-01-09 21:18 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2ff3a68c-1328-4b47-a4aa-0365b3f1809b@redhat.com \
--to=llong@redhat.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brho@google.com \
--cc=daniel@iogearbox.net \
--cc=dohyunkim@google.com \
--cc=eddyz87@gmail.com \
--cc=joshdon@google.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox