From: Peter Zijlstra <peterz@infradead.org>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
lkmm@lists.linux.dev, Ingo Molnar <mingo@kernel.org>,
Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>,
Davidlohr Bueso <dave@stgolabs.net>,
"Paul E. McKenney" <paulmck@kernel.org>,
Josh Triplett <josh@joshtriplett.org>,
Frederic Weisbecker <frederic@kernel.org>,
Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Uladzislau Rezki <urezki@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Zqiang <qiang.zhang@linux.dev>, Breno Leitao <leitao@debian.org>,
aeh@meta.com, netdev@vger.kernel.org, edumazet@google.com,
jhs@mojatatu.com, kernel-team@meta.com,
Erik Lundgren <elundgren@meta.com>
Subject: Re: [PATCH 1/8] Introduce simple hazard pointers
Date: Wed, 25 Jun 2025 12:00:32 +0200 [thread overview]
Message-ID: <20250625100032.GA1613376@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <20250625031101.12555-2-boqun.feng@gmail.com>
On Tue, Jun 24, 2025 at 08:10:54PM -0700, Boqun Feng wrote:
> As its name suggests, simple hazard pointers (shazptr) is a
> simplification of hazard pointers [1]: it has only one hazard pointer
> slot per-CPU and is targeted for simple use cases where the read-side
> already has preemption disabled. It's a trade-off between full features
> of a normal hazard pointer implementation (multiple slots, dynamic slot
> allocation, etc.) and the simple use scenario.
>
> Since there's only one slot per-CPU, so shazptr read-side critical
> section nesting is a problem that needs to be resolved, because at very
> least, interrupts and NMI can introduce nested shazptr read-side
> critical sections. A SHAZPTR_WILDCARD is introduced to resolve this:
> SHAZPTR_WILDCARD is a special address value that blocks *all* shazptr
> waiters. In an interrupt-causing shazptr read-side critical section
> nesting case (i.e. an interrupt happens while the per-CPU hazard pointer
> slot being used and tries to acquire a hazard pointer itself), the inner
> critical section will switch the value of the hazard pointer slot into
> SHAZPTR_WILDCARD, and let the outer critical section eventually zero the
> slot. The SHAZPTR_WILDCARD still provide the correct protection because
> it blocks all the waiters.
Don't we typically name such a thing a tombstone?
> It's true that once the wildcard mechanism is activated, shazptr
> mechanism may be downgrade to something similar to RCU (and probably
> with a worse implementation), which generally has longer wait time and
> larger memory footprint compared to a typical hazard pointer
> implementation. However, that can only happen with a lot of users using
> hazard pointers, and then it's reasonable to introduce the
> fully-featured hazard pointer implementation [2] and switch users to it.
>
> Note that shazptr_protect() may be added later, the current potential
> usage doesn't require it, and a shazptr_acquire(), which installs the
> protected value to hazard pointer slot and proves the smp_mb(), is
> enough for now.
>
> [1]: M. M. Michael, "Hazard pointers: safe memory reclamation for
> lock-free objects," in IEEE Transactions on Parallel and
> Distributed Systems, vol. 15, no. 6, pp. 491-504, June 2004
>
> Link: https://lore.kernel.org/lkml/20240917143402.930114-1-boqun.feng@gmail.com/ [2]
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
> include/linux/shazptr.h | 73 ++++++++++++++++++++++++++++++++++++++++
> kernel/locking/Makefile | 2 +-
> kernel/locking/shazptr.c | 29 ++++++++++++++++
> 3 files changed, 103 insertions(+), 1 deletion(-)
> create mode 100644 include/linux/shazptr.h
> create mode 100644 kernel/locking/shazptr.c
>
> diff --git a/include/linux/shazptr.h b/include/linux/shazptr.h
> new file mode 100644
> index 000000000000..287cd04b4be9
> --- /dev/null
> +++ b/include/linux/shazptr.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Simple hazard pointers
> + *
> + * Copyright (c) 2025, Microsoft Corporation.
> + *
> + * Author: Boqun Feng <boqun.feng@gmail.com>
> + *
> + * A simple variant of hazard pointers, the users must ensure the preemption
> + * is already disabled when calling a shazptr_acquire() to protect an address.
> + * If one shazptr_acquire() is called after another shazptr_acquire() has been
> + * called without the corresponding shazptr_clear() has been called, the later
> + * shazptr_acquire() must be cleared first.
> + *
> + * The most suitable usage is when only one address need to be protected in a
> + * preemption disabled critical section.
It might be useful to have some example code included here to illustrate
how this is supposed to be used etc.
> + */
> +
> +#ifndef _LINUX_SHAZPTR_H
> +#define _LINUX_SHAZPTR_H
> +
> +#include <linux/cleanup.h>
> +#include <linux/percpu.h>
> +
> +/* Make ULONG_MAX the wildcard value */
> +#define SHAZPTR_WILDCARD ((void *)(ULONG_MAX))
Right, I typically write that like: ((void *)-1L) or ((void *)~0UL)
> +
> +DECLARE_PER_CPU_SHARED_ALIGNED(void *, shazptr_slots);
> +
> +/* Represent a held hazard pointer slot */
> +struct shazptr_guard {
> + void **slot;
> + bool use_wildcard;
> +};
Natural alignment ensures the LSB of that pointer is 0, which is enough
space to stick that bool in, no?
> +
> +/*
> + * Acquire a hazptr slot and begin the hazard pointer critical section.
> + *
> + * Must be called with preemption disabled, and preemption must remain disabled
> + * until shazptr_clear().
> + */
> +static inline struct shazptr_guard shazptr_acquire(void *ptr)
> +{
> + struct shazptr_guard guard = {
> + /* Preemption is disabled. */
> + .slot = this_cpu_ptr(&shazptr_slots),
What you're trying to say with that comment is that: this_cpu_ptr(),
will complain if preemption is not already disabled, and as such this
verifies the assumption?
You can also add:
lockdep_assert_preemption_disabled();
at the start of this function and then all these comments can go in the
bin, no?
> + .use_wildcard = false,
> + };
> +
> + if (likely(!READ_ONCE(*guard.slot))) {
> + WRITE_ONCE(*guard.slot, ptr);
> + } else {
> + guard.use_wildcard = true;
> + WRITE_ONCE(*guard.slot, SHAZPTR_WILDCARD);
> + }
> +
> + smp_mb(); /* Synchronize with smp_mb() at synchronize_shazptr(). */
> +
> + return guard;
> +}
> +
> +static inline void shazptr_clear(struct shazptr_guard guard)
> +{
> + /* Only clear the slot when the outermost guard is released */
> + if (likely(!guard.use_wildcard))
> + smp_store_release(guard.slot, NULL); /* Pair with ACQUIRE at synchronize_shazptr() */
> +}
> +
> +void synchronize_shazptr(void *ptr);
> +
> +DEFINE_CLASS(shazptr, struct shazptr_guard, shazptr_clear(_T),
> + shazptr_acquire(ptr), void *ptr);
> +#endif
> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
> index a114949eeed5..1517076c98ec 100644
> --- a/kernel/locking/Makefile
> +++ b/kernel/locking/Makefile
> @@ -3,7 +3,7 @@
> # and is generally not a function of system call inputs.
> KCOV_INSTRUMENT := n
>
> -obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
> +obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o shazptr.o
>
> # Avoid recursion lockdep -> sanitizer -> ... -> lockdep & improve performance.
> KASAN_SANITIZE_lockdep.o := n
> diff --git a/kernel/locking/shazptr.c b/kernel/locking/shazptr.c
> new file mode 100644
> index 000000000000..991fd1a05cfd
> --- /dev/null
> +++ b/kernel/locking/shazptr.c
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Simple hazard pointers
> + *
> + * Copyright (c) 2025, Microsoft Corporation.
> + *
> + * Author: Boqun Feng <boqun.feng@gmail.com>
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/cpumask.h>
> +#include <linux/shazptr.h>
> +
> +DEFINE_PER_CPU_SHARED_ALIGNED(void *, shazptr_slots);
> +EXPORT_PER_CPU_SYMBOL_GPL(shazptr_slots);
> +
> +void synchronize_shazptr(void *ptr)
> +{
> + int cpu;
lockdep_assert_preemption_enabled();
> +
> + smp_mb(); /* Synchronize with the smp_mb() in shazptr_acquire(). */
> + for_each_possible_cpu(cpu) {
> + void **slot = per_cpu_ptr(&shazptr_slots, cpu);
> + /* Pair with smp_store_release() in shazptr_clear(). */
> + smp_cond_load_acquire(slot,
> + VAL != ptr && VAL != SHAZPTR_WILDCARD);
> + }
> +}
> +EXPORT_SYMBOL_GPL(synchronize_shazptr);
> --
> 2.39.5 (Apple Git-154)
>
next prev parent reply other threads:[~2025-06-25 10:00 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-25 3:10 [PATCH 0/8] Introduce simple hazard pointers for lockdep Boqun Feng
2025-06-25 3:10 ` [PATCH 1/8] Introduce simple hazard pointers Boqun Feng
2025-06-25 10:00 ` Peter Zijlstra [this message]
2025-06-25 14:25 ` Mathieu Desnoyers
2025-06-25 15:05 ` Boqun Feng
2025-06-25 15:52 ` Waiman Long
2025-06-25 16:09 ` Boqun Feng
2025-06-25 17:47 ` Waiman Long
2025-06-25 3:10 ` [PATCH 2/8] shazptr: Add refscale test Boqun Feng
2025-06-25 10:02 ` Peter Zijlstra
2025-06-25 3:10 ` [PATCH 3/8] shazptr: Add refscale test for wildcard Boqun Feng
2025-06-25 10:03 ` Peter Zijlstra
2025-06-25 3:10 ` [PATCH 4/8] shazptr: Avoid synchronize_shaptr() busy waiting Boqun Feng
2025-06-25 11:40 ` Peter Zijlstra
2025-06-25 11:56 ` Peter Zijlstra
2025-06-25 13:56 ` Frederic Weisbecker
2025-06-25 15:24 ` Boqun Feng
2025-06-26 13:45 ` Frederic Weisbecker
2025-06-25 3:10 ` [PATCH 5/8] shazptr: Allow skip self scan in synchronize_shaptr() Boqun Feng
2025-06-25 3:10 ` [PATCH 6/8] rcuscale: Allow rcu_scale_ops::get_gp_seq to be NULL Boqun Feng
2025-06-25 3:11 ` [PATCH 7/8] rcuscale: Add tests for simple hazard pointers Boqun Feng
2025-06-25 3:11 ` [PATCH 8/8] locking/lockdep: Use shazptr to protect the key hashlist Boqun Feng
2025-06-25 11:59 ` Peter Zijlstra
2025-06-25 14:18 ` Boqun Feng
2025-07-10 14:06 ` Breno Leitao
2025-07-11 2:31 ` Boqun Feng
2025-06-25 12:05 ` [PATCH 0/8] Introduce simple hazard pointers for lockdep Christoph Hellwig
2025-06-25 14:08 ` Boqun Feng
2025-06-26 10:16 ` Christoph Hellwig
2025-06-26 13:45 ` Mathieu Desnoyers
2025-06-26 15:47 ` Boqun Feng
2025-06-27 2:56 ` Paul E. McKenney
2025-06-25 12:25 ` Mathieu Desnoyers
2025-06-25 13:21 ` Boqun Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250625100032.GA1613376@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=aeh@meta.com \
--cc=boqun.feng@gmail.com \
--cc=dave@stgolabs.net \
--cc=edumazet@google.com \
--cc=elundgren@meta.com \
--cc=frederic@kernel.org \
--cc=jhs@mojatatu.com \
--cc=jiangshanlai@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=josh@joshtriplett.org \
--cc=kernel-team@meta.com \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkmm@lists.linux.dev \
--cc=longman@redhat.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=neeraj.upadhyay@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=qiang.zhang@linux.dev \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.