From: Boqun Feng <boqun.feng@gmail.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Alan Stern <stern@rowland.harvard.edu>,
John Stultz <jstultz@google.com>,
Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Frederic Weisbecker <frederic@kernel.org>,
Josh Triplett <josh@joshtriplett.org>,
Uladzislau Rezki <urezki@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Zqiang <qiang.zhang1211@gmail.com>,
Ingo Molnar <mingo@redhat.com>, Waiman Long <longman@redhat.com>,
Mark Rutland <mark.rutland@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>,
maged.michael@gmail.com, Mateusz Guzik <mjguzik@gmail.com>,
Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>,
rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev
Subject: Re: [RFC PATCH v4 4/4] hazptr: Migrate per-CPU slots to backup slot on context switch
Date: Fri, 19 Dec 2025 07:16:37 +0900 [thread overview]
Message-ID: <aUR9RfVChdcDncwX@tardis-2.local> (raw)
In-Reply-To: <20251218014531.3793471-5-mathieu.desnoyers@efficios.com>
On Wed, Dec 17, 2025 at 08:45:31PM -0500, Mathieu Desnoyers wrote:
> Integrate with the scheduler to migrate per-CPU slots to the backup slot
> on context switch. This ensures that the per-CPU slots won't be used by
> blocked or preempted tasks holding on hazard pointers for a long time.
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: John Stultz <jstultz@google.com>
> Cc: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Zqiang <qiang.zhang1211@gmail.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Waiman Long <longman@redhat.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: maged.michael@gmail.com
> Cc: Mateusz Guzik <mjguzik@gmail.com>
> Cc: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
> Cc: rcu@vger.kernel.org
> Cc: linux-mm@kvack.org
> Cc: lkmm@lists.linux.dev
> ---
> include/linux/hazptr.h | 63 ++++++++++++++++++++++++++++++++++++++++--
> include/linux/sched.h | 4 +++
> init/init_task.c | 3 ++
> kernel/Kconfig.preempt | 10 +++++++
> kernel/fork.c | 3 ++
> kernel/sched/core.c | 2 ++
> 6 files changed, 83 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/hazptr.h b/include/linux/hazptr.h
> index 70c066ddb0f5..10ac53a42a7a 100644
> --- a/include/linux/hazptr.h
> +++ b/include/linux/hazptr.h
> @@ -24,6 +24,7 @@
> #include <linux/percpu.h>
> #include <linux/types.h>
> #include <linux/cleanup.h>
> +#include <linux/sched.h>
>
> /* 8 slots (each sizeof(void *)) fit in a single cache line. */
> #define NR_HAZPTR_PERCPU_SLOTS 8
> @@ -46,6 +47,9 @@ struct hazptr_ctx {
> struct hazptr_slot *slot;
> /* Backup slot in case all per-CPU slots are used. */
> struct hazptr_backup_slot backup_slot;
> +#ifdef CONFIG_PREEMPT_HAZPTR
I would suggest we make CONFIG_PREEMPT_HAZPTR always enabled hence no
need for a config, do we have the measurement of the additional cost?
> + struct list_head preempt_node;
> +#endif
> };
>
> struct hazptr_percpu_slots {
> @@ -98,6 +102,50 @@ bool hazptr_slot_is_backup(struct hazptr_ctx *ctx, struct hazptr_slot *slot)
> return slot == &ctx->backup_slot.slot;
> }
>
> +#ifdef CONFIG_PREEMPT_HAZPTR
> +static inline
> +void hazptr_chain_task_ctx(struct hazptr_ctx *ctx)
> +{
> + list_add(&ctx->preempt_node, ¤t->hazptr_ctx_list);
> +}
> +
> +static inline
> +void hazptr_unchain_task_ctx(struct hazptr_ctx *ctx)
> +{
> + list_del(&ctx->preempt_node);
> +}
> +
I think you need to add interrupt disabling for chain/unchain because of
the potential readers in interrupt and then you can avoid the preempt
disabling in hazptr_release() I think. Let's aim for supporting readers
in interrupt handler, because at least lockdep needs that.
Regards,
Boqun
> +static inline
> +void hazptr_note_context_switch(void)
> +{
> + struct hazptr_ctx *ctx;
> +
> + list_for_each_entry(ctx, ¤t->hazptr_ctx_list, preempt_node) {
> + struct hazptr_slot *slot;
> +
> + if (hazptr_slot_is_backup(ctx, ctx->slot))
> + continue;
> + slot = hazptr_chain_backup_slot(ctx);
> + /*
> + * Move hazard pointer from per-CPU slot to backup slot.
> + * This requires hazard pointer synchronize to iterate
> + * on per-CPU slots with load-acquire before iterating
> + * on the overflow list.
> + */
> + WRITE_ONCE(slot->addr, ctx->slot->addr);
> + /*
> + * store-release orders store to backup slot addr before
> + * store to per-CPU slot addr.
> + */
> + smp_store_release(&ctx->slot->addr, NULL);
> + }
> +}
> +#else
> +static inline void hazptr_chain_task_ctx(struct hazptr_ctx *ctx) { }
> +static inline void hazptr_unchain_task_ctx(struct hazptr_ctx *ctx) { }
> +static inline void hazptr_note_context_switch(void) { }
> +#endif
> +
> /*
> * hazptr_acquire: Load pointer at address and protect with hazard pointer.
> *
> @@ -114,6 +162,7 @@ void *hazptr_acquire(struct hazptr_ctx *ctx, void * const * addr_p)
> struct hazptr_slot *slot = NULL;
> void *addr, *addr2;
>
> + ctx->slot = NULL;
> /*
> * Load @addr_p to know which address should be protected.
> */
> @@ -121,7 +170,9 @@ void *hazptr_acquire(struct hazptr_ctx *ctx, void * const * addr_p)
> for (;;) {
> if (!addr)
> return NULL;
> +
> guard(preempt)();
> + hazptr_chain_task_ctx(ctx);
> if (likely(!hazptr_slot_is_backup(ctx, slot))) {
> slot = hazptr_get_free_percpu_slot();
> /*
> @@ -140,8 +191,11 @@ void *hazptr_acquire(struct hazptr_ctx *ctx, void * const * addr_p)
> * Re-load @addr_p after storing it to the hazard pointer slot.
> */
> addr2 = READ_ONCE(*addr_p); /* Load A */
> - if (likely(ptr_eq(addr2, addr)))
> + if (likely(ptr_eq(addr2, addr))) {
> + ctx->slot = slot;
> + /* Success. Break loop, enable preemption and return. */
> break;
> + }
> /*
> * If @addr_p content has changed since the first load,
> * release the hazard pointer and try again.
> @@ -150,11 +204,14 @@ void *hazptr_acquire(struct hazptr_ctx *ctx, void * const * addr_p)
> if (!addr2) {
> if (hazptr_slot_is_backup(ctx, slot))
> hazptr_unchain_backup_slot(ctx);
> + hazptr_unchain_task_ctx(ctx);
> + /* Loaded NULL. Enable preemption and return NULL. */
> return NULL;
> }
> addr = addr2;
> + hazptr_unchain_task_ctx(ctx);
> + /* Enable preemption and retry. */
> }
> - ctx->slot = slot;
> /*
> * Use addr2 loaded from the second READ_ONCE() to preserve
> * address dependency ordering.
> @@ -170,11 +227,13 @@ void hazptr_release(struct hazptr_ctx *ctx, void *addr)
>
> if (!addr)
> return;
> + guard(preempt)();
> slot = ctx->slot;
> WARN_ON_ONCE(slot->addr != addr);
> smp_store_release(&slot->addr, NULL);
> if (unlikely(hazptr_slot_is_backup(ctx, slot)))
> hazptr_unchain_backup_slot(ctx);
> + hazptr_unchain_task_ctx(ctx);
> }
[...]
next prev parent reply other threads:[~2025-12-18 22:16 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 1:45 [RFC PATCH v4 0/4] Hazard Pointers Mathieu Desnoyers
2025-12-18 1:45 ` [RFC PATCH v4 1/4] compiler.h: Introduce ptr_eq() to preserve address dependency Mathieu Desnoyers
2025-12-18 9:03 ` David Laight
2025-12-18 13:51 ` Mathieu Desnoyers
2025-12-18 15:54 ` David Laight
2025-12-18 14:27 ` Gary Guo
2025-12-18 16:12 ` David Laight
2025-12-18 1:45 ` [RFC PATCH v4 2/4] Documentation: RCU: Refer to ptr_eq() Mathieu Desnoyers
2025-12-18 1:45 ` [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers Mathieu Desnoyers
2025-12-18 8:36 ` Boqun Feng
2025-12-18 17:35 ` Mathieu Desnoyers
2025-12-18 20:22 ` Boqun Feng
2025-12-18 23:36 ` Mathieu Desnoyers
2025-12-19 0:25 ` Boqun Feng
2025-12-19 6:06 ` Joel Fernandes
2025-12-19 15:14 ` Mathieu Desnoyers
2025-12-19 15:42 ` Joel Fernandes
2025-12-19 22:19 ` Mathieu Desnoyers
2025-12-19 22:39 ` Joel Fernandes
2025-12-21 9:59 ` Boqun Feng
2025-12-19 0:43 ` Boqun Feng
2025-12-19 14:22 ` Mathieu Desnoyers
2026-01-08 16:34 ` Frederic Weisbecker
2026-01-08 16:45 ` Mathieu Desnoyers
2026-01-08 23:34 ` Joel Fernandes
2026-01-08 19:01 ` Paul E. McKenney
2025-12-19 1:22 ` Joel Fernandes
2025-12-18 1:45 ` [RFC PATCH v4 4/4] hazptr: Migrate per-CPU slots to backup slot on context switch Mathieu Desnoyers
2025-12-18 16:20 ` Mathieu Desnoyers
2025-12-18 22:16 ` Boqun Feng [this message]
2025-12-19 0:21 ` Mathieu Desnoyers
2026-02-23 16:54 ` Boqun Feng
2026-02-23 19:40 ` Mathieu Desnoyers
2025-12-18 10:33 ` [RFC PATCH v4 0/4] Hazard Pointers Joel Fernandes
2025-12-18 17:54 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUR9RfVChdcDncwX@tardis-2.local \
--to=boqun.feng@gmail.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=frederic@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=jonas.oberhauser@huaweicloud.com \
--cc=josh@joshtriplett.org \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkmm@lists.linux.dev \
--cc=longman@redhat.com \
--cc=maged.michael@gmail.com \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=stern@rowland.harvard.edu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.