All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Waiman Long <llong@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Breno Leitao <leitao@debian.org>, Ingo Molnar <mingo@redhat.com>,
	Will Deacon <will@kernel.org>,
	aeh@meta.com, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, jhs@mojatatu.com, kernel-team@meta.com,
	Erik Lundgren <elundgren@meta.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization
Date: Wed, 26 Mar 2025 09:47:59 -0700	[thread overview]
Message-ID: <Z-QvvzFORBDESCgP@Mac.home> (raw)
In-Reply-To: <37bbf28f-911a-4fea-b531-b43cdee72915@redhat.com>

On Wed, Mar 26, 2025 at 12:40:59PM -0400, Waiman Long wrote:
> On 3/26/25 11:39 AM, Waiman Long wrote:
> > On 3/26/25 1:25 AM, Boqun Feng wrote:
> > > > It looks like you are trying hard to find a use case for hazard pointer in
> > > > the kernel 🙂
> > > > 
> > > Well, if it does the job, why not use it 😉 Also this shows how
> > > flexible hazard pointers can be.
> > > 
> > > At least when using hazard pointers, the reader side of the hash list
> > > iteration is still lockless. Plus, since the synchronization part
> > > doesn't need to wait for the RCU readers in the whole system, it will be
> > > faster (I tried with the protecting-the-whole-hash-list approach as
> > > well, it's the same result on the tc command). This is why I choose to
> > > look into hazard pointers. Another mechanism can achieve the similar
> > > behavior is SRCU, but SRCU is slightly heavier compared to hazard
> > > pointers in this case (of course SRCU has more functionalities).
> > > 
> > > We can provide a lockdep_unregister_key_nosync() without the
> > > synchronize_rcu() in it and let users do the synchronization, but it's
> > > going to be hard to enforce and review, especially when someone
> > > refactors the code and move the free code to somewhere else.
> > Providing a second API and ask callers to do the right thing is probably
> > not a good idea and mistake is going to be made sooner or later.
> > > > Anyway, that may work. The only problem that I see is the issue of nesting
> > > > of an interrupt context on top of a task context. It is possible that the
> > > > first use of a raw_spinlock may happen in an interrupt context. If the
> > > > interrupt happens when the task has set the hazard pointer and iterating the
> > > > hash list, the value of the hazard pointer may be overwritten. Alternatively
> > > > we could have multiple slots for the hazard pointer, but that will make the
> > > > code more complicated. Or we could disable interrupt before setting the
> > > > hazard pointer.
> > > Or we can use lockdep_recursion:
> > > 
> > > 	preempt_disable();
> > > 	lockdep_recursion_inc();
> > > 	barrier();
> > > 
> > > 	WRITE_ONCE(*hazptr, ...);
> > > 
> > > , it should prevent the re-entrant of lockdep in irq.
> > That will probably work. Or we can disable irq. I am fine with both.
> > > > The solution that I am thinking about is to have a simple unfair rwlock to
> > > > protect just the hash list iteration. lockdep_unregister_key() and
> > > > lockdep_register_key() take the write lock with interrupt disabled. While
> > > > is_dynamic_key() takes the read lock. Nesting in this case isn't a problem
> > > > and we don't need RCU to protect the iteration process and so the last
> > > > synchronize_rcu() call isn't needed. The level of contention should be low
> > > > enough that live lock isn't an issue.
> > > > 
> > > This could work, one thing though is that locks don't compose. Using a
> > > hash write_lock in lockdep_unregister_key() will create a lockdep_lock()
> > > -> "hash write_lock" dependency, and that means you cannot
> > > lockdep_lock() while you're holding a hash read_lock, although it's
> > > not the case today, but it certainly complicates the locking design
> > > inside lockdep where there's no lockdep to help 😉
> > 
> > Thinking about it more, doing it in a lockless way is probably a good
> > idea.
> > 
> If we are using hazard pointer for synchronization, should we also take off
> "_rcu" from the list iteration/insertion/deletion macros to avoid the
> confusion that RCU is being used?
> 

We can, but we probably want to introduce a new set of API with suffix
"_lockless" or something because they will still need a lockless fashion
similar to RCU list iteration/insertion/deletion.

Regards,
Boqun

> Cheers,
> Longman
> 

  reply	other threads:[~2025-03-26 16:48 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21  9:30 [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization Breno Leitao
2025-03-21 10:37 ` Eric Dumazet
2025-03-21 14:22   ` Breno Leitao
2025-03-24 12:12 ` Peter Zijlstra
2025-03-24 12:23   ` Eric Dumazet
2025-03-24 12:24     ` Eric Dumazet
2025-03-24 19:21     ` Boqun Feng
2025-03-24 19:30       ` Boqun Feng
2025-03-25  0:47         ` Boqun Feng
2025-03-25  1:56           ` Waiman Long
2025-03-25  3:41             ` Boqun Feng
     [not found]               ` <934d794b-7ebc-422c-b4fe-3e658a2e5e7a@redhat.com>
2025-03-25 14:57                 ` Waiman Long
2025-03-25 18:45                 ` Boqun Feng
2025-03-25 19:23                   ` Waiman Long
2025-03-25 19:42                     ` Boqun Feng
2025-03-25 23:20                       ` Waiman Long
2025-03-26  5:25                         ` Boqun Feng
     [not found]                           ` <df237702-55c3-466b-b51e-f3fe46ae03ba@redhat.com>
2025-03-26 16:40                             ` Waiman Long
2025-03-26 16:47                               ` Boqun Feng [this message]
2025-03-26 17:02                                 ` Waiman Long
2025-03-26 17:10                                   ` Paul E. McKenney
2025-03-26 18:42                                     ` Boqun Feng
2025-03-26 21:37                                       ` Paul E. McKenney
2025-03-31 16:48                                       ` Breno Leitao
2025-03-31 17:34                                         ` Boqun Feng
2025-03-31 17:26                             ` Boqun Feng
2025-03-31 17:33                               ` Waiman Long
2025-03-31 18:33                                 ` Paul E. McKenney
2025-03-31 18:57                                   ` Waiman Long
2025-03-31 21:21                                     ` Boqun Feng
2025-03-31 21:47                                       ` Waiman Long
2025-03-31 17:42                               ` Eric Dumazet
2025-07-09 10:00 ` Breno Leitao
2025-07-09 13:57   ` Waiman Long
2025-07-09 14:57     ` Boqun Feng
2025-07-19 17:40 ` [tip: locking/core] " tip-bot2 for Breno Leitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-QvvzFORBDESCgP@Mac.home \
    --to=boqun.feng@gmail.com \
    --cc=aeh@meta.com \
    --cc=edumazet@google.com \
    --cc=elundgren@meta.com \
    --cc=jhs@mojatatu.com \
    --cc=kernel-team@meta.com \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llong@redhat.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.