All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guo Ren <guoren@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: lockless case of retain_dentry() (was Re: [PATCH 09/15] fold the call of retain_dentry() into fast_dput())
Date: Wed, 22 Nov 2023 02:19:16 -0500	[thread overview]
Message-ID: <ZV2rdE1XQWwJ7s75@gmail.com> (raw)
In-Reply-To: <CAHk-=wgaLBRwPE0_VfxOrCzFsHgV-pR35=7V3K=EHOJV36vaPQ@mail.gmail.com>

On Thu, Nov 09, 2023 at 09:57:39PM -0800, Linus Torvalds wrote:
> On Thu, 9 Nov 2023 at 20:20, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> >         FWIW, on top of current #work.dcache2 the following delta might be worth
> > looking into.  Not sure if it's less confusing that way, though - I'd been staring
> > at that place for too long.  Code generation is slightly suboptimal with recent
> > gcc, but only marginally so.
> 
> I doubt the pure ALU ops and a couple of extra conditional branches
> (that _probably_ predict well) matter at all.
> 
> Especially since this is all after lockref_put_return() has done that
> locked cmpxchg, which *is* expensive.
> 
> My main reaction is that we use hlist_bl_unhashed() for d_unhashed(),
> and we *intentionally* make it separate from the actual unhasing:
> 
>  - ___d_drop() does the __hlist_bl_del()
> 
>  - but d_unhashed() does hlist_bl_unhashed(), which checks
> d_hash.pprev == NULL, and that's done by __d_drop
> 
> We even have a comment about this:
> 
>  * ___d_drop doesn't mark dentry as "unhashed"
>  * (dentry->d_hash.pprev will be LIST_POISON2, not NULL).
> 
> and we depend on this in __d_move(), which will unhash things
> temporarily, but not mark things unhashed, because they get re-hashed
> again. Same goes for __d_add().
> 
> Anyway, what I'm actually getting at in a roundabout way is that maybe
> we should make D_UNHASHED be another flag in d_flags, and *not* use
> that d_hash.pprev field, and that would allow us to combine even more
> of these tests in dput(), because now pretty much *all* of those
> "retain_dentry()" checks would be about d_flags bits.
> 
> Hmm? As it is, it has that odd combination of d_flags and that
> d_unhashed() test, so it's testing two different fields.
> 
> Anyway, I really don't think it matters much, but since you brought up
> the whole suboptimal code generation..
> 
> I tried to look at dput() code generation, and it doesn't look
> horrendous as-is in your dcache2 branch.
> 
> If anything, the thing that hirs is the lockref_put_return() being
> out-of-line even though this is basically the only caller, plus people
> have pessimized the arch_spin_value_unlocked() implementation *again*,
> so that it uses a volatile read, when the *WHOLE*POINT* of that
> "VALUE" part of "value_unlocked()" is that we've already read the
> value, and we should *not* re-read it.
> 
> Damn.
> 
> The bug seems to affect both the generic qspinlock code, and the
> ticket-based one.
> 
> For the ticket based ones, it's PeterZ and commit 1bce11126d57
> ("asm-generic: ticket-lock: New generic ticket-based spinlock"), which
> does
> 
>   static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>   {
>         return !arch_spin_is_locked(&lock);
>   }
> 
> where we've got that "lock" value, but then it takes the address of
> it, and uses arch_spin_is_locked() on it, so now it will force a flush
> to memory, and then an READ_ONCE() on it.
> 
> And for the qspinlock code, we had a similar

We discussed x86 qspinlock code generation. It looked not too bad as I
thought because qspinlock_spin_value_unlocked is much cheaper than the
ticket-lock. But the riscv ticket-lock code generation is terrible
because of the shift left & right 16-bit.
https://lore.kernel.org/all/ZNG2tHFOABSXGCVi@gmail.com

> 
>   static __always_inline int queued_spin_value_unlocked(struct qspinlock lock)
>   {
>         return !atomic_read(&lock.val);
>   }
> 
> thing, where it does 'atomic_read()' on the value it was passed as an argument.
> 
> Stupid, stupid. It's literally forcing a re-read of a value that is
> guaranteed to be on stack.
> 
> I know this worked at some point, but that may have been many years
> ago, since I haven't looked at this part of lockref code generation in
> ages.
> 
> Anway, as a result now all the lockref functions will do silly "store
> the old lockref value to memory, in order to read it again" dances in
> that CMPXCHG_LOOP() loop.
> 
> It literally makes that whole "is this an unlocked value" function
> completely pointless. The *whole* and only point was "look, I already
> loaded the value from memory, is this *VALUE* unlocked.
> 
> Compared to that complete braindamage in the fast-path loop, the small
> extra ALU ops in fast_dput() are nothing.
> 
> Peter - those functions are done exactly the wrong way around.
> arch_spin_is_locked() should be implemented using
> arch_spin_value_unlocked(), not this way around.
> 
> And the queued spinlocks should not do an atomic_read()of the argument
> they get, they should just do "!lock.val.counter"
> 
> So something like this should fix lockref. ENTIRELY UNTESTED, except
> now the code generation of lockref_put_return() looks much better,
> without a pointless flush to the stack, and now it has no pointless
> stack frame as a result.
> 
> Of course, it should probably be inlined, since it has only one user
> (ok, two, since fast_dput() gets used twice), and that should make the
> return value testing much better.
> 
>                Linus

>  include/asm-generic/qspinlock.h |  2 +-
>  include/asm-generic/spinlock.h  | 17 +++++++++--------
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
> index 995513fa2690..0655aa5b57b2 100644
> --- a/include/asm-generic/qspinlock.h
> +++ b/include/asm-generic/qspinlock.h
> @@ -70,7 +70,7 @@ static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
>   */
>  static __always_inline int queued_spin_value_unlocked(struct qspinlock lock)
>  {
> -	return !atomic_read(&lock.val);
> +	return !lock.val.counter;
>  }
>  
>  /**
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..a35eda0ec2a2 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -68,11 +68,17 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
>  	smp_store_release(ptr, (u16)val + 1);
>  }
>  
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +{
> +	u32 val = lock.counter;
> +	return ((val >> 16) == (val & 0xffff));
> +}
> +
>  static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> -	u32 val = atomic_read(lock);
> -
> -	return ((val >> 16) != (val & 0xffff));
> +	arch_spinlock_t val;
> +	val.counter = atomic_read(lock);
> +	return !arch_spin_value_unlocked(val);
>  }
>  
>  static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> @@ -82,11 +88,6 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  	return (s16)((val >> 16) - (val & 0xffff)) > 1;
>  }
>  
> -static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> -{
> -	return !arch_spin_is_locked(&lock);
> -}
> -
>  #include <asm/qrwlock.h>
>  
>  #endif /* __ASM_GENERIC_SPINLOCK_H */


  parent reply	other threads:[~2023-11-22  7:19 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-30  0:37 [RFC] simplifying fast_dput(), dentry_kill() et.al Al Viro
2023-10-30 21:53 ` Al Viro
2023-10-30 22:18   ` Linus Torvalds
2023-10-31  0:18     ` Al Viro
2023-10-31  1:53       ` Al Viro
2023-10-31  6:12         ` Al Viro
2023-11-01  6:18           ` Al Viro
2023-11-01  6:20           ` [PATCH 01/15] fast_dput(): having ->d_delete() is not reason to delay refcount decrement Al Viro
2023-11-01  6:20             ` [PATCH 02/15] fast_dput(): handle underflows gracefully Al Viro
2023-11-01  6:20             ` [PATCH 03/15] fast_dput(): new rules for refcount Al Viro
2023-11-01  6:20             ` [PATCH 04/15] __dput_to_list(): do decrement of refcount in the caller Al Viro
2023-11-01  6:20             ` [PATCH 05/15] retain_dentry(): lift decrement of ->d_count into callers Al Viro
2023-11-01  6:20             ` [PATCH 06/15] __dentry_kill(): get consistent rules for ->d_count Al Viro
2023-11-01  6:20             ` [PATCH 07/15] dentry_kill(): don't bother with retain_dentry() on slow path Al Viro
2023-11-01  6:20             ` [PATCH 08/15] Call retain_dentry() with refcount 0 Al Viro
2023-11-01  6:20             ` [PATCH 09/15] fold the call of retain_dentry() into fast_dput() Al Viro
2023-11-01  8:45               ` Al Viro
2023-11-01 17:30                 ` Linus Torvalds
2023-11-01 18:19                   ` Al Viro
2023-11-10  4:20                     ` lockless case of retain_dentry() (was Re: [PATCH 09/15] fold the call of retain_dentry() into fast_dput()) Al Viro
2023-11-10  5:57                       ` Linus Torvalds
2023-11-10  6:22                         ` Linus Torvalds
2023-11-22  6:29                           ` Guo Ren
2023-11-10  8:19                         ` Al Viro
2023-11-22  7:19                         ` Guo Ren [this message]
2023-11-22 17:20                           ` Linus Torvalds
2023-11-22 17:52                             ` Linus Torvalds
2023-11-22 18:05                               ` Linus Torvalds
2023-11-22 19:11                               ` Linus Torvalds
2023-11-29  7:14                                 ` Guo Ren
2023-11-29 12:25                                 ` Guo Ren
2023-11-29 14:42                                   ` Linus Torvalds
2023-11-26 16:39                             ` Guo Ren
2023-11-26 16:51                               ` Linus Torvalds
2023-11-30 10:00                                 ` Guo Ren
2023-12-01  1:09                                   ` Linus Torvalds
2023-12-01  3:36                                     ` Guo Ren
2023-12-01  5:15                                       ` Linus Torvalds
2023-12-01  7:31                                         ` Guo Ren
2023-11-26 16:51                               ` Guo Ren
2023-11-26 17:06                               ` Linus Torvalds
2023-11-26 17:59                                 ` Linus Torvalds
2023-11-29  9:52                                 ` Guo Ren
2023-11-01  6:20             ` [PATCH 10/15] don't try to cut corners in shrink_lock_dentry() Al Viro
2023-11-01  6:21             ` [PATCH 11/15] fold dentry_kill() into dput() Al Viro
2023-11-01  6:21             ` [PATCH 12/15] get rid of __dget() Al Viro
2023-11-01  6:21             ` [PATCH 13/15] shrink_dentry_list(): no need to check that dentry refcount is marked dead Al Viro
2023-11-01  6:21             ` [PATCH 14/15] to_shrink_list(): call only if refcount is 0 Al Viro
2023-11-01  6:21             ` [PATCH 15/15] switch select_collect{,2}() to use of to_shrink_list() Al Viro
2023-11-01  2:22       ` [RFC] simplifying fast_dput(), dentry_kill() et.al Al Viro
2023-11-01 14:29         ` Benjamin Coddington
2023-11-05 19:54       ` Al Viro
2023-11-05 21:59         ` Al Viro
2023-11-06  5:53         ` Al Viro
2023-11-07  2:08           ` Al Viro
2023-11-09  6:19             ` [RFC][PATCHSET v2] " Al Viro
2023-11-09  6:20               ` [PATCH 01/22] struct dentry: get rid of randomize_layout idiocy Al Viro
2023-11-09  6:20                 ` [PATCH 02/22] switch nfsd_client_rmdir() to use of simple_recursive_removal() Al Viro
2023-11-09 13:42                   ` Christian Brauner
2023-11-09 14:01                   ` Chuck Lever
2023-11-09 18:47                     ` Al Viro
2023-11-09 18:50                       ` Chuck Lever III
2023-11-09  6:20                 ` [PATCH 03/22] coda_flag_children(): cope with dentries turning negative Al Viro
2023-11-09 13:43                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 04/22] dentry: switch the lists of children to hlist Al Viro
2023-11-09 13:48                   ` Christian Brauner
2023-11-09 19:32                     ` Al Viro
2023-11-09  6:20                 ` [PATCH 05/22] centralize killing dentry from shrink list Al Viro
2023-11-09 13:49                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 06/22] get rid of __dget() Al Viro
2023-11-09 13:50                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 07/22] shrink_dentry_list(): no need to check that dentry refcount is marked dead Al Viro
2023-11-09 13:53                   ` Christian Brauner
2023-11-09 20:28                     ` Al Viro
2023-11-09  6:20                 ` [PATCH 08/22] fast_dput(): having ->d_delete() is not reason to delay refcount decrement Al Viro
2023-11-09 13:58                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 09/22] fast_dput(): handle underflows gracefully Al Viro
2023-11-09 14:46                   ` Christian Brauner
2023-11-09 20:39                     ` Al Viro
2023-11-09  6:20                 ` [PATCH 10/22] fast_dput(): new rules for refcount Al Viro
2023-11-09 14:54                   ` Christian Brauner
2023-11-09 20:52                     ` Al Viro
2023-11-09  6:20                 ` [PATCH 11/22] __dput_to_list(): do decrement of refcount in the callers Al Viro
2023-11-09 15:21                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 12/22] Make retain_dentry() neutral with respect to refcounting Al Viro
2023-11-09 15:22                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 13/22] __dentry_kill(): get consistent rules for victim's refcount Al Viro
2023-11-09 15:27                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 14/22] dentry_kill(): don't bother with retain_dentry() on slow path Al Viro
2023-11-09 15:53                   ` Christian Brauner
2023-11-09 21:29                     ` Al Viro
2023-11-09  6:20                 ` [PATCH 15/22] Call retain_dentry() with refcount 0 Al Viro
2023-11-09 16:09                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 16/22] fold the call of retain_dentry() into fast_dput() Al Viro
2023-11-09 16:17                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 17/22] don't try to cut corners in shrink_lock_dentry() Al Viro
2023-11-09 17:20                   ` Christian Brauner
2023-11-09 21:45                     ` Al Viro
2023-11-10  9:07                       ` Christian Brauner
2023-11-09 17:39                   ` Linus Torvalds
2023-11-09 18:11                     ` Linus Torvalds
2023-11-09 18:20                     ` Al Viro
2023-11-09  6:20                 ` [PATCH 18/22] fold dentry_kill() into dput() Al Viro
2023-11-09 17:22                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 19/22] to_shrink_list(): call only if refcount is 0 Al Viro
2023-11-09 17:29                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 20/22] switch select_collect{,2}() to use of to_shrink_list() Al Viro
2023-11-09 17:31                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 21/22] d_prune_aliases(): use a shrink list Al Viro
2023-11-09 17:33                   ` Christian Brauner
2023-11-09  6:20                 ` [PATCH 22/22] __dentry_kill(): new locking scheme Al Viro
2023-11-10 13:34                   ` Christian Brauner
2023-11-09 13:33                 ` [PATCH 01/22] struct dentry: get rid of randomize_layout idiocy Christian Brauner
2023-10-31  2:25     ` [RFC] simplifying fast_dput(), dentry_kill() et.al Gao Xiang
2023-10-31  2:29       ` Gao Xiang
2023-10-31  3:02       ` Linus Torvalds
2023-10-31  3:13         ` Gao Xiang
2023-10-31  3:26         ` Al Viro
2023-10-31  3:41           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZV2rdE1XQWwJ7s75@gmail.com \
    --to=guoren@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.