From: Guo Ren <guoren@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
Peter Zijlstra <peterz@infradead.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: lockless case of retain_dentry() (was Re: [PATCH 09/15] fold the call of retain_dentry() into fast_dput())
Date: Wed, 22 Nov 2023 02:19:16 -0500 [thread overview]
Message-ID: <ZV2rdE1XQWwJ7s75@gmail.com> (raw)
In-Reply-To: <CAHk-=wgaLBRwPE0_VfxOrCzFsHgV-pR35=7V3K=EHOJV36vaPQ@mail.gmail.com>
On Thu, Nov 09, 2023 at 09:57:39PM -0800, Linus Torvalds wrote:
> On Thu, 9 Nov 2023 at 20:20, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > FWIW, on top of current #work.dcache2 the following delta might be worth
> > looking into. Not sure if it's less confusing that way, though - I'd been staring
> > at that place for too long. Code generation is slightly suboptimal with recent
> > gcc, but only marginally so.
>
> I doubt the pure ALU ops and a couple of extra conditional branches
> (that _probably_ predict well) matter at all.
>
> Especially since this is all after lockref_put_return() has done that
> locked cmpxchg, which *is* expensive.
>
> My main reaction is that we use hlist_bl_unhashed() for d_unhashed(),
> and we *intentionally* make it separate from the actual unhasing:
>
> - ___d_drop() does the __hlist_bl_del()
>
> - but d_unhashed() does hlist_bl_unhashed(), which checks
> d_hash.pprev == NULL, and that's done by __d_drop
>
> We even have a comment about this:
>
> * ___d_drop doesn't mark dentry as "unhashed"
> * (dentry->d_hash.pprev will be LIST_POISON2, not NULL).
>
> and we depend on this in __d_move(), which will unhash things
> temporarily, but not mark things unhashed, because they get re-hashed
> again. Same goes for __d_add().
>
> Anyway, what I'm actually getting at in a roundabout way is that maybe
> we should make D_UNHASHED be another flag in d_flags, and *not* use
> that d_hash.pprev field, and that would allow us to combine even more
> of these tests in dput(), because now pretty much *all* of those
> "retain_dentry()" checks would be about d_flags bits.
>
> Hmm? As it is, it has that odd combination of d_flags and that
> d_unhashed() test, so it's testing two different fields.
>
> Anyway, I really don't think it matters much, but since you brought up
> the whole suboptimal code generation..
>
> I tried to look at dput() code generation, and it doesn't look
> horrendous as-is in your dcache2 branch.
>
> If anything, the thing that hirs is the lockref_put_return() being
> out-of-line even though this is basically the only caller, plus people
> have pessimized the arch_spin_value_unlocked() implementation *again*,
> so that it uses a volatile read, when the *WHOLE*POINT* of that
> "VALUE" part of "value_unlocked()" is that we've already read the
> value, and we should *not* re-read it.
>
> Damn.
>
> The bug seems to affect both the generic qspinlock code, and the
> ticket-based one.
>
> For the ticket based ones, it's PeterZ and commit 1bce11126d57
> ("asm-generic: ticket-lock: New generic ticket-based spinlock"), which
> does
>
> static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> {
> return !arch_spin_is_locked(&lock);
> }
>
> where we've got that "lock" value, but then it takes the address of
> it, and uses arch_spin_is_locked() on it, so now it will force a flush
> to memory, and then an READ_ONCE() on it.
>
> And for the qspinlock code, we had a similar
We discussed x86 qspinlock code generation. It looked not too bad as I
thought because qspinlock_spin_value_unlocked is much cheaper than the
ticket-lock. But the riscv ticket-lock code generation is terrible
because of the shift left & right 16-bit.
https://lore.kernel.org/all/ZNG2tHFOABSXGCVi@gmail.com
>
> static __always_inline int queued_spin_value_unlocked(struct qspinlock lock)
> {
> return !atomic_read(&lock.val);
> }
>
> thing, where it does 'atomic_read()' on the value it was passed as an argument.
>
> Stupid, stupid. It's literally forcing a re-read of a value that is
> guaranteed to be on stack.
>
> I know this worked at some point, but that may have been many years
> ago, since I haven't looked at this part of lockref code generation in
> ages.
>
> Anway, as a result now all the lockref functions will do silly "store
> the old lockref value to memory, in order to read it again" dances in
> that CMPXCHG_LOOP() loop.
>
> It literally makes that whole "is this an unlocked value" function
> completely pointless. The *whole* and only point was "look, I already
> loaded the value from memory, is this *VALUE* unlocked.
>
> Compared to that complete braindamage in the fast-path loop, the small
> extra ALU ops in fast_dput() are nothing.
>
> Peter - those functions are done exactly the wrong way around.
> arch_spin_is_locked() should be implemented using
> arch_spin_value_unlocked(), not this way around.
>
> And the queued spinlocks should not do an atomic_read()of the argument
> they get, they should just do "!lock.val.counter"
>
> So something like this should fix lockref. ENTIRELY UNTESTED, except
> now the code generation of lockref_put_return() looks much better,
> without a pointless flush to the stack, and now it has no pointless
> stack frame as a result.
>
> Of course, it should probably be inlined, since it has only one user
> (ok, two, since fast_dput() gets used twice), and that should make the
> return value testing much better.
>
> Linus
> include/asm-generic/qspinlock.h | 2 +-
> include/asm-generic/spinlock.h | 17 +++++++++--------
> 2 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
> index 995513fa2690..0655aa5b57b2 100644
> --- a/include/asm-generic/qspinlock.h
> +++ b/include/asm-generic/qspinlock.h
> @@ -70,7 +70,7 @@ static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
> */
> static __always_inline int queued_spin_value_unlocked(struct qspinlock lock)
> {
> - return !atomic_read(&lock.val);
> + return !lock.val.counter;
> }
>
> /**
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..a35eda0ec2a2 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -68,11 +68,17 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> smp_store_release(ptr, (u16)val + 1);
> }
>
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +{
> + u32 val = lock.counter;
> + return ((val >> 16) == (val & 0xffff));
> +}
> +
> static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> {
> - u32 val = atomic_read(lock);
> -
> - return ((val >> 16) != (val & 0xffff));
> + arch_spinlock_t val;
> + val.counter = atomic_read(lock);
> + return !arch_spin_value_unlocked(val);
> }
>
> static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> @@ -82,11 +88,6 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> return (s16)((val >> 16) - (val & 0xffff)) > 1;
> }
>
> -static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> -{
> - return !arch_spin_is_locked(&lock);
> -}
> -
> #include <asm/qrwlock.h>
>
> #endif /* __ASM_GENERIC_SPINLOCK_H */
next prev parent reply other threads:[~2023-11-22 7:19 UTC|newest]
Thread overview: 119+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-30 0:37 [RFC] simplifying fast_dput(), dentry_kill() et.al Al Viro
2023-10-30 21:53 ` Al Viro
2023-10-30 22:18 ` Linus Torvalds
2023-10-31 0:18 ` Al Viro
2023-10-31 1:53 ` Al Viro
2023-10-31 6:12 ` Al Viro
2023-11-01 6:18 ` Al Viro
2023-11-01 6:20 ` [PATCH 01/15] fast_dput(): having ->d_delete() is not reason to delay refcount decrement Al Viro
2023-11-01 6:20 ` [PATCH 02/15] fast_dput(): handle underflows gracefully Al Viro
2023-11-01 6:20 ` [PATCH 03/15] fast_dput(): new rules for refcount Al Viro
2023-11-01 6:20 ` [PATCH 04/15] __dput_to_list(): do decrement of refcount in the caller Al Viro
2023-11-01 6:20 ` [PATCH 05/15] retain_dentry(): lift decrement of ->d_count into callers Al Viro
2023-11-01 6:20 ` [PATCH 06/15] __dentry_kill(): get consistent rules for ->d_count Al Viro
2023-11-01 6:20 ` [PATCH 07/15] dentry_kill(): don't bother with retain_dentry() on slow path Al Viro
2023-11-01 6:20 ` [PATCH 08/15] Call retain_dentry() with refcount 0 Al Viro
2023-11-01 6:20 ` [PATCH 09/15] fold the call of retain_dentry() into fast_dput() Al Viro
2023-11-01 8:45 ` Al Viro
2023-11-01 17:30 ` Linus Torvalds
2023-11-01 18:19 ` Al Viro
2023-11-10 4:20 ` lockless case of retain_dentry() (was Re: [PATCH 09/15] fold the call of retain_dentry() into fast_dput()) Al Viro
2023-11-10 5:57 ` Linus Torvalds
2023-11-10 6:22 ` Linus Torvalds
2023-11-22 6:29 ` Guo Ren
2023-11-10 8:19 ` Al Viro
2023-11-22 7:19 ` Guo Ren [this message]
2023-11-22 17:20 ` Linus Torvalds
2023-11-22 17:52 ` Linus Torvalds
2023-11-22 18:05 ` Linus Torvalds
2023-11-22 19:11 ` Linus Torvalds
2023-11-29 7:14 ` Guo Ren
2023-11-29 12:25 ` Guo Ren
2023-11-29 14:42 ` Linus Torvalds
2023-11-26 16:39 ` Guo Ren
2023-11-26 16:51 ` Linus Torvalds
2023-11-30 10:00 ` Guo Ren
2023-12-01 1:09 ` Linus Torvalds
2023-12-01 3:36 ` Guo Ren
2023-12-01 5:15 ` Linus Torvalds
2023-12-01 7:31 ` Guo Ren
2023-11-26 16:51 ` Guo Ren
2023-11-26 17:06 ` Linus Torvalds
2023-11-26 17:59 ` Linus Torvalds
2023-11-29 9:52 ` Guo Ren
2023-11-01 6:20 ` [PATCH 10/15] don't try to cut corners in shrink_lock_dentry() Al Viro
2023-11-01 6:21 ` [PATCH 11/15] fold dentry_kill() into dput() Al Viro
2023-11-01 6:21 ` [PATCH 12/15] get rid of __dget() Al Viro
2023-11-01 6:21 ` [PATCH 13/15] shrink_dentry_list(): no need to check that dentry refcount is marked dead Al Viro
2023-11-01 6:21 ` [PATCH 14/15] to_shrink_list(): call only if refcount is 0 Al Viro
2023-11-01 6:21 ` [PATCH 15/15] switch select_collect{,2}() to use of to_shrink_list() Al Viro
2023-11-01 2:22 ` [RFC] simplifying fast_dput(), dentry_kill() et.al Al Viro
2023-11-01 14:29 ` Benjamin Coddington
2023-11-05 19:54 ` Al Viro
2023-11-05 21:59 ` Al Viro
2023-11-06 5:53 ` Al Viro
2023-11-07 2:08 ` Al Viro
2023-11-09 6:19 ` [RFC][PATCHSET v2] " Al Viro
2023-11-09 6:20 ` [PATCH 01/22] struct dentry: get rid of randomize_layout idiocy Al Viro
2023-11-09 6:20 ` [PATCH 02/22] switch nfsd_client_rmdir() to use of simple_recursive_removal() Al Viro
2023-11-09 13:42 ` Christian Brauner
2023-11-09 14:01 ` Chuck Lever
2023-11-09 18:47 ` Al Viro
2023-11-09 18:50 ` Chuck Lever III
2023-11-09 6:20 ` [PATCH 03/22] coda_flag_children(): cope with dentries turning negative Al Viro
2023-11-09 13:43 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 04/22] dentry: switch the lists of children to hlist Al Viro
2023-11-09 13:48 ` Christian Brauner
2023-11-09 19:32 ` Al Viro
2023-11-09 6:20 ` [PATCH 05/22] centralize killing dentry from shrink list Al Viro
2023-11-09 13:49 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 06/22] get rid of __dget() Al Viro
2023-11-09 13:50 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 07/22] shrink_dentry_list(): no need to check that dentry refcount is marked dead Al Viro
2023-11-09 13:53 ` Christian Brauner
2023-11-09 20:28 ` Al Viro
2023-11-09 6:20 ` [PATCH 08/22] fast_dput(): having ->d_delete() is not reason to delay refcount decrement Al Viro
2023-11-09 13:58 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 09/22] fast_dput(): handle underflows gracefully Al Viro
2023-11-09 14:46 ` Christian Brauner
2023-11-09 20:39 ` Al Viro
2023-11-09 6:20 ` [PATCH 10/22] fast_dput(): new rules for refcount Al Viro
2023-11-09 14:54 ` Christian Brauner
2023-11-09 20:52 ` Al Viro
2023-11-09 6:20 ` [PATCH 11/22] __dput_to_list(): do decrement of refcount in the callers Al Viro
2023-11-09 15:21 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 12/22] Make retain_dentry() neutral with respect to refcounting Al Viro
2023-11-09 15:22 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 13/22] __dentry_kill(): get consistent rules for victim's refcount Al Viro
2023-11-09 15:27 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 14/22] dentry_kill(): don't bother with retain_dentry() on slow path Al Viro
2023-11-09 15:53 ` Christian Brauner
2023-11-09 21:29 ` Al Viro
2023-11-09 6:20 ` [PATCH 15/22] Call retain_dentry() with refcount 0 Al Viro
2023-11-09 16:09 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 16/22] fold the call of retain_dentry() into fast_dput() Al Viro
2023-11-09 16:17 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 17/22] don't try to cut corners in shrink_lock_dentry() Al Viro
2023-11-09 17:20 ` Christian Brauner
2023-11-09 21:45 ` Al Viro
2023-11-10 9:07 ` Christian Brauner
2023-11-09 17:39 ` Linus Torvalds
2023-11-09 18:11 ` Linus Torvalds
2023-11-09 18:20 ` Al Viro
2023-11-09 6:20 ` [PATCH 18/22] fold dentry_kill() into dput() Al Viro
2023-11-09 17:22 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 19/22] to_shrink_list(): call only if refcount is 0 Al Viro
2023-11-09 17:29 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 20/22] switch select_collect{,2}() to use of to_shrink_list() Al Viro
2023-11-09 17:31 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 21/22] d_prune_aliases(): use a shrink list Al Viro
2023-11-09 17:33 ` Christian Brauner
2023-11-09 6:20 ` [PATCH 22/22] __dentry_kill(): new locking scheme Al Viro
2023-11-10 13:34 ` Christian Brauner
2023-11-09 13:33 ` [PATCH 01/22] struct dentry: get rid of randomize_layout idiocy Christian Brauner
2023-10-31 2:25 ` [RFC] simplifying fast_dput(), dentry_kill() et.al Gao Xiang
2023-10-31 2:29 ` Gao Xiang
2023-10-31 3:02 ` Linus Torvalds
2023-10-31 3:13 ` Gao Xiang
2023-10-31 3:26 ` Al Viro
2023-10-31 3:41 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZV2rdE1XQWwJ7s75@gmail.com \
--to=guoren@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).