All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Boqun Feng <boqun@kernel.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Jeff Layton <jlayton@kernel.org>,
	linux-fsdevel@vger.kernel.org,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Nikolay Borisov <nik.borisov@suse.com>,
	Max Kellermann <max.kellermann@ionos.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Paulo Alcantara <pc@manguebit.org>
Subject: Re: [RFC][PATCH] make sure that lock_for_kill() callers drop the locks in safe order
Date: Fri, 10 Apr 2026 21:24:04 +0100	[thread overview]
Message-ID: <20260410202404.GW3836593@ZenIV> (raw)
In-Reply-To: <CAHk-=wiiJ4uO6C3U-B_SVuyjzQpNd3yMjWck8kJ7tZ4oPwhGWA@mail.gmail.com>

On Fri, Apr 10, 2026 at 12:30:13PM -0700, Linus Torvalds wrote:

> The reason it exists is because lock_for_kill() can drop d_lock(), but
> that's in the unlikely case that we cn't just immediately get the
> inode lock.
> 
> So honestly, I think that rcu_read_lock() should be inside
> lock_for_kill(), rather than in the caller as a "just in case things
> go down".

Yup, in the cascade of followups I've mentioned...

> IOW, that code basically now not only does that typically unnecessary
> rcu locking (and then unlocking), it does so *because* it knows about
> the subtle internal behavior of lock_for_kill().
> 
> In contrast, if we put it inside lock_for_kill(), all we need is a
> fairly straightforward comment along the lines of
> 
>   "we're dropping the spinlock in order to take the inode lock, but
> the caller may be depending on RCU behavior so we need to take the rcu
> read lock first".
> 
>  that would turn strange illogical code that also generates worse code
> into straightforwardly explained code that also performs better.
> 
> Ok, so "performs better" is kind of exaggerated, in that obviously the
> extra rcu_read_lock/unlock sequences aren't exactly expensive, but
> still - I feel it's a win-win to just do this differently.
> 
> Or am I missing somethign else?

Mostly that fast_dput() calling conventions would need to change a bit
as well.

With the above as step 1,

Step 2: fold lock_for_kill() into shrink_kill().  Result: 2 call sites of
lock_for_kill() left, shrink_kill() becomes
static inline void shrink_kill(struct dentry *victim)
{
	while (lock_for_kill(victim)) {   
		rcu_read_unlock();
		victim = __dentry_kill(victim);
		if (!victim)
			return;
		rcu_read_lock();
	}
	spin_unlock(&victim->d_lock);
	rcu_read_unlock();
}
shrink_dentry_list() and shrink_dentry_tree() calling shrink_kill()
without bothering with lock_for_kill().

Step 3: note that both call sites of lock_for_kill() are followed with the
same sequence - drop ->d_lock + rcu_read_unlock on failure,
rcu_read_unlock + __dentry_kill() on success.  And these are the
only callers of __dentry_kill().  Moreover, NULL from __dentry_kill()
means "bugger off" in both callers.  So add

struct dentry *dentry_kill(dentry)
{
	if (!lock_for_kill(dentry)) {
		spin_unlock(&dentry->d_lock);
		rcu_read_unlock();
		return NULL;
	}
	rcu_read_unlock();
	return __dentry_kill(dentry);
}
and convert both callers to that:
static inline void shrink_kill(struct dentry *victim)
{
	while ((victim = dentry_kill(victim)) != NULL)
		rcu_read_lock();
}
and
static void finish_dput(struct dentry *dentry)
{
	while ((dentry = dentry_kill(dentry)) != NULL) {
		if (retain_dentry(dentry, true)) {
			spin_unlock(&dentry->d_lock);
			return;
		}
		rcu_read_lock();
	}
}

Step 4: note that callers of shrink_kill() and finish_dput()
have dentry->d_lock held.  We can (assuming there's no RCU
bugs) drop the lock before calling them and grab it the
first thing in.  Moreover, we can shift grabbing it into
lock_for_kill(), leaving these
static inline void shrink_kill(struct dentry *victim)
{
	while ((victim = dentry_kill(victim)) != NULL)
		;
}
and
static void finish_dput(struct dentry *dentry)
{
	while ((dentry = dentry_kill(dentry)) != NULL) {
		if (retain_dentry(dentry, true)) {
			spin_unlock(&dentry->d_lock);
			return;
		}
	}
}
resp., with the callers becoming
void dput(struct dentry *dentry)
{
	if (!dentry)
		return;
	might_sleep();
	rcu_read_lock();
	if (likely(fast_dput(dentry))) {
		rcu_read_unlock();
		return;
	}
	rcu_read_unlock();
	finish_dput(dentry);
}

void d_make_discardable(struct dentry *dentry)
{
        spin_lock(&dentry->d_lock);
        WARN_ON(!(dentry->d_flags & DCACHE_PERSISTENT));
        dentry->d_flags &= ~DCACHE_PERSISTENT;
        dentry->d_lockref.count--;
        finish_dput(dentry);
}

void shrink_dentry_list(struct list_head *list)
{
        while (!list_empty(list)) {
                struct dentry *dentry;
 
                dentry = list_entry(list->prev, struct dentry, d_lru);
                spin_lock(&dentry->d_lock);
                d_shrink_del(dentry);
		if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED) {
                        spin_unlock(&dentry->d_lock);
			dentry_free(dentry);
			continue;
		}
                shrink_kill(dentry);
	}
}

and, in shrink_dentry_list(),
                        spin_lock(&v->d_lock);
			rcu_read_unlock();	// now held by ->d_lock
                        if (v->d_lockref.count < 0 &&
                            !(v->d_flags & DCACHE_DENTRY_KILLED)) {
                                struct completion_list wait;
                                // It's busy dying; have it notify us once
                                // it becomes invisible to d_walk().
                                d_add_waiter(v, &wait);
                                spin_unlock(&v->d_lock);
                                if (!list_empty(&data.dispose))
                                        shrink_dentry_list(&data.dispose);
                                wait_for_completion(&wait.completion);
                                continue;
                        }
			shrink_kill(v);

lock_for_kill() has grown a call of rcu_read_lock() in the very beginning,
will be gone shortly.

Step 5: pull the followup rcu_read_unlock() (and dropping ->d_lock on
failure) into lock_for_kill(); again, that assumes no RCU bugs there:
static bool lock_for_kill(struct dentry *dentry)
{
        struct inode *inode = dentry->d_inode;
 
        if (unlikely(dentry->d_lockref.count)) {
		spin_unlock(&dentry->d_lock);
                return false;
	}
 
        if (!inode || likely(spin_trylock(&inode->i_lock)))
                return true;
 
	rcu_read_lock();
        do {   
                spin_unlock(&dentry->d_lock);
                spin_lock(&inode->i_lock);
                spin_lock(&dentry->d_lock);
                if (likely(inode == dentry->d_inode))
                        break;
                spin_unlock(&inode->i_lock);
                inode = dentry->d_inode;
        } while (inode);
	rcu_read_unlock();
        if (likely(!dentry->d_lockref.count))
                return true;
        if (inode)
                spin_unlock(&inode->i_lock);
	spin_unlock(&dentry->d_lock);
        return false;
}

and in dentry_kill()
{
	if (!lock_for_kill(dentry))
		return NULL;
	return __dentry_kill(dentry);
}

Step 6: fold __dentry_kill() into dentry_kill().

Step 7: both callers of fast_dput() are identical:
        rcu_read_lock();
        if (likely(fast_dput(dentry))) {
                rcu_read_unlock();
                return;
        }
	rcu_read_unlock();
(in dput() and dput_to_list()).  Pull rcu_read_lock() and rcu_read_unlock()
into fast_dput(); ends up with 5 lines added there.  Result (with comments
skipped) is
static inline bool fast_dput(struct dentry *dentry)
{
        int ret;
 
	rcu_read_lock();		// added
        ret = lockref_put_return(&dentry->d_lockref);
 
        if (unlikely(ret < 0)) {
                spin_lock(&dentry->d_lock);
		rcu_read_unlock();	// added
                if (WARN_ON_ONCE(dentry->d_lockref.count <= 0)) {
                        spin_unlock(&dentry->d_lock);
                        return true;
                }
                dentry->d_lockref.count--;
                goto locked;
        }
        if (ret) {
		rcu_read_unlock();	// added
                return true;
	}
        if (retain_dentry(dentry, false)) {
		rcu_read_unlock();	// added
                return true;
	}
        spin_lock(&dentry->d_lock);
	rcu_read_unlock();		// added
locked:
        if (dentry->d_lockref.count || retain_dentry(dentry, true)) {
                spin_unlock(&dentry->d_lock);
                return true;
        }
        return false;
}

Callers become
void dput(struct dentry *dentry)
{
	if (!dentry)
		return;
	might_sleep();
	if (likely(fast_dput(dentry)))
		return;
	finish_dput(dentry);
}
and
void dput_to_list(struct dentry *dentry, struct list_head *list)
{
	if (likely(fast_dput(dentry)))
		return;
	to_shrink_list(dentry, list);
	spin_unlock(&dentry->d_lock);
}
resp.

  reply	other threads:[~2026-04-10 20:20 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-22 20:20 [PATCH][RFC] get rid of busy-wait in shrink_dcache_tree() Al Viro
2026-01-23  0:19 ` Linus Torvalds
2026-01-23  0:36   ` Al Viro
2026-01-24  4:36     ` Al Viro
2026-01-24  4:46       ` Linus Torvalds
2026-01-24  5:36         ` Al Viro
2026-01-24 17:45           ` Linus Torvalds
2026-01-24 18:43             ` Al Viro
2026-01-24 19:32               ` Linus Torvalds
2026-01-24 20:28                 ` Al Viro
2026-04-02 18:08 ` [RFC PATCH v2 0/4] getting rid of busy-wait in shrink_dcache_parent() Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 1/4] for_each_alias(): helper macro for iterating through dentries of given inode Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 2/4] struct dentry: make ->d_u anonymous Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 3/4] dcache.c: more idiomatic "positives are not allowed" sanity checks Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 4/4] get rid of busy-waiting in shrink_dcache_tree() Al Viro
2026-04-02 19:52     ` Linus Torvalds
2026-04-02 22:44       ` Al Viro
2026-04-02 22:49         ` Linus Torvalds
2026-04-02 23:16           ` Al Viro
2026-04-03  0:29             ` Linus Torvalds
2026-04-03  2:15               ` Al Viro
2026-04-04  0:02                 ` Al Viro
2026-04-04  0:04                   ` Linus Torvalds
2026-04-04 18:54                     ` Al Viro
2026-04-04 19:04                       ` Linus Torvalds
2026-04-05  0:04                         ` Al Viro
2026-04-02 20:28   ` [RFC PATCH v2 0/4] getting rid of busy-wait in shrink_dcache_parent() Paulo Alcantara
2026-04-03  4:46     ` Al Viro
2026-04-04  8:07 ` [RFC PATCH v3 " Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 1/4] for_each_alias(): helper macro for iterating through dentries of given inode Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 2/4] struct dentry: make ->d_u anonymous Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 3/4] dcache.c: more idiomatic "positives are not allowed" sanity checks Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 4/4] get rid of busy-waiting in shrink_dcache_tree() Al Viro
2026-04-09 16:51   ` [RFC PATCH v3 0/4] getting rid of busy-wait in shrink_dcache_parent() Jeff Layton
2026-04-09 19:02     ` Al Viro
2026-04-09 20:10       ` Jeff Layton
2026-04-09 21:57         ` Al Viro
2026-04-09 22:38           ` Jeff Layton
2026-04-10  8:48           ` [RFC][PATCH] make sure that lock_for_kill() callers drop the locks in safe order Al Viro
2026-04-10 11:18             ` Jeff Layton
2026-04-10 11:56               ` Jeff Layton
2026-04-10 15:25             ` Linus Torvalds
2026-04-10 15:57               ` Al Viro
2026-04-10 16:27               ` Boqun Feng
2026-04-10 17:31                 ` Linus Torvalds
2026-04-10 18:11                   ` Paul E. McKenney
2026-04-10 18:21                   ` Jeff Layton
2026-04-10 19:19                     ` Al Viro
2026-04-10 19:32                       ` Jeff Layton
2026-04-10 21:13                         ` Calvin Owens
2026-04-10 21:24                           ` Al Viro
2026-04-10 22:15                             ` Calvin Owens
2026-04-10 23:05                               ` Al Viro
2026-04-10 23:30                                 ` Calvin Owens
2026-04-11  0:51                                   ` Al Viro
2026-04-11 12:07                                     ` Calvin Owens
2026-04-10 17:32               ` Paul E. McKenney
2026-04-10 18:26                 ` Jeff Layton
2026-04-10 18:36                   ` Paul E. McKenney
2026-04-10 18:52               ` Al Viro
2026-04-10 19:21                 ` Paul E. McKenney
2026-04-10 19:30                 ` Linus Torvalds
2026-04-10 20:24                   ` Al Viro [this message]
2026-04-10 20:48                     ` Al Viro
2026-04-11 21:33                     ` Al Viro
2026-04-11 21:34                       ` [RFC PATCH 01/11] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-04-11 21:34                         ` [RFC PATCH 02/11] fold lock_for_kill() into shrink_kill() Al Viro
2026-04-11 21:34                         ` [RFC PATCH 03/11] fold lock_for_kill() and __dentry_kill() into common helper Al Viro
2026-04-11 21:34                         ` [RFC PATCH 04/11] reducing rcu_read_lock() scopes in dput and friends, step 1 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 05/11] reducing rcu_read_lock() scopes in dput and friends, step 2 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 06/11] reducing rcu_read_lock() scopes in dput and friends, step 3 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 07/11] reducing rcu_read_lock() scopes in dput and friends, step 4 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 08/11] reducing rcu_read_lock() scopes in dput and friends, step 5 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 09/11] reducing rcu_read_lock() scopes in dput and friends, step 6 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 10/11] adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Al Viro
2026-04-11 21:34                         ` [RFC PATCH 11/11] document dentry_kill() Al Viro
2026-04-12 19:03                         ` [RFC PATCH 01/11] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-04-21  9:28                         ` [git pull] dcache busy-wait fixes Al Viro
2026-04-21 16:25                           ` pr-tracker-bot
2026-04-12 13:15                       ` [RFC][PATCH] make sure that lock_for_kill() callers drop the locks in safe order Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260410202404.GW3836593@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=boqun@kernel.org \
    --cc=brauner@kernel.org \
    --cc=frederic@kernel.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=max.kellermann@ionos.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=nik.borisov@suse.com \
    --cc=paulmck@kernel.org \
    --cc=pc@manguebit.org \
    --cc=sandeen@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.