public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Boqun Feng <boqun@kernel.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Jeff Layton <jlayton@kernel.org>,
	linux-fsdevel@vger.kernel.org,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Nikolay Borisov <nik.borisov@suse.com>,
	Max Kellermann <max.kellermann@ionos.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Paulo Alcantara <pc@manguebit.org>
Subject: Re: [RFC][PATCH] make sure that lock_for_kill() callers drop the locks in safe order
Date: Fri, 10 Apr 2026 19:52:43 +0100	[thread overview]
Message-ID: <20260410185243.GU3836593@ZenIV> (raw)
In-Reply-To: <CAHk-=wjRgHLvSnEY3P45hSQ0ycKxdz-xqnccAMPuGRrwsvWdig@mail.gmail.com>

On Fri, Apr 10, 2026 at 08:25:27AM -0700, Linus Torvalds wrote:
> [ Adding RCU maintainers to the participants: see
> 
>   https://lore.kernel.org/all/20260410084839.GA1310153@ZenIV/
> 
>  and the fairly long thread associated with it  for context ]
> 
> On Fri, 10 Apr 2026 at 01:44, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > [this may or may not be the source of UAFs caught by Jeff and by Helge]
> 
> Hmm.
> 
> I think this patch may indeed fix the problem, and I don't mind how it looks.
> 
> But while I think the patch looks fine, I am still quite unhappy about
> it if it matters - because we have very much documented that spinlocks
> in themselves are also RCU read locks:
> 
>         Note that anything that disables bottom halves, preemption,
>         or interrupts also enters an RCU read-side critical section.
>         Acquiring a spinlock also enters an RCU read-side critical
>         sections, even for spinlocks that do not disable preemption,
>         as is the case in kernels built with CONFIG_PREEMPT_RT=y.
>         Sleeplocks do *not* enter RCU read-side critical sections.
> 
> so *if* this makes a difference, I think it's actually implying that
> there's something wrong with our "rcu_read_unlock()" implementation,
> and that it doesn't nest properly.
> 
> Because our documentation also makes it very clear that this should
> all work as-is, and your patch should be a complete no-op. Just a few
> lines later in that core RCU doc, we have
> 
>         Note that RCU read-side critical sections may be nested and/or
>         overlapping.
> 
> so the order of the spin_unlock() and the rcu_read_unlock() *should*
> be entirely immaterial, and the order of unlocking simply shouldn't
> matter.
> 
> So it may indeed be that relying on the ordering of RCU read unlock
> matters for the case of explicit unlocks and the implicit unlocks by a
> spinlock, but that's not great.
> 
> Does the rcu_read_unlock() code perhaps only check the RCU count, not
> the atomicity count? That would explain it, and looking at
> __rcu_read_unlock() in kernel/rcu/tree_plugin.h that may indeed be the
> case (rcu_preempt_read_exit() seems to only check
> rcu_read_lock_nesting).
> 
> This is most likely dependent on kernel config options, and it may all
> be unavoidable becasue RCU might not even have a way to tell that it's
> still in a critical region without preemption counts etc.
> 
> But if it's unavoidable, we need to update the docs about this gotcha,
> and we need to have some tooling scan our existing code for this
> documentation change.
> 
> RCU people - what do you think?

FWIW, I wonder if we would be better off with the following:

void shrink_dentry_list(struct list_head *list)
{
	while (!list_empty(list)) {
		struct dentry *dentry;
 
		dentry = list_entry(list->prev, struct dentry, d_lru);
		spin_lock(&dentry->d_lock);
		d_shrink_del(dentry);
		if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) {
			// killed while on shrink list, freeing left to us
			spin_unlock(&dentry->d_lock);
			dentry_free(dentry);
			continue;
		}
		rcu_read_lock();
		if (!lock_for_kill(dentry)) {
			spin_unlock(&dentry->d_lock);
			rcu_read_unlock();
			continue;
		}
		shrink_kill(dentry);
	}
}

because in that case both callers of shrink_kill() have the same form - the
other one is
			if (!lock_for_kill(v)) {
				spin_unlock(&v->d_lock);
				rcu_read_unlock();
			} else {
				shrink_kill(v);
			}
and we could fold lock_for_kill() into shrink_kill(), with a nice series of
cleanups becoming possible after that, especially if we can rely upon
spinlock acting as RCU read-side critical in all cases...

Interesting...

  parent reply	other threads:[~2026-04-10 18:48 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-22 20:20 [PATCH][RFC] get rid of busy-wait in shrink_dcache_tree() Al Viro
2026-01-23  0:19 ` Linus Torvalds
2026-01-23  0:36   ` Al Viro
2026-01-24  4:36     ` Al Viro
2026-01-24  4:46       ` Linus Torvalds
2026-01-24  5:36         ` Al Viro
2026-01-24 17:45           ` Linus Torvalds
2026-01-24 18:43             ` Al Viro
2026-01-24 19:32               ` Linus Torvalds
2026-01-24 20:28                 ` Al Viro
2026-04-02 18:08 ` [RFC PATCH v2 0/4] getting rid of busy-wait in shrink_dcache_parent() Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 1/4] for_each_alias(): helper macro for iterating through dentries of given inode Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 2/4] struct dentry: make ->d_u anonymous Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 3/4] dcache.c: more idiomatic "positives are not allowed" sanity checks Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 4/4] get rid of busy-waiting in shrink_dcache_tree() Al Viro
2026-04-02 19:52     ` Linus Torvalds
2026-04-02 22:44       ` Al Viro
2026-04-02 22:49         ` Linus Torvalds
2026-04-02 23:16           ` Al Viro
2026-04-03  0:29             ` Linus Torvalds
2026-04-03  2:15               ` Al Viro
2026-04-04  0:02                 ` Al Viro
2026-04-04  0:04                   ` Linus Torvalds
2026-04-04 18:54                     ` Al Viro
2026-04-04 19:04                       ` Linus Torvalds
2026-04-05  0:04                         ` Al Viro
2026-04-02 20:28   ` [RFC PATCH v2 0/4] getting rid of busy-wait in shrink_dcache_parent() Paulo Alcantara
2026-04-03  4:46     ` Al Viro
2026-04-04  8:07 ` [RFC PATCH v3 " Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 1/4] for_each_alias(): helper macro for iterating through dentries of given inode Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 2/4] struct dentry: make ->d_u anonymous Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 3/4] dcache.c: more idiomatic "positives are not allowed" sanity checks Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 4/4] get rid of busy-waiting in shrink_dcache_tree() Al Viro
2026-04-09 16:51   ` [RFC PATCH v3 0/4] getting rid of busy-wait in shrink_dcache_parent() Jeff Layton
2026-04-09 19:02     ` Al Viro
2026-04-09 20:10       ` Jeff Layton
2026-04-09 21:57         ` Al Viro
2026-04-09 22:38           ` Jeff Layton
2026-04-10  8:48           ` [RFC][PATCH] make sure that lock_for_kill() callers drop the locks in safe order Al Viro
2026-04-10 11:18             ` Jeff Layton
2026-04-10 11:56               ` Jeff Layton
2026-04-10 15:25             ` Linus Torvalds
2026-04-10 15:57               ` Al Viro
2026-04-10 16:27               ` Boqun Feng
2026-04-10 17:31                 ` Linus Torvalds
2026-04-10 18:11                   ` Paul E. McKenney
2026-04-10 18:21                   ` Jeff Layton
2026-04-10 19:19                     ` Al Viro
2026-04-10 19:32                       ` Jeff Layton
2026-04-10 21:13                         ` Calvin Owens
2026-04-10 21:24                           ` Al Viro
2026-04-10 22:15                             ` Calvin Owens
2026-04-10 23:05                               ` Al Viro
2026-04-10 23:30                                 ` Calvin Owens
2026-04-11  0:51                                   ` Al Viro
2026-04-10 17:32               ` Paul E. McKenney
2026-04-10 18:26                 ` Jeff Layton
2026-04-10 18:36                   ` Paul E. McKenney
2026-04-10 18:52               ` Al Viro [this message]
2026-04-10 19:21                 ` Paul E. McKenney
2026-04-10 19:30                 ` Linus Torvalds
2026-04-10 20:24                   ` Al Viro
2026-04-10 20:48                     ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260410185243.GU3836593@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=boqun@kernel.org \
    --cc=brauner@kernel.org \
    --cc=frederic@kernel.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=max.kellermann@ionos.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=nik.borisov@suse.com \
    --cc=paulmck@kernel.org \
    --cc=pc@manguebit.org \
    --cc=sandeen@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox