linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: NeilBrown <neilb@suse.de>
Cc: Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jeff Layton <jlayton@kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 11/19] VFS: Add ability to exclusively lock a dentry and use for create/remove  operations.
Date: Sun, 9 Feb 2025 06:40:27 +0000	[thread overview]
Message-ID: <20250209064027.GV1977892@ZenIV> (raw)
In-Reply-To: <20250206054504.2950516-12-neilb@suse.de>

On Thu, Feb 06, 2025 at 04:42:48PM +1100, NeilBrown wrote:

> +bool d_update_lock(struct dentry *dentry,
> +		   struct dentry *base, const struct qstr *last,
> +		   unsigned int subclass)
> +{
> +	lock_acquire_exclusive(&dentry->d_update_map, subclass, 0, NULL, _THIS_IP_);
> +again:
> +	spin_lock(&dentry->d_lock);
> +	wait_var_event_spinlock(&dentry->d_flags,
> +				!check_dentry_locked(dentry),
> +				&dentry->d_lock);
> +	if (d_is_positive(dentry)) {
> +		rcu_read_lock(); /* needed for d_same_name() */

It isn't.  You are holding ->d_lock there.

> +		if (
> +			/* Was unlinked while we waited ?*/
> +			d_unhashed(dentry) ||
> +			/* Or was dentry renamed ?? */
> +			dentry->d_parent != base ||
> +			dentry->d_name.hash != last->hash ||
> +			!d_same_name(dentry, base, last)

Negatives can't be moved, but they bloody well can be unhashed.  So skipping
the d_unhashed() part for negatives is wrong.

> +		) {
> +			rcu_read_unlock();
> +			spin_unlock(&dentry->d_lock);
> +			lock_map_release(&dentry->d_update_map);
> +			return false;
> +		}
> +		rcu_read_unlock();
> +	}
> +	/* Must ensure DCACHE_PAR_UPDATE in child is visible before reading
> +	 * from parent
> +	 */
> +	smp_store_mb(dentry->d_flags, dentry->d_flags | DCACHE_PAR_UPDATE);

... paired with?

> +	if (base->d_flags & DCACHE_PAR_UPDATE) {
> +		/* We cannot grant DCACHE_PAR_UPDATE on a dentry while
> +		 * it is held on the parent
> +		 */
> +		dentry->d_flags &= ~DCACHE_PAR_UPDATE;
> +		spin_unlock(&dentry->d_lock);
> +		spin_lock(&base->d_lock);
> +		wait_var_event_spinlock(&base->d_flags,
> +					!check_dentry_locked(base),
> +					&base->d_lock);

Oh?  So you might also be waiting on the parent?  That's a deadlock fodder right
there - caller might be holding ->i_rwsem on the same parent, so you have waiting
on _->d_flags nested both outside and inside _->d_inode->i_rwsem.

Just in case anyone goes "->i_rwsem will only be held shared" - that wouldn't help.
Throw fchmod() into the mix and enjoy your deadlock -
	A: holds ->i_rwsem shared, waits for C to clear DCACHE_PAR_UPDATE.
	B: blocked trying to grab ->i_rwsem exclusive
	C: has DCACHE_PAR_UPDATE set, is blocked trying to grab ->i_rwsem shared
and there you go...

> +		spin_unlock(&base->d_lock);
> +		goto again;
> +	}
> +	spin_unlock(&dentry->d_lock);
> +	return true;
> +}

The entire thing is refcount-neutral for both dentry and base.  Which makes this

> @@ -1759,8 +1863,9 @@ static struct dentry *lookup_and_lock_nested(const struct qstr *last,
>  
>  	if (!(lookup_flags & LOOKUP_PARENT_LOCKED))
>  		inode_lock_nested(base->d_inode, subclass);
> -
> -	dentry = lookup_one_qstr(last, base, lookup_flags);
> +	do {
> +		dentry = lookup_one_qstr(last, base, lookup_flags);
> +	} while (!IS_ERR(dentry) && !d_update_lock(dentry, base, last, subclass));

... a refcount leak waiting to happen.

  parent reply	other threads:[~2025-02-09  6:40 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-06  5:42 [PATCH 00/19 v7?] RFC: Allow concurrent and async changes in a directory NeilBrown
2025-02-06  5:42 ` [PATCH 01/19] VFS: introduce vfs_mkdir_return() NeilBrown
2025-02-06 12:24   ` Christian Brauner
2025-02-06 23:52     ` NeilBrown
2025-02-06 13:52   ` Jeff Layton
2025-02-06 23:57     ` NeilBrown
2025-02-07 19:45   ` Al Viro
2025-02-10  4:36     ` NeilBrown
2025-02-06  5:42 ` [PATCH 02/19] VFS: use global wait-queue table for d_alloc_parallel() NeilBrown
2025-02-07 19:32   ` Al Viro
2025-02-10  4:58     ` NeilBrown
2025-02-10  5:15       ` Al Viro
2025-02-11 23:35         ` NeilBrown
2025-02-12  0:25           ` Al Viro
2025-02-12  1:46             ` NeilBrown
2025-02-06  5:42 ` [PATCH 03/19] VFS: use d_alloc_parallel() in lookup_one_qstr_excl() and rename it NeilBrown
2025-02-06 14:30   ` Jeff Layton
2025-02-07  0:04     ` NeilBrown
2025-02-07  0:23       ` Jeff Layton
2025-02-07 20:01   ` Al Viro
2025-02-06  5:42 ` [PATCH 04/19] VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry NeilBrown
2025-02-06 12:31   ` Christian Brauner
2025-02-06 13:09     ` Christian Brauner
2025-02-07  0:08       ` NeilBrown
2025-02-06  5:42 ` [PATCH 05/19] VFS: add common error checks to lookup_one_qstr() NeilBrown
2025-02-06 12:33   ` Christian Brauner
2025-02-07 20:14   ` Al Viro
2025-02-09 20:23   ` Al Viro
2025-02-06  5:42 ` [PATCH 06/19] VFS: repack DENTRY_ flags NeilBrown
2025-02-06 12:34   ` (subset) " Christian Brauner
2025-02-06  5:42 ` [PATCH 07/19] VFS: repack LOOKUP_ bit flags NeilBrown
2025-02-06 12:44   ` Christian Brauner
2025-02-07  0:24     ` NeilBrown
2025-02-06 12:54   ` (subset) " Christian Brauner
2025-02-06  5:42 ` [PATCH 08/19] VFS: introduce lookup_and_lock() and friends NeilBrown
2025-02-06 13:49   ` Christian Brauner
2025-02-07  1:28     ` NeilBrown
2025-02-07 20:22   ` Al Viro
2025-02-08 23:18     ` Al Viro
2025-02-12  5:22       ` NeilBrown
2025-02-12 15:51         ` Al Viro
2025-02-12 20:11           ` Al Viro
2025-02-12  4:49     ` NeilBrown
2025-02-06  5:42 ` [PATCH 09/19] VFS: add _async versions of the various directory modifying inode_operations NeilBrown
2025-02-06 13:15   ` Christian Brauner
2025-02-07  1:46     ` NeilBrown
2025-02-07 22:41   ` Al Viro
2025-02-09  1:09     ` Al Viro
2025-02-09  4:57       ` Al Viro
2025-02-06  5:42 ` [PATCH 10/19] VFS: introduce inode flags to report locking needs for directory ops NeilBrown
2025-02-06 13:22   ` Christian Brauner
2025-02-07  2:01     ` NeilBrown
2025-02-06  5:42 ` [PATCH 11/19] VFS: Add ability to exclusively lock a dentry and use for create/remove operations NeilBrown
2025-02-08  1:38   ` Al Viro
2025-02-09  6:40   ` Al Viro [this message]
2025-02-06  5:42 ` [PATCH 12/19] VFS: enhance d_splice_alias to accommodate shared-lock updates NeilBrown
2025-02-06  5:42 ` [PATCH 13/19] VFS: lock dentry for ->revalidate to avoid races with rename etc NeilBrown
2025-02-07 20:28   ` Al Viro
2025-02-07 20:35     ` Al Viro
2025-02-08  1:30   ` Al Viro
2025-02-08  1:35     ` Al Viro
2025-02-12 21:22     ` Al Viro
2025-02-06  5:42 ` [PATCH 14/19] VFS: Ensure no async updates happening in directory being removed NeilBrown
2025-02-06 14:06   ` Christian Brauner
2025-02-07  2:17     ` NeilBrown
2025-02-07 21:06   ` Al Viro
2025-02-08 22:06     ` Al Viro
2025-02-08 22:30       ` Linus Torvalds
2025-02-08 22:34         ` Linus Torvalds
2025-02-08 23:25         ` Al Viro
2025-02-06  5:42 ` [PATCH 15/19] VFS: Change lookup_and_lock() to use shared lock when possible NeilBrown
2025-02-06  5:42 ` [PATCH 16/19] VFS: add lookup_and_lock_rename() NeilBrown
2025-02-07 21:21   ` Al Viro
2025-02-06  5:42 ` [PATCH 17/19] nfsd: use lookup_and_lock_one() and lookup_and_lock_rename_one() NeilBrown
2025-02-06  5:42 ` [PATCH 18/19] nfs: change mkdir inode_operation to mkdir_async NeilBrown
2025-02-06  5:42 ` [PATCH 19/19] nfs: switch to _async for all directory ops NeilBrown
2025-02-13  3:51   ` Al Viro
2025-02-13  4:09     ` Al Viro
2025-02-13 18:01       ` Al Viro
2025-02-06 14:36 ` [PATCH 00/19 v7?] RFC: Allow concurrent and async changes in a directory Christian Brauner
2025-02-06 15:36 ` John Stoffel
2025-02-07  2:18   ` NeilBrown
2025-02-09 23:33 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250209064027.GV1977892@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).