From: Al Viro <viro@zeniv.linux.org.uk>
To: NeilBrown <neil@brown.name>
Cc: Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 5/8] Introduce S_DYING which warns that S_DEAD might follow.
Date: Tue, 10 Jun 2025 21:57:32 +0100 [thread overview]
Message-ID: <20250610205732.GG299672@ZenIV> (raw)
In-Reply-To: <20250609075950.159417-6-neil@brown.name>
On Mon, Jun 09, 2025 at 05:34:10PM +1000, NeilBrown wrote:
> Once we support directory operations (e.g. create) without requiring the
> parent to be locked, the current practice locking a directory while
> processing rmdir() or similar will not be sufficient to wait for
> operations to complete and to block further operations.
>
> This patch introduced a new inode flag S_DYING. It indicates that
> a rmdir or similar is being processed and new directory operations must
> not commence in the directory. They should not abort either as the
> rmdir might fail - instead they should block. They can do this by
> waiting for a lock on the inode.
>
> A new interface rmdir_lock() locks the inode, sets this flag, and waits
> for any children with DCACHE_LOCK set to complete their operation, and
> for any d_in_lookup() children to complete the lookup. It should be
> called before attempted to delete the directory or set S_DEAD. Matching
> rmdir_unlock() clears the flag and unlocks the inode.
>
> dentry_lock() and d_alloc_parallel() are changed to block while this
> flag it set and to fail if the parent IS_DEADDIR(), though dentry_lock()
> doesn't block for d_in_lookup() dentries.
> diff --git a/fs/namei.c b/fs/namei.c
> index 4ad76df21677..c590f25d0d49 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1770,8 +1770,11 @@ static bool __dentry_lock(struct dentry *dentry,
> struct dentry *base, const struct qstr *last,
> unsigned int subclass, int state)
> {
> + struct dentry *parent;
> + struct inode *dir;
> int err;
>
> +retry:
> lock_acquire_exclusive(&dentry->dentry_map, subclass, 0, NULL, _THIS_IP_);
> spin_lock(&dentry->d_lock);
> err = wait_var_event_any_lock(&dentry->d_flags,
> @@ -1782,10 +1785,43 @@ static bool __dentry_lock(struct dentry *dentry,
> spin_unlock(&dentry->d_lock);
> return false;
> }
> -
> - dentry->d_flags |= DCACHE_LOCK;
> + parent = dentry->d_parent;
Why will it stay the parent? Matter of fact, why will it stay positive?
> + dir = igrab(parent->d_inode);
... and not oops right here?
> + lock_map_release(&dentry->dentry_map);
> spin_unlock(&dentry->d_lock);
> - return true;
> +
> + if (state == TASK_KILLABLE) {
> + err = down_write_killable(&dir->i_rwsem);
> + if (err) {
> + iput(dir);
> + return false;
> + }
> + } else
> + inode_lock(dir);
> + /* S_DYING much be clear now */
> + inode_unlock(dir);
> + iput(dir);
> + goto retry;
OK, I'm really confused now. Is it allowed to call dentry_lock() while holding
->i_rwsem of the parent?
Where does your dentry lock nest wrt ->i_rwsem? As a bonus (well, malus, I guess)
question, where does it nest wrt parent *and* child inodes' ->i_rwsem for rmdir
and rename?
Tangentially connected question: which locks are held for ->unlink() in your
scheme? You do need *something* on the victim inode to protect ->i_nlink
modifications, and anything on dentries of victim or their parent directories
is not going to give that.
next prev parent reply other threads:[~2025-06-10 20:57 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-09 7:34 [PATCH 0/8 preview] demonstrate proposed new locking strategy for directories NeilBrown
2025-06-09 7:34 ` [PATCH 1/8] VFS: use global wait-queue table for d_alloc_parallel() NeilBrown
2025-06-09 7:34 ` [PATCH 2/8] VFS: use d_alloc_parallel() in lookup_one_qstr_excl() NeilBrown
2025-06-09 7:34 ` [PATCH 3/8] fs/proc: take rcu_read_lock() in proc_sys_compare() NeilBrown
2025-06-09 7:34 ` [PATCH 4/8] VFS: Add ability to exclusively lock a dentry and use for open/create NeilBrown
2025-06-09 7:34 ` [PATCH 5/8] Introduce S_DYING which warns that S_DEAD might follow NeilBrown
2025-06-10 20:57 ` Al Viro [this message]
2025-06-11 1:00 ` NeilBrown
2025-06-11 1:13 ` Al Viro
2025-06-11 2:49 ` NeilBrown
2025-06-09 7:34 ` [PATCH 6/8] VFS: provide alternative to s_vfs_rename_mutex NeilBrown
2025-06-09 7:34 ` [PATCH 7/8] VFS: use new dentry locking for create/remove/rename NeilBrown
2025-06-10 20:36 ` Al Viro
2025-06-11 0:34 ` NeilBrown
2025-06-09 7:34 ` [PATCH 8/8] VFS: allow a filesystem to opt out of directory locking NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250610205732.GG299672@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=neil@brown.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.