linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem
@ 2025-07-10 23:03 NeilBrown
  2025-07-10 23:03 ` [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir() NeilBrown
                   ` (20 more replies)
  0 siblings, 21 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

This is a revised set of patches following helpful feedback.  There are
now more patches, but they should be a lot easier to review.

These patches are all in a git tree at
   https://github.com/neilbrown/linux/commits/pdirops
though there a lot more patches there too - demonstrating what is to come.
0eaa1c629788 ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()
is the last in the series posted here.

I welcome further review.

Original description:

This series of patches for overlayfs is primarily focussed on preparing
for some proposed changes to directory locking.  In the new scheme we
will lock individual dentries in a directory rather than the whole
directory.

ovl currently will sometimes lock a directory on the upper filesystem
and do a few different things while holding the lock.  This is
incompatible with the new scheme.

This series narrows the region of code protected by the directory lock,
taking it multiple times when necessary.  This theoretically open up the
possibilty of other changes happening on the upper filesytem between the
unlock and the lock.  To some extent the patches guard against that by
checking the dentries still have the expect parent after retaking the
lock.  In general, I think ovl would have trouble if upperfs were being
changed independantly, and I don't think the changes here increase the
problem in any important way.

I have tested this with fstests, both generic and unionfs tests.  I
wouldn't be surprised if I missed something though, so please review
carefully.

After this series (with any needed changes) lands I will resubmit my
change to vfs_rmdir() behaviour to have it drop the lock on error.  ovl
will be much better positioned to handle that change.  It will come with
the new "lookup_and_lock" API that I am proposing.

Thanks,
NeilBrown


 [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()
 [PATCH 02/20] ovl: change ovl_create_index() to take write and dir
 [PATCH 03/20] ovl: Call ovl_create_temp() without lock held.
 [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir()
 [PATCH 05/20] ovl: narrow locking in ovl_create_upper()
 [PATCH 06/20] ovl: narrow locking in ovl_clear_empty()
 [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout()
 [PATCH 08/20] ovl: narrow locking in ovl_rename()
 [PATCH 09/20] ovl: narrow locking in ovl_cleanup_whiteouts()
 [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index()
 [PATCH 11/20] ovl: narrow locking in ovl_workdir_create()
 [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup()
 [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse()
 [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as
 [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout()
 [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename
 [PATCH 17/20] ovl: narrow locking in ovl_whiteout()
 [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout()
 [PATCH 19/20] ovl: change ovl_create_real() to receive dentry parent
 [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11  8:25   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 02/20] ovl: change ovl_create_index() to take write and dir locks NeilBrown
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

If ovl_copy_up_data() fails the error is not immediately handled but the
code continues on to call ovl_start_write() and lock_rename(),
presumably because both of these locks are needed for the cleanup.
On then (if the lock was successful) is the error checked.

This makes the code a little hard to follow and could be fragile.

This patch changes to handle the error immediately.  A new
ovl_cleanup_unlocked() is created which takes the required directory
lock (though it doesn't take the write lock on the filesystem).  This
will be used extensively in later patches.

In general we need to check the parent is still correct after taking the
lock (as ovl_copy_up_workdir() does after a successful lock_rename()) so
that is included in ovl_cleanup_unlocked() using new lock_parent() and
unlock_parent() calls (it is planned to move this API into VFS code
eventually, though in a slightly different form).

A fresh cleanup block is added which doesn't share code with other
cleanup blocks.  It will get a new users in the next patch.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/copy_up.c   | 12 ++++++++++--
 fs/overlayfs/dir.c       | 15 +++++++++++++++
 fs/overlayfs/overlayfs.h |  6 ++++++
 fs/overlayfs/util.c      | 10 ++++++++++
 4 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 8a3c0d18ec2e..5d21b8d94a0a 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -794,6 +794,9 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	 */
 	path.dentry = temp;
 	err = ovl_copy_up_data(c, &path);
+	if (err)
+		goto cleanup_need_write;
+
 	/*
 	 * We cannot hold lock_rename() throughout this helper, because of
 	 * lock ordering with sb_writers, which shouldn't be held when calling
@@ -809,8 +812,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 		if (IS_ERR(trap))
 			goto out;
 		goto unlock;
-	} else if (err) {
-		goto cleanup;
 	}
 
 	err = ovl_copy_up_metadata(c, temp);
@@ -857,6 +858,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	ovl_cleanup(ofs, wdir, temp);
 	dput(temp);
 	goto unlock;
+
+cleanup_need_write:
+	ovl_start_write(c->dentry);
+	ovl_cleanup_unlocked(ofs, c->workdir, temp);
+	ovl_end_write(c->dentry);
+	dput(temp);
+	return err;
 }
 
 /* Copyup using O_TMPFILE which does not require cross dir locking */
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 4fc221ea6480..cee35d69e0e6 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -43,6 +43,21 @@ int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
 	return err;
 }
 
+int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
+			 struct dentry *wdentry)
+{
+	int err;
+
+	err = parent_lock(workdir, wdentry);
+	if (err)
+		return err;
+
+	ovl_cleanup(ofs, workdir->d_inode, wdentry);
+	parent_unlock(workdir);
+
+	return err;
+}
+
 struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
 {
 	struct dentry *temp;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 42228d10f6b9..68dc78c712a8 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -416,6 +416,11 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
 }
 
 /* util.c */
+int parent_lock(struct dentry *parent, struct dentry *child);
+static inline void parent_unlock(struct dentry *parent)
+{
+	inode_unlock(parent->d_inode);
+}
 int ovl_get_write_access(struct dentry *dentry);
 void ovl_put_write_access(struct dentry *dentry);
 void ovl_start_write(struct dentry *dentry);
@@ -843,6 +848,7 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs,
 			       struct inode *dir, struct dentry *newdentry,
 			       struct ovl_cattr *attr);
 int ovl_cleanup(struct ovl_fs *ofs, struct inode *dir, struct dentry *dentry);
+int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
 struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir);
 struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
 			       struct ovl_cattr *attr);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 2b4754c645ee..a5105d68f6b4 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1544,3 +1544,13 @@ void ovl_copyattr(struct inode *inode)
 	i_size_write(inode, i_size_read(realinode));
 	spin_unlock(&inode->i_lock);
 }
+
+int parent_lock(struct dentry *parent, struct dentry *child)
+{
+	inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
+	if (!child || child->d_parent == parent)
+		return 0;
+
+	inode_unlock(parent->d_inode);
+	return -EINVAL;
+}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 02/20] ovl: change ovl_create_index() to take write and dir locks
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
  2025-07-10 23:03 ` [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 10:41   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 03/20] ovl: Call ovl_create_temp() without lock held NeilBrown
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

ovl_copy_up_workdir() currently take a rename lock on two directories,
then use the lock to both create a file in one directory, perform a
rename, and possibly unlink the file for cleanup.  This is incompatible
with proposed changes which will lock just the dentry of objects being
acted on.

This patch moves the call to ovl_create_index() earlier in
ovl_copy_up_workdir() to before the lock is taken, and also before write
access to the filesystem is gained (this last is not strictly necessary
but seems cleaner).

ovl_create_index() then take the requires locks and drops them before
returning.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/copy_up.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 5d21b8d94a0a..25be0b80a40b 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -517,8 +517,6 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper,
 
 /*
  * Create and install index entry.
- *
- * Caller must hold i_mutex on indexdir.
  */
 static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 			    struct dentry *upper)
@@ -550,7 +548,10 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 	if (err)
 		return err;
 
+	ovl_start_write(dentry);
+	inode_lock(dir);
 	temp = ovl_create_temp(ofs, indexdir, OVL_CATTR(S_IFDIR | 0));
+	inode_unlock(dir);
 	err = PTR_ERR(temp);
 	if (IS_ERR(temp))
 		goto free_name;
@@ -559,6 +560,9 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 	if (err)
 		goto out;
 
+	err = parent_lock(indexdir, temp);
+	if (err)
+		goto out;
 	index = ovl_lookup_upper(ofs, name.name, indexdir, name.len);
 	if (IS_ERR(index)) {
 		err = PTR_ERR(index);
@@ -566,9 +570,11 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 		err = ovl_do_rename(ofs, indexdir, temp, indexdir, index, 0);
 		dput(index);
 	}
+	parent_unlock(indexdir);
 out:
 	if (err)
-		ovl_cleanup(ofs, dir, temp);
+		ovl_cleanup_unlocked(ofs, indexdir, temp);
+	ovl_end_write(dentry);
 	dput(temp);
 free_name:
 	kfree(name.name);
@@ -797,6 +803,12 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	if (err)
 		goto cleanup_need_write;
 
+	if (S_ISDIR(c->stat.mode) && c->indexed) {
+		err = ovl_create_index(c->dentry, c->origin_fh, temp);
+		if (err)
+			goto cleanup_need_write;
+	}
+
 	/*
 	 * We cannot hold lock_rename() throughout this helper, because of
 	 * lock ordering with sb_writers, which shouldn't be held when calling
@@ -818,12 +830,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	if (err)
 		goto cleanup;
 
-	if (S_ISDIR(c->stat.mode) && c->indexed) {
-		err = ovl_create_index(c->dentry, c->origin_fh, temp);
-		if (err)
-			goto cleanup;
-	}
-
 	upper = ovl_lookup_upper(ofs, c->destname.name, c->destdir,
 				 c->destname.len);
 	err = PTR_ERR(upper);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 03/20] ovl: Call ovl_create_temp() without lock held.
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
  2025-07-10 23:03 ` [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir() NeilBrown
  2025-07-10 23:03 ` [PATCH 02/20] ovl: change ovl_create_index() to take write and dir locks NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 11:10   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir() NeilBrown
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

ovl currently locks a directory or two and then performs multiple actions
in one or both directories.  This is incompatible with proposed changes
which will lock just the dentry of objects being acted on.

This patch moves calls to ovl_create_temp() out of the locked regions and
has it take and release the relevant lock itself.

The lock that was taken before this function was called is now taken
after.  This means that any code between where the lock was taken and
ovl_create_temp() is now unlocked.  This necessitates the use of
ovl_cleanup_unlocked() and the creation of ovl_lookup_upper_unlocked().
These will be used more widely in future patches.

Now that the file is created before the lock is taken for rename, we
need to ensure the parent wasn't changed before the lock was gained.
ovl_lock_rename_workdir() is changed to optionally receive the dentries
that will be involved in the rename.  If either is present but has the
wrong parent, an error is returned.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/copy_up.c   |  5 ---
 fs/overlayfs/dir.c       | 67 ++++++++++++++++++++--------------------
 fs/overlayfs/overlayfs.h | 12 ++++++-
 fs/overlayfs/super.c     | 11 ++++---
 fs/overlayfs/util.c      |  7 ++++-
 5 files changed, 58 insertions(+), 44 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 25be0b80a40b..eafb46686854 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -523,7 +523,6 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 {
 	struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
 	struct dentry *indexdir = ovl_indexdir(dentry->d_sb);
-	struct inode *dir = d_inode(indexdir);
 	struct dentry *index = NULL;
 	struct dentry *temp = NULL;
 	struct qstr name = { };
@@ -549,9 +548,7 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 		return err;
 
 	ovl_start_write(dentry);
-	inode_lock(dir);
 	temp = ovl_create_temp(ofs, indexdir, OVL_CATTR(S_IFDIR | 0));
-	inode_unlock(dir);
 	err = PTR_ERR(temp);
 	if (IS_ERR(temp))
 		goto free_name;
@@ -785,9 +782,7 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 		return err;
 
 	ovl_start_write(c->dentry);
-	inode_lock(wdir);
 	temp = ovl_create_temp(ofs, c->workdir, &cattr);
-	inode_unlock(wdir);
 	ovl_end_write(c->dentry);
 	ovl_revert_cu_creds(&cc);
 
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index cee35d69e0e6..144e1753d0c9 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -214,8 +214,12 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir,
 struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
 			       struct ovl_cattr *attr)
 {
-	return ovl_create_real(ofs, d_inode(workdir),
-			       ovl_lookup_temp(ofs, workdir), attr);
+	struct dentry *ret;
+	inode_lock(workdir->d_inode);
+	ret = ovl_create_real(ofs, d_inode(workdir),
+			      ovl_lookup_temp(ofs, workdir), attr);
+	inode_unlock(workdir->d_inode);
+	return ret;
 }
 
 static int ovl_set_opaque_xerr(struct dentry *dentry, struct dentry *upper,
@@ -353,7 +357,6 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 	struct dentry *workdir = ovl_workdir(dentry);
 	struct inode *wdir = workdir->d_inode;
 	struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent);
-	struct inode *udir = upperdir->d_inode;
 	struct path upperpath;
 	struct dentry *upper;
 	struct dentry *opaquedir;
@@ -363,28 +366,25 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 	if (WARN_ON(!workdir))
 		return ERR_PTR(-EROFS);
 
-	err = ovl_lock_rename_workdir(workdir, upperdir);
-	if (err)
-		goto out;
-
 	ovl_path_upper(dentry, &upperpath);
 	err = vfs_getattr(&upperpath, &stat,
 			  STATX_BASIC_STATS, AT_STATX_SYNC_AS_STAT);
 	if (err)
-		goto out_unlock;
+		goto out;
 
 	err = -ESTALE;
 	if (!S_ISDIR(stat.mode))
-		goto out_unlock;
+		goto out;
 	upper = upperpath.dentry;
-	if (upper->d_parent->d_inode != udir)
-		goto out_unlock;
 
 	opaquedir = ovl_create_temp(ofs, workdir, OVL_CATTR(stat.mode));
 	err = PTR_ERR(opaquedir);
 	if (IS_ERR(opaquedir))
-		goto out_unlock;
-
+		/* workdir was unlocked, no upperdir */
+		goto out;
+	err = ovl_lock_rename_workdir(workdir, opaquedir, upperdir, upper);
+	if (err)
+		goto out_cleanup_unlocked;
 	err = ovl_copy_xattr(dentry->d_sb, &upperpath, opaquedir);
 	if (err)
 		goto out_cleanup;
@@ -413,10 +413,10 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 	return opaquedir;
 
 out_cleanup:
-	ovl_cleanup(ofs, wdir, opaquedir);
-	dput(opaquedir);
-out_unlock:
 	unlock_rename(workdir, upperdir);
+out_cleanup_unlocked:
+	ovl_cleanup_unlocked(ofs, workdir, opaquedir);
+	dput(opaquedir);
 out:
 	return ERR_PTR(err);
 }
@@ -454,15 +454,11 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 			return err;
 	}
 
-	err = ovl_lock_rename_workdir(workdir, upperdir);
-	if (err)
-		goto out;
-
-	upper = ovl_lookup_upper(ofs, dentry->d_name.name, upperdir,
-				 dentry->d_name.len);
+	upper = ovl_lookup_upper_unlocked(ofs, dentry->d_name.name, upperdir,
+					  dentry->d_name.len);
 	err = PTR_ERR(upper);
 	if (IS_ERR(upper))
-		goto out_unlock;
+		goto out;
 
 	err = -ESTALE;
 	if (d_is_negative(upper) || !ovl_upper_is_whiteout(ofs, upper))
@@ -473,6 +469,10 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	if (IS_ERR(newdentry))
 		goto out_dput;
 
+	err = ovl_lock_rename_workdir(workdir, newdentry, upperdir, upper);
+	if (err)
+		goto out_cleanup;
+
 	/*
 	 * mode could have been mutilated due to umask (e.g. sgid directory)
 	 */
@@ -487,35 +487,35 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 		err = ovl_do_notify_change(ofs, newdentry, &attr);
 		inode_unlock(newdentry->d_inode);
 		if (err)
-			goto out_cleanup;
+			goto out_cleanup_locked;
 	}
 	if (!hardlink) {
 		err = ovl_set_upper_acl(ofs, newdentry,
 					XATTR_NAME_POSIX_ACL_ACCESS, acl);
 		if (err)
-			goto out_cleanup;
+			goto out_cleanup_locked;
 
 		err = ovl_set_upper_acl(ofs, newdentry,
 					XATTR_NAME_POSIX_ACL_DEFAULT, default_acl);
 		if (err)
-			goto out_cleanup;
+			goto out_cleanup_locked;
 	}
 
 	if (!hardlink && S_ISDIR(cattr->mode)) {
 		err = ovl_set_opaque(dentry, newdentry);
 		if (err)
-			goto out_cleanup;
+			goto out_cleanup_locked;
 
 		err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper,
 				    RENAME_EXCHANGE);
 		if (err)
-			goto out_cleanup;
+			goto out_cleanup_locked;
 
 		ovl_cleanup(ofs, wdir, upper);
 	} else {
 		err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper, 0);
 		if (err)
-			goto out_cleanup;
+			goto out_cleanup_locked;
 	}
 	ovl_dir_modified(dentry->d_parent, false);
 	err = ovl_instantiate(dentry, inode, newdentry, hardlink, NULL);
@@ -523,10 +523,9 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 		ovl_cleanup(ofs, udir, newdentry);
 		dput(newdentry);
 	}
+	unlock_rename(workdir, upperdir);
 out_dput:
 	dput(upper);
-out_unlock:
-	unlock_rename(workdir, upperdir);
 out:
 	if (!hardlink) {
 		posix_acl_release(acl);
@@ -534,8 +533,10 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	}
 	return err;
 
+out_cleanup_locked:
+	unlock_rename(workdir, upperdir);
 out_cleanup:
-	ovl_cleanup(ofs, wdir, newdentry);
+	ovl_cleanup_unlocked(ofs, workdir, newdentry);
 	dput(newdentry);
 	goto out_dput;
 }
@@ -772,7 +773,7 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
 			goto out;
 	}
 
-	err = ovl_lock_rename_workdir(workdir, upperdir);
+	err = ovl_lock_rename_workdir(workdir, NULL, upperdir, NULL);
 	if (err)
 		goto out_dput;
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 68dc78c712a8..ec804d6bb2ef 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -407,6 +407,15 @@ static inline struct dentry *ovl_lookup_upper(struct ovl_fs *ofs,
 	return lookup_one(ovl_upper_mnt_idmap(ofs), &QSTR_LEN(name, len), base);
 }
 
+static inline struct dentry *ovl_lookup_upper_unlocked(struct ovl_fs *ofs,
+						       const char *name,
+						       struct dentry *base,
+						       int len)
+{
+	return lookup_one_unlocked(ovl_upper_mnt_idmap(ofs),
+				   &QSTR_LEN(name, len), base);
+}
+
 static inline bool ovl_open_flags_need_copy_up(int flags)
 {
 	if (!flags)
@@ -540,7 +549,8 @@ bool ovl_is_inuse(struct dentry *dentry);
 bool ovl_need_index(struct dentry *dentry);
 int ovl_nlink_start(struct dentry *dentry);
 void ovl_nlink_end(struct dentry *dentry);
-int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
+int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *work,
+			    struct dentry *upperdir, struct dentry *upper);
 int ovl_check_metacopy_xattr(struct ovl_fs *ofs, const struct path *path,
 			     struct ovl_metacopy *data);
 int ovl_set_metacopy_xattr(struct ovl_fs *ofs, struct dentry *d,
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index cf99b276fdfb..9cce3251dd83 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -564,13 +564,16 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
 	struct name_snapshot name;
 	int err;
 
-	inode_lock_nested(dir, I_MUTEX_PARENT);
-
 	temp = ovl_create_temp(ofs, workdir, OVL_CATTR(S_IFREG | 0));
 	err = PTR_ERR(temp);
 	if (IS_ERR(temp))
-		goto out_unlock;
+		return err;
 
+	err = parent_lock(workdir, temp);
+	if (err) {
+		dput(temp);
+		return err;
+	}
 	dest = ovl_lookup_temp(ofs, workdir);
 	err = PTR_ERR(dest);
 	if (IS_ERR(dest)) {
@@ -606,7 +609,7 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
 	dput(dest);
 
 out_unlock:
-	inode_unlock(dir);
+	parent_unlock(workdir);
 
 	return err;
 }
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index a5105d68f6b4..9ce9fe62ef28 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1220,7 +1220,8 @@ void ovl_nlink_end(struct dentry *dentry)
 	ovl_inode_unlock(inode);
 }
 
-int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
+int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *work,
+			    struct dentry *upperdir, struct dentry *upper)
 {
 	struct dentry *trap;
 
@@ -1234,6 +1235,10 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
 		goto err;
 	if (trap)
 		goto err_unlock;
+	if (work && work->d_parent != workdir)
+		goto err_unlock;
+	if (upper && upper->d_parent != upperdir)
+		goto err_unlock;
 
 	return 0;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (2 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 03/20] ovl: Call ovl_create_temp() without lock held NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 12:03   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 05/20] ovl: narrow locking in ovl_create_upper() NeilBrown
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

In ovl_copy_up_workdir() unlock immediately after the rename, and then
use ovl_cleanup_unlocked() with separate locking rather than using the
same lock to protect both.

This makes way for future changes where locks are taken on individual
dentries rather than the whole directory.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/copy_up.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index eafb46686854..7b84a39c081f 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -765,7 +765,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 {
 	struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb);
 	struct inode *inode;
-	struct inode *wdir = d_inode(c->workdir);
 	struct path path = { .mnt = ovl_upper_mnt(ofs) };
 	struct dentry *temp, *upper, *trap;
 	struct ovl_cu_creds cc;
@@ -816,9 +815,9 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 		/* temp or workdir moved underneath us? abort without cleanup */
 		dput(temp);
 		err = -EIO;
-		if (IS_ERR(trap))
-			goto out;
-		goto unlock;
+		if (!IS_ERR(trap))
+			unlock_rename(c->workdir, c->destdir);
+		goto out;
 	}
 
 	err = ovl_copy_up_metadata(c, temp);
@@ -832,9 +831,10 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 		goto cleanup;
 
 	err = ovl_do_rename(ofs, c->workdir, temp, c->destdir, upper, 0);
+	unlock_rename(c->workdir, c->destdir);
 	dput(upper);
 	if (err)
-		goto cleanup;
+		goto cleanup_unlocked;
 
 	inode = d_inode(c->dentry);
 	if (c->metacopy_digest)
@@ -848,17 +848,17 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	ovl_inode_update(inode, temp);
 	if (S_ISDIR(inode->i_mode))
 		ovl_set_flag(OVL_WHITEOUTS, inode);
-unlock:
-	unlock_rename(c->workdir, c->destdir);
 out:
 	ovl_end_write(c->dentry);
 
 	return err;
 
 cleanup:
-	ovl_cleanup(ofs, wdir, temp);
+	unlock_rename(c->workdir, c->destdir);
+cleanup_unlocked:
+	ovl_cleanup_unlocked(ofs, c->workdir, temp);
 	dput(temp);
-	goto unlock;
+	goto out;
 
 cleanup_need_write:
 	ovl_start_write(c->dentry);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 05/20] ovl: narrow locking in ovl_create_upper()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (3 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 12:09   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 06/20] ovl: narrow locking in ovl_clear_empty() NeilBrown
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Drop the directory lock immediately after the ovl_create_real() call and
take a separate lock later for cleanup in ovl_cleanup_unlocked() - if
needed.

This makes way for future changes where locks are taken on individual
dentries rather than the whole directory.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 144e1753d0c9..fa438e13e8b1 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -326,9 +326,9 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 				    ovl_lookup_upper(ofs, dentry->d_name.name,
 						     upperdir, dentry->d_name.len),
 				    attr);
-	err = PTR_ERR(newdentry);
+	inode_unlock(udir);
 	if (IS_ERR(newdentry))
-		goto out_unlock;
+		return PTR_ERR(newdentry);
 
 	if (ovl_type_merge(dentry->d_parent) && d_is_dir(newdentry) &&
 	    !ovl_allow_offline_changes(ofs)) {
@@ -340,14 +340,12 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 	err = ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink, NULL);
 	if (err)
 		goto out_cleanup;
-out_unlock:
-	inode_unlock(udir);
-	return err;
+	return 0;
 
 out_cleanup:
-	ovl_cleanup(ofs, udir, newdentry);
+	ovl_cleanup_unlocked(ofs, upperdir, newdentry);
 	dput(newdentry);
-	goto out_unlock;
+	return err;
 }
 
 static struct dentry *ovl_clear_empty(struct dentry *dentry,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 06/20] ovl: narrow locking in ovl_clear_empty()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (4 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 05/20] ovl: narrow locking in ovl_create_upper() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 12:27   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout() NeilBrown
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Drop the locks immediately after rename, and use a separate lock for
cleanup.

This makes way for future changes where locks are taken on individual
dentries rather than the whole directory.

Note that ovl_cleanup_whiteouts() operates on "upper", a child of
"upperdir" and does not require upperdir or workdir to be locked.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index fa438e13e8b1..b3d858654f23 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -353,7 +353,6 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 {
 	struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
 	struct dentry *workdir = ovl_workdir(dentry);
-	struct inode *wdir = workdir->d_inode;
 	struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent);
 	struct path upperpath;
 	struct dentry *upper;
@@ -400,10 +399,10 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 	err = ovl_do_rename(ofs, workdir, opaquedir, upperdir, upper, RENAME_EXCHANGE);
 	if (err)
 		goto out_cleanup;
+	unlock_rename(workdir, upperdir);
 
 	ovl_cleanup_whiteouts(ofs, upper, list);
-	ovl_cleanup(ofs, wdir, upper);
-	unlock_rename(workdir, upperdir);
+	ovl_cleanup_unlocked(ofs, workdir, upper);
 
 	/* dentry's upper doesn't match now, get rid of it */
 	d_drop(dentry);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (5 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 06/20] ovl: narrow locking in ovl_clear_empty() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 12:42   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 08/20] ovl: narrow locking in ovl_rename() NeilBrown
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Unlock the parents immediately after the rename, and use
ovl_cleanup_unlocked() for cleanup, which takes a separate lock.

This makes way for future changes where locks are taken on individual
dentries rather than the whole directory.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index b3d858654f23..687d5e12289c 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -432,9 +432,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 {
 	struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
 	struct dentry *workdir = ovl_workdir(dentry);
-	struct inode *wdir = workdir->d_inode;
 	struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent);
-	struct inode *udir = upperdir->d_inode;
 	struct dentry *upper;
 	struct dentry *newdentry;
 	int err;
@@ -505,22 +503,23 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 
 		err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper,
 				    RENAME_EXCHANGE);
+		unlock_rename(workdir, upperdir);
 		if (err)
-			goto out_cleanup_locked;
+			goto out_cleanup;
 
-		ovl_cleanup(ofs, wdir, upper);
+		ovl_cleanup_unlocked(ofs, workdir, upper);
 	} else {
 		err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper, 0);
+		unlock_rename(workdir, upperdir);
 		if (err)
-			goto out_cleanup_locked;
+			goto out_cleanup;
 	}
 	ovl_dir_modified(dentry->d_parent, false);
 	err = ovl_instantiate(dentry, inode, newdentry, hardlink, NULL);
 	if (err) {
-		ovl_cleanup(ofs, udir, newdentry);
+		ovl_cleanup_unlocked(ofs, upperdir, newdentry);
 		dput(newdentry);
 	}
-	unlock_rename(workdir, upperdir);
 out_dput:
 	dput(upper);
 out:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 08/20] ovl: narrow locking in ovl_rename()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (6 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:03   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 09/20] ovl: narrow locking in ovl_cleanup_whiteouts() NeilBrown
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Drop the rename lock immediately after the rename, and use
ovl_cleanup_unlocked() for cleanup.

This makes way for future changes where locks are taken on individual
dentries rather than the whole directory.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 687d5e12289c..d01e83f9d800 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1262,9 +1262,10 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 			    new_upperdir, newdentry, flags);
 	if (err)
 		goto out_dput;
+	unlock_rename(new_upperdir, old_upperdir);
 
 	if (cleanup_whiteout)
-		ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
+		ovl_cleanup_unlocked(ofs, old_upperdir, newdentry);
 
 	if (overwrite && d_inode(new)) {
 		if (new_is_dir)
@@ -1283,12 +1284,8 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	if (d_inode(new) && ovl_dentry_upper(new))
 		ovl_copyattr(d_inode(new));
 
-out_dput:
 	dput(newdentry);
-out_dput_old:
 	dput(olddentry);
-out_unlock:
-	unlock_rename(new_upperdir, old_upperdir);
 out_revert_creds:
 	ovl_revert_creds(old_cred);
 	if (update_nlink)
@@ -1299,6 +1296,14 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	dput(opaquedir);
 	ovl_cache_free(&list);
 	return err;
+
+out_dput:
+	dput(newdentry);
+out_dput_old:
+	dput(olddentry);
+out_unlock:
+	unlock_rename(new_upperdir, old_upperdir);
+	goto out_revert_creds;
 }
 
 static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 09/20] ovl: narrow locking in ovl_cleanup_whiteouts()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (7 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 08/20] ovl: narrow locking in ovl_rename() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-10 23:03 ` [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index() NeilBrown
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Rather than lock the directory for the whole operation, use
ovl_lookup_upper_unlocked() and ovl_cleanup_unlocked() to take the lock
only when needed.

This makes way for future changes where locks are taken on individual
dentries rather than the whole directory.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/readdir.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 68cca52ae2ac..2a222b8185a3 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1034,14 +1034,13 @@ void ovl_cleanup_whiteouts(struct ovl_fs *ofs, struct dentry *upper,
 {
 	struct ovl_cache_entry *p;
 
-	inode_lock_nested(upper->d_inode, I_MUTEX_CHILD);
 	list_for_each_entry(p, list, l_node) {
 		struct dentry *dentry;
 
 		if (WARN_ON(!p->is_whiteout || !p->is_upper))
 			continue;
 
-		dentry = ovl_lookup_upper(ofs, p->name, upper, p->len);
+		dentry = ovl_lookup_upper_unlocked(ofs, p->name, upper, p->len);
 		if (IS_ERR(dentry)) {
 			pr_err("lookup '%s/%.*s' failed (%i)\n",
 			       upper->d_name.name, p->len, p->name,
@@ -1049,10 +1048,9 @@ void ovl_cleanup_whiteouts(struct ovl_fs *ofs, struct dentry *upper,
 			continue;
 		}
 		if (dentry->d_inode)
-			ovl_cleanup(ofs, upper->d_inode, dentry);
+			ovl_cleanup_unlocked(ofs, upper, dentry);
 		dput(dentry);
 	}
-	inode_unlock(upper->d_inode);
 }
 
 static bool ovl_check_d_type(struct dir_context *ctx, const char *name,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (8 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 09/20] ovl: narrow locking in ovl_cleanup_whiteouts() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:12   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 11/20] ovl: narrow locking in ovl_workdir_create() NeilBrown
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

ovl_cleanup_index() takes a lock on the directory and then does a lookup
and possibly one of two different cleanups.
This patch narrows the locking to use the _unlocked() versions of the
lookup and one cleanup, and just takes the lock for the other cleanup.

A subsequent patch will take the lock into the cleanup.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/util.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 9ce9fe62ef28..7369193b11ec 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1107,21 +1107,20 @@ static void ovl_cleanup_index(struct dentry *dentry)
 		goto out;
 	}
 
-	inode_lock_nested(dir, I_MUTEX_PARENT);
-	index = ovl_lookup_upper(ofs, name.name, indexdir, name.len);
+	index = ovl_lookup_upper_unlocked(ofs, name.name, indexdir, name.len);
 	err = PTR_ERR(index);
 	if (IS_ERR(index)) {
 		index = NULL;
 	} else if (ovl_index_all(dentry->d_sb)) {
 		/* Whiteout orphan index to block future open by handle */
+		inode_lock_nested(dir, I_MUTEX_PARENT);
 		err = ovl_cleanup_and_whiteout(OVL_FS(dentry->d_sb),
 					       indexdir, index);
+		inode_unlock(dir);
 	} else {
 		/* Cleanup orphan index entries */
-		err = ovl_cleanup(ofs, dir, index);
+		err = ovl_cleanup_unlocked(ofs, indexdir, index);
 	}
-
-	inode_unlock(dir);
 	if (err)
 		goto fail;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 11/20] ovl: narrow locking in ovl_workdir_create()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (9 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:32   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup() NeilBrown
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

In ovl_workdir_create() don't hold the dir lock for the whole time, but
only take it when needed.

It now gets taken separately for ovl_workdir_cleanup().  A subsequent
patch will move the locking into that function.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/super.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 9cce3251dd83..239ae1946edf 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -299,8 +299,8 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 	int err;
 	bool retried = false;
 
-	inode_lock_nested(dir, I_MUTEX_PARENT);
 retry:
+	inode_lock_nested(dir, I_MUTEX_PARENT);
 	work = ovl_lookup_upper(ofs, name, ofs->workbasedir, strlen(name));
 
 	if (!IS_ERR(work)) {
@@ -311,23 +311,27 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 
 		if (work->d_inode) {
 			err = -EEXIST;
+			inode_unlock(dir);
 			if (retried)
 				goto out_dput;
 
 			if (persist)
-				goto out_unlock;
+				goto out;
 
 			retried = true;
+			inode_lock_nested(dir, I_MUTEX_PARENT);
 			err = ovl_workdir_cleanup(ofs, dir, mnt, work, 0);
+			inode_unlock(dir);
 			dput(work);
 			if (err == -EINVAL) {
 				work = ERR_PTR(err);
-				goto out_unlock;
+				goto out;
 			}
 			goto retry;
 		}
 
 		work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode);
+		inode_unlock(dir);
 		err = PTR_ERR(work);
 		if (IS_ERR(work))
 			goto out_err;
@@ -365,11 +369,11 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 		if (err)
 			goto out_dput;
 	} else {
+		inode_unlock(dir);
 		err = PTR_ERR(work);
 		goto out_err;
 	}
-out_unlock:
-	inode_unlock(dir);
+out:
 	return work;
 
 out_dput:
@@ -378,7 +382,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 	pr_warn("failed to create directory %s/%s (errno: %i); mounting read-only\n",
 		ofs->config.workdir, name, -err);
 	work = NULL;
-	goto out_unlock;
+	goto out;
 }
 
 static int ovl_check_namelen(const struct path *path, struct ovl_fs *ofs,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (10 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 11/20] ovl: narrow locking in ovl_workdir_create() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:33   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse() NeilBrown
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Instead of taking the directory lock for the whole cleanup, only take it
when needed.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/readdir.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 2a222b8185a3..3a4bbc178203 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1194,7 +1194,6 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 	if (err)
 		goto out;
 
-	inode_lock_nested(dir, I_MUTEX_PARENT);
 	list_for_each_entry(p, &list, l_node) {
 		if (p->name[0] == '.') {
 			if (p->len == 1)
@@ -1202,7 +1201,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 			if (p->len == 2 && p->name[1] == '.')
 				continue;
 		}
-		index = ovl_lookup_upper(ofs, p->name, indexdir, p->len);
+		index = ovl_lookup_upper_unlocked(ofs, p->name, indexdir, p->len);
 		if (IS_ERR(index)) {
 			err = PTR_ERR(index);
 			index = NULL;
@@ -1210,7 +1209,9 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 		}
 		/* Cleanup leftover from index create/cleanup attempt */
 		if (index->d_name.name[0] == '#') {
+			inode_lock_nested(dir, I_MUTEX_PARENT);
 			err = ovl_workdir_cleanup(ofs, dir, path.mnt, index, 1);
+			inode_unlock(dir);
 			if (err)
 				break;
 			goto next;
@@ -1220,7 +1221,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 			goto next;
 		} else if (err == -ESTALE) {
 			/* Cleanup stale index entries */
-			err = ovl_cleanup(ofs, dir, index);
+			err = ovl_cleanup_unlocked(ofs, indexdir, index);
 		} else if (err != -ENOENT) {
 			/*
 			 * Abort mount to avoid corrupting the index if
@@ -1233,10 +1234,12 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 			 * Whiteout orphan index to block future open by
 			 * handle after overlay nlink dropped to zero.
 			 */
+			inode_lock_nested(dir, I_MUTEX_PARENT);
 			err = ovl_cleanup_and_whiteout(ofs, indexdir, index);
+			inode_unlock(dir);
 		} else {
 			/* Cleanup orphan index entries */
-			err = ovl_cleanup(ofs, dir, index);
+			err = ovl_cleanup_unlocked(ofs, indexdir, index);
 		}
 
 		if (err)
@@ -1247,7 +1250,6 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 		index = NULL;
 	}
 	dput(index);
-	inode_unlock(dir);
 out:
 	ovl_cache_free(&list);
 	if (err)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (11 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:35   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as needed NeilBrown
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Only take the dir lock when needed, rather than for the whole loop.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/readdir.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 3a4bbc178203..b3d44bf56c78 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1122,7 +1122,6 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
 	if (err)
 		goto out;
 
-	inode_lock_nested(dir, I_MUTEX_PARENT);
 	list_for_each_entry(p, &list, l_node) {
 		struct dentry *dentry;
 
@@ -1137,16 +1136,18 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
 			err = -EINVAL;
 			break;
 		}
-		dentry = ovl_lookup_upper(ofs, p->name, path->dentry, p->len);
+		dentry = ovl_lookup_upper_unlocked(ofs, p->name, path->dentry, p->len);
 		if (IS_ERR(dentry))
 			continue;
-		if (dentry->d_inode)
+		if (dentry->d_inode) {
+			inode_lock_nested(dir, I_MUTEX_PARENT);
 			err = ovl_workdir_cleanup(ofs, dir, path->mnt, dentry, level);
+			inode_unlock(dir);
+		}
 		dput(dentry);
 		if (err)
 			break;
 	}
-	inode_unlock(dir);
 out:
 	ovl_cache_free(&list);
 	return err;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as needed.
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (12 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:28   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout() NeilBrown
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Rather than calling ovl_workdir_cleanup() with the dir already locked,
change it to take the dir lock only when needed.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/overlayfs.h |  2 +-
 fs/overlayfs/readdir.c   | 30 +++++++++++++-----------------
 fs/overlayfs/super.c     |  4 +---
 3 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index ec804d6bb2ef..ca74be44dddd 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -738,7 +738,7 @@ void ovl_cleanup_whiteouts(struct ovl_fs *ofs, struct dentry *upper,
 void ovl_cache_free(struct list_head *list);
 void ovl_dir_cache_free(struct inode *inode);
 int ovl_check_d_type_supported(const struct path *realpath);
-int ovl_workdir_cleanup(struct ovl_fs *ofs, struct inode *dir,
+int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
 			struct vfsmount *mnt, struct dentry *dentry, int level);
 int ovl_indexdir_cleanup(struct ovl_fs *ofs);
 
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index b3d44bf56c78..6cc5f885e036 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1096,7 +1096,6 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
 				       int level)
 {
 	int err;
-	struct inode *dir = path->dentry->d_inode;
 	LIST_HEAD(list);
 	struct ovl_cache_entry *p;
 	struct ovl_readdir_data rdd = {
@@ -1139,11 +1138,9 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
 		dentry = ovl_lookup_upper_unlocked(ofs, p->name, path->dentry, p->len);
 		if (IS_ERR(dentry))
 			continue;
-		if (dentry->d_inode) {
-			inode_lock_nested(dir, I_MUTEX_PARENT);
-			err = ovl_workdir_cleanup(ofs, dir, path->mnt, dentry, level);
-			inode_unlock(dir);
-		}
+		if (dentry->d_inode)
+			err = ovl_workdir_cleanup(ofs, path->dentry, path->mnt,
+						  dentry, level);
 		dput(dentry);
 		if (err)
 			break;
@@ -1153,24 +1150,25 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
 	return err;
 }
 
-int ovl_workdir_cleanup(struct ovl_fs *ofs, struct inode *dir,
+int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
 			struct vfsmount *mnt, struct dentry *dentry, int level)
 {
 	int err;
 
-	if (!d_is_dir(dentry) || level > 1) {
-		return ovl_cleanup(ofs, dir, dentry);
-	}
+	if (!d_is_dir(dentry) || level > 1)
+		return ovl_cleanup_unlocked(ofs, parent, dentry);
 
-	err = ovl_do_rmdir(ofs, dir, dentry);
+	err = parent_lock(parent, dentry);
+	if (err)
+		return err;
+	err = ovl_do_rmdir(ofs, parent->d_inode, dentry);
+	parent_unlock(parent);
 	if (err) {
 		struct path path = { .mnt = mnt, .dentry = dentry };
 
-		inode_unlock(dir);
 		err = ovl_workdir_cleanup_recurse(ofs, &path, level + 1);
-		inode_lock_nested(dir, I_MUTEX_PARENT);
 		if (!err)
-			err = ovl_cleanup(ofs, dir, dentry);
+			err = ovl_cleanup_unlocked(ofs, parent, dentry);
 	}
 
 	return err;
@@ -1210,9 +1208,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 		}
 		/* Cleanup leftover from index create/cleanup attempt */
 		if (index->d_name.name[0] == '#') {
-			inode_lock_nested(dir, I_MUTEX_PARENT);
-			err = ovl_workdir_cleanup(ofs, dir, path.mnt, index, 1);
-			inode_unlock(dir);
+			err = ovl_workdir_cleanup(ofs, indexdir, path.mnt, index, 1);
 			if (err)
 				break;
 			goto next;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 239ae1946edf..23f43f8131dd 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -319,9 +319,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 				goto out;
 
 			retried = true;
-			inode_lock_nested(dir, I_MUTEX_PARENT);
-			err = ovl_workdir_cleanup(ofs, dir, mnt, work, 0);
-			inode_unlock(dir);
+			err = ovl_workdir_cleanup(ofs, ofs->workbasedir, mnt, work, 0);
 			dput(work);
 			if (err == -EINVAL) {
 				work = ERR_PTR(err);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (13 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as needed NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:42   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed NeilBrown
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Normally it is ok to include a lookup with the subsequent operation on
the result.  However in this case ovl_cleanup_and_whiteout() already
(potentially) creates a whiteout inode so we need separate locking.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index d01e83f9d800..8580cd5c61e4 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -769,15 +769,11 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
 			goto out;
 	}
 
-	err = ovl_lock_rename_workdir(workdir, NULL, upperdir, NULL);
-	if (err)
-		goto out_dput;
-
-	upper = ovl_lookup_upper(ofs, dentry->d_name.name, upperdir,
-				 dentry->d_name.len);
+	upper = ovl_lookup_upper_unlocked(ofs, dentry->d_name.name, upperdir,
+					  dentry->d_name.len);
 	err = PTR_ERR(upper);
 	if (IS_ERR(upper))
-		goto out_unlock;
+		goto out_dput;
 
 	err = -ESTALE;
 	if ((opaquedir && upper != opaquedir) ||
@@ -786,6 +782,10 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
 		goto out_dput_upper;
 	}
 
+	err = ovl_lock_rename_workdir(workdir, NULL, upperdir, upper);
+	if (err)
+		goto out_dput_upper;
+
 	err = ovl_cleanup_and_whiteout(ofs, upperdir, upper);
 	if (err)
 		goto out_d_drop;
@@ -793,10 +793,9 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
 	ovl_dir_modified(dentry->d_parent, true);
 out_d_drop:
 	d_drop(dentry);
+	unlock_rename(workdir, upperdir);
 out_dput_upper:
 	dput(upper);
-out_unlock:
-	unlock_rename(workdir, upperdir);
 out_dput:
 	dput(opaquedir);
 out:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (14 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:50   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 17/20] ovl: narrow locking in ovl_whiteout() NeilBrown
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Rather than locking the directory(s) before calling
ovl_cleanup_and_whiteout(), change it (and ovl_whiteout()) to do the
locking, so the locking can be fine grained as will be needed for
proposed locking changes.

Sometimes this is called to whiteout something in the index dir, in
which case only that dir must be locked.  In one case it is called on
something in an upperdir, so two directories must be locked.  We use
ovl_lock_rename_workdir() for this and remove the restriction that
upperdir cannot be indexdir - because now sometimes it is.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c     | 20 +++++++++-----------
 fs/overlayfs/readdir.c |  3 ---
 fs/overlayfs/util.c    |  7 -------
 3 files changed, 9 insertions(+), 21 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 8580cd5c61e4..086719129be3 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -77,7 +77,6 @@ struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
 	return temp;
 }
 
-/* caller holds i_mutex on workdir */
 static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
 {
 	int err;
@@ -85,6 +84,7 @@ static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
 	struct dentry *workdir = ofs->workdir;
 	struct inode *wdir = workdir->d_inode;
 
+	inode_lock_nested(wdir, I_MUTEX_PARENT);
 	if (!ofs->whiteout) {
 		whiteout = ovl_lookup_temp(ofs, workdir);
 		if (IS_ERR(whiteout))
@@ -118,14 +118,13 @@ static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
 	whiteout = ofs->whiteout;
 	ofs->whiteout = NULL;
 out:
+	inode_unlock(wdir);
 	return whiteout;
 }
 
-/* Caller must hold i_mutex on both workdir and dir */
 int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct dentry *dir,
 			     struct dentry *dentry)
 {
-	struct inode *wdir = ofs->workdir->d_inode;
 	struct dentry *whiteout;
 	int err;
 	int flags = 0;
@@ -138,18 +137,22 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct dentry *dir,
 	if (d_is_dir(dentry))
 		flags = RENAME_EXCHANGE;
 
-	err = ovl_do_rename(ofs, ofs->workdir, whiteout, dir, dentry, flags);
+	err = ovl_lock_rename_workdir(ofs->workdir, whiteout, dir, dentry);
+	if (!err) {
+		err = ovl_do_rename(ofs, ofs->workdir, whiteout, dir, dentry, flags);
+		unlock_rename(ofs->workdir, dir);
+	}
 	if (err)
 		goto kill_whiteout;
 	if (flags)
-		ovl_cleanup(ofs, wdir, dentry);
+		ovl_cleanup_unlocked(ofs, ofs->workdir, dentry);
 
 out:
 	dput(whiteout);
 	return err;
 
 kill_whiteout:
-	ovl_cleanup(ofs, wdir, whiteout);
+	ovl_cleanup_unlocked(ofs, ofs->workdir, whiteout);
 	goto out;
 }
 
@@ -782,10 +785,6 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
 		goto out_dput_upper;
 	}
 
-	err = ovl_lock_rename_workdir(workdir, NULL, upperdir, upper);
-	if (err)
-		goto out_dput_upper;
-
 	err = ovl_cleanup_and_whiteout(ofs, upperdir, upper);
 	if (err)
 		goto out_d_drop;
@@ -793,7 +792,6 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
 	ovl_dir_modified(dentry->d_parent, true);
 out_d_drop:
 	d_drop(dentry);
-	unlock_rename(workdir, upperdir);
 out_dput_upper:
 	dput(upper);
 out_dput:
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 6cc5f885e036..4127d1f160b3 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1179,7 +1179,6 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 	int err;
 	struct dentry *indexdir = ofs->workdir;
 	struct dentry *index = NULL;
-	struct inode *dir = indexdir->d_inode;
 	struct path path = { .mnt = ovl_upper_mnt(ofs), .dentry = indexdir };
 	LIST_HEAD(list);
 	struct ovl_cache_entry *p;
@@ -1231,9 +1230,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 			 * Whiteout orphan index to block future open by
 			 * handle after overlay nlink dropped to zero.
 			 */
-			inode_lock_nested(dir, I_MUTEX_PARENT);
 			err = ovl_cleanup_and_whiteout(ofs, indexdir, index);
-			inode_unlock(dir);
 		} else {
 			/* Cleanup orphan index entries */
 			err = ovl_cleanup_unlocked(ofs, indexdir, index);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 7369193b11ec..5218a477551b 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1071,7 +1071,6 @@ static void ovl_cleanup_index(struct dentry *dentry)
 {
 	struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
 	struct dentry *indexdir = ovl_indexdir(dentry->d_sb);
-	struct inode *dir = indexdir->d_inode;
 	struct dentry *lowerdentry = ovl_dentry_lower(dentry);
 	struct dentry *upperdentry = ovl_dentry_upper(dentry);
 	struct dentry *index = NULL;
@@ -1113,10 +1112,8 @@ static void ovl_cleanup_index(struct dentry *dentry)
 		index = NULL;
 	} else if (ovl_index_all(dentry->d_sb)) {
 		/* Whiteout orphan index to block future open by handle */
-		inode_lock_nested(dir, I_MUTEX_PARENT);
 		err = ovl_cleanup_and_whiteout(OVL_FS(dentry->d_sb),
 					       indexdir, index);
-		inode_unlock(dir);
 	} else {
 		/* Cleanup orphan index entries */
 		err = ovl_cleanup_unlocked(ofs, indexdir, index);
@@ -1224,10 +1221,6 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *work,
 {
 	struct dentry *trap;
 
-	/* Workdir should not be the same as upperdir */
-	if (workdir == upperdir)
-		goto err;
-
 	/* Workdir should not be subdir of upperdir and vice versa */
 	trap = lock_rename(workdir, upperdir);
 	if (IS_ERR(trap))
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 17/20] ovl: narrow locking in ovl_whiteout()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (15 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 15:19   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout() NeilBrown
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

ovl_whiteout() relies on the workdir i_rwsem to provide exclusive access
to ofs->whiteout which it manipulates.  Rather than depending on this,
add a new mutex, "whiteout_lock" to explicitly provide the required
locking.  Use guard(mutex) for this so that we can return without
needing to explicitly unlock.

Then take the lock on workdir only when needed - to lookup the temp name
and to do the whiteout or link.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c       | 49 +++++++++++++++++++++-------------------
 fs/overlayfs/ovl_entry.h |  1 +
 fs/overlayfs/params.c    |  2 ++
 3 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 086719129be3..fd89c25775bd 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -84,41 +84,44 @@ static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
 	struct dentry *workdir = ofs->workdir;
 	struct inode *wdir = workdir->d_inode;
 
-	inode_lock_nested(wdir, I_MUTEX_PARENT);
+	guard(mutex)(&ofs->whiteout_lock);
+
 	if (!ofs->whiteout) {
+		inode_lock_nested(wdir, I_MUTEX_PARENT);
 		whiteout = ovl_lookup_temp(ofs, workdir);
-		if (IS_ERR(whiteout))
-			goto out;
-
-		err = ovl_do_whiteout(ofs, wdir, whiteout);
-		if (err) {
-			dput(whiteout);
-			whiteout = ERR_PTR(err);
-			goto out;
+		if (!IS_ERR(whiteout)) {
+			err = ovl_do_whiteout(ofs, wdir, whiteout);
+			if (err) {
+				dput(whiteout);
+				whiteout = ERR_PTR(err);
+			}
 		}
+		inode_unlock(wdir);
+		if (IS_ERR(whiteout))
+			return whiteout;
 		ofs->whiteout = whiteout;
 	}
 
 	if (!ofs->no_shared_whiteout) {
+		inode_lock_nested(wdir, I_MUTEX_PARENT);
 		whiteout = ovl_lookup_temp(ofs, workdir);
-		if (IS_ERR(whiteout))
-			goto out;
-
-		err = ovl_do_link(ofs, ofs->whiteout, wdir, whiteout);
-		if (!err)
-			goto out;
-
-		if (err != -EMLINK) {
-			pr_warn("Failed to link whiteout - disabling whiteout inode sharing(nlink=%u, err=%i)\n",
-				ofs->whiteout->d_inode->i_nlink, err);
-			ofs->no_shared_whiteout = true;
+		if (!IS_ERR(whiteout)) {
+			err = ovl_do_link(ofs, ofs->whiteout, wdir, whiteout);
+			if (err) {
+				dput(whiteout);
+				whiteout = ERR_PTR(err);
+			}
 		}
-		dput(whiteout);
+		inode_unlock(wdir);
+		if (!IS_ERR(whiteout) || PTR_ERR(whiteout) != -EMLINK)
+			return whiteout;
+
+		pr_warn("Failed to link whiteout - disabling whiteout inode sharing(nlink=%u, err=%i)\n",
+			ofs->whiteout->d_inode->i_nlink, err);
+		ofs->no_shared_whiteout = true;
 	}
 	whiteout = ofs->whiteout;
 	ofs->whiteout = NULL;
-out:
-	inode_unlock(wdir);
 	return whiteout;
 }
 
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index afb7762f873f..4c1bae935ced 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -88,6 +88,7 @@ struct ovl_fs {
 	/* Shared whiteout cache */
 	struct dentry *whiteout;
 	bool no_shared_whiteout;
+	struct mutex whiteout_lock;
 	/* r/o snapshot of upperdir sb's only taken on volatile mounts */
 	errseq_t errseq;
 };
diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c
index f42488c01957..cb1a17c066cd 100644
--- a/fs/overlayfs/params.c
+++ b/fs/overlayfs/params.c
@@ -797,6 +797,8 @@ int ovl_init_fs_context(struct fs_context *fc)
 	fc->s_fs_info		= ofs;
 	fc->fs_private		= ctx;
 	fc->ops			= &ovl_context_ops;
+
+	mutex_init(&ofs->whiteout_lock);
 	return 0;
 
 out_err:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (16 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 17/20] ovl: narrow locking in ovl_whiteout() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11 13:54   ` Amir Goldstein
  2025-07-10 23:03 ` [PATCH 19/20] ovl: change ovl_create_real() to receive dentry parent NeilBrown
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

ovl_check_rename_whiteout() now only holds the directory lock when
needed, and takes it again if necessary.

This makes way for future changes where locks are taken on individual
dentries rather than the whole directory.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/super.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 23f43f8131dd..78f4fcfb9ff6 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -559,7 +559,6 @@ static int ovl_get_upper(struct super_block *sb, struct ovl_fs *ofs,
 static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
 {
 	struct dentry *workdir = ofs->workdir;
-	struct inode *dir = d_inode(workdir);
 	struct dentry *temp;
 	struct dentry *dest;
 	struct dentry *whiteout;
@@ -580,19 +579,22 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
 	err = PTR_ERR(dest);
 	if (IS_ERR(dest)) {
 		dput(temp);
-		goto out_unlock;
+		parent_unlock(workdir);
+		return err;
 	}
 
 	/* Name is inline and stable - using snapshot as a copy helper */
 	take_dentry_name_snapshot(&name, temp);
 	err = ovl_do_rename(ofs, workdir, temp, workdir, dest, RENAME_WHITEOUT);
+	parent_unlock(workdir);
 	if (err) {
 		if (err == -EINVAL)
 			err = 0;
 		goto cleanup_temp;
 	}
 
-	whiteout = ovl_lookup_upper(ofs, name.name.name, workdir, name.name.len);
+	whiteout = ovl_lookup_upper_unlocked(ofs, name.name.name,
+					     workdir, name.name.len);
 	err = PTR_ERR(whiteout);
 	if (IS_ERR(whiteout))
 		goto cleanup_temp;
@@ -601,18 +603,15 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
 
 	/* Best effort cleanup of whiteout and temp file */
 	if (err)
-		ovl_cleanup(ofs, dir, whiteout);
+		ovl_cleanup_unlocked(ofs, workdir, whiteout);
 	dput(whiteout);
 
 cleanup_temp:
-	ovl_cleanup(ofs, dir, temp);
+	ovl_cleanup_unlocked(ofs, workdir, temp);
 	release_dentry_name_snapshot(&name);
 	dput(temp);
 	dput(dest);
 
-out_unlock:
-	parent_unlock(workdir);
-
 	return err;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 19/20] ovl: change ovl_create_real() to receive dentry parent
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (17 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout() NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-10 23:03 ` [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup() NeilBrown
  2025-07-11 16:41 ` [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem Amir Goldstein
  20 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

Instead of passing an inode *dir, pass a dentry *parent.  This makes the
calling slightly cleaner.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c       | 7 ++++---
 fs/overlayfs/overlayfs.h | 2 +-
 fs/overlayfs/super.c     | 3 +--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index fd89c25775bd..58078ce67d6a 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -159,9 +159,10 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct dentry *dir,
 	goto out;
 }
 
-struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir,
+struct dentry *ovl_create_real(struct ovl_fs *ofs, struct dentry *parent,
 			       struct dentry *newdentry, struct ovl_cattr *attr)
 {
+	struct inode *dir = parent->d_inode;
 	int err;
 
 	if (IS_ERR(newdentry))
@@ -222,7 +223,7 @@ struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
 {
 	struct dentry *ret;
 	inode_lock(workdir->d_inode);
-	ret = ovl_create_real(ofs, d_inode(workdir),
+	ret = ovl_create_real(ofs, workdir,
 			      ovl_lookup_temp(ofs, workdir), attr);
 	inode_unlock(workdir->d_inode);
 	return ret;
@@ -328,7 +329,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 	int err;
 
 	inode_lock_nested(udir, I_MUTEX_PARENT);
-	newdentry = ovl_create_real(ofs, udir,
+	newdentry = ovl_create_real(ofs, upperdir,
 				    ovl_lookup_upper(ofs, dentry->d_name.name,
 						     upperdir, dentry->d_name.len),
 				    attr);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index ca74be44dddd..bda25287c510 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -855,7 +855,7 @@ struct ovl_cattr {
 #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) })
 
 struct dentry *ovl_create_real(struct ovl_fs *ofs,
-			       struct inode *dir, struct dentry *newdentry,
+			       struct dentry *parent, struct dentry *newdentry,
 			       struct ovl_cattr *attr);
 int ovl_cleanup(struct ovl_fs *ofs, struct inode *dir, struct dentry *dentry);
 int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 78f4fcfb9ff6..3c012c8f7c88 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -625,8 +625,7 @@ static struct dentry *ovl_lookup_or_create(struct ovl_fs *ofs,
 	inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
 	child = ovl_lookup_upper(ofs, name, parent, len);
 	if (!IS_ERR(child) && !child->d_inode)
-		child = ovl_create_real(ofs, parent->d_inode, child,
-					OVL_CATTR(mode));
+		child = ovl_create_real(ofs, parent, child, OVL_CATTR(mode));
 	inode_unlock(parent->d_inode);
 	dput(parent);
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (18 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 19/20] ovl: change ovl_create_real() to receive dentry parent NeilBrown
@ 2025-07-10 23:03 ` NeilBrown
  2025-07-11  9:57   ` Amir Goldstein
  2025-07-11 16:41 ` [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem Amir Goldstein
  20 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-10 23:03 UTC (permalink / raw)
  To: Miklos Szeredi, Amir Goldstein; +Cc: linux-unionfs, linux-fsdevel

The only remaining user of ovl_cleanup() is ovl_cleanup_locked(), so we
no longer need both.

This patch moves ovl_cleanup() code into ovl_cleanup_locked(), and then
renames ovl_cleanup_locked() to ovl_cleanup().

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/copy_up.c   |  6 ++---
 fs/overlayfs/dir.c       | 52 ++++++++++++++++------------------------
 fs/overlayfs/overlayfs.h |  3 +--
 fs/overlayfs/readdir.c   | 10 ++++----
 fs/overlayfs/super.c     |  4 ++--
 fs/overlayfs/util.c      |  2 +-
 6 files changed, 33 insertions(+), 44 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7b84a39c081f..f345f2899ccf 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -570,7 +570,7 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 	parent_unlock(indexdir);
 out:
 	if (err)
-		ovl_cleanup_unlocked(ofs, indexdir, temp);
+		ovl_cleanup(ofs, indexdir, temp);
 	ovl_end_write(dentry);
 	dput(temp);
 free_name:
@@ -856,13 +856,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 cleanup:
 	unlock_rename(c->workdir, c->destdir);
 cleanup_unlocked:
-	ovl_cleanup_unlocked(ofs, c->workdir, temp);
+	ovl_cleanup(ofs, c->workdir, temp);
 	dput(temp);
 	goto out;
 
 cleanup_need_write:
 	ovl_start_write(c->dentry);
-	ovl_cleanup_unlocked(ofs, c->workdir, temp);
+	ovl_cleanup(ofs, c->workdir, temp);
 	ovl_end_write(c->dentry);
 	dput(temp);
 	return err;
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 58078ce67d6a..7e7f701c7ae4 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -24,16 +24,21 @@ MODULE_PARM_DESC(redirect_max,
 
 static int ovl_set_redirect(struct dentry *dentry, bool samedir);
 
-int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
+int ovl_cleanup(struct ovl_fs *ofs, struct dentry *workdir,
+			 struct dentry *wdentry)
 {
 	int err;
 
-	dget(wdentry);
-	if (d_is_dir(wdentry))
-		err = ovl_do_rmdir(ofs, wdir, wdentry);
-	else
-		err = ovl_do_unlink(ofs, wdir, wdentry);
-	dput(wdentry);
+	err = parent_lock(workdir, wdentry);
+	if (!err) {
+		dget(wdentry);
+		if (d_is_dir(wdentry))
+			err = ovl_do_rmdir(ofs, workdir->d_inode, wdentry);
+		else
+			err = ovl_do_unlink(ofs, workdir->d_inode, wdentry);
+		dput(wdentry);
+		parent_unlock(workdir);
+	}
 
 	if (err) {
 		pr_err("cleanup of '%pd2' failed (%i)\n",
@@ -43,21 +48,6 @@ int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
 	return err;
 }
 
-int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
-			 struct dentry *wdentry)
-{
-	int err;
-
-	err = parent_lock(workdir, wdentry);
-	if (err)
-		return err;
-
-	ovl_cleanup(ofs, workdir->d_inode, wdentry);
-	parent_unlock(workdir);
-
-	return err;
-}
-
 struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
 {
 	struct dentry *temp;
@@ -148,14 +138,14 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct dentry *dir,
 	if (err)
 		goto kill_whiteout;
 	if (flags)
-		ovl_cleanup_unlocked(ofs, ofs->workdir, dentry);
+		ovl_cleanup(ofs, ofs->workdir, dentry);
 
 out:
 	dput(whiteout);
 	return err;
 
 kill_whiteout:
-	ovl_cleanup_unlocked(ofs, ofs->workdir, whiteout);
+	ovl_cleanup(ofs, ofs->workdir, whiteout);
 	goto out;
 }
 
@@ -350,7 +340,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 	return 0;
 
 out_cleanup:
-	ovl_cleanup_unlocked(ofs, upperdir, newdentry);
+	ovl_cleanup(ofs, upperdir, newdentry);
 	dput(newdentry);
 	return err;
 }
@@ -409,7 +399,7 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 	unlock_rename(workdir, upperdir);
 
 	ovl_cleanup_whiteouts(ofs, upper, list);
-	ovl_cleanup_unlocked(ofs, workdir, upper);
+	ovl_cleanup(ofs, workdir, upper);
 
 	/* dentry's upper doesn't match now, get rid of it */
 	d_drop(dentry);
@@ -419,7 +409,7 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 out_cleanup:
 	unlock_rename(workdir, upperdir);
 out_cleanup_unlocked:
-	ovl_cleanup_unlocked(ofs, workdir, opaquedir);
+	ovl_cleanup(ofs, workdir, opaquedir);
 	dput(opaquedir);
 out:
 	return ERR_PTR(err);
@@ -514,7 +504,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 		if (err)
 			goto out_cleanup;
 
-		ovl_cleanup_unlocked(ofs, workdir, upper);
+		ovl_cleanup(ofs, workdir, upper);
 	} else {
 		err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper, 0);
 		unlock_rename(workdir, upperdir);
@@ -524,7 +514,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	ovl_dir_modified(dentry->d_parent, false);
 	err = ovl_instantiate(dentry, inode, newdentry, hardlink, NULL);
 	if (err) {
-		ovl_cleanup_unlocked(ofs, upperdir, newdentry);
+		ovl_cleanup(ofs, upperdir, newdentry);
 		dput(newdentry);
 	}
 out_dput:
@@ -539,7 +529,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 out_cleanup_locked:
 	unlock_rename(workdir, upperdir);
 out_cleanup:
-	ovl_cleanup_unlocked(ofs, workdir, newdentry);
+	ovl_cleanup(ofs, workdir, newdentry);
 	dput(newdentry);
 	goto out_dput;
 }
@@ -1266,7 +1256,7 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	unlock_rename(new_upperdir, old_upperdir);
 
 	if (cleanup_whiteout)
-		ovl_cleanup_unlocked(ofs, old_upperdir, newdentry);
+		ovl_cleanup(ofs, old_upperdir, newdentry);
 
 	if (overwrite && d_inode(new)) {
 		if (new_is_dir)
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index bda25287c510..1bebfdcd4d90 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -857,8 +857,7 @@ struct ovl_cattr {
 struct dentry *ovl_create_real(struct ovl_fs *ofs,
 			       struct dentry *parent, struct dentry *newdentry,
 			       struct ovl_cattr *attr);
-int ovl_cleanup(struct ovl_fs *ofs, struct inode *dir, struct dentry *dentry);
-int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
+int ovl_cleanup(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
 struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir);
 struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
 			       struct ovl_cattr *attr);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 4127d1f160b3..5a05842c60c5 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1048,7 +1048,7 @@ void ovl_cleanup_whiteouts(struct ovl_fs *ofs, struct dentry *upper,
 			continue;
 		}
 		if (dentry->d_inode)
-			ovl_cleanup_unlocked(ofs, upper, dentry);
+			ovl_cleanup(ofs, upper, dentry);
 		dput(dentry);
 	}
 }
@@ -1156,7 +1156,7 @@ int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
 	int err;
 
 	if (!d_is_dir(dentry) || level > 1)
-		return ovl_cleanup_unlocked(ofs, parent, dentry);
+		return ovl_cleanup(ofs, parent, dentry);
 
 	err = parent_lock(parent, dentry);
 	if (err)
@@ -1168,7 +1168,7 @@ int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
 
 		err = ovl_workdir_cleanup_recurse(ofs, &path, level + 1);
 		if (!err)
-			err = ovl_cleanup_unlocked(ofs, parent, dentry);
+			err = ovl_cleanup(ofs, parent, dentry);
 	}
 
 	return err;
@@ -1217,7 +1217,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 			goto next;
 		} else if (err == -ESTALE) {
 			/* Cleanup stale index entries */
-			err = ovl_cleanup_unlocked(ofs, indexdir, index);
+			err = ovl_cleanup(ofs, indexdir, index);
 		} else if (err != -ENOENT) {
 			/*
 			 * Abort mount to avoid corrupting the index if
@@ -1233,7 +1233,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 			err = ovl_cleanup_and_whiteout(ofs, indexdir, index);
 		} else {
 			/* Cleanup orphan index entries */
-			err = ovl_cleanup_unlocked(ofs, indexdir, index);
+			err = ovl_cleanup(ofs, indexdir, index);
 		}
 
 		if (err)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 3c012c8f7c88..e3dd60c459e2 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -603,11 +603,11 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
 
 	/* Best effort cleanup of whiteout and temp file */
 	if (err)
-		ovl_cleanup_unlocked(ofs, workdir, whiteout);
+		ovl_cleanup(ofs, workdir, whiteout);
 	dput(whiteout);
 
 cleanup_temp:
-	ovl_cleanup_unlocked(ofs, workdir, temp);
+	ovl_cleanup(ofs, workdir, temp);
 	release_dentry_name_snapshot(&name);
 	dput(temp);
 	dput(dest);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 5218a477551b..c91c3a9187b0 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1116,7 +1116,7 @@ static void ovl_cleanup_index(struct dentry *dentry)
 					       indexdir, index);
 	} else {
 		/* Cleanup orphan index entries */
-		err = ovl_cleanup_unlocked(ofs, indexdir, index);
+		err = ovl_cleanup(ofs, indexdir, index);
 	}
 	if (err)
 		goto fail;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()
  2025-07-10 23:03 ` [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir() NeilBrown
@ 2025-07-11  8:25   ` Amir Goldstein
  2025-07-11 10:30     ` Amir Goldstein
  2025-07-14  0:13     ` NeilBrown
  0 siblings, 2 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11  8:25 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> If ovl_copy_up_data() fails the error is not immediately handled but the
> code continues on to call ovl_start_write() and lock_rename(),
> presumably because both of these locks are needed for the cleanup.
> On then (if the lock was successful) is the error checked.
>
> This makes the code a little hard to follow and could be fragile.
>
> This patch changes to handle the error immediately.  A new
> ovl_cleanup_unlocked() is created which takes the required directory
> lock (though it doesn't take the write lock on the filesystem).  This
> will be used extensively in later patches.
>
> In general we need to check the parent is still correct after taking the
> lock (as ovl_copy_up_workdir() does after a successful lock_rename()) so
> that is included in ovl_cleanup_unlocked() using new lock_parent() and
> unlock_parent() calls (it is planned to move this API into VFS code
> eventually, though in a slightly different form).

Since you are not planning to move it to VFS with this name
AND since I assume you want to merge this ovl cleanup prior
to the rest of of patches, please use an ovl helper without
the ovl_ namespace prefix and you have a typo above
its parent_lock() not lock_parent().

And apropos lock helper names, at the tip of your branch
the lock helpers used in ovl_cleanup() are named:
lock_and_check_dentry()/dentry_unlock()

I have multiple comments on your choice of names for those helpers:
1. Please use a consistent name pattern for lock/unlock.
    The pattern <obj-or-lock-type>_{lock,unlock}_* is far more common
    then the pattern lock_<obj-or-lock-type> in the kernel, but at least
    be consistent with dentry_lock_and_check() or better yet
    parent_lock() and later parent_lock_get_child()
2. dentry_unlock() is a very strange name for a helper that
    unlocks the parent. The fact that you document what it does
    in Kernel-doc does not stop people reading the code using it
    from being confused and writing bugs.
3. Why not call it parnet_unlock() like I suggested and like you
    used in this patch set and why not introduce it in VFS to begin with?
    For that matter parent_unlock_{put,return}_child() is more clear IMO.
4. The name dentry_unlock_rename(&rd) also does not balance nicely with
    the name lookup_and_lock_rename(&rd) and has nothing to do with the
    dentry_ prefix. How about lookup_done_and_unlock_rename(&rd)?

Hope this is not too much complaining for review of a small cleanup patch :-p

>
> A fresh cleanup block is added which doesn't share code with other
> cleanup blocks.  It will get a new users in the next patch.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/copy_up.c   | 12 ++++++++++--
>  fs/overlayfs/dir.c       | 15 +++++++++++++++
>  fs/overlayfs/overlayfs.h |  6 ++++++
>  fs/overlayfs/util.c      | 10 ++++++++++
>  4 files changed, 41 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 8a3c0d18ec2e..5d21b8d94a0a 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -794,6 +794,9 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>          */
>         path.dentry = temp;
>         err = ovl_copy_up_data(c, &path);
> +       if (err)
> +               goto cleanup_need_write;
> +
>         /*
>          * We cannot hold lock_rename() throughout this helper, because of
>          * lock ordering with sb_writers, which shouldn't be held when calling
> @@ -809,8 +812,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>                 if (IS_ERR(trap))
>                         goto out;
>                 goto unlock;
> -       } else if (err) {
> -               goto cleanup;
>         }
>
>         err = ovl_copy_up_metadata(c, temp);
> @@ -857,6 +858,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         ovl_cleanup(ofs, wdir, temp);
>         dput(temp);
>         goto unlock;
> +
> +cleanup_need_write:
> +       ovl_start_write(c->dentry);
> +       ovl_cleanup_unlocked(ofs, c->workdir, temp);
> +       ovl_end_write(c->dentry);
> +       dput(temp);
> +       return err;
>  }
>

Sorry, I will not accept more messy goto routines.
I rewrote your simplification based on the tip of your branch.
Much simpler and no need for this extra routine.
Just always use ovl_cleanup_unlocked() in this function and
ovl_start_write() before goto cleanup_unlocked:

--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -794,13 +794,16 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
         */
        path.dentry = temp;
        err = ovl_copy_up_data(c, &path);
+       ovl_start_write(c->dentry);
+       if (err)
+               goto cleanup_unlocked;
+
        /*
         * We cannot hold lock_rename() throughout this helper, because of
         * lock ordering with sb_writers, which shouldn't be held when calling
         * ovl_copy_up_data(), so lock workdir and destdir and make sure that
         * temp wasn't moved before copy up completion or cleanup.
         */
-       ovl_start_write(c->dentry);
        trap = lock_rename(c->workdir, c->destdir);
        if (trap || temp->d_parent != c->workdir) {
                /* temp or workdir moved underneath us? abort without cleanup */
@@ -809,8 +812,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
                if (IS_ERR(trap))
                        goto out;
                goto unlock;
-       } else if (err) {
-               goto cleanup;
        }

        err = ovl_copy_up_metadata(c, temp);
@@ -846,17 +847,17 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
        ovl_inode_update(inode, temp);
        if (S_ISDIR(inode->i_mode))
                ovl_set_flag(OVL_WHITEOUTS, inode);
-unlock:
-       unlock_rename(c->workdir, c->destdir);
 out:
        ovl_end_write(c->dentry);

        return err;

 cleanup:
-       ovl_cleanup(ofs, wdir, temp);
+       unlock_rename(c->workdir, c->destdir);
+cleanup_unlocked:
+       ovl_cleanup_unlocked(ofs, wdir, temp);
        dput(temp);
-       goto unlock;
+       goto out;
 }
---

>  /* Copyup using O_TMPFILE which does not require cross dir locking */
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 4fc221ea6480..cee35d69e0e6 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -43,6 +43,21 @@ int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
>         return err;
>  }
>
> +int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
> +                        struct dentry *wdentry)
> +{
> +       int err;
> +
> +       err = parent_lock(workdir, wdentry);
> +       if (err)
> +               return err;
> +
> +       ovl_cleanup(ofs, workdir->d_inode, wdentry);
> +       parent_unlock(workdir);
> +
> +       return err;
> +}
> +
>  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
>  {
>         struct dentry *temp;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 42228d10f6b9..68dc78c712a8 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -416,6 +416,11 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
>  }
>
>  /* util.c */
> +int parent_lock(struct dentry *parent, struct dentry *child);
> +static inline void parent_unlock(struct dentry *parent)
> +{
> +       inode_unlock(parent->d_inode);
> +}

ovl_parent_unlock() or move to vfs please.

>  int ovl_get_write_access(struct dentry *dentry);
>  void ovl_put_write_access(struct dentry *dentry);
>  void ovl_start_write(struct dentry *dentry);
> @@ -843,6 +848,7 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs,
>                                struct inode *dir, struct dentry *newdentry,
>                                struct ovl_cattr *attr);
>  int ovl_cleanup(struct ovl_fs *ofs, struct inode *dir, struct dentry *dentry);
> +int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
>  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir);
>  struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
>                                struct ovl_cattr *attr);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 2b4754c645ee..a5105d68f6b4 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -1544,3 +1544,13 @@ void ovl_copyattr(struct inode *inode)
>         i_size_write(inode, i_size_read(realinode));
>         spin_unlock(&inode->i_lock);
>  }
> +
> +int parent_lock(struct dentry *parent, struct dentry *child)
> +{
> +       inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> +       if (!child || child->d_parent == parent)
> +               return 0;
> +
> +       inode_unlock(parent->d_inode);
> +       return -EINVAL;
> +}

ovl_parent_lock() or move to vfs please.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()
  2025-07-10 23:03 ` [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup() NeilBrown
@ 2025-07-11  9:57   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11  9:57 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

Don't worry I did not work through the entire patch set.
I 'm just reviewing the easy one as a snack when I am tired/hungry ;)

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> The only remaining user of ovl_cleanup() is ovl_cleanup_locked(), so we

You meant ovl_cleanup_unlocked()

> no longer need both.
>
> This patch moves ovl_cleanup() code into ovl_cleanup_locked(), and then
> renames ovl_cleanup_locked() to ovl_cleanup().

I know I wrote in v1 review that it may be ok to combine the helpers,
but looking at this patch I think I prefer to keep it a rename only

ovl_cleanup() => ovl_cleanup_locked()
ovl_cleanup_unlocked() => ovl_cleanup()

You can either leave ovl_cleanup_locked() exported or make it static
I am fine with either way.

Thanks,
Amir.

>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/copy_up.c   |  6 ++---
>  fs/overlayfs/dir.c       | 52 ++++++++++++++++------------------------
>  fs/overlayfs/overlayfs.h |  3 +--
>  fs/overlayfs/readdir.c   | 10 ++++----
>  fs/overlayfs/super.c     |  4 ++--
>  fs/overlayfs/util.c      |  2 +-
>  6 files changed, 33 insertions(+), 44 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 7b84a39c081f..f345f2899ccf 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -570,7 +570,7 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>         parent_unlock(indexdir);
>  out:
>         if (err)
> -               ovl_cleanup_unlocked(ofs, indexdir, temp);
> +               ovl_cleanup(ofs, indexdir, temp);
>         ovl_end_write(dentry);
>         dput(temp);
>  free_name:
> @@ -856,13 +856,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>  cleanup:
>         unlock_rename(c->workdir, c->destdir);
>  cleanup_unlocked:
> -       ovl_cleanup_unlocked(ofs, c->workdir, temp);
> +       ovl_cleanup(ofs, c->workdir, temp);
>         dput(temp);
>         goto out;
>
>  cleanup_need_write:
>         ovl_start_write(c->dentry);
> -       ovl_cleanup_unlocked(ofs, c->workdir, temp);
> +       ovl_cleanup(ofs, c->workdir, temp);
>         ovl_end_write(c->dentry);
>         dput(temp);
>         return err;
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 58078ce67d6a..7e7f701c7ae4 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -24,16 +24,21 @@ MODULE_PARM_DESC(redirect_max,
>
>  static int ovl_set_redirect(struct dentry *dentry, bool samedir);
>
> -int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
> +int ovl_cleanup(struct ovl_fs *ofs, struct dentry *workdir,
> +                        struct dentry *wdentry)
>  {
>         int err;
>
> -       dget(wdentry);
> -       if (d_is_dir(wdentry))
> -               err = ovl_do_rmdir(ofs, wdir, wdentry);
> -       else
> -               err = ovl_do_unlink(ofs, wdir, wdentry);
> -       dput(wdentry);
> +       err = parent_lock(workdir, wdentry);
> +       if (!err) {
> +               dget(wdentry);
> +               if (d_is_dir(wdentry))
> +                       err = ovl_do_rmdir(ofs, workdir->d_inode, wdentry);
> +               else
> +                       err = ovl_do_unlink(ofs, workdir->d_inode, wdentry);
> +               dput(wdentry);
> +               parent_unlock(workdir);
> +       }
>
>         if (err) {
>                 pr_err("cleanup of '%pd2' failed (%i)\n",
> @@ -43,21 +48,6 @@ int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
>         return err;
>  }
>
> -int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
> -                        struct dentry *wdentry)
> -{
> -       int err;
> -
> -       err = parent_lock(workdir, wdentry);
> -       if (err)
> -               return err;
> -
> -       ovl_cleanup(ofs, workdir->d_inode, wdentry);
> -       parent_unlock(workdir);
> -
> -       return err;
> -}
> -
>  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
>  {
>         struct dentry *temp;
> @@ -148,14 +138,14 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct dentry *dir,
>         if (err)
>                 goto kill_whiteout;
>         if (flags)
> -               ovl_cleanup_unlocked(ofs, ofs->workdir, dentry);
> +               ovl_cleanup(ofs, ofs->workdir, dentry);
>
>  out:
>         dput(whiteout);
>         return err;
>
>  kill_whiteout:
> -       ovl_cleanup_unlocked(ofs, ofs->workdir, whiteout);
> +       ovl_cleanup(ofs, ofs->workdir, whiteout);
>         goto out;
>  }
>
> @@ -350,7 +340,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>         return 0;
>
>  out_cleanup:
> -       ovl_cleanup_unlocked(ofs, upperdir, newdentry);
> +       ovl_cleanup(ofs, upperdir, newdentry);
>         dput(newdentry);
>         return err;
>  }
> @@ -409,7 +399,7 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>         unlock_rename(workdir, upperdir);
>
>         ovl_cleanup_whiteouts(ofs, upper, list);
> -       ovl_cleanup_unlocked(ofs, workdir, upper);
> +       ovl_cleanup(ofs, workdir, upper);
>
>         /* dentry's upper doesn't match now, get rid of it */
>         d_drop(dentry);
> @@ -419,7 +409,7 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>  out_cleanup:
>         unlock_rename(workdir, upperdir);
>  out_cleanup_unlocked:
> -       ovl_cleanup_unlocked(ofs, workdir, opaquedir);
> +       ovl_cleanup(ofs, workdir, opaquedir);
>         dput(opaquedir);
>  out:
>         return ERR_PTR(err);
> @@ -514,7 +504,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>                 if (err)
>                         goto out_cleanup;
>
> -               ovl_cleanup_unlocked(ofs, workdir, upper);
> +               ovl_cleanup(ofs, workdir, upper);
>         } else {
>                 err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper, 0);
>                 unlock_rename(workdir, upperdir);
> @@ -524,7 +514,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>         ovl_dir_modified(dentry->d_parent, false);
>         err = ovl_instantiate(dentry, inode, newdentry, hardlink, NULL);
>         if (err) {
> -               ovl_cleanup_unlocked(ofs, upperdir, newdentry);
> +               ovl_cleanup(ofs, upperdir, newdentry);
>                 dput(newdentry);
>         }
>  out_dput:
> @@ -539,7 +529,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>  out_cleanup_locked:
>         unlock_rename(workdir, upperdir);
>  out_cleanup:
> -       ovl_cleanup_unlocked(ofs, workdir, newdentry);
> +       ovl_cleanup(ofs, workdir, newdentry);
>         dput(newdentry);
>         goto out_dput;
>  }
> @@ -1266,7 +1256,7 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         unlock_rename(new_upperdir, old_upperdir);
>
>         if (cleanup_whiteout)
> -               ovl_cleanup_unlocked(ofs, old_upperdir, newdentry);
> +               ovl_cleanup(ofs, old_upperdir, newdentry);
>
>         if (overwrite && d_inode(new)) {
>                 if (new_is_dir)
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index bda25287c510..1bebfdcd4d90 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -857,8 +857,7 @@ struct ovl_cattr {
>  struct dentry *ovl_create_real(struct ovl_fs *ofs,
>                                struct dentry *parent, struct dentry *newdentry,
>                                struct ovl_cattr *attr);
> -int ovl_cleanup(struct ovl_fs *ofs, struct inode *dir, struct dentry *dentry);
> -int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
> +int ovl_cleanup(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
>  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir);
>  struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
>                                struct ovl_cattr *attr);
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 4127d1f160b3..5a05842c60c5 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1048,7 +1048,7 @@ void ovl_cleanup_whiteouts(struct ovl_fs *ofs, struct dentry *upper,
>                         continue;
>                 }
>                 if (dentry->d_inode)
> -                       ovl_cleanup_unlocked(ofs, upper, dentry);
> +                       ovl_cleanup(ofs, upper, dentry);
>                 dput(dentry);
>         }
>  }
> @@ -1156,7 +1156,7 @@ int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
>         int err;
>
>         if (!d_is_dir(dentry) || level > 1)
> -               return ovl_cleanup_unlocked(ofs, parent, dentry);
> +               return ovl_cleanup(ofs, parent, dentry);
>
>         err = parent_lock(parent, dentry);
>         if (err)
> @@ -1168,7 +1168,7 @@ int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
>
>                 err = ovl_workdir_cleanup_recurse(ofs, &path, level + 1);
>                 if (!err)
> -                       err = ovl_cleanup_unlocked(ofs, parent, dentry);
> +                       err = ovl_cleanup(ofs, parent, dentry);
>         }
>
>         return err;
> @@ -1217,7 +1217,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                         goto next;
>                 } else if (err == -ESTALE) {
>                         /* Cleanup stale index entries */
> -                       err = ovl_cleanup_unlocked(ofs, indexdir, index);
> +                       err = ovl_cleanup(ofs, indexdir, index);
>                 } else if (err != -ENOENT) {
>                         /*
>                          * Abort mount to avoid corrupting the index if
> @@ -1233,7 +1233,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                         err = ovl_cleanup_and_whiteout(ofs, indexdir, index);
>                 } else {
>                         /* Cleanup orphan index entries */
> -                       err = ovl_cleanup_unlocked(ofs, indexdir, index);
> +                       err = ovl_cleanup(ofs, indexdir, index);
>                 }
>
>                 if (err)
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 3c012c8f7c88..e3dd60c459e2 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -603,11 +603,11 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
>
>         /* Best effort cleanup of whiteout and temp file */
>         if (err)
> -               ovl_cleanup_unlocked(ofs, workdir, whiteout);
> +               ovl_cleanup(ofs, workdir, whiteout);
>         dput(whiteout);
>
>  cleanup_temp:
> -       ovl_cleanup_unlocked(ofs, workdir, temp);
> +       ovl_cleanup(ofs, workdir, temp);
>         release_dentry_name_snapshot(&name);
>         dput(temp);
>         dput(dest);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 5218a477551b..c91c3a9187b0 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -1116,7 +1116,7 @@ static void ovl_cleanup_index(struct dentry *dentry)
>                                                indexdir, index);
>         } else {
>                 /* Cleanup orphan index entries */
> -               err = ovl_cleanup_unlocked(ofs, indexdir, index);
> +               err = ovl_cleanup(ofs, indexdir, index);
>         }
>         if (err)
>                 goto fail;
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()
  2025-07-11  8:25   ` Amir Goldstein
@ 2025-07-11 10:30     ` Amir Goldstein
  2025-07-14  0:13     ` NeilBrown
  1 sibling, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 10:30 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 10:25 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > If ovl_copy_up_data() fails the error is not immediately handled but the
> > code continues on to call ovl_start_write() and lock_rename(),
> > presumably because both of these locks are needed for the cleanup.
> > On then (if the lock was successful) is the error checked.
> >
> > This makes the code a little hard to follow and could be fragile.
> >
> > This patch changes to handle the error immediately.  A new
> > ovl_cleanup_unlocked() is created which takes the required directory
> > lock (though it doesn't take the write lock on the filesystem).  This
> > will be used extensively in later patches.
> >
> > In general we need to check the parent is still correct after taking the
> > lock (as ovl_copy_up_workdir() does after a successful lock_rename()) so
> > that is included in ovl_cleanup_unlocked() using new lock_parent() and
> > unlock_parent() calls (it is planned to move this API into VFS code
> > eventually, though in a slightly different form).
>
> Since you are not planning to move it to VFS with this name
> AND since I assume you want to merge this ovl cleanup prior
> to the rest of of patches, please use an ovl helper without
> the ovl_ namespace prefix and you have a typo above
> its parent_lock() not lock_parent().
>
> And apropos lock helper names, at the tip of your branch
> the lock helpers used in ovl_cleanup() are named:
> lock_and_check_dentry()/dentry_unlock()
>
> I have multiple comments on your choice of names for those helpers:
> 1. Please use a consistent name pattern for lock/unlock.
>     The pattern <obj-or-lock-type>_{lock,unlock}_* is far more common
>     then the pattern lock_<obj-or-lock-type> in the kernel, but at least
>     be consistent with dentry_lock_and_check() or better yet
>     parent_lock() and later parent_lock_get_child()
> 2. dentry_unlock() is a very strange name for a helper that
>     unlocks the parent. The fact that you document what it does
>     in Kernel-doc does not stop people reading the code using it
>     from being confused and writing bugs.
> 3. Why not call it parnet_unlock() like I suggested and like you
>     used in this patch set and why not introduce it in VFS to begin with?
>     For that matter parent_unlock_{put,return}_child() is more clear IMO.
> 4. The name dentry_unlock_rename(&rd) also does not balance nicely with
>     the name lookup_and_lock_rename(&rd) and has nothing to do with the
>     dentry_ prefix. How about lookup_done_and_unlock_rename(&rd)?
>
> Hope this is not too much complaining for review of a small cleanup patch :-p
>
> >
> > A fresh cleanup block is added which doesn't share code with other
> > cleanup blocks.  It will get a new users in the next patch.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/copy_up.c   | 12 ++++++++++--
> >  fs/overlayfs/dir.c       | 15 +++++++++++++++
> >  fs/overlayfs/overlayfs.h |  6 ++++++
> >  fs/overlayfs/util.c      | 10 ++++++++++
> >  4 files changed, 41 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index 8a3c0d18ec2e..5d21b8d94a0a 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -794,6 +794,9 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >          */
> >         path.dentry = temp;
> >         err = ovl_copy_up_data(c, &path);
> > +       if (err)
> > +               goto cleanup_need_write;
> > +
> >         /*
> >          * We cannot hold lock_rename() throughout this helper, because of
> >          * lock ordering with sb_writers, which shouldn't be held when calling
> > @@ -809,8 +812,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >                 if (IS_ERR(trap))
> >                         goto out;
> >                 goto unlock;
> > -       } else if (err) {
> > -               goto cleanup;
> >         }
> >
> >         err = ovl_copy_up_metadata(c, temp);
> > @@ -857,6 +858,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >         ovl_cleanup(ofs, wdir, temp);
> >         dput(temp);
> >         goto unlock;
> > +
> > +cleanup_need_write:
> > +       ovl_start_write(c->dentry);
> > +       ovl_cleanup_unlocked(ofs, c->workdir, temp);
> > +       ovl_end_write(c->dentry);
> > +       dput(temp);
> > +       return err;
> >  }
> >
>
> Sorry, I will not accept more messy goto routines.
> I rewrote your simplification based on the tip of your branch.
> Much simpler and no need for this extra routine.
> Just always use ovl_cleanup_unlocked() in this function and
> ovl_start_write() before goto cleanup_unlocked:
>
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -794,13 +794,16 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>          */
>         path.dentry = temp;
>         err = ovl_copy_up_data(c, &path);
> +       ovl_start_write(c->dentry);
> +       if (err)
> +               goto cleanup_unlocked;
> +
>         /*
>          * We cannot hold lock_rename() throughout this helper, because of
>          * lock ordering with sb_writers, which shouldn't be held when calling
>          * ovl_copy_up_data(), so lock workdir and destdir and make sure that
>          * temp wasn't moved before copy up completion or cleanup.
>          */
> -       ovl_start_write(c->dentry);
>         trap = lock_rename(c->workdir, c->destdir);
>         if (trap || temp->d_parent != c->workdir) {
>                 /* temp or workdir moved underneath us? abort without cleanup */
> @@ -809,8 +812,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>                 if (IS_ERR(trap))
>                         goto out;
>                 goto unlock;
> -       } else if (err) {
> -               goto cleanup;
>         }
>
>         err = ovl_copy_up_metadata(c, temp);
> @@ -846,17 +847,17 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         ovl_inode_update(inode, temp);
>         if (S_ISDIR(inode->i_mode))
>                 ovl_set_flag(OVL_WHITEOUTS, inode);
> -unlock:
> -       unlock_rename(c->workdir, c->destdir);
>  out:
>         ovl_end_write(c->dentry);
>
>         return err;
>
>  cleanup:
> -       ovl_cleanup(ofs, wdir, temp);
> +       unlock_rename(c->workdir, c->destdir);
> +cleanup_unlocked:
> +       ovl_cleanup_unlocked(ofs, wdir, temp);
>         dput(temp);
> -       goto unlock;
> +       goto out;
>  }
> ---
>
> >  /* Copyup using O_TMPFILE which does not require cross dir locking */
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index 4fc221ea6480..cee35d69e0e6 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -43,6 +43,21 @@ int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
> >         return err;
> >  }
> >
> > +int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
> > +                        struct dentry *wdentry)
> > +{
> > +       int err;
> > +
> > +       err = parent_lock(workdir, wdentry);
> > +       if (err)
> > +               return err;
> > +
> > +       ovl_cleanup(ofs, workdir->d_inode, wdentry);
> > +       parent_unlock(workdir);
> > +
> > +       return err;
> > +}
> > +
> >  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
> >  {
> >         struct dentry *temp;
> > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > index 42228d10f6b9..68dc78c712a8 100644
> > --- a/fs/overlayfs/overlayfs.h
> > +++ b/fs/overlayfs/overlayfs.h
> > @@ -416,6 +416,11 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
> >  }
> >
> >  /* util.c */
> > +int parent_lock(struct dentry *parent, struct dentry *child);
> > +static inline void parent_unlock(struct dentry *parent)
> > +{
> > +       inode_unlock(parent->d_inode);
> > +}
>
> ovl_parent_unlock() or move to vfs please.
>
> >  int ovl_get_write_access(struct dentry *dentry);
> >  void ovl_put_write_access(struct dentry *dentry);
> >  void ovl_start_write(struct dentry *dentry);
> > @@ -843,6 +848,7 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs,
> >                                struct inode *dir, struct dentry *newdentry,
> >                                struct ovl_cattr *attr);
> >  int ovl_cleanup(struct ovl_fs *ofs, struct inode *dir, struct dentry *dentry);
> > +int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
> >  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir);
> >  struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
> >                                struct ovl_cattr *attr);
> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > index 2b4754c645ee..a5105d68f6b4 100644
> > --- a/fs/overlayfs/util.c
> > +++ b/fs/overlayfs/util.c
> > @@ -1544,3 +1544,13 @@ void ovl_copyattr(struct inode *inode)
> >         i_size_write(inode, i_size_read(realinode));
> >         spin_unlock(&inode->i_lock);
> >  }
> > +
> > +int parent_lock(struct dentry *parent, struct dentry *child)
> > +{
> > +       inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> > +       if (!child || child->d_parent == parent)
> > +               return 0;
> > +
> > +       inode_unlock(parent->d_inode);
> > +       return -EINVAL;
> > +}
>
> ovl_parent_lock() or move to vfs please.
>

BTW, I prefer to define them in vfs if I wasn't clear (in a separate patch)
Where you can later rename them to:
parent_lock_get_child()/parent_unlock_put_child()
and fork the parallel lookup variants.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 02/20] ovl: change ovl_create_index() to take write and dir locks
  2025-07-10 23:03 ` [PATCH 02/20] ovl: change ovl_create_index() to take write and dir locks NeilBrown
@ 2025-07-11 10:41   ` Amir Goldstein
  2025-07-14  0:14     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 10:41 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> ovl_copy_up_workdir() currently take a rename lock on two directories,
> then use the lock to both create a file in one directory, perform a
> rename, and possibly unlink the file for cleanup.  This is incompatible
> with proposed changes which will lock just the dentry of objects being
> acted on.
>
> This patch moves the call to ovl_create_index() earlier in
> ovl_copy_up_workdir() to before the lock is taken, and also before write
> access to the filesystem is gained (this last is not strictly necessary
> but seems cleaner).

With my proposed change to patch 1, ovl_create_index() will be
called with ovl_start_write() held so you wont need to add it.

>
> ovl_create_index() then take the requires locks and drops them before
> returning.
>
> Signed-off-by: NeilBrown <neil@brown.name>

With that fixed, feel free to add:

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Thanks,
Amir.

> ---
>  fs/overlayfs/copy_up.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 5d21b8d94a0a..25be0b80a40b 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -517,8 +517,6 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper,
>
>  /*
>   * Create and install index entry.
> - *
> - * Caller must hold i_mutex on indexdir.
>   */
>  static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>                             struct dentry *upper)
> @@ -550,7 +548,10 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>         if (err)
>                 return err;
>
> +       ovl_start_write(dentry);
> +       inode_lock(dir);
>         temp = ovl_create_temp(ofs, indexdir, OVL_CATTR(S_IFDIR | 0));
> +       inode_unlock(dir);
>         err = PTR_ERR(temp);
>         if (IS_ERR(temp))
>                 goto free_name;
> @@ -559,6 +560,9 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>         if (err)
>                 goto out;
>
> +       err = parent_lock(indexdir, temp);
> +       if (err)
> +               goto out;
>         index = ovl_lookup_upper(ofs, name.name, indexdir, name.len);
>         if (IS_ERR(index)) {
>                 err = PTR_ERR(index);
> @@ -566,9 +570,11 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>                 err = ovl_do_rename(ofs, indexdir, temp, indexdir, index, 0);
>                 dput(index);
>         }
> +       parent_unlock(indexdir);
>  out:
>         if (err)
> -               ovl_cleanup(ofs, dir, temp);
> +               ovl_cleanup_unlocked(ofs, indexdir, temp);
> +       ovl_end_write(dentry);
>         dput(temp);
>  free_name:
>         kfree(name.name);
> @@ -797,6 +803,12 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         if (err)
>                 goto cleanup_need_write;
>
> +       if (S_ISDIR(c->stat.mode) && c->indexed) {
> +               err = ovl_create_index(c->dentry, c->origin_fh, temp);
> +               if (err)
> +                       goto cleanup_need_write;
> +       }
> +
>         /*
>          * We cannot hold lock_rename() throughout this helper, because of
>          * lock ordering with sb_writers, which shouldn't be held when calling
> @@ -818,12 +830,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         if (err)
>                 goto cleanup;
>
> -       if (S_ISDIR(c->stat.mode) && c->indexed) {
> -               err = ovl_create_index(c->dentry, c->origin_fh, temp);
> -               if (err)
> -                       goto cleanup;
> -       }
> -
>         upper = ovl_lookup_upper(ofs, c->destname.name, c->destdir,
>                                  c->destname.len);
>         err = PTR_ERR(upper);
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 03/20] ovl: Call ovl_create_temp() without lock held.
  2025-07-10 23:03 ` [PATCH 03/20] ovl: Call ovl_create_temp() without lock held NeilBrown
@ 2025-07-11 11:10   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 11:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> ovl currently locks a directory or two and then performs multiple actions
> in one or both directories.  This is incompatible with proposed changes
> which will lock just the dentry of objects being acted on.
>
> This patch moves calls to ovl_create_temp() out of the locked regions and
> has it take and release the relevant lock itself.
>
> The lock that was taken before this function was called is now taken
> after.  This means that any code between where the lock was taken and
> ovl_create_temp() is now unlocked.  This necessitates the use of
> ovl_cleanup_unlocked() and the creation of ovl_lookup_upper_unlocked().
> These will be used more widely in future patches.
>
> Now that the file is created before the lock is taken for rename, we
> need to ensure the parent wasn't changed before the lock was gained.
> ovl_lock_rename_workdir() is changed to optionally receive the dentries
> that will be involved in the rename.  If either is present but has the
> wrong parent, an error is returned.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/copy_up.c   |  5 ---
>  fs/overlayfs/dir.c       | 67 ++++++++++++++++++++--------------------
>  fs/overlayfs/overlayfs.h | 12 ++++++-
>  fs/overlayfs/super.c     | 11 ++++---
>  fs/overlayfs/util.c      |  7 ++++-
>  5 files changed, 58 insertions(+), 44 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 25be0b80a40b..eafb46686854 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -523,7 +523,6 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>  {
>         struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>         struct dentry *indexdir = ovl_indexdir(dentry->d_sb);
> -       struct inode *dir = d_inode(indexdir);
>         struct dentry *index = NULL;
>         struct dentry *temp = NULL;
>         struct qstr name = { };
> @@ -549,9 +548,7 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>                 return err;
>
>         ovl_start_write(dentry);
> -       inode_lock(dir);
>         temp = ovl_create_temp(ofs, indexdir, OVL_CATTR(S_IFDIR | 0));
> -       inode_unlock(dir);
>         err = PTR_ERR(temp);
>         if (IS_ERR(temp))
>                 goto free_name;
> @@ -785,9 +782,7 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>                 return err;
>
>         ovl_start_write(c->dentry);
> -       inode_lock(wdir);
>         temp = ovl_create_temp(ofs, c->workdir, &cattr);
> -       inode_unlock(wdir);
>         ovl_end_write(c->dentry);
>         ovl_revert_cu_creds(&cc);
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index cee35d69e0e6..144e1753d0c9 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -214,8 +214,12 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir,
>  struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
>                                struct ovl_cattr *attr)
>  {
> -       return ovl_create_real(ofs, d_inode(workdir),
> -                              ovl_lookup_temp(ofs, workdir), attr);
> +       struct dentry *ret;
> +       inode_lock(workdir->d_inode);
> +       ret = ovl_create_real(ofs, d_inode(workdir),
> +                             ovl_lookup_temp(ofs, workdir), attr);
> +       inode_unlock(workdir->d_inode);
> +       return ret;
>  }
>
>  static int ovl_set_opaque_xerr(struct dentry *dentry, struct dentry *upper,
> @@ -353,7 +357,6 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>         struct dentry *workdir = ovl_workdir(dentry);
>         struct inode *wdir = workdir->d_inode;
>         struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent);
> -       struct inode *udir = upperdir->d_inode;
>         struct path upperpath;
>         struct dentry *upper;
>         struct dentry *opaquedir;
> @@ -363,28 +366,25 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>         if (WARN_ON(!workdir))
>                 return ERR_PTR(-EROFS);
>
> -       err = ovl_lock_rename_workdir(workdir, upperdir);
> -       if (err)
> -               goto out;
> -
>         ovl_path_upper(dentry, &upperpath);
>         err = vfs_getattr(&upperpath, &stat,
>                           STATX_BASIC_STATS, AT_STATX_SYNC_AS_STAT);
>         if (err)
> -               goto out_unlock;
> +               goto out;
>
>         err = -ESTALE;
>         if (!S_ISDIR(stat.mode))
> -               goto out_unlock;
> +               goto out;
>         upper = upperpath.dentry;
> -       if (upper->d_parent->d_inode != udir)
> -               goto out_unlock;
>
>         opaquedir = ovl_create_temp(ofs, workdir, OVL_CATTR(stat.mode));
>         err = PTR_ERR(opaquedir);
>         if (IS_ERR(opaquedir))
> -               goto out_unlock;
> -
> +               /* workdir was unlocked, no upperdir */
> +               goto out;

Strong lint error here. Don't use multi lines (inc. comments) without {}
TBH this comment adds no clarity for me. I would remove it.

> +       err = ovl_lock_rename_workdir(workdir, opaquedir, upperdir, upper);
> +       if (err)
> +               goto out_cleanup_unlocked;

Nit: please keep the empty line after goto as it was in the code before.
I removed this empty line in other patches as well and it hurts my eyes.
I know we do not have a 100% consistent style in that regard in overlayfs code
(e.g. S_ISDIR check above), but please try to avoid changing the existing
style of code in that regard.

>         err = ovl_copy_xattr(dentry->d_sb, &upperpath, opaquedir);
>         if (err)
>                 goto out_cleanup;
> @@ -413,10 +413,10 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>         return opaquedir;
>
>  out_cleanup:
> -       ovl_cleanup(ofs, wdir, opaquedir);
> -       dput(opaquedir);
> -out_unlock:
>         unlock_rename(workdir, upperdir);
> +out_cleanup_unlocked:
> +       ovl_cleanup_unlocked(ofs, workdir, opaquedir);
> +       dput(opaquedir);
>  out:
>         return ERR_PTR(err);
>  }
> @@ -454,15 +454,11 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>                         return err;
>         }
>
> -       err = ovl_lock_rename_workdir(workdir, upperdir);
> -       if (err)
> -               goto out;
> -
> -       upper = ovl_lookup_upper(ofs, dentry->d_name.name, upperdir,
> -                                dentry->d_name.len);
> +       upper = ovl_lookup_upper_unlocked(ofs, dentry->d_name.name, upperdir,
> +                                         dentry->d_name.len);
>         err = PTR_ERR(upper);
>         if (IS_ERR(upper))
> -               goto out_unlock;
> +               goto out;
>
>         err = -ESTALE;
>         if (d_is_negative(upper) || !ovl_upper_is_whiteout(ofs, upper))
> @@ -473,6 +469,10 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>         if (IS_ERR(newdentry))
>                 goto out_dput;
>
> +       err = ovl_lock_rename_workdir(workdir, newdentry, upperdir, upper);
> +       if (err)
> +               goto out_cleanup;
> +

goto out_cleanup_unlocked here please
and leave the rest of the goto cleanup be
just like you did in ovl_clear_empty().

This looks way better than v1 patch 2 that overflowed my review context stack.
With minor nits above fixed, feel free to add:

Reviewed-by: Amir Goldstein <amir73il@gmail.com>


Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir()
  2025-07-10 23:03 ` [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir() NeilBrown
@ 2025-07-11 12:03   ` Amir Goldstein
  2025-07-14  0:29     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 12:03 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> In ovl_copy_up_workdir() unlock immediately after the rename, and then
> use ovl_cleanup_unlocked() with separate locking rather than using the
> same lock to protect both.
>
> This makes way for future changes where locks are taken on individual
> dentries rather than the whole directory.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/copy_up.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index eafb46686854..7b84a39c081f 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -765,7 +765,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>  {
>         struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb);
>         struct inode *inode;
> -       struct inode *wdir = d_inode(c->workdir);
>         struct path path = { .mnt = ovl_upper_mnt(ofs) };
>         struct dentry *temp, *upper, *trap;
>         struct ovl_cu_creds cc;
> @@ -816,9 +815,9 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>                 /* temp or workdir moved underneath us? abort without cleanup */
>                 dput(temp);
>                 err = -EIO;
> -               if (IS_ERR(trap))
> -                       goto out;
> -               goto unlock;
> +               if (!IS_ERR(trap))
> +                       unlock_rename(c->workdir, c->destdir);
> +               goto out;

I now see that this bit was missing from my proposed patch 1
variant, but with this in patch 1, this patch becomes trivial.

Thanks,
Amir.

>         }
>
>         err = ovl_copy_up_metadata(c, temp);
> @@ -832,9 +831,10 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>                 goto cleanup;
>
>         err = ovl_do_rename(ofs, c->workdir, temp, c->destdir, upper, 0);
> +       unlock_rename(c->workdir, c->destdir);
>         dput(upper);
>         if (err)
> -               goto cleanup;
> +               goto cleanup_unlocked;
>
>         inode = d_inode(c->dentry);
>         if (c->metacopy_digest)
> @@ -848,17 +848,17 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         ovl_inode_update(inode, temp);
>         if (S_ISDIR(inode->i_mode))
>                 ovl_set_flag(OVL_WHITEOUTS, inode);
> -unlock:
> -       unlock_rename(c->workdir, c->destdir);
>  out:
>         ovl_end_write(c->dentry);
>
>         return err;
>
>  cleanup:
> -       ovl_cleanup(ofs, wdir, temp);
> +       unlock_rename(c->workdir, c->destdir);
> +cleanup_unlocked:
> +       ovl_cleanup_unlocked(ofs, c->workdir, temp);
>         dput(temp);
> -       goto unlock;
> +       goto out;
>
>  cleanup_need_write:
>         ovl_start_write(c->dentry);
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 05/20] ovl: narrow locking in ovl_create_upper()
  2025-07-10 23:03 ` [PATCH 05/20] ovl: narrow locking in ovl_create_upper() NeilBrown
@ 2025-07-11 12:09   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 12:09 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Drop the directory lock immediately after the ovl_create_real() call and
> take a separate lock later for cleanup in ovl_cleanup_unlocked() - if
> needed.
>
> This makes way for future changes where locks are taken on individual
> dentries rather than the whole directory.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/dir.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 144e1753d0c9..fa438e13e8b1 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -326,9 +326,9 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>                                     ovl_lookup_upper(ofs, dentry->d_name.name,
>                                                      upperdir, dentry->d_name.len),
>                                     attr);
> -       err = PTR_ERR(newdentry);
> +       inode_unlock(udir);
>         if (IS_ERR(newdentry))
> -               goto out_unlock;
> +               return PTR_ERR(newdentry);
>
>         if (ovl_type_merge(dentry->d_parent) && d_is_dir(newdentry) &&
>             !ovl_allow_offline_changes(ofs)) {
> @@ -340,14 +340,12 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>         err = ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink, NULL);
>         if (err)
>                 goto out_cleanup;
> -out_unlock:
> -       inode_unlock(udir);
> -       return err;
> +       return 0;
>
>  out_cleanup:
> -       ovl_cleanup(ofs, udir, newdentry);
> +       ovl_cleanup_unlocked(ofs, upperdir, newdentry);
>         dput(newdentry);
> -       goto out_unlock;
> +       return err;
>  }
>

Thank you for getting rid of this goto chain!

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 06/20] ovl: narrow locking in ovl_clear_empty()
  2025-07-10 23:03 ` [PATCH 06/20] ovl: narrow locking in ovl_clear_empty() NeilBrown
@ 2025-07-11 12:27   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 12:27 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

Neil,

Half way through the review I noticed that your individual patches are
not tagged v2
This makes a bit of a mess when trying to search the inbox for
previous versions.

Please do git format-patch -v3 for the next posting.

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Drop the locks immediately after rename, and use a separate lock for
> cleanup.
>
> This makes way for future changes where locks are taken on individual
> dentries rather than the whole directory.
>
> Note that ovl_cleanup_whiteouts() operates on "upper", a child of
> "upperdir" and does not require upperdir or workdir to be locked.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>

You may keep my RVB, but...

> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/dir.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index fa438e13e8b1..b3d858654f23 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -353,7 +353,6 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>  {
>         struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>         struct dentry *workdir = ovl_workdir(dentry);
> -       struct inode *wdir = workdir->d_inode;
>         struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent);
>         struct path upperpath;
>         struct dentry *upper;
> @@ -400,10 +399,10 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>         err = ovl_do_rename(ofs, workdir, opaquedir, upperdir, upper, RENAME_EXCHANGE);
>         if (err)
>                 goto out_cleanup;
> +       unlock_rename(workdir, upperdir);

If you look out_cleanup now, it basically does unlock_rename(workdir, upperdir);
and then out_cleanup_unlocked:
so I think this would look a bit nicer to bring unlock further closer
to do_rename ?

         err = ovl_do_rename(ofs, workdir, opaquedir, upperdir, upper,
RENAME_EXCHANGE);
+       unlock_rename(workdir, upperdir);
         if (err)
-                 goto out_cleanup;
+                 goto out_cleanup_unlocked;


and leave newline after goto please :)

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout()
  2025-07-10 23:03 ` [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout() NeilBrown
@ 2025-07-11 12:42   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 12:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Unlock the parents immediately after the rename, and use
> ovl_cleanup_unlocked() for cleanup, which takes a separate lock.
>
> This makes way for future changes where locks are taken on individual
> dentries rather than the whole directory.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/dir.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index b3d858654f23..687d5e12289c 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -432,9 +432,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>  {
>         struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>         struct dentry *workdir = ovl_workdir(dentry);
> -       struct inode *wdir = workdir->d_inode;
>         struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent);
> -       struct inode *udir = upperdir->d_inode;
>         struct dentry *upper;
>         struct dentry *newdentry;
>         int err;
> @@ -505,22 +503,23 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>
>                 err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper,
>                                     RENAME_EXCHANGE);
> +               unlock_rename(workdir, upperdir);
>                 if (err)
> -                       goto out_cleanup_locked;
> +                       goto out_cleanup;
>
> -               ovl_cleanup(ofs, wdir, upper);
> +               ovl_cleanup_unlocked(ofs, workdir, upper);
>         } else {
>                 err = ovl_do_rename(ofs, workdir, newdentry, upperdir, upper, 0);
> +               unlock_rename(workdir, upperdir);
>                 if (err)
> -                       goto out_cleanup_locked;
> +                       goto out_cleanup;
>         }

With my suggested changes to labels in patch 3, those lines would
change to out_cleanup => out_cleanup_unlocked

Other that that feel free to add:

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Thanks,
Amir.

>         ovl_dir_modified(dentry->d_parent, false);
>         err = ovl_instantiate(dentry, inode, newdentry, hardlink, NULL);
>         if (err) {
> -               ovl_cleanup(ofs, udir, newdentry);
> +               ovl_cleanup_unlocked(ofs, upperdir, newdentry);
>                 dput(newdentry);
>         }
> -       unlock_rename(workdir, upperdir);
>  out_dput:
>         dput(upper);
>  out:
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 08/20] ovl: narrow locking in ovl_rename()
  2025-07-10 23:03 ` [PATCH 08/20] ovl: narrow locking in ovl_rename() NeilBrown
@ 2025-07-11 13:03   ` Amir Goldstein
  2025-07-14  1:00     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:03 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Drop the rename lock immediately after the rename, and use
> ovl_cleanup_unlocked() for cleanup.
>
> This makes way for future changes where locks are taken on individual
> dentries rather than the whole directory.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/dir.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 687d5e12289c..d01e83f9d800 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -1262,9 +1262,10 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>                             new_upperdir, newdentry, flags);
>         if (err)
>                 goto out_dput;
> +       unlock_rename(new_upperdir, old_upperdir);
>
>         if (cleanup_whiteout)
> -               ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
> +               ovl_cleanup_unlocked(ofs, old_upperdir, newdentry);
>
>         if (overwrite && d_inode(new)) {
>                 if (new_is_dir)
> @@ -1283,12 +1284,8 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         if (d_inode(new) && ovl_dentry_upper(new))
>                 ovl_copyattr(d_inode(new));
>
> -out_dput:
>         dput(newdentry);
> -out_dput_old:
>         dput(olddentry);
> -out_unlock:
> -       unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
>         ovl_revert_creds(old_cred);
>         if (update_nlink)
> @@ -1299,6 +1296,14 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         dput(opaquedir);
>         ovl_cache_free(&list);
>         return err;
> +
> +out_dput:
> +       dput(newdentry);
> +out_dput_old:
> +       dput(olddentry);
> +out_unlock:
> +       unlock_rename(new_upperdir, old_upperdir);
> +       goto out_revert_creds;
>  }
>
>  static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
> --
> 2.49.0
>

I think we get end up with fewer and clearer to understand goto labels
with a relatively simple trick:

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index fe493f3ed6b6..7cddaa7b263e 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1069,8 +1069,8 @@ static int ovl_rename(struct mnt_idmap *idmap,
struct inode *olddir,
        int err;
        struct dentry *old_upperdir;
        struct dentry *new_upperdir;
-       struct dentry *olddentry;
-       struct dentry *newdentry;
+       struct dentry *olddentry = NULL;
+       struct dentry *newdentry = NULL;
        struct dentry *trap;
        bool old_opaque;
        bool new_opaque;
@@ -1187,18 +1187,22 @@ static int ovl_rename(struct mnt_idmap *idmap,
struct inode *olddir,
        olddentry = ovl_lookup_upper(ofs, old->d_name.name, old_upperdir,
                                     old->d_name.len);
        err = PTR_ERR(olddentry);
-       if (IS_ERR(olddentry))
+       if (IS_ERR(olddentry)) {
+               olddentry = NULL;
                goto out_unlock;
+       }

        err = -ESTALE;
        if (!ovl_matches_upper(old, olddentry))
-               goto out_dput_old;
+               goto out_unlock;

        newdentry = ovl_lookup_upper(ofs, new->d_name.name, new_upperdir,
                                     new->d_name.len);
        err = PTR_ERR(newdentry);
-       if (IS_ERR(newdentry))
-               goto out_dput_old;
+       if (IS_ERR(newdentry)) {
+               newdentry = NULL;
+               goto out_unlock;
+       }

        old_opaque = ovl_dentry_is_opaque(old);
        new_opaque = ovl_dentry_is_opaque(new);
@@ -1207,28 +1211,28 @@ static int ovl_rename(struct mnt_idmap *idmap,
struct inode *olddir,
        if (d_inode(new) && ovl_dentry_upper(new)) {
                if (opaquedir) {
                        if (newdentry != opaquedir)
-                               goto out_dput;
+                               goto out_unlock;
                } else {
                        if (!ovl_matches_upper(new, newdentry))
-                               goto out_dput;
+                               goto out_unlock;
                }
        } else {
                if (!d_is_negative(newdentry)) {
                        if (!new_opaque || !ovl_upper_is_whiteout(ofs,
newdentry))
-                               goto out_dput;
+                               goto out_unlock;
                } else {
                        if (flags & RENAME_EXCHANGE)
-                               goto out_dput;
+                               goto out_unlock;
                }
        }

        if (olddentry == trap)
-               goto out_dput;
+               goto out_unlock;
        if (newdentry == trap)
-               goto out_dput;
+               goto out_unlock;

        if (olddentry->d_inode == newdentry->d_inode)
-               goto out_dput;
+               goto out_unlock;

        err = 0;
        if (ovl_type_merge_or_lower(old))
@@ -1236,7 +1240,7 @@ static int ovl_rename(struct mnt_idmap *idmap,
struct inode *olddir,
        else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
                err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
        if (err)
-               goto out_dput;
+               goto out_unlock;

        if (!overwrite && ovl_type_merge_or_lower(new))
                err = ovl_set_redirect(new, samedir);
@@ -1244,15 +1248,16 @@ static int ovl_rename(struct mnt_idmap *idmap,
struct inode *olddir,
                 ovl_type_merge(old->d_parent))
                err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
        if (err)
-               goto out_dput;
+               goto out_unlock;

        err = ovl_do_rename(ofs, old_upperdir->d_inode, olddentry,
                            new_upperdir->d_inode, newdentry, flags);
        if (err)
-               goto out_dput;
+               goto out_unlock;
+       unlock_rename(new_upperdir, old_upperdir);

        if (cleanup_whiteout)
-               ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
+               ovl_cleanup_unlocked(ofs, old_upperdir->d_inode, newdentry);

        if (overwrite && d_inode(new)) {
                if (new_is_dir)
@@ -1271,12 +1276,6 @@ static int ovl_rename(struct mnt_idmap *idmap,
struct inode *olddir,
        if (d_inode(new) && ovl_dentry_upper(new))
                ovl_copyattr(d_inode(new));

-out_dput:
-       dput(newdentry);
-out_dput_old:
-       dput(olddentry);
-out_unlock:
-       unlock_rename(new_upperdir, old_upperdir);
 out_revert_creds:
        ovl_revert_creds(old_cred);
        if (update_nlink)
@@ -1284,9 +1283,15 @@ static int ovl_rename(struct mnt_idmap *idmap,
struct inode *olddir,
        else
                ovl_drop_write(old);
 out:
+       dput(newdentry);
+       dput(olddentry);
        dput(opaquedir);
        ovl_cache_free(&list);
        return err;
+
+out_unlock:
+       unlock_rename(new_upperdir, old_upperdir);
+       goto out_revert_creds;
 }

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index()
  2025-07-10 23:03 ` [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index() NeilBrown
@ 2025-07-11 13:12   ` Amir Goldstein
  2025-07-14  1:03     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> ovl_cleanup_index() takes a lock on the directory and then does a lookup
> and possibly one of two different cleanups.
> This patch narrows the locking to use the _unlocked() versions of the
> lookup and one cleanup, and just takes the lock for the other cleanup.
>
> A subsequent patch will take the lock into the cleanup.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/util.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 9ce9fe62ef28..7369193b11ec 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -1107,21 +1107,20 @@ static void ovl_cleanup_index(struct dentry *dentry)
>                 goto out;
>         }
>
> -       inode_lock_nested(dir, I_MUTEX_PARENT);
> -       index = ovl_lookup_upper(ofs, name.name, indexdir, name.len);
> +       index = ovl_lookup_upper_unlocked(ofs, name.name, indexdir, name.len);
>         err = PTR_ERR(index);
>         if (IS_ERR(index)) {
>                 index = NULL;
>         } else if (ovl_index_all(dentry->d_sb)) {
>                 /* Whiteout orphan index to block future open by handle */
> +               inode_lock_nested(dir, I_MUTEX_PARENT);

Don't we need to verify that index wasn't moved with
parent_lock(indexdi, index)?

Thanks,
Amir.

>                 err = ovl_cleanup_and_whiteout(OVL_FS(dentry->d_sb),
>                                                indexdir, index);
> +               inode_unlock(dir);
>         } else {
>                 /* Cleanup orphan index entries */
> -               err = ovl_cleanup(ofs, dir, index);
> +               err = ovl_cleanup_unlocked(ofs, indexdir, index);
>         }
> -
> -       inode_unlock(dir);
>         if (err)
>                 goto fail;
>
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as needed.
  2025-07-10 23:03 ` [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as needed NeilBrown
@ 2025-07-11 13:28   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:28 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Rather than calling ovl_workdir_cleanup() with the dir already locked,
> change it to take the dir lock only when needed.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/overlayfs.h |  2 +-
>  fs/overlayfs/readdir.c   | 30 +++++++++++++-----------------
>  fs/overlayfs/super.c     |  4 +---
>  3 files changed, 15 insertions(+), 21 deletions(-)
>
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index ec804d6bb2ef..ca74be44dddd 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -738,7 +738,7 @@ void ovl_cleanup_whiteouts(struct ovl_fs *ofs, struct dentry *upper,
>  void ovl_cache_free(struct list_head *list);
>  void ovl_dir_cache_free(struct inode *inode);
>  int ovl_check_d_type_supported(const struct path *realpath);
> -int ovl_workdir_cleanup(struct ovl_fs *ofs, struct inode *dir,
> +int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
>                         struct vfsmount *mnt, struct dentry *dentry, int level);
>  int ovl_indexdir_cleanup(struct ovl_fs *ofs);
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index b3d44bf56c78..6cc5f885e036 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1096,7 +1096,6 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
>                                        int level)
>  {
>         int err;
> -       struct inode *dir = path->dentry->d_inode;
>         LIST_HEAD(list);
>         struct ovl_cache_entry *p;
>         struct ovl_readdir_data rdd = {
> @@ -1139,11 +1138,9 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
>                 dentry = ovl_lookup_upper_unlocked(ofs, p->name, path->dentry, p->len);
>                 if (IS_ERR(dentry))
>                         continue;
> -               if (dentry->d_inode) {
> -                       inode_lock_nested(dir, I_MUTEX_PARENT);
> -                       err = ovl_workdir_cleanup(ofs, dir, path->mnt, dentry, level);
> -                       inode_unlock(dir);
> -               }
> +               if (dentry->d_inode)
> +                       err = ovl_workdir_cleanup(ofs, path->dentry, path->mnt,
> +                                                 dentry, level);
>                 dput(dentry);
>                 if (err)
>                         break;
> @@ -1153,24 +1150,25 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
>         return err;
>  }
>
> -int ovl_workdir_cleanup(struct ovl_fs *ofs, struct inode *dir,
> +int ovl_workdir_cleanup(struct ovl_fs *ofs, struct dentry *parent,
>                         struct vfsmount *mnt, struct dentry *dentry, int level)
>  {
>         int err;
>
> -       if (!d_is_dir(dentry) || level > 1) {
> -               return ovl_cleanup(ofs, dir, dentry);
> -       }
> +       if (!d_is_dir(dentry) || level > 1)
> +               return ovl_cleanup_unlocked(ofs, parent, dentry);
>
> -       err = ovl_do_rmdir(ofs, dir, dentry);
> +       err = parent_lock(parent, dentry);
> +       if (err)
> +               return err;
> +       err = ovl_do_rmdir(ofs, parent->d_inode, dentry);
> +       parent_unlock(parent);

At this point, the code looks correct,
but it replaces unsafe uses of inode_lock_nested() with correct use of
parent_lock().

Please fix patches 11-13

Thanks,
Amir.

>         if (err) {
>                 struct path path = { .mnt = mnt, .dentry = dentry };
>
> -               inode_unlock(dir);
>                 err = ovl_workdir_cleanup_recurse(ofs, &path, level + 1);
> -               inode_lock_nested(dir, I_MUTEX_PARENT);
>                 if (!err)
> -                       err = ovl_cleanup(ofs, dir, dentry);
> +                       err = ovl_cleanup_unlocked(ofs, parent, dentry);
>         }
>
>         return err;
> @@ -1210,9 +1208,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                 }
>                 /* Cleanup leftover from index create/cleanup attempt */
>                 if (index->d_name.name[0] == '#') {
> -                       inode_lock_nested(dir, I_MUTEX_PARENT);
> -                       err = ovl_workdir_cleanup(ofs, dir, path.mnt, index, 1);
> -                       inode_unlock(dir);
> +                       err = ovl_workdir_cleanup(ofs, indexdir, path.mnt, index, 1);
>                         if (err)
>                                 break;
>                         goto next;
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 239ae1946edf..23f43f8131dd 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -319,9 +319,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
>                                 goto out;
>
>                         retried = true;
> -                       inode_lock_nested(dir, I_MUTEX_PARENT);
> -                       err = ovl_workdir_cleanup(ofs, dir, mnt, work, 0);
> -                       inode_unlock(dir);
> +                       err = ovl_workdir_cleanup(ofs, ofs->workbasedir, mnt, work, 0);
>                         dput(work);
>                         if (err == -EINVAL) {
>                                 work = ERR_PTR(err);
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 11/20] ovl: narrow locking in ovl_workdir_create()
  2025-07-10 23:03 ` [PATCH 11/20] ovl: narrow locking in ovl_workdir_create() NeilBrown
@ 2025-07-11 13:32   ` Amir Goldstein
  2025-07-14  1:08     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:32 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> In ovl_workdir_create() don't hold the dir lock for the whole time, but
> only take it when needed.
>
> It now gets taken separately for ovl_workdir_cleanup().  A subsequent
> patch will move the locking into that function.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/super.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 9cce3251dd83..239ae1946edf 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -299,8 +299,8 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
>         int err;
>         bool retried = false;
>
> -       inode_lock_nested(dir, I_MUTEX_PARENT);
>  retry:
> +       inode_lock_nested(dir, I_MUTEX_PARENT);
>         work = ovl_lookup_upper(ofs, name, ofs->workbasedir, strlen(name));
>
>         if (!IS_ERR(work)) {
> @@ -311,23 +311,27 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
>
>                 if (work->d_inode) {
>                         err = -EEXIST;
> +                       inode_unlock(dir);
>                         if (retried)
>                                 goto out_dput;
>
>                         if (persist)
> -                               goto out_unlock;
> +                               goto out;
>
>                         retried = true;
> +                       inode_lock_nested(dir, I_MUTEX_PARENT);

Feels like this should be parent_lock(ofs->workbasedir, work)
and parent_lock(ofs->workbasedir, NULL) in retry:

>                         err = ovl_workdir_cleanup(ofs, dir, mnt, work, 0);
> +                       inode_unlock(dir);
>                         dput(work);
>                         if (err == -EINVAL) {
>                                 work = ERR_PTR(err);
> -                               goto out_unlock;
> +                               goto out;
>                         }
>                         goto retry;
>                 }
>
>                 work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode);
> +               inode_unlock(dir);
>                 err = PTR_ERR(work);
>                 if (IS_ERR(work))
>                         goto out_err;
> @@ -365,11 +369,11 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
>                 if (err)
>                         goto out_dput;
>         } else {
> +               inode_unlock(dir);
>                 err = PTR_ERR(work);
>                 goto out_err;
>         }
> -out_unlock:
> -       inode_unlock(dir);
> +out:
>         return work;
>
>  out_dput:
> @@ -378,7 +382,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
>         pr_warn("failed to create directory %s/%s (errno: %i); mounting read-only\n",
>                 ofs->config.workdir, name, -err);
>         work = NULL;
> -       goto out_unlock;
> +       goto out;

might as well be return NULL now.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup()
  2025-07-10 23:03 ` [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup() NeilBrown
@ 2025-07-11 13:33   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:33 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Instead of taking the directory lock for the whole cleanup, only take it
> when needed.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/readdir.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 2a222b8185a3..3a4bbc178203 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1194,7 +1194,6 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>         if (err)
>                 goto out;
>
> -       inode_lock_nested(dir, I_MUTEX_PARENT);
>         list_for_each_entry(p, &list, l_node) {
>                 if (p->name[0] == '.') {
>                         if (p->len == 1)
> @@ -1202,7 +1201,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                         if (p->len == 2 && p->name[1] == '.')
>                                 continue;
>                 }
> -               index = ovl_lookup_upper(ofs, p->name, indexdir, p->len);
> +               index = ovl_lookup_upper_unlocked(ofs, p->name, indexdir, p->len);
>                 if (IS_ERR(index)) {
>                         err = PTR_ERR(index);
>                         index = NULL;
> @@ -1210,7 +1209,9 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                 }
>                 /* Cleanup leftover from index create/cleanup attempt */
>                 if (index->d_name.name[0] == '#') {
> +                       inode_lock_nested(dir, I_MUTEX_PARENT);

parent_lock()

>                         err = ovl_workdir_cleanup(ofs, dir, path.mnt, index, 1);
> +                       inode_unlock(dir);
>                         if (err)
>                                 break;
>                         goto next;
> @@ -1220,7 +1221,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                         goto next;
>                 } else if (err == -ESTALE) {
>                         /* Cleanup stale index entries */
> -                       err = ovl_cleanup(ofs, dir, index);
> +                       err = ovl_cleanup_unlocked(ofs, indexdir, index);
>                 } else if (err != -ENOENT) {
>                         /*
>                          * Abort mount to avoid corrupting the index if
> @@ -1233,10 +1234,12 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                          * Whiteout orphan index to block future open by
>                          * handle after overlay nlink dropped to zero.
>                          */
> +                       inode_lock_nested(dir, I_MUTEX_PARENT);

parent_lock()

Thanks,
Amir.

>                         err = ovl_cleanup_and_whiteout(ofs, indexdir, index);
> +                       inode_unlock(dir);
>                 } else {
>                         /* Cleanup orphan index entries */
> -                       err = ovl_cleanup(ofs, dir, index);
> +                       err = ovl_cleanup_unlocked(ofs, indexdir, index);
>                 }
>
>                 if (err)
> @@ -1247,7 +1250,6 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                 index = NULL;
>         }
>         dput(index);
> -       inode_unlock(dir);
>  out:
>         ovl_cache_free(&list);
>         if (err)
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse()
  2025-07-10 23:03 ` [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse() NeilBrown
@ 2025-07-11 13:35   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:35 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Only take the dir lock when needed, rather than for the whole loop.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/readdir.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 3a4bbc178203..b3d44bf56c78 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1122,7 +1122,6 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
>         if (err)
>                 goto out;
>
> -       inode_lock_nested(dir, I_MUTEX_PARENT);
>         list_for_each_entry(p, &list, l_node) {
>                 struct dentry *dentry;
>
> @@ -1137,16 +1136,18 @@ static int ovl_workdir_cleanup_recurse(struct ovl_fs *ofs, const struct path *pa
>                         err = -EINVAL;
>                         break;
>                 }
> -               dentry = ovl_lookup_upper(ofs, p->name, path->dentry, p->len);
> +               dentry = ovl_lookup_upper_unlocked(ofs, p->name, path->dentry, p->len);
>                 if (IS_ERR(dentry))
>                         continue;
> -               if (dentry->d_inode)
> +               if (dentry->d_inode) {
> +                       inode_lock_nested(dir, I_MUTEX_PARENT);
>                         err = ovl_workdir_cleanup(ofs, dir, path->mnt, dentry, level);
> +                       inode_unlock(dir);


parent_lock()

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout()
  2025-07-10 23:03 ` [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout() NeilBrown
@ 2025-07-11 13:42   ` Amir Goldstein
  2025-07-14  1:35     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Normally it is ok to include a lookup with the subsequent operation on
> the result.  However in this case ovl_cleanup_and_whiteout() already
> (potentially) creates a whiteout inode so we need separate locking.

The change itself looks fine and simple, but I didn't understand the text above.

Can you please explain?

Thanks,
Amir.

>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/dir.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index d01e83f9d800..8580cd5c61e4 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -769,15 +769,11 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
>                         goto out;
>         }
>
> -       err = ovl_lock_rename_workdir(workdir, NULL, upperdir, NULL);
> -       if (err)
> -               goto out_dput;
> -
> -       upper = ovl_lookup_upper(ofs, dentry->d_name.name, upperdir,
> -                                dentry->d_name.len);
> +       upper = ovl_lookup_upper_unlocked(ofs, dentry->d_name.name, upperdir,
> +                                         dentry->d_name.len);
>         err = PTR_ERR(upper);
>         if (IS_ERR(upper))
> -               goto out_unlock;
> +               goto out_dput;
>
>         err = -ESTALE;
>         if ((opaquedir && upper != opaquedir) ||
> @@ -786,6 +782,10 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
>                 goto out_dput_upper;
>         }
>
> +       err = ovl_lock_rename_workdir(workdir, NULL, upperdir, upper);
> +       if (err)
> +               goto out_dput_upper;
> +
>         err = ovl_cleanup_and_whiteout(ofs, upperdir, upper);
>         if (err)
>                 goto out_d_drop;
> @@ -793,10 +793,9 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
>         ovl_dir_modified(dentry->d_parent, true);
>  out_d_drop:
>         d_drop(dentry);
> +       unlock_rename(workdir, upperdir);
>  out_dput_upper:
>         dput(upper);
> -out_unlock:
> -       unlock_rename(workdir, upperdir);
>  out_dput:
>         dput(opaquedir);
>  out:
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed
  2025-07-10 23:03 ` [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed NeilBrown
@ 2025-07-11 13:50   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:50 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> Rather than locking the directory(s) before calling
> ovl_cleanup_and_whiteout(), change it (and ovl_whiteout()) to do the
> locking, so the locking can be fine grained as will be needed for
> proposed locking changes.
>
> Sometimes this is called to whiteout something in the index dir, in
> which case only that dir must be locked.  In one case it is called on
> something in an upperdir, so two directories must be locked.  We use
> ovl_lock_rename_workdir() for this and remove the restriction that
> upperdir cannot be indexdir - because now sometimes it is.
>
> Signed-off-by: NeilBrown <neil@brown.name>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Thanks,
Amir.

> ---
>  fs/overlayfs/dir.c     | 20 +++++++++-----------
>  fs/overlayfs/readdir.c |  3 ---
>  fs/overlayfs/util.c    |  7 -------
>  3 files changed, 9 insertions(+), 21 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 8580cd5c61e4..086719129be3 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -77,7 +77,6 @@ struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
>         return temp;
>  }
>
> -/* caller holds i_mutex on workdir */
>  static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
>  {
>         int err;
> @@ -85,6 +84,7 @@ static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
>         struct dentry *workdir = ofs->workdir;
>         struct inode *wdir = workdir->d_inode;
>
> +       inode_lock_nested(wdir, I_MUTEX_PARENT);
>         if (!ofs->whiteout) {
>                 whiteout = ovl_lookup_temp(ofs, workdir);
>                 if (IS_ERR(whiteout))
> @@ -118,14 +118,13 @@ static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
>         whiteout = ofs->whiteout;
>         ofs->whiteout = NULL;
>  out:
> +       inode_unlock(wdir);
>         return whiteout;
>  }
>
> -/* Caller must hold i_mutex on both workdir and dir */
>  int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct dentry *dir,
>                              struct dentry *dentry)
>  {
> -       struct inode *wdir = ofs->workdir->d_inode;
>         struct dentry *whiteout;
>         int err;
>         int flags = 0;
> @@ -138,18 +137,22 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct dentry *dir,
>         if (d_is_dir(dentry))
>                 flags = RENAME_EXCHANGE;
>
> -       err = ovl_do_rename(ofs, ofs->workdir, whiteout, dir, dentry, flags);
> +       err = ovl_lock_rename_workdir(ofs->workdir, whiteout, dir, dentry);
> +       if (!err) {
> +               err = ovl_do_rename(ofs, ofs->workdir, whiteout, dir, dentry, flags);
> +               unlock_rename(ofs->workdir, dir);
> +       }
>         if (err)
>                 goto kill_whiteout;
>         if (flags)
> -               ovl_cleanup(ofs, wdir, dentry);
> +               ovl_cleanup_unlocked(ofs, ofs->workdir, dentry);
>
>  out:
>         dput(whiteout);
>         return err;
>
>  kill_whiteout:
> -       ovl_cleanup(ofs, wdir, whiteout);
> +       ovl_cleanup_unlocked(ofs, ofs->workdir, whiteout);
>         goto out;
>  }
>
> @@ -782,10 +785,6 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
>                 goto out_dput_upper;
>         }
>
> -       err = ovl_lock_rename_workdir(workdir, NULL, upperdir, upper);
> -       if (err)
> -               goto out_dput_upper;
> -
>         err = ovl_cleanup_and_whiteout(ofs, upperdir, upper);
>         if (err)
>                 goto out_d_drop;
> @@ -793,7 +792,6 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
>         ovl_dir_modified(dentry->d_parent, true);
>  out_d_drop:
>         d_drop(dentry);
> -       unlock_rename(workdir, upperdir);
>  out_dput_upper:
>         dput(upper);
>  out_dput:
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 6cc5f885e036..4127d1f160b3 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1179,7 +1179,6 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>         int err;
>         struct dentry *indexdir = ofs->workdir;
>         struct dentry *index = NULL;
> -       struct inode *dir = indexdir->d_inode;
>         struct path path = { .mnt = ovl_upper_mnt(ofs), .dentry = indexdir };
>         LIST_HEAD(list);
>         struct ovl_cache_entry *p;
> @@ -1231,9 +1230,7 @@ int ovl_indexdir_cleanup(struct ovl_fs *ofs)
>                          * Whiteout orphan index to block future open by
>                          * handle after overlay nlink dropped to zero.
>                          */
> -                       inode_lock_nested(dir, I_MUTEX_PARENT);
>                         err = ovl_cleanup_and_whiteout(ofs, indexdir, index);
> -                       inode_unlock(dir);
>                 } else {
>                         /* Cleanup orphan index entries */
>                         err = ovl_cleanup_unlocked(ofs, indexdir, index);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 7369193b11ec..5218a477551b 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -1071,7 +1071,6 @@ static void ovl_cleanup_index(struct dentry *dentry)
>  {
>         struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>         struct dentry *indexdir = ovl_indexdir(dentry->d_sb);
> -       struct inode *dir = indexdir->d_inode;
>         struct dentry *lowerdentry = ovl_dentry_lower(dentry);
>         struct dentry *upperdentry = ovl_dentry_upper(dentry);
>         struct dentry *index = NULL;
> @@ -1113,10 +1112,8 @@ static void ovl_cleanup_index(struct dentry *dentry)
>                 index = NULL;
>         } else if (ovl_index_all(dentry->d_sb)) {
>                 /* Whiteout orphan index to block future open by handle */
> -               inode_lock_nested(dir, I_MUTEX_PARENT);
>                 err = ovl_cleanup_and_whiteout(OVL_FS(dentry->d_sb),
>                                                indexdir, index);
> -               inode_unlock(dir);
>         } else {
>                 /* Cleanup orphan index entries */
>                 err = ovl_cleanup_unlocked(ofs, indexdir, index);
> @@ -1224,10 +1221,6 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *work,
>  {
>         struct dentry *trap;
>
> -       /* Workdir should not be the same as upperdir */
> -       if (workdir == upperdir)
> -               goto err;
> -
>         /* Workdir should not be subdir of upperdir and vice versa */
>         trap = lock_rename(workdir, upperdir);
>         if (IS_ERR(trap))
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout()
  2025-07-10 23:03 ` [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout() NeilBrown
@ 2025-07-11 13:54   ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 13:54 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> ovl_check_rename_whiteout() now only holds the directory lock when
> needed, and takes it again if necessary.
>
> This makes way for future changes where locks are taken on individual
> dentries rather than the whole directory.
>
> Signed-off-by: NeilBrown <neil@brown.name>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Thanks,
Amir.

> ---
>  fs/overlayfs/super.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 23f43f8131dd..78f4fcfb9ff6 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -559,7 +559,6 @@ static int ovl_get_upper(struct super_block *sb, struct ovl_fs *ofs,
>  static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
>  {
>         struct dentry *workdir = ofs->workdir;
> -       struct inode *dir = d_inode(workdir);
>         struct dentry *temp;
>         struct dentry *dest;
>         struct dentry *whiteout;
> @@ -580,19 +579,22 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
>         err = PTR_ERR(dest);
>         if (IS_ERR(dest)) {
>                 dput(temp);
> -               goto out_unlock;
> +               parent_unlock(workdir);
> +               return err;
>         }
>
>         /* Name is inline and stable - using snapshot as a copy helper */
>         take_dentry_name_snapshot(&name, temp);
>         err = ovl_do_rename(ofs, workdir, temp, workdir, dest, RENAME_WHITEOUT);
> +       parent_unlock(workdir);
>         if (err) {
>                 if (err == -EINVAL)
>                         err = 0;
>                 goto cleanup_temp;
>         }
>
> -       whiteout = ovl_lookup_upper(ofs, name.name.name, workdir, name.name.len);
> +       whiteout = ovl_lookup_upper_unlocked(ofs, name.name.name,
> +                                            workdir, name.name.len);
>         err = PTR_ERR(whiteout);
>         if (IS_ERR(whiteout))
>                 goto cleanup_temp;
> @@ -601,18 +603,15 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
>
>         /* Best effort cleanup of whiteout and temp file */
>         if (err)
> -               ovl_cleanup(ofs, dir, whiteout);
> +               ovl_cleanup_unlocked(ofs, workdir, whiteout);
>         dput(whiteout);
>
>  cleanup_temp:
> -       ovl_cleanup(ofs, dir, temp);
> +       ovl_cleanup_unlocked(ofs, workdir, temp);
>         release_dentry_name_snapshot(&name);
>         dput(temp);
>         dput(dest);
>
> -out_unlock:
> -       parent_unlock(workdir);
> -
>         return err;
>  }
>
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 17/20] ovl: narrow locking in ovl_whiteout()
  2025-07-10 23:03 ` [PATCH 17/20] ovl: narrow locking in ovl_whiteout() NeilBrown
@ 2025-07-11 15:19   ` Amir Goldstein
  2025-07-14  1:44     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 15:19 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> ovl_whiteout() relies on the workdir i_rwsem to provide exclusive access
> to ofs->whiteout which it manipulates.  Rather than depending on this,
> add a new mutex, "whiteout_lock" to explicitly provide the required
> locking.  Use guard(mutex) for this so that we can return without
> needing to explicitly unlock.
>
> Then take the lock on workdir only when needed - to lookup the temp name
> and to do the whiteout or link.
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/dir.c       | 49 +++++++++++++++++++++-------------------
>  fs/overlayfs/ovl_entry.h |  1 +
>  fs/overlayfs/params.c    |  2 ++
>  3 files changed, 29 insertions(+), 23 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 086719129be3..fd89c25775bd 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -84,41 +84,44 @@ static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
>         struct dentry *workdir = ofs->workdir;
>         struct inode *wdir = workdir->d_inode;
>
> -       inode_lock_nested(wdir, I_MUTEX_PARENT);
> +       guard(mutex)(&ofs->whiteout_lock);
> +
>         if (!ofs->whiteout) {
> +               inode_lock_nested(wdir, I_MUTEX_PARENT);
>                 whiteout = ovl_lookup_temp(ofs, workdir);
> -               if (IS_ERR(whiteout))
> -                       goto out;
> -
> -               err = ovl_do_whiteout(ofs, wdir, whiteout);
> -               if (err) {
> -                       dput(whiteout);
> -                       whiteout = ERR_PTR(err);
> -                       goto out;
> +               if (!IS_ERR(whiteout)) {
> +                       err = ovl_do_whiteout(ofs, wdir, whiteout);
> +                       if (err) {
> +                               dput(whiteout);
> +                               whiteout = ERR_PTR(err);
> +                       }
>                 }
> +               inode_unlock(wdir);
> +               if (IS_ERR(whiteout))
> +                       return whiteout;
>                 ofs->whiteout = whiteout;
>         }
>
>         if (!ofs->no_shared_whiteout) {
> +               inode_lock_nested(wdir, I_MUTEX_PARENT);
>                 whiteout = ovl_lookup_temp(ofs, workdir);
> -               if (IS_ERR(whiteout))
> -                       goto out;
> -
> -               err = ovl_do_link(ofs, ofs->whiteout, wdir, whiteout);
> -               if (!err)
> -                       goto out;
> -
> -               if (err != -EMLINK) {
> -                       pr_warn("Failed to link whiteout - disabling whiteout inode sharing(nlink=%u, err=%i)\n",
> -                               ofs->whiteout->d_inode->i_nlink, err);
> -                       ofs->no_shared_whiteout = true;
> +               if (!IS_ERR(whiteout)) {
> +                       err = ovl_do_link(ofs, ofs->whiteout, wdir, whiteout);
> +                       if (err) {
> +                               dput(whiteout);
> +                               whiteout = ERR_PTR(err);
> +                       }
>                 }
> -               dput(whiteout);
> +               inode_unlock(wdir);
> +               if (!IS_ERR(whiteout) || PTR_ERR(whiteout) != -EMLINK)
> +                       return whiteout;

+               if (!IS_ERR(whiteout))
+                       return whiteout;

> +
> +               pr_warn("Failed to link whiteout - disabling whiteout inode sharing(nlink=%u, err=%i)\n",
> +                       ofs->whiteout->d_inode->i_nlink, err);
> +               ofs->no_shared_whiteout = true;

Logic was changed.
The above pr_warn and no_shared_whiteout = true and for the case of
PTR_ERR(whiteout) != -EMLINK

>         }
>         whiteout = ofs->whiteout;
>         ofs->whiteout = NULL;

The outcome is the same with all errors - we return and reset
ofs->whiteout, but with EMLINK this is expected and not a warning
with other errors unexpected and warning and we do not try again
to hardlink to singleton whiteout.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem
  2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
                   ` (19 preceding siblings ...)
  2025-07-10 23:03 ` [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup() NeilBrown
@ 2025-07-11 16:41 ` Amir Goldstein
  2025-07-14  5:57   ` Amir Goldstein
  20 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-11 16:41 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
>
> This is a revised set of patches following helpful feedback.  There are
> now more patches, but they should be a lot easier to review.

I confirm that this set was "reviewable" :)

No major comments on my part, mostly petty nits.

I would prefer to see parent_lock/unlock helpers in vfs for v3,
but if you prefer to keep the prep patches internal to ovl, that's fine too.
In that case I'd prefer to use ovl_parent_lock/unlock, but if that's too
painful, don't bother.

Thanks,
Amir.

>
> These patches are all in a git tree at
>    https://github.com/neilbrown/linux/commits/pdirops
> though there a lot more patches there too - demonstrating what is to come.
> 0eaa1c629788 ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()
> is the last in the series posted here.
>
> I welcome further review.
>
> Original description:
>
> This series of patches for overlayfs is primarily focussed on preparing
> for some proposed changes to directory locking.  In the new scheme we
> will lock individual dentries in a directory rather than the whole
> directory.
>
> ovl currently will sometimes lock a directory on the upper filesystem
> and do a few different things while holding the lock.  This is
> incompatible with the new scheme.
>
> This series narrows the region of code protected by the directory lock,
> taking it multiple times when necessary.  This theoretically open up the
> possibilty of other changes happening on the upper filesytem between the
> unlock and the lock.  To some extent the patches guard against that by
> checking the dentries still have the expect parent after retaking the
> lock.  In general, I think ovl would have trouble if upperfs were being
> changed independantly, and I don't think the changes here increase the
> problem in any important way.
>
> I have tested this with fstests, both generic and unionfs tests.  I
> wouldn't be surprised if I missed something though, so please review
> carefully.
>
> After this series (with any needed changes) lands I will resubmit my
> change to vfs_rmdir() behaviour to have it drop the lock on error.  ovl
> will be much better positioned to handle that change.  It will come with
> the new "lookup_and_lock" API that I am proposing.
>
> Thanks,
> NeilBrown
>
>
>  [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()
>  [PATCH 02/20] ovl: change ovl_create_index() to take write and dir
>  [PATCH 03/20] ovl: Call ovl_create_temp() without lock held.
>  [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir()
>  [PATCH 05/20] ovl: narrow locking in ovl_create_upper()
>  [PATCH 06/20] ovl: narrow locking in ovl_clear_empty()
>  [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout()
>  [PATCH 08/20] ovl: narrow locking in ovl_rename()
>  [PATCH 09/20] ovl: narrow locking in ovl_cleanup_whiteouts()
>  [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index()
>  [PATCH 11/20] ovl: narrow locking in ovl_workdir_create()
>  [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup()
>  [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse()
>  [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as
>  [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout()
>  [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename
>  [PATCH 17/20] ovl: narrow locking in ovl_whiteout()
>  [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout()
>  [PATCH 19/20] ovl: change ovl_create_real() to receive dentry parent
>  [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()
  2025-07-11  8:25   ` Amir Goldstein
  2025-07-11 10:30     ` Amir Goldstein
@ 2025-07-14  0:13     ` NeilBrown
  2025-07-14  5:42       ` parent_lock/unlock (Was: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()) Amir Goldstein
  1 sibling, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-14  0:13 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, 11 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > If ovl_copy_up_data() fails the error is not immediately handled but the
> > code continues on to call ovl_start_write() and lock_rename(),
> > presumably because both of these locks are needed for the cleanup.
> > On then (if the lock was successful) is the error checked.
> >
> > This makes the code a little hard to follow and could be fragile.
> >
> > This patch changes to handle the error immediately.  A new
> > ovl_cleanup_unlocked() is created which takes the required directory
> > lock (though it doesn't take the write lock on the filesystem).  This
> > will be used extensively in later patches.
> >
> > In general we need to check the parent is still correct after taking the
> > lock (as ovl_copy_up_workdir() does after a successful lock_rename()) so
> > that is included in ovl_cleanup_unlocked() using new lock_parent() and
> > unlock_parent() calls (it is planned to move this API into VFS code
> > eventually, though in a slightly different form).
> 
> Since you are not planning to move it to VFS with this name
> AND since I assume you want to merge this ovl cleanup prior
> to the rest of of patches, please use an ovl helper without
> the ovl_ namespace prefix and you have a typo above
> its parent_lock() not lock_parent().

I think you mean "with" rather than "without" ?
But you separately say you would much rather this go into the VFS code
first. 

For me a core issue is how the patches will land.  If you are happy for
these patches (once they are all approved of course) to land via the vfs
tree, then I can certainly submit the new interfaces in VFS code first,
then the ovl cleanups that use them.

However I assumed that they were so substantial that you would want them
to land via an ovl tree.  In that case I wouldn't want to have to wait
for a couple of new interfaces to land in VFS before you could take the
cleanups.

What process do you imagine?

> 
> And apropos lock helper names, at the tip of your branch
> the lock helpers used in ovl_cleanup() are named:
> lock_and_check_dentry()/dentry_unlock()
> 
> I have multiple comments on your choice of names for those helpers:
> 1. Please use a consistent name pattern for lock/unlock.
>     The pattern <obj-or-lock-type>_{lock,unlock}_* is far more common
>     then the pattern lock_<obj-or-lock-type> in the kernel, but at least
>     be consistent with dentry_lock_and_check() or better yet
>     parent_lock() and later parent_lock_get_child()

dentry_lock_and_check() does make sense - thanks.

> 2. dentry_unlock() is a very strange name for a helper that
>     unlocks the parent. The fact that you document what it does
>     in Kernel-doc does not stop people reading the code using it
>     from being confused and writing bugs.

The plan is that dentry_lookup_and_lock() will only lock the parent during a
short interim period.  Maybe there will be one full release where that
is the case.  As soon a practical (and we know this sort of large change
cannot move quickly) dentry_lookup_and_lock() etc will only lock the
dentry, not the directory.  The directory will only get locked
immediately before call the inode_operations - for filesystems that
haven't opted out.  Thus patches in my git tree don't full reflect this
yet (Though the hints are there are the end) but that is my current
plan, based on most recent feedback from Al Viro.

> 3. Why not call it parent_unlock() like I suggested and like you
>     used in this patch set and why not introduce it in VFS to begin with?
>     For that matter parent_unlock_{put,return}_child() is more clear IMO.

Because, as I say about, it is only incidentally about the parent. It is
primarily about the dentry.

> 4. The name dentry_unlock_rename(&rd) also does not balance nicely with
>     the name lookup_and_lock_rename(&rd) and has nothing to do with the
>     dentry_ prefix. How about lookup_done_and_unlock_rename(&rd)?

The is probably my least favourite name....  I did try some "done"
variants (following one from done_path_create()).  But if felt it should
be "done_$function-that-started-this-interaction()" and that resulted in
   done_dentry_lookup_and_lock()
or similar, and having "lock" in an unlock function was weird.
Your "done_and_unlock" addresses this but results and long name that
feels clumsy to me.

I chose the dentry_ prefix before I decided to pass the renamedata
around (and I'm really happy about that latter choice).  So
reconsidering the name is definitely appropriate.
Maybe  renamedata_lock() and renamedata_unlock() ???
renamedata_lock() can do lookups as well as locking, but maybe that is
implied by the presense of old_last and new_last in renamedata...

> 
> Hope this is not too much complaining for review of a small cleanup patch :-p

It's review as requested, not complaining.  Thanks for it.

> 
> >
> > A fresh cleanup block is added which doesn't share code with other
> > cleanup blocks.  It will get a new users in the next patch.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/copy_up.c   | 12 ++++++++++--
> >  fs/overlayfs/dir.c       | 15 +++++++++++++++
> >  fs/overlayfs/overlayfs.h |  6 ++++++
> >  fs/overlayfs/util.c      | 10 ++++++++++
> >  4 files changed, 41 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index 8a3c0d18ec2e..5d21b8d94a0a 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -794,6 +794,9 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >          */
> >         path.dentry = temp;
> >         err = ovl_copy_up_data(c, &path);
> > +       if (err)
> > +               goto cleanup_need_write;
> > +
> >         /*
> >          * We cannot hold lock_rename() throughout this helper, because of
> >          * lock ordering with sb_writers, which shouldn't be held when calling
> > @@ -809,8 +812,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >                 if (IS_ERR(trap))
> >                         goto out;
> >                 goto unlock;
> > -       } else if (err) {
> > -               goto cleanup;
> >         }
> >
> >         err = ovl_copy_up_metadata(c, temp);
> > @@ -857,6 +858,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >         ovl_cleanup(ofs, wdir, temp);
> >         dput(temp);
> >         goto unlock;
> > +
> > +cleanup_need_write:
> > +       ovl_start_write(c->dentry);
> > +       ovl_cleanup_unlocked(ofs, c->workdir, temp);
> > +       ovl_end_write(c->dentry);
> > +       dput(temp);
> > +       return err;
> >  }
> >
> 
> Sorry, I will not accept more messy goto routines.
> I rewrote your simplification based on the tip of your branch.
> Much simpler and no need for this extra routine.
> Just always use ovl_cleanup_unlocked() in this function and
> ovl_start_write() before goto cleanup_unlocked:

Yes, that's much nicer.  Thanks.

I could of minor changes I've noted below just for completeness.

Thanks,
NeilBrown


> 
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -794,13 +794,16 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>          */
>         path.dentry = temp;
>         err = ovl_copy_up_data(c, &path);
> +       ovl_start_write(c->dentry);
> +       if (err)
> +               goto cleanup_unlocked;
> +
>         /*
>          * We cannot hold lock_rename() throughout this helper, because of
>          * lock ordering with sb_writers, which shouldn't be held when calling
>          * ovl_copy_up_data(), so lock workdir and destdir and make sure that
>          * temp wasn't moved before copy up completion or cleanup.
>          */
> -       ovl_start_write(c->dentry);
>         trap = lock_rename(c->workdir, c->destdir);
>         if (trap || temp->d_parent != c->workdir) {
>                 /* temp or workdir moved underneath us? abort without cleanup */
> @@ -809,8 +812,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>                 if (IS_ERR(trap))
>                         goto out;
>                 goto unlock;
> -       } else if (err) {
> -               goto cleanup;
>         }
> 
>         err = ovl_copy_up_metadata(c, temp);
> @@ -846,17 +847,17 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         ovl_inode_update(inode, temp);
>         if (S_ISDIR(inode->i_mode))
>                 ovl_set_flag(OVL_WHITEOUTS, inode);
> -unlock:
> -       unlock_rename(c->workdir, c->destdir);

We need to leave this unlock_rename() here.

>  out:
>         ovl_end_write(c->dentry);
> 
>         return err;
> 
>  cleanup:
> -       ovl_cleanup(ofs, wdir, temp);
> +       unlock_rename(c->workdir, c->destdir);
> +cleanup_unlocked:
> +       ovl_cleanup_unlocked(ofs, wdir, temp);

"wdir" becomes "c->workdir". 

>         dput(temp);
> -       goto unlock;
> +       goto out;
>  }
> ---
> 
> >  /* Copyup using O_TMPFILE which does not require cross dir locking */
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index 4fc221ea6480..cee35d69e0e6 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -43,6 +43,21 @@ int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
> >         return err;
> >  }
> >
> > +int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
> > +                        struct dentry *wdentry)
> > +{
> > +       int err;
> > +
> > +       err = parent_lock(workdir, wdentry);
> > +       if (err)
> > +               return err;
> > +
> > +       ovl_cleanup(ofs, workdir->d_inode, wdentry);
> > +       parent_unlock(workdir);
> > +
> > +       return err;
> > +}
> > +
> >  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir)
> >  {
> >         struct dentry *temp;
> > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > index 42228d10f6b9..68dc78c712a8 100644
> > --- a/fs/overlayfs/overlayfs.h
> > +++ b/fs/overlayfs/overlayfs.h
> > @@ -416,6 +416,11 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
> >  }
> >
> >  /* util.c */
> > +int parent_lock(struct dentry *parent, struct dentry *child);
> > +static inline void parent_unlock(struct dentry *parent)
> > +{
> > +       inode_unlock(parent->d_inode);
> > +}
> 
> ovl_parent_unlock() or move to vfs please.
> 
> >  int ovl_get_write_access(struct dentry *dentry);
> >  void ovl_put_write_access(struct dentry *dentry);
> >  void ovl_start_write(struct dentry *dentry);
> > @@ -843,6 +848,7 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs,
> >                                struct inode *dir, struct dentry *newdentry,
> >                                struct ovl_cattr *attr);
> >  int ovl_cleanup(struct ovl_fs *ofs, struct inode *dir, struct dentry *dentry);
> > +int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir, struct dentry *dentry);
> >  struct dentry *ovl_lookup_temp(struct ovl_fs *ofs, struct dentry *workdir);
> >  struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
> >                                struct ovl_cattr *attr);
> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > index 2b4754c645ee..a5105d68f6b4 100644
> > --- a/fs/overlayfs/util.c
> > +++ b/fs/overlayfs/util.c
> > @@ -1544,3 +1544,13 @@ void ovl_copyattr(struct inode *inode)
> >         i_size_write(inode, i_size_read(realinode));
> >         spin_unlock(&inode->i_lock);
> >  }
> > +
> > +int parent_lock(struct dentry *parent, struct dentry *child)
> > +{
> > +       inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> > +       if (!child || child->d_parent == parent)
> > +               return 0;
> > +
> > +       inode_unlock(parent->d_inode);
> > +       return -EINVAL;
> > +}
> 
> ovl_parent_lock() or move to vfs please.
> 
> Thanks,
> Amir.
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 02/20] ovl: change ovl_create_index() to take write and dir locks
  2025-07-11 10:41   ` Amir Goldstein
@ 2025-07-14  0:14     ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-14  0:14 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, 11 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > ovl_copy_up_workdir() currently take a rename lock on two directories,
> > then use the lock to both create a file in one directory, perform a
> > rename, and possibly unlink the file for cleanup.  This is incompatible
> > with proposed changes which will lock just the dentry of objects being
> > acted on.
> >
> > This patch moves the call to ovl_create_index() earlier in
> > ovl_copy_up_workdir() to before the lock is taken, and also before write
> > access to the filesystem is gained (this last is not strictly necessary
> > but seems cleaner).
> 
> With my proposed change to patch 1, ovl_create_index() will be
> called with ovl_start_write() held so you wont need to add it.
> 
> >
> > ovl_create_index() then take the requires locks and drops them before
> > returning.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> 
> With that fixed, feel free to add:
> 
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Done - thanks.

NeilBrown


> 
> Thanks,
> Amir.
> 
> > ---
> >  fs/overlayfs/copy_up.c | 24 +++++++++++++++---------
> >  1 file changed, 15 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index 5d21b8d94a0a..25be0b80a40b 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -517,8 +517,6 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper,
> >
> >  /*
> >   * Create and install index entry.
> > - *
> > - * Caller must hold i_mutex on indexdir.
> >   */
> >  static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
> >                             struct dentry *upper)
> > @@ -550,7 +548,10 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
> >         if (err)
> >                 return err;
> >
> > +       ovl_start_write(dentry);
> > +       inode_lock(dir);
> >         temp = ovl_create_temp(ofs, indexdir, OVL_CATTR(S_IFDIR | 0));
> > +       inode_unlock(dir);
> >         err = PTR_ERR(temp);
> >         if (IS_ERR(temp))
> >                 goto free_name;
> > @@ -559,6 +560,9 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
> >         if (err)
> >                 goto out;
> >
> > +       err = parent_lock(indexdir, temp);
> > +       if (err)
> > +               goto out;
> >         index = ovl_lookup_upper(ofs, name.name, indexdir, name.len);
> >         if (IS_ERR(index)) {
> >                 err = PTR_ERR(index);
> > @@ -566,9 +570,11 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
> >                 err = ovl_do_rename(ofs, indexdir, temp, indexdir, index, 0);
> >                 dput(index);
> >         }
> > +       parent_unlock(indexdir);
> >  out:
> >         if (err)
> > -               ovl_cleanup(ofs, dir, temp);
> > +               ovl_cleanup_unlocked(ofs, indexdir, temp);
> > +       ovl_end_write(dentry);
> >         dput(temp);
> >  free_name:
> >         kfree(name.name);
> > @@ -797,6 +803,12 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >         if (err)
> >                 goto cleanup_need_write;
> >
> > +       if (S_ISDIR(c->stat.mode) && c->indexed) {
> > +               err = ovl_create_index(c->dentry, c->origin_fh, temp);
> > +               if (err)
> > +                       goto cleanup_need_write;
> > +       }
> > +
> >         /*
> >          * We cannot hold lock_rename() throughout this helper, because of
> >          * lock ordering with sb_writers, which shouldn't be held when calling
> > @@ -818,12 +830,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >         if (err)
> >                 goto cleanup;
> >
> > -       if (S_ISDIR(c->stat.mode) && c->indexed) {
> > -               err = ovl_create_index(c->dentry, c->origin_fh, temp);
> > -               if (err)
> > -                       goto cleanup;
> > -       }
> > -
> >         upper = ovl_lookup_upper(ofs, c->destname.name, c->destdir,
> >                                  c->destname.len);
> >         err = PTR_ERR(upper);
> > --
> > 2.49.0
> >
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir()
  2025-07-11 12:03   ` Amir Goldstein
@ 2025-07-14  0:29     ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-14  0:29 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, 11 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > In ovl_copy_up_workdir() unlock immediately after the rename, and then
> > use ovl_cleanup_unlocked() with separate locking rather than using the
> > same lock to protect both.
> >
> > This makes way for future changes where locks are taken on individual
> > dentries rather than the whole directory.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/copy_up.c | 18 +++++++++---------
> >  1 file changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index eafb46686854..7b84a39c081f 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -765,7 +765,6 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >  {
> >         struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb);
> >         struct inode *inode;
> > -       struct inode *wdir = d_inode(c->workdir);
> >         struct path path = { .mnt = ovl_upper_mnt(ofs) };
> >         struct dentry *temp, *upper, *trap;
> >         struct ovl_cu_creds cc;
> > @@ -816,9 +815,9 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >                 /* temp or workdir moved underneath us? abort without cleanup */
> >                 dput(temp);
> >                 err = -EIO;
> > -               if (IS_ERR(trap))
> > -                       goto out;
> > -               goto unlock;
> > +               if (!IS_ERR(trap))
> > +                       unlock_rename(c->workdir, c->destdir);
> > +               goto out;
> 
> I now see that this bit was missing from my proposed patch 1
> variant, but with this in patch 1, this patch becomes trivial.

I missed that too :-)
As you say - nice and trivial here now.

NeilBrown


> 
> Thanks,
> Amir.
> 
> >         }
> >
> >         err = ovl_copy_up_metadata(c, temp);
> > @@ -832,9 +831,10 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >                 goto cleanup;
> >
> >         err = ovl_do_rename(ofs, c->workdir, temp, c->destdir, upper, 0);
> > +       unlock_rename(c->workdir, c->destdir);
> >         dput(upper);
> >         if (err)
> > -               goto cleanup;
> > +               goto cleanup_unlocked;
> >
> >         inode = d_inode(c->dentry);
> >         if (c->metacopy_digest)
> > @@ -848,17 +848,17 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >         ovl_inode_update(inode, temp);
> >         if (S_ISDIR(inode->i_mode))
> >                 ovl_set_flag(OVL_WHITEOUTS, inode);
> > -unlock:
> > -       unlock_rename(c->workdir, c->destdir);
> >  out:
> >         ovl_end_write(c->dentry);
> >
> >         return err;
> >
> >  cleanup:
> > -       ovl_cleanup(ofs, wdir, temp);
> > +       unlock_rename(c->workdir, c->destdir);
> > +cleanup_unlocked:
> > +       ovl_cleanup_unlocked(ofs, c->workdir, temp);
> >         dput(temp);
> > -       goto unlock;
> > +       goto out;
> >
> >  cleanup_need_write:
> >         ovl_start_write(c->dentry);
> > --
> > 2.49.0
> >
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 08/20] ovl: narrow locking in ovl_rename()
  2025-07-11 13:03   ` Amir Goldstein
@ 2025-07-14  1:00     ` NeilBrown
  2025-07-14  5:12       ` Amir Goldstein
  0 siblings, 1 reply; 54+ messages in thread
From: NeilBrown @ 2025-07-14  1:00 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, 11 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > Drop the rename lock immediately after the rename, and use
> > ovl_cleanup_unlocked() for cleanup.
> >
> > This makes way for future changes where locks are taken on individual
> > dentries rather than the whole directory.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/dir.c | 15 ++++++++++-----
> >  1 file changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index 687d5e12289c..d01e83f9d800 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -1262,9 +1262,10 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
> >                             new_upperdir, newdentry, flags);
> >         if (err)
> >                 goto out_dput;
> > +       unlock_rename(new_upperdir, old_upperdir);
> >
> >         if (cleanup_whiteout)
> > -               ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
> > +               ovl_cleanup_unlocked(ofs, old_upperdir, newdentry);
> >
> >         if (overwrite && d_inode(new)) {
> >                 if (new_is_dir)
> > @@ -1283,12 +1284,8 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
> >         if (d_inode(new) && ovl_dentry_upper(new))
> >                 ovl_copyattr(d_inode(new));
> >
> > -out_dput:
> >         dput(newdentry);
> > -out_dput_old:
> >         dput(olddentry);
> > -out_unlock:
> > -       unlock_rename(new_upperdir, old_upperdir);
> >  out_revert_creds:
> >         ovl_revert_creds(old_cred);
> >         if (update_nlink)
> > @@ -1299,6 +1296,14 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
> >         dput(opaquedir);
> >         ovl_cache_free(&list);
> >         return err;
> > +
> > +out_dput:
> > +       dput(newdentry);
> > +out_dput_old:
> > +       dput(olddentry);
> > +out_unlock:
> > +       unlock_rename(new_upperdir, old_upperdir);
> > +       goto out_revert_creds;
> >  }
> >
> >  static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
> > --
> > 2.49.0
> >
> 
> I think we get end up with fewer and clearer to understand goto labels
> with a relatively simple trick:
> 
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index fe493f3ed6b6..7cddaa7b263e 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -1069,8 +1069,8 @@ static int ovl_rename(struct mnt_idmap *idmap,
> struct inode *olddir,
>         int err;
>         struct dentry *old_upperdir;
>         struct dentry *new_upperdir;
> -       struct dentry *olddentry;
> -       struct dentry *newdentry;
> +       struct dentry *olddentry = NULL;
> +       struct dentry *newdentry = NULL;
>         struct dentry *trap;
>         bool old_opaque;
>         bool new_opaque;
> @@ -1187,18 +1187,22 @@ static int ovl_rename(struct mnt_idmap *idmap,
> struct inode *olddir,
>         olddentry = ovl_lookup_upper(ofs, old->d_name.name, old_upperdir,
>                                      old->d_name.len);
>         err = PTR_ERR(olddentry);
> -       if (IS_ERR(olddentry))
> +       if (IS_ERR(olddentry)) {
> +               olddentry = NULL;
>                 goto out_unlock;
> +       }
> 
>         err = -ESTALE;
>         if (!ovl_matches_upper(old, olddentry))
> -               goto out_dput_old;
> +               goto out_unlock;
> 
>         newdentry = ovl_lookup_upper(ofs, new->d_name.name, new_upperdir,
>                                      new->d_name.len);
>         err = PTR_ERR(newdentry);
> -       if (IS_ERR(newdentry))
> -               goto out_dput_old;
> +       if (IS_ERR(newdentry)) {
> +               newdentry = NULL;
> +               goto out_unlock;
> +       }
> 
>         old_opaque = ovl_dentry_is_opaque(old);
>         new_opaque = ovl_dentry_is_opaque(new);
> @@ -1207,28 +1211,28 @@ static int ovl_rename(struct mnt_idmap *idmap,
> struct inode *olddir,
>         if (d_inode(new) && ovl_dentry_upper(new)) {
>                 if (opaquedir) {
>                         if (newdentry != opaquedir)
> -                               goto out_dput;
> +                               goto out_unlock;
>                 } else {
>                         if (!ovl_matches_upper(new, newdentry))
> -                               goto out_dput;
> +                               goto out_unlock;
>                 }
>         } else {
>                 if (!d_is_negative(newdentry)) {
>                         if (!new_opaque || !ovl_upper_is_whiteout(ofs,
> newdentry))
> -                               goto out_dput;
> +                               goto out_unlock;
>                 } else {
>                         if (flags & RENAME_EXCHANGE)
> -                               goto out_dput;
> +                               goto out_unlock;
>                 }
>         }
> 
>         if (olddentry == trap)
> -               goto out_dput;
> +               goto out_unlock;
>         if (newdentry == trap)
> -               goto out_dput;
> +               goto out_unlock;
> 
>         if (olddentry->d_inode == newdentry->d_inode)
> -               goto out_dput;
> +               goto out_unlock;
> 
>         err = 0;
>         if (ovl_type_merge_or_lower(old))
> @@ -1236,7 +1240,7 @@ static int ovl_rename(struct mnt_idmap *idmap,
> struct inode *olddir,
>         else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
>                 err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
>         if (err)
> -               goto out_dput;
> +               goto out_unlock;
> 
>         if (!overwrite && ovl_type_merge_or_lower(new))
>                 err = ovl_set_redirect(new, samedir);
> @@ -1244,15 +1248,16 @@ static int ovl_rename(struct mnt_idmap *idmap,
> struct inode *olddir,
>                  ovl_type_merge(old->d_parent))
>                 err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
>         if (err)
> -               goto out_dput;
> +               goto out_unlock;
> 
>         err = ovl_do_rename(ofs, old_upperdir->d_inode, olddentry,
>                             new_upperdir->d_inode, newdentry, flags);
>         if (err)
> -               goto out_dput;
> +               goto out_unlock;
> +       unlock_rename(new_upperdir, old_upperdir);
> 
>         if (cleanup_whiteout)
> -               ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
> +               ovl_cleanup_unlocked(ofs, old_upperdir->d_inode, newdentry);
> 
>         if (overwrite && d_inode(new)) {
>                 if (new_is_dir)
> @@ -1271,12 +1276,6 @@ static int ovl_rename(struct mnt_idmap *idmap,
> struct inode *olddir,
>         if (d_inode(new) && ovl_dentry_upper(new))
>                 ovl_copyattr(d_inode(new));
> 
> -out_dput:
> -       dput(newdentry);
> -out_dput_old:
> -       dput(olddentry);
> -out_unlock:
> -       unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
>         ovl_revert_creds(old_cred);
>         if (update_nlink)
> @@ -1284,9 +1283,15 @@ static int ovl_rename(struct mnt_idmap *idmap,
> struct inode *olddir,
>         else
>                 ovl_drop_write(old);
>  out:
> +       dput(newdentry);
> +       dput(olddentry);
>         dput(opaquedir);
>         ovl_cache_free(&list);
>         return err;
> +
> +out_unlock:
> +       unlock_rename(new_upperdir, old_upperdir);
> +       goto out_revert_creds;
>  }
> 

I decided to make the goto changed into a separate patch as follows.  My
version is slightly different to yours (see new var "de").

Thanks,
NeilBrown

From: NeilBrown <neil@brown.name>
Date: Mon, 14 Jul 2025 10:44:03 +1000
Subject: [PATCH] ovl: simplify gotos in ovl_rename()

Rather than having three separate goto label: out_unlock, out_dput_old,
and out_dput, make use of that fact that dput() happily accepts a NULL
point to reduce this to just one goto label: out_unlock.

olddentry and newdentry are initialised to NULL and only set once a
value dentry is found.  They are then put late in the function.

Suggested-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/overlayfs/dir.c | 54 +++++++++++++++++++++++-----------------------
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index e094adf9d169..63460bdd71cf 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1082,9 +1082,9 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	int err;
 	struct dentry *old_upperdir;
 	struct dentry *new_upperdir;
-	struct dentry *olddentry;
-	struct dentry *newdentry;
-	struct dentry *trap;
+	struct dentry *olddentry = NULL;
+	struct dentry *newdentry = NULL;
+	struct dentry *trap, *de;
 	bool old_opaque;
 	bool new_opaque;
 	bool cleanup_whiteout = false;
@@ -1197,21 +1197,23 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 		goto out_revert_creds;
 	}
 
-	olddentry = ovl_lookup_upper(ofs, old->d_name.name, old_upperdir,
-				     old->d_name.len);
-	err = PTR_ERR(olddentry);
-	if (IS_ERR(olddentry))
+	de = ovl_lookup_upper(ofs, old->d_name.name, old_upperdir,
+			      old->d_name.len);
+	err = PTR_ERR(de);
+	if (IS_ERR(de))
 		goto out_unlock;
+	olddentry = de;
 
 	err = -ESTALE;
 	if (!ovl_matches_upper(old, olddentry))
-		goto out_dput_old;
+		goto out_unlock;
 
-	newdentry = ovl_lookup_upper(ofs, new->d_name.name, new_upperdir,
-				     new->d_name.len);
-	err = PTR_ERR(newdentry);
-	if (IS_ERR(newdentry))
-		goto out_dput_old;
+	de = ovl_lookup_upper(ofs, new->d_name.name, new_upperdir,
+			      new->d_name.len);
+	err = PTR_ERR(de);
+	if (IS_ERR(de))
+		goto out_unlock;
+	newdentry = de;
 
 	old_opaque = ovl_dentry_is_opaque(old);
 	new_opaque = ovl_dentry_is_opaque(new);
@@ -1220,28 +1222,28 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	if (d_inode(new) && ovl_dentry_upper(new)) {
 		if (opaquedir) {
 			if (newdentry != opaquedir)
-				goto out_dput;
+				goto out_unlock;
 		} else {
 			if (!ovl_matches_upper(new, newdentry))
-				goto out_dput;
+				goto out_unlock;
 		}
 	} else {
 		if (!d_is_negative(newdentry)) {
 			if (!new_opaque || !ovl_upper_is_whiteout(ofs, newdentry))
-				goto out_dput;
+				goto out_unlock;
 		} else {
 			if (flags & RENAME_EXCHANGE)
-				goto out_dput;
+				goto out_unlock;
 		}
 	}
 
 	if (olddentry == trap)
-		goto out_dput;
+		goto out_unlock;
 	if (newdentry == trap)
-		goto out_dput;
+		goto out_unlock;
 
 	if (olddentry->d_inode == newdentry->d_inode)
-		goto out_dput;
+		goto out_unlock;
 
 	err = 0;
 	if (ovl_type_merge_or_lower(old))
@@ -1249,7 +1251,7 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
 		err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
 	if (err)
-		goto out_dput;
+		goto out_unlock;
 
 	if (!overwrite && ovl_type_merge_or_lower(new))
 		err = ovl_set_redirect(new, samedir);
@@ -1257,12 +1259,12 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 		 ovl_type_merge(old->d_parent))
 		err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
 	if (err)
-		goto out_dput;
+		goto out_unlock;
 
 	err = ovl_do_rename(ofs, old_upperdir, olddentry,
 			    new_upperdir, newdentry, flags);
 	if (err)
-		goto out_dput;
+		goto out_unlock;
 
 	if (cleanup_whiteout)
 		ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
@@ -1284,10 +1286,6 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	if (d_inode(new) && ovl_dentry_upper(new))
 		ovl_copyattr(d_inode(new));
 
-out_dput:
-	dput(newdentry);
-out_dput_old:
-	dput(olddentry);
 out_unlock:
 	unlock_rename(new_upperdir, old_upperdir);
 out_revert_creds:
@@ -1297,6 +1295,8 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
 	else
 		ovl_drop_write(old);
 out:
+	dput(newdentry);
+	dput(olddentry);
 	dput(opaquedir);
 	ovl_cache_free(&list);
 	return err;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index()
  2025-07-11 13:12   ` Amir Goldstein
@ 2025-07-14  1:03     ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-14  1:03 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, 11 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > ovl_cleanup_index() takes a lock on the directory and then does a lookup
> > and possibly one of two different cleanups.
> > This patch narrows the locking to use the _unlocked() versions of the
> > lookup and one cleanup, and just takes the lock for the other cleanup.
> >
> > A subsequent patch will take the lock into the cleanup.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/util.c | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > index 9ce9fe62ef28..7369193b11ec 100644
> > --- a/fs/overlayfs/util.c
> > +++ b/fs/overlayfs/util.c
> > @@ -1107,21 +1107,20 @@ static void ovl_cleanup_index(struct dentry *dentry)
> >                 goto out;
> >         }
> >
> > -       inode_lock_nested(dir, I_MUTEX_PARENT);
> > -       index = ovl_lookup_upper(ofs, name.name, indexdir, name.len);
> > +       index = ovl_lookup_upper_unlocked(ofs, name.name, indexdir, name.len);
> >         err = PTR_ERR(index);
> >         if (IS_ERR(index)) {
> >                 index = NULL;
> >         } else if (ovl_index_all(dentry->d_sb)) {
> >                 /* Whiteout orphan index to block future open by handle */
> > +               inode_lock_nested(dir, I_MUTEX_PARENT);
> 
> Don't we need to verify that index wasn't moved with
> parent_lock(indexdi, index)?

Yes, thanks.  I've change it to use lock_parent() (or whatever we end up
calling it).

Thanks,
NeilBrown


> 
> Thanks,
> Amir.
> 
> >                 err = ovl_cleanup_and_whiteout(OVL_FS(dentry->d_sb),
> >                                                indexdir, index);
> > +               inode_unlock(dir);
> >         } else {
> >                 /* Cleanup orphan index entries */
> > -               err = ovl_cleanup(ofs, dir, index);
> > +               err = ovl_cleanup_unlocked(ofs, indexdir, index);
> >         }
> > -
> > -       inode_unlock(dir);
> >         if (err)
> >                 goto fail;
> >
> > --
> > 2.49.0
> >
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 11/20] ovl: narrow locking in ovl_workdir_create()
  2025-07-11 13:32   ` Amir Goldstein
@ 2025-07-14  1:08     ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-14  1:08 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, 11 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > In ovl_workdir_create() don't hold the dir lock for the whole time, but
> > only take it when needed.
> >
> > It now gets taken separately for ovl_workdir_cleanup().  A subsequent
> > patch will move the locking into that function.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/super.c | 16 ++++++++++------
> >  1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index 9cce3251dd83..239ae1946edf 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -299,8 +299,8 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
> >         int err;
> >         bool retried = false;
> >
> > -       inode_lock_nested(dir, I_MUTEX_PARENT);
> >  retry:
> > +       inode_lock_nested(dir, I_MUTEX_PARENT);
> >         work = ovl_lookup_upper(ofs, name, ofs->workbasedir, strlen(name));
> >
> >         if (!IS_ERR(work)) {
> > @@ -311,23 +311,27 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
> >
> >                 if (work->d_inode) {
> >                         err = -EEXIST;
> > +                       inode_unlock(dir);
> >                         if (retried)
> >                                 goto out_dput;
> >
> >                         if (persist)
> > -                               goto out_unlock;
> > +                               goto out;
> >
> >                         retried = true;
> > +                       inode_lock_nested(dir, I_MUTEX_PARENT);
> 
> Feels like this should be parent_lock(ofs->workbasedir, work)
> and parent_lock(ofs->workbasedir, NULL) in retry:

Agreed.

> 
> >                         err = ovl_workdir_cleanup(ofs, dir, mnt, work, 0);
> > +                       inode_unlock(dir);
> >                         dput(work);
> >                         if (err == -EINVAL) {
> >                                 work = ERR_PTR(err);
> > -                               goto out_unlock;
> > +                               goto out;
> >                         }
> >                         goto retry;
> >                 }
> >
> >                 work = ovl_do_mkdir(ofs, dir, work, attr.ia_mode);
> > +               inode_unlock(dir);
> >                 err = PTR_ERR(work);
> >                 if (IS_ERR(work))
> >                         goto out_err;
> > @@ -365,11 +369,11 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
> >                 if (err)
> >                         goto out_dput;
> >         } else {
> > +               inode_unlock(dir);
> >                 err = PTR_ERR(work);
> >                 goto out_err;
> >         }
> > -out_unlock:
> > -       inode_unlock(dir);
> > +out:
> >         return work;
> >
> >  out_dput:
> > @@ -378,7 +382,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
> >         pr_warn("failed to create directory %s/%s (errno: %i); mounting read-only\n",
> >                 ofs->config.workdir, name, -err);
> >         work = NULL;
> > -       goto out_unlock;
> > +       goto out;
> 
> might as well be return NULL now.

Done.  I got rid of the out: label completely.

NeilBrown

> 
> Thanks,
> Amir.
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout()
  2025-07-11 13:42   ` Amir Goldstein
@ 2025-07-14  1:35     ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-14  1:35 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Fri, 11 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > Normally it is ok to include a lookup with the subsequent operation on
> > the result.  However in this case ovl_cleanup_and_whiteout() already
> > (potentially) creates a whiteout inode so we need separate locking.
> 
> The change itself looks fine and simple, but I didn't understand the text above.
> 
> Can you please explain?

Maybe that was really a note to myself - at first glance the change
looked a little misguided.

While it is possible to perform the lookups outside the directory lock,
the take the lock, check the parents, perform the operation, it is
generally better to combine the lookup with the lock (hence my proposed
lookup_and_lock operations).

In the current locking scheme, performing the lookup and the operation
under the one lock avoids some races.
In my new code we don't avoid the race but the lookup-and-lock can
detect the race and repeat the lookup.

So in generally we can avoid returning the -EINVAL if the parent check
fails.

So changing code that did a lookup and rename in the same lock to code
which takes the lock twice seems wrong.  I wanted to justify it, and the
justification is the need to create the whiteout between the lookup and
the rename.

A different way to do this might be the create the whiteout before doing
the lookup_upper.  That would require a larger refactoring that probably
isn't justified.

I've changed it to:

===========
This code:
  performs a lookup_upper
  created a whiteout object
  renames the whiteout over the result of the lookup

The create and the rename must be locked separated for proposed
directory locking changes.  This patch takes a first step of moving the
lookup out of the locked region.
===========

Thanks,
NeilBrown


> 
> Thanks,
> Amir.
> 
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/dir.c | 17 ++++++++---------
> >  1 file changed, 8 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index d01e83f9d800..8580cd5c61e4 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -769,15 +769,11 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
> >                         goto out;
> >         }
> >
> > -       err = ovl_lock_rename_workdir(workdir, NULL, upperdir, NULL);
> > -       if (err)
> > -               goto out_dput;
> > -
> > -       upper = ovl_lookup_upper(ofs, dentry->d_name.name, upperdir,
> > -                                dentry->d_name.len);
> > +       upper = ovl_lookup_upper_unlocked(ofs, dentry->d_name.name, upperdir,
> > +                                         dentry->d_name.len);
> >         err = PTR_ERR(upper);
> >         if (IS_ERR(upper))
> > -               goto out_unlock;
> > +               goto out_dput;
> >
> >         err = -ESTALE;
> >         if ((opaquedir && upper != opaquedir) ||
> > @@ -786,6 +782,10 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
> >                 goto out_dput_upper;
> >         }
> >
> > +       err = ovl_lock_rename_workdir(workdir, NULL, upperdir, upper);
> > +       if (err)
> > +               goto out_dput_upper;
> > +
> >         err = ovl_cleanup_and_whiteout(ofs, upperdir, upper);
> >         if (err)
> >                 goto out_d_drop;
> > @@ -793,10 +793,9 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
> >         ovl_dir_modified(dentry->d_parent, true);
> >  out_d_drop:
> >         d_drop(dentry);
> > +       unlock_rename(workdir, upperdir);
> >  out_dput_upper:
> >         dput(upper);
> > -out_unlock:
> > -       unlock_rename(workdir, upperdir);
> >  out_dput:
> >         dput(opaquedir);
> >  out:
> > --
> > 2.49.0
> >
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 17/20] ovl: narrow locking in ovl_whiteout()
  2025-07-11 15:19   ` Amir Goldstein
@ 2025-07-14  1:44     ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-14  1:44 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Sat, 12 Jul 2025, Amir Goldstein wrote:
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > ovl_whiteout() relies on the workdir i_rwsem to provide exclusive access
> > to ofs->whiteout which it manipulates.  Rather than depending on this,
> > add a new mutex, "whiteout_lock" to explicitly provide the required
> > locking.  Use guard(mutex) for this so that we can return without
> > needing to explicitly unlock.
> >
> > Then take the lock on workdir only when needed - to lookup the temp name
> > and to do the whiteout or link.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/dir.c       | 49 +++++++++++++++++++++-------------------
> >  fs/overlayfs/ovl_entry.h |  1 +
> >  fs/overlayfs/params.c    |  2 ++
> >  3 files changed, 29 insertions(+), 23 deletions(-)
> >
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index 086719129be3..fd89c25775bd 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -84,41 +84,44 @@ static struct dentry *ovl_whiteout(struct ovl_fs *ofs)
> >         struct dentry *workdir = ofs->workdir;
> >         struct inode *wdir = workdir->d_inode;
> >
> > -       inode_lock_nested(wdir, I_MUTEX_PARENT);
> > +       guard(mutex)(&ofs->whiteout_lock);
> > +
> >         if (!ofs->whiteout) {
> > +               inode_lock_nested(wdir, I_MUTEX_PARENT);
> >                 whiteout = ovl_lookup_temp(ofs, workdir);
> > -               if (IS_ERR(whiteout))
> > -                       goto out;
> > -
> > -               err = ovl_do_whiteout(ofs, wdir, whiteout);
> > -               if (err) {
> > -                       dput(whiteout);
> > -                       whiteout = ERR_PTR(err);
> > -                       goto out;
> > +               if (!IS_ERR(whiteout)) {
> > +                       err = ovl_do_whiteout(ofs, wdir, whiteout);
> > +                       if (err) {
> > +                               dput(whiteout);
> > +                               whiteout = ERR_PTR(err);
> > +                       }
> >                 }
> > +               inode_unlock(wdir);
> > +               if (IS_ERR(whiteout))
> > +                       return whiteout;
> >                 ofs->whiteout = whiteout;
> >         }
> >
> >         if (!ofs->no_shared_whiteout) {
> > +               inode_lock_nested(wdir, I_MUTEX_PARENT);
> >                 whiteout = ovl_lookup_temp(ofs, workdir);
> > -               if (IS_ERR(whiteout))
> > -                       goto out;
> > -
> > -               err = ovl_do_link(ofs, ofs->whiteout, wdir, whiteout);
> > -               if (!err)
> > -                       goto out;
> > -
> > -               if (err != -EMLINK) {
> > -                       pr_warn("Failed to link whiteout - disabling whiteout inode sharing(nlink=%u, err=%i)\n",
> > -                               ofs->whiteout->d_inode->i_nlink, err);
> > -                       ofs->no_shared_whiteout = true;
> > +               if (!IS_ERR(whiteout)) {
> > +                       err = ovl_do_link(ofs, ofs->whiteout, wdir, whiteout);
> > +                       if (err) {
> > +                               dput(whiteout);
> > +                               whiteout = ERR_PTR(err);
> > +                       }
> >                 }
> > -               dput(whiteout);
> > +               inode_unlock(wdir);
> > +               if (!IS_ERR(whiteout) || PTR_ERR(whiteout) != -EMLINK)
> > +                       return whiteout;
> 
> +               if (!IS_ERR(whiteout))
> +                       return whiteout;
> 
> > +
> > +               pr_warn("Failed to link whiteout - disabling whiteout inode sharing(nlink=%u, err=%i)\n",
> > +                       ofs->whiteout->d_inode->i_nlink, err);
> > +               ofs->no_shared_whiteout = true;
> 
> Logic was changed.
> The above pr_warn and no_shared_whiteout = true and for the case of
> PTR_ERR(whiteout) != -EMLINK
> 
> >         }
> >         whiteout = ofs->whiteout;
> >         ofs->whiteout = NULL;
> 
> The outcome is the same with all errors - we return and reset
> ofs->whiteout, but with EMLINK this is expected and not a warning
> with other errors unexpected and warning and we do not try again
> to hardlink to singleton whiteout.

I see that now - thanks.  I've fix up the code.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 08/20] ovl: narrow locking in ovl_rename()
  2025-07-14  1:00     ` NeilBrown
@ 2025-07-14  5:12       ` Amir Goldstein
  0 siblings, 0 replies; 54+ messages in thread
From: Amir Goldstein @ 2025-07-14  5:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel

On Mon, Jul 14, 2025 at 3:00 AM NeilBrown <neil@brown.name> wrote:
>
> On Fri, 11 Jul 2025, Amir Goldstein wrote:
> > On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> > >
> > > Drop the rename lock immediately after the rename, and use
> > > ovl_cleanup_unlocked() for cleanup.
> > >
> > > This makes way for future changes where locks are taken on individual
> > > dentries rather than the whole directory.
> > >
> > > Signed-off-by: NeilBrown <neil@brown.name>
> > > ---
> > >  fs/overlayfs/dir.c | 15 ++++++++++-----
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > > index 687d5e12289c..d01e83f9d800 100644
> > > --- a/fs/overlayfs/dir.c
> > > +++ b/fs/overlayfs/dir.c
> > > @@ -1262,9 +1262,10 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
> > >                             new_upperdir, newdentry, flags);
> > >         if (err)
> > >                 goto out_dput;
> > > +       unlock_rename(new_upperdir, old_upperdir);
> > >
> > >         if (cleanup_whiteout)
> > > -               ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
> > > +               ovl_cleanup_unlocked(ofs, old_upperdir, newdentry);
> > >
> > >         if (overwrite && d_inode(new)) {
> > >                 if (new_is_dir)
> > > @@ -1283,12 +1284,8 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
> > >         if (d_inode(new) && ovl_dentry_upper(new))
> > >                 ovl_copyattr(d_inode(new));
> > >
> > > -out_dput:
> > >         dput(newdentry);
> > > -out_dput_old:
> > >         dput(olddentry);
> > > -out_unlock:
> > > -       unlock_rename(new_upperdir, old_upperdir);
> > >  out_revert_creds:
> > >         ovl_revert_creds(old_cred);
> > >         if (update_nlink)
> > > @@ -1299,6 +1296,14 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
> > >         dput(opaquedir);
> > >         ovl_cache_free(&list);
> > >         return err;
> > > +
> > > +out_dput:
> > > +       dput(newdentry);
> > > +out_dput_old:
> > > +       dput(olddentry);
> > > +out_unlock:
> > > +       unlock_rename(new_upperdir, old_upperdir);
> > > +       goto out_revert_creds;
> > >  }
> > >
> > >  static int ovl_create_tmpfile(struct file *file, struct dentry *dentry,
> > > --
> > > 2.49.0
> > >
> >
> > I think we get end up with fewer and clearer to understand goto labels
> > with a relatively simple trick:
> >
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index fe493f3ed6b6..7cddaa7b263e 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -1069,8 +1069,8 @@ static int ovl_rename(struct mnt_idmap *idmap,
> > struct inode *olddir,
> >         int err;
> >         struct dentry *old_upperdir;
> >         struct dentry *new_upperdir;
> > -       struct dentry *olddentry;
> > -       struct dentry *newdentry;
> > +       struct dentry *olddentry = NULL;
> > +       struct dentry *newdentry = NULL;
> >         struct dentry *trap;
> >         bool old_opaque;
> >         bool new_opaque;
> > @@ -1187,18 +1187,22 @@ static int ovl_rename(struct mnt_idmap *idmap,
> > struct inode *olddir,
> >         olddentry = ovl_lookup_upper(ofs, old->d_name.name, old_upperdir,
> >                                      old->d_name.len);
> >         err = PTR_ERR(olddentry);
> > -       if (IS_ERR(olddentry))
> > +       if (IS_ERR(olddentry)) {
> > +               olddentry = NULL;
> >                 goto out_unlock;
> > +       }
> >
> >         err = -ESTALE;
> >         if (!ovl_matches_upper(old, olddentry))
> > -               goto out_dput_old;
> > +               goto out_unlock;
> >
> >         newdentry = ovl_lookup_upper(ofs, new->d_name.name, new_upperdir,
> >                                      new->d_name.len);
> >         err = PTR_ERR(newdentry);
> > -       if (IS_ERR(newdentry))
> > -               goto out_dput_old;
> > +       if (IS_ERR(newdentry)) {
> > +               newdentry = NULL;
> > +               goto out_unlock;
> > +       }
> >
> >         old_opaque = ovl_dentry_is_opaque(old);
> >         new_opaque = ovl_dentry_is_opaque(new);
> > @@ -1207,28 +1211,28 @@ static int ovl_rename(struct mnt_idmap *idmap,
> > struct inode *olddir,
> >         if (d_inode(new) && ovl_dentry_upper(new)) {
> >                 if (opaquedir) {
> >                         if (newdentry != opaquedir)
> > -                               goto out_dput;
> > +                               goto out_unlock;
> >                 } else {
> >                         if (!ovl_matches_upper(new, newdentry))
> > -                               goto out_dput;
> > +                               goto out_unlock;
> >                 }
> >         } else {
> >                 if (!d_is_negative(newdentry)) {
> >                         if (!new_opaque || !ovl_upper_is_whiteout(ofs,
> > newdentry))
> > -                               goto out_dput;
> > +                               goto out_unlock;
> >                 } else {
> >                         if (flags & RENAME_EXCHANGE)
> > -                               goto out_dput;
> > +                               goto out_unlock;
> >                 }
> >         }
> >
> >         if (olddentry == trap)
> > -               goto out_dput;
> > +               goto out_unlock;
> >         if (newdentry == trap)
> > -               goto out_dput;
> > +               goto out_unlock;
> >
> >         if (olddentry->d_inode == newdentry->d_inode)
> > -               goto out_dput;
> > +               goto out_unlock;
> >
> >         err = 0;
> >         if (ovl_type_merge_or_lower(old))
> > @@ -1236,7 +1240,7 @@ static int ovl_rename(struct mnt_idmap *idmap,
> > struct inode *olddir,
> >         else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
> >                 err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
> >         if (err)
> > -               goto out_dput;
> > +               goto out_unlock;
> >
> >         if (!overwrite && ovl_type_merge_or_lower(new))
> >                 err = ovl_set_redirect(new, samedir);
> > @@ -1244,15 +1248,16 @@ static int ovl_rename(struct mnt_idmap *idmap,
> > struct inode *olddir,
> >                  ovl_type_merge(old->d_parent))
> >                 err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
> >         if (err)
> > -               goto out_dput;
> > +               goto out_unlock;
> >
> >         err = ovl_do_rename(ofs, old_upperdir->d_inode, olddentry,
> >                             new_upperdir->d_inode, newdentry, flags);
> >         if (err)
> > -               goto out_dput;
> > +               goto out_unlock;
> > +       unlock_rename(new_upperdir, old_upperdir);
> >
> >         if (cleanup_whiteout)
> > -               ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
> > +               ovl_cleanup_unlocked(ofs, old_upperdir->d_inode, newdentry);
> >
> >         if (overwrite && d_inode(new)) {
> >                 if (new_is_dir)
> > @@ -1271,12 +1276,6 @@ static int ovl_rename(struct mnt_idmap *idmap,
> > struct inode *olddir,
> >         if (d_inode(new) && ovl_dentry_upper(new))
> >                 ovl_copyattr(d_inode(new));
> >
> > -out_dput:
> > -       dput(newdentry);
> > -out_dput_old:
> > -       dput(olddentry);
> > -out_unlock:
> > -       unlock_rename(new_upperdir, old_upperdir);
> >  out_revert_creds:
> >         ovl_revert_creds(old_cred);
> >         if (update_nlink)
> > @@ -1284,9 +1283,15 @@ static int ovl_rename(struct mnt_idmap *idmap,
> > struct inode *olddir,
> >         else
> >                 ovl_drop_write(old);
> >  out:
> > +       dput(newdentry);
> > +       dput(olddentry);
> >         dput(opaquedir);
> >         ovl_cache_free(&list);
> >         return err;
> > +
> > +out_unlock:
> > +       unlock_rename(new_upperdir, old_upperdir);
> > +       goto out_revert_creds;
> >  }
> >
>
> I decided to make the goto changed into a separate patch as follows.

Good idea.

> My version is slightly different to yours (see new var "de").
>

Looks nicer.

Thanks,
Amir.

> Thanks,
> NeilBrown
>
> From: NeilBrown <neil@brown.name>
> Date: Mon, 14 Jul 2025 10:44:03 +1000
> Subject: [PATCH] ovl: simplify gotos in ovl_rename()
>
> Rather than having three separate goto label: out_unlock, out_dput_old,
> and out_dput, make use of that fact that dput() happily accepts a NULL
> point to reduce this to just one goto label: out_unlock.
>
> olddentry and newdentry are initialised to NULL and only set once a
> value dentry is found.  They are then put late in the function.
>
> Suggested-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/overlayfs/dir.c | 54 +++++++++++++++++++++++-----------------------
>  1 file changed, 27 insertions(+), 27 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index e094adf9d169..63460bdd71cf 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -1082,9 +1082,9 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         int err;
>         struct dentry *old_upperdir;
>         struct dentry *new_upperdir;
> -       struct dentry *olddentry;
> -       struct dentry *newdentry;
> -       struct dentry *trap;
> +       struct dentry *olddentry = NULL;
> +       struct dentry *newdentry = NULL;
> +       struct dentry *trap, *de;
>         bool old_opaque;
>         bool new_opaque;
>         bool cleanup_whiteout = false;
> @@ -1197,21 +1197,23 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>                 goto out_revert_creds;
>         }
>
> -       olddentry = ovl_lookup_upper(ofs, old->d_name.name, old_upperdir,
> -                                    old->d_name.len);
> -       err = PTR_ERR(olddentry);
> -       if (IS_ERR(olddentry))
> +       de = ovl_lookup_upper(ofs, old->d_name.name, old_upperdir,
> +                             old->d_name.len);
> +       err = PTR_ERR(de);
> +       if (IS_ERR(de))
>                 goto out_unlock;
> +       olddentry = de;
>
>         err = -ESTALE;
>         if (!ovl_matches_upper(old, olddentry))
> -               goto out_dput_old;
> +               goto out_unlock;
>
> -       newdentry = ovl_lookup_upper(ofs, new->d_name.name, new_upperdir,
> -                                    new->d_name.len);
> -       err = PTR_ERR(newdentry);
> -       if (IS_ERR(newdentry))
> -               goto out_dput_old;
> +       de = ovl_lookup_upper(ofs, new->d_name.name, new_upperdir,
> +                             new->d_name.len);
> +       err = PTR_ERR(de);
> +       if (IS_ERR(de))
> +               goto out_unlock;
> +       newdentry = de;
>
>         old_opaque = ovl_dentry_is_opaque(old);
>         new_opaque = ovl_dentry_is_opaque(new);
> @@ -1220,28 +1222,28 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         if (d_inode(new) && ovl_dentry_upper(new)) {
>                 if (opaquedir) {
>                         if (newdentry != opaquedir)
> -                               goto out_dput;
> +                               goto out_unlock;
>                 } else {
>                         if (!ovl_matches_upper(new, newdentry))
> -                               goto out_dput;
> +                               goto out_unlock;
>                 }
>         } else {
>                 if (!d_is_negative(newdentry)) {
>                         if (!new_opaque || !ovl_upper_is_whiteout(ofs, newdentry))
> -                               goto out_dput;
> +                               goto out_unlock;
>                 } else {
>                         if (flags & RENAME_EXCHANGE)
> -                               goto out_dput;
> +                               goto out_unlock;
>                 }
>         }
>
>         if (olddentry == trap)
> -               goto out_dput;
> +               goto out_unlock;
>         if (newdentry == trap)
> -               goto out_dput;
> +               goto out_unlock;
>
>         if (olddentry->d_inode == newdentry->d_inode)
> -               goto out_dput;
> +               goto out_unlock;
>
>         err = 0;
>         if (ovl_type_merge_or_lower(old))
> @@ -1249,7 +1251,7 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
>                 err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
>         if (err)
> -               goto out_dput;
> +               goto out_unlock;
>
>         if (!overwrite && ovl_type_merge_or_lower(new))
>                 err = ovl_set_redirect(new, samedir);
> @@ -1257,12 +1259,12 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>                  ovl_type_merge(old->d_parent))
>                 err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
>         if (err)
> -               goto out_dput;
> +               goto out_unlock;
>
>         err = ovl_do_rename(ofs, old_upperdir, olddentry,
>                             new_upperdir, newdentry, flags);
>         if (err)
> -               goto out_dput;
> +               goto out_unlock;
>
>         if (cleanup_whiteout)
>                 ovl_cleanup(ofs, old_upperdir->d_inode, newdentry);
> @@ -1284,10 +1286,6 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         if (d_inode(new) && ovl_dentry_upper(new))
>                 ovl_copyattr(d_inode(new));
>
> -out_dput:
> -       dput(newdentry);
> -out_dput_old:
> -       dput(olddentry);
>  out_unlock:
>         unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
> @@ -1297,6 +1295,8 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
>         else
>                 ovl_drop_write(old);
>  out:
> +       dput(newdentry);
> +       dput(olddentry);
>         dput(opaquedir);
>         ovl_cache_free(&list);
>         return err;
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: parent_lock/unlock (Was: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir())
  2025-07-14  0:13     ` NeilBrown
@ 2025-07-14  5:42       ` Amir Goldstein
  2025-07-16  3:55         ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-14  5:42 UTC (permalink / raw)
  To: NeilBrown, Christian Brauner
  Cc: Miklos Szeredi, overlayfs, linux-fsdevel, Al Viro, Jan Kara

[CC vfs maintainers who were not personally CCed on your patches
and changed the subject to focus on the topic at hand.]

On Mon, Jul 14, 2025 at 2:13 AM NeilBrown <neil@brown.name> wrote:
>
> On Fri, 11 Jul 2025, Amir Goldstein wrote:
> > On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> > >
> > > If ovl_copy_up_data() fails the error is not immediately handled but the
> > > code continues on to call ovl_start_write() and lock_rename(),
> > > presumably because both of these locks are needed for the cleanup.
> > > On then (if the lock was successful) is the error checked.
> > >
> > > This makes the code a little hard to follow and could be fragile.
> > >
> > > This patch changes to handle the error immediately.  A new
> > > ovl_cleanup_unlocked() is created which takes the required directory
> > > lock (though it doesn't take the write lock on the filesystem).  This
> > > will be used extensively in later patches.
> > >
> > > In general we need to check the parent is still correct after taking the
> > > lock (as ovl_copy_up_workdir() does after a successful lock_rename()) so
> > > that is included in ovl_cleanup_unlocked() using new lock_parent() and
> > > unlock_parent() calls (it is planned to move this API into VFS code
> > > eventually, though in a slightly different form).
> >
> > Since you are not planning to move it to VFS with this name
> > AND since I assume you want to merge this ovl cleanup prior
> > to the rest of of patches, please use an ovl helper without
> > the ovl_ namespace prefix and you have a typo above
> > its parent_lock() not lock_parent().
>
> I think you mean "with" rather than "without" ?

Yeh.

> But you separately say you would much rather this go into the VFS code
> first.

On second thought. no strong feeling either way.
Using an internal ovl helper without ovl_ prefix is not good practice,
but I can also live with that for a short while, or at the very least
I am willing to defer the decision to the vfs maintainers.

Pasting the helper here for context:

> > > +
> > > +int parent_lock(struct dentry *parent, struct dentry *child)
> > > +{
> > > +       inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> > > +       if (!child || child->d_parent == parent)
> > > +               return 0;
> > > +
> > > +       inode_unlock(parent->d_inode);
> > > +       return -EINVAL;
> > > +}

FWIW, as I mentioned before, this helper could be factored out
of the first part of lock_rename_child().

>
> For me a core issue is how the patches will land.  If you are happy for
> these patches (once they are all approved of course) to land via the vfs
> tree, then I can certainly submit the new interfaces in VFS code first,
> then the ovl cleanups that use them.
>
> However I assumed that they were so substantial that you would want them
> to land via an ovl tree.  In that case I wouldn't want to have to wait
> for a couple of new interfaces to land in VFS before you could take the
> cleanups.
>
> What process do you imagine?
>

Whatever process we choose is going to be collaborated with the vfs
maintainers.

Right now, there are a few ovl patches on Cristian's vfs-6.17.file
branch and zero patches on overlayfs-next branch.

What I would like to do is personally apply and test your patches
(based on vfs-6.17.file).

Then I will either send a PR to Christian before the merge window
or send the PR to Linux during the merge window and after vfs-6.17.file
PR lands.

Within these options we have plenty of freedom to decide if we want
to keep parent_lock/unlock internal ovl helpers or vfs helpers.
It's really up to the vfs maintainers.

> >
> > And apropos lock helper names, at the tip of your branch

Reference for people who just joined:

   https://github.com/neilbrown/linux/commits/pdirops

> > the lock helpers used in ovl_cleanup() are named:
> > lock_and_check_dentry()/dentry_unlock()
> >
> > I have multiple comments on your choice of names for those helpers:
> > 1. Please use a consistent name pattern for lock/unlock.
> >     The pattern <obj-or-lock-type>_{lock,unlock}_* is far more common
> >     then the pattern lock_<obj-or-lock-type> in the kernel, but at least
> >     be consistent with dentry_lock_and_check() or better yet
> >     parent_lock() and later parent_lock_get_child()
>
> dentry_lock_and_check() does make sense - thanks.
>
> > 2. dentry_unlock() is a very strange name for a helper that
> >     unlocks the parent. The fact that you document what it does
> >     in Kernel-doc does not stop people reading the code using it
> >     from being confused and writing bugs.
>
> The plan is that dentry_lookup_and_lock() will only lock the parent during a
> short interim period.  Maybe there will be one full release where that
> is the case.  As soon a practical (and we know this sort of large change
> cannot move quickly) dentry_lookup_and_lock() etc will only lock the
> dentry, not the directory.  The directory will only get locked
> immediately before call the inode_operations - for filesystems that
> haven't opted out.  Thus patches in my git tree don't full reflect this
> yet (Though the hints are there are the end) but that is my current
> plan, based on most recent feedback from Al Viro.
>
> > 3. Why not call it parent_unlock() like I suggested and like you
> >     used in this patch set and why not introduce it in VFS to begin with?
> >     For that matter parent_unlock_{put,return}_child() is more clear IMO.
>
> Because, as I say about, it is only incidentally about the parent. It is
> primarily about the dentry.

When you have a helper named dentry_unlock() that unlocks the
parent inode, it's not good naming IMO.

When you have a helper called parent_unlock_put_child()
or dentry_put_and_unlock_parent() there is no ambiguity about
the subject of the operations.

>
> > 4. The name dentry_unlock_rename(&rd) also does not balance nicely with
> >     the name lookup_and_lock_rename(&rd) and has nothing to do with the
> >     dentry_ prefix. How about lookup_done_and_unlock_rename(&rd)?
>
> The is probably my least favourite name....  I did try some "done"
> variants (following one from done_path_create()).  But if felt it should
> be "done_$function-that-started-this-interaction()" and that resulted in
>    done_dentry_lookup_and_lock()
> or similar, and having "lock" in an unlock function was weird.
> Your "done_and_unlock" addresses this but results and long name that
> feels clumsy to me.
>
> I chose the dentry_ prefix before I decided to pass the renamedata
> around (and I'm really happy about that latter choice).  So
> reconsidering the name is definitely appropriate.
> Maybe  renamedata_lock() and renamedata_unlock() ???
> renamedata_lock() can do lookups as well as locking, but maybe that is
> implied by the presense of old_last and new_last in renamedata...
>

My biggest complaint was about the non balanced lock/unlock name pattern.
renamedata_lock/unlock() is fine by me and aligns very well with existing
lock helper name patterns.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem
  2025-07-11 16:41 ` [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem Amir Goldstein
@ 2025-07-14  5:57   ` Amir Goldstein
  2025-07-16  0:13     ` NeilBrown
  0 siblings, 1 reply; 54+ messages in thread
From: Amir Goldstein @ 2025-07-14  5:57 UTC (permalink / raw)
  To: NeilBrown
  Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel, Christian Brauner,
	Al Viro, Jan Kara

[CC vfs maintainers]

On Fri, Jul 11, 2025 at 6:41 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> >
> > This is a revised set of patches following helpful feedback.  There are
> > now more patches, but they should be a lot easier to review.
>
> I confirm that this set was "reviewable" :)
>
> No major comments on my part, mostly petty nits.
>
> I would prefer to see parent_lock/unlock helpers in vfs for v3,
> but if you prefer to keep the prep patches internal to ovl, that's fine too.
> In that case I'd prefer to use ovl_parent_lock/unlock, but if that's too
> painful, don't bother.
>
> Thanks,
> Amir.
>
> >
> > These patches are all in a git tree at
> >    https://github.com/neilbrown/linux/commits/pdirops
> > though there a lot more patches there too - demonstrating what is to come.
> > 0eaa1c629788 ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()
> > is the last in the series posted here.
> >
> > I welcome further review.
> >
> > Original description:
> >
> > This series of patches for overlayfs is primarily focussed on preparing
> > for some proposed changes to directory locking.  In the new scheme we
> > will lock individual dentries in a directory rather than the whole
> > directory.
> >
> > ovl currently will sometimes lock a directory on the upper filesystem
> > and do a few different things while holding the lock.  This is
> > incompatible with the new scheme.
> >
> > This series narrows the region of code protected by the directory lock,
> > taking it multiple times when necessary.  This theoretically open up the
> > possibilty of other changes happening on the upper filesytem between the
> > unlock and the lock.  To some extent the patches guard against that by
> > checking the dentries still have the expect parent after retaking the
> > lock.  In general, I think ovl would have trouble if upperfs were being
> > changed independantly, and I don't think the changes here increase the
> > problem in any important way.
> >
> > I have tested this with fstests, both generic and unionfs tests.  I
> > wouldn't be surprised if I missed something though, so please review
> > carefully.
> >
> > After this series (with any needed changes) lands I will resubmit my
> > change to vfs_rmdir() behaviour to have it drop the lock on error.  ovl
> > will be much better positioned to handle that change.  It will come with
> > the new "lookup_and_lock" API that I am proposing.
> >

Slightly off topic. As I know how much ovl code currently depends on
(perhaps even abuses) the directory inode lock beyond its vfs uses
(e.g. to synchronize internal ovl dir cache changes) just an idea that
came to my head for your followup patches -
Consider adding an assertion in WRAP_DIR_ITER() that disallows
i_op->no_dir_lock.
Not that any of the current users of WRAP_DIR_ITER() are candidates
for parallel dir ops (?), but its an easy assertion to add.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem
  2025-07-14  5:57   ` Amir Goldstein
@ 2025-07-16  0:13     ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-16  0:13 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, linux-unionfs, linux-fsdevel, Christian Brauner,
	Al Viro, Jan Kara

On Mon, 14 Jul 2025, Amir Goldstein wrote:
> [CC vfs maintainers]
> 
> On Fri, Jul 11, 2025 at 6:41 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> > >
> > > This is a revised set of patches following helpful feedback.  There are
> > > now more patches, but they should be a lot easier to review.
> >
> > I confirm that this set was "reviewable" :)
> >
> > No major comments on my part, mostly petty nits.
> >
> > I would prefer to see parent_lock/unlock helpers in vfs for v3,
> > but if you prefer to keep the prep patches internal to ovl, that's fine too.
> > In that case I'd prefer to use ovl_parent_lock/unlock, but if that's too
> > painful, don't bother.
> >
> > Thanks,
> > Amir.
> >
> > >
> > > These patches are all in a git tree at
> > >    https://github.com/neilbrown/linux/commits/pdirops
> > > though there a lot more patches there too - demonstrating what is to come.
> > > 0eaa1c629788 ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()
> > > is the last in the series posted here.
> > >
> > > I welcome further review.
> > >
> > > Original description:
> > >
> > > This series of patches for overlayfs is primarily focussed on preparing
> > > for some proposed changes to directory locking.  In the new scheme we
> > > will lock individual dentries in a directory rather than the whole
> > > directory.
> > >
> > > ovl currently will sometimes lock a directory on the upper filesystem
> > > and do a few different things while holding the lock.  This is
> > > incompatible with the new scheme.
> > >
> > > This series narrows the region of code protected by the directory lock,
> > > taking it multiple times when necessary.  This theoretically open up the
> > > possibilty of other changes happening on the upper filesytem between the
> > > unlock and the lock.  To some extent the patches guard against that by
> > > checking the dentries still have the expect parent after retaking the
> > > lock.  In general, I think ovl would have trouble if upperfs were being
> > > changed independantly, and I don't think the changes here increase the
> > > problem in any important way.
> > >
> > > I have tested this with fstests, both generic and unionfs tests.  I
> > > wouldn't be surprised if I missed something though, so please review
> > > carefully.
> > >
> > > After this series (with any needed changes) lands I will resubmit my
> > > change to vfs_rmdir() behaviour to have it drop the lock on error.  ovl
> > > will be much better positioned to handle that change.  It will come with
> > > the new "lookup_and_lock" API that I am proposing.
> > >
> 
> Slightly off topic. As I know how much ovl code currently depends on
> (perhaps even abuses) the directory inode lock beyond its vfs uses
> (e.g. to synchronize internal ovl dir cache changes) just an idea that
> came to my head for your followup patches -
> Consider adding an assertion in WRAP_DIR_ITER() that disallows
> i_op->no_dir_lock.
> Not that any of the current users of WRAP_DIR_ITER() are candidates
> for parallel dir ops (?), but its an easy assertion to add.

Thanks a sensible suggestion - thanks.
Though removing the need for WRAP_DIR_ITER() would be nice too... Not an
easy task for course.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: parent_lock/unlock (Was: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir())
  2025-07-14  5:42       ` parent_lock/unlock (Was: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()) Amir Goldstein
@ 2025-07-16  3:55         ` NeilBrown
  0 siblings, 0 replies; 54+ messages in thread
From: NeilBrown @ 2025-07-16  3:55 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Christian Brauner, Miklos Szeredi, overlayfs, linux-fsdevel,
	Al Viro, Jan Kara

On Mon, 14 Jul 2025, Amir Goldstein wrote:
> [CC vfs maintainers who were not personally CCed on your patches
> and changed the subject to focus on the topic at hand.]
> 
> On Mon, Jul 14, 2025 at 2:13 AM NeilBrown <neil@brown.name> wrote:
> >
> > On Fri, 11 Jul 2025, Amir Goldstein wrote:
> > > On Fri, Jul 11, 2025 at 1:21 AM NeilBrown <neil@brown.name> wrote:
> > > >
> > > > If ovl_copy_up_data() fails the error is not immediately handled but the
> > > > code continues on to call ovl_start_write() and lock_rename(),
> > > > presumably because both of these locks are needed for the cleanup.
> > > > On then (if the lock was successful) is the error checked.
> > > >
> > > > This makes the code a little hard to follow and could be fragile.
> > > >
> > > > This patch changes to handle the error immediately.  A new
> > > > ovl_cleanup_unlocked() is created which takes the required directory
> > > > lock (though it doesn't take the write lock on the filesystem).  This
> > > > will be used extensively in later patches.
> > > >
> > > > In general we need to check the parent is still correct after taking the
> > > > lock (as ovl_copy_up_workdir() does after a successful lock_rename()) so
> > > > that is included in ovl_cleanup_unlocked() using new lock_parent() and
> > > > unlock_parent() calls (it is planned to move this API into VFS code
> > > > eventually, though in a slightly different form).
> > >
> > > Since you are not planning to move it to VFS with this name
> > > AND since I assume you want to merge this ovl cleanup prior
> > > to the rest of of patches, please use an ovl helper without
> > > the ovl_ namespace prefix and you have a typo above
> > > its parent_lock() not lock_parent().
> >
> > I think you mean "with" rather than "without" ?
> 
> Yeh.
> 
> > But you separately say you would much rather this go into the VFS code
> > first.
> 
> On second thought. no strong feeling either way.
> Using an internal ovl helper without ovl_ prefix is not good practice,
> but I can also live with that for a short while, or at the very least
> I am willing to defer the decision to the vfs maintainers.
> 
> Pasting the helper here for context:
> 
> > > > +
> > > > +int parent_lock(struct dentry *parent, struct dentry *child)
> > > > +{
> > > > +       inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> > > > +       if (!child || child->d_parent == parent)
> > > > +               return 0;
> > > > +
> > > > +       inode_unlock(parent->d_inode);
> > > > +       return -EINVAL;
> > > > +}
> 
> FWIW, as I mentioned before, this helper could be factored out
> of the first part of lock_rename_child().
> 
> >
> > For me a core issue is how the patches will land.  If you are happy for
> > these patches (once they are all approved of course) to land via the vfs
> > tree, then I can certainly submit the new interfaces in VFS code first,
> > then the ovl cleanups that use them.
> >
> > However I assumed that they were so substantial that you would want them
> > to land via an ovl tree.  In that case I wouldn't want to have to wait
> > for a couple of new interfaces to land in VFS before you could take the
> > cleanups.
> >
> > What process do you imagine?
> >
> 
> Whatever process we choose is going to be collaborated with the vfs
> maintainers.
> 
> Right now, there are a few ovl patches on Cristian's vfs-6.17.file
> branch and zero patches on overlayfs-next branch.
> 
> What I would like to do is personally apply and test your patches
> (based on vfs-6.17.file).
> 
> Then I will either send a PR to Christian before the merge window
> or send the PR to Linux during the merge window and after vfs-6.17.file
> PR lands.
> 
> Within these options we have plenty of freedom to decide if we want
> to keep parent_lock/unlock internal ovl helpers or vfs helpers.
> It's really up to the vfs maintainers.

My preference is for the ovl patches to land somewhere with an
ovl_parent_lock() helper which is expected to be short-lived.
(have have today posted a new set of patches to fs-devel and elsewhere).

Then I can send patches to introduce new VFS APIs and we can have name
discussion then.  Meanwhile I'l revise my name choice based on your
input.

Thanks,
NeilBrown

> 
> > >
> > > And apropos lock helper names, at the tip of your branch
> 
> Reference for people who just joined:
> 
>    https://github.com/neilbrown/linux/commits/pdirops
> 
> > > the lock helpers used in ovl_cleanup() are named:
> > > lock_and_check_dentry()/dentry_unlock()
> > >
> > > I have multiple comments on your choice of names for those helpers:
> > > 1. Please use a consistent name pattern for lock/unlock.
> > >     The pattern <obj-or-lock-type>_{lock,unlock}_* is far more common
> > >     then the pattern lock_<obj-or-lock-type> in the kernel, but at least
> > >     be consistent with dentry_lock_and_check() or better yet
> > >     parent_lock() and later parent_lock_get_child()
> >
> > dentry_lock_and_check() does make sense - thanks.
> >
> > > 2. dentry_unlock() is a very strange name for a helper that
> > >     unlocks the parent. The fact that you document what it does
> > >     in Kernel-doc does not stop people reading the code using it
> > >     from being confused and writing bugs.
> >
> > The plan is that dentry_lookup_and_lock() will only lock the parent during a
> > short interim period.  Maybe there will be one full release where that
> > is the case.  As soon a practical (and we know this sort of large change
> > cannot move quickly) dentry_lookup_and_lock() etc will only lock the
> > dentry, not the directory.  The directory will only get locked
> > immediately before call the inode_operations - for filesystems that
> > haven't opted out.  Thus patches in my git tree don't full reflect this
> > yet (Though the hints are there are the end) but that is my current
> > plan, based on most recent feedback from Al Viro.
> >
> > > 3. Why not call it parent_unlock() like I suggested and like you
> > >     used in this patch set and why not introduce it in VFS to begin with?
> > >     For that matter parent_unlock_{put,return}_child() is more clear IMO.
> >
> > Because, as I say about, it is only incidentally about the parent. It is
> > primarily about the dentry.
> 
> When you have a helper named dentry_unlock() that unlocks the
> parent inode, it's not good naming IMO.
> 
> When you have a helper called parent_unlock_put_child()
> or dentry_put_and_unlock_parent() there is no ambiguity about
> the subject of the operations.
> 
> >
> > > 4. The name dentry_unlock_rename(&rd) also does not balance nicely with
> > >     the name lookup_and_lock_rename(&rd) and has nothing to do with the
> > >     dentry_ prefix. How about lookup_done_and_unlock_rename(&rd)?
> >
> > The is probably my least favourite name....  I did try some "done"
> > variants (following one from done_path_create()).  But if felt it should
> > be "done_$function-that-started-this-interaction()" and that resulted in
> >    done_dentry_lookup_and_lock()
> > or similar, and having "lock" in an unlock function was weird.
> > Your "done_and_unlock" addresses this but results and long name that
> > feels clumsy to me.
> >
> > I chose the dentry_ prefix before I decided to pass the renamedata
> > around (and I'm really happy about that latter choice).  So
> > reconsidering the name is definitely appropriate.
> > Maybe  renamedata_lock() and renamedata_unlock() ???
> > renamedata_lock() can do lookups as well as locking, but maybe that is
> > implied by the presense of old_last and new_last in renamedata...
> >
> 
> My biggest complaint was about the non balanced lock/unlock name pattern.
> renamedata_lock/unlock() is fine by me and aligns very well with existing
> lock helper name patterns.
> 
> Thanks,
> Amir.
> 


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2025-07-16  3:55 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-10 23:03 [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem NeilBrown
2025-07-10 23:03 ` [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir() NeilBrown
2025-07-11  8:25   ` Amir Goldstein
2025-07-11 10:30     ` Amir Goldstein
2025-07-14  0:13     ` NeilBrown
2025-07-14  5:42       ` parent_lock/unlock (Was: [PATCH 01/20] ovl: simplify an error path in ovl_copy_up_workdir()) Amir Goldstein
2025-07-16  3:55         ` NeilBrown
2025-07-10 23:03 ` [PATCH 02/20] ovl: change ovl_create_index() to take write and dir locks NeilBrown
2025-07-11 10:41   ` Amir Goldstein
2025-07-14  0:14     ` NeilBrown
2025-07-10 23:03 ` [PATCH 03/20] ovl: Call ovl_create_temp() without lock held NeilBrown
2025-07-11 11:10   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 04/20] ovl: narrow the locked region in ovl_copy_up_workdir() NeilBrown
2025-07-11 12:03   ` Amir Goldstein
2025-07-14  0:29     ` NeilBrown
2025-07-10 23:03 ` [PATCH 05/20] ovl: narrow locking in ovl_create_upper() NeilBrown
2025-07-11 12:09   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 06/20] ovl: narrow locking in ovl_clear_empty() NeilBrown
2025-07-11 12:27   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 07/20] ovl: narrow locking in ovl_create_over_whiteout() NeilBrown
2025-07-11 12:42   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 08/20] ovl: narrow locking in ovl_rename() NeilBrown
2025-07-11 13:03   ` Amir Goldstein
2025-07-14  1:00     ` NeilBrown
2025-07-14  5:12       ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 09/20] ovl: narrow locking in ovl_cleanup_whiteouts() NeilBrown
2025-07-10 23:03 ` [PATCH 10/20] ovl: narrow locking in ovl_cleanup_index() NeilBrown
2025-07-11 13:12   ` Amir Goldstein
2025-07-14  1:03     ` NeilBrown
2025-07-10 23:03 ` [PATCH 11/20] ovl: narrow locking in ovl_workdir_create() NeilBrown
2025-07-11 13:32   ` Amir Goldstein
2025-07-14  1:08     ` NeilBrown
2025-07-10 23:03 ` [PATCH 12/20] ovl: narrow locking in ovl_indexdir_cleanup() NeilBrown
2025-07-11 13:33   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 13/20] ovl: narrow locking in ovl_workdir_cleanup_recurse() NeilBrown
2025-07-11 13:35   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 14/20] ovl: change ovl_workdir_cleanup() to take dir lock as needed NeilBrown
2025-07-11 13:28   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 15/20] ovl: narrow locking on ovl_remove_and_whiteout() NeilBrown
2025-07-11 13:42   ` Amir Goldstein
2025-07-14  1:35     ` NeilBrown
2025-07-10 23:03 ` [PATCH 16/20] ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed NeilBrown
2025-07-11 13:50   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 17/20] ovl: narrow locking in ovl_whiteout() NeilBrown
2025-07-11 15:19   ` Amir Goldstein
2025-07-14  1:44     ` NeilBrown
2025-07-10 23:03 ` [PATCH 18/20] ovl: narrow locking in ovl_check_rename_whiteout() NeilBrown
2025-07-11 13:54   ` Amir Goldstein
2025-07-10 23:03 ` [PATCH 19/20] ovl: change ovl_create_real() to receive dentry parent NeilBrown
2025-07-10 23:03 ` [PATCH 20/20] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup() NeilBrown
2025-07-11  9:57   ` Amir Goldstein
2025-07-11 16:41 ` [PATCH 00/20 v2] ovl: narrow regions protected by i_rw_sem Amir Goldstein
2025-07-14  5:57   ` Amir Goldstein
2025-07-16  0:13     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).