ecryptfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes
@ 2025-06-08 23:09 NeilBrown
  2025-06-08 23:09 ` [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl() NeilBrown
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: NeilBrown @ 2025-06-08 23:09 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Jeff Layton, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

The following 5 patches provide further cleanup that serves as
preparation for some dir-locking API changes that I want to make.  The
most interesting is the last which makes another change to vfs_mkdir().
As well as returning the dentry or consuming it on failure (a recent
change) it now also unlocks on failure.  This will be needed when we
transition to locking just the dentry, not the whole directory.

This leaves some rather clumsy code in overlayfs.  Overlayfs sometimes
takes a rename lock (two directories) and then possibly does a
vfs_mkdir() in one of those directories.  When that fails we now need to
unlock only the other directory.

I hope to go through overlayfs to narrow the directory locking so each
lock only covers a single operation (possibly including a lookup first).
That should remove the clumsiness and will also be needed for the
proposed API change.

As well as the cleanups here I have a few for smb/server.  I will send
those separately as they deserve careful review by the smb team and I
don't want them to be buried in this patch set.

After these, and the mentioned overlayfs changes, I have a series which
adds a collection of APIs with names like "lookup_and_lock()" which
combine the locking and the lookup, and then a set which changes all
code which currently locks a directory for name-based operations to
instead use the new look_and_lock() interfaces.  This will mean that the
changes to directory locking can be done in one central place.

After that there are a few more cleanups to stop filesystems from usng
d_drop() in the middle of name operations (at the end is OK, but not in
the middle) and then the core patches for this work which introduce an
alternate way to provide all the locking that the VFS needs for name
operations without taking i_rw_sem.  Filesystems can then opt into using
only this locking and to not depend on i_rw_sem.  This allows create and
remove of different names is the same directory to continue concurrently
with each other and with renames.  Renames are also concurrent though
cross-directory renames block some other cross-directory renames in the
same part of the tree.

Note that i_rw_sem will still be used for the target of rmdir, and will
still be held as a shared lock by readdir() so that we never try reading
in a directory being removed.  It might still be used (shared) for
lookups for the same reason, though I haven't completely settled my
design there yet.

Thanks,
NeilBrown

 [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into
 [PATCH 2/5] VFS: Minor fixes for porting.rst
 [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
 [PATCH 4/5] exportfs: use lookup_one_unlocked()
 [PATCH 5/5] Change vfs_mkdir() to unlock on failure.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl()
  2025-06-08 23:09 [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes NeilBrown
@ 2025-06-08 23:09 ` NeilBrown
  2025-06-09 12:13   ` Jeff Layton
  2025-06-08 23:09 ` [PATCH 2/5] VFS: Minor fixes for porting.rst NeilBrown
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2025-06-08 23:09 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Jeff Layton, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

The effect of lookup_one_qstr_excl_raw() can be achieved by passing
LOOKUP_CREATE() to lookup_one_qstr_excl() - we don't need a separate
function.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/namei.c | 37 ++++++++++++++-----------------------
 1 file changed, 14 insertions(+), 23 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 4bb889fc980b..dc42bfac5c57 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1665,9 +1665,17 @@ static struct dentry *lookup_dcache(const struct qstr *name,
 	return dentry;
 }
 
-static struct dentry *lookup_one_qstr_excl_raw(const struct qstr *name,
-					       struct dentry *base,
-					       unsigned int flags)
+/*
+ * Parent directory has inode locked exclusive.  This is one
+ * and only case when ->lookup() gets called on non in-lookup
+ * dentries - as the matter of fact, this only gets called
+ * when directory is guaranteed to have no in-lookup children
+ * at all.
+ * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed.
+ * Will return -EEXIST if name is found and LOOKUP_EXCL was passed.
+ */
+struct dentry *lookup_one_qstr_excl(const struct qstr *name,
+				    struct dentry *base, unsigned int flags)
 {
 	struct dentry *dentry;
 	struct dentry *old;
@@ -1675,7 +1683,7 @@ static struct dentry *lookup_one_qstr_excl_raw(const struct qstr *name,
 
 	dentry = lookup_dcache(name, base, flags);
 	if (dentry)
-		return dentry;
+		goto found;
 
 	/* Don't create child dentry for a dead directory. */
 	dir = base->d_inode;
@@ -1691,24 +1699,7 @@ static struct dentry *lookup_one_qstr_excl_raw(const struct qstr *name,
 		dput(dentry);
 		dentry = old;
 	}
-	return dentry;
-}
-
-/*
- * Parent directory has inode locked exclusive.  This is one
- * and only case when ->lookup() gets called on non in-lookup
- * dentries - as the matter of fact, this only gets called
- * when directory is guaranteed to have no in-lookup children
- * at all.
- * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed.
- * Will return -EEXIST if name is found and LOOKUP_EXCL was passed.
- */
-struct dentry *lookup_one_qstr_excl(const struct qstr *name,
-				    struct dentry *base, unsigned int flags)
-{
-	struct dentry *dentry;
-
-	dentry = lookup_one_qstr_excl_raw(name, base, flags);
+found:
 	if (IS_ERR(dentry))
 		return dentry;
 	if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) {
@@ -2790,7 +2781,7 @@ struct dentry *kern_path_locked_negative(const char *name, struct path *path)
 	if (unlikely(type != LAST_NORM))
 		return ERR_PTR(-EINVAL);
 	inode_lock_nested(parent_path.dentry->d_inode, I_MUTEX_PARENT);
-	d = lookup_one_qstr_excl_raw(&last, parent_path.dentry, 0);
+	d = lookup_one_qstr_excl(&last, parent_path.dentry, LOOKUP_CREATE);
 	if (IS_ERR(d)) {
 		inode_unlock(parent_path.dentry->d_inode);
 		return d;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/5] VFS: Minor fixes for porting.rst
  2025-06-08 23:09 [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes NeilBrown
  2025-06-08 23:09 ` [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl() NeilBrown
@ 2025-06-08 23:09 ` NeilBrown
  2025-06-09 12:13   ` Jeff Layton
  2025-06-08 23:09 ` [PATCH 3/5] coda: use iterate_dir() in coda_readdir() NeilBrown
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2025-06-08 23:09 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Jeff Layton, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

This paragraph was relevant for an earlier version of the code which
passed the qstr as a struct instead of a point.  The version that landed
passed the pointer in all cases so this para is now pointless.

Signed-off-by: NeilBrown <neil@brown.name>
---
 Documentation/filesystems/porting.rst | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index 3616d7161dab..e8c9f21582d1 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -1224,9 +1224,6 @@ lookup_noperm_unlocked(), lookup_noperm_positive_unlocked().  They now
 take a qstr instead of separate name and length.  QSTR() can be used
 when strlen() is needed for the length.
 
-For try_lookup_noperm() a reference to the qstr is passed in case the
-hash might subsequently be needed.
-
 These function no longer do any permission checking - they previously
 checked that the caller has 'X' permission on the parent.  They must
 ONLY be used internally by a filesystem on itself when it knows that
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-08 23:09 [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes NeilBrown
  2025-06-08 23:09 ` [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl() NeilBrown
  2025-06-08 23:09 ` [PATCH 2/5] VFS: Minor fixes for porting.rst NeilBrown
@ 2025-06-08 23:09 ` NeilBrown
  2025-06-09 12:17   ` Jeff Layton
  2025-06-09 14:35   ` Jan Harkes
  2025-06-08 23:09 ` [PATCH 4/5] exportfs: use lookup_one_unlocked() NeilBrown
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: NeilBrown @ 2025-06-08 23:09 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Jeff Layton, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

The code in coda_readdir() is nearly identical to iterate_dir().
Differences are:
 - iterate_dir() is killable
 - iterate_dir() adds permission checking and accessing notifications

I believe these are not harmful for coda so it is best to use
iterate_dir() directly.  This will allow locking changes without
touching the code in coda.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/coda/dir.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/fs/coda/dir.c b/fs/coda/dir.c
index ab69d8f0cec2..ca9990017265 100644
--- a/fs/coda/dir.c
+++ b/fs/coda/dir.c
@@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
 	cfi = coda_ftoc(coda_file);
 	host_file = cfi->cfi_container;
 
-	if (host_file->f_op->iterate_shared) {
-		struct inode *host_inode = file_inode(host_file);
-		ret = -ENOENT;
-		if (!IS_DEADDIR(host_inode)) {
-			inode_lock_shared(host_inode);
-			ret = host_file->f_op->iterate_shared(host_file, ctx);
-			file_accessed(host_file);
-			inode_unlock_shared(host_inode);
-		}
+	ret = iterate_dir(host_file, ctx);
+	if (ret != -ENOTDIR)
 		return ret;
-	}
 	/* Venus: we must read Venus dirents from a file */
 	return coda_venus_readdir(coda_file, ctx);
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/5] exportfs: use lookup_one_unlocked()
  2025-06-08 23:09 [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes NeilBrown
                   ` (2 preceding siblings ...)
  2025-06-08 23:09 ` [PATCH 3/5] coda: use iterate_dir() in coda_readdir() NeilBrown
@ 2025-06-08 23:09 ` NeilBrown
  2025-06-09 12:18   ` Jeff Layton
  2025-06-09 14:01   ` Chuck Lever
  2025-06-08 23:09 ` [PATCH 5/5] Change vfs_mkdir() to unlock on failure NeilBrown
  2025-06-11 11:43 ` [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes Christian Brauner
  5 siblings, 2 replies; 25+ messages in thread
From: NeilBrown @ 2025-06-08 23:09 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Jeff Layton, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

rather than locking the directory and using lookup_one(), just use
lookup_one_unlocked().  This keeps locking code centralised.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/exportfs/expfs.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index cdefea17986a..d3e55de4a2a2 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -549,15 +549,13 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
 			goto err_result;
 		}
 
-		inode_lock(target_dir->d_inode);
-		nresult = lookup_one(mnt_idmap(mnt), &QSTR(nbuf), target_dir);
+		nresult = lookup_one_unlocked(mnt_idmap(mnt), &QSTR(nbuf), target_dir);
 		if (!IS_ERR(nresult)) {
 			if (unlikely(nresult->d_inode != result->d_inode)) {
 				dput(nresult);
 				nresult = ERR_PTR(-ESTALE);
 			}
 		}
-		inode_unlock(target_dir->d_inode);
 		/*
 		 * At this point we are done with the parent, but it's pinned
 		 * by the child dentry anyway.
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/5] Change vfs_mkdir() to unlock on failure.
  2025-06-08 23:09 [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes NeilBrown
                   ` (3 preceding siblings ...)
  2025-06-08 23:09 ` [PATCH 4/5] exportfs: use lookup_one_unlocked() NeilBrown
@ 2025-06-08 23:09 ` NeilBrown
  2025-06-09  0:50   ` Al Viro
                     ` (2 more replies)
  2025-06-11 11:43 ` [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes Christian Brauner
  5 siblings, 3 replies; 25+ messages in thread
From: NeilBrown @ 2025-06-08 23:09 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Jeff Layton, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

Proposed changes to directory-op locking will lock the dentry rather
than the whole directory.  So the dentry will need to be unlocked.

vfs_mkdir() consumes the dentry on error, so there will be no dentry to
be unlocked.

So change vfs_mkdir() to unlock on error as well as releasing the
dentry.  This requires various other functions in various callers to
also unlock on error.

At present this results in some clumsy code.  Once the transition to
dentry locking is complete the clumsiness will be gone.

overlayfs looks particularly clumsy as in some cases a double-directory
rename lock is taken, and a mkdir is then performed in one of the
directories.  If that fails the other directory must be unlocked.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/cachefiles/namei.c    | 10 +++++++---
 fs/ecryptfs/inode.c      |  3 ++-
 fs/namei.c               | 10 +++++++---
 fs/nfsd/nfs4recover.c    | 12 +++++++-----
 fs/nfsd/vfs.c            | 12 ++++++++++--
 fs/overlayfs/copy_up.c   | 21 +++++++++++++++------
 fs/overlayfs/dir.c       | 31 +++++++++++++++++++++++--------
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/super.c     | 14 ++++++++++----
 fs/xfs/scrub/orphanage.c |  2 +-
 10 files changed, 83 insertions(+), 33 deletions(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index aecfc5c37b49..6644f0694169 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -131,8 +131,11 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 		ret = cachefiles_inject_write_error();
 		if (ret == 0)
 			subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
-		else
+		else {
+			/* vfs_mkdir() unlocks on failure so we must too */
+			inode_unlock(d_inode(dir));
 			subdir = ERR_PTR(ret);
+		}
 		if (IS_ERR(subdir)) {
 			trace_cachefiles_vfs_error(NULL, d_inode(dir), ret,
 						   cachefiles_trace_mkdir_error);
@@ -196,9 +199,10 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 	return ERR_PTR(-EBUSY);
 
 mkdir_error:
-	inode_unlock(d_inode(dir));
-	if (!IS_ERR(subdir))
+	if (!IS_ERR(subdir)) {
+		inode_unlock(d_inode(dir));
 		dput(subdir);
+	}
 	pr_err("mkdir %s failed with error %d\n", dirname, ret);
 	return ERR_PTR(ret);
 
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 493d7f194956..c513e912ae3c 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -520,7 +520,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 				 lower_dentry, mode);
 	rc = PTR_ERR(lower_dentry);
 	if (IS_ERR(lower_dentry))
-		goto out;
+		goto out_unlocked;
 	rc = 0;
 	if (d_unhashed(lower_dentry))
 		goto out;
@@ -532,6 +532,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	set_nlink(dir, lower_dir->i_nlink);
 out:
 	inode_unlock(lower_dir);
+out_unlocked:
 	if (d_really_is_negative(dentry))
 		d_drop(dentry);
 	return ERR_PTR(rc);
diff --git a/fs/namei.c b/fs/namei.c
index dc42bfac5c57..cefbb681d2f5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4163,9 +4163,10 @@ EXPORT_SYMBOL(kern_path_create);
 
 void done_path_create(struct path *path, struct dentry *dentry)
 {
-	if (!IS_ERR(dentry))
+	if (!IS_ERR(dentry)) {
 		dput(dentry);
-	inode_unlock(path->dentry->d_inode);
+		inode_unlock(path->dentry->d_inode);
+	}
 	mnt_drop_write(path->mnt);
 	path_put(path);
 }
@@ -4328,7 +4329,8 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
  * negative or unhashes it and possibly splices a different one returning it,
  * the original dentry is dput() and the alternate is returned.
  *
- * In case of an error the dentry is dput() and an ERR_PTR() is returned.
+ * In case of an error the dentry is dput(), the parent is unlocked, and
+ * an ERR_PTR() is returned.
  */
 struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 			 struct dentry *dentry, umode_t mode)
@@ -4366,6 +4368,8 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	return dentry;
 
 err:
+	/* Caller only needs to unlock if dentry is not an error */
+	inode_unlock(dir);
 	dput(dentry);
 	return ERR_PTR(error);
 }
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 82785db730d9..5aedadebebe4 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -222,7 +222,8 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
 	dentry = lookup_one(&nop_mnt_idmap, &QSTR(dname), dir);
 	if (IS_ERR(dentry)) {
 		status = PTR_ERR(dentry);
-		goto out_unlock;
+		inode_unlock(d_inode(dir));
+		goto out;
 	}
 	if (d_really_is_positive(dentry))
 		/*
@@ -235,13 +236,14 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
 		 */
 		goto out_put;
 	dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
-	if (IS_ERR(dentry))
+	if (IS_ERR(dentry)) {
 		status = PTR_ERR(dentry);
+		goto out;
+	}
 out_put:
-	if (!status)
-		dput(dentry);
-out_unlock:
+	dput(dentry);
 	inode_unlock(d_inode(dir));
+out:
 	if (status == 0) {
 		if (nn->in_grace)
 			__nfsd4_create_reclaim_record_grace(clp, dname,
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index cd689df2ca5d..be29a18a23b2 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1489,7 +1489,9 @@ nfsd_check_ignore_resizing(struct iattr *iap)
 		iap->ia_valid &= ~ATTR_SIZE;
 }
 
-/* The parent directory should already be locked: */
+/* The parent directory should already be locked.  The lock
+ * will be dropped on error.
+ */
 __be32
 nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		   struct nfsd_attrs *attrs,
@@ -1555,8 +1557,11 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	err = nfsd_create_setattr(rqstp, fhp, resfhp, attrs);
 
 out:
-	if (!IS_ERR(dchild))
+	if (!IS_ERR(dchild)) {
+		if (err)
+			inode_unlock(dirp);
 		dput(dchild);
+	}
 	return err;
 
 out_nfserr:
@@ -1613,6 +1618,9 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	if (err != nfs_ok)
 		goto out_unlock;
 	err = nfsd_create_locked(rqstp, fhp, attrs, type, rdev, resfhp);
+	if (err)
+		/* lock will have been dropped */
+		return err;
 	fh_fill_post_attrs(fhp);
 out_unlock:
 	inode_unlock(dentry->d_inode);
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index d7310fcf3888..324429d02569 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -518,7 +518,7 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper,
 /*
  * Create and install index entry.
  *
- * Caller must hold i_mutex on indexdir.
+ * Caller must hold i_mutex on indexdir.  It will be unlocked on error.
  */
 static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 			    struct dentry *upper)
@@ -539,16 +539,22 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 	 * TODO: implement create index for non-dir, so we can call it when
 	 * encoding file handle for non-dir in case index does not exist.
 	 */
-	if (WARN_ON(!d_is_dir(dentry)))
+	if (WARN_ON(!d_is_dir(dentry))) {
+		inode_unlock(dir);
 		return -EIO;
+	}
 
 	/* Directory not expected to be indexed before copy up */
-	if (WARN_ON(ovl_test_flag(OVL_INDEX, d_inode(dentry))))
+	if (WARN_ON(ovl_test_flag(OVL_INDEX, d_inode(dentry)))) {
+		inode_unlock(dir);
 		return -EIO;
+	}
 
 	err = ovl_get_index_name_fh(fh, &name);
-	if (err)
+	if (err) {
+		inode_unlock(dir);
 		return err;
+	}
 
 	temp = ovl_create_temp(ofs, indexdir, OVL_CATTR(S_IFDIR | 0));
 	err = PTR_ERR(temp);
@@ -567,8 +573,10 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
 		dput(index);
 	}
 out:
-	if (err)
+	if (err) {
 		ovl_cleanup(ofs, dir, temp);
+		inode_unlock(dir);
+	}
 	dput(temp);
 free_name:
 	kfree(name.name);
@@ -781,7 +789,8 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	ovl_start_write(c->dentry);
 	inode_lock(wdir);
 	temp = ovl_create_temp(ofs, c->workdir, &cattr);
-	inode_unlock(wdir);
+	if (!IS_ERR(wdir))
+		inode_unlock(wdir);
 	ovl_end_write(c->dentry);
 	ovl_revert_cu_creds(&cc);
 
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index fe493f3ed6b6..b4d92b51b63f 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -138,13 +138,16 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir,
 	goto out;
 }
 
+/* dir will be unlocked on failure */
 struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir,
 			       struct dentry *newdentry, struct ovl_cattr *attr)
 {
 	int err;
 
-	if (IS_ERR(newdentry))
+	if (IS_ERR(newdentry)) {
+		inode_unlock(dir);
 		return newdentry;
+	}
 
 	err = -ESTALE;
 	if (newdentry->d_inode)
@@ -189,13 +192,16 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir,
 	}
 out:
 	if (err) {
-		if (!IS_ERR(newdentry))
+		if (!IS_ERR(newdentry)) {
+			inode_unlock(dir);
 			dput(newdentry);
+		}
 		return ERR_PTR(err);
 	}
 	return newdentry;
 }
 
+/* Note workdir will be unlocked on failure */
 struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
 			       struct ovl_cattr *attr)
 {
@@ -309,7 +315,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 				    attr);
 	err = PTR_ERR(newdentry);
 	if (IS_ERR(newdentry))
-		goto out_unlock;
+		goto out;
 
 	if (ovl_type_merge(dentry->d_parent) && d_is_dir(newdentry) &&
 	    !ovl_allow_offline_changes(ofs)) {
@@ -323,6 +329,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 		goto out_cleanup;
 out_unlock:
 	inode_unlock(udir);
+out:
 	return err;
 
 out_cleanup:
@@ -367,9 +374,12 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 
 	opaquedir = ovl_create_temp(ofs, workdir, OVL_CATTR(stat.mode));
 	err = PTR_ERR(opaquedir);
-	if (IS_ERR(opaquedir))
-		goto out_unlock;
-
+	if (IS_ERR(opaquedir)) {
+		/* workdir was unlocked, no upperdir */
+		inode_unlock(upperdir->d_inode);
+		mutex_unlock(&upperdir->d_sb->s_vfs_rename_mutex);
+		return ERR_PTR(err);
+	}
 	err = ovl_copy_xattr(dentry->d_sb, &upperpath, opaquedir);
 	if (err)
 		goto out_cleanup;
@@ -455,8 +465,13 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 
 	newdentry = ovl_create_temp(ofs, workdir, cattr);
 	err = PTR_ERR(newdentry);
-	if (IS_ERR(newdentry))
-		goto out_dput;
+	if (IS_ERR(newdentry)) {
+		/* workdir was unlocked, not upperdir */
+		inode_unlock(upperdir->d_inode);
+		mutex_unlock(&upperdir->d_sb->s_vfs_rename_mutex);
+		dput(upper);
+		goto out;
+	}
 
 	/*
 	 * mode could have been mutilated due to umask (e.g. sgid directory)
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 8baaba0a3fe5..44df3a2449e7 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -248,6 +248,7 @@ static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs,
 {
 	dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
 	pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry));
+	/* Note: dir will have been unlocked on failure */
 	return dentry;
 }
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e19940d649ca..5f3267e919dd 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -366,14 +366,17 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 			goto out_dput;
 	} else {
 		err = PTR_ERR(work);
+		inode_unlock(dir);
 		goto out_err;
 	}
 out_unlock:
-	inode_unlock(dir);
+	if (work && !IS_ERR(work))
+		inode_unlock(dir);
 	return work;
 
 out_dput:
 	dput(work);
+	inode_unlock(dir);
 out_err:
 	pr_warn("failed to create directory %s/%s (errno: %i); mounting read-only\n",
 		ofs->config.workdir, name, -err);
@@ -569,7 +572,7 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
 	temp = ovl_create_temp(ofs, workdir, OVL_CATTR(S_IFREG | 0));
 	err = PTR_ERR(temp);
 	if (IS_ERR(temp))
-		goto out_unlock;
+		return err;
 
 	dest = ovl_lookup_temp(ofs, workdir);
 	err = PTR_ERR(dest);
@@ -620,10 +623,13 @@ static struct dentry *ovl_lookup_or_create(struct ovl_fs *ofs,
 
 	inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
 	child = ovl_lookup_upper(ofs, name, parent, len);
-	if (!IS_ERR(child) && !child->d_inode)
+	if (!IS_ERR(child) && !child->d_inode) {
 		child = ovl_create_real(ofs, parent->d_inode, child,
 					OVL_CATTR(mode));
-	inode_unlock(parent->d_inode);
+		if (!IS_ERR(child))
+			inode_unlock(parent->d_inode);
+	} else
+		inode_unlock(parent->d_inode);
 	dput(parent);
 
 	return child;
diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
index 9c12cb844231..c95bded4e8a7 100644
--- a/fs/xfs/scrub/orphanage.c
+++ b/fs/xfs/scrub/orphanage.c
@@ -170,7 +170,7 @@ xrep_orphanage_create(
 					     orphanage_dentry, 0750);
 		error = PTR_ERR(orphanage_dentry);
 		if (IS_ERR(orphanage_dentry))
-			goto out_unlock_root;
+			goto out_dput_root;
 	}
 
 	/* Not a directory? Bail out. */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] Change vfs_mkdir() to unlock on failure.
  2025-06-08 23:09 ` [PATCH 5/5] Change vfs_mkdir() to unlock on failure NeilBrown
@ 2025-06-09  0:50   ` Al Viro
  2025-06-09  5:22     ` NeilBrown
  2025-06-09 12:23   ` Jeff Layton
  2025-06-12  6:59   ` Amir Goldstein
  2 siblings, 1 reply; 25+ messages in thread
From: Al Viro @ 2025-06-09  0:50 UTC (permalink / raw)
  To: NeilBrown
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Jeff Layton,
	Amir Goldstein, Jan Harkes, David Howells, Tyler Hicks,
	Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda, codalist,
	linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Mon, Jun 09, 2025 at 09:09:37AM +1000, NeilBrown wrote:
> Proposed changes to directory-op locking will lock the dentry rather
> than the whole directory.  So the dentry will need to be unlocked.

Please, repost your current proposal _before_ that one goes anywhere.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] Change vfs_mkdir() to unlock on failure.
  2025-06-09  0:50   ` Al Viro
@ 2025-06-09  5:22     ` NeilBrown
  2025-06-09  5:34       ` Al Viro
  0 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2025-06-09  5:22 UTC (permalink / raw)
  To: Al Viro
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Jeff Layton,
	Amir Goldstein, Jan Harkes, David Howells, Tyler Hicks,
	Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda, codalist,
	linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Mon, 09 Jun 2025, Al Viro wrote:
> On Mon, Jun 09, 2025 at 09:09:37AM +1000, NeilBrown wrote:
> > Proposed changes to directory-op locking will lock the dentry rather
> > than the whole directory.  So the dentry will need to be unlocked.
> 
> Please, repost your current proposal _before_ that one goes anywhere.
> 

I've posted my proposal for the new API.  This makes the value of the
vfs_mkdir() change clear (I hope).

Would you also like me to post the patches which introduce the new
locking scheme?

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] Change vfs_mkdir() to unlock on failure.
  2025-06-09  5:22     ` NeilBrown
@ 2025-06-09  5:34       ` Al Viro
  2025-06-10  8:26         ` Al Viro
  0 siblings, 1 reply; 25+ messages in thread
From: Al Viro @ 2025-06-09  5:34 UTC (permalink / raw)
  To: NeilBrown
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Jeff Layton,
	Amir Goldstein, Jan Harkes, David Howells, Tyler Hicks,
	Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda, codalist,
	linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Mon, Jun 09, 2025 at 03:22:00PM +1000, NeilBrown wrote:
> On Mon, 09 Jun 2025, Al Viro wrote:
> > On Mon, Jun 09, 2025 at 09:09:37AM +1000, NeilBrown wrote:
> > > Proposed changes to directory-op locking will lock the dentry rather
> > > than the whole directory.  So the dentry will need to be unlocked.
> > 
> > Please, repost your current proposal _before_ that one goes anywhere.
> > 
> 
> I've posted my proposal for the new API.  This makes the value of the
> vfs_mkdir() change clear (I hope).
> 
> Would you also like me to post the patches which introduce the new
> locking scheme?

Yes, seeing that the rest does not make much sense without that.

I would really like a description of that locking scheme as well,
TBH, but if you prefer to start with the patches, then so be it.

I can't promise a response tonight - going down in an hour or so
and I'd like to do enough reordering of #work.mount to be able
to post the initial variant of at least some of that in the
morning...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl()
  2025-06-08 23:09 ` [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl() NeilBrown
@ 2025-06-09 12:13   ` Jeff Layton
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2025-06-09 12:13 UTC (permalink / raw)
  To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> The effect of lookup_one_qstr_excl_raw() can be achieved by passing
> LOOKUP_CREATE() to lookup_one_qstr_excl() - we don't need a separate
> function.
> 
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/namei.c | 37 ++++++++++++++-----------------------
>  1 file changed, 14 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 4bb889fc980b..dc42bfac5c57 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1665,9 +1665,17 @@ static struct dentry *lookup_dcache(const struct qstr *name,
>  	return dentry;
>  }
>  
> -static struct dentry *lookup_one_qstr_excl_raw(const struct qstr *name,
> -					       struct dentry *base,
> -					       unsigned int flags)
> +/*
> + * Parent directory has inode locked exclusive.  This is one
> + * and only case when ->lookup() gets called on non in-lookup
> + * dentries - as the matter of fact, this only gets called
> + * when directory is guaranteed to have no in-lookup children
> + * at all.
> + * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed.
> + * Will return -EEXIST if name is found and LOOKUP_EXCL was passed.
> + */
> +struct dentry *lookup_one_qstr_excl(const struct qstr *name,
> +				    struct dentry *base, unsigned int flags)
>  {
>  	struct dentry *dentry;
>  	struct dentry *old;
> @@ -1675,7 +1683,7 @@ static struct dentry *lookup_one_qstr_excl_raw(const struct qstr *name,
>  
>  	dentry = lookup_dcache(name, base, flags);
>  	if (dentry)
> -		return dentry;
> +		goto found;
>  
>  	/* Don't create child dentry for a dead directory. */
>  	dir = base->d_inode;
> @@ -1691,24 +1699,7 @@ static struct dentry *lookup_one_qstr_excl_raw(const struct qstr *name,
>  		dput(dentry);
>  		dentry = old;
>  	}
> -	return dentry;
> -}
> -
> -/*
> - * Parent directory has inode locked exclusive.  This is one
> - * and only case when ->lookup() gets called on non in-lookup
> - * dentries - as the matter of fact, this only gets called
> - * when directory is guaranteed to have no in-lookup children
> - * at all.
> - * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed.
> - * Will return -EEXIST if name is found and LOOKUP_EXCL was passed.
> - */
> -struct dentry *lookup_one_qstr_excl(const struct qstr *name,
> -				    struct dentry *base, unsigned int flags)
> -{
> -	struct dentry *dentry;
> -
> -	dentry = lookup_one_qstr_excl_raw(name, base, flags);
> +found:
>  	if (IS_ERR(dentry))
>  		return dentry;
>  	if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) {
> @@ -2790,7 +2781,7 @@ struct dentry *kern_path_locked_negative(const char *name, struct path *path)
>  	if (unlikely(type != LAST_NORM))
>  		return ERR_PTR(-EINVAL);
>  	inode_lock_nested(parent_path.dentry->d_inode, I_MUTEX_PARENT);
> -	d = lookup_one_qstr_excl_raw(&last, parent_path.dentry, 0);
> +	d = lookup_one_qstr_excl(&last, parent_path.dentry, LOOKUP_CREATE);
>  	if (IS_ERR(d)) {
>  		inode_unlock(parent_path.dentry->d_inode);
>  		return d;

Nice little cleanup.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] VFS: Minor fixes for porting.rst
  2025-06-08 23:09 ` [PATCH 2/5] VFS: Minor fixes for porting.rst NeilBrown
@ 2025-06-09 12:13   ` Jeff Layton
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2025-06-09 12:13 UTC (permalink / raw)
  To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> This paragraph was relevant for an earlier version of the code which
> passed the qstr as a struct instead of a point.  The version that landed
> passed the pointer in all cases so this para is now pointless.
> 
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  Documentation/filesystems/porting.rst | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
> index 3616d7161dab..e8c9f21582d1 100644
> --- a/Documentation/filesystems/porting.rst
> +++ b/Documentation/filesystems/porting.rst
> @@ -1224,9 +1224,6 @@ lookup_noperm_unlocked(), lookup_noperm_positive_unlocked().  They now
>  take a qstr instead of separate name and length.  QSTR() can be used
>  when strlen() is needed for the length.
>  
> -For try_lookup_noperm() a reference to the qstr is passed in case the
> -hash might subsequently be needed.
> -
>  These function no longer do any permission checking - they previously
>  checked that the caller has 'X' permission on the parent.  They must
>  ONLY be used internally by a filesystem on itself when it knows that

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-08 23:09 ` [PATCH 3/5] coda: use iterate_dir() in coda_readdir() NeilBrown
@ 2025-06-09 12:17   ` Jeff Layton
  2025-06-09 13:00     ` Jan Kara
  2025-06-09 14:35   ` Jan Harkes
  1 sibling, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2025-06-09 12:17 UTC (permalink / raw)
  To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> The code in coda_readdir() is nearly identical to iterate_dir().
> Differences are:
>  - iterate_dir() is killable
>  - iterate_dir() adds permission checking and accessing notifications
> 
> I believe these are not harmful for coda so it is best to use
> iterate_dir() directly.  This will allow locking changes without
> touching the code in coda.
> 
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/coda/dir.c | 12 ++----------
>  1 file changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/coda/dir.c b/fs/coda/dir.c
> index ab69d8f0cec2..ca9990017265 100644
> --- a/fs/coda/dir.c
> +++ b/fs/coda/dir.c
> @@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
>  	cfi = coda_ftoc(coda_file);
>  	host_file = cfi->cfi_container;
>  
> -	if (host_file->f_op->iterate_shared) {
> -		struct inode *host_inode = file_inode(host_file);
> -		ret = -ENOENT;
> -		if (!IS_DEADDIR(host_inode)) {
> -			inode_lock_shared(host_inode);
> -			ret = host_file->f_op->iterate_shared(host_file, ctx);
> -			file_accessed(host_file);
> -			inode_unlock_shared(host_inode);
> -		}
> +	ret = iterate_dir(host_file, ctx);
> +	if (ret != -ENOTDIR)
>  		return ret;
> -	}
>  	/* Venus: we must read Venus dirents from a file */
>  	return coda_venus_readdir(coda_file, ctx);
>  }


Is it already time for my annual ask of "Who the heck is using coda
these days?" Anyway, this patch looks fine to me.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] exportfs: use lookup_one_unlocked()
  2025-06-08 23:09 ` [PATCH 4/5] exportfs: use lookup_one_unlocked() NeilBrown
@ 2025-06-09 12:18   ` Jeff Layton
  2025-06-09 14:01   ` Chuck Lever
  1 sibling, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2025-06-09 12:18 UTC (permalink / raw)
  To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> rather than locking the directory and using lookup_one(), just use
> lookup_one_unlocked().  This keeps locking code centralised.
> 
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/exportfs/expfs.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index cdefea17986a..d3e55de4a2a2 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -549,15 +549,13 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
>  			goto err_result;
>  		}
>  
> -		inode_lock(target_dir->d_inode);
> -		nresult = lookup_one(mnt_idmap(mnt), &QSTR(nbuf), target_dir);
> +		nresult = lookup_one_unlocked(mnt_idmap(mnt), &QSTR(nbuf), target_dir);
>  		if (!IS_ERR(nresult)) {
>  			if (unlikely(nresult->d_inode != result->d_inode)) {
>  				dput(nresult);
>  				nresult = ERR_PTR(-ESTALE);
>  			}
>  		}
> -		inode_unlock(target_dir->d_inode);
>  		/*
>  		 * At this point we are done with the parent, but it's pinned
>  		 * by the child dentry anyway.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] Change vfs_mkdir() to unlock on failure.
  2025-06-08 23:09 ` [PATCH 5/5] Change vfs_mkdir() to unlock on failure NeilBrown
  2025-06-09  0:50   ` Al Viro
@ 2025-06-09 12:23   ` Jeff Layton
  2025-06-12  6:59   ` Amir Goldstein
  2 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2025-06-09 12:23 UTC (permalink / raw)
  To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> Proposed changes to directory-op locking will lock the dentry rather
> than the whole directory.  So the dentry will need to be unlocked.
> 
> vfs_mkdir() consumes the dentry on error, so there will be no dentry to
> be unlocked.
> 
> So change vfs_mkdir() to unlock on error as well as releasing the
> dentry.  This requires various other functions in various callers to
> also unlock on error.
> 
> At present this results in some clumsy code.  Once the transition to
> dentry locking is complete the clumsiness will be gone.
> 
> overlayfs looks particularly clumsy as in some cases a double-directory
> rename lock is taken, and a mkdir is then performed in one of the
> directories.  If that fails the other directory must be unlocked.
> 
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/cachefiles/namei.c    | 10 +++++++---
>  fs/ecryptfs/inode.c      |  3 ++-
>  fs/namei.c               | 10 +++++++---
>  fs/nfsd/nfs4recover.c    | 12 +++++++-----
>  fs/nfsd/vfs.c            | 12 ++++++++++--
>  fs/overlayfs/copy_up.c   | 21 +++++++++++++++------
>  fs/overlayfs/dir.c       | 31 +++++++++++++++++++++++--------
>  fs/overlayfs/overlayfs.h |  1 +
>  fs/overlayfs/super.c     | 14 ++++++++++----
>  fs/xfs/scrub/orphanage.c |  2 +-
>  10 files changed, 83 insertions(+), 33 deletions(-)
> 
> diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
> index aecfc5c37b49..6644f0694169 100644
> --- a/fs/cachefiles/namei.c
> +++ b/fs/cachefiles/namei.c
> @@ -131,8 +131,11 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
>  		ret = cachefiles_inject_write_error();
>  		if (ret == 0)
>  			subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
> -		else
> +		else {
> +			/* vfs_mkdir() unlocks on failure so we must too */
> +			inode_unlock(d_inode(dir));
>  			subdir = ERR_PTR(ret);
> +		}
>  		if (IS_ERR(subdir)) {
>  			trace_cachefiles_vfs_error(NULL, d_inode(dir), ret,
>  						   cachefiles_trace_mkdir_error);
> @@ -196,9 +199,10 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
>  	return ERR_PTR(-EBUSY);
>  
>  mkdir_error:
> -	inode_unlock(d_inode(dir));
> -	if (!IS_ERR(subdir))
> +	if (!IS_ERR(subdir)) {
> +		inode_unlock(d_inode(dir));
>  		dput(subdir);
> +	}
>  	pr_err("mkdir %s failed with error %d\n", dirname, ret);
>  	return ERR_PTR(ret);
>  
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index 493d7f194956..c513e912ae3c 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -520,7 +520,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
>  				 lower_dentry, mode);
>  	rc = PTR_ERR(lower_dentry);
>  	if (IS_ERR(lower_dentry))
> -		goto out;
> +		goto out_unlocked;
>  	rc = 0;
>  	if (d_unhashed(lower_dentry))
>  		goto out;
> @@ -532,6 +532,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
>  	set_nlink(dir, lower_dir->i_nlink);
>  out:
>  	inode_unlock(lower_dir);
> +out_unlocked:
>  	if (d_really_is_negative(dentry))
>  		d_drop(dentry);
>  	return ERR_PTR(rc);
> diff --git a/fs/namei.c b/fs/namei.c
> index dc42bfac5c57..cefbb681d2f5 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4163,9 +4163,10 @@ EXPORT_SYMBOL(kern_path_create);
>  
>  void done_path_create(struct path *path, struct dentry *dentry)
>  {
> -	if (!IS_ERR(dentry))
> +	if (!IS_ERR(dentry)) {
>  		dput(dentry);
> -	inode_unlock(path->dentry->d_inode);
> +		inode_unlock(path->dentry->d_inode);
> +	}
>  	mnt_drop_write(path->mnt);
>  	path_put(path);
>  }
> @@ -4328,7 +4329,8 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, umode_t, mode, unsigned, d
>   * negative or unhashes it and possibly splices a different one returning it,
>   * the original dentry is dput() and the alternate is returned.
>   *
> - * In case of an error the dentry is dput() and an ERR_PTR() is returned.
> + * In case of an error the dentry is dput(), the parent is unlocked, and
> + * an ERR_PTR() is returned.
>   */
>  struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
>  			 struct dentry *dentry, umode_t mode)
> @@ -4366,6 +4368,8 @@ struct dentry *vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
>  	return dentry;
>  
>  err:
> +	/* Caller only needs to unlock if dentry is not an error */
> +	inode_unlock(dir);
>  	dput(dentry);
>  	return ERR_PTR(error);
>  }
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 82785db730d9..5aedadebebe4 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -222,7 +222,8 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
>  	dentry = lookup_one(&nop_mnt_idmap, &QSTR(dname), dir);
>  	if (IS_ERR(dentry)) {
>  		status = PTR_ERR(dentry);
> -		goto out_unlock;
> +		inode_unlock(d_inode(dir));
> +		goto out;
>  	}
>  	if (d_really_is_positive(dentry))
>  		/*
> @@ -235,13 +236,14 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
>  		 */
>  		goto out_put;
>  	dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
> -	if (IS_ERR(dentry))
> +	if (IS_ERR(dentry)) {
>  		status = PTR_ERR(dentry);
> +		goto out;
> +	}
>  out_put:
> -	if (!status)
> -		dput(dentry);
> -out_unlock:
> +	dput(dentry);
>  	inode_unlock(d_inode(dir));
> +out:
>  	if (status == 0) {
>  		if (nn->in_grace)
>  			__nfsd4_create_reclaim_record_grace(clp, dname,
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index cd689df2ca5d..be29a18a23b2 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1489,7 +1489,9 @@ nfsd_check_ignore_resizing(struct iattr *iap)
>  		iap->ia_valid &= ~ATTR_SIZE;
>  }
>  
> -/* The parent directory should already be locked: */
> +/* The parent directory should already be locked.  The lock
> + * will be dropped on error.
> + */
>  __be32
>  nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		   struct nfsd_attrs *attrs,
> @@ -1555,8 +1557,11 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	err = nfsd_create_setattr(rqstp, fhp, resfhp, attrs);
>  
>  out:
> -	if (!IS_ERR(dchild))
> +	if (!IS_ERR(dchild)) {
> +		if (err)
> +			inode_unlock(dirp);
>  		dput(dchild);
> +	}
>  	return err;
>  
>  out_nfserr:
> @@ -1613,6 +1618,9 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	if (err != nfs_ok)
>  		goto out_unlock;
>  	err = nfsd_create_locked(rqstp, fhp, attrs, type, rdev, resfhp);
> +	if (err)
> +		/* lock will have been dropped */
> +		return err;
>  	fh_fill_post_attrs(fhp);
>  out_unlock:
>  	inode_unlock(dentry->d_inode);
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index d7310fcf3888..324429d02569 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -518,7 +518,7 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper,
>  /*
>   * Create and install index entry.
>   *
> - * Caller must hold i_mutex on indexdir.
> + * Caller must hold i_mutex on indexdir.  It will be unlocked on error.
>   */
>  static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>  			    struct dentry *upper)
> @@ -539,16 +539,22 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>  	 * TODO: implement create index for non-dir, so we can call it when
>  	 * encoding file handle for non-dir in case index does not exist.
>  	 */
> -	if (WARN_ON(!d_is_dir(dentry)))
> +	if (WARN_ON(!d_is_dir(dentry))) {
> +		inode_unlock(dir);
>  		return -EIO;
> +	}
>  
>  	/* Directory not expected to be indexed before copy up */
> -	if (WARN_ON(ovl_test_flag(OVL_INDEX, d_inode(dentry))))
> +	if (WARN_ON(ovl_test_flag(OVL_INDEX, d_inode(dentry)))) {
> +		inode_unlock(dir);
>  		return -EIO;
> +	}
>  
>  	err = ovl_get_index_name_fh(fh, &name);
> -	if (err)
> +	if (err) {
> +		inode_unlock(dir);
>  		return err;
> +	}
>  
>  	temp = ovl_create_temp(ofs, indexdir, OVL_CATTR(S_IFDIR | 0));
>  	err = PTR_ERR(temp);
> @@ -567,8 +573,10 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
>  		dput(index);
>  	}
>  out:
> -	if (err)
> +	if (err) {
>  		ovl_cleanup(ofs, dir, temp);
> +		inode_unlock(dir);
> +	}
>  	dput(temp);
>  free_name:
>  	kfree(name.name);
> @@ -781,7 +789,8 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>  	ovl_start_write(c->dentry);
>  	inode_lock(wdir);
>  	temp = ovl_create_temp(ofs, c->workdir, &cattr);
> -	inode_unlock(wdir);
> +	if (!IS_ERR(wdir))
> +		inode_unlock(wdir);
>  	ovl_end_write(c->dentry);
>  	ovl_revert_cu_creds(&cc);
>  
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index fe493f3ed6b6..b4d92b51b63f 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -138,13 +138,16 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir,
>  	goto out;
>  }
>  
> +/* dir will be unlocked on failure */
>  struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir,
>  			       struct dentry *newdentry, struct ovl_cattr *attr)
>  {
>  	int err;
>  
> -	if (IS_ERR(newdentry))
> +	if (IS_ERR(newdentry)) {
> +		inode_unlock(dir);
>  		return newdentry;
> +	}
>  
>  	err = -ESTALE;
>  	if (newdentry->d_inode)
> @@ -189,13 +192,16 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir,
>  	}
>  out:
>  	if (err) {
> -		if (!IS_ERR(newdentry))
> +		if (!IS_ERR(newdentry)) {
> +			inode_unlock(dir);
>  			dput(newdentry);
> +		}
>  		return ERR_PTR(err);
>  	}
>  	return newdentry;
>  }
>  
> +/* Note workdir will be unlocked on failure */
>  struct dentry *ovl_create_temp(struct ovl_fs *ofs, struct dentry *workdir,
>  			       struct ovl_cattr *attr)
>  {
> @@ -309,7 +315,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>  				    attr);
>  	err = PTR_ERR(newdentry);
>  	if (IS_ERR(newdentry))
> -		goto out_unlock;
> +		goto out;
>  
>  	if (ovl_type_merge(dentry->d_parent) && d_is_dir(newdentry) &&
>  	    !ovl_allow_offline_changes(ofs)) {
> @@ -323,6 +329,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>  		goto out_cleanup;
>  out_unlock:
>  	inode_unlock(udir);
> +out:
>  	return err;
>  
>  out_cleanup:
> @@ -367,9 +374,12 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
>  
>  	opaquedir = ovl_create_temp(ofs, workdir, OVL_CATTR(stat.mode));
>  	err = PTR_ERR(opaquedir);
> -	if (IS_ERR(opaquedir))
> -		goto out_unlock;
> -
> +	if (IS_ERR(opaquedir)) {
> +		/* workdir was unlocked, no upperdir */
> +		inode_unlock(upperdir->d_inode);
> +		mutex_unlock(&upperdir->d_sb->s_vfs_rename_mutex);
> +		return ERR_PTR(err);
> +	}
>  	err = ovl_copy_xattr(dentry->d_sb, &upperpath, opaquedir);
>  	if (err)
>  		goto out_cleanup;
> @@ -455,8 +465,13 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>  
>  	newdentry = ovl_create_temp(ofs, workdir, cattr);
>  	err = PTR_ERR(newdentry);
> -	if (IS_ERR(newdentry))
> -		goto out_dput;
> +	if (IS_ERR(newdentry)) {
> +		/* workdir was unlocked, not upperdir */
> +		inode_unlock(upperdir->d_inode);
> +		mutex_unlock(&upperdir->d_sb->s_vfs_rename_mutex);
> +		dput(upper);
> +		goto out;
> +	}
>  
>  	/*
>  	 * mode could have been mutilated due to umask (e.g. sgid directory)
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 8baaba0a3fe5..44df3a2449e7 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -248,6 +248,7 @@ static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs,
>  {
>  	dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
>  	pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry));
> +	/* Note: dir will have been unlocked on failure */
>  	return dentry;
>  }
>  
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index e19940d649ca..5f3267e919dd 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -366,14 +366,17 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
>  			goto out_dput;
>  	} else {
>  		err = PTR_ERR(work);
> +		inode_unlock(dir);
>  		goto out_err;
>  	}
>  out_unlock:
> -	inode_unlock(dir);
> +	if (work && !IS_ERR(work))
> +		inode_unlock(dir);
>  	return work;
>  
>  out_dput:
>  	dput(work);
> +	inode_unlock(dir);
>  out_err:
>  	pr_warn("failed to create directory %s/%s (errno: %i); mounting read-only\n",
>  		ofs->config.workdir, name, -err);
> @@ -569,7 +572,7 @@ static int ovl_check_rename_whiteout(struct ovl_fs *ofs)
>  	temp = ovl_create_temp(ofs, workdir, OVL_CATTR(S_IFREG | 0));
>  	err = PTR_ERR(temp);
>  	if (IS_ERR(temp))
> -		goto out_unlock;
> +		return err;
>  
>  	dest = ovl_lookup_temp(ofs, workdir);
>  	err = PTR_ERR(dest);
> @@ -620,10 +623,13 @@ static struct dentry *ovl_lookup_or_create(struct ovl_fs *ofs,
>  
>  	inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
>  	child = ovl_lookup_upper(ofs, name, parent, len);
> -	if (!IS_ERR(child) && !child->d_inode)
> +	if (!IS_ERR(child) && !child->d_inode) {
>  		child = ovl_create_real(ofs, parent->d_inode, child,
>  					OVL_CATTR(mode));
> -	inode_unlock(parent->d_inode);
> +		if (!IS_ERR(child))
> +			inode_unlock(parent->d_inode);
> +	} else
> +		inode_unlock(parent->d_inode);
>  	dput(parent);
>  
>  	return child;
> diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
> index 9c12cb844231..c95bded4e8a7 100644
> --- a/fs/xfs/scrub/orphanage.c
> +++ b/fs/xfs/scrub/orphanage.c
> @@ -170,7 +170,7 @@ xrep_orphanage_create(
>  					     orphanage_dentry, 0750);
>  		error = PTR_ERR(orphanage_dentry);
>  		if (IS_ERR(orphanage_dentry))
> -			goto out_unlock_root;
> +			goto out_dput_root;
>  	}
>  
>  	/* Not a directory? Bail out. */

The patch looks sane at first glance, but you're correct that it makes
the code uglier for now. Assuming that later cleanup will improve
things:

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-09 12:17   ` Jeff Layton
@ 2025-06-09 13:00     ` Jan Kara
  2025-06-09 13:12       ` Jan Harkes
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Kara @ 2025-06-09 13:00 UTC (permalink / raw)
  To: Jeff Layton
  Cc: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda,
	codalist, linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Mon 09-06-25 08:17:15, Jeff Layton wrote:
> On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> > The code in coda_readdir() is nearly identical to iterate_dir().
> > Differences are:
> >  - iterate_dir() is killable
> >  - iterate_dir() adds permission checking and accessing notifications
> > 
> > I believe these are not harmful for coda so it is best to use
> > iterate_dir() directly.  This will allow locking changes without
> > touching the code in coda.
> > 
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/coda/dir.c | 12 ++----------
> >  1 file changed, 2 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/coda/dir.c b/fs/coda/dir.c
> > index ab69d8f0cec2..ca9990017265 100644
> > --- a/fs/coda/dir.c
> > +++ b/fs/coda/dir.c
> > @@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
> >  	cfi = coda_ftoc(coda_file);
> >  	host_file = cfi->cfi_container;
> >  
> > -	if (host_file->f_op->iterate_shared) {
> > -		struct inode *host_inode = file_inode(host_file);
> > -		ret = -ENOENT;
> > -		if (!IS_DEADDIR(host_inode)) {
> > -			inode_lock_shared(host_inode);
> > -			ret = host_file->f_op->iterate_shared(host_file, ctx);
> > -			file_accessed(host_file);
> > -			inode_unlock_shared(host_inode);
> > -		}
> > +	ret = iterate_dir(host_file, ctx);
> > +	if (ret != -ENOTDIR)
> >  		return ret;
> > -	}
> >  	/* Venus: we must read Venus dirents from a file */
> >  	return coda_venus_readdir(coda_file, ctx);
> >  }
> 
> 
> Is it already time for my annual ask of "Who the heck is using coda
> these days?" Anyway, this patch looks fine to me.
> 
> Reviewed-by: Jeff Layton <jlayton@kernel.org>

Send a patch proposing deprecating it and we might learn that :) Searching
the web seems to suggest it is indeed pretty close to dead.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-09 13:00     ` Jan Kara
@ 2025-06-09 13:12       ` Jan Harkes
  2025-06-09 13:21         ` Jeff Layton
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Harkes @ 2025-06-09 13:12 UTC (permalink / raw)
  To: Jan Kara, Jeff Layton
  Cc: NeilBrown, Alexander Viro, Christian Brauner, Chuck Lever,
	Amir Goldstein, David Howells, Tyler Hicks, Miklos Szeredi,
	Carlos Maiolino, linux-fsdevel, coda, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

There are definitely still users of Coda at CMU, I don't track who else uses it, but it cannot be too many for sure.

At this point it mostly keeps you honest about little locking details in the vfs. There are some tricky details in how the inode mappings are accessed and such. I think that  is helpful for overlay and user filesystems like fuse, overlayfs, etc. but Coda is quite small so it is easy to reason about how it uses these features.

Jan

On June 9, 2025 9:00:31 AM EDT, Jan Kara <jack@suse.cz> wrote:
>On Mon 09-06-25 08:17:15, Jeff Layton wrote:
>> On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
>> > The code in coda_readdir() is nearly identical to iterate_dir().
>> > Differences are:
>> >  - iterate_dir() is killable
>> >  - iterate_dir() adds permission checking and accessing notifications
>> > 
>> > I believe these are not harmful for coda so it is best to use
>> > iterate_dir() directly.  This will allow locking changes without
>> > touching the code in coda.
>> > 
>> > Signed-off-by: NeilBrown <neil@brown.name>
>> > ---
>> >  fs/coda/dir.c | 12 ++----------
>> >  1 file changed, 2 insertions(+), 10 deletions(-)
>> > 
>> > diff --git a/fs/coda/dir.c b/fs/coda/dir.c
>> > index ab69d8f0cec2..ca9990017265 100644
>> > --- a/fs/coda/dir.c
>> > +++ b/fs/coda/dir.c
>> > @@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
>> >  	cfi = coda_ftoc(coda_file);
>> >  	host_file = cfi->cfi_container;
>> >  
>> > -	if (host_file->f_op->iterate_shared) {
>> > -		struct inode *host_inode = file_inode(host_file);
>> > -		ret = -ENOENT;
>> > -		if (!IS_DEADDIR(host_inode)) {
>> > -			inode_lock_shared(host_inode);
>> > -			ret = host_file->f_op->iterate_shared(host_file, ctx);
>> > -			file_accessed(host_file);
>> > -			inode_unlock_shared(host_inode);
>> > -		}
>> > +	ret = iterate_dir(host_file, ctx);
>> > +	if (ret != -ENOTDIR)
>> >  		return ret;
>> > -	}
>> >  	/* Venus: we must read Venus dirents from a file */
>> >  	return coda_venus_readdir(coda_file, ctx);
>> >  }
>> 
>> 
>> Is it already time for my annual ask of "Who the heck is using coda
>> these days?" Anyway, this patch looks fine to me.
>> 
>> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>
>Send a patch proposing deprecating it and we might learn that :) Searching
>the web seems to suggest it is indeed pretty close to dead.
>
>								Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-09 13:12       ` Jan Harkes
@ 2025-06-09 13:21         ` Jeff Layton
  2025-06-09 13:33           ` Jan Harkes
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2025-06-09 13:21 UTC (permalink / raw)
  To: Jan Harkes, Jan Kara
  Cc: NeilBrown, Alexander Viro, Christian Brauner, Chuck Lever,
	Amir Goldstein, David Howells, Tyler Hicks, Miklos Szeredi,
	Carlos Maiolino, linux-fsdevel, coda, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On Mon, 2025-06-09 at 09:12 -0400, Jan Harkes wrote:
> There are definitely still users of Coda at CMU, I don't track who else uses it, but it cannot be too many for sure.
> 
> At this point it mostly keeps you honest about little locking details in the vfs. There are some tricky details in how the inode mappings are accessed and such. I think that  is helpful for overlay and user filesystems like fuse, overlayfs, etc. but Coda is quite small so it is easy to reason about how it uses these features.
> 
> 

The reason I ask is that it's one of the places that we often have to
do odd fixups like this when making changes to core VFS APIs. It's also
not seen any non-collateral changes since 2021. I'm just wondering
whether it's worth it to keep coda in-tree for so few users.

IIRC, it's also the only fs in the kernel that changes its
inode->i_mapping pointer after inode instantiation. If not for coda, we
could probably replace the i_mapping pointer with some macro wizardry
and shrink struct inode by 8 bytes.


> On June 9, 2025 9:00:31 AM EDT, Jan Kara <jack@suse.cz> wrote:
> > On Mon 09-06-25 08:17:15, Jeff Layton wrote:
> > > On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> > > > The code in coda_readdir() is nearly identical to iterate_dir().
> > > > Differences are:
> > > >  - iterate_dir() is killable
> > > >  - iterate_dir() adds permission checking and accessing notifications
> > > > 
> > > > I believe these are not harmful for coda so it is best to use
> > > > iterate_dir() directly.  This will allow locking changes without
> > > > touching the code in coda.
> > > > 
> > > > Signed-off-by: NeilBrown <neil@brown.name>
> > > > ---
> > > >  fs/coda/dir.c | 12 ++----------
> > > >  1 file changed, 2 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/fs/coda/dir.c b/fs/coda/dir.c
> > > > index ab69d8f0cec2..ca9990017265 100644
> > > > --- a/fs/coda/dir.c
> > > > +++ b/fs/coda/dir.c
> > > > @@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
> > > >  	cfi = coda_ftoc(coda_file);
> > > >  	host_file = cfi->cfi_container;
> > > >  
> > > > -	if (host_file->f_op->iterate_shared) {
> > > > -		struct inode *host_inode = file_inode(host_file);
> > > > -		ret = -ENOENT;
> > > > -		if (!IS_DEADDIR(host_inode)) {
> > > > -			inode_lock_shared(host_inode);
> > > > -			ret = host_file->f_op->iterate_shared(host_file, ctx);
> > > > -			file_accessed(host_file);
> > > > -			inode_unlock_shared(host_inode);
> > > > -		}
> > > > +	ret = iterate_dir(host_file, ctx);
> > > > +	if (ret != -ENOTDIR)
> > > >  		return ret;
> > > > -	}
> > > >  	/* Venus: we must read Venus dirents from a file */
> > > >  	return coda_venus_readdir(coda_file, ctx);
> > > >  }
> > > 
> > > 
> > > Is it already time for my annual ask of "Who the heck is using coda
> > > these days?" Anyway, this patch looks fine to me.
> > > 
> > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > 
> > Send a patch proposing deprecating it and we might learn that :) Searching
> > the web seems to suggest it is indeed pretty close to dead.
> > 
> > 								Honza

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-09 13:21         ` Jeff Layton
@ 2025-06-09 13:33           ` Jan Harkes
  2025-06-09 14:02             ` Jeff Layton
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Harkes @ 2025-06-09 13:33 UTC (permalink / raw)
  To: Jeff Layton, Jan Kara
  Cc: NeilBrown, Alexander Viro, Christian Brauner, Chuck Lever,
	Amir Goldstein, David Howells, Tyler Hicks, Miklos Szeredi,
	Carlos Maiolino, linux-fsdevel, coda, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel



On June 9, 2025 9:21:20 AM EDT, Jeff Layton <jlayton@kernel.org> wrote:
>On Mon, 2025-06-09 at 09:12 -0400, Jan Harkes wrote:
>> There are definitely still users of Coda at CMU, I don't track who else uses it, but it cannot be too many for sure.
>> 
>> At this point it mostly keeps you honest about little locking details in the vfs. There are some tricky details in how the inode mappings are accessed and such. I think that  is helpful for overlay and user filesystems like fuse, overlayfs, etc. but Coda is quite small so it is easy to reason about how it uses these features.
>> 
>> 
>
>The reason I ask is that it's one of the places that we often have to
>do odd fixups like this when making changes to core VFS APIs. It's also
>not seen any non-collateral changes since 2021. I'm just wondering
>whether it's worth it to keep coda in-tree for so few users.

I have no problem maintaining an external module. I already do that to both make development easier and to support building a custom module in case some distro didn't ship with a prebuilt Coda kernel module.

>IIRC, it's also the only fs in the kernel that changes its
>inode->i_mapping pointer after inode instantiation. If not for coda, we
>could probably replace the i_mapping pointer with some macro wizardry
>and shrink struct inode by 8 bytes.

And that would be why an out-of-tree solution will not work in the long run. A small change like that to shave 8 bytes off the inode structure is a fairly easy call to make when there are no in-tree filesystems that use it anymore.

Ultimately it is up to you guys to decide if the extra burden on VFS maintenance is worth it.

Jan

>> On June 9, 2025 9:00:31 AM EDT, Jan Kara <jack@suse.cz> wrote:
>> > On Mon 09-06-25 08:17:15, Jeff Layton wrote:
>> > > On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
>> > > > The code in coda_readdir() is nearly identical to iterate_dir().
>> > > > Differences are:
>> > > >  - iterate_dir() is killable
>> > > >  - iterate_dir() adds permission checking and accessing notifications
>> > > > 
>> > > > I believe these are not harmful for coda so it is best to use
>> > > > iterate_dir() directly.  This will allow locking changes without
>> > > > touching the code in coda.
>> > > > 
>> > > > Signed-off-by: NeilBrown <neil@brown.name>
>> > > > ---
>> > > >  fs/coda/dir.c | 12 ++----------
>> > > >  1 file changed, 2 insertions(+), 10 deletions(-)
>> > > > 
>> > > > diff --git a/fs/coda/dir.c b/fs/coda/dir.c
>> > > > index ab69d8f0cec2..ca9990017265 100644
>> > > > --- a/fs/coda/dir.c
>> > > > +++ b/fs/coda/dir.c
>> > > > @@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
>> > > >  	cfi = coda_ftoc(coda_file);
>> > > >  	host_file = cfi->cfi_container;
>> > > >  
>> > > > -	if (host_file->f_op->iterate_shared) {
>> > > > -		struct inode *host_inode = file_inode(host_file);
>> > > > -		ret = -ENOENT;
>> > > > -		if (!IS_DEADDIR(host_inode)) {
>> > > > -			inode_lock_shared(host_inode);
>> > > > -			ret = host_file->f_op->iterate_shared(host_file, ctx);
>> > > > -			file_accessed(host_file);
>> > > > -			inode_unlock_shared(host_inode);
>> > > > -		}
>> > > > +	ret = iterate_dir(host_file, ctx);
>> > > > +	if (ret != -ENOTDIR)
>> > > >  		return ret;
>> > > > -	}
>> > > >  	/* Venus: we must read Venus dirents from a file */
>> > > >  	return coda_venus_readdir(coda_file, ctx);
>> > > >  }
>> > > 
>> > > 
>> > > Is it already time for my annual ask of "Who the heck is using coda
>> > > these days?" Anyway, this patch looks fine to me.
>> > > 
>> > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
>> > 
>> > Send a patch proposing deprecating it and we might learn that :) Searching
>> > the web seems to suggest it is indeed pretty close to dead.
>> > 
>> > 								Honza
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] exportfs: use lookup_one_unlocked()
  2025-06-08 23:09 ` [PATCH 4/5] exportfs: use lookup_one_unlocked() NeilBrown
  2025-06-09 12:18   ` Jeff Layton
@ 2025-06-09 14:01   ` Chuck Lever
  1 sibling, 0 replies; 25+ messages in thread
From: Chuck Lever @ 2025-06-09 14:01 UTC (permalink / raw)
  To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Amir Goldstein, Jan Harkes, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, codalist, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On 6/8/25 7:09 PM, NeilBrown wrote:
> rather than locking the directory and using lookup_one(), just use
> lookup_one_unlocked().  This keeps locking code centralised.
> 
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/exportfs/expfs.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index cdefea17986a..d3e55de4a2a2 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -549,15 +549,13 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
>  			goto err_result;
>  		}
>  
> -		inode_lock(target_dir->d_inode);
> -		nresult = lookup_one(mnt_idmap(mnt), &QSTR(nbuf), target_dir);
> +		nresult = lookup_one_unlocked(mnt_idmap(mnt), &QSTR(nbuf), target_dir);
>  		if (!IS_ERR(nresult)) {
>  			if (unlikely(nresult->d_inode != result->d_inode)) {
>  				dput(nresult);
>  				nresult = ERR_PTR(-ESTALE);
>  			}
>  		}
> -		inode_unlock(target_dir->d_inode);
>  		/*
>  		 * At this point we are done with the parent, but it's pinned
>  		 * by the child dentry anyway.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-09 13:33           ` Jan Harkes
@ 2025-06-09 14:02             ` Jeff Layton
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2025-06-09 14:02 UTC (permalink / raw)
  To: Jan Harkes, Jan Kara
  Cc: NeilBrown, Alexander Viro, Christian Brauner, Chuck Lever,
	Amir Goldstein, David Howells, Tyler Hicks, Miklos Szeredi,
	Carlos Maiolino, linux-fsdevel, coda, linux-nfs, netfs, ecryptfs,
	linux-unionfs, linux-xfs, linux-kernel

On Mon, 2025-06-09 at 09:33 -0400, Jan Harkes wrote:
> 
> On June 9, 2025 9:21:20 AM EDT, Jeff Layton <jlayton@kernel.org> wrote:
> > On Mon, 2025-06-09 at 09:12 -0400, Jan Harkes wrote:
> > > There are definitely still users of Coda at CMU, I don't track who else uses it, but it cannot be too many for sure.
> > > 
> > > At this point it mostly keeps you honest about little locking details in the vfs. There are some tricky details in how the inode mappings are accessed and such. I think that  is helpful for overlay and user filesystems like fuse, overlayfs, etc. but Coda is quite small so it is easy to reason about how it uses these features.
> > > 
> > > 
> > 
> > The reason I ask is that it's one of the places that we often have to
> > do odd fixups like this when making changes to core VFS APIs. It's also
> > not seen any non-collateral changes since 2021. I'm just wondering
> > whether it's worth it to keep coda in-tree for so few users.
> 
> I have no problem maintaining an external module. I already do that to both make development easier and to support building a custom module in case some distro didn't ship with a prebuilt Coda kernel module.
> 
> > IIRC, it's also the only fs in the kernel that changes its
> > inode->i_mapping pointer after inode instantiation. If not for coda, we
> > could probably replace the i_mapping pointer with some macro wizardry
> > and shrink struct inode by 8 bytes.
> 
> And that would be why an out-of-tree solution will not work in the long run. A small change like that to shave 8 bytes off the inode structure is a fairly easy call to make when there are no in-tree filesystems that use it anymore.
> 

Yes. That would break coda as it's currently designed.  Actually, it
looks like DAX replaces i_mapping too, so we'd have to figure out
something there if we wanted to do this anyway.

> Ultimately it is up to you guys to decide if the extra burden on VFS maintenance is worth it.
> 

Fair enough. I don't have any current plans to push for removing it,
but there is a maintenance burden here.

> 
> > > On June 9, 2025 9:00:31 AM EDT, Jan Kara <jack@suse.cz> wrote:
> > > > On Mon 09-06-25 08:17:15, Jeff Layton wrote:
> > > > > On Mon, 2025-06-09 at 09:09 +1000, NeilBrown wrote:
> > > > > > The code in coda_readdir() is nearly identical to iterate_dir().
> > > > > > Differences are:
> > > > > >  - iterate_dir() is killable
> > > > > >  - iterate_dir() adds permission checking and accessing notifications
> > > > > > 
> > > > > > I believe these are not harmful for coda so it is best to use
> > > > > > iterate_dir() directly.  This will allow locking changes without
> > > > > > touching the code in coda.
> > > > > > 
> > > > > > Signed-off-by: NeilBrown <neil@brown.name>
> > > > > > ---
> > > > > >  fs/coda/dir.c | 12 ++----------
> > > > > >  1 file changed, 2 insertions(+), 10 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/coda/dir.c b/fs/coda/dir.c
> > > > > > index ab69d8f0cec2..ca9990017265 100644
> > > > > > --- a/fs/coda/dir.c
> > > > > > +++ b/fs/coda/dir.c
> > > > > > @@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
> > > > > >  	cfi = coda_ftoc(coda_file);
> > > > > >  	host_file = cfi->cfi_container;
> > > > > >  
> > > > > > -	if (host_file->f_op->iterate_shared) {
> > > > > > -		struct inode *host_inode = file_inode(host_file);
> > > > > > -		ret = -ENOENT;
> > > > > > -		if (!IS_DEADDIR(host_inode)) {
> > > > > > -			inode_lock_shared(host_inode);
> > > > > > -			ret = host_file->f_op->iterate_shared(host_file, ctx);
> > > > > > -			file_accessed(host_file);
> > > > > > -			inode_unlock_shared(host_inode);
> > > > > > -		}
> > > > > > +	ret = iterate_dir(host_file, ctx);
> > > > > > +	if (ret != -ENOTDIR)
> > > > > >  		return ret;
> > > > > > -	}
> > > > > >  	/* Venus: we must read Venus dirents from a file */
> > > > > >  	return coda_venus_readdir(coda_file, ctx);
> > > > > >  }
> > > > > 
> > > > > 
> > > > > Is it already time for my annual ask of "Who the heck is using coda
> > > > > these days?" Anyway, this patch looks fine to me.
> > > > > 
> > > > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > > > 
> > > > Send a patch proposing deprecating it and we might learn that :) Searching
> > > > the web seems to suggest it is indeed pretty close to dead.
> > > > 
> > > > 								Honza
> > 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] coda: use iterate_dir() in coda_readdir()
  2025-06-08 23:09 ` [PATCH 3/5] coda: use iterate_dir() in coda_readdir() NeilBrown
  2025-06-09 12:17   ` Jeff Layton
@ 2025-06-09 14:35   ` Jan Harkes
  1 sibling, 0 replies; 25+ messages in thread
From: Jan Harkes @ 2025-06-09 14:35 UTC (permalink / raw)
  To: NeilBrown, Alexander Viro, Christian Brauner, Jan Kara,
	Chuck Lever, Jeff Layton, Amir Goldstein, David Howells,
	Tyler Hicks, Miklos Szeredi, Carlos Maiolino
  Cc: linux-fsdevel, coda, linux-nfs, netfs, ecryptfs, linux-unionfs,
	linux-xfs, linux-kernel

That change is probably good.

The Coda user space always writes directory data to a file, so the normal path always uses coda_venus_readdir.

The iterate_dir code was afaik mostly used while developing the original kernel module during the Linux-2.1 era. It was using a trivial user space helper that would simply re-export an existing filesystem subtree. Sort of like a bind mount before bind mounts existed.

Jan

On June 8, 2025 7:37:25 PM EDT, NeilBrown <neil@brown.name> wrote:
>The code in coda_readdir() is nearly identical to iterate_dir().
>Differences are:
> - iterate_dir() is killable
> - iterate_dir() adds permission checking and accessing notifications
>
>I believe these are not harmful for coda so it is best to use
>iterate_dir() directly.  This will allow locking changes without
>touching the code in coda.
>
>Signed-off-by: NeilBrown <neil@brown.name>
>---
> fs/coda/dir.c | 12 ++----------
> 1 file changed, 2 insertions(+), 10 deletions(-)
>
>diff --git a/fs/coda/dir.c b/fs/coda/dir.c
>index ab69d8f0cec2..ca9990017265 100644
>--- a/fs/coda/dir.c
>+++ b/fs/coda/dir.c
>@@ -429,17 +429,9 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx)
> 	cfi = coda_ftoc(coda_file);
> 	host_file = cfi->cfi_container;
> 
>-	if (host_file->f_op->iterate_shared) {
>-		struct inode *host_inode = file_inode(host_file);
>-		ret = -ENOENT;
>-		if (!IS_DEADDIR(host_inode)) {
>-			inode_lock_shared(host_inode);
>-			ret = host_file->f_op->iterate_shared(host_file, ctx);
>-			file_accessed(host_file);
>-			inode_unlock_shared(host_inode);
>-		}
>+	ret = iterate_dir(host_file, ctx);
>+	if (ret != -ENOTDIR)
> 		return ret;
>-	}
> 	/* Venus: we must read Venus dirents from a file */
> 	return coda_venus_readdir(coda_file, ctx);
> }

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] Change vfs_mkdir() to unlock on failure.
  2025-06-09  5:34       ` Al Viro
@ 2025-06-10  8:26         ` Al Viro
  0 siblings, 0 replies; 25+ messages in thread
From: Al Viro @ 2025-06-10  8:26 UTC (permalink / raw)
  To: NeilBrown
  Cc: Christian Brauner, Jan Kara, Chuck Lever, Jeff Layton,
	Amir Goldstein, Jan Harkes, David Howells, Tyler Hicks,
	Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda, codalist,
	linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Mon, Jun 09, 2025 at 06:34:42AM +0100, Al Viro wrote:

> I can't promise a response tonight - going down in an hour or so
> and I'd like to do enough reordering of #work.mount to be able
> to post the initial variant of at least some of that in the
> morning...

Grr...  Sorry, that took longer than I hoped - fun propagate_mnt_busy()
bug had eaten a lot of time ;-/

I'll go through your series when I get up; apologies for delay...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes
  2025-06-08 23:09 [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes NeilBrown
                   ` (4 preceding siblings ...)
  2025-06-08 23:09 ` [PATCH 5/5] Change vfs_mkdir() to unlock on failure NeilBrown
@ 2025-06-11 11:43 ` Christian Brauner
  2025-06-11 22:35   ` NeilBrown
  5 siblings, 1 reply; 25+ messages in thread
From: Christian Brauner @ 2025-06-11 11:43 UTC (permalink / raw)
  To: NeilBrown
  Cc: Alexander Viro, Jan Kara, Chuck Lever, Jeff Layton,
	Amir Goldstein, Jan Harkes, David Howells, Tyler Hicks,
	Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda, codalist,
	linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Mon, Jun 09, 2025 at 09:09:32AM +1000, NeilBrown wrote:
> The following 5 patches provide further cleanup that serves as
> preparation for some dir-locking API changes that I want to make.  The
> most interesting is the last which makes another change to vfs_mkdir().
> As well as returning the dentry or consuming it on failure (a recent
> change) it now also unlocks on failure.  This will be needed when we
> transition to locking just the dentry, not the whole directory.

All of the patches except the vfs_mkdir() one that Al is looking at
make sense as independent cleanups imho. So I'd take them unless I hear
screams.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes
  2025-06-11 11:43 ` [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes Christian Brauner
@ 2025-06-11 22:35   ` NeilBrown
  0 siblings, 0 replies; 25+ messages in thread
From: NeilBrown @ 2025-06-11 22:35 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Jan Kara, Chuck Lever, Jeff Layton,
	Amir Goldstein, Jan Harkes, David Howells, Tyler Hicks,
	Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda, codalist,
	linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Wed, 11 Jun 2025, Christian Brauner wrote:
> On Mon, Jun 09, 2025 at 09:09:32AM +1000, NeilBrown wrote:
> > The following 5 patches provide further cleanup that serves as
> > preparation for some dir-locking API changes that I want to make.  The
> > most interesting is the last which makes another change to vfs_mkdir().
> > As well as returning the dentry or consuming it on failure (a recent
> > change) it now also unlocks on failure.  This will be needed when we
> > transition to locking just the dentry, not the whole directory.
> 
> All of the patches except the vfs_mkdir() one that Al is looking at
> make sense as independent cleanups imho. So I'd take them unless I hear
> screams.
> 

Thanks.  I'm glad you didn't include the vfs_mkdir() change as I found a
problem in the overlayfs code.  I'm make sure I really understand that
code before trying that one again.

Meanwhile I noticed I have a couple of other mostly-independent
cleanups.  I'll send those.

NeilBrown

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] Change vfs_mkdir() to unlock on failure.
  2025-06-08 23:09 ` [PATCH 5/5] Change vfs_mkdir() to unlock on failure NeilBrown
  2025-06-09  0:50   ` Al Viro
  2025-06-09 12:23   ` Jeff Layton
@ 2025-06-12  6:59   ` Amir Goldstein
  2 siblings, 0 replies; 25+ messages in thread
From: Amir Goldstein @ 2025-06-12  6:59 UTC (permalink / raw)
  To: NeilBrown
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Chuck Lever,
	Jeff Layton, Jan Harkes, David Howells, Tyler Hicks,
	Miklos Szeredi, Carlos Maiolino, linux-fsdevel, coda, codalist,
	linux-nfs, netfs, ecryptfs, linux-unionfs, linux-xfs,
	linux-kernel

On Mon, Jun 9, 2025 at 1:10 AM NeilBrown <neil@brown.name> wrote:
>
> Proposed changes to directory-op locking will lock the dentry rather
> than the whole directory.  So the dentry will need to be unlocked.
>
> vfs_mkdir() consumes the dentry on error, so there will be no dentry to
> be unlocked.
>
> So change vfs_mkdir() to unlock on error as well as releasing the
> dentry.  This requires various other functions in various callers to
> also unlock on error.
>

That's a scary subtle API change to make.
If the change to mkdir API wasn't only in v6.15, that would
have been a lethal backporting bug landmine.
Anyway, a shiny porting.rst comment is due.

> At present this results in some clumsy code.  Once the transition to
> dentry locking is complete the clumsiness will be gone.
>
> overlayfs looks particularly clumsy as in some cases a double-directory
> rename lock is taken, and a mkdir is then performed in one of the
> directories.  If that fails the other directory must be unlocked.

Can some of this mess be abstracted with a helper like
unlock_new_dir(struct dentry *newdir)
which is tolerant to PTR_ERR?

I will refrain from reviewing the ovl patch because you said you found
a bug in it and because I hope it may be easier to review with the
proposed cleanup helper.

>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
...

> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 8baaba0a3fe5..44df3a2449e7 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -248,6 +248,7 @@ static inline struct dentry *ovl_do_mkdir(struct ovl_fs *ofs,
>  {
>         dentry = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode);
>         pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, PTR_ERR_OR_ZERO(dentry));
> +       /* Note: dir will have been unlocked on failure */
>         return dentry;
>  }
>

Your previous APi change introduced a regression here.
The name will not be printed in the error case.
I will post a fix.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-06-12  7:00 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-08 23:09 [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes NeilBrown
2025-06-08 23:09 ` [PATCH 1/5] VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl() NeilBrown
2025-06-09 12:13   ` Jeff Layton
2025-06-08 23:09 ` [PATCH 2/5] VFS: Minor fixes for porting.rst NeilBrown
2025-06-09 12:13   ` Jeff Layton
2025-06-08 23:09 ` [PATCH 3/5] coda: use iterate_dir() in coda_readdir() NeilBrown
2025-06-09 12:17   ` Jeff Layton
2025-06-09 13:00     ` Jan Kara
2025-06-09 13:12       ` Jan Harkes
2025-06-09 13:21         ` Jeff Layton
2025-06-09 13:33           ` Jan Harkes
2025-06-09 14:02             ` Jeff Layton
2025-06-09 14:35   ` Jan Harkes
2025-06-08 23:09 ` [PATCH 4/5] exportfs: use lookup_one_unlocked() NeilBrown
2025-06-09 12:18   ` Jeff Layton
2025-06-09 14:01   ` Chuck Lever
2025-06-08 23:09 ` [PATCH 5/5] Change vfs_mkdir() to unlock on failure NeilBrown
2025-06-09  0:50   ` Al Viro
2025-06-09  5:22     ` NeilBrown
2025-06-09  5:34       ` Al Viro
2025-06-10  8:26         ` Al Viro
2025-06-09 12:23   ` Jeff Layton
2025-06-12  6:59   ` Amir Goldstein
2025-06-11 11:43 ` [PATCH 0/5] Minor cleanup preparation for some dir-locking API changes Christian Brauner
2025-06-11 22:35   ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).