Linux NFS development
 help / color / mirror / Atom feed
* [PATCH 0/8 RFC] tidy up various VFS lookup functions
@ 2025-03-14  0:34 NeilBrown
  2025-03-14  0:34 ` [PATCH 1/8] VFS: improve interface for lookup_one functions NeilBrown
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

VFS has some functions with names containing "lookup_one_len" and others
without the "_len".  This difference has nothing to do with "len".

The functions without "_len" take a "mnt_idmap" pointer.  This is found
in the "vfsmount" and that is an important question when choosing which
to use: do you have a vfsmount, or are you "inside" the filesystem.  A
related question is "is permission checking relevant here?".

nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
functions.  They pass nop_mnt_idmap which is not correct if the vfsmount
is actually idmaped.  For cachefiles it probably (?) doesn't matter as
the accesses to the backing filesystem are always does with elevated privileged (?).

For nfsd it would matter if anyone exported an idmapped filesystem.  I
wonder if anyone has tried...

These patches change the "lookup_one" functions to take a vfsmount
instead of a mnt_idmap because I think that makes the intention clearer.

It also renames the "_one" functions to be "_noperm" and removes the
permission checking completely.  In all cases where they are (correctly)
used permission checking is irrelevant.

I haven't included changes to afs because there are patches in vfs.all
which make a lot of changes to lookup in afs.  I think (if they are seen
as a good idea) these patches should aim to land after the afs patches
and any further fixup in afs can happen then.

The nfsd and cachefiles patches probably should be separate.  Maybe I
should submit those to relevant maintainers first, and one afs,
cachefiles, and nfsd changes have landed I can submit this series with
appropriate modifications.

May main question for review is : have I understood mnt_idmap correctly?

These patches are based on vfs-6.15.async.dir as they touch mkdir
related code.  There is a small conflict with the recently posted patch
to remove locking from try_lookup_one_len() calls.

Thanks,
NeilBrown


 [PATCH 1/8] VFS: improve interface for lookup_one functions
 [PATCH 2/8] nfsd: Use lookup_one() rather than lookup_one_len()
 [PATCH 3/8] nfsd: use correct idmap for all accesses.
 [PATCH 4/8] cachefiles: Use lookup_one() rather than lookup_one_len()
 [PATCH 5/8] cachefiles: use correct mnt_idmap
 [PATCH 6/8] VFS: rename lookup_one_len() family to lookup_noperm()
 [PATCH 7/8] Use try_lookup_noperm() instead of d_hash_and_lookup()
 [PATCH 8/8] VFS: change lookup_one_common and lookup_noperm_common to

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/8] VFS: improve interface for lookup_one functions
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14  0:34 ` [PATCH 2/8] nfsd: Use lookup_one() rather than lookup_one_len() NeilBrown
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

The family of functions:
  lookup_one()
  lookup_one_unlocked()
  lookup_one_positive_unlocked()

appear designed to be used by external clients of the filesystem rather
than by filesystems acting on themselves as the lookup_one_len family
are used.

They are used by:
   btrfs/ioctl - which is a user-space interface rather than an internal
     activity
   exportfs - i.e. from nfsd or the open_by_handle_at interface
   overlayfs - at access the underlying filesystems
   smb/server - for file service

They should be used by nfsd (more than just the exportfs path) and
cachefs but aren't - likely because of the need for an "idmap" which
may not be well understood by developers.

It would help if the documentation didn't claim they should "not be
called by generic code" (which is the only place they are called, and if
they accepted a struct vfsmount rather then mnt_idmap.  All callers can
easily provide a vfsmount and developers are likely more familiar with
this object.

Also the path component name is passed as "name" and "len" which are
(confusingly?) separate by the "base".  In some cases the len in simply
"strlen" and so passing a qstr using QSTR() would make the calling
clearer.
Other callers do pass separate name and len which are stored in a
struct.  Sometimes these are already stored in a qstr, other times it
easily could be.

So this patch changes these three functions to receive a 'struct
vfsmount *' and a 'struct qstr', and improves the documentation.

QSTR_LEN() is added to make it easy to pass a QSTR containing a known
len.

This does result in some code duplication in the functions in
fs/namei.c.  Some of this (lookup_one_len_unlocked) will be addressed in
a later patch.

Signed-off-by: NeilBrown <neil@brown.name>
---
 Documentation/filesystems/porting.rst | 10 ++++
 fs/btrfs/ioctl.c                      |  9 ++--
 fs/exportfs/expfs.c                   |  5 +-
 fs/namei.c                            | 77 +++++++++++++++------------
 fs/overlayfs/namei.c                  | 14 ++---
 fs/overlayfs/overlayfs.h              |  3 +-
 fs/overlayfs/readdir.c                |  9 ++--
 fs/smb/server/smb2pdu.c               |  7 +--
 include/linux/dcache.h                |  3 +-
 include/linux/namei.h                 | 13 +++--
 10 files changed, 84 insertions(+), 66 deletions(-)

diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index 6817614e0820..481b2656d7a0 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -1196,3 +1196,13 @@ should use d_drop();d_splice_alias() and return the result of the latter.
 If a positive dentry cannot be returned for some reason, in-kernel
 clients such as cachefiles, nfsd, smb/server may not perform ideally but
 will fail-safe.
+
+---
+
+** mandatory**
+
+lookup_one(), lookup_one_unlocked(), lookup_one_positive_unlocked() now
+take a vfsmount instead of a mnt_idmap, and a qstr instead of a name and len.
+These, not the "one_len" versions, should be used whenever accessing a
+filesystem from outside that filesysmtem, through a mount point - unless
+a permission check has already been performed.
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 6c18bad53cd3..9c8323335fd2 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -911,7 +911,7 @@ static noinline int btrfs_mksubvol(const struct path *parent,
 	if (error == -EINTR)
 		return error;
 
-	dentry = lookup_one(idmap, name, parent->dentry, namelen);
+	dentry = lookup_one(parent->mnt, QSTR_LEN(name, namelen), parent->dentry);
 	error = PTR_ERR(dentry);
 	if (IS_ERR(dentry))
 		goto out_unlock;
@@ -2289,7 +2289,6 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 	struct btrfs_ioctl_vol_args_v2 *vol_args2 = NULL;
 	struct mnt_idmap *idmap = file_mnt_idmap(file);
 	char *subvol_name, *subvol_name_ptr = NULL;
-	int subvol_namelen;
 	int ret = 0;
 	bool destroy_parent = false;
 
@@ -2412,10 +2411,8 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 			goto out;
 	}
 
-	subvol_namelen = strlen(subvol_name);
-
 	if (strchr(subvol_name, '/') ||
-	    strncmp(subvol_name, "..", subvol_namelen) == 0) {
+	    strcmp(subvol_name, "..") == 0) {
 		ret = -EINVAL;
 		goto free_subvol_name;
 	}
@@ -2428,7 +2425,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 	ret = down_write_killable_nested(&dir->i_rwsem, I_MUTEX_PARENT);
 	if (ret == -EINTR)
 		goto free_subvol_name;
-	dentry = lookup_one(idmap, subvol_name, parent, subvol_namelen);
+	dentry = lookup_one(file->f_path.mnt, QSTR(subvol_name), parent);
 	if (IS_ERR(dentry)) {
 		ret = PTR_ERR(dentry);
 		goto out_unlock_dir;
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 0c899cfba578..cc9a5c45447d 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -145,7 +145,7 @@ static struct dentry *reconnect_one(struct vfsmount *mnt,
 	if (err)
 		goto out_err;
 	dprintk("%s: found name: %s\n", __func__, nbuf);
-	tmp = lookup_one_unlocked(mnt_idmap(mnt), nbuf, parent, strlen(nbuf));
+	tmp = lookup_one_unlocked(mnt, QSTR(nbuf), parent);
 	if (IS_ERR(tmp)) {
 		dprintk("lookup failed: %ld\n", PTR_ERR(tmp));
 		err = PTR_ERR(tmp);
@@ -551,8 +551,7 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
 		}
 
 		inode_lock(target_dir->d_inode);
-		nresult = lookup_one(mnt_idmap(mnt), nbuf,
-				     target_dir, strlen(nbuf));
+		nresult = lookup_one(mnt, QSTR(nbuf), target_dir);
 		if (!IS_ERR(nresult)) {
 			if (unlikely(nresult->d_inode != result->d_inode)) {
 				dput(nresult);
diff --git a/fs/namei.c b/fs/namei.c
index b5abf456c5f4..fb8dbabc5cbf 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2922,19 +2922,17 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 EXPORT_SYMBOL(lookup_one_len);
 
 /**
- * lookup_one - filesystem helper to lookup single pathname component
- * @idmap:	idmap of the mount the lookup is performed from
- * @name:	pathname component to lookup
+ * lookup_one - lookup single pathname component
+ * @mnt:	the mount the lookup is performed from
+ * @name:	qstr holding pathname component to lookup
  * @base:	base directory to lookup from
- * @len:	maximum length @len should be interpreted to
  *
- * Note that this routine is purely a helper for filesystem usage and should
- * not be called by generic code.
+ * This can be used for in-kernel filesystem clients such as file servers.
  *
  * The caller must hold base->i_mutex.
  */
-struct dentry *lookup_one(struct mnt_idmap *idmap, const char *name,
-			  struct dentry *base, int len)
+struct dentry *lookup_one(struct vfsmount *mnt, struct qstr name,
+			  struct dentry *base)
 {
 	struct dentry *dentry;
 	struct qstr this;
@@ -2942,7 +2940,7 @@ struct dentry *lookup_one(struct mnt_idmap *idmap, const char *name,
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_one_common(idmap, name, base, len, &this);
+	err = lookup_one_common(mnt_idmap(mnt), name.name, base, name.len, &this);
 	if (err)
 		return ERR_PTR(err);
 
@@ -2952,27 +2950,24 @@ struct dentry *lookup_one(struct mnt_idmap *idmap, const char *name,
 EXPORT_SYMBOL(lookup_one);
 
 /**
- * lookup_one_unlocked - filesystem helper to lookup single pathname component
- * @idmap:	idmap of the mount the lookup is performed from
- * @name:	pathname component to lookup
+ * lookup_one_unlocked - lookup single pathname component
+ * @mnt:	the mount the lookup is performed from
+ * @name:	qstr olding pathname component to lookup
  * @base:	base directory to lookup from
- * @len:	maximum length @len should be interpreted to
  *
- * Note that this routine is purely a helper for filesystem usage and should
- * not be called by generic code.
+ * This can be used for in-kernel filesystem clients such as file servers.
  *
  * Unlike lookup_one_len, it should be called without the parent
- * i_mutex held, and will take the i_mutex itself if necessary.
+ * i_rwsem held, and will take the i_rwsem itself if necessary.
  */
-struct dentry *lookup_one_unlocked(struct mnt_idmap *idmap,
-				   const char *name, struct dentry *base,
-				   int len)
+struct dentry *lookup_one_unlocked(struct vfsmount *mnt,
+				   struct qstr name, struct dentry *base)
 {
 	struct qstr this;
 	int err;
 	struct dentry *ret;
 
-	err = lookup_one_common(idmap, name, base, len, &this);
+	err = lookup_one_common(mnt_idmap(mnt), name.name, base, name.len, &this);
 	if (err)
 		return ERR_PTR(err);
 
@@ -2984,12 +2979,10 @@ struct dentry *lookup_one_unlocked(struct mnt_idmap *idmap,
 EXPORT_SYMBOL(lookup_one_unlocked);
 
 /**
- * lookup_one_positive_unlocked - filesystem helper to lookup single
- *				  pathname component
- * @idmap:	idmap of the mount the lookup is performed from
- * @name:	pathname component to lookup
+ * lookup_one_positive_unlocked - lookup single pathname component
+ * @mnt:	the mount the lookup is performed from
+ * @name:	qstr holding pathname component to lookup
  * @base:	base directory to lookup from
- * @len:	maximum length @len should be interpreted to
  *
  * This helper will yield ERR_PTR(-ENOENT) on negatives. The helper returns
  * known positive or ERR_PTR(). This is what most of the users want.
@@ -2998,16 +2991,15 @@ EXPORT_SYMBOL(lookup_one_unlocked);
  * time, so callers of lookup_one_unlocked() need to be very careful; pinned
  * positives have >d_inode stable, so this one avoids such problems.
  *
- * Note that this routine is purely a helper for filesystem usage and should
- * not be called by generic code.
+ * This can be used for in-kernel filesystem clients such as file servers.
  *
- * The helper should be called without i_mutex held.
+ * The helper should be called without i_rwsem held.
  */
-struct dentry *lookup_one_positive_unlocked(struct mnt_idmap *idmap,
-					    const char *name,
-					    struct dentry *base, int len)
+struct dentry *lookup_one_positive_unlocked(struct vfsmount *mnt,
+					    struct qstr name,
+					    struct dentry *base)
 {
-	struct dentry *ret = lookup_one_unlocked(idmap, name, base, len);
+	struct dentry *ret = lookup_one_unlocked(mnt, name, base);
 
 	if (!IS_ERR(ret) && d_flags_negative(smp_load_acquire(&ret->d_flags))) {
 		dput(ret);
@@ -3032,7 +3024,18 @@ EXPORT_SYMBOL(lookup_one_positive_unlocked);
 struct dentry *lookup_one_len_unlocked(const char *name,
 				       struct dentry *base, int len)
 {
-	return lookup_one_unlocked(&nop_mnt_idmap, name, base, len);
+	struct qstr this;
+	int err;
+	struct dentry *ret;
+
+	err = lookup_one_common(&nop_mnt_idmap, name, base, len, &this);
+	if (err)
+		return ERR_PTR(err);
+
+	ret = lookup_dcache(&this, base, 0);
+	if (!ret)
+		ret = lookup_slow(&this, base, 0);
+	return ret;
 }
 EXPORT_SYMBOL(lookup_one_len_unlocked);
 
@@ -3047,7 +3050,13 @@ EXPORT_SYMBOL(lookup_one_len_unlocked);
 struct dentry *lookup_positive_unlocked(const char *name,
 				       struct dentry *base, int len)
 {
-	return lookup_one_positive_unlocked(&nop_mnt_idmap, name, base, len);
+	struct dentry *ret = lookup_one_len_unlocked(name, base, len);
+
+	if (!IS_ERR(ret) && d_flags_negative(smp_load_acquire(&ret->d_flags))) {
+		dput(ret);
+		ret = ERR_PTR(-ENOENT);
+	}
+	return ret;
 }
 EXPORT_SYMBOL(lookup_positive_unlocked);
 
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index be5c65d6f848..7ee0c43e6630 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -205,8 +205,9 @@ static struct dentry *ovl_lookup_positive_unlocked(struct ovl_lookup_data *d,
 						   struct dentry *base, int len,
 						   bool drop_negative)
 {
-	struct dentry *ret = lookup_one_unlocked(mnt_idmap(d->layer->mnt), name,
-						 base, len);
+	struct dentry *ret = lookup_one_unlocked(d->layer->mnt,
+						 QSTR_LEN(name, len),
+						 base);
 
 	if (!IS_ERR(ret) && d_flags_negative(smp_load_acquire(&ret->d_flags))) {
 		if (drop_negative && ret->d_lockref.count == 1) {
@@ -789,8 +790,8 @@ struct dentry *ovl_lookup_index(struct ovl_fs *ofs, struct dentry *upper,
 	if (err)
 		return ERR_PTR(err);
 
-	index = lookup_one_positive_unlocked(ovl_upper_mnt_idmap(ofs), name.name,
-					     ofs->workdir, name.len);
+	index = lookup_one_positive_unlocked(ovl_upper_mnt(ofs), name,
+					     ofs->workdir);
 	if (IS_ERR(index)) {
 		err = PTR_ERR(index);
 		if (err == -ENOENT) {
@@ -1394,9 +1395,8 @@ bool ovl_lower_positive(struct dentry *dentry)
 		struct dentry *this;
 		struct ovl_path *parentpath = &ovl_lowerstack(poe)[i];
 
-		this = lookup_one_positive_unlocked(
-				mnt_idmap(parentpath->layer->mnt),
-				name->name, parentpath->dentry, name->len);
+		this = lookup_one_positive_unlocked(parentpath->layer->mnt,
+						    *name, parentpath->dentry);
 		if (IS_ERR(this)) {
 			switch (PTR_ERR(this)) {
 			case -ENOENT:
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 6f2f8f4cfbbc..38978bd4be48 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -402,7 +402,8 @@ static inline struct dentry *ovl_lookup_upper(struct ovl_fs *ofs,
 					      const char *name,
 					      struct dentry *base, int len)
 {
-	return lookup_one(ovl_upper_mnt_idmap(ofs), name, base, len);
+	return lookup_one(ovl_upper_mnt(ofs),
+			  QSTR_LEN(name, len), base);
 }
 
 static inline bool ovl_open_flags_need_copy_up(int flags)
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 881ec5592da5..aaba9777f3bf 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -271,7 +271,6 @@ static bool ovl_fill_merge(struct dir_context *ctx, const char *name,
 static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data *rdd)
 {
 	int err;
-	struct ovl_cache_entry *p;
 	struct dentry *dentry, *dir = path->dentry;
 	const struct cred *old_cred;
 
@@ -280,9 +279,11 @@ static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data
 	err = down_write_killable(&dir->d_inode->i_rwsem);
 	if (!err) {
 		while (rdd->first_maybe_whiteout) {
-			p = rdd->first_maybe_whiteout;
+			struct ovl_cache_entry *p =
+				rdd->first_maybe_whiteout;
 			rdd->first_maybe_whiteout = p->next_maybe_whiteout;
-			dentry = lookup_one(mnt_idmap(path->mnt), p->name, dir, p->len);
+			dentry = lookup_one(path->mnt,
+					    QSTR_LEN(p->name, p->len), dir);
 			if (!IS_ERR(dentry)) {
 				p->is_whiteout = ovl_is_whiteout(dentry);
 				dput(dentry);
@@ -492,7 +493,7 @@ static int ovl_cache_update(const struct path *path, struct ovl_cache_entry *p,
 		}
 	}
 	/* This checks also for xwhiteouts */
-	this = lookup_one(mnt_idmap(path->mnt), p->name, dir, p->len);
+	this = lookup_one(path->mnt, QSTR_LEN(p->name, p->len), dir);
 	if (IS_ERR_OR_NULL(this) || !this->d_inode) {
 		/* Mark a stale entry */
 		p->is_whiteout = true;
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
index f1efcd027475..9c699bbe3d47 100644
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -4091,9 +4091,10 @@ static int process_query_dir_entries(struct smb2_query_dir_private *priv)
 			return -EINVAL;
 
 		lock_dir(priv->dir_fp);
-		dent = lookup_one(idmap, priv->d_info->name,
-				  priv->dir_fp->filp->f_path.dentry,
-				  priv->d_info->name_len);
+		dent = lookup_one(priv->dir_fp->filp->f_path.mnt,
+				  QSTR_LEN(priv->d_info->name,
+					   priv->d_info->name_len),
+				  priv->dir_fp->filp->f_path.dentry);
 		unlock_dir(priv->dir_fp);
 
 		if (IS_ERR(dent)) {
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 45bff10d3773..1f01f4e734c5 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -57,7 +57,8 @@ struct qstr {
 };
 
 #define QSTR_INIT(n,l) { { { .len = l } }, .name = n }
-#define QSTR(n) (struct qstr)QSTR_INIT(n, strlen(n))
+#define QSTR_LEN(n,l) (struct qstr)QSTR_INIT(n,l)
+#define QSTR(n) QSTR_LEN(n, strlen(n))
 
 extern const struct qstr empty_name;
 extern const struct qstr slash_name;
diff --git a/include/linux/namei.h b/include/linux/namei.h
index e3042176cdf4..5df2c1a4af25 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -73,13 +73,12 @@ extern struct dentry *try_lookup_one_len(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int);
 extern struct dentry *lookup_positive_unlocked(const char *, struct dentry *, int);
-struct dentry *lookup_one(struct mnt_idmap *, const char *, struct dentry *, int);
-struct dentry *lookup_one_unlocked(struct mnt_idmap *idmap,
-				   const char *name, struct dentry *base,
-				   int len);
-struct dentry *lookup_one_positive_unlocked(struct mnt_idmap *idmap,
-					    const char *name,
-					    struct dentry *base, int len);
+struct dentry *lookup_one(struct vfsmount *, struct qstr, struct dentry *);
+struct dentry *lookup_one_unlocked(struct vfsmount *mnt,
+				   struct qstr name, struct dentry *base);
+struct dentry *lookup_one_positive_unlocked(struct vfsmount *mnt,
+					    struct qstr name,
+					    struct dentry *base);
 
 extern int follow_down_one(struct path *);
 extern int follow_down(struct path *path, unsigned int flags);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/8] nfsd: Use lookup_one() rather than lookup_one_len()
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
  2025-03-14  0:34 ` [PATCH 1/8] VFS: improve interface for lookup_one functions NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14  0:34 ` [PATCH 3/8] nfsd: use correct idmap for all accesses NeilBrown
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

lookup_one_len() does not support idmapped mounts and does permission
checking on the non-mapped uids.  This means that nfsd cannot correctly
export idmapped mounts.

This patch changes to use lookup_one(), lookup_one_unlocked(), and
lookup_one_positive_unlocked() passing the relevant vfsmount so
idmapping can be honoured.

This requires passing the name in a qstr.  Currently this is a little
clumsy, but if nfsd is changed to use qstr more broadly it will result
in a net improvement.

Note that there are still many places where nfsd uses nop_mnt_idmap so
more work is needed to properly support idmapped mounts.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/nfsd/nfs3proc.c    |  3 ++-
 fs/nfsd/nfs3xdr.c     |  4 +++-
 fs/nfsd/nfs4proc.c    |  4 +++-
 fs/nfsd/nfs4recover.c | 13 +++++++------
 fs/nfsd/nfs4xdr.c     |  4 +++-
 fs/nfsd/nfsproc.c     |  5 +++--
 fs/nfsd/vfs.c         | 17 +++++++++--------
 fs/nfsd/vfs.h         | 12 ++++++++++++
 8 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 372bdcf5e07a..457638bf0f32 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -284,7 +284,8 @@ nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 	inode_lock_nested(inode, I_MUTEX_PARENT);
 
-	child = lookup_one_len(argp->name, parent, argp->len);
+	child = lookup_one(fh_mnt(fhp), QSTR_LEN(argp->name, argp->len),
+			   parent);
 	if (IS_ERR(child)) {
 		status = nfserrno(PTR_ERR(child));
 		goto out;
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index a7a07470c1f8..54fcc814e398 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -1001,7 +1001,9 @@ compose_entry_fh(struct nfsd3_readdirres *cd, struct svc_fh *fhp,
 		} else
 			dchild = dget(dparent);
 	} else
-		dchild = lookup_positive_unlocked(name, dparent, namlen);
+		dchild = lookup_one_positive_unlocked(fh_mnt(fhp),
+						      QSTR_LEN(name, namlen),
+						      dparent);
 	if (IS_ERR(dchild))
 		return rv;
 	if (d_mountpoint(dchild))
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index f6e06c779d09..9e29663aaeb1 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -266,7 +266,9 @@ nfsd4_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 	inode_lock_nested(inode, I_MUTEX_PARENT);
 
-	child = lookup_one_len(open->op_fname, parent, open->op_fnamelen);
+	child = lookup_one(fh_mnt(fhp),
+			   QSTR_LEN(open->op_fname, open->op_fnamelen),
+			   parent);
 	if (IS_ERR(child)) {
 		status = nfserrno(PTR_ERR(child));
 		goto out;
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index c1d9bd07285f..10d24bec532f 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -218,7 +218,7 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
 	/* lock the parent */
 	inode_lock(d_inode(dir));
 
-	dentry = lookup_one_len(dname, dir, HEXDIR_LEN-1);
+	dentry = lookup_one(nn->rec_file->f_path.mnt, QSTR(dname), dir);
 	if (IS_ERR(dentry)) {
 		status = PTR_ERR(dentry);
 		goto out_unlock;
@@ -316,7 +316,8 @@ nfsd4_list_rec_dir(recdir_func *f, struct nfsd_net *nn)
 	list_for_each_entry_safe(entry, tmp, &ctx.names, list) {
 		if (!status) {
 			struct dentry *dentry;
-			dentry = lookup_one_len(entry->name, dir, HEXDIR_LEN-1);
+			dentry = lookup_one(nn->rec_file->f_path.mnt,
+					    QSTR(entry->name), dir);
 			if (IS_ERR(dentry)) {
 				status = PTR_ERR(dentry);
 				break;
@@ -339,16 +340,16 @@ nfsd4_list_rec_dir(recdir_func *f, struct nfsd_net *nn)
 }
 
 static int
-nfsd4_unlink_clid_dir(char *name, int namlen, struct nfsd_net *nn)
+nfsd4_unlink_clid_dir(char *name, struct nfsd_net *nn)
 {
 	struct dentry *dir, *dentry;
 	int status;
 
-	dprintk("NFSD: nfsd4_unlink_clid_dir. name %.*s\n", namlen, name);
+	dprintk("NFSD: nfsd4_unlink_clid_dir. name %s\n", name);
 
 	dir = nn->rec_file->f_path.dentry;
 	inode_lock_nested(d_inode(dir), I_MUTEX_PARENT);
-	dentry = lookup_one_len(name, dir, namlen);
+	dentry = lookup_one(nn->rec_file->f_path.mnt, QSTR(name), dir);
 	if (IS_ERR(dentry)) {
 		status = PTR_ERR(dentry);
 		goto out_unlock;
@@ -408,7 +409,7 @@ nfsd4_remove_clid_dir(struct nfs4_client *clp)
 	if (status < 0)
 		goto out_drop_write;
 
-	status = nfsd4_unlink_clid_dir(dname, HEXDIR_LEN-1, nn);
+	status = nfsd4_unlink_clid_dir(dname, nn);
 	nfs4_reset_creds(original_cred);
 	if (status == 0) {
 		vfs_fsync(nn->rec_file, 0);
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index e67420729ecd..64ab2c605e93 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3812,7 +3812,9 @@ nfsd4_encode_entry4_fattr(struct nfsd4_readdir *cd, const char *name,
 	__be32 nfserr;
 	int ignore_crossmnt = 0;
 
-	dentry = lookup_positive_unlocked(name, cd->rd_fhp->fh_dentry, namlen);
+	dentry = lookup_one_positive_unlocked(fh_mnt(cd->rd_fhp),
+					      QSTR_LEN(name, namlen),
+					      cd->rd_fhp->fh_dentry);
 	if (IS_ERR(dentry))
 		return nfserrno(PTR_ERR(dentry));
 
diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index 6dda081eb24c..a84719935690 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -312,7 +312,8 @@ nfsd_proc_create(struct svc_rqst *rqstp)
 	}
 
 	inode_lock_nested(dirfhp->fh_dentry->d_inode, I_MUTEX_PARENT);
-	dchild = lookup_one_len(argp->name, dirfhp->fh_dentry, argp->len);
+	dchild = lookup_one(fh_mnt(dirfhp), QSTR_LEN(argp->name, argp->len),
+			    dirfhp->fh_dentry);
 	if (IS_ERR(dchild)) {
 		resp->status = nfserrno(PTR_ERR(dchild));
 		goto out_unlock;
@@ -331,7 +332,7 @@ nfsd_proc_create(struct svc_rqst *rqstp)
 		 */
 		resp->status = nfserr_acces;
 		if (!newfhp->fh_dentry) {
-			printk(KERN_WARNING 
+			printk(KERN_WARNING
 				"nfsd_proc_create: file handle not verified\n");
 			goto out_unlock;
 		}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 34d7aa531662..2829799bcd08 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -265,7 +265,8 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp,
 				goto out_nfserr;
 		}
 	} else {
-		dentry = lookup_one_len_unlocked(name, dparent, len);
+		dentry = lookup_one_unlocked(fh_mnt(fhp), QSTR_LEN(name, len),
+					     dparent);
 		host_err = PTR_ERR(dentry);
 		if (IS_ERR(dentry))
 			goto out_nfserr;
@@ -923,7 +924,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
 	 * directories, but we never have and it doesn't seem to have
 	 * caused anyone a problem.  If we were to change this, note
 	 * also that our filldir callbacks would need a variant of
-	 * lookup_one_len that doesn't check permissions.
+	 * lookup_one_positive_unlocked() that doesn't check permissions.
 	 */
 	if (type == S_IFREG)
 		may_flags |= NFSD_MAY_OWNER_OVERRIDE;
@@ -1555,7 +1556,7 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		return nfserrno(host_err);
 
 	inode_lock_nested(dentry->d_inode, I_MUTEX_PARENT);
-	dchild = lookup_one_len(fname, dentry, flen);
+	dchild = lookup_one(fh_mnt(fhp), QSTR_LEN(fname, flen), dentry);
 	host_err = PTR_ERR(dchild);
 	if (IS_ERR(dchild)) {
 		err = nfserrno(host_err);
@@ -1660,7 +1661,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 	dentry = fhp->fh_dentry;
 	inode_lock_nested(dentry->d_inode, I_MUTEX_PARENT);
-	dnew = lookup_one_len(fname, dentry, flen);
+	dnew = lookup_one(fh_mnt(fhp), QSTR_LEN(fname, flen), dentry);
 	if (IS_ERR(dnew)) {
 		err = nfserrno(PTR_ERR(dnew));
 		inode_unlock(dentry->d_inode);
@@ -1726,7 +1727,7 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 	dirp = d_inode(ddir);
 	inode_lock_nested(dirp, I_MUTEX_PARENT);
 
-	dnew = lookup_one_len(name, ddir, len);
+	dnew = lookup_one(fh_mnt(ffhp), QSTR_LEN(name, len), ddir);
 	if (IS_ERR(dnew)) {
 		err = nfserrno(PTR_ERR(dnew));
 		goto out_unlock;
@@ -1839,7 +1840,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 	if (err != nfs_ok)
 		goto out_unlock;
 
-	odentry = lookup_one_len(fname, fdentry, flen);
+	odentry = lookup_one(fh_mnt(ffhp), QSTR_LEN(fname, flen), fdentry);
 	host_err = PTR_ERR(odentry);
 	if (IS_ERR(odentry))
 		goto out_nfserr;
@@ -1851,7 +1852,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 	if (odentry == trap)
 		goto out_dput_old;
 
-	ndentry = lookup_one_len(tname, tdentry, tlen);
+	ndentry = lookup_one(fh_mnt(tfhp), QSTR_LEN(tname, tlen), tdentry);
 	host_err = PTR_ERR(ndentry);
 	if (IS_ERR(ndentry))
 		goto out_dput_old;
@@ -1948,7 +1949,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 	dirp = d_inode(dentry);
 	inode_lock_nested(dirp, I_MUTEX_PARENT);
 
-	rdentry = lookup_one_len(fname, dentry, flen);
+	rdentry = lookup_one(fh_mnt(fhp), QSTR_LEN(fname, flen), dentry);
 	host_err = PTR_ERR(rdentry);
 	if (IS_ERR(rdentry))
 		goto out_unlock;
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index f9b09b842856..5a60004468b8 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -193,4 +193,16 @@ static inline __be32 fh_getattr(const struct svc_fh *fh, struct kstat *stat)
 				    AT_STATX_SYNC_AS_STAT));
 }
 
+/**
+ * fh_mnt - access vfsmount of a given file handle
+ * @fhp:  the filehandle
+ *
+ * Returns the struct vfsmount from the export referenced in the
+ * filehandle.
+ */
+static inline struct vfsmount *fh_mnt(struct svc_fh *fhp)
+{
+	return fhp->fh_export->ex_path.mnt;
+}
+
 #endif /* LINUX_NFSD_VFS_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] nfsd: use correct idmap for all accesses.
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
  2025-03-14  0:34 ` [PATCH 1/8] VFS: improve interface for lookup_one functions NeilBrown
  2025-03-14  0:34 ` [PATCH 2/8] nfsd: Use lookup_one() rather than lookup_one_len() NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14  0:34 ` [PATCH 4/8] cachefiles: Use lookup_one() rather than lookup_one_len() NeilBrown
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

When accessing the exported filesystem, or the filesystem storing
state-recovery data, we should use the idmap associated with the mount,
or incorrect behaviour could eventuate of an idmapped filesystem were in
use.

This patch adds fh_idmap() to return the mnt_idmap for a given svc_fh()
and uses that or other means to provide the correct mnt_idmap.  nfsd no
longer users nop_mnt_idmap.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/nfsd/nfs2acl.c     |  4 ++--
 fs/nfsd/nfs3acl.c     |  4 ++--
 fs/nfsd/nfs3proc.c    |  2 +-
 fs/nfsd/nfs4recover.c |  7 ++++---
 fs/nfsd/nfs4state.c   |  9 ++++++---
 fs/nfsd/nfs4xdr.c     |  2 +-
 fs/nfsd/state.h       |  4 +++-
 fs/nfsd/vfs.h         | 12 ++++++++++++
 8 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/nfs2acl.c b/fs/nfsd/nfs2acl.c
index 5fb202acb0fd..cb15d2d0dd50 100644
--- a/fs/nfsd/nfs2acl.c
+++ b/fs/nfsd/nfs2acl.c
@@ -115,11 +115,11 @@ static __be32 nfsacld_proc_setacl(struct svc_rqst *rqstp)
 
 	inode_lock(inode);
 
-	error = set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, ACL_TYPE_ACCESS,
+	error = set_posix_acl(fh_idmap(fh), fh->fh_dentry, ACL_TYPE_ACCESS,
 			      argp->acl_access);
 	if (error)
 		goto out_drop_lock;
-	error = set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, ACL_TYPE_DEFAULT,
+	error = set_posix_acl(fh_idmap(fh), fh->fh_dentry, ACL_TYPE_DEFAULT,
 			      argp->acl_default);
 	if (error)
 		goto out_drop_lock;
diff --git a/fs/nfsd/nfs3acl.c b/fs/nfsd/nfs3acl.c
index 7b5433bd3019..2e92a5673021 100644
--- a/fs/nfsd/nfs3acl.c
+++ b/fs/nfsd/nfs3acl.c
@@ -105,11 +105,11 @@ static __be32 nfsd3_proc_setacl(struct svc_rqst *rqstp)
 
 	inode_lock(inode);
 
-	error = set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, ACL_TYPE_ACCESS,
+	error = set_posix_acl(fh_idmap(fh), fh->fh_dentry, ACL_TYPE_ACCESS,
 			      argp->acl_access);
 	if (error)
 		goto out_drop_lock;
-	error = set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, ACL_TYPE_DEFAULT,
+	error = set_posix_acl(fh_idmap(fh), fh->fh_dentry, ACL_TYPE_DEFAULT,
 			      argp->acl_default);
 
 out_drop_lock:
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 457638bf0f32..d32ce5956ca0 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -344,7 +344,7 @@ nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	status = fh_fill_pre_attrs(fhp);
 	if (status != nfs_ok)
 		goto out;
-	host_err = vfs_create(&nop_mnt_idmap, inode, child, iap->ia_mode, true);
+	host_err = vfs_create(fh_idmap(fhp), inode, child, iap->ia_mode, true);
 	if (host_err < 0) {
 		status = nfserrno(host_err);
 		goto out;
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 10d24bec532f..f8fe23941873 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -233,7 +233,8 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
 		 * as well be forgiving and just succeed silently.
 		 */
 		goto out_put;
-	dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
+	dentry = vfs_mkdir(mnt_idmap(nn->rec_file->f_path.mnt), d_inode(dir),
+				  dentry, S_IRWXU);
 	if (IS_ERR(dentry))
 		status = PTR_ERR(dentry);
 out_put:
@@ -357,7 +358,7 @@ nfsd4_unlink_clid_dir(char *name, struct nfsd_net *nn)
 	status = -ENOENT;
 	if (d_really_is_negative(dentry))
 		goto out;
-	status = vfs_rmdir(&nop_mnt_idmap, d_inode(dir), dentry);
+	status = vfs_rmdir(mnt_idmap(nn->rec_file->f_path.mnt), d_inode(dir), dentry);
 out:
 	dput(dentry);
 out_unlock:
@@ -447,7 +448,7 @@ purge_old(struct dentry *parent, struct dentry *child, struct nfsd_net *nn)
 	if (nfs4_has_reclaimed_state(name, nn))
 		goto out_free;
 
-	status = vfs_rmdir(&nop_mnt_idmap, d_inode(parent), child);
+	status = vfs_rmdir(mnt_idmap(nn->rec_file->f_path.mnt), d_inode(parent), child);
 	if (status)
 		printk("failed to remove client recovery directory %pd\n",
 				child);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 153eeea2c7c9..1796a6aeedd8 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -9079,7 +9079,8 @@ static bool set_cb_time(struct timespec64 *cb, const struct timespec64 *orig,
 	return true;
 }
 
-static int cb_getattr_update_times(struct dentry *dentry, struct nfs4_delegation *dp)
+static int cb_getattr_update_times(struct vfsmount *mnt, struct dentry *dentry,
+				   struct nfs4_delegation *dp)
 {
 	struct inode *inode = d_inode(dentry);
 	struct timespec64 now = current_time(inode);
@@ -9111,7 +9112,7 @@ static int cb_getattr_update_times(struct dentry *dentry, struct nfs4_delegation
 
 	attrs.ia_valid |= ATTR_DELEG;
 	inode_lock(inode);
-	ret = notify_change(&nop_mnt_idmap, dentry, &attrs, NULL);
+	ret = notify_change(mnt_idmap(mnt), dentry, &attrs, NULL);
 	inode_unlock(inode);
 	return ret;
 }
@@ -9120,6 +9121,7 @@ static int cb_getattr_update_times(struct dentry *dentry, struct nfs4_delegation
  * nfsd4_deleg_getattr_conflict - Recall if GETATTR causes conflict
  * @rqstp: RPC transaction context
  * @dentry: dentry of inode to be checked for a conflict
+ * @exp: svc_export being accessed
  * @pdp: returned WRITE delegation, if one was found
  *
  * This function is called when there is a conflict between a write
@@ -9135,6 +9137,7 @@ static int cb_getattr_update_times(struct dentry *dentry, struct nfs4_delegation
  */
 __be32
 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry,
+			     struct svc_export *exp,
 			     struct nfs4_delegation **pdp)
 {
 	__be32 status;
@@ -9203,7 +9206,7 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry,
 		 * not update the file's metadata with the client's
 		 * modified size
 		 */
-		err = cb_getattr_update_times(dentry, dp);
+		err = cb_getattr_update_times(exp->ex_path.mnt, dentry, dp);
 		if (err) {
 			status = nfserrno(err);
 			goto out_status;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 64ab2c605e93..e7c87653e979 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3618,7 +3618,7 @@ nfsd4_encode_fattr4(struct svc_rqst *rqstp, struct xdr_stream *xdr,
 	    (attrmask[1] & (FATTR4_WORD1_TIME_ACCESS |
 			    FATTR4_WORD1_TIME_MODIFY |
 			    FATTR4_WORD1_TIME_METADATA))) {
-		status = nfsd4_deleg_getattr_conflict(rqstp, dentry, &dp);
+		status = nfsd4_deleg_getattr_conflict(rqstp, dentry, exp, &dp);
 		if (status)
 			goto out;
 	}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 74d2d7b42676..8ef2bec43afa 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -826,5 +826,7 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
 }
 
 extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
-		struct dentry *dentry, struct nfs4_delegation **pdp);
+					   struct dentry *dentry,
+					   struct svc_export *exp,
+					   struct nfs4_delegation **pdp);
 #endif   /* NFSD4_STATE_H */
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 5a60004468b8..1bb75d740427 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -205,4 +205,16 @@ static inline struct vfsmount *fh_mnt(struct svc_fh *fhp)
 	return fhp->fh_export->ex_path.mnt;
 }
 
+/**
+ * fh_idmap - access idmap of vfsmount of a given file handle
+ * @fhp:  the filehandle
+ *
+ * Returns the struct idmap from the vfsmount of the export referenced in the
+ * filehandle.
+ */
+static inline struct mnt_idmap *fh_idmap(struct svc_fh *fhp)
+{
+	return mnt_idmap(fh_mnt(fhp));;
+}
+
 #endif /* LINUX_NFSD_VFS_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] cachefiles: Use lookup_one() rather than lookup_one_len()
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
                   ` (2 preceding siblings ...)
  2025-03-14  0:34 ` [PATCH 3/8] nfsd: use correct idmap for all accesses NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14  0:34 ` [PATCH 5/8] cachefiles: use correct mnt_idmap NeilBrown
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

lookup_one_len() does not support idmapped mounts and does permission
checking on the non-mapped uids.  This means that cachefiles cannot
correctly use idmapped mounts as a backing store.

This patch changes to use lookup_one() and lookup_one_positive_unlocked(),
passing the relevant vfsmount so idmapping can be honoured.

This requires passing the name in a qstr.  This is easily done with
QSTR() as the name is always nul terminated, and often strlen is used
anyway.  ->d_name_len is removed as no longer useful.

Note that there are still many places where cachefiles uses
nop_mnt_idmap so more work is needed to properly support idmapped
mounts.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/cachefiles/internal.h |  1 -
 fs/cachefiles/key.c      |  1 -
 fs/cachefiles/namei.c    | 14 +++++++-------
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 38c236e38cef..b62cd3e9a18e 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -71,7 +71,6 @@ struct cachefiles_object {
 	int				debug_id;
 	spinlock_t			lock;
 	refcount_t			ref;
-	u8				d_name_len;	/* Length of filename */
 	enum cachefiles_content		content_info:8;	/* Info about content presence */
 	unsigned long			flags;
 #define CACHEFILES_OBJECT_USING_TMPFILE	0		/* Have an unlinked tmpfile */
diff --git a/fs/cachefiles/key.c b/fs/cachefiles/key.c
index bf935e25bdbe..4927b533b9ae 100644
--- a/fs/cachefiles/key.c
+++ b/fs/cachefiles/key.c
@@ -132,7 +132,6 @@ bool cachefiles_cook_key(struct cachefiles_object *object)
 success:
 	name[len] = 0;
 	object->d_name = name;
-	object->d_name_len = len;
 	_leave(" = %s", object->d_name);
 	return true;
 }
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 83a60126de0f..a440a2ff5d41 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -98,7 +98,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 retry:
 	ret = cachefiles_inject_read_error();
 	if (ret == 0)
-		subdir = lookup_one_len(dirname, dir, strlen(dirname));
+		subdir = lookup_one(cache->mnt, QSTR(dirname), dir);
 	else
 		subdir = ERR_PTR(ret);
 	trace_cachefiles_lookup(NULL, dir, subdir);
@@ -337,7 +337,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 		return -EIO;
 	}
 
-	grave = lookup_one_len(nbuffer, cache->graveyard, strlen(nbuffer));
+	grave = lookup_one(cache->mnt, QSTR(nbuffer), cache->graveyard);
 	if (IS_ERR(grave)) {
 		unlock_rename(cache->graveyard, dir);
 		trace_cachefiles_vfs_error(object, d_inode(cache->graveyard),
@@ -629,8 +629,8 @@ bool cachefiles_look_up_object(struct cachefiles_object *object)
 	/* Look up path "cache/vol/fanout/file". */
 	ret = cachefiles_inject_read_error();
 	if (ret == 0)
-		dentry = lookup_positive_unlocked(object->d_name, fan,
-						  object->d_name_len);
+		dentry = lookup_one_positive_unlocked(volume->cache->mnt,
+						      QSTR(object->d_name), fan);
 	else
 		dentry = ERR_PTR(ret);
 	trace_cachefiles_lookup(object, fan, dentry);
@@ -682,7 +682,7 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
 	ret = cachefiles_inject_read_error();
 	if (ret == 0)
-		dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+		dentry = lookup_one(cache->mnt, QSTR(object->d_name), fan);
 	else
 		dentry = ERR_PTR(ret);
 	if (IS_ERR(dentry)) {
@@ -701,7 +701,7 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 		dput(dentry);
 		ret = cachefiles_inject_read_error();
 		if (ret == 0)
-			dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+			dentry = lookup_one(cache->mnt, QSTR(object->d_name), fan);
 		else
 			dentry = ERR_PTR(ret);
 		if (IS_ERR(dentry)) {
@@ -750,7 +750,7 @@ static struct dentry *cachefiles_lookup_for_cull(struct cachefiles_cache *cache,
 
 	inode_lock_nested(d_inode(dir), I_MUTEX_PARENT);
 
-	victim = lookup_one_len(filename, dir, strlen(filename));
+	victim = lookup_one(cache->mnt, QSTR(filename), dir);
 	if (IS_ERR(victim))
 		goto lookup_error;
 	if (d_is_negative(victim))
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] cachefiles: use correct mnt_idmap
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
                   ` (3 preceding siblings ...)
  2025-03-14  0:34 ` [PATCH 4/8] cachefiles: Use lookup_one() rather than lookup_one_len() NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14  0:34 ` [PATCH 6/8] VFS: rename lookup_one_len() family to lookup_noperm() and remove permission check NeilBrown
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

If cachesfiles is configured on a mount that is idmapped, it needs to
use the idmap from the mount point.  This patch changes all occurrences
of nop_mnt_idmap in cachefiles to file the correct mnt_idmap.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/cachefiles/interface.c |  6 ++++--
 fs/cachefiles/namei.c     | 14 ++++++++------
 fs/cachefiles/xattr.c     | 12 +++++++-----
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 3e63cfe15874..b2b9867d8428 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -143,7 +143,8 @@ static int cachefiles_adjust_size(struct cachefiles_object *object)
 		newattrs.ia_size = oi_size & PAGE_MASK;
 		ret = cachefiles_inject_remove_error();
 		if (ret == 0)
-			ret = notify_change(&nop_mnt_idmap, file->f_path.dentry,
+			ret = notify_change(mnt_idmap(file->f_path.mnt),
+					    file->f_path.dentry,
 					    &newattrs, NULL);
 		if (ret < 0)
 			goto truncate_failed;
@@ -153,7 +154,8 @@ static int cachefiles_adjust_size(struct cachefiles_object *object)
 	newattrs.ia_size = ni_size;
 	ret = cachefiles_inject_write_error();
 	if (ret == 0)
-		ret = notify_change(&nop_mnt_idmap, file->f_path.dentry,
+		ret = notify_change(file_mnt_idmap(file),
+				    file->f_path.dentry,
 				    &newattrs, NULL);
 
 truncate_failed:
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index a440a2ff5d41..b156a0671a3d 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -130,7 +130,8 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 			goto mkdir_error;
 		subdir = ERR_PTR(cachefiles_inject_write_error());
 		if (!IS_ERR(subdir))
-			subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
+			subdir = vfs_mkdir(mnt_idmap(cache->mnt), d_inode(dir),
+					   subdir, 0700);
 		ret = PTR_ERR(subdir);
 		if (IS_ERR(subdir)) {
 			trace_cachefiles_vfs_error(NULL, d_inode(dir), ret,
@@ -247,7 +248,8 @@ static int cachefiles_unlink(struct cachefiles_cache *cache,
 
 	ret = cachefiles_inject_remove_error();
 	if (ret == 0) {
-		ret = vfs_unlink(&nop_mnt_idmap, d_backing_inode(dir), dentry, NULL);
+		ret = vfs_unlink(mnt_idmap(cache->mnt),
+				 d_backing_inode(dir), dentry, NULL);
 		if (ret == -EIO)
 			cachefiles_io_error(cache, "Unlink failed");
 	}
@@ -386,10 +388,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 		cachefiles_io_error(cache, "Rename security error %d", ret);
 	} else {
 		struct renamedata rd = {
-			.old_mnt_idmap	= &nop_mnt_idmap,
+			.old_mnt_idmap	= mnt_idmap(cache->mnt),
 			.old_dir	= d_inode(dir),
 			.old_dentry	= rep,
-			.new_mnt_idmap	= &nop_mnt_idmap,
+			.new_mnt_idmap	= mnt_idmap(cache->mnt),
 			.new_dir	= d_inode(cache->graveyard),
 			.new_dentry	= grave,
 		};
@@ -455,7 +457,7 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 
 	ret = cachefiles_inject_write_error();
 	if (ret == 0) {
-		file = kernel_tmpfile_open(&nop_mnt_idmap, &parentpath,
+		file = kernel_tmpfile_open(mnt_idmap(cache->mnt), &parentpath,
 					   S_IFREG | 0600,
 					   O_RDWR | O_LARGEFILE | O_DIRECT,
 					   cache->cache_cred);
@@ -714,7 +716,7 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 
 	ret = cachefiles_inject_read_error();
 	if (ret == 0)
-		ret = vfs_link(object->file->f_path.dentry, &nop_mnt_idmap,
+		ret = vfs_link(object->file->f_path.dentry, file_mnt_idmap(object->file),
 			       d_inode(fan), dentry, NULL);
 	if (ret < 0) {
 		trace_cachefiles_vfs_error(object, d_inode(fan), ret,
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 52383b1d0ba6..a2ab9100ba74 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -67,7 +67,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 	if (ret == 0) {
 		ret = mnt_want_write_file(file);
 		if (ret == 0) {
-			ret = vfs_setxattr(&nop_mnt_idmap, dentry,
+			ret = vfs_setxattr(file_mnt_idmap(file), dentry,
 					   cachefiles_xattr_cache, buf,
 					   sizeof(struct cachefiles_xattr) + len, 0);
 			mnt_drop_write_file(file);
@@ -116,7 +116,8 @@ int cachefiles_check_auxdata(struct cachefiles_object *object, struct file *file
 
 	xlen = cachefiles_inject_read_error();
 	if (xlen == 0)
-		xlen = vfs_getxattr(&nop_mnt_idmap, dentry, cachefiles_xattr_cache, buf, tlen);
+		xlen = vfs_getxattr(file_mnt_idmap(file), dentry,
+				    cachefiles_xattr_cache, buf, tlen);
 	if (xlen != tlen) {
 		if (xlen < 0) {
 			ret = xlen;
@@ -167,7 +168,7 @@ int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 	if (ret == 0) {
 		ret = mnt_want_write(cache->mnt);
 		if (ret == 0) {
-			ret = vfs_removexattr(&nop_mnt_idmap, dentry,
+			ret = vfs_removexattr(mnt_idmap(cache->mnt), dentry,
 					      cachefiles_xattr_cache);
 			mnt_drop_write(cache->mnt);
 		}
@@ -230,7 +231,7 @@ bool cachefiles_set_volume_xattr(struct cachefiles_volume *volume)
 	if (ret == 0) {
 		ret = mnt_want_write(volume->cache->mnt);
 		if (ret == 0) {
-			ret = vfs_setxattr(&nop_mnt_idmap, dentry,
+			ret = vfs_setxattr(mnt_idmap(volume->cache->mnt), dentry,
 					   cachefiles_xattr_cache,
 					   buf, len, 0);
 			mnt_drop_write(volume->cache->mnt);
@@ -276,7 +277,8 @@ int cachefiles_check_volume_xattr(struct cachefiles_volume *volume)
 
 	xlen = cachefiles_inject_read_error();
 	if (xlen == 0)
-		xlen = vfs_getxattr(&nop_mnt_idmap, dentry, cachefiles_xattr_cache, buf, len);
+		xlen = vfs_getxattr(mnt_idmap(volume->cache->mnt),
+				    dentry, cachefiles_xattr_cache, buf, len);
 	if (xlen != len) {
 		if (xlen < 0) {
 			ret = xlen;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] VFS: rename lookup_one_len() family to lookup_noperm() and remove permission check
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
                   ` (4 preceding siblings ...)
  2025-03-14  0:34 ` [PATCH 5/8] cachefiles: use correct mnt_idmap NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14  0:34 ` [PATCH 7/8] Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS NeilBrown
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

The lookup_one_len family of functions is (now) only used internally by
a filesystem on itself either
- in a context where permission checking is irrelevant such as by a
  virtual filesystem populating itself, or xfs accessing its ORPHANAGE
  or dquota accessing the quota file; or
- in a context where a permission check (MAY_EXEC on the parent) has just
  been performed such as a network filesystem finding a "silly-rename"
  file in the same directory.  This is also the context after the
  _parentat() functions where currently lookup_one_qstr_excl() is used.

So the permission check is pointless and given that nop_mnt_idmap is
currently used, it is possibly erroneous on a idmapped mount.

The name "one_len" is unhelpful in understanding the purpose of these
functions and should be changed.  Most of the callers pass the len as
"strlen()" so using a qstr and QSTR() can simplify the code.

This patch renames these functions (include lookup_positive_unlocked()
which is part of the family despite the name) to have a name based on
"lookup_noperm".  They are changed to receive a 'struct qstr' instead
of separate name and len.  In a few cases the use of QSTR() results in a
new call to strlen().

try_lookup_noperm() takes a pointer to a qstr instead of the whole
qstr.  This is consistent with d_hash_and_lookup() (which is nearly
identical) and useful for lookup_noperm_unlocked().

The new lookup_noperm_common() doesn't take a qstr yet.  That will be
tidied up in a subsequent patch.

Signed-off-by: NeilBrown <neil@brown.name>
---
 Documentation/filesystems/porting.rst | 20 +++++++
 arch/s390/hypfs/inode.c               |  2 +-
 drivers/android/binderfs.c            |  4 +-
 drivers/infiniband/hw/qib/qib_fs.c    |  4 +-
 fs/afs/dir.c                          |  2 +-
 fs/afs/dir_silly.c                    |  6 +-
 fs/autofs/dev-ioctl.c                 |  3 +-
 fs/binfmt_misc.c                      |  2 +-
 fs/debugfs/inode.c                    |  6 +-
 fs/ecryptfs/inode.c                   | 16 +++---
 fs/kernfs/mount.c                     |  4 +-
 fs/namei.c                            | 82 ++++++++++++++-------------
 fs/nfs/unlink.c                       | 11 ++--
 fs/overlayfs/export.c                 |  6 +-
 fs/overlayfs/namei.c                  |  2 +-
 fs/quota/dquot.c                      |  2 +-
 fs/smb/client/cached_dir.c            |  5 +-
 fs/smb/client/cifsfs.c                |  3 +-
 fs/tracefs/inode.c                    |  2 +-
 fs/xfs/scrub/orphanage.c              |  3 +-
 include/linux/namei.h                 |  8 +--
 ipc/mqueue.c                          |  5 +-
 kernel/bpf/inode.c                    |  2 +-
 security/apparmor/apparmorfs.c        |  4 +-
 security/inode.c                      |  2 +-
 25 files changed, 113 insertions(+), 93 deletions(-)

diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index 481b2656d7a0..49a7be912e5f 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -1206,3 +1206,23 @@ take a vfsmount instead of a mnt_idmap, and a qstr instead of a name and len.
 These, not the "one_len" versions, should be used whenever accessing a
 filesystem from outside that filesysmtem, through a mount point - unless
 a permission check has already been performed.
+
+---
+
+** mandatory**
+
+Functions try_lookup_one_len(), lookup_one_len(),
+lookup_one_len_unlocked() and lookup_positive_unlocked() have been
+renamed to try_lookup_noperm(), lookup_noperm(),
+lookup_noperm_unlocked(), lookup_noperm_positive_unlocked().  They now
+take a qstr instead of separate name and length.  QSTR() can be used
+when strlen() is needed for the length.
+
+For try_lookup_noperm() a reference to the qstr is passed in case the
+hash might subsequently be needed.
+
+These function no longer do any permission checking - they previously
+checked that the caller has 'X' permission on the parent.  They must
+ONLY be used internally by a filesystem on itself when it knows that
+permissions are irrelevant or in a context where permission checks have
+already been performed such as after vfs_path_parent_lookup()
diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index d428635abf08..2eed957393f4 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -341,7 +341,7 @@ static struct dentry *hypfs_create_file(struct dentry *parent, const char *name,
 	struct inode *inode;
 
 	inode_lock(d_inode(parent));
-	dentry = lookup_one_len(name, parent, strlen(name));
+	dentry = lookup_noperm(QSTR(name), parent);
 	if (IS_ERR(dentry)) {
 		dentry = ERR_PTR(-ENOMEM);
 		goto fail;
diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
index bc6bae76ccaf..172e825168c8 100644
--- a/drivers/android/binderfs.c
+++ b/drivers/android/binderfs.c
@@ -187,7 +187,7 @@ static int binderfs_binder_device_create(struct inode *ref_inode,
 	inode_lock(d_inode(root));
 
 	/* look it up */
-	dentry = lookup_one_len(name, root, name_len);
+	dentry = lookup_noperm(QSTR(name), root);
 	if (IS_ERR(dentry)) {
 		inode_unlock(d_inode(root));
 		ret = PTR_ERR(dentry);
@@ -486,7 +486,7 @@ static struct dentry *binderfs_create_dentry(struct dentry *parent,
 {
 	struct dentry *dentry;
 
-	dentry = lookup_one_len(name, parent, strlen(name));
+	dentry = lookup_noperm(QSTR(name), parent);
 	if (IS_ERR(dentry))
 		return dentry;
 
diff --git a/drivers/infiniband/hw/qib/qib_fs.c b/drivers/infiniband/hw/qib/qib_fs.c
index b27791029fa9..c3aa40c3fb8f 100644
--- a/drivers/infiniband/hw/qib/qib_fs.c
+++ b/drivers/infiniband/hw/qib/qib_fs.c
@@ -89,7 +89,7 @@ static int create_file(const char *name, umode_t mode,
 	int error;
 
 	inode_lock(d_inode(parent));
-	*dentry = lookup_one_len(name, parent, strlen(name));
+	*dentry = lookup_noperm(QSTR(name), parent);
 	if (!IS_ERR(*dentry))
 		error = qibfs_mknod(d_inode(parent), *dentry,
 				    mode, fops, data);
@@ -432,7 +432,7 @@ static int remove_device_files(struct super_block *sb,
 	char unit[10];
 
 	snprintf(unit, sizeof(unit), "%u", dd->unit);
-	dir = lookup_one_len_unlocked(unit, sb->s_root, strlen(unit));
+	dir = lookup_noperm_unlocked(QSTR(unit), sb->s_root);
 
 	if (IS_ERR(dir)) {
 		pr_err("Lookup of %s failed\n", unit);
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 5bddcc20786e..0e633535d2c7 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -943,7 +943,7 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry)
 		}
 
 		strcpy(p, name);
-		ret = lookup_one_len(buf, dentry->d_parent, len);
+		ret = lookup_noperm(QSTR(buf), dentry->d_parent);
 		if (IS_ERR(ret) || d_is_positive(ret))
 			goto out_s;
 		dput(ret);
diff --git a/fs/afs/dir_silly.c b/fs/afs/dir_silly.c
index a1e581946b93..292a5c8c752a 100644
--- a/fs/afs/dir_silly.c
+++ b/fs/afs/dir_silly.c
@@ -113,16 +113,14 @@ int afs_sillyrename(struct afs_vnode *dvnode, struct afs_vnode *vnode,
 
 	sdentry = NULL;
 	do {
-		int slen;
-
 		dput(sdentry);
 		sillycounter++;
 
 		/* Create a silly name.  Note that the ".__afs" prefix is
 		 * understood by the salvager and must not be changed.
 		 */
-		slen = scnprintf(silly, sizeof(silly), ".__afs%04X", sillycounter);
-		sdentry = lookup_one_len(silly, dentry->d_parent, slen);
+		scnprintf(silly, sizeof(silly), ".__afs%04X", sillycounter);
+		sdentry = lookup_noperm(QSTR(silly), dentry->d_parent);
 
 		/* N.B. Better to return EBUSY here ... it could be dangerous
 		 * to delete the file while it's in use.
diff --git a/fs/autofs/dev-ioctl.c b/fs/autofs/dev-ioctl.c
index 6d57efbb8110..a577b175c38a 100644
--- a/fs/autofs/dev-ioctl.c
+++ b/fs/autofs/dev-ioctl.c
@@ -461,7 +461,8 @@ static int autofs_dev_ioctl_timeout(struct file *fp,
 				"prevent shutdown\n");
 
 		inode_lock_shared(inode);
-		dentry = try_lookup_one_len(param->path, base, path_len);
+		dentry = try_lookup_noperm(&QSTR_LEN(param->path, path_len),
+					   base);
 		inode_unlock_shared(inode);
 		if (IS_ERR_OR_NULL(dentry))
 			return dentry ? PTR_ERR(dentry) : -ENOENT;
diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 5a7ebd160724..f0996a76ee7c 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -842,7 +842,7 @@ static ssize_t bm_register_write(struct file *file, const char __user *buffer,
 	}
 
 	inode_lock(d_inode(root));
-	dentry = lookup_one_len(e->name, root, strlen(e->name));
+	dentry = lookup_noperm(QSTR(e->name), root);
 	err = PTR_ERR(dentry);
 	if (IS_ERR(dentry))
 		goto out;
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 75715d8877ee..feb5ccb5bba8 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -346,7 +346,7 @@ struct dentry *debugfs_lookup(const char *name, struct dentry *parent)
 	if (!parent)
 		parent = debugfs_mount->mnt_root;
 
-	dentry = lookup_positive_unlocked(name, parent, strlen(name));
+	dentry = lookup_noperm_positive_unlocked(QSTR(name), parent);
 	if (IS_ERR(dentry))
 		return NULL;
 	return dentry;
@@ -388,7 +388,7 @@ static struct dentry *start_creating(const char *name, struct dentry *parent)
 	if (unlikely(IS_DEADDIR(d_inode(parent))))
 		dentry = ERR_PTR(-ENOENT);
 	else
-		dentry = lookup_one_len(name, parent, strlen(name));
+		dentry = lookup_noperm(QSTR(name), parent);
 	if (!IS_ERR(dentry) && d_really_is_positive(dentry)) {
 		if (d_is_dir(dentry))
 			pr_err("Directory '%s' with parent '%s' already present!\n",
@@ -872,7 +872,7 @@ int __printf(2, 3) debugfs_change_name(struct dentry *dentry, const char *fmt, .
 	}
 	if (strcmp(old_name.name.name, new_name) == 0)
 		goto out;
-	target = lookup_one_len(new_name, parent, strlen(new_name));
+	target = lookup_noperm(QSTR(new_name), parent);
 	if (IS_ERR(target)) {
 		error = PTR_ERR(target);
 		goto out;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 51a5c54eb740..9a0d40bb43d6 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -394,8 +394,8 @@ static struct dentry *ecryptfs_lookup(struct inode *ecryptfs_dir_inode,
 	char *encrypted_and_encoded_name = NULL;
 	struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
 	struct dentry *lower_dir_dentry, *lower_dentry;
-	const char *name = ecryptfs_dentry->d_name.name;
-	size_t len = ecryptfs_dentry->d_name.len;
+	struct qstr qname = QSTR_INIT(ecryptfs_dentry->d_name.name,
+				      ecryptfs_dentry->d_name.len);
 	struct dentry *res;
 	int rc = 0;
 
@@ -404,23 +404,25 @@ static struct dentry *ecryptfs_lookup(struct inode *ecryptfs_dir_inode,
 	mount_crypt_stat = &ecryptfs_superblock_to_private(
 				ecryptfs_dentry->d_sb)->mount_crypt_stat;
 	if (mount_crypt_stat->flags & ECRYPTFS_GLOBAL_ENCRYPT_FILENAMES) {
+		size_t len = qname.len;
 		rc = ecryptfs_encrypt_and_encode_filename(
 			&encrypted_and_encoded_name, &len,
-			mount_crypt_stat, name, len);
+			mount_crypt_stat, qname.name, len);
 		if (rc) {
 			printk(KERN_ERR "%s: Error attempting to encrypt and encode "
 			       "filename; rc = [%d]\n", __func__, rc);
 			return ERR_PTR(rc);
 		}
-		name = encrypted_and_encoded_name;
+		qname.name = encrypted_and_encoded_name;
+		qname.len = len;
 	}
 
-	lower_dentry = lookup_one_len_unlocked(name, lower_dir_dentry, len);
+	lower_dentry = lookup_noperm_unlocked(qname, lower_dir_dentry);
 	if (IS_ERR(lower_dentry)) {
-		ecryptfs_printk(KERN_DEBUG, "%s: lookup_one_len() returned "
+		ecryptfs_printk(KERN_DEBUG, "%s: lookup_noperm() returned "
 				"[%ld] on lower_dentry = [%s]\n", __func__,
 				PTR_ERR(lower_dentry),
-				name);
+				qname.name);
 		res = ERR_CAST(lower_dentry);
 	} else {
 		res = ecryptfs_lookup_interpose(ecryptfs_dentry, lower_dentry);
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 1358c21837f1..46943a7fb4df 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -233,8 +233,8 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *kn,
 			dput(dentry);
 			return ERR_PTR(-EINVAL);
 		}
-		dtmp = lookup_positive_unlocked(kntmp->name, dentry,
-					       strlen(kntmp->name));
+		dtmp = lookup_noperm_positive_unlocked(QSTR(kntmp->name),
+							 dentry);
 		dput(dentry);
 		if (IS_ERR(dtmp))
 			return dtmp;
diff --git a/fs/namei.c b/fs/namei.c
index fb8dbabc5cbf..f89ed09c0b0b 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2833,9 +2833,9 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt,
 }
 EXPORT_SYMBOL(vfs_path_lookup);
 
-static int lookup_one_common(struct mnt_idmap *idmap,
-			     const char *name, struct dentry *base, int len,
-			     struct qstr *this)
+static int lookup_noperm_common(const char *name, struct dentry *base,
+				  int len,
+				  struct qstr *this)
 {
 	this->name = name;
 	this->len = len;
@@ -2860,51 +2860,59 @@ static int lookup_one_common(struct mnt_idmap *idmap,
 		if (err < 0)
 			return err;
 	}
+	return 0;
+}
 
+static int lookup_one_common(struct mnt_idmap *idmap,
+			     const char *name, struct dentry *base, int len,
+			     struct qstr *this) {
+	int err;
+	err = lookup_noperm_common(name, base, len, this);
+	if (err < 0)
+		return err;
 	return inode_permission(idmap, base->d_inode, MAY_EXEC);
 }
 
 /**
- * try_lookup_one_len - filesystem helper to lookup single pathname component
- * @name:	pathname component to lookup
+ * try_lookup_noperm - filesystem helper to lookup single pathname component
+ * @name:	qstr storing pathname component to lookup
  * @base:	base directory to lookup from
- * @len:	maximum length @len should be interpreted to
  *
  * Look up a dentry by name in the dcache, returning NULL if it does not
  * currently exist.  The function does not try to create a dentry.
  *
  * Note that this routine is purely a helper for filesystem usage and should
- * not be called by generic code.
+ * not be called by generic code.  It does no permission checking.
  *
  * The caller must hold base->i_mutex.
  */
-struct dentry *try_lookup_one_len(const char *name, struct dentry *base, int len)
+struct dentry *try_lookup_noperm(struct qstr *name, struct dentry *base)
 {
 	struct qstr this;
 	int err;
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_one_common(&nop_mnt_idmap, name, base, len, &this);
+	err = lookup_noperm_common(name->name, base, name->len, &this);
 	if (err)
 		return ERR_PTR(err);
 
-	return lookup_dcache(&this, base, 0);
+	name->hash = this.hash;
+	return lookup_dcache(name, base, 0);
 }
-EXPORT_SYMBOL(try_lookup_one_len);
+EXPORT_SYMBOL(try_lookup_noperm);
 
 /**
- * lookup_one_len - filesystem helper to lookup single pathname component
- * @name:	pathname component to lookup
+ * lookup_noperm - filesystem helper to lookup single pathname component
+ * @name:	qstr storing pathname component to lookup
  * @base:	base directory to lookup from
- * @len:	maximum length @len should be interpreted to
  *
  * Note that this routine is purely a helper for filesystem usage and should
- * not be called by generic code.
+ * not be called by generic code.  It does no permission checking.
  *
  * The caller must hold base->i_mutex.
  */
-struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
+struct dentry *lookup_noperm(struct qstr name, struct dentry *base)
 {
 	struct dentry *dentry;
 	struct qstr this;
@@ -2912,14 +2920,14 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_one_common(&nop_mnt_idmap, name, base, len, &this);
+	err = lookup_noperm_common(name.name, base, name.len, &this);
 	if (err)
 		return ERR_PTR(err);
 
 	dentry = lookup_dcache(&this, base, 0);
 	return dentry ? dentry : __lookup_slow(&this, base, 0);
 }
-EXPORT_SYMBOL(lookup_one_len);
+EXPORT_SYMBOL(lookup_noperm);
 
 /**
  * lookup_one - lookup single pathname component
@@ -2957,7 +2965,7 @@ EXPORT_SYMBOL(lookup_one);
  *
  * This can be used for in-kernel filesystem clients such as file servers.
  *
- * Unlike lookup_one_len, it should be called without the parent
+ * Unlike lookup_one, it should be called without the parent
  * i_rwsem held, and will take the i_rwsem itself if necessary.
  */
 struct dentry *lookup_one_unlocked(struct vfsmount *mnt,
@@ -3010,47 +3018,41 @@ struct dentry *lookup_one_positive_unlocked(struct vfsmount *mnt,
 EXPORT_SYMBOL(lookup_one_positive_unlocked);
 
 /**
- * lookup_one_len_unlocked - filesystem helper to lookup single pathname component
+ * lookup_noperm_unlocked - filesystem helper to lookup single pathname component
  * @name:	pathname component to lookup
  * @base:	base directory to lookup from
  * @len:	maximum length @len should be interpreted to
  *
  * Note that this routine is purely a helper for filesystem usage and should
- * not be called by generic code.
+ * not be called by generic code. It does no permission checking.
  *
- * Unlike lookup_one_len, it should be called without the parent
- * i_mutex held, and will take the i_mutex itself if necessary.
+ * Unlike lookup_noperm, it should be called without the parent
+ * i_rwsem held, and will take the i_rwsem itself if necessary.
  */
-struct dentry *lookup_one_len_unlocked(const char *name,
-				       struct dentry *base, int len)
+struct dentry *lookup_noperm_unlocked(struct qstr name,
+					struct dentry *base)
 {
-	struct qstr this;
-	int err;
 	struct dentry *ret;
 
-	err = lookup_one_common(&nop_mnt_idmap, name, base, len, &this);
-	if (err)
-		return ERR_PTR(err);
-
-	ret = lookup_dcache(&this, base, 0);
+	ret = try_lookup_noperm(&name, base);
 	if (!ret)
-		ret = lookup_slow(&this, base, 0);
+		ret = lookup_slow(&name, base, 0);
 	return ret;
 }
-EXPORT_SYMBOL(lookup_one_len_unlocked);
+EXPORT_SYMBOL(lookup_noperm_unlocked);
 
 /*
- * Like lookup_one_len_unlocked(), except that it yields ERR_PTR(-ENOENT)
+ * Like lookup_noperm_unlocked(), except that it yields ERR_PTR(-ENOENT)
  * on negatives.  Returns known positive or ERR_PTR(); that's what
  * most of the users want.  Note that pinned negative with unlocked parent
- * _can_ become positive at any time, so callers of lookup_one_len_unlocked()
+ * _can_ become positive at any time, so callers of lookup_noperm_unlocked()
  * need to be very careful; pinned positives have ->d_inode stable, so
  * this one avoids such problems.
  */
-struct dentry *lookup_positive_unlocked(const char *name,
-				       struct dentry *base, int len)
+struct dentry *lookup_noperm_positive_unlocked(struct qstr name,
+						 struct dentry *base)
 {
-	struct dentry *ret = lookup_one_len_unlocked(name, base, len);
+	struct dentry *ret = lookup_noperm_unlocked(name, base);
 
 	if (!IS_ERR(ret) && d_flags_negative(smp_load_acquire(&ret->d_flags))) {
 		dput(ret);
@@ -3058,7 +3060,7 @@ struct dentry *lookup_positive_unlocked(const char *name,
 	}
 	return ret;
 }
-EXPORT_SYMBOL(lookup_positive_unlocked);
+EXPORT_SYMBOL(lookup_noperm_positive_unlocked);
 
 #ifdef CONFIG_UNIX98_PTYS
 int path_pts(struct path *path)
diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index bf77399696a7..f52e57c31ae4 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -464,18 +464,17 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry)
 
 	sdentry = NULL;
 	do {
-		int slen;
 		dput(sdentry);
 		sillycounter++;
-		slen = scnprintf(silly, sizeof(silly),
-				SILLYNAME_PREFIX "%0*llx%0*x",
-				SILLYNAME_FILEID_LEN, fileid,
-				SILLYNAME_COUNTER_LEN, sillycounter);
+		scnprintf(silly, sizeof(silly),
+			  SILLYNAME_PREFIX "%0*llx%0*x",
+			  SILLYNAME_FILEID_LEN, fileid,
+			  SILLYNAME_COUNTER_LEN, sillycounter);
 
 		dfprintk(VFS, "NFS: trying to rename %pd to %s\n",
 				dentry, silly);
 
-		sdentry = lookup_one_len(silly, dentry->d_parent, slen);
+		sdentry = lookup_noperm(QSTR(silly), dentry->d_parent);
 		/*
 		 * N.B. Better to return EBUSY here ... it could be
 		 * dangerous to delete the file while it's in use.
diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 444aeeccb6da..a51025e8b2d0 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -385,11 +385,9 @@ static struct dentry *ovl_lookup_real_one(struct dentry *connected,
 	 */
 	take_dentry_name_snapshot(&name, real);
 	/*
-	 * No idmap handling here: it's an internal lookup.  Could skip
-	 * permission checking altogether, but for now just use non-idmap
-	 * transformed ids.
+	 * No idmap handling here: it's an internal lookup.
 	 */
-	this = lookup_one_len(name.name.name, connected, name.name.len);
+	this = lookup_noperm(name.name, connected);
 	release_dentry_name_snapshot(&name);
 	err = PTR_ERR(this);
 	if (IS_ERR(this)) {
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 7ee0c43e6630..46fa8a41f6cc 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -758,7 +758,7 @@ struct dentry *ovl_get_index_fh(struct ovl_fs *ofs, struct ovl_fh *fh)
 	if (err)
 		return ERR_PTR(err);
 
-	index = lookup_positive_unlocked(name.name, ofs->workdir, name.len);
+	index = lookup_noperm_positive_unlocked(name, ofs->workdir);
 	kfree(name.name);
 	if (IS_ERR(index)) {
 		if (PTR_ERR(index) == -ENOENT)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 825c5c2e0962..ea426467f26b 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2560,7 +2560,7 @@ int dquot_quota_on_mount(struct super_block *sb, char *qf_name,
 	struct dentry *dentry;
 	int error;
 
-	dentry = lookup_positive_unlocked(qf_name, sb->s_root, strlen(qf_name));
+	dentry = lookup_noperm_positive_unlocked(QSTR(qf_name), sb->s_root);
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index fe738623cf1b..64308cccd1be 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -109,7 +109,8 @@ path_to_dentry(struct cifs_sb_info *cifs_sb, const char *path)
 		while (*s && *s != sep)
 			s++;
 
-		child = lookup_positive_unlocked(p, dentry, s - p);
+		child = lookup_noperm_positive_unlocked(QSTR_LEN(p, s - p),
+							dentry);
 		dput(dentry);
 		dentry = child;
 	} while (!IS_ERR(dentry));
@@ -207,7 +208,7 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
 	spin_unlock(&cfids->cfid_list_lock);
 
 	/*
-	 * Skip any prefix paths in @path as lookup_positive_unlocked() ends up
+	 * Skip any prefix paths in @path as lookup_noperm_positive_unlocked() ends up
 	 * calling ->lookup() which already adds those through
 	 * build_path_from_dentry().  Also, do it earlier as we might reconnect
 	 * below when trying to send compounded request and then potentially
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index 6a3bd652d251..71eadc0305b5 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -925,7 +925,8 @@ cifs_get_root(struct smb3_fs_context *ctx, struct super_block *sb)
 		while (*s && *s != sep)
 			s++;
 
-		child = lookup_positive_unlocked(p, dentry, s - p);
+		child = lookup_noperm_positive_unlocked(QSTR_LEN(p, s - p),
+							dentry);
 		dput(dentry);
 		dentry = child;
 	} while (!IS_ERR(dentry));
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index cb1af30b49f5..aa7982e9b579 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -555,7 +555,7 @@ struct dentry *tracefs_start_creating(const char *name, struct dentry *parent)
 	if (unlikely(IS_DEADDIR(d_inode(parent))))
 		dentry = ERR_PTR(-ENOENT);
 	else
-		dentry = lookup_one_len(name, parent, strlen(name));
+		dentry = lookup_noperm(QSTR(name), parent);
 	if (!IS_ERR(dentry) && d_inode(dentry)) {
 		dput(dentry);
 		dentry = ERR_PTR(-EEXIST);
diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
index 3537f3cca6d5..987af5b2bb82 100644
--- a/fs/xfs/scrub/orphanage.c
+++ b/fs/xfs/scrub/orphanage.c
@@ -153,8 +153,7 @@ xrep_orphanage_create(
 
 	/* Try to find the orphanage directory. */
 	inode_lock_nested(root_inode, I_MUTEX_PARENT);
-	orphanage_dentry = lookup_one_len(ORPHANAGE, root_dentry,
-			strlen(ORPHANAGE));
+	orphanage_dentry = lookup_noperm(QSTR(ORPHANAGE), root_dentry);
 	if (IS_ERR(orphanage_dentry)) {
 		error = PTR_ERR(orphanage_dentry);
 		goto out_unlock_root;
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 5df2c1a4af25..6492c0c85049 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -69,10 +69,10 @@ int vfs_path_parent_lookup(struct filename *filename, unsigned int flags,
 int vfs_path_lookup(struct dentry *, struct vfsmount *, const char *,
 		    unsigned int, struct path *);
 
-extern struct dentry *try_lookup_one_len(const char *, struct dentry *, int);
-extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
-extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int);
-extern struct dentry *lookup_positive_unlocked(const char *, struct dentry *, int);
+extern struct dentry *try_lookup_noperm(struct qstr *, struct dentry *);
+extern struct dentry *lookup_noperm(struct qstr, struct dentry *);
+extern struct dentry *lookup_noperm_unlocked(struct qstr, struct dentry *);
+extern struct dentry *lookup_noperm_positive_unlocked(struct qstr, struct dentry *);
 struct dentry *lookup_one(struct vfsmount *, struct qstr, struct dentry *);
 struct dentry *lookup_one_unlocked(struct vfsmount *mnt,
 				   struct qstr name, struct dentry *base);
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 35b4f8659904..f3d76c4b6013 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -913,7 +913,7 @@ static int do_mq_open(const char __user *u_name, int oflag, umode_t mode,
 
 	ro = mnt_want_write(mnt);	/* we'll drop it in any case */
 	inode_lock(d_inode(root));
-	path.dentry = lookup_one_len(name->name, root, strlen(name->name));
+	path.dentry = lookup_noperm(QSTR(name->name), root);
 	if (IS_ERR(path.dentry)) {
 		error = PTR_ERR(path.dentry);
 		goto out_putfd;
@@ -969,8 +969,7 @@ SYSCALL_DEFINE1(mq_unlink, const char __user *, u_name)
 	if (err)
 		goto out_name;
 	inode_lock_nested(d_inode(mnt->mnt_root), I_MUTEX_PARENT);
-	dentry = lookup_one_len(name->name, mnt->mnt_root,
-				strlen(name->name));
+	dentry = lookup_noperm(QSTR(name->name), mnt->mnt_root);
 	if (IS_ERR(dentry)) {
 		err = PTR_ERR(dentry);
 		goto out_unlock;
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index dc3aa91a6ba0..77fda4101b06 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -421,7 +421,7 @@ static int bpf_iter_link_pin_kernel(struct dentry *parent,
 	int ret;
 
 	inode_lock(parent->d_inode);
-	dentry = lookup_one_len(name, parent, strlen(name));
+	dentry = lookup_noperm(QSTR(name), parent);
 	if (IS_ERR(dentry)) {
 		inode_unlock(parent->d_inode);
 		return PTR_ERR(dentry);
diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index 6039afae4bfc..dfc56cec5ab5 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -283,7 +283,7 @@ static struct dentry *aafs_create(const char *name, umode_t mode,
 	dir = d_inode(parent);
 
 	inode_lock(dir);
-	dentry = lookup_one_len(name, parent, strlen(name));
+	dentry = lookup_noperm(QSTR(name), parent);
 	if (IS_ERR(dentry)) {
 		error = PTR_ERR(dentry);
 		goto fail_lock;
@@ -2551,7 +2551,7 @@ static int aa_mk_null_file(struct dentry *parent)
 		return error;
 
 	inode_lock(d_inode(parent));
-	dentry = lookup_one_len(NULL_FILE_NAME, parent, strlen(NULL_FILE_NAME));
+	dentry = lookup_noperm(QSTR(NULL_FILE_NAME), parent);
 	if (IS_ERR(dentry)) {
 		error = PTR_ERR(dentry);
 		goto out;
diff --git a/security/inode.c b/security/inode.c
index da3ab44c8e57..d04a52a52bdd 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -128,7 +128,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 	dir = d_inode(parent);
 
 	inode_lock(dir);
-	dentry = lookup_one_len(name, parent, strlen(name));
+	dentry = lookup_noperm(QSTR(name), parent);
 	if (IS_ERR(dentry))
 		goto out;
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
                   ` (5 preceding siblings ...)
  2025-03-14  0:34 ` [PATCH 6/8] VFS: rename lookup_one_len() family to lookup_noperm() and remove permission check NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14  0:34 ` [PATCH 8/8] VFS: change lookup_one_common and lookup_noperm_common to take a qstr NeilBrown
  2025-03-14 10:38 ` [PATCH 0/8 RFC] tidy up various VFS lookup functions Christian Brauner
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

try_lookup_noperm() and d_hash_and_lookup() are nearly identical.  The
former does some validation of the name wile the latter doesn't.
Outside of the VFS that validation is likely valuable, and having only
one exported function for this task is certainly a good idea.

So make d_hash_and_lookup() local to VFS files and change all other
callers to try_lookup_noperm().  Note that the arguments are swapped.

Signed-off-by: NeilBrown <neil@brown.name>
---
 Documentation/filesystems/porting.rst | 11 +++++++++++
 fs/dcache.c                           |  1 -
 fs/efivarfs/super.c                   |  8 ++++----
 fs/internal.h                         |  1 +
 fs/proc/base.c                        |  2 +-
 fs/smb/client/readdir.c               |  3 ++-
 fs/xfs/scrub/orphanage.c              |  4 ++--
 include/linux/dcache.h                |  1 -
 net/sunrpc/rpc_pipe.c                 | 12 ++++++------
 security/selinux/selinuxfs.c          |  4 ++--
 10 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index 49a7be912e5f..672cd427d1e4 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -1226,3 +1226,14 @@ checked that the caller has 'X' permission on the parent.  They must
 ONLY be used internally by a filesystem on itself when it knows that
 permissions are irrelevant or in a context where permission checks have
 already been performed such as after vfs_path_parent_lookup()
+
+---
+
+** mandatory**
+
+d_hash_and_lookup() is no longer exported or available outside the VFS.
+Use try_lookup_noperm() instead.  This adds name validation and takes
+arguments in the opposite order but is otherwise identical.
+
+Using try_lookup_noperm() will require linux/namei.h to be included.
+
diff --git a/fs/dcache.c b/fs/dcache.c
index 726a5be2747b..17f8e0b7f04f 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2395,7 +2395,6 @@ struct dentry *d_hash_and_lookup(struct dentry *dir, struct qstr *name)
 	}
 	return d_lookup(dir, name);
 }
-EXPORT_SYMBOL(d_hash_and_lookup);
 
 /*
  * When a file is deleted, we have two options:
diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index 09fcf731e65d..c16206581b2d 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -219,7 +219,7 @@ bool efivarfs_variable_is_present(efi_char16_t *variable_name,
 
 	qstr.name = name;
 	qstr.len = strlen(name);
-	dentry = d_hash_and_lookup(sb->s_root, &qstr);
+	dentry = try_lookup_noperm(&qstr, sb->s_root);
 	kfree(name);
 	if (!IS_ERR_OR_NULL(dentry))
 		dput(dentry);
@@ -402,8 +402,8 @@ static bool efivarfs_actor(struct dir_context *ctx, const char *name, int len,
 {
 	unsigned long size;
 	struct efivarfs_ctx *ectx = container_of(ctx, struct efivarfs_ctx, ctx);
-	struct qstr qstr = { .name = name, .len = len };
-	struct dentry *dentry = d_hash_and_lookup(ectx->sb->s_root, &qstr);
+	struct qstr qstr = QSTR_INIT(name, len);
+	struct dentry *dentry = try_lookup_noperm(&qstr, ectx->sb->s_root);
 	struct inode *inode;
 	struct efivar_entry *entry;
 	int err;
@@ -451,7 +451,7 @@ static int efivarfs_check_missing(efi_char16_t *name16, efi_guid_t vendor,
 
 	qstr.name = name;
 	qstr.len = strlen(name);
-	dentry = d_hash_and_lookup(sb->s_root, &qstr);
+	dentry = try_lookup_noperm(&qstr, sb->s_root);
 	if (IS_ERR(dentry)) {
 		err = PTR_ERR(dentry);
 		goto out;
diff --git a/fs/internal.h b/fs/internal.h
index e7f02ae1e098..c21534a23196 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -66,6 +66,7 @@ int do_linkat(int olddfd, struct filename *old, int newdfd,
 int vfs_tmpfile(struct mnt_idmap *idmap,
 		const struct path *parentpath,
 		struct file *file, umode_t mode);
+struct dentry *d_hash_and_lookup(struct dentry *, struct qstr *);
 
 /*
  * namespace.c
diff --git a/fs/proc/base.c b/fs/proc/base.c
index cd89e956c322..7d36c7567c31 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2124,7 +2124,7 @@ bool proc_fill_cache(struct file *file, struct dir_context *ctx,
 	unsigned type = DT_UNKNOWN;
 	ino_t ino = 1;
 
-	child = d_hash_and_lookup(dir, &qname);
+	child = try_lookup_noperm(&qname, dir);
 	if (!child) {
 		DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
 		child = d_alloc_parallel(dir, &qname, &wq);
diff --git a/fs/smb/client/readdir.c b/fs/smb/client/readdir.c
index 50f96259d9ad..7329ec532bcf 100644
--- a/fs/smb/client/readdir.c
+++ b/fs/smb/client/readdir.c
@@ -9,6 +9,7 @@
  *
  */
 #include <linux/fs.h>
+#include <linux/namei.h>
 #include <linux/pagemap.h>
 #include <linux/slab.h>
 #include <linux/stat.h>
@@ -78,7 +79,7 @@ cifs_prime_dcache(struct dentry *parent, struct qstr *name,
 
 	cifs_dbg(FYI, "%s: for %s\n", __func__, name->name);
 
-	dentry = d_hash_and_lookup(parent, name);
+	dentry = try_lookup_noperm(name, parent);
 	if (!dentry) {
 		/*
 		 * If we know that the inode will need to be revalidated
diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c
index 987af5b2bb82..f42ffad5a7b9 100644
--- a/fs/xfs/scrub/orphanage.c
+++ b/fs/xfs/scrub/orphanage.c
@@ -444,7 +444,7 @@ xrep_adoption_check_dcache(
 	if (!d_orphanage)
 		return 0;
 
-	d_child = d_hash_and_lookup(d_orphanage, &qname);
+	d_child = try_lookup_noperm(&qname, d_orphanage);
 	if (d_child) {
 		trace_xrep_adoption_check_child(sc->mp, d_child);
 
@@ -481,7 +481,7 @@ xrep_adoption_zap_dcache(
 	if (!d_orphanage)
 		return;
 
-	d_child = d_hash_and_lookup(d_orphanage, &qname);
+	d_child = try_lookup_noperm(&qname, d_orphanage);
 	while (d_child != NULL) {
 		trace_xrep_adoption_invalidate_child(sc->mp, d_child);
 
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 1f01f4e734c5..cf37ae54955d 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -288,7 +288,6 @@ extern void d_exchange(struct dentry *, struct dentry *);
 extern struct dentry *d_ancestor(struct dentry *, struct dentry *);
 
 extern struct dentry *d_lookup(const struct dentry *, const struct qstr *);
-extern struct dentry *d_hash_and_lookup(struct dentry *, struct qstr *);
 
 static inline unsigned d_count(const struct dentry *dentry)
 {
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index eadc00410ebc..98f78cd55905 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -631,7 +631,7 @@ static struct dentry *__rpc_lookup_create_exclusive(struct dentry *parent,
 					  const char *name)
 {
 	struct qstr q = QSTR(name);
-	struct dentry *dentry = d_hash_and_lookup(parent, &q);
+	struct dentry *dentry = try_lookup_noperm(&q, parent);
 	if (!dentry) {
 		dentry = d_alloc(parent, &q);
 		if (!dentry)
@@ -658,7 +658,7 @@ static void __rpc_depopulate(struct dentry *parent,
 	for (i = start; i < eof; i++) {
 		name.name = files[i].name;
 		name.len = strlen(files[i].name);
-		dentry = d_hash_and_lookup(parent, &name);
+		dentry = try_lookup_noperm(&name, parent);
 
 		if (dentry == NULL)
 			continue;
@@ -1190,7 +1190,7 @@ static const struct rpc_filelist files[] = {
 struct dentry *rpc_d_lookup_sb(const struct super_block *sb,
 			       const unsigned char *dir_name)
 {
-	return d_hash_and_lookup(sb->s_root, &QSTR(dir_name));
+	return try_lookup_noperm(&QSTR(dir_name), sb->s_root);
 }
 EXPORT_SYMBOL_GPL(rpc_d_lookup_sb);
 
@@ -1301,7 +1301,7 @@ rpc_gssd_dummy_populate(struct dentry *root, struct rpc_pipe *pipe_data)
 	struct dentry *pipe_dentry = NULL;
 
 	/* We should never get this far if "gssd" doesn't exist */
-	gssd_dentry = d_hash_and_lookup(root, &QSTR(files[RPCAUTH_gssd].name));
+	gssd_dentry = try_lookup_noperm(&QSTR(files[RPCAUTH_gssd].name), root);
 	if (!gssd_dentry)
 		return ERR_PTR(-ENOENT);
 
@@ -1311,8 +1311,8 @@ rpc_gssd_dummy_populate(struct dentry *root, struct rpc_pipe *pipe_data)
 		goto out;
 	}
 
-	clnt_dentry = d_hash_and_lookup(gssd_dentry,
-					&QSTR(gssd_dummy_clnt_dir[0].name));
+	clnt_dentry = try_lookup_noperm(&QSTR(gssd_dummy_clnt_dir[0].name),
+					  gssd_dentry);
 	if (!clnt_dentry) {
 		__rpc_depopulate(gssd_dentry, gssd_dummy_clnt_dir, 0, 1);
 		pipe_dentry = ERR_PTR(-ENOENT);
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 47480eb2189b..e67a8ce4b64c 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -2158,8 +2158,8 @@ static int __init init_sel_fs(void)
 		return err;
 	}
 
-	selinux_null.dentry = d_hash_and_lookup(selinux_null.mnt->mnt_root,
-						&null_name);
+	selinux_null.dentry = try_lookup_noperm(&null_name,
+						  selinux_null.mnt->mnt_root);
 	if (IS_ERR(selinux_null.dentry)) {
 		pr_err("selinuxfs:  could not lookup null!\n");
 		err = PTR_ERR(selinux_null.dentry);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] VFS: change lookup_one_common and lookup_noperm_common to take a qstr
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
                   ` (6 preceding siblings ...)
  2025-03-14  0:34 ` [PATCH 7/8] Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS NeilBrown
@ 2025-03-14  0:34 ` NeilBrown
  2025-03-14 10:38 ` [PATCH 0/8 RFC] tidy up various VFS lookup functions Christian Brauner
  8 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2025-03-14  0:34 UTC (permalink / raw)
  To: Alexander Viro, Christian Brauner, Jan Kara, David Howells,
	Chuck Lever, Jeff Layton
  Cc: linux-nfs, netfs, linux-fsdevel

These function already take a qstr of course, but they also currently
take a name/len was well and fill in the qstr.
Now they take a qstr that is already filled in, which is what all the
callers have.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/namei.c | 44 +++++++++++++++++++-------------------------
 1 file changed, 19 insertions(+), 25 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index f89ed09c0b0b..a47a3795f2c7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2833,13 +2833,12 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt,
 }
 EXPORT_SYMBOL(vfs_path_lookup);
 
-static int lookup_noperm_common(const char *name, struct dentry *base,
-				  int len,
-				  struct qstr *this)
+static int lookup_noperm_common(struct qstr *qname, struct dentry *base)
 {
-	this->name = name;
-	this->len = len;
-	this->hash = full_name_hash(base, name, len);
+	const char *name = qname->name;
+	u32 len = qname->len;
+
+	qname->hash = full_name_hash(base, name, len);
 	if (!len)
 		return -EACCES;
 
@@ -2856,7 +2855,7 @@ static int lookup_noperm_common(const char *name, struct dentry *base,
 	 * to use its own hash..
 	 */
 	if (base->d_flags & DCACHE_OP_HASH) {
-		int err = base->d_op->d_hash(base, this);
+		int err = base->d_op->d_hash(base, qname);
 		if (err < 0)
 			return err;
 	}
@@ -2864,10 +2863,10 @@ static int lookup_noperm_common(const char *name, struct dentry *base,
 }
 
 static int lookup_one_common(struct mnt_idmap *idmap,
-			     const char *name, struct dentry *base, int len,
-			     struct qstr *this) {
+			     struct qstr *qname, struct dentry *base)
+{
 	int err;
-	err = lookup_noperm_common(name, base, len, this);
+	err = lookup_noperm_common(qname, base);
 	if (err < 0)
 		return err;
 	return inode_permission(idmap, base->d_inode, MAY_EXEC);
@@ -2888,16 +2887,14 @@ static int lookup_one_common(struct mnt_idmap *idmap,
  */
 struct dentry *try_lookup_noperm(struct qstr *name, struct dentry *base)
 {
-	struct qstr this;
 	int err;
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_noperm_common(name->name, base, name->len, &this);
+	err = lookup_noperm_common(name, base);
 	if (err)
 		return ERR_PTR(err);
 
-	name->hash = this.hash;
 	return lookup_dcache(name, base, 0);
 }
 EXPORT_SYMBOL(try_lookup_noperm);
@@ -2915,17 +2912,16 @@ EXPORT_SYMBOL(try_lookup_noperm);
 struct dentry *lookup_noperm(struct qstr name, struct dentry *base)
 {
 	struct dentry *dentry;
-	struct qstr this;
 	int err;
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_noperm_common(name.name, base, name.len, &this);
+	err = lookup_noperm_common(&name, base);
 	if (err)
 		return ERR_PTR(err);
 
-	dentry = lookup_dcache(&this, base, 0);
-	return dentry ? dentry : __lookup_slow(&this, base, 0);
+	dentry = lookup_dcache(&name, base, 0);
+	return dentry ? dentry : __lookup_slow(&name, base, 0);
 }
 EXPORT_SYMBOL(lookup_noperm);
 
@@ -2943,17 +2939,16 @@ struct dentry *lookup_one(struct vfsmount *mnt, struct qstr name,
 			  struct dentry *base)
 {
 	struct dentry *dentry;
-	struct qstr this;
 	int err;
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_one_common(mnt_idmap(mnt), name.name, base, name.len, &this);
+	err = lookup_one_common(mnt_idmap(mnt), &name, base);
 	if (err)
 		return ERR_PTR(err);
 
-	dentry = lookup_dcache(&this, base, 0);
-	return dentry ? dentry : __lookup_slow(&this, base, 0);
+	dentry = lookup_dcache(&name, base, 0);
+	return dentry ? dentry : __lookup_slow(&name, base, 0);
 }
 EXPORT_SYMBOL(lookup_one);
 
@@ -2971,17 +2966,16 @@ EXPORT_SYMBOL(lookup_one);
 struct dentry *lookup_one_unlocked(struct vfsmount *mnt,
 				   struct qstr name, struct dentry *base)
 {
-	struct qstr this;
 	int err;
 	struct dentry *ret;
 
-	err = lookup_one_common(mnt_idmap(mnt), name.name, base, name.len, &this);
+	err = lookup_one_common(mnt_idmap(mnt), &name, base);
 	if (err)
 		return ERR_PTR(err);
 
-	ret = lookup_dcache(&this, base, 0);
+	ret = lookup_dcache(&name, base, 0);
 	if (!ret)
-		ret = lookup_slow(&this, base, 0);
+		ret = lookup_slow(&name, base, 0);
 	return ret;
 }
 EXPORT_SYMBOL(lookup_one_unlocked);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/8 RFC] tidy up various VFS lookup functions
  2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
                   ` (7 preceding siblings ...)
  2025-03-14  0:34 ` [PATCH 8/8] VFS: change lookup_one_common and lookup_noperm_common to take a qstr NeilBrown
@ 2025-03-14 10:38 ` Christian Brauner
  2025-03-17  2:06   ` NeilBrown
  8 siblings, 1 reply; 12+ messages in thread
From: Christian Brauner @ 2025-03-14 10:38 UTC (permalink / raw)
  To: NeilBrown
  Cc: Alexander Viro, Jan Kara, David Howells, Chuck Lever, Jeff Layton,
	linux-nfs, netfs, linux-fsdevel

On Fri, Mar 14, 2025 at 11:34:06AM +1100, NeilBrown wrote:
> VFS has some functions with names containing "lookup_one_len" and others
> without the "_len".  This difference has nothing to do with "len".
> 
> The functions without "_len" take a "mnt_idmap" pointer.  This is found

When we added idmapped mounts there were all these *_len() helpers and I
orignally had just ported them to pass mnt_idmap. But we decided to not
do this. The argument might have been that most callers don't need to be
switched (I'm not actually sure if that's still true now that we have
quite a few filesystems that do support idmapped mounts.).

So then we added new helper and then we decided to use better naming
then that *_len() stuff. That's about it.

> in the "vfsmount" and that is an important question when choosing which
> to use: do you have a vfsmount, or are you "inside" the filesystem.  A
> related question is "is permission checking relevant here?".
> 
> nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
> functions.  They pass nop_mnt_idmap which is not correct if the vfsmount
> is actually idmaped.  For cachefiles it probably (?) doesn't matter as
> the accesses to the backing filesystem are always does with elevated privileged (?).

Cachefiles explicitly refuse being mounted on top of an idmapped mount
and they require that the mount is attached (check_mnt()) and an
attached mount can never be idmapped as it has already been exposed in
the filesystem hierarchy.

> 
> For nfsd it would matter if anyone exported an idmapped filesystem.  I
> wonder if anyone has tried...

nfsd doesn't support exporting idmapped mounts. See check_export() where
that's explicitly checked.

If there are ways to circumvent this I'd be good to know.

> 
> These patches change the "lookup_one" functions to take a vfsmount
> instead of a mnt_idmap because I think that makes the intention clearer.

Please don't!

These internal lookup helpers intentionally do not take a vfsmount.
First, because they can be called in places where access to a vfsmount
isn't possible and we don't want to pass vfsmounts down to filesystems
ever!

Second, the mnt_idmap pointer is - with few safe exceptions - is
retrieved once in the VFS and then passed down so that e.g., permission
checking and file creation are guaranteed to use the same mnt_idmap
pointer.

A caller may start out with a non-idmapped detached mount (e.g., via
fsmount() or OPEN_TREE_CLONE) (nop_mnt_idmap) and call
inode_permission(). Now someone idmaps that mount. Now further down the
callchain someone calls lookup_one() which now retrieves the idmapping
again and now it's an idmapped mount. Now permission checking is out of
sync. That's an unlikely scenario but it's possible so lookup_one() is
not supposed to retrieve the idmapping again. Please keep passing it
explicitly. I've also written that down in the Documenation somewhere.

> 
> It also renames the "_one" functions to be "_noperm" and removes the
> permission checking completely.  In all cases where they are (correctly)
> used permission checking is irrelevant.

Ok, that sounds fine. Though I haven't taken the time to check the
callers yet. I'll try to do that during the weekend.

> 
> I haven't included changes to afs because there are patches in vfs.all
> which make a lot of changes to lookup in afs.  I think (if they are seen
> as a good idea) these patches should aim to land after the afs patches
> and any further fixup in afs can happen then.
> 
> The nfsd and cachefiles patches probably should be separate.  Maybe I
> should submit those to relevant maintainers first, and one afs,
> cachefiles, and nfsd changes have landed I can submit this series with
> appropriate modifications.
> 
> May main question for review is : have I understood mnt_idmap correctly?

I mean, you didn't ask semantic questions so much as syntactic, I think.
I hope I explained the reasoning sufficiently.

Thanks for the cleanups.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/8 RFC] tidy up various VFS lookup functions
  2025-03-14 10:38 ` [PATCH 0/8 RFC] tidy up various VFS lookup functions Christian Brauner
@ 2025-03-17  2:06   ` NeilBrown
  2025-03-18 13:57     ` Christian Brauner
  0 siblings, 1 reply; 12+ messages in thread
From: NeilBrown @ 2025-03-17  2:06 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Jan Kara, David Howells, Chuck Lever, Jeff Layton,
	linux-nfs, netfs, linux-fsdevel

On Fri, 14 Mar 2025, Christian Brauner wrote:
> On Fri, Mar 14, 2025 at 11:34:06AM +1100, NeilBrown wrote:
> > VFS has some functions with names containing "lookup_one_len" and others
> > without the "_len".  This difference has nothing to do with "len".
> > 
> > The functions without "_len" take a "mnt_idmap" pointer.  This is found
> 
> When we added idmapped mounts there were all these *_len() helpers and I
> orignally had just ported them to pass mnt_idmap. But we decided to not
> do this. The argument might have been that most callers don't need to be
> switched (I'm not actually sure if that's still true now that we have
> quite a few filesystems that do support idmapped mounts.).
> 
> So then we added new helper and then we decided to use better naming
> then that *_len() stuff. That's about it.
> 
> > in the "vfsmount" and that is an important question when choosing which
> > to use: do you have a vfsmount, or are you "inside" the filesystem.  A
> > related question is "is permission checking relevant here?".
> > 
> > nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
> > functions.  They pass nop_mnt_idmap which is not correct if the vfsmount
> > is actually idmaped.  For cachefiles it probably (?) doesn't matter as
> > the accesses to the backing filesystem are always does with elevated privileged (?).
> 
> Cachefiles explicitly refuse being mounted on top of an idmapped mount
> and they require that the mount is attached (check_mnt()) and an
> attached mount can never be idmapped as it has already been exposed in
> the filesystem hierarchy.
> 
> > 
> > For nfsd it would matter if anyone exported an idmapped filesystem.  I
> > wonder if anyone has tried...
> 
> nfsd doesn't support exporting idmapped mounts. See check_export() where
> that's explicitly checked.
> 
> If there are ways to circumvent this I'd be good to know.

I should have checked that they rejected idmapped mounts
(is_idmapped_mnt()).  But I think that just changes my justification for
the change, not my desire to make the change.

There are two contexts in which lookup is done.  One is the common
context when there is a vfsmount present and permission checking is
expected.  nfsd and cachefiles both fit this context.

The other is when there is no vfsmount and/or permission checking is not
relevant.  This happens after lookup_parentat when the permission check
has already been performed, and in various virtual filesystems when the
filesystem itself is adding/removing files or in normal filesystems
where dedicated names like "lost+found" and "quota" are being accessed.

I would like to make a clear distinction between these, and for that to
be done nfsd and cachefiles need to be changed to clearly fit the first
context.  Whether they should allow idmapped mounts or not is to some
extent a separate question.  They do want to do permission checking
(certainly nfsd does) so they should use the same API as other
permission-checking code.

> 
> > 
> > These patches change the "lookup_one" functions to take a vfsmount
> > instead of a mnt_idmap because I think that makes the intention clearer.
> 
> Please don't!
> 
> These internal lookup helpers intentionally do not take a vfsmount.
> First, because they can be called in places where access to a vfsmount
> isn't possible and we don't want to pass vfsmounts down to filesystems
> ever!

There are two sorts of internal lookup helpers.
Those that currently don't even take a mnt_idmap and are called, as you
say, in places where a vfsmount isn't available.
And those that are currently called with a mnt_idmap and called (after a
few cleanup) in places where a vfsmount is readily available.


> 
> Second, the mnt_idmap pointer is - with few safe exceptions - is
> retrieved once in the VFS and then passed down so that e.g., permission
> checking and file creation are guaranteed to use the same mnt_idmap
> pointer.

In every case that I changed a call to pass a vfsmount instead of a
mnt_idmap, the mnt_idmap had recently been fetched from the vfsmount,
often by mnt_idmap() in the first argument to lookup_one().  Sometimes
by file_mnt_idmap() or similar.  So the patch never changed the safety
of the idmap.

One purpose of this change is to force callers to either acknowledge
that not permission checking is needed (by using a _noperm interface),
or to provide the correct idmap (not nop_mnt_idmap).

I understand that for object creation interfaces (vfs_mkdir etc) this
wouldn't work.  For these the idmap isn't needed only for permission
checking, but also for assigning an initial uid so the simple "vfsmount
or noperm" distinction doesn't work.  But for lookup I think it does.

> 
> A caller may start out with a non-idmapped detached mount (e.g., via
> fsmount() or OPEN_TREE_CLONE) (nop_mnt_idmap) and call
> inode_permission(). Now someone idmaps that mount. Now further down the
> callchain someone calls lookup_one() which now retrieves the idmapping
> again and now it's an idmapped mount. Now permission checking is out of
> sync. That's an unlikely scenario but it's possible so lookup_one() is
> not supposed to retrieve the idmapping again. Please keep passing it
> explicitly. I've also written that down in the Documenation somewhere.
> 

I couldn't easily find such documentation but I didn't look *very* hard.
fsmount and OPEN_TREE_CLONE only seem to be used in selftests though I
guess they could get used elsewhere.  However current callers of
lookup_one() always get the idmap from the vfsmount shortly before
calling lookup_one.

> 
> > 
> > It also renames the "_one" functions to be "_noperm" and removes the
> > permission checking completely.  In all cases where they are (correctly)
> > used permission checking is irrelevant.
> 
> Ok, that sounds fine. Though I haven't taken the time to check the
> callers yet. I'll try to do that during the weekend.
> 
> > 
> > I haven't included changes to afs because there are patches in vfs.all
> > which make a lot of changes to lookup in afs.  I think (if they are seen
> > as a good idea) these patches should aim to land after the afs patches
> > and any further fixup in afs can happen then.
> > 
> > The nfsd and cachefiles patches probably should be separate.  Maybe I
> > should submit those to relevant maintainers first, and one afs,
> > cachefiles, and nfsd changes have landed I can submit this series with
> > appropriate modifications.
> > 
> > May main question for review is : have I understood mnt_idmap correctly?
> 
> I mean, you didn't ask semantic questions so much as syntactic, I think.
> I hope I explained the reasoning sufficiently.
> 

You have helped me see some aspects of idmapping which I hadn't seen as
clearly before - thanks.

NeilBrown

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/8 RFC] tidy up various VFS lookup functions
  2025-03-17  2:06   ` NeilBrown
@ 2025-03-18 13:57     ` Christian Brauner
  0 siblings, 0 replies; 12+ messages in thread
From: Christian Brauner @ 2025-03-18 13:57 UTC (permalink / raw)
  To: NeilBrown
  Cc: Alexander Viro, Jan Kara, David Howells, Chuck Lever, Jeff Layton,
	linux-nfs, netfs, linux-fsdevel

On Mon, Mar 17, 2025 at 01:06:57PM +1100, NeilBrown wrote:
> On Fri, 14 Mar 2025, Christian Brauner wrote:
> > On Fri, Mar 14, 2025 at 11:34:06AM +1100, NeilBrown wrote:
> > > VFS has some functions with names containing "lookup_one_len" and others
> > > without the "_len".  This difference has nothing to do with "len".
> > > 
> > > The functions without "_len" take a "mnt_idmap" pointer.  This is found
> > 
> > When we added idmapped mounts there were all these *_len() helpers and I
> > orignally had just ported them to pass mnt_idmap. But we decided to not
> > do this. The argument might have been that most callers don't need to be
> > switched (I'm not actually sure if that's still true now that we have
> > quite a few filesystems that do support idmapped mounts.).
> > 
> > So then we added new helper and then we decided to use better naming
> > then that *_len() stuff. That's about it.
> > 
> > > in the "vfsmount" and that is an important question when choosing which
> > > to use: do you have a vfsmount, or are you "inside" the filesystem.  A
> > > related question is "is permission checking relevant here?".
> > > 
> > > nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
> > > functions.  They pass nop_mnt_idmap which is not correct if the vfsmount
> > > is actually idmaped.  For cachefiles it probably (?) doesn't matter as
> > > the accesses to the backing filesystem are always does with elevated privileged (?).
> > 
> > Cachefiles explicitly refuse being mounted on top of an idmapped mount
> > and they require that the mount is attached (check_mnt()) and an
> > attached mount can never be idmapped as it has already been exposed in
> > the filesystem hierarchy.
> > 
> > > 
> > > For nfsd it would matter if anyone exported an idmapped filesystem.  I
> > > wonder if anyone has tried...
> > 
> > nfsd doesn't support exporting idmapped mounts. See check_export() where
> > that's explicitly checked.
> > 
> > If there are ways to circumvent this I'd be good to know.
> 
> I should have checked that they rejected idmapped mounts
> (is_idmapped_mnt()).  But I think that just changes my justification for
> the change, not my desire to make the change.
> 
> There are two contexts in which lookup is done.  One is the common
> context when there is a vfsmount present and permission checking is
> expected.  nfsd and cachefiles both fit this context.
> 
> The other is when there is no vfsmount and/or permission checking is not
> relevant.  This happens after lookup_parentat when the permission check
> has already been performed, and in various virtual filesystems when the
> filesystem itself is adding/removing files or in normal filesystems
> where dedicated names like "lost+found" and "quota" are being accessed.
> 
> I would like to make a clear distinction between these, and for that to
> be done nfsd and cachefiles need to be changed to clearly fit the first
> context.  Whether they should allow idmapped mounts or not is to some
> extent a separate question.  They do want to do permission checking
> (certainly nfsd does) so they should use the same API as other
> permission-checking code.
> 
> > 
> > > 
> > > These patches change the "lookup_one" functions to take a vfsmount
> > > instead of a mnt_idmap because I think that makes the intention clearer.
> > 
> > Please don't!
> > 
> > These internal lookup helpers intentionally do not take a vfsmount.
> > First, because they can be called in places where access to a vfsmount
> > isn't possible and we don't want to pass vfsmounts down to filesystems
> > ever!
> 
> There are two sorts of internal lookup helpers.
> Those that currently don't even take a mnt_idmap and are called, as you
> say, in places where a vfsmount isn't available.
> And those that are currently called with a mnt_idmap and called (after a
> few cleanup) in places where a vfsmount is readily available.

That's not the point. The vfsmount is the wrong data structure to pass
to lookup_one(). The mount is completely immaterial to what it does. The
only thing that matters is the idmap. Passing the vfsmount is actively
misleading.

> 
> 
> > 
> > Second, the mnt_idmap pointer is - with few safe exceptions - is
> > retrieved once in the VFS and then passed down so that e.g., permission
> > checking and file creation are guaranteed to use the same mnt_idmap
> > pointer.
> 
> In every case that I changed a call to pass a vfsmount instead of a
> mnt_idmap, the mnt_idmap had recently been fetched from the vfsmount,
> often by mnt_idmap() in the first argument to lookup_one().  Sometimes
> by file_mnt_idmap() or similar.  So the patch never changed the safety
> of the idmap.

Taking btrfs:

        dentry = lookup_one(parent->mnt, QSTR_LEN(name, namelen), parent->dentry);
        error = PTR_ERR(dentry);
        if (IS_ERR(dentry))
                goto out_unlock;

        error = btrfs_may_create(idmap, dir, dentry);
        if (error)
                goto out_dput;

You fetch the idmap pointer twice here. The only reason that this is
safe is because I made it so that while an active writer is on a mount
the idmap cannot be changed. But that's subtle knowledge. By passing the
idmap pointer directly it is visually trivial to figure out that it is
guaranteed to be the same idmap without having to know how
mnt_want_write_file() interacts with mount_setattr().

The changes here also make following the logic complex. The idmap
pointer is fetched once and passed down everywhere it is needed. But
suddenly that's not true anymore for lookup_one() where its the
vfsmount. First question this raises is whether the mount topology
somehow matters for the lookup when really it doesn't.

Your cleanup series is really good with or without stuffing vfsmount
into all of these helpers. So please just either resend it and continue
passing struct mnt_idmap or I'll change it when I apply.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-03-18 13:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-14  0:34 [PATCH 0/8 RFC] tidy up various VFS lookup functions NeilBrown
2025-03-14  0:34 ` [PATCH 1/8] VFS: improve interface for lookup_one functions NeilBrown
2025-03-14  0:34 ` [PATCH 2/8] nfsd: Use lookup_one() rather than lookup_one_len() NeilBrown
2025-03-14  0:34 ` [PATCH 3/8] nfsd: use correct idmap for all accesses NeilBrown
2025-03-14  0:34 ` [PATCH 4/8] cachefiles: Use lookup_one() rather than lookup_one_len() NeilBrown
2025-03-14  0:34 ` [PATCH 5/8] cachefiles: use correct mnt_idmap NeilBrown
2025-03-14  0:34 ` [PATCH 6/8] VFS: rename lookup_one_len() family to lookup_noperm() and remove permission check NeilBrown
2025-03-14  0:34 ` [PATCH 7/8] Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS NeilBrown
2025-03-14  0:34 ` [PATCH 8/8] VFS: change lookup_one_common and lookup_noperm_common to take a qstr NeilBrown
2025-03-14 10:38 ` [PATCH 0/8 RFC] tidy up various VFS lookup functions Christian Brauner
2025-03-17  2:06   ` NeilBrown
2025-03-18 13:57     ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox